Is Big Data ready for enterprise ?
As with all new technologies, that evolve fast, traditional enterprises find it hard catch up. This is true of the case of Big Data too. For Big Data to be considered enterprise ready it would have to reach comparable levels of maturity with traditional software.
To start with lets take the case of documentation. This is an interesting case given the pace of development in open source ecosystem, its pretty hard for documentation to keep upto date. Unfortunately large enterprises depend on documentation to help guide thier business as usual, rather than reading code.
Then the question of state of ecosystem, it seems like a constant state of flux. The pattern that was designed 6 months ago seems no longer valid, the component which was supposed to have done the ingestion gets replaced by another new comer. And then there are projects which go in and out of fashion, which get vendor supports and loses it, e.g. storm vs spark, hue vs ambari.
Then there is the question of security, which seems like something which vendors and ecosystem doesnt seem to have come to terms with. Just to add a few are the following
Active Directory integration — This seems a problem when multiple components in the ecosytem fails in being able to understand what is happening on the operating system and group memberships . This would include both ambari and yarn.
Kerberos integration — Enterprises generally would like to have authentication, authorisation and audit in place, especially in regulated industries. The challenge is currently this is bit broken, assuming one uses ambari, it would require you to turn it off to update ambari, so much for security.
HSM integration — As with all things, regulation implies that keys and certs should be stored in a secure manner, i.e HSM’s. Unfortunately the integration so far has not been that great, its yet to support KMIP.
So in summary the answer to the question would be, it depends. Looks like some of the things are low priority for the vendors and contributors, which are higher priority for enterprises. Assuming one is doing something that does not need security then one might not have to worry about the security issues. If the question of documentation is something which again is something one have assess the risks and make decision based on in house expertise. As for the state of ecosystem, the architecture decisions have to be based on what feels architecturally stable for the set of use cases you want to run on it.