Going Green: Why I Joined Nimble Storage

Dimitris Krekoukias
RecoveryMonkey
Published in
6 min readMar 28, 2016

I am proud to announce that, as of today, I am a member of the Nimble Storage team.

Nimble Logo

This marks the end of an era — I spent quite a bit of time at NetApp: learned a lot, did a lot — by the end I had my hands in all kinds of sausage making… :)

I wish my friends at NetApp the best of luck for the future. The storage industry is a very tough arena, and one that will be increasingly harder and with less tolerance than ever before.

Why?

I compared Nimble Storage with many competitors before making my decision. Quite simply, Nimble’s core values agree with mine. It goes without saying that I wouldn’t choose to move to a company unless I believed they had the best technology (and the best support), but the core values is where it all starts. The product is built upon those core values.

I firmly believe that modern storage should be easy to consume. Indeed, it should be a joy to consume, even for complex multi-site environments. It should not be a burden to deal with. Nor should it be a burden to sell.

Systems that are holistically easy to consume have several business benefits, some of which are: lower OPEX and CAPEX, increased productivity, less risk, easier planning, faster problem resolution.

It’s important to understand that easy to consume is not at all the same as easy to usethat is but a very small subset of easy consumption.

The core value of easy consumption encompasses several aspects, many of which are ignored by most storage vendors. Most modern players will focus on ease of use, show demos of a pretty GUI and suchlike. “Look how easy it is to install” or “look how easy it is to create a LUN”. Well — there’s a lot more to worry about in real life.

The lifecycle of a storage system

Beyond initial installation and simple element creation, there is a multitude of things users and vendors need to be concerned with. Here’s a partial list:

  • Installation
  • Migration to/from
  • Provisioning
  • Host/fabric configuration
  • Backups, restores, replication
  • Scaling up/out
  • Upgrading from a smaller/older version
  • Firmware updates for all components (including drives)
  • Tech refresh
  • Support

What about more advanced topics?

A storage solution cannot exist in a vacuum. There are several ancillary (but extremely important) services needed in order to help consume storage, especially at scale. Services that typically cannot (and many that should not) reside on the storage system itself. How about…

  • Initial and future sizing
  • Capacity planning based on long-term usage data
  • Performance analysis and profiling
  • Performance issue resolution/recommendations
  • Root cause analysis
  • What-if scenario modeling
  • Support case resolution
  • Comprehensive end-to-end monitoring and alerting
  • Comprehensive reporting (including auditing)
  • Security (including RBAC and delegation)
  • Upgrade planning
  • Pervasive automation (including host-side)
  • Ensuring adherence to best practices

If a storage solution doesn’t make all or most of the above straightforward, then it is not truly easy to consume.

The problem:

Storage vendors will typically either be lacking in many of the above areas, or may need many different tools, and significant manual effort, in order to provide even some of these services.

Not having the tools creates an obvious problem — the customer and vendor simply can’t perform these functions, or the implementation is too basic. Most smaller vendors are in this camp. Not much functionality beyond what’s inside the storage device itself. Long-term consumption, especially at scale, becomes challenging.

On the other hand, having a multitude of tools to help with these areas also makes the solution hard to consume overall. Larger vendors fall into this category.

For instance: Customers may need to access many different tools just to monitor and alert on various metrics. One tool may provide certain information, another tool provides different metrics (often with significant overlap with the first tool), and so on. And not all tools work with all versions of the product. This increases administrative complexity and overall time and money spent. And the end result is often compromised and incredibly hard to support.

Vendors that need many different tools also create a problem for themselves: Almost nobody on staff will have the expertise to deal with the plethora of tools necessary to do certain things like sizing, performance troubleshooting or even a tech refresh. Or optimizing a product for specific workloads. Deep expertise is often needed to interpret the results of the tools. This causes interminable delays in problem resolution, lengthens sales cycles, complicates product development, creates staffing challenges, increases costs, and in general makes life miserable.

RG autojack

How?

What always fascinated me about Nimble Storage is that not only did they recognize these challenges, they actually built an entire infrastructure and innovative approach in order to solve the problem.

Nimble recognized the value of Predictive Analytics.

The challenge: How to use Big Data to solve the challenges faced by storage customers and storage vendors. And how to do this in a way that achieves a dramatically better end result.

While most vendors have call-home features, and some even have rudimentary capacity, configuration and maybe even performance telemetry being sent to some central repository (usually very infrequently), Nimble elected instead to send extremely comprehensive sensor telemetry to a huge analytics engine. A difficult undertaking, but one that would define the company in the years to come.

Nimble also recognized the need to do this from the very beginning. Each Nimble array sends 30–70 million data points back to Nimble every day. Trying to retrofit telemetry of this scope would be extremely difficult if not impossible to achieve effectively.

This wealth of data (the largest storage-related analytics engine in the world, by far) is used to help customers with the challenges mentioned previously, while at the same time lowering complexity.

It also, crucially, helps Nimble better support customers and design better products without having to bother customers for data dumps.

For example: What if a Nimble engineer trying to optimize SQL I/O performance wants to see detailed I/O statistics only for SQL workloads on all Nimble arrays in the world? Or on one array? Or on all arrays at a certain customer? It’s only a simple query away… and that’s just scratching the surface of what’s possible. It certainly beats trying to design storage based on arbitrary synthetic benchmarks, or manually grabbing performance traces from customer gear…

What?

Enter InfoSight. That’s the name of the gigantic analytics engine currently ingesting trillions of anonymized sensor data points every week. And growing. Check some numbers here

Nimble Storage customers do not need to install custom monitoring tools to perform highly advanced storage analytics, performance troubleshooting, and even hardware upgrade recommendations based on automated performance analysis heuristics.

No need to use the CLI, no need to manually send data dumps to the vendor, no need to use 10 different tools.

All the information customers need is available through a browser GUI. Even the vast majority of support cases are automatically handled by InfoSight, and I’m not talking about simply sending replacement hardware (that’s trivial).

I always saw InfoSight as the core offering of Nimble Storage, the huge differentiator that works hand in hand with the hardware and helps customers consume storage easily. Yes, Nimble Storage arrays are fast, reliable, easy to use, have impressive data reduction abilities, scale nicely, have great features, are cost-effective etc. But other vendors can claim they can satisfy at least some of those attributes.

Nobody else has anything even remotely the depth and capability of InfoSight. This is why Nimble calls their offering the Predictive Flash Platform. InfoSight Predictive Analytics + great hardware = Predictive Flash.

I will be covering this fascinating topic in a lot more depth in the future. An AI Expert System powered by a behemoth analytics engine, helping reduce complexity and making the solution Easy To Consume is a pretty impressive piece of engineering.

Watch this space…

D

Technorati Tags: NetApp, Nimble, Big Data, Analytics, InfoSight

--

--

Dimitris Krekoukias
RecoveryMonkey

I have been dealing with technology since I was a small child. In the past, I built large scale computing and backup. Today, high end storage, AI, analytics.