Scale Different — How we Reimagined Data Storage to Give Data Freedom
What’s wrong with data storage?
Before starting Datera in 2013, Nicholas Bellinger and I had contributed the block storage subsystem to Linux (“Linux-IO”), which was adopted by numerous companies, including Pure Storage, Red Hat (now IBM) and Google. Linux-IO eventually became an industry standard, and Nic and I became somewhat notorious as the original software-defined storage dudes. Life was good.
While the industry took great advantage of our open source software to replace proprietary storage hardware with less-proprietary storage hardware (and called it “software-defined”), we thought they missed the point: Mapping rigid architecture of the past into software is not enough to meet the demands of digital transformation. Getting stuck with the hardware you start with simply no longer works.
We imagined a service-centric architecture to orchestrate data anywhere, while scaling continuous availability and predictable performance across private and public clouds. We knew hyperscale architecture had the innate adaptability we wanted, but we needed to systematically reimagine it to address its inherent challenges. If we could harness this adaptability to let users change their intent as they go, we could free their mind from the mental slavery of trying to anticipate future storage needs. Thus, Datera was born.
Swarms are adaptive, monoliths are not. A swarm of starlings is infinitely adaptable and resilient, dinosaurs are the antithesis of adaptive — and extinct. If implemented correctly, everything else flows from this basic concept.
For instance, Google adopted swarm design almost twenty years ago. They build their data centers from thousands of commodity servers, and use autonomous distributed software to orchestrate them into coherent swarms. Now known as “hyperscale,” this architecture is transforming how IT is designed and delivered, putting IT as we know it under existential pressure — the Jurassic IT Era is coming to its end.
The promise of hyperscale is to converge diverse hardware resources into one coherent swarm with incredible adaptability and scalability. Its implementation, however, is riddled with hard practical challenges like node heterogeneity, data gravity, data consistency, combinatorial reliability and availability, operational complexity, performance and scalability cliffs, and so on, plus fundamental physical realities like time, distance and latency.
To harness the power of hyperscale, we assembled an amazing interdisciplinary team around our founding architects Claudio Fleiner (a hyperscale wizard), Raghu Krishnamurthy (an automation thought leader) and Bill Rozas (a brilliant computer architect), who created a game-changing core technology portfolio and product that solves the innate conceptual challenges with hyperscale.
This allowed us to make hyperscale storage work, and finally bring its benefits to a wide spectrum of workloads, traditional and cloud native, including the most demanding mission critical enterprise applications. Effectively, we delivered the first enterprise-grade software-defined storage.
True software-defined storage
Over the next few weeks, I’ll be publishing a series of four blogs that explain Datera’s key architecture tenets, why they are fundamental to make hyperscale storage work, and how they help transform IT.
In this series of four blogs, I’ll explain how we addressed fundamental hyperscale problems by systematically rethinking basic concepts, based on first principles, in three key categories: infrastructure model, automation model and hybrid cloud model, and how those confluence to transform the traditional IT operating model, from systems to scalable services — thereby capturing significantly more value across the entire data and system life cycle.
1. Infrastructure model
Our key infrastructure tenet was to make hyperscale frictionless. What good is hyperscale if it can’t adapt rapidly and continuously, deliver inspiring and predictable performance, scale seamlessly, rebalance quickly among a rich spectrum of endpoints, and so on?
So, at the inception of Datera, we spent significant time reimagining hyperscale without its innate friction points. As a result, we created a hyperscale infrastructure model that delivers:
- Enterprise performance: transparent data mobility and lockless distributed coherence confluence to frictionless I/O, driving tier-1 performance, low latency and seamless scalability
- Continuous availability: built for change, transparent data mobility across generations of endpoints driven by current and future application intent, no forklift upgrades
- Eternal clusters: heterogeneous endpoint choice and scalability, economic elasticity, no hardware lock-in, no technical debt, future-ready — the hardware you start with no longer is the hardware you’re stuck with
- Fully programmable: all data infrastructure is accessible through a REST API, making it browsable, searchable and meshable, data infrastructure-as-code
- Ecosystem integration: standard protocols in extensible containers, forward agility, easily surfacing the benefits of new tech to all applications
The result is a uniquely flexible data services platform that can independently orchestrate data services and data across all of its endpoints. Live data services mobility and live data mobility lay the foundation to achieve data freedom across private and public clouds.
2. Automation model
Now that we created an infrastructure model that makes hyperscale frictionless and truly scalable, effectively creating long invisible arms across the datacenter, we turned to our next key tenet: automation. What good is frictionless hyperscale if it requires slow humans to work it?
So, our hyperscale automation model is driven by applications — effortless, instant, invisibe. Applications know best what they want:
- Application-driven automation: data infrastructure is continuously composed and delivered as a service, driven by applications and not by humans, based on application profiles (or intent). Intent is invariant and, together with transparent data mobility, makes data portable and scalable — across heterogeneous endpoints, technology and innovation. Most importantly, Datera allows you to change your mind as you go — it always adapts and gives you an operational out
- Data Orchestration: AI-driven continuous self-optimization, based on application intent, infrastructure feedback (or insights) and live data mobility, 24x7 lights-out operations
- Role-based multi-tenancy: application intent can be mapped dynamically, depending on how it is instantiated — so tenants get a degree of self-service, while operating within pre-defined business envelopes
- Lifecycle management: manage full lifecycle of data and endpoints, from sunrise (e.g., test) to sunset (e.g., archive), always best economics
- Global service management: global ops portal with visual insights and cloud-based machine learning across the installed base, driving anomaly detection and predictive operations
Containers further escalate the speed, scale and elasticity pressure on infrastructure — they consume infrastructure as a service that is continuously delivered. In that model, Kubernetes, Mesos and their brethren replace manual operations with application-driven automation of compute swarms — and Datera is to data as Kubernetes is to compute.The impact of this new automation model is hard to imagine without experiencing it.
Remember when the iPhone overnight made flip phones feel so 1990s? Experiencing Datera is a similar watershed moment — it makes traditional storage feel antiquated. What made the key difference for the iPhone? Apps. The iPhone is driven by apps, not by humans using a dialpad, just like Datera is driven by applications, not by humans using a keyboard.
3. Hybrid cloud model
Now that we have achieved data freedom across the data center, we can parlay it into a hybrid cloud model that lets us scale it across private and public clouds.
Our application-driven automation model allows describing the behavior of data in invariant application profiles, or storage blueprints, that we automatically adapt to tenancy and roles. Storage blueprints seamlessly expand the behavior of data beyond the box — they allow a “data broker” to make data portable, scalable and hybridizable.
Our complementary frictionless infrastructure model provides scalable live data mobility — effectively implementing a “data exchange” across private and public clouds.
- Data center automation: data center awareness to co-orchestrate data with the data center, e.g., optional L3 network virtualization to participate in flat L3 networks (one flat IP address space) instead of using L2 overlays, which streamlines network configuration and management, and makes data services continuously available by letting them float across the data center (behind fixed virtual IP addresses) with practically instant session failovers
- Storage blueprints: data packaged with behavior, active/passive data portability and scalability across a wide spectrum of diverse endpoints in private and public clouds
- Edge to cloud data freedom: data portability and scalability (storage blueprints), live data mobility (synchronous stretch clusters) and scale (lockless distributed coherence) allow active/active synchronous replication — essentially creating a single multi-cloud data continuum that allows applications to float from edge to cloud
As workloads are moving to the edge, the data center is evolving into a meta data center, and clouds get abstracted behind brokerage layers, we can provide the scaled data broker and data exchange to achieve data freedom from the intelligent edge to cloud.
4. Operating model
Our innovative storage infrastructure and automation models allow us to fundamentally decouple storage consumption from deployment, and continuously broker between them. Now we can rethink the storage operating model from rigid point-in-time systems to continuously composable and scalable data services that empower both consumers and operators to independently optimize their needs:
- Application owners: can self-service by simply defining composable and scalable data services as application intent — and change it as they go; and complementarily
- Storage operators: can independently scale performance and capacity by simply adding optimized servers from the spot market — and vary them as they go.
We can uplevel this concept to refactor the entire IT value creation chain, from planning through obsolescence, to deliver transformational simplicity and efficiency:
Effectively, we bring the cloud experience to enterprise data storage, and let you free your mind (and data!) to focus on creating business value:
- Planning + Procurement: plan your services not your systems, and change your plans as you go — Datera always adapts and gives you an operational out. Free your mind from trying to plan ahead and getting it wrong, from fallible point in-time tech commitments that decay into technical/operational debt, from expensive under- or overprovisioning, and from the futile mental slavery of trying to anticipate future IT needs in face of an unprecedented rate of innovation
- Deployment + Operations: zero-touch continuous composability and delivery of data services. Free your mind from sisyphean repetitive manual configuration, aggravated by fast churning modern environments like containers or Kubernetes
- Scaling: an extensible data continuum that transparently scales across space and time — across new technologies, consumption models and hardware capabilities, and across accelerating innovation and obsolescence. Free your mind from hardware boundaries that create rigid silos with a never-ending treadmill of forklift upgrades
- Service + Maintenance: continuous availability with a regular, planned maintenance cadence. Free your mind from unpredictable failure “cliffs” and stressful emergency incident responses
- Upgrades + Obsolescence: “eternal” clusters with live software upgrades on rolling endpoints. Free your mind from the remaining operational risk, including planning long obsolescence cycles, tech forklifts and data migration sprees
The result is a comprehensive data foundation for the modern software defined data center. Together with enterprise partners that have a global brand, reach and support, efficient supply chains, and equipment financing, we can deliver game-changing operational value to enterprise customers — not just for hyperscale, but for any scale.
Customers are looking to replatform their IT to cloud in order to increase business agility and reduce technology risk. Public clouds have enormous OpEx elasticity, which makes failure cheap but success expensive, and they lock customers in with captive data services. Thus there is a universal need for data services that converge public cloud simplicity and elasticity with private cloud control and efficiency, to create multi-cloud optionality.
To meet this need, we have reimagined storage to bring the cloud experience to data. We have rethought storage to scale and orchestrate data across the constant flux of capabilities and consumption, technology innovation and obsolescence cycles, and across private and public clouds — to scale across space and time, driven by current and future application intent. We have envisioned an “eternal” data services continuum that combines software-defined simplicity with enterprise ‘abilities.
As a result, we created mission critical software-defined storage that is future-proof for the demands of digital transformation — a 24x7 lights-out data continuum that scales data freedom from the intelligent edge to cloud. Because the people who are crazy enough to think they can reimagine storage at scale, are the ones who do.
Please visit us at www.datera.io, or tweet me at @MarcFleischmann.