Cloud Storage Options Part 2

Falafel Software Bloggers
Falafel Software
Published in
7 min readFeb 1, 2017

This is a continuation of Cloud Storage Options Part 1, which covers Google Cloud Storage and Google Cloud SQL, both from a .NET developer’s perspective. Part 2 includes the two remaining structured storage solutions offered in GCP: Cloud Datastore and Cloud BigTable, and again focuses on how .NET developers can get started leveraging these storage options for themselves.

NoSQL

Although Google categorizes them as “structured”, Datastore and BigTable aren’t as structured as you might be used to if you come from a background of relational databases, like MS SQL Server. These two Cloud Storage options are both NoSQL storage solutions. What is NoSQL? In part, it is a storage solution resulting from the internet-age’s exponential data growth for both user and device data. That data growth resulted in the need for a new kind of storage solution — one that could be fast, easily distributed, and flexible. As the name implies, NoSQL (Not Only SQL) storage options are non-relational, which gives them an advantage if data needs to be flexible, scalable, and lightning fast. Since Google itself started the NoSQL movement with the original BigTable in 2006, it makes sense that GCP is leading the way today for NoSQL cloud-based solutions.

Cloud Datastore

Cloud Datastore is originally the backing technology behind App Engine, but now is accessible as its own service as well. Google encourages the use of Datastore for flexible, high availability data storage needs that are less than 1TB, after which your storage needs might better be served by Bigtable. In fact, scalability is one of the big pluses for Cloud Datastore. So especially if your data is going to start small and then grow huge, and you need high availability, Datastore is the storage solution Google recommends.

Like Azure’s DocumentDB, Cloud Datastore is a document-store NoSQL model. That means it is more than just keys with values, but instead keys with documents. Document in this case just means an object with an internal structure, such as a JSON object. Think of it as semi-structured application data, but also can include hierarchies. User profiles or product catalogs are great examples of data that is well-suited to Cloud Datastore. With Datastore, you often hear about the flexibility of storage, or ad-hoc storage capabilities, but what does that mean? Well, consider that even after storing hundreds objects with the same 3 properties, you can then store the next hundred with those 3 properties plus 3 more. Get the idea?

Oh, did I mention it’s free to get started with Cloud Datastore? Up to 1GB storage per day, which is enough to get started to see if it is the storage solution for your application. Storage costs after that are dependent on the amount of data, and also the structure of the data, so you’ll want to take a look at the size calculations and pricing.

.NET Datastore API

For .NET developers, you can get started right away with the .NET API for Cloud Datastore. Using the library is straightforward in C#, once you get used to the basics. For anyone familiar with relational database structures, here’s a friendly mapping from the documentation to help with the terminology.

Concept Cloud Datastore Relational database Category of object Kind Table One object Entity Row Individual data for an object Property Field Unique ID for an object Key Primary key

Remember, Cloud Datastore Entities of the same Kind can have different Properties. That flexibility comes at a cost, however. Even though Cloud Datastore allows for some SQL-like queries, the functionality is limited. There is no support for join operations, inequality filtering on more than one property, or filtering based on the results of a subquery, just to name a few.

Using the Console

To explore Cloud Datastore using the GCP Console, you can select Datastore under Storage, and choose Create an Entity. This is an interesting exercise to demonstrate just how the Entities work, and what working with them in code will entail. First, choose your region and let Google Cloud initialize the Datastore.

Now you can set the properties for your Entity. Namespace is important if you plan on having multitenancy.

Also note the Parent option here — the description provided is helpful in understanding how data hierarchies work, which is a big part of the appeal of document-store NoSQL.

Now we can start adding properties! The typical options for property type are available, and you can choose whether to index them or not.

Interesting to note here are the Array and Embedded Entity types. If you choose these types, you will be presented with pre-formatted JSON in the Value window, which you can alter for your needs.

After creating a couple of entities, you can try out querying your Datastore, either by Kind or by GQL, using the properties you set as indexes. This is a great little test service to play with, because if you’re used to TSQL, you’ll almost certainly find some query functionality to be missing. You can take a look at the unsupported features list, if you are unsure why a particular query isn’t working.

Example in C#

So what does a data operation for Cloud Datastore look like in C#? Let’s take a look. The Cloud Storage C# example code for this series is available here. Remember, you’ll need to have set up default application authentication just as in the previous post, along with the rest of the setup in the First Steps of the previous examples.

Let’s start with adding a few entities.

// dsdb = DatastoreDb.Create(projectId, dsNamespace);
var kFactory = dsdb.CreateKeyFactory("SurveyQuestion");
var entities = new List<Entity>()
{
new Entity()
{
Key = kFactory.CreateIncompleteKey(),
["questiontype"] = "yesno",
["text"] = "Would you purchase this product again?",
["categories"] = new ArrayValue() {Values = {"product", "purchase"}},
},
new Entity()
{
Key = kFactory.CreateIncompleteKey(),
["questiontype"] = "starrating",
["text"] = "How would you rate the quality of our product?",
["categories"] = new ArrayValue() {Values = {"product", "quality"}},
},
new Entity()
{
Key = kFactory.CreateIncompleteKey(),
["questiontype"] = "starrating",
["text"] = "How would you rate the price of our product?",
["categories"] = new ArrayValue() {Values = {"product", "price"}},
}
};

dsdb.Upsert(entities);

And then doing a simple query by category.

Query query = new Query("SurveyQuestion")
{
Filter = Filter.Equal("categories", category)
};

var results = dsdb.RunQuery(query).Entities;

The example code linked above uses these operations in a C# console app, which you can download to get started.

Cloud Bigtable

Cloud Bigtable is another NoSQL option, one that is most cost-effective for very, very large data sets starting at 1TB. Bigtable is not a document-store, but instead is a Wide Column Store, also known as Extensible Record Store. This is another reason it is recommended for the largest types of data, such as iOT, financial, or geospatial datasets. BigTable is notable for its low-latency and high-throughput, and also it’s compatibility with other open-source APIs.

To create a BigTable instance, start from the Storage menu in Console, and create a new instance.

You’ll essentially be creating a cluster of nodes, so beware if you are working off of a trial account this may eat up some of your trial credits. After creation, you can view the instance in the Console dashboard.

Bigtable and HBase

Bigtable supports the open-source HBase API, , making it a natural choice if you are experienced with HBase API and the Hadoop ecosystem. A cloud storage solution unique to Google Cloud Platform alone, Bigtable targets very large datasets in every way. If you aren’t familiar with this sort of NoSQL storage, consider a giant table with a sorted key/value map that can scale to billions of rows and thousands of columns, yet can be sparsely populated. Along with the HBase ecosystem for reading, writing, and organizing that data -that’s Bigtable.

.NET Compatibility

BigTable is a little different in that there isn’t (at the time of this article) a Google-sourced C# API. Instead, you can use the HBase-compliant .NET library of your choice. Google does note (but not explicitly support) third-party libraries, one of which is .NET in C#. It is not yet available on Nuget, but can be accessed through the GitHub link. You can also use the HBase Shell for Cloud Bigtable.

Other Resources

Video Overview of Cloud Storage Options

.NET Client Library Developer’s Guide

Google Cloud .NET Code Sample Quickstarts

.NET APIs

Datastore API tutorial

--

--