Querying JSON data in Couchbase using Scopes and Collections
This week I’m attending the 3-day Couchbase Connect event and will be reporting on some of the topics that I find most interesting. Today I’m reviewing one of the New N1QL Features coming in the next release, presented by Keshav Murthy, VP R&D at Couchbase. This session is available as a replay on-demand once you register.
The first day saw many sessions but I knew the N1QL update section would have a lot of technical innovations. N1QL is the SQL-like query language used on JSON documents stored in Couchbase. With Couchbase, you are not just using a key-value store but a full JSON document database that also exposes data through queries like a relational database (among other things).
Most of N1QL is identical to SQL, but there are nuances that add even more power than standard SQL provides — especially for JSON-specific features like sub-objects, array handling, and, now more refined logical data management.
One of the most practical new features for both developers and DBAs is the implementation of scopes and collections.
I’m planning to dig into other topics such as the “flex index” N1QL features for search but will do so in a separate post so this one stays short.
Logical Data Containment: Scope and Collections
Background
The main units of containment in Couchbase have buckets that store documents. Named buckets allow the grouping of documents for a particular application or even workgroup that requires different user permissions than others.
Today’s Update
A 2019 release previewed the addition of collections as a subunit of buckets. The update discussed today brings it to a more refined level — including scopes within a bucket and further collections within scopes.
Each level of this hierarchy is accessible through the table name path syntax used elsewhere in the platform. For example, in the above graphic, the FROM clause specifies the data to be used in the query, each container holds a subset of documents and is referenceable by name:
cxprof.usa.loginfo -> bucket.scope.collection
Note that indexes are as easy to create as any SQL environment and only fields that you want to query need to be indexed. Likewise, data can be partitioned automatically or with manual intervention depending on need.
Why it matters
Logical data groupings in NoSQL systems help improve the ability to manage data more effectively. Some applications will only ever need access to a subset of documents, so exposing all of them can add unneeded complexity.
Without this ability to tightly scope, a query may end up needing multiple WHERE clauses to drill down to a subset of data of interest. Whereas scopes and collections can manage it all behind the scenes.
For example, documents could have fields like countryName or appName instead of using the above syntax, but the release would be two more WHERE clauses added to the query, e.g.:
FROM cxprof
WHERE countryName = “usa”
AND appName = “loginfo”
Instead, the new features allow desired scoping within the FROM clause:
FROM cxprof.use.loginfo
Why it _really_ matters
By extending all the query capabilities of N1QL to support scope and collection keyspace usage, it allows even more granular management of data, including data security.
Role-based access control (RBAC) is available at these new levels, further enabling application developers to offload access-related issues to the database using roles. DBAs can define users and limit them to particular areas and assign read/write/edit level permissions accordingly.
- For more information read the blogs by Keshav Murthy that dig deeper
This is only a small part of the N1QL update given today, but I will have more over the following days as I try out new features and follow more sessions at the event.
Join the event at connect.couchbase.com and see all the other innovations with N1QL including ACID transactions, UDFs, index advisor, and more. I can’t wait to give them all a try with a real project :)