Reading the Open-Source Nutrition Labels
The open-source projects that underpin various IBM cloud services
IBM used to be a closed-source company. Customers would come to us because it was the only place to get software we built behind closed doors in our research labs. The picture today is very different. We still have exclusive technologies such as Watson and Db2, but there are also familiar open-source projects operated as-a-service by IBM.
Each such service is based on an open-source project but may be renamed, to distinguish it from the original. IBM’s implementation may have additional features such as a web-based dashboard, backup tools, or unified authentication.
Each of the open-source projects is built with one or more programming languages, using a framework and external libraries — the tools themselves will be open-source projects in their own right. They run on a Linux distribution, which itself is a soup of open-source projects, libraries, and tools. When a particular open-source tool is deployed, it may even run in containers orchestrated by other open-source utilities.
This is all possible because free, open-source software is both free of charge and free to re-model in other forms (within the limits of the license agreement), but it is certainly not free to produce. Many, many human hours are needed to create, maintain, and improve these products. IBM doesn’t just take from the open-source community, it is an active contributor to the products and tools it uses.
Sometimes, it may not be clear that the IBM product you are using is actually based on an open-source project. So let’s clear this up with list of IBM products and the open-source projects they’re built on.
IBM Cloudant / Apache CouchDB™
IBM Cloudant is a JSON document data store built on Apache CouchDB. They are broadly equivalent, but Cloudant adds geospatial support and free-text search which are open-sourced, but don’t ship with vanilla CouchDB.
IBM Message Hub / Apache Kafka™
IBM Blockchain / Hyperledger, a Linux Foundation Project
IBM Blockchain allows private blockchains to be built where all the participants know that the transaction history is secure and tamper-proof. It is built on the Linux Foundation’s Hyperledger project.
IBM Cloud Functions / Apache OpenWhisk™
IBM Cloud Functions allows tiny microservices to be deployed as scalable “serverless” actions. It is a deployment of the Apache OpenWhisk project which combines nginx, Apache Kafka, Apache CouchDB, and Docker, amongst others.
This one’s easy! Spark is a distributed data-processing framework and it should be no surprise that IBM’s Spark service is based on Apache Spark. It is available in Jupyter notebooks and in conjunction with BigInsights, a proprietary IBM technology.
Open-source offerings from Compose
The lovely folks at Compose provide a range of data layer services based on open-source technologies:
- Elasticsearch / Elasticsearch
- etcd / etcd
- MongoDB / MongoDB
- MySQL / MySQL
- PostgreSQL / PostgreSQL
- RabbitMQ / RabbitMQ
- Redis / Redis
- RethinkDB / RethinkDB
- ScyllaDB / ScyllaDB
- JanusGraph / JanusGraph + Apache Tinkerpop + ScyllaDB
The Compose offerings are multi-node deployments of the service, not just the single-node instance you get when you download and run some software on your machine. They also add an access control layer on top of your service and add logging/monitoring underneath it. There’s an API, command-line tooling, and a web-based dashboard to control your service, plus other tools such as backup, restore, and version management.
The next time you’re perusing the Bluemix Catalog, remember: Many of these services are made on equipment that also processes open-source software and infrastructure tools, and may contain traces of all of the above.