Starting with the CortexPlatform
Agility, flexibility and fast results are just a few of the buzzwords that have been demanded from IT departments in recent years. Especially when it comes to the analysis of data, data discovery or data access suitable for specialist areas (analysis or own applications), the start into the CortexPlatform is ideal.
This overview provides an introduction to the included applications, APIs and CortexDB. More detailed documentation can be found in the online documentation of Cortex AG; the free version can be downloaded from the manufacturer’s website (after registration). When you register for the download, a version is also available in the cloud (as an Amazon web service with an application for importing data from csv files and immediate access to the database).
The graphic shows that the CortexPlatform is divided into three basic layers. In the CortexImplex layer for data import, the CortexDB as database in the core of the platform and the CortexUniplex as application. All three layers complement each other, can be operated on different hardware or can be supplemented with other tools from other manufacturers, if certain functions should not be sufficient.
Each of the three layers also has its own API for developers. It should be noted here that these allow different possibilities and can be extended by the manufacturer on a project-related basis if this is required.
short definition: CortexImplex — CortexDB — CortexUniplex
The CortexImplex is a Java application. With the help of this application csv and xml files can be imported directly. A CortexDB can also be used as a data source. The API is an abstract Java class (so-called reader class) that can also be used to access other sources (e.g. SQL databases).
The CortexDB has several ways to be used via API. The API is currently being revised to make it much easier to use. Therefore, the use of CortexUniplexAPI is currently recommended.
The CortexUniplex is the standard application for creating a data model and defining the associated parameters. Basically the CortexDB works schemaless. However, the CortexUniplex requires a schema in order to present data sets to the users in a “comprehensible” way and also to be able to execute other functions.
As already mentioned, the CortexImplex is a data integration tool. This makes it possible to import data from different sources. By default, csv and xml files can be imported immediately and very easily, as well as an import from another CortexDB.
An abstract Java class is available for extending the CortexImplex. Developers can use it to create a so-called “reader class” for reading out further data sources (e.g. SQL databases).
The basic import procedure requires a configuration of the CortexUniplex. This configuration is used for the import to check field contents during the import and to import the data records correctly.
Note: The CortexDB works independently of the configuration of the CortexUniplex and the CortexImplex. The above configuration is only necessary for using the CortexUniplex. An import can also take place without prior configuration. However, this makes very little sense in most cases.
The following procedure is suitable for the basic procedure for getting started:
- Configuration of fields and record types in CortexUniplex
- Creating the import configuration for the CortexImplex
- Test of the import configuration with source data
- Creating a database backup
- Executing the data import
- Checking the imported records
- Restore the database, adapt the configuration and re-import if necessary
Note: The CortexImplex and the CortexDB have different parameters to optimize the import process. The import configuration also allows a large number of functions to check the source data during the import and import any changes that may have been made. The online documentation for the CortexImplex explains, among other things, the descriptions of the functions, as well as the formation of hash values and delta import. These are the first steps to speed up the import.
The CortexDB is the heart of the CortexPlatform. Like any other database, records can be stored, read, modified and deleted. However, the difference to all other databases is that the CortexDB basically fulfils two different database paradigms and offers different functions for database modeling (“multi-model”).
The data sets are stored in a schemaless format. A data record must therefore always be understood as an independent object (a so-called container). Even similar data records (persons, companies, articles,…) can therefore be structured differently. The storage is the same as for all document stores, but in a different container format and not based on xml or json as usual.
Unlike other databases, no index needs to be configured! Each time a data record is changed during the same transaction, each content and each field is transformed to a data record ID in a multidimensional, universal key/value store. Here, the temporal context is also recorded (historical values / “slowly changing dimensions”). No normalization is possible in this key/value structure, because each value and each field are available once without redundancy. The highest normalization was thus achieved. Every value and every field is atomic.
Only this key/value structure is used to search for data records. Therefore no other procedures are necessary (e.g. “elasticsearch”/”lucene” etc.)
In principle, the CortexDB therefore offers a universal index scheme based on schemaless storage. This is used to find data records; data records are read in the schemaless storage (“document store”).
Any number of fields and information can be stored within the CortexDB container format. The CortexDB only distinguishes four field types. Everything beyond that is an interpretation by the CortexUniplex (must be considered also with the CortexImplex) or the definition by the software development.
- “characters” < 220 bytes
(only these are transferred to the key/value structure
- “Characters” > 220 bytes (so-called multi-line text fields; “plain text”)
- binary large objects (blob)
A data set can be 4GB; the maximum size of a field (JSON, blobs or plain text) can also be 4GB.
The CortexUniplex can be used as the first, temporary or permanent tool for users, developers, data evaluation or for the closer analysis and use of data (“data science”).
The flexibility of the CortexDB for the storage of data sets is reflected in various functions of the CortexUniplex. The CortexUniplex API can also be used for the individual development of applications. The complete configuration of the CortexUniplex is taken into account. Rights and roles for accessing data records and fields, as well as individual functions (lists, search options,…) can thus also be used directly by developers without having to redevelop them. In addition, preconfigured selections and lists can be used directly as JSON objects.
The CortexUniplex is a web application that can be used using current browsers. This application is located within the CortexDB and is provided via the supplied web server (httpd). An extension for own functions is possible, so that the CortexUniplex API can also be used for this. The CortexUniplex must therefore be imported into the database and does not require any file storage on the server.
To save data records and for easy display and navigation in the data records, the configuration of permitted fields and data record types is necessary. You can also define different field types for fields (string, numeric, date,…). The CortexUniplex thus ensures the generally valid storage of the information in the CortexDB and defines a concept for the storage and processing of data records (“database schema”).
In contrast to the CortexDB, the CortexUniplex uses a dynamic schema for data sets. This means that the maximum value of a data record is defined and the use of the information is restricted again with user rights. It is also possible to add further fields at any time; the change depends on the use in previous data records.
The CortexUniplex is therefore to be regarded as a universal application in order to provide a uniform database interface for different departments.
By the interaction of the data import (CortexImplex), the database (CortexDB), the universal application (CortexUniplex) and all associated APIs, a broad use of the CortexPlatform is very easily possible. The CortexDB can be used alone or together with all other tools via the API and can be extended by individual developments.