Open Source: Project A Releases Its Data Warehouse Infrastructure
Berlin, October 11, 2018. The Operational VC Project A has decided to provide its business intelligence infrastructure, which goes by the name Mara, as open source software to the public. The centerpiece of the software, essentially a library for integrating a business’s data into what is known as a data warehouse (or: DWH), has been released via Github github.com/mara.
Data warehouses are used to integrate data in an automated way into one central system, in order to build up the consistent data pool needed to run a data-driven business. Over the past several years, Project A has been working together with its portfolio companies to develop an infrastructure for setting up data warehouses.
By releasing its business intelligence infrastructure as open source software, Berlin-based Project A aims not only to strengthen collaboration with its former and existing portfolio companies, but also to encourage external companies to take first steps into adopting a data-driven approach and mindset.
“We strongly believe that data-driveness is crucial for building and growing digital companies,”
says Florian Heinemann, Founding Partner at Project A.
“Hence we’ve been prioritizing this area since our early days and it’s exciting to share our solution now with the outside world to empower other companies to take the step into data management. We’re convinced that we are making a contribution to the digital ecosystem with this release and helping to create a more data-driven mindset within our industry and beyond.”
Mara was developed for companies that have chosen to build up their data warehouse with a team of developers in-house. One advantage of the software is that the infrastructure can be highly customized to the specific business model. Although common data aggregation tools or cloud solutions seem to be inexpensive in the short term, proprietary development will pay off in the long run. Companies are able to react not only faster, but also more flexibly.
Following the rising trend of using Python or SQL source code to specify data transformations, Mara offers a Python framework for programming data integration processes in PostgreSQL as well as a number of smaller libraries. Unlike “click-based” tools, Mara gives users the chance to observe modern best practices in software engineering such as versioning and automated testing as well as the possibility to work in parallel manner on one data warehouse with bigger teams.
The development of Mara has been led by Project A’s Chief Data Officer Dr. Martin Loetzsch. While working with startups as well as established companies, Loetzsch recognized that the biggest challenge when setting up a data warehouse is not the handling of large amounts of data, but rather that mastering the complexity of the data and creating transparency and consistency are much bigger issues for most of the companies.
As a consequence, many components in Mara have been kept as simple as possible, and a lot of energy has been put into the development of automated visualizations of data integration processes.
“It happens quite rarely to us that we don’t understand how a piece of data was computed. This helps a lot in building up trust with those who are using the data”, added Dr. Loetzsch.
Remaining true to the concept of an Operational VC for nearly 6 years now, Project A pursues an approach characterized by knowledge sharing and data-driven business management, a culture that it actively brings into its portfolio companies, who thus profit from the experience, mistakes and lessons learned of their peers. A strong dialogue between the startups and Project A’s data team has allowed for a continuous development of the Mara software, by integrating the specific needs of various business models as well as changing technical requirements.
On October 12, Dr. Martin Loetzsch will present Mara at the first Project A Knowledge Conference in Berlin.