OpenPaaS NG research project: 2nd year

OpenPaaS is a collaborative platform, developed with love and ethics by Linagora. The OpenPaaS developers also participate to an ambitious research project, named “OpenPaaS:NG”. We work with other awesome Open Source companies, Nexedi and xWiki, and great research laboratories, Polytechnique and Loria. On that 4 years project, we just passed the 2 first years. I guess it’s time for a little overview of the things we achieved and the next steps.

What is this project about ?

The goal of the project is to bring to the community a ready to use collaboration platform. The project is split into Work Packages, each work package address a specific topic. Objectives are, briefly said:

  • to provide a secure and scalable peer to peer platform, thus enabling collaboration in serverless type of architectures
  • to bring to the community a collaborative office editor
  • to provide an AppStore inside the platform
  • to provide greater experience in video conferences, by analysing people speech, and doing things based upon that

Needless to say, this is an ambitious project ! Our long term goal is to offer an Open Source, great alternative to products like Microsoft Office365 and Google G-Suite.

Decentralized collaboration through P2P

This is a hot topic. Beside the protocol-level challenges, related to the number of connected peers, as well as the security concerns around, collaborating means also, on a more technical point of view, synchronizing data between peers. Without a central point of authority, namely the server, it’s hard to re-conciliate the different changes operated by different pairs.

We architectured and developed two approaches of data synchronization.

The first one rely on a formal research and is called LogootSplit CRDT algorithm. Gérald Oster, a member of the Loria lab, architectured it, and we also put code where our mouth is. We want to create rich JSON objects, that look just like classic JSON object, but are synchronized between all peers.

The second one is called DOM Diff/Patch algorithm, and the underlying promise is wonderful: being able to synchronize the web ! The web being a collection of DOM trees, if we are capable of synchronize a DOM tree, then… This protocol rely on a clever algorithm to detect changes and create the diff, apply the patches received, and heuristics to deal with browsers specificities in the DOM handling. Once again, we delivered the code. Kudos to James Caleb De Lisle and other xWiki members to have made this happen, as well as the side project collaborative, zero-knowledge PAD Cryptpad.

Can I haz .doc ?

One of the greatest feature of Office365 or G-Suite id the ability to work with Office documents, which are Text Documents, Spreadsheets, and Presentations. And this is also where the Open Source alternatives fall short : the development cost of such tools is huge, not to mention the file format conversion systems. Hopefully, a company named Ascensio systems took the bet, and propose an Open Source Office suite. We decided to plug it into the OpenPaaS, to provide the added value of the platform (real-time collaboration, files indexing and search, ….) Right now, users can import, export, open, record, and share Office files through OpenPaaS. We intend to push the integration way further.

A platform Application Marketplace

Like any platform, we are modular and the PaaS can be enriched. Moreover, it’s totally possible to deploy SaaS on top of it. Thus the need for some kind of App Stores, where the community can expose specific apps. Well… An App Store data model got lots of similarities with the model of an e-commerce based system. Too good, we got Nexedi in the team, that is specialized in those kind of business, with their Enterprise Resource Planning system ERP5. Working with them, we took the decision to start from the technical side, by designing the API based on business workflows, and also on the business side, by creating what OpenPaaS appstore invoices will look like (even if it’s free :-). We looked at software versions and persistence, update strategies, as well as typical business cases. Now we have a better view of the things we need to build.

An augmented video-conferencing experience

Video-conferencing is still a hot topic. Everyone is convinced that this is now a commodity, because, you know, every smartphone or laptop now has an integrated Camera and microphone (Hello NSA ! :-). However, the global experience of meeting remotely is still really poor. I’ll pass the infrastructure issues (poor microphone, or poor network bandwith), and focus of the added value that an enterprise platform could bring. At Linagora, we have that dream, that in the future, no-one will never ever have to write a meeting minute again, because the software will understand what is being said during the meeting, and write it down for us. Well, that is exactly what we are building right now.

We used the Kaldi Speech Recognition Toolkit as a starting point. We hired some trainees to build a french corpus. We then worked with the LIX researchers to plug their awesome keywords extractor and summarization engine. We end up with a first proof of concept of an Hubl.in on steroids system, analyzing attendees speech, offering external resources related to the current conversation topics, and finally building a summary of the meeting !

Conclusion

Software development is a huge area, and our team works at providing software working with current technologies, as well as inventing the technologies that will power tomorrow’s applications, and bring new kind of features to the end users. Challenges are hard and exciting, partners are nice and highly motivated, we’re on track to deliver a great experience to the users ! Let’s see where this project will lead us to in 2 years.