Palantir and Open-Source Software
Palantir is a substantial user of and contributor to open-source software (OSS). Our software and internal tools are built around OSS databases like Cassandra and Postgres, data processing frameworks like Apache Spark, RPC libraries like OkHttp and Jersey, and frontend libraries like React. All of our developers use tools like Gradle or Yarn on a daily basis. This post explores why and how Palantir contributes to OSS, from Gradle plugins, to fixes, to established projects, to software evolution like Spark-on-Kubernetes.
As a commercial software company, we must decide on a case-by-case basis which parts of our software we contribute to the OSS community, and which parts we keep in closed-source repositories. Depending on the type of software and the nature of the contribution, we have different motivations for pursuing OSS contributions:
Improvements and bug fixes. We push all bug fixes and performance improvements directly to the respective OSS repositories. This is a no-brainer win-win: we don’t have to maintain internal versions that deviate from upstream, and the entire community benefits from the fixes. Examples include a fix for a connection leak in the Feign RPC library or a performance tweak for Cassandra compactions.
Evolution. Every so often, the OSS constellations align and create space for major evolutions in some software field. We pursue such opportunities out of strategic interest: shaping a software ecosystem from its infancy allows us to push for the features and implementations we believe are important. This is precisely where our investment in Spark-on-Kubernetes originated: as heavy users of both Spark and Kubernetes, we prototyped the feature set that allowed our customers to run Spark workloads on Kubernetes clusters. This functionality is now available to the entire Spark community and Palantir is involved in its ongoing development.
Developer tooling. We would gain no competitive advantage from keeping developer tooling closed-source, and the overhead of maintaining it on public GitHub instead of internal infrastructure is effectively zero. Open-sourcing is then simply the right thing to do. Many of our developers use our OSS projects to develop their public reputation as software developers, and many enjoy the learning and growth opportunities associated with interacting with an external developer community. For example, tslint is today the standard linter for the Typescript community.
The following is a shortlist of representative Palantir OSS projects:
- Spark-on-Kubernetes: Kubernetes scheduler back-end for Apache Spark, now merged into mainline Spark
- tslint: extensible linter for the TypeScript language
- gödel: build system for Golang (blog post)
- Python Language Server: implementation of the Language Server Protocol for Python
- AtlasDB: transactional distributed database
- Blueprint: React-based UI toolkit for the web (blog post)
- Plottable: library of modular chart components built on D3
- gradle-baseline: set of Gradle plugins that configure default code quality tools for Java developers
- gradle-processors: Gradle plugin for integrating Java annotation processors
- gradle-docker: Gradle plugin for orchestrating docker builds and pushes
- docker-compose-rule: JUnit rule to manage docker containers using docker-compose
We are currently working hard to get our HTTP+JSON RPC framework ready for open-sourcing. Conjure allows developers to define APIs in a YAML DSL and generates client and server stubs across backend and frontend languages like Java, Typescript, or Go. We’re aiming at a first release by EOY 2018, stay tuned!
Palantir’s OSS projects are available on GitHub at https://github.com/palantir. We love getting feedback from the open-source community, and are always happy to discuss and merge contributions!
Authors: Robert F.