The next major RDKit version — v2020.03 — should be released next week. Once the release is out I’ll do a couple posts on the RDKit blog about some of the new features, but I thought it would be worth doing a quick post beforehand here to describe some new testing that I’ve started doing.

Be forewarned that this post is also a bit of light advertising for the RDKit support contracts I offer through T5 Informatics.

The RDKit and testing

The RDKit is pretty well tested: we have automated tests for:

  • the C++ code (really good coverage)
  • the Python wrappers (really good coverage)
  • the pure Python code (really good…


I really enjoyed reading this article in Nature about uses of Slack in science. I think Slack is an awesome tool and have found it really useful as an additional tool for the RDKit community to use to stay in touch. It’s really nice to have a real-time(ish) communications channel that is searchable, allows file uploads, etc. Oh, and the github integration is also pretty useful. So back to the Nature article… it’s got a nice overview of some of the ways that labs/teams use Slack to do their research. There is, however, at least one missing: taking advantage of Slack’s open API to add some chemistry capabilities. One can hardly blame the author though: this is an experiment I started earlier this year, but I’ve only ever really talked about this on Slack itself and at the RDKit UGM this year. …


Note: as usual, I’m looking for feedback on this document. Please get in touch if you have thoughts, tips, or concerns.

Background

Over the years that the RDKit has been around, I have placed a strong emphasis on maintaining backwards compatibility. Things have been added to the API, but not much has been changed or taken away. With a few exceptions, code that worked 10 years ago will still (modulo bug fixes) work today. This approach has some obvious benefits for RDKit users and those who build software on top of it, but there is a cost: we’re stuck with the consequences of bad (or, perhaps, poorly thought through?) …


Note: This document has been cooking for a while and I’ve circulated it to smaller groups in a couple of different forms; it’s time to get it out there and start moving.

The topic is reasonably technical, but there is an important practical implication: the change discussed here will break compatibility with older C++ compilers. This should not be a problem for people using binary distributions of the RDKit (i.e. conda packages or packages installed by the operating system’s package manager), but it may introduce problems for people who build the RDKit themselves on older systems.

Background

It’s time to start “allowing” the use of modern C++ (by which I mean C++14) in the RDKit. I think this is an important step both for code quality in the toolkit itself and for allowing us (the developers) to continue to learn and use modern tools. Who knows, it may even help with performance. …


Obligatory apologetic beginning: it’s been too long since I wrote one of these.

The first RDKit UGM, around five years ago in London, had a number of interesting and important outputs. One of the most important ones for me was a change in the frequency of RDKit releases. During one of the roundtables I was asking about things that I could do to better support the community. I wasn’t completely sure what to expect when I asked this (and, to be honest, I still wasn’t completely sold on the whole idea that having a community was a great idea, but that’s another story), but I certainly did not expect: “Could you please do releases less frequently? We can’t keep up with you.” Back then I used to do a release every quarter, it was a non-zero amount of extra work for me, but seemed like a reasonable rhythm to maintain visible forward progress. Being asked to help out by doing less (not particularly pleasant) work was an easy request for me to take on. I switched immediately and since then I’ve done two releases a year: one in September and one in March. …


In addition to doing standard support/maintenance contracts for the RDKit, I believe that T5 Informatics could also generate revenue by helping people make effective (or more effective) use of the RDKit. We have a strong motivation to create happy, productive RDKit users. In order to do this we will offer services to help install the RDKit and integrate it into customers’ systems. We’ll also have a variety of training modules on offer to help get local users familiar with how to use the RDKit to accomplish common cheminformatics tasks. …


One of the the ways that I think T5 Informatics is going to generate revenue is by offering support contracts for the RDKit. I say “I think” because that’s still just a hypothesis. This post is part of starting to test that hypothesis.

Support for the RDKit is currently provided via the mailing lists, github, and, to a lesser extent, the Slack channel. Oh, and via email to me directly. I tend to direct people who send me personal email to the mailing lists so that other members of the community have a chance to answer and both the question and answer end up being publicly findable via a search, but I do directly answer the occasional direct question, particularly those coming from companies. This system isn’t perfect, but it does work reasonably well thanks to the helpfulness and openness of the RDKit community (we have a great community). It can, however, make things difficult for organizations to report problems or ask questions: often they need to invest effort in making sample structures/inputs generic before asking. I believe that the lack of formal support offerings can also be a barrier to adoption of the RDKit in some organizations. …

Greg Landrum

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store