It has been almost two months since we first released TRAINS. We knew this was an amazingly useful tool for running deep-learning experiments but still had knots in our stomachs as we were getting ready to get on the public stage.
As it turns out, our audience has had a very warm welcome in store, as ever since the initial release, we receive daily feedback from an expanding community.
This feedback has not only acknowledged the immediate value TRAINS can provide right off the bat, but helped shine a light on some key areas of interest in current AI model development, and what can facilitate a more efficient workflow.
So here’s a list of the main capabilities we’ve added to TRAINS (available here) based on our user feedback.
Jupyter Notebook support
TRAINS now fully supports Jupyter notebook, either on-premises or in the cloud. As we all know — version control and Jupyter notebook do not mix, and we received numerous requests to try and help out.
Our solution: TRAINS now automagically converts your Jupyter notebook to python (yes automatically as running in the background and every checkpoint you create is automatically converted to python code and stored). On top of that, you can now compare not only the model parameters but the Jupyter code that created it.
Improved Git Integration
TRAINS now logs not only your git repository and commit id, but also the current ‘git diff’!
This provides an ultimate solution to the all-too-known issue of failing to commit before training (and thus never being able to reproduce).
Tracking Python Packages
TRAINS now automatically tracks and logs the python packages and versions that you were using . Obviously this is also part of the experiment comparison, so now TRAINS can help tell which model was trained with what tensorflow version…
Remotely Stopping Experiments
TRAINS now provides the means to stop an experiment running on any machine directly from the web UI: Just right click it and press abort.
Imagine a bash script running one training after another, this allows you to quickly decide a model is “going nowhere fast” (a scientific term commonly used by Deep Learning veterans) and then aborting the experiment, just so that the next one on the list will get some GPU time. By the way, this is our first step towards “DevOps-on-TRAINS” — tailored for the AI/DL community.
We were under the impression that everyone in the organization should be able to login and look at all the beautiful graphs. Admittedly, we were wrong , so we added a configurable user/password for the TRAINS-server. Minor downside — now you have to talk to DevOps to get a user (worth it 😃).
Because it seems the color selection for plots is one of your favorite features
We made sure you have a color palettes on every experiment, but also made sure you can change it to whatever ghastly combination you want.
This is probably one of the features we are most proud of. TRAINS now automatically logs the resources on your machine and gives you colorful graphs, in real-time! (Here’s a list of what you can expect: Per GPU CPU utilization/mem/temp, Networking and IO)
TRAINS server back-end
TRAINS-server deployment improvements. Based on the feedback we received: There are now automatically updated AMI images, unified dockers for standalone OS X & Linux installation, and obviously, as one would expect, Kubernetes & Helm support.
We have added support for xgboost & scikit-learn (thanks reddit!)
We’ve found the community to be a vibrant, positive and exceptionally skilled group, whom we encourage to share thoughts and ideas on how to improve TRAINS or uncover any overseen bugs (it happens to the best of us) by posting an issue. We try to address all of them.
As TRAINS has been so well accepted so far, we are taking the extra step and are looking for companies / organizations to join our design-partner program. This means early access to unreleased features, a direct line to our product/system designers and, most important of all, taking part in shaping the next versions of TRAINS.
✉️ Shoot us an email with “Design Partner” in the subject line.