MATLAB is extremely popular in academia partly because it is approachable even by researchers who are not very experienced in software development. Thus, many MATLAB users may not be aware of some of the best practices that are usually employed in developing and maintaining reusable software. One such important practice is package management.
What is package management, and why would I need it?
Package management is the idea that if some part of the code you are writing is needed in more than one project, you should move it into a separate package and use that package in every project that needs it. The main goal is to avoid maintaining multiple copies of the same code in each project that uses it. Maintaining multiple copies of the same code is both inconvenient and ineffective.
Consider the example of fixing bugs. Let’s say you discover a small bug in some code that is repeated in several projects. How do you approach fixing it? You would first need to fix the code in at least one of the projects. Once you have the fixed code, you probably also want to apply the fix to all other projects that have the issue. But how would you find all of them? You may have tens of projects that use the same code or a similar pattern of code, but you may not easily remember all of them. For each project that you suspect may have the bug, you would need to manually check the code, find the part(s) that have the bug, apply the fix, test to make sure the code still works, and reproduce any results that may change due to the fix. You may even have shared your code with others. You would need to somehow let them know about the bug and how to fix it… This is a nightmare!
Enter package management! Imagine having one central version of your code that you can apply the fix to and then immediately, every project that uses that piece of code would have access to the latest debugged version. Both your projects and everyone else’s. You don’t necessarily want the code in all of these projects to automatically change to the new version but you would like everyone to be able to do that with a single command, without having to search around in the code and copy and paste things around. That is what package management does. For each project, the package manager keeps track of all pieces of code that the project depends on (dependencies). It also keeps track of the exact version of each dependency that is being used by the project at all times so that you have a documentation of when the buggy version was replaced with an updated version.
Some programming languages have great package managers and incredibly vibrant package ecosystems: Python has pip and Node.js has npm. These package managers provide some invaluable functionality for developing and maintaining code in those languages. PackMan is an attempt at bringing some of their key functionalities to MATLAB.
Two essential capabilities are required in any package management solution:
- The ability to install packages in a project from the network (e.g. internet)
- The ability to keep track of the exact version of the package code that is installed and being able to install the exact same version later on
PackMan provides these for MATLAB. It was originally forked from a brilliant open source project called depmat by Tom Doel which already had the first capability. PackMan added the 2nd capability and some other critical features including: graceful handling of recursive dependencies, clean addition/removal from MATLAB path, automated update script, etc.
In order to rely on a package manager for dependencies, it is very important that you can somehow ensure that it is going to be around years from now. PackMan should be like that because it does everything with no infrastructure. PackMan doesn’t even have a single server to keep the list of Packages and their versions. It runs completely locally on your computer.
The key idea that enables PackMan to provide package management with no infrastructure is very simple: rely on well established source control services such as GitHub to host all versions of packages. As a result, as long as your Git source control service (such as GitHub) is around, your package manager should also be around.
Using PackMan to add packages to your MATLAB projects and updating them is quite easy and is explained in the GitHub page for PackMan. Briefly, you will need to do 2 things:
- Copy a file called “installDeps.m” to your project
- Make a file called “package.json” and add the link to any package that you want to install to it.
Making packages that can be installed to other projects via PackMan is even easier. Any code that is tracked with Git (including any repository on GitHub that you have access to) can be installed via PackMan. You don’t need to do anything to make the code available via PackMan. As long as you have access to a public or private Git repository, you can also have PackMan install it.
This is just a starting point. At this stage PackMan fits into my workflow and I have been using it in all my MATLAB projects for over a year now. But it may miss features needed for other use cases, have bugs, etc. So please don’t hesitate to get involved with the development on GitHub. I would also be eager to hear any other comments about it.