Open-source static analysis for Security in 2018 — Part 1: Python
I have a love-hate relationship with static analysis tools. I want to love them, and use them until they break and I can’t deliver product, and I disable them. Since I currently have a rather broad job definition of “Security and Compliance” occasionally I get questions around secure-coding and reviewing practices, static/dynamic code. I decided to revisit a few languages to see where the community stands from a practical perspective. Obviously this isn’t a substitute for running a rigorous vulnerability scan, integration test, and pen-tests as they cover different aspects of the software development lifecycle.
This post is going to cover 4 languages that comprise the bulk of my company’s important source code, doesn’t mean we don’t have other things. But this should give a good overview of what’s out there for at least these languages. I’m finding that there’s a bit of convergence happening with static analysis. And while type checking/safety is making a resurgence, it’s still going to be a while before we’ll static code checking to be powerful.
These are the languages I’m covering, starting with python
The goal isn’t so much completeness, but more practicality
Accuracy : Is this tool resulting in a lot of false positives, so much that the best thing an engineer will do is disable it?
Actionability: Is the tool giving me actionable suggestions or examples?
Integration: Can I use it with my existing tools and pipeline?
Maintainability: I don’t know the future, but is the tool being maintained right now? Or, is it in good enough state that I can maintain it if need be? YMMV here.
There is a curated list of static analysis tools for many more languages
Python is an interesting language from a static analysis perspective, there’s an AST package, but a lot of code can be changed at runtime. There’s the idea around making it “pythonic” and frequent code reviews might catch these problems, but catching some of these issues earlier is still great.
- Check your requirements.txt
It’s 2018, and any one writing python is using virtualenv to develop and ship the code. This means your upstream dependencies are recorded in a requirements.txt. If you’re not doing this yet, please stop reading and follow instructions here to begin using virtualenv.
We use safety package to check our requirements.txt and to date it has managed to find a lot of upstream vulnerabilities. Whenever possible, we stay with the latest stable versions of packages. I can’t recommend this approach enough. While it does add some pain, it avoids huge drift and surprises. Whenever I need to diverge from upstream package, I know relatively quickly and I can do that.
Accuracy: Do you trust https://github.com/pyupio/safety-db ?
Actionable: This is great, usually the PyPI package for upgrading is available and we just upgrade
Integration: Files are exportable and we just run something like this in our build, this translates to a failed build and a report whenever that happens. This gives the engineer a clear idea of what to do next, and without making them parse json :)
safety check -r $TOPDIR/requirements.txt --full-report --json > $TOPDIR/safety-report.json
if [[ $open_vulnerabilities -gt 0 ]]; then
echo "$open_vulnerabilities open known vulnerabilities exist in packages, failing build"
safety check -r $TOPDIR/requirements.txt --full-report
Maintainability: Well, If the safety-db is gone, I don’t see why someone else or a foundation cannot own this. Also, why isn’t PyPI doing this?
2. Check your code
Ok, so anything upstream doesn’t have a known vulnerability, great. Now lets run bandit on our code. This tool parses your python code and allows you to select which types of errors to check for. If there are false positives, you can always use
#nosec to ignore those blocks.
Accuracy: I found this relatively accurate in the code I tested
Actionable: It was very noisy, however the tool provides a way to export in different formats and filter by both severity using
-lll flag in increasing severity, and
-iii flag in increasing confidence. So if you only want to see the most severe and the ones the tool is confident on, you can run
bandit -lll -iii -r dir
Integration: You can export by
-f flag in csv,html,json,screen,txt or xml. This is very useful if you want to triage it first and say, make some JIRA tickets.
Maintainability: Looks good to me, the project is managed by Openstack and it’s supposed to allow hooking in your own AST, so I can see more novel uses
Tomorrow, I’ll talk about the history and present state of static analysis in Java