Modern Automation: Git PLC & SCADA Backup
Will Git work as a backup tool for PLC and SCADA?
In my last few stories I’ve been exploring the feasibility of using Git as a backup tool for PLC and SCADA configurations, rather than the traditional method of using a ZIP files. Some initial testing convinced me that using Git was a feasible solution, but this wasn’t a comprehensive test. I needed to experiment using a decent sample set of modified PLC and SCADA projects. This story examines the results of those experiments, specifically the storage efficiency of Git versus ZIP.
The theory was that Git would utilize storage more efficiently than ZIP for a sample set of 30 commits. This would be tested by making a series of project changes, each of which would be committed to Git individually. In parallel a new ZIP file of the entire project would be made to backup each change. The series of changes would include 10 file additions, 10 file modifications and 10 file deletions. Separate tests would be conducted for Siemens PLC and Citect SCADA project types. The efficiency of each backup method would then be assessed based on the total storage space used.
A quick note on Git LFS (Large File Storage). In my previous post I discussed the issue of PLC and SCADA projects being made up of predominately binary files. I’d planned to manage this issue by using Git LFS which promised to make my local repo more light weight. However, I quickly ran into a problem when conducting the SCADA test. The Git repo size had exceeded 1GB after only 3 commits. This was versus a total ZIP file collection size of just 76MB.
The problem was being caused by a single database file in the SCADA project. The file was about 200MB and modified during every SCADA compilation. This would trigger Git LFS to duplicate the uncompressed file on every commit and inflate the repo. Unfortunately, this file was a critical project file and could not be ignored. Consequently, because Git LFS doesn’t currently support compression it was abandoned as a feasible option. The plan going forward was to use a standard Git repository which does support file compression.
PLC Git Test
After making 30 PLC project commits including additions, modifications and deletions. The Git and ZIP results looked like this.
The S7 Git repo started at about 7MB with negligible growth over 30 commits. In comparison the ZIP collection grew by about 7MB with every commit, totaling just over 200MB. Based on this growth rate I’d expect a ZIP collection size of over 700MB after 100 commits. The combination of differential file backup and compression gives Git a clear advantage over ZIP compression alone.
SCADA Git Test
The same test was applied to the SCADA project and the results looked like this.
The SCADA Git repo started at about 9MB with negligible growth over 30 commits. In comparison the ZIP collection grew by about 12MB with every commit, totaling over 400MB. Based on this growth rate I’d expect a ZIP collection size of over 1.4GB after 100 commits. Once again Git performance was considerably better then ZIP.
There were a couple of key findings from this experiment. The first was that Git LFS won’t be suitable for PLC and SCADA backups until it supports integrated compression. LFS would have been a nice, but it’s not necessary if you’re getting good file compression.
The second finding was that Git performed significantly better than ZIP in terms of storage efficiency. The total storage space used by Git was approximately 3% of what was used by an equivalent ZIP collection. This is a significant storage saving which saves money and is easier to maintain.
Based on these results I’m now confident that Git can work as an effective and efficient backup tool for PLC and SCADA projects. The next step will be to create a prototype that automates the same Git functions that were used in this experiment.
Time for some code 😊.