GSoC’21: Coding Phase second week

Sarthak Singh
SCoRe Lab

--

In this period of the coding phase, I have developed 6 docker images of ClamAV and figured out a way to decrease the image size by 200 MB by just using a simple concept of mounting a volume to the container.

So, I started with building a docker image of ClamAV using ubuntu and used a startup script so clamd process can start in the background and the python backend can directly work on it. This is an approach that is fine and the image size of Ubuntu is around 425MB because of freshclam which is a virus database update tool for ClamAV. It downloads the whole database of virus to the system and ClamAV considers this as a tool to detect viruses, trojans etc.

And when I used the dive tool to look into the image and looked for image efficiency score it is about 99% So I think it is fine now then I created another image of ClamAV using alpine and the image size reduces to 200 MB which is progress for me but I wanted more reduced size of the image. Then I tried to store the virus database to the external volume and attach this volume to the container whenever the container runs ClamAV can read the particular database and work properly.

Then after some tries and setting up permission for the particular user, ClamAV is able to use that particular database that is mounted to the container. This reduces image size significantly and the alpine image is now only 17.1MB and the ubuntu image size reduces to 200MB. which is great, but there is one issue freshclam can’t update the database as it is not installed in the container and virus definitions will be outdated in some time and we want to scan our file with new virus definitions right? So for this, I have figured out a way in which one small alpine container whose main task is to update the virus database every hour. I will discuss more about this approach in the next blog, Is this approach is fine in production or not or do I encounter any errors while working on the image.

Then mentors provided me with GCP account credentials and I have to run these docker images in the upcoming week and run ClamAV with the actual viruses which can be found in https://bazaar.abuse.ch/browse/ .

Coming up

For the coming week, I have decided to look into the CI/CD pipeline and try to integrate CI/CD pipeline into the project.

Stay tuned for further updates and feel free to connect with me on Linkedin

Join the GITTER channel of Scan8 for more insights of the project.

--

--