Measuring code coverage in Snowpark Python Stored Procedures

Python Stored Procedures provide a convenient and flexible way to package and run your python code on Snowflake. They are useful for tasks like:

  • Data processing and transformation
  • Machine learning
  • Custom business logic

As you build larger and more sophisticated Stored Procedures, you’ll want to write test cases to validate that they work as expected, and to ensure they continue to work as you make changes. So you write a suite of tests, and they all pass 🥳.

But how do you measure the extent of testing, and ensure that the critical parts of your code are included? This type of measurement is called Code Coverage. Software engineering teams use it for:

  • Identifying untested areas of the codebase
  • Providing a metric for progress
  • Risk management: if coverage is low in critical areas, you can allocate resources to improving coverage until it meets your desired thresholds
  • Compliance requirements: Some industries have standards and regulations for software to meet, and code coverage can provide a quantitative measure

New in version 0.1.25 of snowcli, you can automatically measure code coverage on your Stored Procedures as your tests run, using the coverage.py package.

In this article, we’ll do a walkthrough of the new feature.

The scenario

Here’s a very contrived Stored Procedure which can add, subtract, multiply or divide two numbers.

As you can see, there are a few different code paths that lead to different business logic, including some of it from another module (other_stuff.py).

snowcli in normal mode

Normally, you run snow procedure packagefollowed bysnow procedure create to upload your procedure to snowflake, so the process looks like this:

your code in red

snowcli adding code coverage

To measure code coverage as we test, we can simply add the new --install-coverage-wrapper parameter. The full commands to package and deploy our procedure are:

snow procedure package
snow procedure update -n calculator -h "app.calculator" -i "(thing_to_do string, first float, second float)" --return-type "numeric" --install-coverage-wrapper --replace-always

Here’s what happens when that flag is added:

your code in red, auto-generated code in cyan

Invoking the procedure

Now that we’re measuring code coverage, let’s give it a test:

snow procedure execute -p "calculator('add',7,10)"

So the calculator works!

Note: Normally we’d build test cases using code so that they can be run in a CI pipeline, including an assertion that the returned value is correct given the inputs (the approach this article describes)

Generating a coverage report

Now that we’ve invoked our procedure at least once, we can generate a coverage report:

snow procedure coverage report -n "calculator" -i "(thing_to_do string, first float, second float)"

Behind the scenes, snowcli fetched all of the .coverage files from the DEPLOYMENTS stage under this proc’s /coverage/ subfolder.

These .coverage files are tiny little sqlite databases containing information on which lines in each file were executed. If there is more than one coverage file, snowcli will ask coverage.py to merge them together before generating the report.

If we open the coverage report, we see:

38% is not so great. Looking at the files themselves:

Increasing our coverage

We can see visually the code paths we need to test. Let’s do some more testing:

# subtract 3 from 8, returns 5
snow procedure execute -p "calculator('subtract',8,3)"
# muliply 3 by 4, returns 12
snow procedure execute -p "calculator('multiply',3,4)"
# divide 10 by 2, returns 5
snow procedure execute -p "calculator('divide',10,2)"
# test an operation that doesn't exist, returns an error
snow procedure execute -p "calculator('to the power of',10,2)"

With that done, we re-run our coverage report:

snow procedure coverage report -n "calculator" -i "(thing_to_do string, first float, second float)"

This time we see there’s been 5 invocations in total:

And after refreshing the report, we see we’ve obtained the mythical 100% code coverage:

If we want to start over (e.g. after changing our test cases or the stored procedure itself), we can delete all the coverage files:

snow procedure coverage clear -n "calculator" -i "(thing_to_do string, first float, second float)"

What should I be aiming for?

100% code coverage is an unreasonable goal on a typical codebase, and it’s important not to over-fixate on the number. Remember that in any organisation “you get what you measure”, and just like measuring lines of code as a proxy for developer productivity, creating pressure around code coverage stats can create the wrong incentives.

Often there are diminishing returns above 70–80%, so you should instead use a more realistic number as a minimum threshold. Remember that code coverage treats each line with equal weight, but in reality some bugs are inconsequential, others are embarrassing, and others are company-ending, depending on where in the codebase they occur. So it can be a good idea to ensure you have high coverage in the higher criticality areas.

You should equally encourage qualitative aspects such as readability and maintainability (via manual code review).

Summary

In this article, we demonstrated how to measure the coverage of tests for our code. By investing a small amount of time getting this in place, you can ship code faster with a greater degree of confidence.

That’s it — happy testing!

Further Reading

--

--