snowflake-vcrpy: faster Python tests for Snowflake

Performing integration testing against Snowflake requires making requests across the network for every test, which can be time consuming. Fortunately we have released a new tool, snowflake-vcrpy, which reduces the amount of network traffic in your tests and significantly speeds up your testing.

This article was co-authored by Adam Ling, Software Engineer at Snowflake.

What is snowflake-vcrpy?

snowflake-vcrpy is an open source Python library that allows you to record and replay HTTP interactions initiated by Snowflake’s Python driver for testing purposes. It is a pytest plugin built upon VCR.py, a Python library that mocks HTTP interactions. The first time you run a test with the library records consisting of HTTP requests and responses will be generated and stored in a cassette file. The next time you run the same test the library will retrieve the serialized requests and responses stored in the cassette file, and if it recognizes any HTTP requests from the original test run, it will intercept them and return the corresponding recorded responses. This approach reduces the network traffic required for testing and speeds up the testing process.

Install and use snowflake-vcrpy

snowflake-vcrpy can be installed from the Github source code:

git clone git@github.com:Snowflake-Labs/snowflake-vcrpy.git
pip install ./snowflake-vcrpy

After installation, you can use the “snowflake-vcr” pytest marker to annotate the test you want to run in record-and-replay mode. Here’s an example test case for the Python connector:

import snowflake.connector

@pytest.mark.snowflake_vcr
def test_method():
CONNECTION_PARAMETERS = {} # your credentials
with snowflake.connector.connect(**CONNECTION_PARAMETERS) as conn, conn.cursor() as cursor:
assert cursor.execute('select 1').fetchall() == [(1,)]

When the test_method is first run, snowflake-vcrpy generates a cassette file called test_method.yaml in a cassettes directory within the test directory. This file contains all the records of the HTTP requests and responses that pass through the Snowflake Python driver. In this case, the driver sends an http request to Snowflake to execute “select 1”. The next time test_method is run, snowflake-vcrpy detects the existence of the cassettes file and intercepts the http request, returning the corresponding response stored in the file.

Here’s another example to test written by snowflake-snowpark-python:

from snowflake.snowpark import Session, Row

@pytest.mark.snowflake_vcr
def test_method():
CONNECTION_PARAMETERS = {} # your credential

session = Session.builder.configs(CONNECTION_PARAMETERS).create()
df = session.create_dataframe([[1, "a"], [2, "b"]]).to_df("id", "value")
assert df.collect() == [Row(1, "a"), Row(2, "b")]

In addition, snowflake-vcrpy provides a pytest option called “ — snowflake-record-tests-selection” that allows you to select the tests to run in record-and-reply mode. You can set the value to “all” to run all tests in record-and-reply mode, regardless of whether they are annotated or not. Alternatively, you can set the value to “annotated” (the default) to run only the annotated tests in record-and-reply mode. Finally, you can set the value to “disable” to ignore the “snowflake-vcr” decorators.

pytest <tests> --snowflake-record-tests-selection all
pytest <tests> --snowflake-record-tests-selection disable
pytest <tests> --snowflake-record-tests-selection annotated

The cassettes directory can be included in source control to ensure that other developers can use the cassettes when running integration tests locally, and so they’re accessible to CI pipelines. Alternatively, you could put the cassettes in any arbitrary file system such as S3.

Conclusion

snowflake-vcrpy is an open source experimental library we developed in response to customers who were looking for better development experience using snowpark. As a Snowflake-Labs project, it isn’t covered by official support today.

Interested in giving it a spin? Head to https://github.com/Snowflake-Labs/snowflake-vcrpy for instructions on how to run your Snowflake tests in record-and-replay mode. Be sure to let us know your thoughts with comments on this post or engaging with GitHub issues.

Issues, pull requests, and ideas are all welcome!

--

--