Create artificial data with Gretel Synthetics and Google Colaboratory
In this post we’ll use Gretel Synthetics and Google Colaborary’s free GPUs to train a machine learning model to automatically generate fake, anonymized data with differential privacy guarantees.
Today we will walk through some of the new features in Gretel’s gretel_synthetics
open-source synthetic data library ver 0.6.0 including:
- Google
SentencePiece
support for unsupervised tokenization, with configurable vocabulary size & character coverage. smart_open
support to load datasets from AWS, GCP, Azure.- Launch directly into Colaboratory.
Check out the walk-through screencast below, or click the Colab link to get started creating your own synthetic dataset!
For a deep dive on anonymizing precise location data, check out our previous deep dive on anonymizing scooter ride-share data, and how we discovered and partnered with Uber to fix privacy concerns in public ride-share feeds.