FAKER- Fake data for Machine Learning

Snekhasuresh
featurepreneur
Published in
2 min readOct 8, 2022

Looking for a dataset for training and testing your machine learning model?

Well, you are at the right place!

In this article, you will be able to learn how to produce fake data for your projects.

What is FAKER?

Often you need dummy data or fake data to test the software you are working on. The real pain is to get suitable data for testing. If you are lucky Google search may yield some datasets but sometimes it may be a futile exercise.

What if we can generate the dummy data quickly and load it into Pandas data frame so that you don’t have to spend time searching for the right dataset? Yes, It’s possible with Faker.

Faker is a Python package that generates fake data.

INSTALLATION:

Use the below command for installation. However, note that starting from version 4.0.0, Faker only supports Python 3.6 and above. If you are still using Python 2.x then you should use Faker 3.0.1.

pip install Faker

IMPLEMENTATION:

First, a Faker object must be created and then the methods must be run on the faker object to get the required fake data.

import faker
fake = faker()

Now, let’s generate random data.

>>> fake.name()
'Lucy Cechtelar'
>>> fake.address()
'0535 Lisa Flats\nSouth Michele, MI 38477'

LOCALIZED DATA:

By default, the Faker generates the data in English like name, address, etc.

You can generate data in customized data in your preferred languages.

# Generating fake data in Hindi
>>> fake = Faker(locale='hi_IN')
>>> for _ in range(2):
print(fake.name())
मोहिनी काले
फ़ातिमा ड़ाल

EXAMPLE:

Let’s generate fake profile data and convert them into pandas dataframe now.

import pandas as pd
from faker import Faker
Faker.seed(42)

fake = Faker(locale='en_US')
fake_workers = [fake.profile() for x in range(5)]

df = pd.DataFrame(fake_workers)

A data frame containing 10 fake profiles would be created.

CONCLUSION:

Congratulations! Now you don’t have to worry about datasets while your Machine Learning Projects!

Hope this article helped you!

--

--