Sitemap
TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Follow publication

Learn you a Kedro*

9 min readJul 2, 2021

--

Latte art. By Yuko Honda on Flickr (CC BY-SA 2.0) https://flic.kr/p/d9KiHk

Introducing Kedro

Kedro concepts

Node

# Prepare first node
def return_greeting():
return “Hello”
return_greeting_node = node(func=return_greeting, inputs=None, outputs=”my_salutation”)
# Prepare second node
def join_statements(greeting):
return f”{greeting} Kedro!”
join_statements_node = node(join_statements, inputs=”my_salutation”, outputs=”my_message”)

Pipeline

# Assemble nodes into a pipeline
pipeline = Pipeline([return_greeting_node, join_statements_node])

DataCatalog

# Prepare a data catalog
data_catalog = DataCatalog({“my_salutation”: MemoryDataSet()})

Runner

Hello Kedro!

Get started!

Kedro spaceflights tutorial

Set up the project

Set up the data

companies:
type: pandas.CSVDataSet
filepath: data/01_raw/companies.csv
reviews:
type: pandas.CSVDataSet
filepath: data/01_raw/reviews.csv
shuttles:
type: pandas.ExcelDataSet
filepath: data/01_raw/shuttles.xlsx
companies = catalog.load(“companies”)
companies.head()
shuttles = catalog.load(“shuttles”)
shuttles.head()

Data processing modular pipeline

Modular pipeline for data science

test_size: 0.2
random_state: 3
features:
- engines
- passenger_capacity
- crew
- d_check_complete
- moon_clearance_complete
- iata_approved
- company_rating
- review_scores_rating
regressor:
type: pickle.PickleDataSet
filepath: data/06_models/regressor.pickle
versioned: true

Test the pipeline

Summary

Wow, that was some long tutorial!

Acknowledgements

--

--

TDS Archive
TDS Archive

Published in TDS Archive

An archive of data science, data analytics, data engineering, machine learning, and artificial intelligence writing from the former Towards Data Science Medium publication.

Jo Stichbury
Jo Stichbury

Written by Jo Stichbury

Technical content creator writing about data science and software. Old-school Symbian C++ developer, now accidental cat herder and goose chaser.

Responses (3)