Puzzle 3 Month by Month

Pandas Brain Teasers — by Miki Tebeka (11 / 34)

The Pragmatic Programmers
The Pragmatic Programmers

--

👈 In or Out? | TOC | Round and Round We Go 👉

monthly.py

​ ​from​ ​io​ ​import​ StringIO
​ ​import​ ​pandas​ ​as​ ​pd​

​ csv_data = ​'''​​\​
​ ​day,hits​
​ ​2020-01-01,400​
​ ​2020-02-02,800​
​ ​2020-02-03,600​
​ ​'''​

​ df = pd.read_csv(StringIO(csv_data))
​ ​print​(df[​'day'​].dt.month.unique())

Guess the Output

IMPORTANT

Try to guess what the output is before moving to the next page.

images/hline.png

This code will raise an AttributeError.

images/hline.png

The comma-separated values (CSV) format does not have a schema. Everything you read from it is a string. Pandas does a great job of “guessing” the types of data inside the CSV, but sometimes it needs help.

You can use .dtypes to see what types a DataFrame has:

​ In [3]: df.dtypes
​ Out[3]:
​ day object
​ hits int64
​ dtype: object

The object dtype usually means a str (Python’s string). The read_csv function has many parameters, including parse_dates.

--

--

The Pragmatic Programmers
The Pragmatic Programmers

We create timely, practical books and learning resources on classic and cutting-edge topics to help you practice your craft and accelerate your career.