Importing and Visualizing data in Google Colab

Shweta Pardeshi
Analytics Vidhya
Published in
3 min readMay 3, 2020

--

Image Source: https://www.bernardmarr.com/default.asp?contentID=1508

In this article, I’ll show you how you can use your dataset in the Google Colab notebook.

How to upload a dataset from a local file?

You can upload a small size dataset (.csv or .txt file)or data saved in a python file (.py) using this approach. This adds a “Choose Files” button and you can upload any file.

I have uploaded file.py containing data in a dictionary format (‘dict’). After importing the dictionary, you can convert it to a data frame using pd.DataFrame.from_dict() attribute.

Data Visualization

I am using a sales dataset for this purpose.

Importing required libraries and dataset:

Understanding the data:

format_dict = {‘Mes’:’{:%m-%Y}’} #Simplified format dictionary with values that do make sense for our datadata.head(10).style.format(format_dict).background_gradient(cmap=’BuGn’)

Scatter Plot

plt.scatter(x=data[‘Profit’], y=data[‘Cost’])

Pie Plot

data[‘Age_Group’].value_counts().plot(kind=’pie’, figsize=(6,6))

Bar Plot

ax = data[‘Country’].value_counts().plot(kind=’bar’, figsize=(14,6))

Histogram

sns.distplot(data[‘Revenue’], bins=10, kde=True )

Correlation Matrix

corr = data.corr()

Heat Map

fig = plt.figure(figsize=(8,8)) plt.matshow(corr, cmap=’RdBu’, fignum=fig.number) plt.xticks(range(len(corr.columns)), corr.columns, rotation=’vertical’); plt.yticks(range(len(corr.columns)), corr.columns);

Density Plot

ax =data[‘Unit_Cost’].plot(kind=’density’, figsize=(14,6)) ax.axvline(data[‘Unit_Cost’].mean(), color=’red’) ax.axvline(data[‘Unit_Cost’].median(), color=’green’)

Pair Plot

sns.pairplot(data.iloc[:, 12:15])

References

--

--

Shweta Pardeshi
Analytics Vidhya

Master's student at UCSD | Educative Author | 35k+ views on Medium | Analytics Vidhya Author | IIT Gandhinagar | https://www.buymeacoffee.com/shwetapar1