Importing and Visualizing data in Google Colab
In this article, I’ll show you how you can use your dataset in the Google Colab notebook.
How to upload a dataset from a local file?
You can upload a small size dataset (.csv or .txt file)or data saved in a python file (.py) using this approach. This adds a “Choose Files” button and you can upload any file.
I have uploaded file.py containing data in a dictionary format (‘dict’). After importing the dictionary, you can convert it to a data frame using pd.DataFrame.from_dict()
attribute.
Data Visualization
I am using a sales dataset for this purpose.
Importing required libraries and dataset:
Understanding the data:
format_dict = {‘Mes’:’{:%m-%Y}’} #Simplified format dictionary with values that do make sense for our datadata.head(10).style.format(format_dict).background_gradient(cmap=’BuGn’)
Scatter Plot
plt.scatter(x=data[‘Profit’], y=data[‘Cost’])
Pie Plot
data[‘Age_Group’].value_counts().plot(kind=’pie’, figsize=(6,6))
Bar Plot
ax = data[‘Country’].value_counts().plot(kind=’bar’, figsize=(14,6))
Histogram
sns.distplot(data[‘Revenue’], bins=10, kde=True )
Correlation Matrix
corr = data.corr()
Heat Map
fig = plt.figure(figsize=(8,8)) plt.matshow(corr, cmap=’RdBu’, fignum=fig.number) plt.xticks(range(len(corr.columns)), corr.columns, rotation=’vertical’); plt.yticks(range(len(corr.columns)), corr.columns);
Density Plot
ax =data[‘Unit_Cost’].plot(kind=’density’, figsize=(14,6)) ax.axvline(data[‘Unit_Cost’].mean(), color=’red’) ax.axvline(data[‘Unit_Cost’].median(), color=’green’)
Pair Plot
sns.pairplot(data.iloc[:, 12:15])