Daily Report Using Python and Spreadsheet.

Qayyum Mujib
TheLorry Data, Tech & Product
5 min readOct 15, 2021

The aim of this article is to help you generate a daily report by using python and analyze the data using Spreadsheet. After reading this article to the end, I’m sure you will fall in love with spreadsheet and maybe spreadsheet can be your best friends after this, like I’m right now.

Why I can say I fall in love with spreadsheet, because before this I’m fully used python for data extraction, manipulation, analyze the data and generate the result to excel table, even to color the excel cell I need to code using python.

But now, not anymore, spreadsheets can save a lot of time using hard coding because spreadsheets have all the functions that I need for data analysis. I’m pretty sure, even if you don’t have knowledge in programming or you come from Admin, Finance or any department you still can automate your daily report by using spreadsheets.

Btw, I have mentioned above to use python also right, actually I used python for ETL part, which is Extract, Transform and Load. Extract data from data source, do transformation and load the data into Spreadsheet. Don’t worry, I will guide the end to end process, python and spreadsheet part. I just need you to read this article until the end to make sure you get some input and knowledge from this article.

Image BY GREG BAUGUES

First things in data analysis is about the data source, so you need to know where your data come from. For example, you can directly get the data from a website, or your data from a database and so on. So, after you know where your data comes from, it’s easier for you to automate the ETL part. For this article, my data source comes from Big Query so I will explain how to read data from Big Query using python. If your data inside your company website, then you need to do web-scraping, if you need knowledge about web-scraping then you can refer to this link, and if you need help how to read data from MYSQL, you can refer to this link, btw the writer read MYSQL using app script, but don’t worry. It is still on the same boat with what we have discussed here.

Ok, Let’s start with the ETL part. Before that automated ETL part is very important because after we automate the process no need to manually download and replace the data inside the spreadsheets, because after that, when the data changes the output also will change that is why we need to automate this process.

For this ETL part, I’m using python for

  1. Read data from Big Query.
  2. Clean Spreadsheet cell, this is to replace it with a new data set.
  3. Push the data to Spreadsheet.

Ok, first import all the required library and you need your bqAPI.json file to give access to read the data from BQ, code as below:

import pandas as pd
import numpy as np
import datetime
import time
from datetime import date,timedelta
from oauth2client.service_account import ServiceAccountCredentials
import gspread
from df2gspread import df2gspread as d2g
import gspread_dataframe as gd
import os
import pyarrow
from google.cloud import bigquery
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="bqApi.json"

Then, you can read the data from BQ as below.

client = bigquery.Client()
QUERY = """
SELECT * FROM `project_name.data_setname.table_name`
"""
query_job = client.query(QUERY)
df = query_job.to_dataframe()

you will get something like this,

Then we can push the data into Spreadsheet. The code as below,

def push_to_spreadsheet(spreadsheet_name,sheet_name,clean_cell,data_frame):
scope = ['https://www.googleapis.com/auth/spreadsheets','https://www.googleapis.com/auth/drive.file','https://www.googleapis.com/auth/drive']
creds = ServiceAccountCredentials.from_json_keyfile_name('credential.json',scope)
client = gspread.authorize(creds)
ws = client.open(spreadsheet_name).worksheet(sheet_name)cell_list = ws.range(clean_cell)for cell in cell_list:
cell.value = ''
# Update in batch
ws.update_cells(cell_list)
time.sleep(5)
gd.set_with_dataframe(ws,data_frame)
print('done complete : ', sheet_name)

For the above code, I create it in 1 function, therefore you just need to call the function to make sure the code can run.

4 parameters of the function describe as below:

  1. spreadsheet_ name : name of spreadsheet you save for the project
  2. sheet_name : name of the sheet to work with.
  3. clean_cell : cell range that we need to delete, before replacing with new dataset.
  4. data_frame : name of dataframe that we want to push into spreadsheet.

example as below :

Ok, settle on the data extraction part then let’s start with analyzing the data inside the spreadsheet.

First thing in sale analysis, people are more interested to know about the sale performance. Therefore, for sale performance I will provide in two types, first overall sale performance and second by month. If you need daily performance, I let you explore by yourself, because if you follow step by step what I will explain later, surely you can find out how to apply daily sales performance.

Overall Sale Performance,

To get the total sale, we can use sumifs or sumif syntax. The difference between sumifs and sumif is sumif only have one criteria, if you have more than one criteria then consider using sumifs.

Therefore, to get overall sales performance then I only use sumif because there is only one criteria that I need to consider which is Order Status. The syntax as below :

SUMIF(sum_range, criteria_range, criterion)sum_range : range to be summed
criteria_range : range to check against criterion
criterion : pattern or test to apply on criterion

so let’s apply in our use case:

=sumif(Sheet1!D:D,Sheet1!B:B,"Shipped")

After we get overall revenue, then we can break down the revenue by month, to help management get an overall view about month performance.

So to get by month revenue we have 2 conditions we need to focus on, month and order status, therefore for this metric we can use sumifs. The syntax,

SUMIFS(sum_range,criteria_range_1,criterion_1,...,criteria_range_n,criterion_n)=sumifs(Sheet1!D:D,Sheet1!B:B,"Shipped",Sheet1!I:I,"Jan 2021")

so you will get the output as below,

It’s boring right, when we see something without color, therefore you can put the cell color by using conditional formatting. Besides, it helps people to understand the table in a short time.

click on format > conditional formatting > choose “colour scale” > select the cell > change the color based on your preference.

For me I prefer to use green color for maximum revenue and red color as minimum revenue.

Much better right compared to the table before? You can directly understand the table, in which Jan 2021 the revenue is lowest compared to other months, while in March you achieve the highest revenue compared to other months.

If you need more about spreadsheet function, then you can refer to this link.

Thank you…..

--

--