The pandas library with SQlite3 for data analysis /machine learning

Umakant_Shinde
Analytics Vidhya
Published in
5 min readOct 12, 2020

aim of this post is to store data from data base using sqlite and accessing data from data base and store in csv file.

in my previous post i’m shared interacting with various data base like CSV file,HTML file,SQL file now i’m explaining interacting with data base ,sql is part of data base but other data base like mongodb ,NOSQL and web XML file , Microsoft Excel files.

now try to understand data in machine learning when you want to train model then firstly to get data without data we can not do anything but data is available everywhere but need data is most important these data accessing from internet then which method to access data and how we can access data these method or way i’m explaining here,

how to access data from data base and accessing data from microsoft excel file ,xml file and loaded data how we can stored in data frame .

Accessing data from SQLite3:-

short introduce about sqlite

sqlite is a public domain software package that provide user define records stored in data base in proper format and these records are very large or very small but all records stored in Relational Data base Management System (RDBMS) form and these records data produce some summaries.

no need of server in sqlite that is serverless.

sqlite is not client server database engine.it is embedded into end program.

accessing data from data base you need to first install sqlite3 library here use following command to install sqlite3.

pip install sqlite

Serverless:-

when we use sqlite library sqlite not need any server we can dirct interact with data base

in traditional way to accessing data is

here we can see that without sqlite3 library how we can access data but this process is very depth and it time taken so mach there for execution of program is delay

here when we use sqlite file then program run fastly

user program can directly interact with sqlite data base and these process run fastly .

Steps

step1: import library

step2: create data base and table

step 3: insert data

step 4: fetch data and store in data frame

step 5:store in csv file

i’m showing you one example that example is that creating table and these table are stored in one data base and these table accessing into data frame format and these data frame format converted into csv file this process i’m sharing with you.

in that i’m created data base and table in that table i’m inserted data let show below picture

here am created data base and table , database name is mydata_with_records.sqlite and connected with that data base and “con” is variable

and created table table name is Student_name_data in that there are 4 columns that store student name , middle name, last name and marks

and con.execute(query) is function that run query in database that is mydata_with_records data base.

here you can see it.

here i’m inserting data in student table but in that when you run this code firstly you want to insert how much student data like 4 students data , 5 students data or 6 that size to execute this code and using for loop it run n’th time and execute statement and data store in database in that database store in table you see that.

and here all student record data selected and stored in cursor variable using con.execute() function and these list stored in data frame (when you fetch data into in any variable then these data by default stored in list)

here you can see it all data stored into data frame by proper format and these data loaded into student table . but these data not in variable below photo you can see that data are stored in datas variable(data frame)

here data are loaded from student_name_data using read_sql() function

previous photo data are fetched using sql command and this code used pandas read_sql function and data are fetched using pandas function and these are difference .

and all data are stored in student_records csv file using to_csv() function in pandas and data2 variable we stored student records file

you can see that we are successfully data stored in data base and these data loaded in data frame and these data frame stored in csv file after loading csv file we can analysis data .

code of this program is here:-

#program created by Shinde_umakant

import sqlite3

import sqlalchemy as sqla

import pandas as pd

query = “””

CREATE TABLE Student_name_data

(First_name VARCHAR(20),middle_name VARCHAR(20), last_name VARCHAR(20),

Marks INTEGER

);”””

#here i’m created one query in that created table

con = sqlite3.connect(‘mydata_wtih_records.sqlite’)

con.execute(query)

con.commit()

dataa=int(input(“Enter data size”))

for i in range(dataa):

frist_name=input(“Enter first name”)

middle_name=input(“Enter middle name”)

last_name=input(“Enter last name”)

Marks=input(“Enter marks”)

data = [(frist_name,middle_name,last_name,Marks)]

tmt = “INSERT INTO Student_name_data VALUES(?,?,?,?)”

con.executemany(tmt,data)

con.commit()

cursor = con.execute(‘select * from Student_name_data’)

rows = cursor.fetchall()

rows

pd.DataFrame(rows, columns=[x[0] for x in cursor.description])

db = sqla.create_engine(‘sqlite:///mydata_wtih_records.sqlite’)

datas=pd.read_sql(‘select * from Student_name_data’, db)

datas

type(datas)

datas.to_csv(“student_records.csv”)

data2=pd.read_csv(“student_records.csv”)

data2.head()

this program is sample we can access data from html and and html data we can store in data base just change inserting data time code and after stored data we can access and next process i’m shared here .

next blog i will share how we can access data from xml, mongodb, Microsoft Excel files.file.

if you are not readed my previous post then read it i’m sharing link here

https://medium.com/analytics-vidhya/the-pandas-library-in-depth-part-1-c85dde2931d5

--

--

Umakant_Shinde
Analytics Vidhya

Computer Science Engineer. machine learning and data science . I’m trying to cover basic level to advance level topic in the data science domain