The pandas library with SQlite3 for data analysis /machine learning
aim of this post is to store data from data base using sqlite and accessing data from data base and store in csv file.
in my previous post i’m shared interacting with various data base like CSV file,HTML file,SQL file now i’m explaining interacting with data base ,sql is part of data base but other data base like mongodb ,NOSQL and web XML file , Microsoft Excel files.
now try to understand data in machine learning when you want to train model then firstly to get data without data we can not do anything but data is available everywhere but need data is most important these data accessing from internet then which method to access data and how we can access data these method or way i’m explaining here,
how to access data from data base and accessing data from microsoft excel file ,xml file and loaded data how we can stored in data frame .
Accessing data from SQLite3:-
short introduce about sqlite
sqlite is a public domain software package that provide user define records stored in data base in proper format and these records are very large or very small but all records stored in Relational Data base Management System (RDBMS) form and these records data produce some summaries.
no need of server in sqlite that is serverless.
sqlite is not client server database engine.it is embedded into end program.
accessing data from data base you need to first install sqlite3 library here use following command to install sqlite3.
pip install sqlite
Serverless:-
when we use sqlite library sqlite not need any server we can dirct interact with data base
in traditional way to accessing data is
here we can see that without sqlite3 library how we can access data but this process is very depth and it time taken so mach there for execution of program is delay
here when we use sqlite file then program run fastly
user program can directly interact with sqlite data base and these process run fastly .
Steps
step1: import library
step2: create data base and table
step 3: insert data
step 4: fetch data and store in data frame
step 5:store in csv file
i’m showing you one example that example is that creating table and these table are stored in one data base and these table accessing into data frame format and these data frame format converted into csv file this process i’m sharing with you.
in that i’m created data base and table in that table i’m inserted data let show below picture
here am created data base and table , database name is mydata_with_records.sqlite and connected with that data base and “con” is variable
and created table table name is Student_name_data in that there are 4 columns that store student name , middle name, last name and marks
and con.execute(query) is function that run query in database that is mydata_with_records data base.
here you can see it.
here i’m inserting data in student table but in that when you run this code firstly you want to insert how much student data like 4 students data , 5 students data or 6 that size to execute this code and using for loop it run n’th time and execute statement and data store in database in that database store in table you see that.
and here all student record data selected and stored in cursor variable using con.execute() function and these list stored in data frame (when you fetch data into in any variable then these data by default stored in list)
here you can see it all data stored into data frame by proper format and these data loaded into student table . but these data not in variable below photo you can see that data are stored in datas variable(data frame)
here data are loaded from student_name_data using read_sql() function
previous photo data are fetched using sql command and this code used pandas read_sql function and data are fetched using pandas function and these are difference .
and all data are stored in student_records csv file using to_csv() function in pandas and data2 variable we stored student records file
you can see that we are successfully data stored in data base and these data loaded in data frame and these data frame stored in csv file after loading csv file we can analysis data .
code of this program is here:-
#program created by Shinde_umakant
import sqlite3
import sqlalchemy as sqla
import pandas as pd
query = “””
CREATE TABLE Student_name_data
(First_name VARCHAR(20),middle_name VARCHAR(20), last_name VARCHAR(20),
Marks INTEGER
);”””
#here i’m created one query in that created table
con = sqlite3.connect(‘mydata_wtih_records.sqlite’)
con.execute(query)
con.commit()
dataa=int(input(“Enter data size”))
for i in range(dataa):
frist_name=input(“Enter first name”)
middle_name=input(“Enter middle name”)
last_name=input(“Enter last name”)
Marks=input(“Enter marks”)
data = [(frist_name,middle_name,last_name,Marks)]
tmt = “INSERT INTO Student_name_data VALUES(?,?,?,?)”
con.executemany(tmt,data)
con.commit()
cursor = con.execute(‘select * from Student_name_data’)
rows = cursor.fetchall()
rows
pd.DataFrame(rows, columns=[x[0] for x in cursor.description])
db = sqla.create_engine(‘sqlite:///mydata_wtih_records.sqlite’)
datas=pd.read_sql(‘select * from Student_name_data’, db)
datas
type(datas)
datas.to_csv(“student_records.csv”)
data2=pd.read_csv(“student_records.csv”)
data2.head()
this program is sample we can access data from html and and html data we can store in data base just change inserting data time code and after stored data we can access and next process i’m shared here .
next blog i will share how we can access data from xml, mongodb, Microsoft Excel files.file.
if you are not readed my previous post then read it i’m sharing link here
https://medium.com/analytics-vidhya/the-pandas-library-in-depth-part-1-c85dde2931d5