Part 5.1 !! Pandas DataFrame to PostgreSQL using Python
How to read data from PostgreSQL to Pandas DataFrame?
Prerequisites
Python 3.8.3 : Anaconda download link
PostgreSQL 13 : Download link
Psycopg2 : To install Psycopg2 use the command: pip install psycopg2
Objective
Generally we create training and testing data by importing csv file into pandas DataFrame but when we have large data stored in database server then we need a method to extract it into pandas DataFrame directly from database server. For this objective, we will learn step by step working code for different methods for doing it.
Step 1 : Import libraries
# import sys to get more detailed Python exception info
import sys
# import the connect library for psycopg2
import psycopg2
# import the error handling libraries for psycopg2
from psycopg2 import OperationalError, errorcodes, errors
import psycopg2.extras as extras
import pandas as pd
Step 2 : Specify the connection parameters
# Note: please change your database, username & password as per your own values
conn_params_dic = {
"host" : "localhost",
"database" : "irisdb",
"user" : "postgres",
"password" : "Passw0rd"
}
Step 3 : Support Functions
# Define a function that handles and parses psycopg2 exceptions
def show_psycopg2_exception(err):
# get details about the exception
err_type, err_obj, traceback = sys.exc_info()
# get the line number when exception occured
line_n = traceback.tb_lineno
# print the connect() error
print ("\npsycopg2 ERROR:", err, "on line number:", line_n)
print ("psycopg2 traceback:", traceback, "-- type:", err_type)
# psycopg2 extensions.Diagnostics object attribute
print ("\nextensions.Diagnostics:", err.diag)
# print the pgcode and pgerror exceptions
print ("pgerror:", err.pgerror)
print ("pgcode:", err.pgcode, "\n")# Define a connect function for PostgreSQL database server
def connect(conn_params_dic):
conn = None
try:
print('Connecting to the PostgreSQL...........')
conn = psycopg2.connect(**conn_params_dic)
print("Connection successfully..................")
except OperationalError as err:
# passing exception to function
show_psycopg2_exception(err)
# set the connection to 'None' in case of error
conn = None
return conn# Method 2 : Using psycopg2
# Define postgresql_to_dataframe function to load data into a pandas # dataframe
def postgresql_to_dataframe(conn, sql, col_names):
cursor = conn.cursor()
try:
cursor.execute(sql)
except (Exception, psycopg2.DatabaseError) as err:
# passing exception to function
show_psycopg2_exception(err)
# Naturally we get a list of tupples
tupples = cursor.fetchall()
cursor.close()
# We just need to turn it into a pandas dataframe
df = pd.DataFrame(tupples, columns=col_names)
return df# Method 3 : Using Alchemy
# Define using_alchemy function to load data into a pandas # dataframe
connect_alchemy = "postgresql+psycopg2://%s:%s@%s/%s" % (
conn_params_dic['user'],
conn_params_dic['password'],
conn_params_dic['host'],
conn_params_dic['database']
)
def using_alchemy(sql):
try:
engine = create_engine(connect_alchemy)
df = pd.read_sql_query(sql,con=engine)
except OperationalError as err:
# passing exception to function
show_psycopg2_exception(err)
return df
Step 4: Perform main task
# Connecting to PostgreSQL Data
conn = connect(conn_params_dic)
col_names = ['sepal_length','sepal_width','petal_length','petal_width','species']
# Create a statement for querying PostgreSQL.
sql = "select * from iris"
Method 1 : Using psycopg2
df = postgresql_to_dataframe(conn, sql, col_names)
print()
print(df.head())
Method 2 : Using psycopg2 and pd.read_sql_query()
conn = connect(conn_params_dic)
df = pd.read_sql_query(sql,con=conn)
df.head()
Method 3: Using sqlalchemy
df = using_alchemy(sql)
df.head()
Conclusion : This ends our Part 5.1. In this tutorial we have learned How to read data from PostgreSQL bdatabase to Pandas DataFrame?
All code for this article is available as a Jupyter Notebook on GitHub.
Next Part 5.2 : Extract, Transform, and Load PostgreSQL Data in Python
Previous Learning :
Part 1: Introduction , Connection & Database Creation
Part 2 Create Table in PostgreSQL Database Using Python
Part 3.1: Insert Bulk Data Using executemany() Into PostgreSQL Database
Part 3.2: Insert Bulk Data Using execute_batch() Into PostgreSQL Database
Part 3.3: Insert Bulk Data Using execute_values() method into PostgreSQL Database using Python
Part 3.4: Insert Bulk Data Using mogrify() Into PostgreSQL Database
Part 3.5: Insert Bulk Data Using copy_from() Into PostgreSQL Database
Part 3.6 : Insert Bulk Data Using copy_from() with StringIO Into PostgreSQL Database
Part 3.7 : Insert Bulk Data Using to_sql()(alchemy) Into PostgreSQL Database
Part 4: Comparison of Methods for Importing bulk CSV data Into PostgreSQL Using Python
Stay Positive !! Stay Safe !! Keep Learning :):)
Thanks you for reading! !