Getting Started With PostgreSQL
Every App needs a storage to store and retrieve data. This blog can be used to setup postgres using docker and also setup logging and basic settings for postgres.
This document can be used by IT professionals looking to setup postgres server and create basic server which is ready to be used by developers.
PostgreSQL, often referred to as Postgres, is a powerful and open-source object-relational database system (ORDBMS).
Core Features:
- SQL Language: PostgreSQL extends the SQL language and incorporates many features.
- Data Workloads: It safely stores and scales even the most complex data workloads.
- Reliability: PostgreSQL has earned a strong reputation for its proven architecture and data integrity.
- Extensibility: Developers can define custom data types, functions, and even write code in different programming languages without recompiling the database.
- Operating Systems: It runs on all major operating systems.
- ACID-Compliant: PostgreSQL has been ACID-compliant since 2001.
But why postgres
- Free And Open Source: Its Free as in freedom , you are free to use, modify, study or improve the software. You can use internally or provide service as SaaS.
- Its Battle tested: Its used by many major companies like apple, spotify, instagram, reddit etc. Its heavily battle tested and improved over time to support lite to heavy traffic
- SQL compliant: Its a relational database and SQL can be use to interact with DB
- ACID Compliant: its acid compliant ( will be discussed later)
Official Docs: https://www.postgresql.org/docs/
Getting Started
Okay lets get started with PostgreSQL. We will use docker to deploy and play around with postgres.
We will use docker image
docker pull postgres:14-bullseye
Now create a file called docker-compose.yaml and paste the following
version: '3.9'
services:
postgres:
image: postgres:14
ports:
- 5432:5432
volumes:
- ./postgres:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD="my-super-duper-secret-password"
- POSTGRES_USER=admin
- POSTGRES_DB=admin
Description:
we are exposing postgres at 5432 port and providing few env variables for password and default db and user.
For persistence we are mounting ./postgres to /var/lib/postgresql/data for data.
later we will add config file also.
Lets Perform few basic operations
Playing with psql
In this section we will go through very basics of psql.
First lets exec into the container
docker exec -it installation-postgres-1 psql -Uadmin
Pay attention how we used admin as user
Description:
psql is command which is used to launch and connect with shell and interact to postgres database. few parameters are given below.
-U : Username to connect as
-p : port of db
-d : dbName to connect to
Cool now you should be in shell
Lets run basic MySQL commands
-- Create a new database called 'Books' --
CREATE DATABASE library;
--- Switch to library database ---
\c library
--- Create Table called book ---
CREATE TABLE book(
bookid Serial Primary Key,
bookName varchar(32) Not NULL,
genre text[],
bookAuthor varchar(32) NOT NULL
);
You may have noticed we used \c to switch the database.
few more helpful shortcuts are
- Connect to a specific database:
\c database_name;
2. List all databases
\l
3. Lists all tables in a current database.
\dt
4. Enable Expanded mode
\x
5. List all users:
\du
Cool Currently you are logged in as admin user. We will discuss about this in authorization and authentications section.
Logging
Logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations.
In postgres by default logs are stored at /etc/log/postgres/
In docker compose output you can also see the logs. but they are not persistent
You can pass arguments -c during starting postgres server to set logging directory
example:
postgres -c logging_collector=on -c log_destination=stderr -c log_directory=/logs
In your docker compose file make the following changes
version: '3.9'
services:
postgres:
image: postgres:14
ports:
- 5432:5432
## Add Logging
command: postgres -c logging_collector=on -c log_destination=stderr -c log_directory=/logs
volumes:
- ./postgres:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=VfzhCn6
- POSTGRES_USER=admin
- POSTGRES_DB=admin
now create a folder called logs to store your logs
stop the docker compose and run docker compose up once more
If you get error with permission declined, change directory ownership to 999
use:
sudo chmod 777 logs -R
Now your default log file is changed
Configuration file
Configurations file provides a reliable and fast way to reproduce the server setup.
You can find location of config file using command in psql shell.
SHOW config_file;
This is location of your config_file.
Now lets copy default config file
Run docker ps
to get all running container and select the id of your container
example
Now run docker cp
to copy default file. I am copying it to ./config/postgresql.conf
docker cp <container-id>:/var/lib/postgresql/data/postgresql.conf ./config/postgresql.conf
Before we deep dive, lets update compose file to read from new config in our local.( its at default location only so even if you don’t change it , it will be same)
Also note we will move config for logging also inside the conf
version: '3.9'
services:
postgres:
image: postgres:14
ports:
- 5432:5432
command: postgres -c config_file=/var/lib/postgresql/data/postgresql.conf
volumes:
- ./postgres:/var/lib/postgresql/data
- ./logs:/logs
- ./config/postgresql.conf:/var/lib/postgresql/data/postgresql.conf
environment:
- POSTGRES_PASSWORD=VfzhCn6
- POSTGRES_USER=admin
- POSTGRES_DB=admin
Exploring Config File
Lets explore few arguments in config file, shall we
In section Connections and Authentications we find the following settings
listen_addresses = '*'
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
#port = 5432 # (change requires restart)
max_connections = 100 # (change requires restart)
#superuser_reserved_connections = 3 # (change requires restart)
#unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
# (change requires restart)
#unix_socket_group = '' # (change requires restart)
#unix_socket_permissions = 0777 # begin with 0 to use octal notation
# (change requires restart)
#bonjour = off # advertise server via Bonjour
# (change requires restart)
#bonjour_name = '' # defaults to the computer name
# (change requires restart)
# - TCP settings -
# see "man tcp" for details
#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0 # TCP_KEEPCNT;
# 0 selects the system default
#tcp_user_timeout = 0 # TCP_USER_TIMEOUT, in milliseconds;
# 0 selects the system default
#client_connection_check_interval = 0 # time between checks for client
# disconnection while running queries;
# 0 for never
# - Authentication -
#authentication_timeout = 1min # 1s-600s
#password_encryption = scram-sha-256 # scram-sha-256 or md5
#db_user_namespace = off
Most of settings have comments associated with them, lets discuss few of most frequently used (This are directly taken from link below and added use case at end for better understanding)
Ref. https://www.postgresql.org/docs/14/runtime-config-connection.html
listen_addresses = '*'
Description:
By default , postgres listens on all the incoming sockets.Specifies the TCP/IP address(es) on which the server is to listen for connections from client applications. The value takes the form of a comma-separated list of host names and/or numeric IP addresses. The special entry *
corresponds to all available IP interfaces. The entry 0.0.0.0
allows listening for all IPv4 addresses and ::
allows listening for all IPv6 addresses. If the list is empty, the server does not listen on any IP interface at all, in which case only Unix-domain sockets can be used to connect to it. If the list is not empty, the server will start if it can listen on at least one TCP/IP address. A warning will be emitted for any TCP/IP address which cannot be opened. The default value is localhost, which allows only local TCP/IP “loopback” connections to be made.
Usecase:
listen_addresses
controls which interfaces accept connection attempts, which can help prevent repeated malicious connection requests on insecure network interfaces. This parameter can only be set at server start.
port = 5432 (integer)
port(integer)
Description:
The TCP port the server listens on; 5432 by default. Note that the same port number is used for all IP addresses the server listens on. This parameter can only be set at server start.
Usecase:
This is very useful if you want to run 2 servers from same machine or change default port
max_connections = 100
max_connections
(integer
)
Description:
Determines the maximum number of concurrent connections to the database server. The default is typically 100 connections, but might be less if your kernel settings will not support it (as determined during initdb). This parameter can only be set at server start.
When running a standby server, you must set this parameter to the same or higher value than on the primary server. Otherwise, queries will not be allowed in the standby server.
Usecase:
the default value 100
is very small, and is mostly changed to a bigger number to support production traffic.
superuser_reserved_connections = 3
superuser_reserved_connections
(integer
)
Description:
Determines the number of connection “slots” that are reserved for connections by PostgreSQL superusers. At most max_connections connections can ever be active simultaneously. Whenever the number of active concurrent connections is at least max_connections
minus superuser_reserved_connections
, new connections will be accepted only for superusers, and no new replication connections will be accepted.
The default value is three connections. The value must be less than max_connections
. This parameter can only be set at server start.
Usecase:
Remember when max_connections is hit the postgres will refuse any connections including, shell. So always create a new user without super user permissions to use in applications.
Logging
Under section REPORTING AND LOGGING you can find below config
#------------------------------------------------------------------------------
# REPORTING AND LOGGING
#------------------------------------------------------------------------------
# - Where to Log -
#log_destination = 'stderr' # Valid values are combinations of
# stderr, csvlog, syslog, and eventlog,
# depending on platform. csvlog
# requires logging_collector to be on.
# This is used when logging to stderr:
#logging_collector = off # Enable capturing of stderr and csvlog
# into log files. Required to be on for
# csvlogs.
# (change requires restart)
# These are only used if logging_collector is on:
#log_directory = 'log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_file_mode = 0600 # creation mode for log files,
# begin with 0 to use octal notation
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 disables.
#log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 disables.
#log_truncate_on_rotation = off # If on, an existing log file with the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.
# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
#syslog_sequence_numbers = on
#syslog_split_messages = on
# This is only relevant when logging to eventlog (Windows):
# (change requires restart)
#event_source = 'PostgreSQL'
# - When to Log -
#log_min_messages = warning # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic
#log_min_error_statement = error # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic (effectively off)
#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this number
# of milliseconds
#log_min_duration_sample = -1 # -1 is disabled, 0 logs a sample of statements
# and their durations, > 0 logs only a sample of
# statements running at least this number
# of milliseconds;
# sample fraction is determined by log_statement_sample_rate
#log_statement_sample_rate = 1.0 # fraction of logged statements exceeding
# log_min_duration_sample to be logged;
# 1.0 logs all such statements, 0.0 never logs
#log_transaction_sample_rate = 0.0 # fraction of transactions whose statements
# are logged regardless of their duration; 1.0 logs all
# statements from all transactions, 0.0 never logs
Few Imp configs are:
log_destination
: PostgreSQL supports several methods for logging server messages, including stderr, csvlog and syslog. On Windows, eventlog is also supported. Set this parameter to a list of desired log destinations separated by commas. The default is to log to stderr only.log_directory
: directory to log to.log_filename
: log file name pattern.
To reduce log file size use log rotations
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 disables.
#log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 disables.
To change what should be logged use
#log_min_messages = warning # values in order of decreasing detail:
Include other config
To seprate out configs , you can go to CONFIG FILE INCLUDES and config the files and dir to pick custom configs
Useful if you want to reduce config in single file or have multiple servers with different sections in configs.
#------------------------------------------------------------------------------
# CONFIG FILE INCLUDES
#------------------------------------------------------------------------------
# These options allow settings to be loaded from files other than the
# default postgresql.conf. Note that these are directives, not variable
# assignments, so they can usefully be given more than once.
#include_dir = '...' # include files ending in '.conf' from
# a directory, e.g., 'conf.d'
#include_if_exists = '...' # include file only if it exists
#include = '...' # include file
This concludes the basic settings we need to do , before running postgres server.
If you are looking to add authentication and authorization checkout article
Hope you liked the article, If there are any suggestions or mistakes i have done please let me know.