Getting Started With PostgreSQL

Shubham Thakur
9 min readNov 14, 2023

--

postgresql

Every App needs a storage to store and retrieve data. This blog can be used to setup postgres using docker and also setup logging and basic settings for postgres.

This document can be used by IT professionals looking to setup postgres server and create basic server which is ready to be used by developers.

PostgreSQL, often referred to as Postgres, is a powerful and open-source object-relational database system (ORDBMS).

Core Features:

  • SQL Language: PostgreSQL extends the SQL language and incorporates many features.
  • Data Workloads: It safely stores and scales even the most complex data workloads.
  • Reliability: PostgreSQL has earned a strong reputation for its proven architecture and data integrity.
  • Extensibility: Developers can define custom data types, functions, and even write code in different programming languages without recompiling the database.
  • Operating Systems: It runs on all major operating systems.
  • ACID-Compliant: PostgreSQL has been ACID-compliant since 2001.

But why postgres

  1. Free And Open Source: Its Free as in freedom , you are free to use, modify, study or improve the software. You can use internally or provide service as SaaS.
  2. Its Battle tested: Its used by many major companies like apple, spotify, instagram, reddit etc. Its heavily battle tested and improved over time to support lite to heavy traffic
  3. SQL compliant: Its a relational database and SQL can be use to interact with DB
  4. ACID Compliant: its acid compliant ( will be discussed later)

Getting Started

Okay lets get started with PostgreSQL. We will use docker to deploy and play around with postgres.

We will use docker image

docker pull postgres:14-bullseye

Now create a file called docker-compose.yaml and paste the following

version: '3.9'

services:
postgres:
image: postgres:14
ports:
- 5432:5432
volumes:
- ./postgres:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD="my-super-duper-secret-password"
- POSTGRES_USER=admin
- POSTGRES_DB=admin

Description:

we are exposing postgres at 5432 port and providing few env variables for password and default db and user.

For persistence we are mounting ./postgres to /var/lib/postgresql/data for data.

later we will add config file also.

Lets Perform few basic operations

Playing with psql

In this section we will go through very basics of psql.

First lets exec into the container

docker exec -it installation-postgres-1  psql -Uadmin 

Pay attention how we used admin as user

Description:

psql is command which is used to launch and connect with shell and interact to postgres database. few parameters are given below.

-U : Username to connect as
-p : port of db
-d : dbName to connect to

Cool now you should be in shell

Lets run basic MySQL commands

-- Create a new database called 'Books' --
CREATE DATABASE library;

--- Switch to library database ---
\c library

--- Create Table called book ---
CREATE TABLE book(
bookid Serial Primary Key,
bookName varchar(32) Not NULL,
genre text[],
bookAuthor varchar(32) NOT NULL


);

You may have noticed we used \c to switch the database.

few more helpful shortcuts are

  1. Connect to a specific database:
\c database_name;

2. List all databases

\l

3. Lists all tables in a current database.

\dt

4. Enable Expanded mode

\x

5. List all users:

\du

Cool Currently you are logged in as admin user. We will discuss about this in authorization and authentications section.

Logging

Logging is the act of keeping a log of events that occur in a computer system, such as problems, errors or just information on current operations.

In postgres by default logs are stored at /etc/log/postgres/

In docker compose output you can also see the logs. but they are not persistent

You can pass arguments -c during starting postgres server to set logging directory

example:

postgres -c logging_collector=on -c log_destination=stderr -c log_directory=/logs

In your docker compose file make the following changes

version: '3.9'

services:
postgres:
image: postgres:14
ports:
- 5432:5432
## Add Logging
command: postgres -c logging_collector=on -c log_destination=stderr -c log_directory=/logs
volumes:
- ./postgres:/var/lib/postgresql/data
environment:
- POSTGRES_PASSWORD=VfzhCn6
- POSTGRES_USER=admin
- POSTGRES_DB=admin

now create a folder called logs to store your logs

stop the docker compose and run docker compose up once more

If you get error with permission declined, change directory ownership to 999

use:

sudo chmod 777 logs -R

Now your default log file is changed

Configuration file

Configurations file provides a reliable and fast way to reproduce the server setup.

You can find location of config file using command in psql shell.

SHOW config_file;

This is location of your config_file.

Now lets copy default config file

Run docker ps to get all running container and select the id of your container

example

Now run docker cp to copy default file. I am copying it to ./config/postgresql.conf

docker cp <container-id>:/var/lib/postgresql/data/postgresql.conf ./config/postgresql.conf

Before we deep dive, lets update compose file to read from new config in our local.( its at default location only so even if you don’t change it , it will be same)

Also note we will move config for logging also inside the conf

version: '3.9'

services:
postgres:
image: postgres:14
ports:
- 5432:5432

command: postgres -c config_file=/var/lib/postgresql/data/postgresql.conf
volumes:
- ./postgres:/var/lib/postgresql/data
- ./logs:/logs
- ./config/postgresql.conf:/var/lib/postgresql/data/postgresql.conf
environment:
- POSTGRES_PASSWORD=VfzhCn6
- POSTGRES_USER=admin
- POSTGRES_DB=admin

Exploring Config File

Lets explore few arguments in config file, shall we

In section Connections and Authentications we find the following settings

listen_addresses = '*'
# comma-separated list of addresses;
# defaults to 'localhost'; use '*' for all
# (change requires restart)
#port = 5432 # (change requires restart)
max_connections = 100 # (change requires restart)
#superuser_reserved_connections = 3 # (change requires restart)
#unix_socket_directories = '/var/run/postgresql' # comma-separated list of directories
# (change requires restart)
#unix_socket_group = '' # (change requires restart)
#unix_socket_permissions = 0777 # begin with 0 to use octal notation
# (change requires restart)
#bonjour = off # advertise server via Bonjour
# (change requires restart)
#bonjour_name = '' # defaults to the computer name
# (change requires restart)

# - TCP settings -
# see "man tcp" for details

#tcp_keepalives_idle = 0 # TCP_KEEPIDLE, in seconds;
# 0 selects the system default
#tcp_keepalives_interval = 0 # TCP_KEEPINTVL, in seconds;
# 0 selects the system default
#tcp_keepalives_count = 0 # TCP_KEEPCNT;
# 0 selects the system default
#tcp_user_timeout = 0 # TCP_USER_TIMEOUT, in milliseconds;
# 0 selects the system default

#client_connection_check_interval = 0 # time between checks for client
# disconnection while running queries;
# 0 for never

# - Authentication -

#authentication_timeout = 1min # 1s-600s
#password_encryption = scram-sha-256 # scram-sha-256 or md5
#db_user_namespace = off

Most of settings have comments associated with them, lets discuss few of most frequently used (This are directly taken from link below and added use case at end for better understanding)

Ref. https://www.postgresql.org/docs/14/runtime-config-connection.html

listen_addresses = '*'

Description:

By default , postgres listens on all the incoming sockets.Specifies the TCP/IP address(es) on which the server is to listen for connections from client applications. The value takes the form of a comma-separated list of host names and/or numeric IP addresses. The special entry * corresponds to all available IP interfaces. The entry 0.0.0.0 allows listening for all IPv4 addresses and :: allows listening for all IPv6 addresses. If the list is empty, the server does not listen on any IP interface at all, in which case only Unix-domain sockets can be used to connect to it. If the list is not empty, the server will start if it can listen on at least one TCP/IP address. A warning will be emitted for any TCP/IP address which cannot be opened. The default value is localhost, which allows only local TCP/IP “loopback” connections to be made.

Usecase:

listen_addresses controls which interfaces accept connection attempts, which can help prevent repeated malicious connection requests on insecure network interfaces. This parameter can only be set at server start.

port = 5432 (integer)

port(integer)

Description:

The TCP port the server listens on; 5432 by default. Note that the same port number is used for all IP addresses the server listens on. This parameter can only be set at server start.

Usecase:

This is very useful if you want to run 2 servers from same machine or change default port

max_connections = 100

max_connections (integer)

Description:

Determines the maximum number of concurrent connections to the database server. The default is typically 100 connections, but might be less if your kernel settings will not support it (as determined during initdb). This parameter can only be set at server start.

When running a standby server, you must set this parameter to the same or higher value than on the primary server. Otherwise, queries will not be allowed in the standby server.

Usecase:

the default value 100 is very small, and is mostly changed to a bigger number to support production traffic.

superuser_reserved_connections = 3

superuser_reserved_connections (integer)

Description:

Determines the number of connection “slots” that are reserved for connections by PostgreSQL superusers. At most max_connections connections can ever be active simultaneously. Whenever the number of active concurrent connections is at least max_connections minus superuser_reserved_connections, new connections will be accepted only for superusers, and no new replication connections will be accepted.

The default value is three connections. The value must be less than max_connections. This parameter can only be set at server start.

Usecase:

Remember when max_connections is hit the postgres will refuse any connections including, shell. So always create a new user without super user permissions to use in applications.

Logging

Under section REPORTING AND LOGGING you can find below config

#------------------------------------------------------------------------------
# REPORTING AND LOGGING
#------------------------------------------------------------------------------

# - Where to Log -

#log_destination = 'stderr' # Valid values are combinations of
# stderr, csvlog, syslog, and eventlog,
# depending on platform. csvlog
# requires logging_collector to be on.

# This is used when logging to stderr:
#logging_collector = off # Enable capturing of stderr and csvlog
# into log files. Required to be on for
# csvlogs.
# (change requires restart)

# These are only used if logging_collector is on:
#log_directory = 'log' # directory where log files are written,
# can be absolute or relative to PGDATA
#log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
# can include strftime() escapes
#log_file_mode = 0600 # creation mode for log files,
# begin with 0 to use octal notation
#log_rotation_age = 1d # Automatic rotation of logfiles will
# happen after that time. 0 disables.
#log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 disables.
#log_truncate_on_rotation = off # If on, an existing log file with the
# same name as the new log file will be
# truncated rather than appended to.
# But such truncation only occurs on
# time-driven rotation, not on restarts
# or size-driven rotation. Default is
# off, meaning append to existing files
# in all cases.

# These are relevant when logging to syslog:
#syslog_facility = 'LOCAL0'
#syslog_ident = 'postgres'
#syslog_sequence_numbers = on
#syslog_split_messages = on

# This is only relevant when logging to eventlog (Windows):
# (change requires restart)
#event_source = 'PostgreSQL'

# - When to Log -

#log_min_messages = warning # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic

#log_min_error_statement = error # values in order of decreasing detail:
# debug5
# debug4
# debug3
# debug2
# debug1
# info
# notice
# warning
# error
# log
# fatal
# panic (effectively off)

#log_min_duration_statement = -1 # -1 is disabled, 0 logs all statements
# and their durations, > 0 logs only
# statements running at least this number
# of milliseconds

#log_min_duration_sample = -1 # -1 is disabled, 0 logs a sample of statements
# and their durations, > 0 logs only a sample of
# statements running at least this number
# of milliseconds;
# sample fraction is determined by log_statement_sample_rate

#log_statement_sample_rate = 1.0 # fraction of logged statements exceeding
# log_min_duration_sample to be logged;
# 1.0 logs all such statements, 0.0 never logs


#log_transaction_sample_rate = 0.0 # fraction of transactions whose statements
# are logged regardless of their duration; 1.0 logs all
# statements from all transactions, 0.0 never logs

Few Imp configs are:

  1. log_destination : PostgreSQL supports several methods for logging server messages, including stderr, csvlog and syslog. On Windows, eventlog is also supported. Set this parameter to a list of desired log destinations separated by commas. The default is to log to stderr only.
  2. log_directory : directory to log to.
  3. log_filename : log file name pattern.

To reduce log file size use log rotations

#log_rotation_age = 1d   # Automatic rotation of logfiles will
# happen after that time. 0 disables.
#log_rotation_size = 10MB # Automatic rotation of logfiles will
# happen after that much log output.
# 0 disables.

To change what should be logged use

#log_min_messages = warning  # values in order of decreasing detail:

Include other config

To seprate out configs , you can go to CONFIG FILE INCLUDES and config the files and dir to pick custom configs

Useful if you want to reduce config in single file or have multiple servers with different sections in configs.

#------------------------------------------------------------------------------
# CONFIG FILE INCLUDES
#------------------------------------------------------------------------------

# These options allow settings to be loaded from files other than the
# default postgresql.conf. Note that these are directives, not variable
# assignments, so they can usefully be given more than once.

#include_dir = '...' # include files ending in '.conf' from
# a directory, e.g., 'conf.d'
#include_if_exists = '...' # include file only if it exists
#include = '...' # include file

This concludes the basic settings we need to do , before running postgres server.

If you are looking to add authentication and authorization checkout article

Hope you liked the article, If there are any suggestions or mistakes i have done please let me know.

--

--