PostgreSQL is an open-source relational database management system (RDBMS) trusted by many large-scale applications and organizations for various purposes including web applications, data warehousing, and analytics.
Among Postgres’ many properties, it’s worth mentioning the following:
- PostgreSQL ensures transactions are Atomic, Consistent, Isolated, and Durable (ACID).
- Supports complex data types such as arrays, hstore (key-value pairs), JSON, and XML.
- Utilizes Multiversion Concurrency Control (MVCC) to handle multiple transactions simultaneously.
- Provides robust security features like authentication, authorization, SSL, and advanced role management.
What does ACID mean?
ACID is an acronym that stands for Atomicity, Consistency, Isolation, and Durability. These are the key properties that ensure reliable processing of database transactions.
- ATOMICITY: each transaction is treated as a single "unit," which either completely succeeds or completely fails. If any part of the transaction fails, the entire transaction is rolled back, and the database state is left unchanged.
- CONSISTENCY: data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof.
- ISOLATION: transactions occur independently without interference from other transactions. The intermediate states of a transaction are invisible to other transactions, and concurrent transactions appear as though they were executed sequentially.
- DURABILITY: once a transaction has been committed, its results are permanently recorded in the database.
What is Multiversion Concurrency Control (MVCC)?
Multiversion Concurrency Control (MVCC) is a database management technique used to handle concurrent access to the database in a way that avoids conflicts and ensures data consistency.
MVCC allows multiple transactions to occur simultaneously without locking the data, instead MVCC maintains multiple versions of each data item (row); so, when a transaction makes changes to a row, it creates a new version of that row rather than modifying the existing one.
Each transaction sees a snapshot of the database at a particular point in time. This snapshot includes only the committed data as of the start of the transaction, along with any changes made by the transaction itself.
When a transaction commits, its changes are made visible to other transactions that start afterward. If a transaction is rolled back, the changes are discarded, and the database remains as it was before the transaction began.
The database periodically removes old versions of data that are no longer needed, which is known as garbage collection. This process ensures that the database does not grow indefinitely.
Install PostgreSQL
You need to install this only once per Ubuntu distro. If you add a new Ubuntu distro then you’ll have to repeat this step.
sudo apt install postgresql postgresql-contrib
Check permissions
cd /var/lib
ls -la
PostgreSQL should have the owner and group as postgres. If this is not the case, run the following command
chown -R postgres:postgres postgresql
Start the database server
sudo service postgresql start
Check Postgres’ status
pg_lsclusters
or with the following command
service postgresql status
Log in to psql
psql is a command-line interface (CLI) tool for interacting with PostgreSQL.
The installation of PostgreSQL on Linux results in the creation of the following:
- a new Linux user called postgres;
- a database named postgres owned by user postgres.
To list the Linux users, run either command:
compgen -u
cut -d: -f1 /etc/passwd
At this point we haven’t created our PostgreSQL user yet nor a new database, therefore we can only log in to the postgres database as the postgres user.
Run the following command to log in to Postgres with psql.
The -U flag means user, the -d flag means database:
psql -U postgres -d postgres
This will result in an error
The reason is peer authentication
cd /etc/postgresql/16/main/
sudo nano pg_hba.conf
Peer authentication means that the connection is only allowed if the name of the PostgreSQL user is the same as the name of the operating system (Ubuntu) user.
So, if you run psql -U postgres as Ubuntu user root or your_username (e.g. md in my case), it won't work.
The solution is to specify a mapping between operating system users and database users in a Postgres file called pg_ident.conf
cd /etc/postgresql/16/main/
sudo nano pg_ident.conf
Save and close the file with Ctrl+x, then type in y and then hit the Enter key.
We need to reference this mapping in another file called pg_hba.conf
sudo nano pg_hba.conf
Save and close the file.
Restart the service so that the changes take effect:
sudo service postgresql restart
If you try to log in, now it should work. Either command works:
psql -U postgres -d postgres
psql -U postgres
Once you’re logged in, run the following command to check username you’re logged in as, and database you’re logged in to
\conninfo
To exit psql:
\q
You may have noticed that I added “postgres” to the pg_ident.conf file. This ensures that I can switch between the “mapping” option and the alternative methods that we’re about to learn. If I don’t add “postgres”, the alternative methods won’t work without disabling the mapping first.
If you do not want to change the pg_ files, you can alternatively log in to psql by switching to the postgres user with the sudo command which elevates privileges to execute commands as root.
The postgres Linux user was automatically created when PostgreSQL was installed.
The command below switches the current user to the postgres user and opens a new shell session.
sudo su postgres
psql
If you don’t specify a database with the -d flag, the psql command logs you in a database that has the same name as the user, in this case the postgres database (because you are logging in as the postgres user).
\conninfo
\q
exit
Another alternative is:
sudo -u postgres psql
The command above runs the psql command directly as the postgres user without opening a new shell session.
\conninfo
\q
Create a new PostgreSQL user
Log in to psql
sudo -u postgres psql
CREATE ROLE your_username LOGIN SUPERUSER INHERIT CREATEDB CREATEROLE;
Here's a breakdown of what each part of the command does:
- LOGIN: The role can be used to log into the PostgreSQL server.
- SUPERUSER: The role has the highest level of privileges and can perform any action without restriction.
- INHERIT: The role will inherit privileges from any roles it is a member of, simplifying privilege management.
- CREATEDB: The role can create new databases, which is useful for administrative tasks.
- CREATEROLE: The role can create, modify, and delete other roles, allowing it to manage user access and permissions.
ALTER ROLE your_username SET client_encoding TO 'utf8';
This command sets the default client encoding for a specific role. This ensures that whenever the specified role connects to the PostgreSQL server, the client encoding is automatically set to UTF-8, which is a widely used character encoding for Unicode.
All text data sent to and received from the PostgreSQL server by clients using this role will be encoded in UTF-8. This helps maintain consistency in how text data is handled, especially in applications that use multiple languages or special characters.
Setting a default client encoding can help avoid issues where the client and server might use different encodings, which could lead to data corruption or incorrect display of text.
ALTER ROLE your_username SET default_transaction_isolation TO 'read committed';
The command above sets the default transaction isolation level to 'read committed' for the specified role, ensuring that all transactions initiated by this role adhere to this isolation level unless overridden.
Transaction Isolation Levels in PostgreSQL
- Read Committed: This is the default isolation level in PostgreSQL. In this mode, a query in a transaction sees only data that was committed before the query began. It does not see changes made by concurrent transactions that have not yet been committed.
- Repeatable Read: This isolation level ensures that if a query is executed again within the same transaction, it will see the same rows, even if other transactions are concurrently modifying the data.
- Serializable: This is the strictest isolation level, ensuring complete isolation from concurrent transactions. It guarantees that the transaction operates as if it were the only transaction in the system.
- Read Uncommitted: This isolation level allows a transaction to see uncommitted changes made by other transactions. PostgreSQL does not actually support this level; it maps it to Read Committed.
ALTER ROLE your_username WITH PASSWORD 'your_password';
\q
I’m going to log back in to psql as my new user “md”
psql
This command will result in an error because there is no database named “md”. We need to specify the database name. Let’s log in to the only database available at this point, that is “postgres”
psql -d postgres
To delete a user (or role):
DROP ROLE md;
Create a new PostgreSQL database
I’m logging into the postgres database as user md and going to create a new database named my_db
psql -d postgres
CREATE DATABASE my_db;
With the /l command you can see the list of all databases
Switch database:
\connect my_db;
Exit psql
\q
To delete a database:
DROP DATABASE my_db;
Install psycopg2
So far, we have interacted with PostgreSQL via the terminal using psql.
To allow our Python/Django application to interact with our PostgreSQL database, we need to install psycopg2.
Psycopg2 is a PostgreSQL database adapter for the Python programming language. It is a powerful and flexible library for interacting with PostgreSQL databases from Python applications.
Before installing psycopg2, we need to install in Ubuntu two development packages: python3-dev and libpq-dev.
These two packages ensure that your system has all the necessary tools and libraries to compile and install psycopg2 and other similar Python packages that rely on C extensions and PostgreSQL libraries.
sudo apt-get update && sudo apt-get upgrade
sudo apt install python3-dev libpq-dev
pip3 install psycopg2
If you want to install a specific version of psycopg2:
pip3 install psycopg2==version_number