Database – Mellowhost Blog

How to see all the constraints in a Postgresql Database

To see/list the constraints, first, connect to the database using the following:

\c my_prod;

Here we are assuming the database name is my_prod. Please note, we are putting these commands in the psql client utility.

Now, use the following query to list all the constraints in the database:

select pgc.conname as constraint_name,
       ccu.table_schema as table_schema,
       ccu.table_name,
       ccu.column_name,
       contype,
        pg_get_constraintdef(pgc.oid)
from pg_constraint pgc
         join pg_namespace nsp on nsp.oid = pgc.connamespace
         join pg_class  cls on pgc.conrelid = cls.oid
         left join information_schema.constraint_column_usage ccu
                   on pgc.conname = ccu.constraint_name
                       and nsp.nspname = ccu.constraint_schema
order by pgc.conname;

Good luck

How to stop Postgresql when you have multiple versions of PGSQL Running on Ubuntu

Question: How to stop Postgresql when you have multiple versions of PGSQL Running on Ubuntu

You may run the following command to stop specific version of postgresql when using multiple versions of postgresql in a single system, under Ubuntu

systemctl stop postgresql[@version-main]

So, for example, if you have a system, with 3 postgresql server, 12, 14, 15, and would like to stop 14 and 15, then you can run the following:

systemctl stop postgresql@15-main
systemctl stop postgresql@14-main

To disable them from booting:

systemctl disable postgresql@15-main
systemctl disable postgresql@14-main

How to Set Timezone in Postgres Databases

Question: How to Set Timezone in Postgres Databases

Answer

Postgres allows you to set a timezone per database. To view the list of available timezones, you may use the following query from psql:

select * from pg_timezone_names;

To set a timezone for say a database called ‘inventory_report’, you may use the following query:

alter database inventory_report set timezone to 'Asia/Dhaka';

Tips:

To List Postgres Databases, you may use the following command from psql:

\l

To List Postgres Tables in a Database:

Connect to the database first:

\c my_prod

Now, list:

\dt

How to Allow Remote User Access in PostgreSQL

Configuration PostgreSQL is a bit tricky and complicated compared to MySQL for a few cases. One such case would be allowing remote user access in PGSQL. You might be interested to allow your home network to connect to your test or production database to run a few queries. To do that, if you are not using a tool like Percona Distribution for Postgres or a Cluster tool like Clustercontrol, it might get a bit complicated.

I have hence, set a list of steps, we need to do to achieve the goal.

Set PostgreSQL to Allow Remote Addresses

To do that, open up your postgresql.conf file. It is usually located under /var/lib/pgsql/9.6/data/postgresql.conf file. I am running PGSQL 9.6, hence the folder is in 9.6. If you are running any other version of PGSQL, it shall change based on the version.

nano /var/lib/pgsql/9.6/data/postgresql.conf

Find the directive ‘listen_address’ and uncomment it, if it is commented. Also, set this to wildcard like the following:

listen_address = "*"

Create a PostgreSQL User with Encrypted Password

Open up your psql console from the user postgres and run the query given below:

su - postgres
psql

create user new_user with encrypted password 'testu1234'

You should always use ‘with encrypted password’, as this would avoid storing your password in plaintext instead of md5.

Now, grant the user on the database, you would like it to access:

grant all privileges on database my_prod to new_user;

Add Remote Access for User in PostgreSQL

PostgreSQL manages a remote access file, to allow access from remote sources. The file is located in /var/lib/pgsql/9.6/data/pg_hba.conf. To allow remote access from an IP, we need to add a line in this file:

Edit the file in nano

nano /var/lib/pgsql/9.6/data/pg_hba.conf

Add the following line

host    all             new_user           27.147.176.2/32       md5

Here 27.147.176.2 is the IP from where I would like to access the database.

Restart PostgreSQL

Now, restart your PostgreSQL instance

systemctl restart postgresql-9.6.service

You should be now set.

Few Additional Things For Reference

How to list the users of PostgreSQL from the console?

From the Postgres console (psql), run the following command:
\du
How to change the password for a PostgreSQL user?

From the Postgres console (psql), run the following query:

ALTER USER username WITH ENCRYPTED PASSWORD 'password';

where username is the username you would like the password changed. Remember to use encrypted password keyword to save the password in md5.

3. How to change the user to a superuser?

You may use the following query:

ALTER USER new_user WITH SUPERUSER;

How to Backup/Dump and Restore PostgreSQL Databases

Here is the situation. We have a customer, who runs a POS backed by PostgreSQL. Customers want a second POS with the updated SQL till last night. Point to note, he didn’t want to use the Master/Replica or Master/Master replication, instead, a day earlier backup to be restored.

A similar situation is probably applicable for developers who are trying to work with staging and production environments and vice versa. In such a situation, you need a technique for dumping and restoring databases.

Backup/Dump Postgresql Databases

To dump the PostgreSQL database, you need to use pg_dump. First ssh to your server, change the user to Postgres, and then run the dump

su - postgres
pg_dump your_database > your_database_2021_13_11.sql

Here, your_database is the name of the database you are trying to dump.

Copy the SQL to the Remote Server

Once the dump is done, now you may copy the SQL file to the remote server. You may do that using rsync:

rsync -vrplogDtH --progress your_database_2021_13_11.sql [email protected]:/var/lib/pgsql/

Replace your_database_2021_13_11.sql with your SQL dump name and ‘ip.ip.ip.ip’ with the IP of your remote destination.

Restore the Dump

Now, ssh to the remote server, and continue with the following command:

# change to postgres user
su - postgres
# drop the old database
psql -c 'drop database your_database;'
# create database (replace your_username with the one you want to use for the database)
psql -c 'create database your_database with owner your_username;'
# restore database
psql your_database < /var/lib/pgsql/your_database_2021_13_11.sql

Once done, you should be able to use the new restored database. Good luck.

ERROR: cannot drop the currently open database – PostgreSQL

If you are trying to drop a database and getting an error like the following in PostgreSQL:

ERROR: cannot drop the currently open database

Then, the easiest technique to solve the problem is to restart your PostgreSQL server. I am using PGSQL 9.6, here was my command:

systemctl stop postgresql-9.6.service
systemctl start postgresql-9.6.service

If you are using PostgreSQL 13 or above, you have an option to drop the database with force now. You may drop the database using the following:

DROP DATABASE database_to_drop WITH (FORCE);

Replace ‘database_to_drop’ with the database you would like to drop.

How to decide your Innodb Buffer Pool Size? – Why MySQL Buffer Pool is awe-inspiring?

One key bottleneck, every software engineer who uses MySQL, fails to realize, is how to utilize MySQL cache. I have seen people believing the MySQL query cache as the MySQL cache. Sadly, it is not. The query cache is probably one of the worst cache attributes you may enable for your MySQL instance optimization for a high-traffic website.

MySQL Buffers

MySQL utilizes buffers for caching. These buffers are stored in RAM. The key benefit of these buffers is that the changes can also be stored in RAM before flushing the data to disk, along with sync to the replica. Let’s look at an example.

Let’s say, you have selected a post with id 1001

select * from posts where id = 1001

If you have a MySQL buffer, then the result of this query would be placed in the memory.

Now, the subsequent call, let’s say the following:

update posts
set title = 'This is a new title'
where id = 1001

Now, imagine, how the above query will be handled in MySQL? MySQL would write in the RAM, and keep the chunk as dirty to be flushed to disk. Now, that makes the above update super fast to the interactive queries. Because writing to disk would be super slow compared to writing to RAM.

If you live under a heavy load, this is an awe-inspiring tweak. Because this makes your MySQL instance a RAM-based database, with an IO player, which keeps track of your dirty bits, and flushes them when needed. While the con of the approach is that if the system is crashed, dirty bits are lost. Now, we can’t directly say lost, if you are using a system with journaling enabled, that keeps track of the changes, and rolls back.

What is the right Innodb Buffer Pool Size?

There is much misinformation about the size of your buffer pool. MySQL says you should use 80% of your RAM as the pool, which is significantly incorrect. If your database is of size 500MB and the RAM you hold is much larger, then using 80% is quite a waste. You could probably use them for other buffers like a join buffer or a temporary table or a heap table. But unwisely using that for Innodb Buffer is not worthy.

To start with, you can set the buffer pool size to the total size of your database first and let the system run for a day or two. Once done, then you may analyze a few MySQL status variables to find out whether should you increase or decrease. Let’s find out a few details

To get the MySQL status variables, run the following from your MySQL console:

SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool%';

Now, this should give you a details like the following:

MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool%';
+---------------------------------------+--------------------------------------------------+
| Variable_name                         | Value                                            |
+---------------------------------------+--------------------------------------------------+
| Innodb_buffer_pool_dump_status        |                                                  |
| Innodb_buffer_pool_load_status        | Buffer pool(s) load completed at 210915 13:30:05 |
| Innodb_buffer_pool_resize_status      |                                                  |
| Innodb_buffer_pool_load_incomplete    | OFF                                              |
| Innodb_buffer_pool_pages_data         | 117007                                           |
| Innodb_buffer_pool_bytes_data         | 1917042688                                       |
| Innodb_buffer_pool_pages_dirty        | 557                                              |
| Innodb_buffer_pool_bytes_dirty        | 9125888                                          |
| Innodb_buffer_pool_pages_flushed      | 80192071                                         |
| Innodb_buffer_pool_pages_free         | 7339                                             |
| Innodb_buffer_pool_pages_misc         | 6726                                             |
| Innodb_buffer_pool_pages_total        | 131072                                           |
| Innodb_buffer_pool_read_ahead_rnd     | 0                                                |
| Innodb_buffer_pool_read_ahead         | 7627222                                          |
| Innodb_buffer_pool_read_ahead_evicted | 41991                                            |
| Innodb_buffer_pool_read_requests      | 480918989539                                     |
| Innodb_buffer_pool_reads              | 117010959                                        |
| Innodb_buffer_pool_wait_free          | 1206                                             |
| Innodb_buffer_pool_write_requests     | 290588732                                        |
+---------------------------------------+--------------------------------------------------+
19 rows in set (0.001 sec)

We can use a few metrics from this to determine whether we need a change in buffer or not.

Free Pages

First, we look at the variable ‘Innodb_buffer_pool_pages_free’. It says the amount the free pages you have. If you have set a pretty large buffer pool, you should have a large number of pages in total, hence seeing the number ‘Innodb_buffer_pool_pages_free’ should not make you realize if you are going out of the pool page or not. You may calculate a ratio like the following:

free_page_ratio = ( Innodb_buffer_pool_pages_free / Innodb_buffer_pool_pages_total ) * 100

free_page_ratio should not be less than 10% according to my experience until you are running out of RAM. If you have enough RAM, and the ratio is less than 10%, you have fewer free pages, you might want to increase the pool size.

Now, one thing to remember, the value, might stay below 10% all the time, it might not mean you are running out of pool space, instead, you are at the top of the optimization or near to cut off. To determine, if you are out of pool space, make sure to monitor the variable several times of the day and see how often does it change. If it gets lower too often, that suggests you are hitting the boundary and it is time to increase the buffer pool size.
Read Ratio

Next, you need to look at the number of read requests and the number of reads you have performed in total. If you are seeing a lot of read requests and the ratio to reads is higher, then the buffer isn’t enough to store your full database requests in the RAM, and you should increase the value.

read_ratio = (Innodb_buffer_pool_reads / Innodb_buffer_pool_read_requests) * 100

Remember, Innodb_buffer_pool_reads means the amount of requests Innodb had to query the disk to get the data from. The value should be less than 1% for maximum efficiency. In my above example, the value is less than .1%, which makes the Innodb performance the best.
Flushed & Wait Free

The other thing, we need to track is that, Innodb_buffer_pool_wait_free. The variable should be near 0 or 0. If the variable is higher, it suggests a situation has happened, where the flushing was essential to free pages. Even if you have a good read_ratio, but if the variable shows a large number, then you need to increase the buffer pool size to make sure we do not need to free pages at the peak times.

Continuous monitoring is essential to perfectly use Innodb Buffer Pool. It is an extremely essential part of MySQL-based software and should be used wisely. Proper usage can excessively decrease your server CPU cost, and maximize the overall performance.

[ERROR] We are not able to run systemctl start mysql return code: 5. Fatal error, see /var/log/installLogs.txt for full details – Cyberpanel

If you are trying to install Cyberpanel 2.1.2 on CentOS 7, you are probably seeing your installation fails with the following in the screen:

[ERROR] We are not able to run systemctl start mysql return code: 5.  Fatal error, see /var/log/installLogs.txt for full details.

This is appearing because Cyberpanel installation script located here:

/usr/local/cyberpanel/install/installCyberPanel.py

wants to restart MariaDB 10.4 with the following:

systemctl start mysql

Unfortunately, this was only possible until 10.3 or below. Since 10.4, Mariadb uses the following command:

systemctl start mariadb

Cyberpanel uses that command if you are using CentOS 8 or Ubuntu. But it looks like CentOS 7 now installs MariaDB 10.4 as default unlike 10.3 as default previously.

Even if you edit this command, Cyberpanel will reclone the script, hence it is not possible to change this from here. The nasty trick would be to load a mysql.service file before starting the cyberpanel installation. Here is what you would need to do to do this.

First reload your VPS with CentOS 7
Run yum update and install nano

yum update -y
yum install nano -y

3. Now, load the following file with the text given below:

nano /usr/lib/systemd/system/mysql.service

Load the file with the following text

# It's not recommended to modify this file in-place, because it will be
# overwritten during package upgrades.  If you want to customize, the
# best way is to create a file "/etc/systemd/system/mariadb.service",
# containing
#       .include /usr/lib/systemd/system/mariadb.service
#       ...make your changes here...
# or create a file "/etc/systemd/system/mariadb.service.d/foo.conf",
# which doesn't need to include ".include" call and which will be parsed
# after the file mariadb.service itself is parsed.
#
# For more info about custom unit files, see systemd.unit(5) or
# https://mariadb.com/kb/en/mariadb/systemd/
#
# Copyright notice:
#
# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.

[Unit]
Description=MariaDB 10.4.21 database server
Documentation=man:mysqld(8)
Documentation=https://mariadb.com/kb/en/library/systemd/
After=network.target

[Install]
WantedBy=multi-user.target
Alias=mysql.service
Alias=mysqld.service


[Service]

##############################################################################
## Core requirements
##

Type=notify

# Setting this to true can break replication and the Type=notify settings
# See also bind-address mysqld option.
PrivateNetwork=false

##############################################################################
## Package maintainers
##

User=mysql
Group=mysql

# CAP_IPC_LOCK To allow memlock to be used as non-root user
# CAP_DAC_OVERRIDE To allow auth_pam_tool (which is SUID root) to read /etc/shadow when it's chmod 0
#   does nothing for non-root, not needed if /etc/shadow is u+r
# CAP_AUDIT_WRITE auth_pam_tool needs it on Debian for whatever reason
CapabilityBoundingSet=CAP_IPC_LOCK CAP_DAC_OVERRIDE CAP_AUDIT_WRITE

# PrivateDevices=true implies NoNewPrivileges=true and
# SUID auth_pam_tool suddenly doesn't do setuid anymore
PrivateDevices=false

# Prevent writes to /usr, /boot, and /etc
ProtectSystem=full

# Doesn't yet work properly with SELinux enabled
# NoNewPrivileges=true

# Prevent accessing /home, /root and /run/user
ProtectHome=true

# Execute pre and post scripts as root, otherwise it does it as User=
PermissionsStartOnly=true



# Perform automatic wsrep recovery. When server is started without wsrep,
# galera_recovery simply returns an empty string. In any case, however,
# the script is not expected to return with a non-zero status.
# It is always safe to unset _WSREP_START_POSITION environment variable.
# Do not panic if galera_recovery script is not available. (MDEV-10538)
ExecStartPre=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"
ExecStartPre=/bin/sh -c "[ ! -e /usr/bin/galera_recovery ] && VAR= || \
 VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ] \
 && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1"

# Needed to create system tables etc.
# ExecStartPre=/usr/bin/mysql_install_db -u mysql

# Start main service
# MYSQLD_OPTS here is for users to set in /etc/systemd/system/mariadb.service.d/MY_SPECIAL.conf
# Use the [Service] section and Environment="MYSQLD_OPTS=...".
# This isn't a replacement for my.cnf.
# _WSREP_NEW_CLUSTER is for the exclusive use of the script galera_new_cluster
ExecStart=/usr/sbin/mysqld $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION

# Unset _WSREP_START_POSITION environment variable.
ExecStartPost=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"



KillSignal=SIGTERM

# Don't want to see an automated SIGKILL ever
SendSIGKILL=no

# Restart crashed server only, on-failure would also restart, for example, when
# my.cnf contains unknown option
Restart=on-abort
RestartSec=5s

UMask=007

##############################################################################
## USERs can override
##
##
## by creating a file in /etc/systemd/system/mariadb.service.d/MY_SPECIAL.conf
## and adding/setting the following under [Service] will override this file's
## settings.

# Useful options not previously available in [mysqld_safe]

# Kernels like killing mysqld when out of memory because its big.
# Lets temper that preference a little.
# OOMScoreAdjust=-600

# Explicitly start with high IO priority
# BlockIOWeight=1000

# If you don't use the /tmp directory for SELECT ... OUTFILE and
# LOAD DATA INFILE you can enable PrivateTmp=true for a little more security.
PrivateTmp=false

# Set an explicit Start and Stop timeout of 900 seconds (15 minutes!)
# this is the same value as used in SysV init scripts in the past
# Galera might need a longer timeout, check the KB if you want to change this:
# https://mariadb.com/kb/en/library/systemd/#configuring-the-systemd-service-timeout
TimeoutStartSec=900
TimeoutStopSec=900

##
## Options previously available to be set via [mysqld_safe]
## that now needs to be set by systemd config files as mysqld_safe
## isn't executed.
##

# Number of files limit. previously [mysqld_safe] open-files-limit
LimitNOFILE=32768

# Maximium core size. previously [mysqld_safe] core-file-size
# LimitCore=

# Nice priority. previously [mysqld_safe] nice
# Nice=-5

# Timezone. previously [mysqld_safe] timezone
# Environment="TZ=UTC"

# Library substitutions. previously [mysqld_safe] malloc-lib with explicit paths
# (in LD_LIBRARY_PATH) and library name (in LD_PRELOAD).
# Environment="LD_LIBRARY_PATH=/path1 /path2" "LD_PRELOAD=

# Flush caches. previously [mysqld_safe] flush-caches=1
# ExecStartPre=sync
# ExecStartPre=sysctl -q -w vm.drop_caches=3

# numa-interleave=1 equalivant
# Change ExecStart=numactl --interleave=all /usr/sbin/mysqld......

# crash-script equalivent
# FailureAction=

Now, save the file, and start your cyberpanel installation as it is with the following:

sh <(curl https://cyberpanel.net/install.sh || wget -O - https://cyberpanel.net/install.sh)

Hope it solves the problem. I am hoping, soon enough Cyberpanel would be able to address the issue the way they solved this for CentOS 8 and Ubuntu.

How to aggregate column with non-aggregable value in an Array using Group in PostgreSQL

Let’s imagine, scenario is like the following:

We have three columns with the following values in a table named ‘sale_line’

----------------------------------------
product_id	qty_done	sale_id
----------------------------------------
1010		5.0		101
1010		1.0		102
1010		2.0		103
1012		1.0		104
1012		4.0		105
1012		4.0		106
1012		9.0		107

We have a product_id column that contains which product we are referring to, a qty_done column, which contains the number of products sold, and sale_id refers to the sale order where the product_id was dispatched. Now the example table contains how many of each product dispatched in different sale orders. We can write a query, to group the product_id and the total number of products sold using aggregation

select product_id, sum(qty_done) as total_sold from sale_line group by product_id;

The producing result would be the following:

---------------------------
product_id	total_sold
---------------------------
1010		8.0		
1012		17.0

We can see, how group by can give you the benefit of aggregation. But have you noticed, we have a column called ‘sale_id’ that doesn’t go with any aggregation function like ‘count’, ‘sum’ or ‘avg’. Can we use them for any purpose?

Sometimes, you may be interested to track down the orders responsible for a set of products using group by when each products will contain the list of orders responsible for causing the total sell. Can we accomplish that in one query?

We actually can. We can concat distinct ids and put them in a resulting array instead of our known aggregation function like ‘count’ or ‘sum’. Postgres provides a array aggregation function called ‘array_agg’, it can be used to produce such result. Have a look at the following query:

select product_id, sum(qty_done) as total_sold, array_agg(sale_id) as sale_ids from sale_line group by product_id;

The producing result would be the following:

-----------------------------------------
product_id	total_sold	sale_ids
-----------------------------------------
1010		8.0		[101, 102, 103]
1012		17.0		[104, 105, 106, 107]

As you can see, we have made postgres return us an array that contains the sale_ids for consisting total_sold even though, sale_id column wasn’t conventionally aggregable or let’s say different than the usual numeric pattern.

This technique can have many use cases. For example if you have a different model/table for managing sale orders, purchase orders, pickings and invoices while, your sale orders contains the price information, and the pickings doesn’t, but picking contains the actual number of products got dispatched, then you may first aggregate the total_sold from the picking table, then produce an average price using the sale_ids you have produced from the picking table using group by and array aggregation, then merge them. This seems quite complicated, I understand, but again, it’s all about putting a break to your reporting time, and manage scalability of your application, putting a constant cost based reporting algorithm using single sql query, even if you process 10 times more orders in future.

How to get purchase_id from stock_picking table in Odoo?

stock_picking is one of the core model for Odoo if you using Odoo for ‘Inventory’. Now, if you use ‘Sales’ module, then stock_picking would be extended with a new field for model and column for database table, naming ‘sale_id’. This can be used to detect if the picking is originated from a sale order or not. But if you install ‘Purchase’ module, then stock_picking model is extended with ‘purchase_id’ like the ‘sale_id’ for purchases, but the database isn’t expanded with a column like ‘Sale’ module.

What does this mean?

This means, if you use Odoo ORM, only then, you may use purchase_id of a stock_picking. An example could be like the following. Let’s say, we would like to pick the pickings that originated from purchase orders, aka, GRN, we could use something like this:

purchase_pickings = self.env['stock.picking'].search([('purchase_id', '!=', False)])

This works, only if you are not trying to make a report from a huge lot of pickings, purchase orders and sale orders, when you want to use SQL statement to produce efficient joins and generate the report quickly.

Let me demonstrate what I meant

We know, stock_picking has a field called sale_id and also this also belongs to the database column as well. Hence, to get all the pickings belongs to sale order, we may first use the ORM:

sale_pickings = self.env['stock.picking'].search([('sale_id', '!=', False)])

or a direct PostGRE SQL

query = """select * from stock_picking where sale_id is not null"""
self.env.cr.execute(query)
result = self.env.cr.fetchall()

Now, the second example is not only faster, but also, it allows you to extend the facility further to use joins or select specific field of a table result, which is only possible using ‘read’ Odoo ORM method, again, domain specification is not permissible like it is available in ‘search’.

We are able to do things like the following with the sql:

query = """select sale_order.name, stock_picking.name from stock_picking left join on stock_picking.sale_id = sale_order.id where stock_picking.sale_id is not null"""
self.env.cr.execute(query)
result = self.env.cr.fetchall()

This would give you a result of each sale order with it’s picking name. To produce a result like the above using ORM is costly as it follows ‘N+1’ algorithm, hence inefficient in making reports or scaling the software.

Now, we understand, we are able to use such field and make the reports efficient using SQL as sale_id is distinctively available in the database. But what if you want to check how the product has been purchased, and then sold? Then, we also need purchase_order model to connect to our above query, right? But unfortunately, as ‘Purchase’ module doesn’t add a column purchase_id, we are unable to use this directly.

So, how can we still use purchase_id in the SQL Query to generate report in Odoo?

First, we need to see, how purchase_id is added in Odoo.

purchase_id is added in stock_picking model in the ‘purchase_stock’ module. If you open the following file:

purchase_stock/models/stock.py

you may see, how purchase_id is defined as related Many2one field:

class StockPicking(models.Model):
    _inherit = 'stock.picking'

    purchase_id = fields.Many2one('purchase.order', related='move_lines.purchase_line_id.order_id',
        string="Purchase Orders", readonly=True)

A related field in Odoo, is like a pointer, a syntactic sugar of foreign key for less used fields. If the field is highly used, this might cause performance issue, as Odoo has to do multiple lookups unlike direct lookup for a related field. Now, get to the point, purchase_id is related to ‘move_lines.purchase_line_id.order_id’. This is a long relation. Let me go one by one:

move_lines : stock_picking has an One2many relation with stock.move model, that derives the available moves for the picking.
purchase_line_id: Each move line derived from a purchase order line, and while doing so, it keeps the ID of the purchase order line in a foreign key of stock.move model, namely purchase_line_id.
order_id: Each purchase_order_line has a foreign key with the purchase.order model kept in order_id field.

Now, we know, how the purchase_id derives the purchase_order id using the following relation:

Picking > Moves > Purchase Order Line > Purchase Order

Now we can use the following kind of relation for detecting purchase order from stock picking:

select purchase_order.name, stock_picking.name from stock_picking left join stock_move on stock_move.picking_id = stock_picking.id left join purchase_order_line on purchase_order_line.id = stock_move.purchase_line_id left join purchase_order on purchase_order.id = purchase_order_line.order_id where stock_move.purchase_line_id is not null group by stock_picking.name, purchase_order.name

Here, we are able to get the picking and purchase in relation with one query. This concept can be used to derive many data, like, let’s say, you would like to see, how many of your products are purchased, then, sold and returned, all can be done in few queries, without having N+1 problem.