How to aggregate column with non-aggregable value in an Array using Group in PostgreSQL

Let’s imagine, scenario is like the following:

We have three columns with the following values in a table named ‘sale_line’

----------------------------------------
product_id	qty_done	sale_id
----------------------------------------
1010		5.0		101
1010		1.0		102
1010		2.0		103
1012		1.0		104
1012		4.0		105
1012		4.0		106
1012		9.0		107

We have a product_id column that contains which product we are referring to, a qty_done column, which contains the number of products sold, and sale_id refers to the sale order where the product_id was dispatched. Now the example table contains how many of each product dispatched in different sale orders. We can write a query, to group the product_id and the total number of products sold using aggregation

select product_id, sum(qty_done) as total_sold from sale_line group by product_id;

The producing result would be the following:

---------------------------
product_id	total_sold
---------------------------
1010		8.0		
1012		17.0	

We can see, how group by can give you the benefit of aggregation. But have you noticed, we have a column called ‘sale_id’ that doesn’t go with any aggregation function like ‘count’, ‘sum’ or ‘avg’. Can we use them for any purpose?

Sometimes, you may be interested to track down the orders responsible for a set of products using group by when each products will contain the list of orders responsible for causing the total sell. Can we accomplish that in one query?

We actually can. We can concat distinct ids and put them in a resulting array instead of our known aggregation function like ‘count’ or ‘sum’. Postgres provides a array aggregation function called ‘array_agg’, it can be used to produce such result. Have a look at the following query:

select product_id, sum(qty_done) as total_sold, array_agg(sale_id) as sale_ids from sale_line group by product_id;

The producing result would be the following:

-----------------------------------------
product_id	total_sold	sale_ids
-----------------------------------------
1010		8.0		[101, 102, 103]
1012		17.0		[104, 105, 106, 107]

As you can see, we have made postgres return us an array that contains the sale_ids for consisting total_sold even though, sale_id column wasn’t conventionally aggregable or let’s say different than the usual numeric pattern.

This technique can have many use cases. For example if you have a different model/table for managing sale orders, purchase orders, pickings and invoices while, your sale orders contains the price information, and the pickings doesn’t, but picking contains the actual number of products got dispatched, then you may first aggregate the total_sold from the picking table, then produce an average price using the sale_ids you have produced from the picking table using group by and array aggregation, then merge them. This seems quite complicated, I understand, but again, it’s all about putting a break to your reporting time, and manage scalability of your application, putting a constant cost based reporting algorithm using single sql query, even if you process 10 times more orders in future.

How to Display Select Result Vertically in Postgres psql Console

Mysql has a pretty cool feature with \G, that sets the mysql query result in vertical mode. If you have a table with a lot of column, you will face difficulty to display them in horizontal mode. Hence, you want to see the result vertically.

For Postgres, this is called ‘expanded display’. To set the expanded display on or off, you need to use the following command in psql console:

my_prod=# \x
Expanded display is on.
my_prod=# \x
Expanded display is off.
my_prod=#

If your expanded display is on, then \x will set this to off, else it would set the display to on. If you set the expanded display to on, this would show the column and value in a vertical key value pair style display, which is much easier to understand.

How to Backup PostgreSQL Database from Command Line

Question

How to Backup PostgreSQL Database from Command Line

Solution

Postgresql comes with two utilities called ‘pg_dump and ‘pgsql’. These can be used to backup and restore database.

To backup postgresql database using command line, you can use this:

pg_dump database_name > database_name.sql

This would take the backup and store it in .sql file.

To restore postgresql (.sql) file to a database using command line, you can use this:

pgsql database_name < database_name.sql