How to aggregate column with non-aggregable value in an Array using Group in PostgreSQL

Let’s imagine, scenario is like the following:

We have three columns with the following values in a table named ‘sale_line’

----------------------------------------
product_id	qty_done	sale_id
----------------------------------------
1010		5.0		101
1010		1.0		102
1010		2.0		103
1012		1.0		104
1012		4.0		105
1012		4.0		106
1012		9.0		107

We have a product_id column that contains which product we are referring to, a qty_done column, which contains the number of products sold, and sale_id refers to the sale order where the product_id was dispatched. Now the example table contains how many of each product dispatched in different sale orders. We can write a query, to group the product_id and the total number of products sold using aggregation

select product_id, sum(qty_done) as total_sold from sale_line group by product_id;

The producing result would be the following:

---------------------------
product_id	total_sold
---------------------------
1010		8.0		
1012		17.0	

We can see, how group by can give you the benefit of aggregation. But have you noticed, we have a column called ‘sale_id’ that doesn’t go with any aggregation function like ‘count’, ‘sum’ or ‘avg’. Can we use them for any purpose?

Sometimes, you may be interested to track down the orders responsible for a set of products using group by when each products will contain the list of orders responsible for causing the total sell. Can we accomplish that in one query?

We actually can. We can concat distinct ids and put them in a resulting array instead of our known aggregation function like ‘count’ or ‘sum’. Postgres provides a array aggregation function called ‘array_agg’, it can be used to produce such result. Have a look at the following query:

select product_id, sum(qty_done) as total_sold, array_agg(sale_id) as sale_ids from sale_line group by product_id;

The producing result would be the following:

-----------------------------------------
product_id	total_sold	sale_ids
-----------------------------------------
1010		8.0		[101, 102, 103]
1012		17.0		[104, 105, 106, 107]

As you can see, we have made postgres return us an array that contains the sale_ids for consisting total_sold even though, sale_id column wasn’t conventionally aggregable or let’s say different than the usual numeric pattern.

This technique can have many use cases. For example if you have a different model/table for managing sale orders, purchase orders, pickings and invoices while, your sale orders contains the price information, and the pickings doesn’t, but picking contains the actual number of products got dispatched, then you may first aggregate the total_sold from the picking table, then produce an average price using the sale_ids you have produced from the picking table using group by and array aggregation, then merge them. This seems quite complicated, I understand, but again, it’s all about putting a break to your reporting time, and manage scalability of your application, putting a constant cost based reporting algorithm using single sql query, even if you process 10 times more orders in future.

How to get purchase_id from stock_picking table in Odoo?

stock_picking is one of the core model for Odoo if you using Odoo for ‘Inventory’. Now, if you use ‘Sales’ module, then stock_picking would be extended with a new field for model and column for database table, naming ‘sale_id’. This can be used to detect if the picking is originated from a sale order or not. But if you install ‘Purchase’ module, then stock_picking model is extended with ‘purchase_id’ like the ‘sale_id’ for purchases, but the database isn’t expanded with a column like ‘Sale’ module.

What does this mean?

This means, if you use Odoo ORM, only then, you may use purchase_id of a stock_picking. An example could be like the following. Let’s say, we would like to pick the pickings that originated from purchase orders, aka, GRN, we could use something like this:

purchase_pickings = self.env['stock.picking'].search([('purchase_id', '!=', False)])

This works, only if you are not trying to make a report from a huge lot of pickings, purchase orders and sale orders, when you want to use SQL statement to produce efficient joins and generate the report quickly.

Let me demonstrate what I meant

We know, stock_picking has a field called sale_id and also this also belongs to the database column as well. Hence, to get all the pickings belongs to sale order, we may first use the ORM:

sale_pickings = self.env['stock.picking'].search([('sale_id', '!=', False)])

or a direct PostGRE SQL

query = """select * from stock_picking where sale_id is not null"""
self.env.cr.execute(query)
result = self.env.cr.fetchall()

Now, the second example is not only faster, but also, it allows you to extend the facility further to use joins or select specific field of a table result, which is only possible using ‘read’ Odoo ORM method, again, domain specification is not permissible like it is available in ‘search’.

We are able to do things like the following with the sql:

query = """select sale_order.name, stock_picking.name from stock_picking left join on stock_picking.sale_id = sale_order.id where stock_picking.sale_id is not null"""
self.env.cr.execute(query)
result = self.env.cr.fetchall()

This would give you a result of each sale order with it’s picking name. To produce a result like the above using ORM is costly as it follows ‘N+1’ algorithm, hence inefficient in making reports or scaling the software.

Now, we understand, we are able to use such field and make the reports efficient using SQL as sale_id is distinctively available in the database. But what if you want to check how the product has been purchased, and then sold? Then, we also need purchase_order model to connect to our above query, right? But unfortunately, as ‘Purchase’ module doesn’t add a column purchase_id, we are unable to use this directly.

So, how can we still use purchase_id in the SQL Query to generate report in Odoo?

First, we need to see, how purchase_id is added in Odoo.

purchase_id is added in stock_picking model in the ‘purchase_stock’ module. If you open the following file:

purchase_stock/models/stock.py

you may see, how purchase_id is defined as related Many2one field:

class StockPicking(models.Model):
    _inherit = 'stock.picking'

    purchase_id = fields.Many2one('purchase.order', related='move_lines.purchase_line_id.order_id',
        string="Purchase Orders", readonly=True)

A related field in Odoo, is like a pointer, a syntactic sugar of foreign key for less used fields. If the field is highly used, this might cause performance issue, as Odoo has to do multiple lookups unlike direct lookup for a related field. Now, get to the point, purchase_id is related to ‘move_lines.purchase_line_id.order_id’. This is a long relation. Let me go one by one:

  1. move_lines : stock_picking has an One2many relation with stock.move model, that derives the available moves for the picking.
  2. purchase_line_id: Each move line derived from a purchase order line, and while doing so, it keeps the ID of the purchase order line in a foreign key of stock.move model, namely purchase_line_id.
  3. order_id: Each purchase_order_line has a foreign key with the purchase.order model kept in order_id field.

Now, we know, how the purchase_id derives the purchase_order id using the following relation:

Picking > Moves > Purchase Order Line > Purchase Order

Now we can use the following kind of relation for detecting purchase order from stock picking:

select purchase_order.name, stock_picking.name from stock_picking left join stock_move on stock_move.picking_id = stock_picking.id left join purchase_order_line on purchase_order_line.id = stock_move.purchase_line_id left join purchase_order on purchase_order.id = purchase_order_line.order_id where stock_move.purchase_line_id is not null group by stock_picking.name, purchase_order.name

Here, we are able to get the picking and purchase in relation with one query. This concept can be used to derive many data, like, let’s say, you would like to see, how many of your products are purchased, then, sold and returned, all can be done in few queries, without having N+1 problem.