December 9, 2023
Summary: In this article, we are going to see how the SQL LATERAL JOIN works, and how we can use it to cross-reference rows from a subquery with rows in the outer table and build compound result sets.
Table of Contents
Database table
Let’s assume we have the following blog
database table storing the blogs hosted by our platform:
And, we have two blogs currently hosted:
| id | created_on | title | url |
|----|------------|---------------------|------------------------------------|
| 1 | 2013-09-30 | PostgreSQL Blog | https://www.rockdata.net/blog/ |
| 2 | 2017-01-22 | PostgreSQL Tutorial | https://www.rockdata.net/tutorial/ |
Getting the report without using the SQL LATERAL JOIN
We need to build a report that extracts the following data from the blog
table:
- the blog id
- the blog age, in years
- the date for the next blog anniversary
- the number of days remaining until the next anniversary.
Calculating the blog age using date interval functions
The blog age needs to be calculated by subtracting the blog creation date from the current date.
The date of the next blog anniversary can be calculated by incrementing the age in years and adding it to the blog creation date.
The number of days until the next anniversary can be calculated by extracting the number of days from the interval given by the next blog anniversary and the current date.
To get the result, you can use the following query:
SELECT
b.id as blog_id,
extract(
YEAR FROM age(now(), b.created_on)
) AS age_in_years,
date(
created_on + (
extract(YEAR FROM age(now(), b.created_on)) + 1
) * interval '1 year'
) AS next_anniversary,
date(
created_on + (
extract(YEAR FROM age(now(), b.created_on)) + 1
) * interval '1 year'
) - date(now()) AS days_to_next_anniversary
FROM blog b
ORDER BY blog_id
And, you are going to get the expected result:
| blog_id | age_in_years | next_anniversary | days_to_next_anniversary |
|---------|--------------|------------------|--------------------------|
| 1 | 7 | 2021-09-30 | 295 |
| 2 | 3 | 2021-01-22 | 44 |
As you can see, the age_in_years
has to be defined three times because you need it when calculating the next_anniversary
and days_to_next_anniversary
values.
And, that’s exactly where LATERAL JOIN can help us.
Getting the report using the SQL LATERAL JOIN
LATERAL JOIN allows us to reuse the age_in_years
value and just pass it further when calculating the next_anniversary
and days_to_next_anniversary
values.
For instance, the previous SQL query can be rewritten like this:
SELECT
b.id as blog_id,
age_in_years,
date(
created_on + (age_in_years + 1) * interval '1 year'
) AS next_anniversary,
date(
created_on + (age_in_years + 1) * interval '1 year'
) - date(now()) AS days_to_next_anniversary
FROM blog b
CROSS JOIN LATERAL (
SELECT
cast(
extract(YEAR FROM age(now(), b.created_on)) AS int
) AS age_in_years
) AS t
ORDER BY blog_id
And, the age_in_years
value can be calculated once and reused for the next_anniversary
and days_to_next_anniversary
computations:
| blog_id | age_in_years | next_anniversary | days_to_next_anniversary |
|---------|--------------|------------------|--------------------------|
| 1 | 7 | 2021-09-30 | 295 |
| 2 | 3 | 2021-01-22 | 44 |
Much better, right?
The age_in_years
is calculated for every record of the blog
table. So, it works like a correlated subquery, but the subquery records are joined with the primary table and, for this reason, we can reference the columns produced by the subquery.