I often find myself working with legacy database tables that don’t have primary keys or auto increment keys. In addition to this, I always find myself needing to get the latest record for a particular set of data in my database. It is never as easy as just calling a max
in the where clause. A subquery is often needed to get the most recent record in a table.
For this example, lets use the scenario where we have a table that contains the access logs of every user. The table has no primary key and no auto increment value. The table contains the username, access timestamp, and system that each user has logged into over the course of forever.
Such a table can be seen below:
Username Varchar(50) | Activity_Date Date | System Varchar(255) |
---|---|---|
nraboy | 2014-07-03 | Unix |
mcampos | 2014-06-02 | Windows |
mcampos | 2014-07-26 | Unix |
nraboy | 2014-07-25 | Unix |
To further this example, the database is Oracle. However, this tutorial is generic enough to work with most SQL languages.
select *
from tbl_access_logs tal1
where tal1.username = 'nraboy'
and tal1.activity_date = (select max(tal2.activity_date) from tbl_access_logs tal2 where tal2.username = tal1.username);
The above SQL query will return the most recent access record for the nraboy user.
Now why can’t you just do the following:
select *
from tbl_access_logs tal1
where tal1.username = 'nraboy'
and tal1.activity_date = max(tal1.activity_date);
This will translate to where tal1.username = ‘nraboy’ and tal1.activity_date = 2014-07-26
which will bring no results since the last sign in for nraboy
was 2014-07-25
. A subquery is necessary to make this happen correctly because you want to make sure the max is for the particular user, not the entire table.