Home > Software design >  Trying to join 2 tables based on the most recent timestamp *before* a specific date
Trying to join 2 tables based on the most recent timestamp *before* a specific date

Time:01-09

Here is my SQL statement, which seemed to function perfectly well before we created a new database. This approach seems to work just fine on another, similarly structured, pair of tables.

SELECT *
FROM tele2_details AS d
    INNER JOIN 
    tele2_usage AS u
    ON    
        d.iccid = u.iccid
    AND
        u.timestamp = (
        SELECT MAX(u.timestamp)
        FROM tele2_usage 
        WHERE u.timestamp <= DATE_FORMAT('2022-01-08 09:30:00','%Y-%m-%d %H:%i:%s') 
      )
WHERE
    accountCustom1='Horizon'

What I'm attempting to do here is join the details table with the usage table, the usage rows just contains the iccid of a sim card, a timestamp, and their current usage in bytes. What this should do is find the most recent usage record before the specified date (2022-01-08 09:30:00). This should give me a set of sims, each joined with it's most recent usage record before the specified time, however I usually get zero results on this particular combo of tables.

Specifically though, it does match any records where the date is the exact same as specified in the query, but not dates that are before or equal to the specified date. Can anybody help me with where I'm going wrong. This query worked fine in a previous database, we're rebuilding our systems and this has now appeared as an issue.

Thanks in advance for any help.


Edit Here's some more information, that I hope will make the question make more sense. So here is an outline of the details table, I've removed some of the columns but this is at least illustrative of table.

CREATE TABLE `tele2_details` (
    `iccid` VARCHAR(255) NOT NULL,
    `msisdn` VARCHAR(255) NOT NULL,
    `status` VARCHAR(255) NOT NULL,
    `ratePlan` VARCHAR(255) NOT NULL,
    `communicationPlan` VARCHAR(255) NOT NULL,
    PRIMARY KEY (`iccid`)
)
COLLATE='utf8mb4_0900_ai_ci'
ENGINE=InnoDB
;

Then we also have a usages table, which stores sample of the data usage of sim cards, along with timestamps...

CREATE TABLE `tele2_usage` (
    `id` INT NOT NULL AUTO_INCREMENT,
    `iccid` VARCHAR(255) NOT NULL COLLATE 'latin1_swedish_ci',
    `msisdn` VARCHAR(255) NOT NULL COLLATE 'latin1_swedish_ci',
    `timestamp` DATETIME NOT NULL,
    `ctd_data_usage` BIGINT NOT NULL,
    `ctd_sms_usage` BIGINT NOT NULL,
    `ctd_voice_usage` BIGINT NOT NULL,
    `session_count` INT NOT NULL,
    PRIMARY KEY (`id`)
)
COLLATE='utf8mb4_hr_0900_ai_ci'
ENGINE=InnoDB
AUTO_INCREMENT=10116319
;

The query I'm trying to create should return a set of sim details, joined with a usage record which is closest to, but not after, a particular time.

So if you look at the original query at the top, I'm trying to join the details onto the usage record which is **closest to, but not after 2022-01-08 09:30:00 **

I hope that makes sense.

Let's have a look at a particular sim

SELECT iccid, msisdn, status, ratePlan, communicationPlan FROM tele2_details WHERE iccid='xxxx203605100034xxxx'

results in 1 match

"xxxx203605100034xxxx" "xxxx9120012xxxx" "ACTIVATED" "Pay as use - Existing Business" "Data LTE"

And if I look in the usage table for that same sim I can see many records that should satisfy my conditions

SELECT id, iccid, TIMESTAMP, ctd_data_usage FROM tele2_usage WHERE iccid='xxxx203605100034xxxx' AND TIMESTAMP <= '2022-01-08 09:30:00'

results in

"10096279"  "xxxx203605100034xxxx"  "2022-01-08 09:01:00"   "77517560"
"10092271"  "xxxx203605100034xxxx"  "2022-01-08 08:01:03"   "77002733"
"10088263"  "xxxx203605100034xxxx"  "2022-01-08 07:01:11"   "76270445"
"10084255"  "xxxx203605100034xxxx"  "2022-01-08 06:01:05"   "76270445"

of which I would like to select the first record (with the 09:01 timestamp) for joining on to the details record. I can get that timestamp with the following query

SELECT MAX(timestamp)
FROM tele2_usage 
WHERE TIMESTAMP <= DATE_FORMAT('2022-01-08 09:30:00','%Y-%m-%d %H:%i:%s') AND iccid='xxxx203605100034xxxx'

which results in '2022-01-08 09:01:00', which is exactly what I want. So now I put it all together...

SELECT *
FROM tele2_details AS d
    INNER JOIN 
    tele2_usage AS u
    ON    
      d.iccid = u.iccid
    AND
      u.timestamp = (
        SELECT MAX(timestamp)
        FROM tele2_usage 
        WHERE timestamp <= DATE_FORMAT('2022-01-08 09:30:00','%Y-%m-%d %H:%i:%s') 
      )
WHERE
    accountCustom1='Horizon' AND iccid='xxxx203605100034xxxx'

And i get nothing! I would expect to get back the details record joined with that particular sim, but actually I get zero results and I don't understand why.

Ideally I would remove the final AND for the iccid and I would expect to recieve a set of all the sims for that client with the usage from closest to but not after the specified date.

So with that explanation, does anyone know why I'm not getting any records? I have a similar table for another sim provider that structured exactly the same, with a details table and a usage table and this query works just fine on that table. I simply can't understand why this doesn't work.


Edit 2 @Serg suggested trying to alias the subquery (I think that's what it's called) which resulted in the following code...

SELECT *
FROM tele2_details AS d
    INNER JOIN 
    tele2_usage AS u
    ON    
      d.iccid = u.iccid
    AND
      u.timestamp = (
        SELECT MAX(u2.timestamp)
        FROM tele2_usage AS u2
        WHERE u2.timestamp <= DATE_FORMAT('2022-01-08 09:30:00','%Y-%m-%d %H:%i:%s') 
      )
WHERE
    accountCustom1='Horizon'

Unfortunately this still resulted in zero results.

CodePudding user response:

You should correlate the subquery:

SELECT *
FROM tele2_details AS d INNER JOIN tele2_usage AS u
ON d.iccid = u.iccid
AND u.timestamp = (
        SELECT MAX(u2.timestamp)
        FROM tele2_usage u2 
        WHERE d.iccid = u2.iccid AND u2.timestamp <= DATE_FORMAT('2022-01-08 09:30:00','%Y-%m-%d %H:%i:%s') 
      )
WHERE d.accountCustom1='Horizon';

Or, with a join to an aggregation query:

SELECT *
FROM tele2_details AS d 
INNER JOIN tele2_usage AS u ON d.iccid = u.iccid
INNER JOIN (
  SELECT iccid, MAX(timestamp) timestamp
  FROM tele2_usage 
  WHERE timestamp <= DATE_FORMAT('2022-01-08 09:30:00','%Y-%m-%d %H:%i:%s') 
  GROUP BY iccid
) m ON m.iccid = u.iccid AND m.timestamp = u.timestamp
WHERE d.accountCustom1='Horizon';
  •  Tags:  
  • Related