"Get the first row of each day" SQL query-CodePudding

I am the end-user of a SQL Server DB with multiple lines ordered by date. Lets take this DB as an example:

Amount	Date
23.5	20210512010220111
24	20210512020220111
30	20210512030220111
1.2	20210513011020111
1000	20210513020220111
24	20210514100220111
240	20210514100220111

Be advised that the date is just a long that represent the date in the format: yyyymmddhhMMssfff.

I am trying to create a SQL query like this: "Get the first row of each day" So for the above example the result will be:

Amount	Date
23.5	20210512010220111
1.2	20210513011020111
24	20210514100220111

I saw this example in multiple sources: https://learnsql.com/cookbook/how-to-select-the-first-row-in-each-group-by-group/

The problem is when I tried it that was way too slow for me the DB is storing hundreds of millions of rows (with 9 columns each)

A couple of weeks ago I used a similar(ish) query for a daily min, avg, max:

SELECT MIN(Amount), AVG(Amount), MAX(Amount)
FROM table
GROUP BY Date/1000000000

the "/1000000000" is for days.

That worked quickly enough, if there is something similar to FIRST(Amount) that would be great. Just to clarify, I am just an end-user, I have no saying over the overall structure of the DB.

Edit: This is the query I tried and was too slow:

WITH added_row_number AS (
  SELECT
    *,
    ROW_NUMBER() OVER(PARTITION BY Date/1000000000 ORDER BY Date ASC) AS row_number
  FROM table
)
SELECT
  *
FROM added_row_number
WHERE row_number = 1;

Thank you.

CodePudding user response：

try something to affect

SELECT t.yyyy_mm_dd_date, table.*
FROM table
JOIN (
  SELECT SUBSTRING (Date, 1, 8) as yyyy_mm_dd_date, MIN(Date) as min_date
  FROM table
) t
ON t.min_date = table.Date

in general i find SQL queries to run fast with joins and aggregations (especially over their indices), so if can translate query to use those believe generally should run fairly fast

CodePudding user response：

Feels like you can use a cte here and cast the date strings as actual DATE values (the last query) here is my sample using test data (you did not post the column types so I guessed)

Not 100% clear about the "date" column if it is an actual datetime, you can just cast as a date

DECLARE @mytable TABLE (
    Amount NUMERIC(10,2) NOT NULL,  
    [Date] VARCHAR(30) NOT NULL
);
INSERT INTO @mytable(Amount,[Date])
VALUES
    (3.5, '20210512010220111'),
    (24.0,  '20210512020220111'),
    (30.0,'20210512030220111'),
    (1.2,   '20210513011020111'),
    (1000.0,    '20210513020220111'),
    (24.0,  '20210514100220111'),
    (240.0, '20210514100220111')
;

SELECT 
    [Amount], 
    MAX(CAST( LEFT([Date], 8)  AS DATE)) AS NewDate
FROM @mytable
GROUP BY AMOUNT 
ORDER BY MAX(CAST( LEFT([Date], 8)  AS DATE)) DESC;


/* this is what we want perhaps: */
;
WITH cte AS (
  SELECT
    Amount,
    CAST(LEFT([Date], 8) AS DATE) AS MyDate,
    ROW_NUMBER() OVER(PARTITION BY CAST(LEFT([Date], 8) AS DATE) ORDER BY CAST(LEFT([Date], 8) AS DATE) DESC) AS row_number
  FROM @mytable
)
SELECT
  *
FROM cte
WHERE row_number = 1;