Home > Software engineering >  Postgresql duplicating values
Postgresql duplicating values

Time:01-06

Good afternoon, I need help to be able to "inactivate" a line, for my result these lines are duplicated, and I only need 1 of them, I have several examples like this, this is only 1 of the cases

    SELECT p.cd_key, r.nm_lane, to_char(r.dt_record_pass, 'yyyyMMdd HH24:MI:SS'),
    r.ds_license_plate, r.vl_speed FROM tbgen_rdo r, tbgen_processor p, tbgen_equipment e 
   WHERE p.id_processor = r.id_processor AND p.id_equipment = e.id_equipment 
   and r.dt_record_pass between '2021-09-27 10:13:00' and '2021-09-27 10:13:01' and 
   r.ds_license_plate ='KRR1000'

resulted:

cd_key nm_lane to_char ds_license_plate vl_speed
1516617158432 A 20210927 10:13:00 KRR1000 33
1516617158432 B 20210927 10:13:00 KRR1000 34

I consider this value doubled because I look only at the ds_license_plate and the date (to_char) vl_speed and nm_lane don't matter in this case, I only need to bring 1 record of this

I always consider the shortest lane, in this case lane A

CodePudding user response:

I always consider the shortest lane, in this case lane A

WITH x as (
 SELECT 
  p.cd_key, 
  r.nm_lane, 
  to_char(r.dt_record_pass, 'yyyyMMdd HH24:MI:SS') as dt,
  r.ds_license_plate, 
  r.vl_speed, 
  ROW_NUMBER() OVER(PARTITION BY to_char(r.dt_record_pass, 'yyyyMMdd HH24:MI:SS'), r.ds_license_plate ORDER BY r.nm_lane) rn 

FROM
  tbgen_rdo r
  INNER JOIN tbgen_processor p ON p.id_processor = r.id_processor 
  INNER JOIN tbgen_equipment e ON p.id_equipment = e.id_equipment 
WHERE  
   r.dt_record_pass between '2021-09-27 10:13:00' and '2021-09-27 10:13:01' and 
   r.ds_license_plate ='KRR1000'
)
SELECT * FROM X WHERE rn = 1

We ask the Database to number the rows according to increasing order of nm_lane, then we filter it down to just the first row (the row with rn = 1). The "partition by" means the row numbering will start over from 1 again if the date or the license plate changes (which it won't for this example because you've fixed it in the where clause, but just in case you search for multiple plates one day..) - this query would this pick the lowest lane row for every unique variation of license plate and date seen in the query

I've also updated the JOIN syntax to 1992 (that's a year) style - about 30 years ago improvements to the spec meant we didn't have to write joins like FROM t1, t2 WHERE t1.x = t2.y any more. It's recommended to write all your joins in this syntax

Note that you said "shortest" for lane and I interpreted that as "soonest in the alphabet", but if you mean there is some column somewhere that literally specifies the lane length in yards, miles etc you should use that column as your ORDER BY

  •  Tags:  
  • Related