Home > Blockchain >  how often the user change his device, query sql
how often the user change his device, query sql

Time:02-01

I want to know "On average, how often the user change his devices by month?"

This is the table_1 in SQL, AWS Athenas

id          date    month   device      carrier
-----------------------------------------------
12-ac8147   2019       4    Sgx1         alca
12-ac8147   2019       5    Sgx1         alca
12-ac8147   2019       5    Sgx1         alca
12-ac8147   2019       6    Sgx1         alca
12-ac8147   2019       6    Sgx1         alca
12-ac8147   2019       6    iPhone8,1    telc
12-ac8147   2019       9    iPhone8,1    telc
12-ac8147   2019       9    iPhone8,1    telc
12-ac8147   2020       1    iPhone8,1    telc
12-ac8147   2020       10   iPhone11,2   telc
12-ac8147   2020       12   iPhone11,2   telc
12-ac8147   2020       12   iPhone11,2   service_m
12-ac8147   2020       12   iPhone11,2   service_m
12-ac8147   2021       1    iPhone11,2   service_m
12-ac8147   2021       12   iPhone11,5   service_m
12-ac8147   2021       12   iPhone11,5   movil_tel   

So my idea was (in SQL athenas AWS amazon)

with table as
(
    id, device, max(year * 100   month) mas_n, min(year * 100   month) min_n)
    from table_1
    where device is not null and device <> ''
    group by id, device
)
SELECT id, AVG(max_n - min_n) prom_dif
FROM table
GROUP BY id

But this doesn't work, because it counts all the months:

    id              prom_dif
12-ac8147            206 

CodePudding user response:

First detect changes. You can do this with LAG t compare with the previous row. Then count changes per user and month. At last get the average per user.

select id, avg(cnt * 1.0) avg_changes
from
(
  select id, month, device, count(*) as cnt
  from
  (
    select 
      id, year, month,
      device,
      lag(device) over (partiton by id order by year, month) as prev_device
    from table_1
  ) with_previous_month
  where device <> prev_device
  group by id, year, month
) monthly_changes
group by id
order by id;

This takes into consideration that a user can switch from device A to B, then back to A, then back to B for instance, so we must count three changes, although it's only two different devices.

If months are missing from the table, the query is missing them, too. This means, if you only have January and December and one change, then the query will show you 0.5 (1 change in two months) instead of 0.083 (one change in twelve months) or 0,091 (one change in eleven months, in case you don't want to include both January and December, but only one of them). If you want to amend for this, you must generate all months first and outer join your data.

(ID is a bad name here, by the way, because despite its name it does not uniquely identify a row in the table, but a user it seems. So better call this user_num, or - if a user table exists - user_id.)

CodePudding user response:

select id, 
       count(distinct device) * 1.0 / count(distinct year * 100   month) as prom_dif
from table_1
where device is not null and device <> ''
group by id
  •  Tags:  
  • Related