Home > Mobile >  How to write a faster sql query to break apart custom dictionary in Postgres column
How to write a faster sql query to break apart custom dictionary in Postgres column

Time:01-27

I'm working with this fun data that has this custom dictionary-ish format in one of the columns.

id parameters
1 x_id x_value; y_id y_value; z_id z_value;
2 y_id y_value2; z_id z_value2;

And looking to get it into this format...

id x_id y_id z_id
1 x_value y_value z_value
2 NULL y_value2 z_value2

I'd prefer for it to be completely dynamic, but I can live with knowing all possible column names in advance if it reduces complexity/improves performance. I can also guarantee this pattern is consistent. No additional nesting of dictionaries, etc.

Not being a sql master, this is the naive implementation I came up with but it seems quite slow. Is there a more performant way to do this?

select 
        string_agg(x_id, ',') as x_id,
        string_agg(y_id, ',') as y_id,
        string_agg(z_id, ',') as z_id
    from (
        select
            t.id,
            case when 'x_id' LIKE kvs.key then kvs.value else null end as x_id,
            case when 'y_id' LIKE kvs.key then kvs.value else null end as y_id,
            case when 'z_id' LIKE kvs.key then kvs.value else null end as z_id
        from my_table as t
        join (
            SELECT
                id,
                split_part(trim(both ' ' FROM unnest(string_to_array(parameters, ';'))), ' ', 1) "key",
                split_part(trim(both ' ' FROM unnest(string_to_array(parameters, ';'))), ' ', 2) "value"
            FROM my_table
        ) as kvs
            on t.id = kvs.id
    ) as params
    group by params.id

CodePudding user response:

I would convert this to a JSON value, then you can access each key quite easily:

select id, 
       parameters ->> 'x_id' as x_id,
       parameters ->> 'y_id' as y_id,
       parameters ->> 'z_id' as z_id
from (       
  select t.id, 
         jsonb_object_agg(split_part(p.parm, ' ', 1), split_part(p.parm, ' ', 2)) as parameters
  from the_table t
    left join unnest(regexp_split_to_array(trim(';' from parameters), '\s*;\s*')) as p(parm) on true
  group by id
) x
order by id;  

The trim(';' from p_input) is necessary to remove the trailing ;, otherwise regexp_split_to_array() will return one empty array element.

You can put this into a function to make things easier:

create or replace function convert_to_json(p_input text)
  returns jsonb
as
$$
  select jsonb_object_agg(elements[1], elements[2]) as parameters
  from (
    select string_to_array(p.parm, ' ') as elements
    from unnest(regexp_split_to_array(trim(';' from p_input), '\s*;\s*')) as p(parm)
  ) t1;
$$
language sql
immutable;

The this gets a bit simpler:

select id, 
       convert_to_json(parameters) ->> 'x_id' as x_id,
       convert_to_json(parameters) ->> 'y_id' as y_id,
       convert_to_json(parameters) ->> 'z_id' as z_id
from the_table
order by id;
  •  Tags:  
  • Related