Home > Blockchain >  Why does this update query not complete?
Why does this update query not complete?

Time:02-03

I have 2 tables, customers (3000 rows) and phone_call_log (350 000 rows).

I need to materialize the time of last call to each customer, using the call log (faster for frontend searches)

Indexes are on:

  • start_time (timestamp)
  • callee(bigint(32) unsigned)
  • caller(bigint(32) unsigned)
  • phonenumber(bigint(32) unsigned)
  • last_call(timestamp)

Running this query without the OR statement completes in < 2 seconds for either caller / callee columns, but with the OR in place, it will not complete (I've not allowed it to run longer than 30 minutes in testing).

UPDATE customers
SET customers.last_call =
        (select max(phone_call_log.start_time)
         from phone_call_log
         WHERE (
                 (
                             phone_call_log.callee = customers.phonenumber
                     )
                 or (
                             phone_call_log.caller = customers.phonenumber
                     )
             )
        )
where customers.phonenumber is not null
  AND LENGTH(customers.phonenumber) > 6
  AND customers.phonenumber > 1000000;

CodePudding user response:

Queries using OR cannot use index (as efficiently). I suggest you try the following:

UPDATE customers
SET last_call = GREATEST(
    (SELECT MAX(start_time) FROM phone_call_log WHERE callee = customers.phonenumber),
    (SELECT MAX(start_time) FROM phone_call_log WHERE caller = customers.phonenumber)
)

Be advised that GREATEST has issues with NULL values.

CodePudding user response:

Fastest

Change the data flow to update customers.last_call when a call comes in.

UPDATE JOIN

UPDATE works better with a JOIN than IN ( SELECT ... )

OR

OR is a performance killer. The query is likely to scan the entire phone_call_log each for each customer.

One fix is to do two UPDATEs and have suitable indexes:

UPDATE 
    SET customers.last_call = GREATEST( customers.last_call,
                 ( select max(phone_call_log.start_time)
                      FROM  phone_call_log
                     WHERE  phone_call_log.callee = customers.phonenumber 
                 )
    WHERE ...
UPDATE 
    SET customers.last_call = GREATEST( customers.last_call,
                 ( ... caller ... )
                 )
    WHERE ...

That needs these indexes on phone_call_log:

    INDEX(callee, start_time)
    INDEX(caller, start_time)

and remove your current 1-column indexes on caller and callee.

Datatype

It is probably wrong and counterproductive to use BIGINT for a phone number, especially in light of LENGTH(customers.phonenumber) > 6.

In fact, all of this boils down to a single test:

where customers.phonenumber is not null
  AND LENGTH(customers.phonenumber) > 6
  AND customers.phonenumber > 1000000;

Each of the > check for NOT NULL anyway; use only one of the others, depending on datatype. And do have it indexed.

(Please provide SHOW CREATE TABLE; the 'English' is less precise.)

  •  Tags:  
  • Related