I have a query and it seems very slow
My Problem
select conversation_hash as search_hash
from conversation
where conversation_hash ='xxxxx'
and result_published_at between '1600064000' and '1610668799'
order by result_published_at desc
limit 5
There is a total of 773179 Records when I run
select count(*)
from conversation
where conversation_hash ='xxxxx'
After I do an explain query
explain select conversation_hash as search_hash
from conversation
where conversation_hash ='xxxxx'
and result_published_at between '1600064000' and '1610668799'
order by result_published_at desc
limit 5
i got this
id,select_type,table,partitions,type,possible_keys,key,key_len,ref,rows,filtered,extra
1, SIMPLE, conversation, , range, idx_result_published_at,conversation_hash_channel_content_id_index,conversation_result_published_at_index,virtaul_ad_id_conversation_hash, idx_result_published_at, 5, , 29383288, 1.79, Using index condition;Using where
Possible Issues
- By looking in the explain query I can see it return more rows(29383288) than the total Records (ie 773179)
- key_len is 5. result_published_at is a timestamp field and its length is def more than 5 eg(1625836640)
What can I improve to make this query Fast, Thanks in advance
EDIT
Indexes for conversation
Table,Non_unique,Key_name,Seq_in_index,Column_name,Collation,Cardinality,Sub_part,Packed,Null,Index_type,Comment,Index_comment
conversation,0,PRIMARY,1,id,A,96901872,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_result_id_unique,1,conversation_hash_id,A,240485,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_result_id_unique,2,result_id,A,100693480,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_channel_content_id_unique,1,conversation_hash_id,A,232122,NULL,NULL,,BTREE,,
conversation,0,conversation_conversation_hash_id_channel_content_id_unique,2,channel_content_id,A,100693480,NULL,NULL,,BTREE,,
conversation,1,conversation_tool_id_foreign,1,tool_id,A,7788,NULL,NULL,,BTREE,,
conversation,1,idx_result_published_at,1,result_published_at,A,38164712,NULL,NULL,YES,BTREE,,
conversation,1,idx_user_name,1,user_name,A,10896208,NULL,NULL,YES,BTREE,,
conversation,1,conversation_hash_channel_content_id_index,1,conversation_hash,A,294048,NULL,NULL,,BTREE,,
conversation,1,conversation_hash_channel_content_id_index,2,channel_content_id,A,99699696,NULL,NULL,,BTREE,,
conversation,1,idx_parent_channel_content_id,1,parent_channel_content_id,A,3550741,NULL,NULL,YES,BTREE,,
conversation,1,idx_channel_content_id,1,channel_content_id,A,90350472,NULL,NULL,,BTREE,,
conversation,1,conversation_result_published_at_index,1,result_published_at,A,37177476,NULL,NULL,YES,BTREE,,
conversation,1,virtaul_ad_id_conversation_hash,1,conversation_hash,A,238906,NULL,NULL,,BTREE,,
conversation,1,virtaul_ad_id_conversation_hash,2,virtual_ad_id,A,230779,NULL,NULL,YES,BTREE,,
conversation,1,idx_ad_story_id,1,ad_story_id,A,167269,NULL,NULL,YES,BTREE,,
CodePudding user response:
Query is correct, it seems you have to update server configurations for mysql, which is probably not available on a shared hosting environment. However, if you have your own server the follow these steps:
- Go to
my.cnffile, in my case it is hosted at/etc/mysql/my.cnf - Increate the values of
query_cache_size,max_connection,innodb_buffer_pool_size,innodb_io_capacity - Switch from
MyISAMtoInnoDB(if possible) - Use latest MySQL version (if possible)
You can get more help from this article https://phoenixnap.com/kb/improve-mysql-performance-tuning-optimization
CodePudding user response:
It's a bit hard to read the output of the Explain command because the possible_keys output is separated by commas.
Depending on the data access patterns, you might want to create a unique index on conversation_hash, in case rows are unique.
If conversation_hash is not a unique field you can create a compound index on conversation_hash, result_published_at so your query will be fulfilled from the index itself.
CodePudding user response:
EXPLAIN estimates the row counts. (It has no way to get the exact number of rows without actually running the query.) That estimate may be a lot lower or higher than the real count. It is rare for the estimate to be that far off, but I would not be worried, just annoyed.
The existence of Text and Blob columns sometimes adds to the imprecision of Explain.
Key_len:
- The raw length, which is 5 for
TIMESTAMP(more below). - 1 for if
NULL( 0 forNOT NULL). - Not very useful for
VARCHAR.
In older versions of MySQL, a Timestamp took 4 bytes and a DATETIME took 8. When fractional seconds were added, those numbers were changed to 5 in both cases. This allowed for a "length" to indicate the number of decimal places. And Datetime was changed from packed decimal to an integer.
Suggest you run ANALYZE TABLE. This might improve the underlying statistics that feed into the estimates.
Please provide SHOW CREATE TABLE; it may give more insight.
The 'composite' INDEX(conversation_hash, result_published_at), in that order, is optimal for that query.
