CopyDisable

Friday 2 May 2014

Optimizing MySQL queries with memory tables

In this query optimization tip, I will show you how we can make our queries faster using MySQL’s memory storage engine.

I have some report queries in my call center application (using which we capture the calls that we received in our call center).

The main table to be used was callflow:

image

I have to calculate some stats from this table, like the total duration of the call.

image

One of the report query was:

select date(receivedon), sum(TIME_TO_SEC(TIMEDIFF(finishedon,receivedon))) Total, count(rcv_call_id) No_of_calls,
max(TIME_TO_SEC(TIMEDIFF(finishedon,receivedon))) maximum ,
min(TIME_TO_SEC(TIMEDIFF(finishedon,receivedon))) Minimum
from
(select FkCallID rcv_call_id, receivedon from callflow where FlowType='R' and
ReceivedOn >= '2014-01-01 00:59:03' and
ReceivedOn <= '2014-01-31 00:59:03') rcv,
(select FkCallID clsd_call_id, finishedon from callflow where FlowType='C' and
FinishedOn >= '2014-01-01 00:59:03' and
FinishedOn <= '2014-01-31 00:59:03') clsd where rcv_call_id=clsd_call_id group by date(receivedon);

This query returns total number of calls for a day, total time in seconds for the calls, max time taken for a call and min time taken for a call.

image

This query was taking around 30 seconds to execute which is too much.

So I decided to use MySQL’s memory tables for above query. The above query has two derived tables and I am going to use memory tables for those two derived tables.

 

1) First I will create the memory tables for the derived tables and going to add index on the column which will be used in where condition.

CREATE TABLE rcv ENGINE=MEMORY
SELECT FkCallID rcv_call_id, receivedon from callflow  where  FlowType='R' and
ReceivedOn >= '2014-01-01 00:59:03' and
ReceivedOn <= '2014-01-31 00:59:03' ;

ALTER TABLE rcv ADD INDEX (rcv_call_id);
 
CREATE TABLE clsd ENGINE=MEMORY select FkCallID clsd_call_id, finishedon from callflow  where FlowType='C'  and
FinishedOn >= '2014-01-01 00:59:03' and
FinishedOn <= '2014-01-31 00:59:03';

ALTER TABLE clsd ADD INDEX (clsd_call_id);

 

2) Using these memory tables, I will rewrite the query:


select date(receivedon), sum(TIME_TO_SEC(TIMEDIFF(finishedon,receivedon))) Total, count(rcv_call_id) No_of_calls,
max(TIME_TO_SEC(TIMEDIFF(finishedon,receivedon))) maximum ,
min(TIME_TO_SEC(TIMEDIFF(finishedon,receivedon))) Minimum
from rcv, clsd where rcv_call_id=clsd_call_id group by date(receivedon);

 

3) After running the query and getting the result, I will drop the temporary memory tables:

drop table rcv;
drop table clsd;

 

My query after using the memory tables gave results in 87 ms which is lightning fast compared to 30 secs it was taking previously with derived tables.

But remember to adjust the max_heap_table_size system variable, as this value restrict the maximum size of memory tables. Default is 16MB, so adjust it as per your need to take the benefit of memory tables.

No comments:

Post a Comment