database - How to improve sub query performance in MySQL -


i have crm system generates attributes using eav model. problem may aware eav model require complex queries pull data. each attribute have returned in separate column.

when using sub queries, mysql performance sucks. have find better way write queries analyzing them using giving clause, sort order , limit "if any"!

by sub query refereeing query this

select a.account_name, a.account_type, a.status, a.account_id, s.fieldvalue, s2.last_training_on, s3.fieldvalue accounts inner join clients c on c.client_id = a.client_id left join (     select p.related_to account_id, decimal_value fieldvalue     df_answers_text p     inner join df_field_to_client_relation r on r.field_id = p.field_id     p.field_id = '19' , r.client_id = '7'; ) s on s.account_id = a.account_id left join (     select p.related_to account_id, datetime_value last_training_on     df_answers_text p     inner join df_field_to_client_relation r on r.field_id = p.field_id     p.field_id = '10' , r.client_id = '7'; ) s2 on s2.account_id = a.account_id left join (     select       p.related_to     , cast(group_concat(o.label separator " | ") char(255)) fieldvalue     df_answer_predefined p     inner join df_fields_options o on o.option_id = p.option_id     inner join df_field_to_client_relation r on r.field_id = o.field_id     o.is_place_holder = 0 , o.field_id = '16' , r.field_id = '16' , r.client_id = '7'     group p.related_to; ) s3 on s3.related_to = a.account_id c.client_id = '7' , c.status = 'active' , ( a.account_type = 'test' or a.account_type = 'value' or s2.last_training_on > '2015-01-01 00:00:00') , (s.fieldvalue = 'medium' or s.fieldvalue = 'low' or a.exptype = 'very high') order a.account_name limit 500; 

i thought creating temporary table using memory engine content of sub query this

create temporary table s (key(account_id, fieldvalue)) engine = memory select p.related_to account_id, decimal_value fieldvalue df_answers_text p inner join df_field_to_client_relation r on r.field_id = p.field_id p.field_id = '19' , r.client_id = '7';  create temporary table s2 (key(account_id, index using btree last_training_on)) engine = memory select p.related_to account_id, datetime_value last_training_on df_answers_text p inner join df_field_to_client_relation r on r.field_id = p.field_id p.field_id = '10' , r.client_id = '7';       create temporary table s3 (key(related_to, fieldvalue)) engine = memory     select       p.related_to     , cast(group_concat(o.label separator " | ") char(255)) fieldvalue     df_answer_predefined p     inner join df_fields_options o on o.option_id = p.option_id     inner join df_field_to_client_relation r on r.field_id = o.field_id     o.is_place_holder = 0 , o.field_id = '16' , r.field_id = '16' , r.client_id = '7'     group p.related_to;       create temporary table s3 (key(related_to)) engine = memory     select       p.related_to     , cast(group_concat(o.label separator " | ") char(255)) fieldvalue     df_answer_predefined p     inner join df_fields_options o on o.option_id = p.option_id     inner join df_field_to_client_relation r on r.field_id = o.field_id     o.is_place_holder = 0 , o.field_id = '16' , r.field_id = '16' , r.client_id = '7'     group p.related_to;   new query      select a.account_name, a.account_type, a.status, a.account_id, s.fieldvalue, s2.last_training_on, s3.fieldvalue     accounts     inner join clients c on c.client_id = a.client_id     left join s on s.account_id = a.account_id     left join s2 on s2.account_id = a.account_id     left join s3 on s2.related_to = a.account_id     c.client_id = '7' , c.status = 'active' , ( a.account_type = 'test' or a.account_type = 'value' or s2.last_training_on > '2015-01-01 00:00:00') , (s.fieldvalue = 'medium' or s.fieldvalue = 'low' or a.exptype = 'very high')     order a.account_name     limit 500;      drop temporary table s, s2; 

the problem facing of temporary table create temporary table of entire data available in database consume lots of time. outer query looking 500 records sorted a.account_name. if temporary table has 1 million records waste of time , give me bad performance.

i looking find better way pass on clause sub query way create temporary table needed data outer query

note: these queries generated dynamic using gui. unable figure out how extract logic/clause , pass them sub query.

questions

  • how can @ clause, parse them , pass them sub query refuse amount of data in sub quires? if call clause "and" life easier since have mix or "and" , "or" complex.
  • is there better approach problem rather using temporary tables.

edited here table definitions

create table df_answer_predefined ( answer_id int(11) unsigned not null auto_increment, field_id int(11) unsigned default null, related_to int(11) unsigned default null, option_id int(11) unsigned default null, created_by int(11) unsigned not null, created_on datetime default current_timestamp, primary key (answer_id), unique key un_row (field_id,option_id,related_to), key field_id (field_id), key related_to (related_to), key to_delete (field_id,related_to), key outter_view (field_id,option_id,related_to) ) engine=innodb auto_increment=4946214 default charset=utf8;

 `create table df_fields_options (   option_id int(11) unsigned not null auto_increment,   field_id int(11) unsigned not null,   label varchar(255) default null,   is_place_holder tinyint(1) not null default '0',   is_default tinyint(1) not null default '0',   sort smallint(3) not null default '1',   status tinyint(1) not null default '1',   primary key (option_id),   key (field_id),   key d (option_id,field_id,is_place_holder) ) engine=innodb auto_increment=155 default charset=utf8;`   `create table df_field_to_client_relation (   relation_id int(11) unsigned not null auto_increment,   client_id int(11) unsigned default null,   field_id int(11) unsigned default null,   primary key (relation_id),   unique key unique_row (field_id,client_id),   key client_id (client_id),   key flient_id (field_id) ) engine=innodb auto_increment=26 default charset=utf8;`   `create table df_answers_text (   answer_id int(11) unsigned not null auto_increment,   notes varchar(20000) default null,   datetime_value datetime default null,   date_value date default null,   us_phone_number char(10) default null,   field_id int(11) unsigned default null,   related_to int(11) unsigned default null,   created_by int(11) unsigned not null,   created_on datetime default current_timestamp,   modified_by int(11) default null,   modified_on datetime default null,   big_unsigned_value bigint(20) default null,   big_signed_value bigint(19) default null,   unsigned_value int(11) default null,   signed_value int(10) default null,   decimal_value decimal(18,4) default null,   primary key (answer_id),   unique key unique_answer (field_id,related_to),   key field_id (field_id),   key related_to (related_to),   key big_unsigned_value (big_unsigned_value),   key big_signed_value (big_signed_value),   key unsigned_value (unsigned_value),   key signed_value (signed_value),   key decimal_value (decimal_value) ) engine=innodb auto_increment=2458748 default charset=utf8;` 

the query takes time third sub query alias s3

here execution plan query taking long time "2 seconds"

enter image description here

unique(a,b,c) index (a) 

drop index, since unique key index and index prefix of unique.

primary key(d) unique(a,b,c) 

why have d @ all? primary key(a,b,c).

from ( select ... ) join ( select ... ) on ... 

optimizes poorly (until 5.6.6). whenever possible turn join ( select ) join table. suggested, using tmp tables may better, if can add suitable index tmp table. best try avoid more one "table" subquery.

in many-to-many relation table, don't include id table, instead have only

primary key (a,b),  -- enforcing uniqueness, providing pk, , going 1 direction index       (b,a)   -- going other way. 

the explain not seem match select provided. each useless without other.

another approach might help... instead of

select ..., s2.foo, ...     ...     join ( select ... x ... ) s2 on s2.account_id = a.account_id 

see if can reformulate as:

select ...,         ( select foo x ... , related = a.account_id) foo, ...     ... 

that is, replace join subquery correlated subquery 1 value need.

the bottom line eav model sucks.

hmmm... don't see need @ all, since r not used elsewhere in query...

inner join df_field_to_client_relation r on r.field_id = p.field_id     p.field_id = '19' , r.client_id = '7' 

it seems equivalent to

where exists ( select * df_field_to_client_relation                 field_id = '19' , client_id = '7' ) 

but why bother checking existence?


Comments