I have an entity_relationship
table that describes relationships between entities. Each relationship will have 2+ entities involved.
In some cases it can be said that one entity is a constituent of a constituency. This is stored in the entity_relationship
table.
I then intend to add entities to the relationships based on this: If entity Joe Bloggs is a constituent of constituency University of Life, and the University of Life is itself a constituent of constituency Made Up Universities then I will add Joe Bloggs into the 2nd relationship as an implied-constituent.
When all the implied-consituent relationships are populated (by running the query until no records are added) I'll have a quick way to find out who's linked to what without needing to do recursion at that stage.
The entity_relationship table looks like:
+-----------------+---------------+
| Field | Type |
+-----------------+---------------+
| entity_id | int(10) |
| relationship_id | int(10) |
| type | enum(...) |
+-----------------+---------------+
and has keys (currently LOADS of 'em, trying to optimise!)
+------------+----------+--------------+-----------------+-------------+------+
| Non_unique | Key_name | Seq_in_index | Column_name | Cardinality | Null |
+------------+----------+--------------+-----------------+-------------+------+
| 0 | PRIMARY | 1 | entity_id | 179429 | |
| 0 | PRIMARY | 2 | relationship_id | 179429 | |
| 1 | r_t_e | 1 | relationship_id | 179429 | |
| 1 | r_t_e | 2 | type | 179429 | YES |
| 1 | r_t_e | 3 | entity_id | 179429 | |
| 1 | t_r_e | 1 | type | 8 | YES |
| 1 | t_r_e | 2 | relationship_id | 179429 | |
| 1 | t_r_e | 3 | entity_id | 179429 | |
| 1 | t_e_r | 1 | type | 6 | YES |
| 1 | t_e_r | 2 | entity_id | 179429 | |
| 1 | t_e_r | 3 | relationship_id | 179429 | |
+------------+----------+--------------+-----------------+-------------+------+
And then the query I'm trying is:
INSERT INTO entity_relationship
SELECT lt.entity_id entity_id,
py.relationship_id relationship_id,
'implied-constituent' `type`
FROM entity_relationship lt,
entity_relationship ly,
entity_relationship pt,
entity_relationship py
WHERE lt.type='constituent'
AND lt.relationship_id = ly.relationship_id
AND ly.type='constituency'
AND ly.entity_id = pt.entity_id
AND pt.type='constituent'
AND pt.relationship_id = py.relationship_id
AND py.type='constituency';
The problem is this is taking 42s to run (even when the query results in zero rows to insert). The output of EXPLAIN (on the SELECT) shows:
+--------+-----+----+-------+---+------------------------+-----+------------------------+
| s._type|table|type|key |len|ref |rows |Extra |
+--------+-----+----+-------+---+------------------------+-----+------------------------+
| SIMPLE |lt |ref |t_r_e |2 |const |89714|Using where; Using index|
| SIMPLE |ly |ref |r_t_e |6 |lt.relationship_id,const| 1|Using where; Using index|
| SIMPLE |pt |ref |PRIMARY|4 |ly.entity_id | 1|Using where |
| SIMPLE |py |ref |r_t_e |6 |pt.relationship_id,const| 1|Using where; Using index|
+--------+-----+----+-------+---+------------------------+-----+------------------------+
Which looks OK – except perhaps the 3rd line where it does not say Using Index.
Can anyone see a way to optimise this?
Best Answer
How to fix it in MySQL
As can be seen from line 3 of the EXPLAIN statement, a [edit:]covering index is not being used here. It needs to join on the entity_id and the type and no key is available for that.
This makes the key available, but MySQL chooses not to use it. It can be forced with:
USE INDEX (e_t_r)
:This now runs in 0.8s. (compare to 17.7s without forcing it to use that index.) This is with MySQL 5.1.63.
How to fix it in MariaDb
Well, you don't have to!
MariaDb executes the query very fast with or without the
USE INDEX
intervention (0.25s, but it's on a different host so I would expect this to be nearer the 0.8s in the optimised version above).Also, MariaDb does not use the new index and is quite happy with the t_e_r index, including using it as a covering index (i.e. uses the index for the entire data fetch).
I'm getting more and more impressed with MariaDb and now considering switching.