Mysql – How does compound index help speed up query

indexMySQLperformancequery-performance

Say we have a bunch of column in mysql table.

Latitude Longitude

Say we make an index (Latitude and Longitude)

Well, the index will sort by Latitude first and IF the Latitude is the same it'll sort by Longitude.

Something fishy should show up. Latitude is rarely, if ever, exactly the same.

Say we want to find all points whose Lat,Long is "in a box" How in the earth those indexes can be useful at all.

Note: I am trying to understand how indexing work. I am not trying to snobbishly say that compound index is stupid. Doesn't have to be full answer. Any pointing is fine.

More importantly, I am suspicious that a query will only use 1 index and the only way for a query to use multilpe index is to make compound index.

Best Answer

Indexes are in most cases a B-Tree structure (or some sort of). There are DBMS that supported different indexing types.

As you are talking about longitude/latitude PostgreSQL with it's GIST indexes come to mind. Oracle has Bitmap indexes in addition to the B-Tree indexes. I'm sure SQL Server and DB2 also have some special index types. And then there are full text indexes which make searching text very efficient.

A B-Tree index is very efficient in finding a specific value - think of a primary key index where all values are different. If the index only contains the PK column(s) (i.e. it is not a clustered index) then typically looking up a row by a specific PK values takes not more than (roughly) 3-4 IO operations (at least with Oracle). 2-3 to find the index block and an additional one to read the whole row. This gets more efficient if the index contains additional columns so that the lookup of the actual table row is not needed. The term for that is "covering index" or "index only retrieval".

Now for doing "range lookups" (e.g. where foo > 42) an index is very helpful as well as in most DBMS the index can also be scanned according to a predicate. Usually (again this highly depends on the DBMS) this is slightly less efficient than a direct lookup (again this also depends on the ability to do an "index only retrieval").

I don't know of any BMS which can not use more than one index in a query. Think a join on a PK and a FK column - depending on the data distribution the DBMS might use the index to find the parent rows (PK lookup) and the child rows (FK lookup).

But not all DBMS can use more than one index for the same table in a single query.

After all whether or not an index is being used or not depends on a lot of things.

I can highly recommend http://use-the-index-luke.com/ which is a very good introduction on indexing across all major DBMS.

DBMS specific information:

Oracle: http://docs.oracle.com/cd/E11882_01/server.112/e25789/indexiot.htm
PostgreSQL: http://www.postgresql.org/docs/current/static/indexes.html