Sql-server – Is a Non-Clustered index needed when referencing a small table with a Foreign Key

clustered-indexforeign keyindexnonclustered-indexsql server

I'm not a DBA, excuse me if this question sounds dumb. Hope I can get help from an experienced DBA, because I'm questioning this for a long time.

Assume there are 2 tables, one table with a lot of records, and one table is just static data with 2 records (for example: a Yes/No table).

As example take these 2 tables:

  • Table_Yes_No (only 2 records: Yes and No. Of course they have an incrementing clustered PK)
  • Table_Form (Very big table with lots of records, and one column has FK to the PK of Table_Yes_No).

Now my question; Is it worth putting an index on the FK (reference from Table_Form to the Table_Yes_No table)? The column will only contain 1 or 2 (but not sorted, because no index). Is it worth indexing such a FK?

An example in SQLFiddle: http://sqlfiddle.com/#!18/51614/3/0

That column will be queried anyway, the question is will the index help the performance or not.
The small table is static data, will never be changed. The big table will be queried heavily, and there will be also a lot of CRUDs on it.

Best Answer

Short answer

No.

Long answer

No, and adding the index could be detrimental to performance.

Something that factors heavily into whether or not a secondary/non-clustered index is even used is how selective it is (and the searches you are trying to perform are). Y/N has two values - it's ability to be selective is going to depend on the proportion of Y to N. If they are evenly split, and there is no rhyme/reason to Y/N with relation to the clustered index, they will not be selective and the query optimizer will almost always ignore that index.

The potential slowdown occurs if you issue a delete on the larger table or an update against that Y/N column. There's no efficient way to update the secondary index as it is organized by Y/N - so the only option is to basically scan the entire index looking for the record to update/delete. Depending on the number of rows in the main table and the size of the clustered index, this could be an issue.

Caveats

If there are few Ys or few Ns, and you need to locate those records quickly, you can use a filtered index to create a very small index that contains only records for the sparsely populated value. This can be beneficial and has a much lower footprint than a traditional non-clustered index would in that situation.

If you're just trying to enforce Y/N (or some other easily understood code set), that can be handled easily through a check constraint, which will have a lower overhead than the FK.

ADD CONSTRAINT CK_<ColumnName>_Is_YesNo CHECK (<ColumnName> IN ('Y','N'))