When it comes to querying, indexing of a table should never be your first concern.
The queries you plan to use should dictate the indexes you need.
Based on the queries, some columns can be individually indexed. Other queries require compound indexes. The ORDER BY
and GROUP BY
clauses should provide immediate hints for indexes to make. Not using such hints may result in temp table sorting rather than using the indexes for data in the desired order needed.
Low cardinality of column values should eliminate the need for an index.
Even with these things taken into consideration, you may find that query may need some adjustment (a.k.a. refactoring) for performance gains.
When you reach the point of having the right indexes, not you have to worry about the size of those indexes. For a MyISAM table, this would mean that the .MYI file may grow significantly.
The size of the index file as well as the number of indexes should now be weighed against the performance of your queries, especially if the indexes provide the proper ordering of data and fastest retrieval.
Explain plans for queries may change over time depending on the number of rows, cardinality of columns, number of DELETEs and UPDATEs. Once a query's explain plan changes from what it looked like months ago, you should explore the need to add or remove indexes.
Reading your question literally, "if we delete 1 of 2 records pointing to Table2 from Table1 would also delete referenced record from Table2", you just need an ordinary ON DELETE CASCADE
, as you've shown: Table2 references Table1, and deleting a record in Table1 causes all related records in Table2 to be deleted, even if those records are being pointed to by other records in Table1.
However, if your goal is instead to delete from Table2 when there are no more associated records in Table1, whereas there are normally two such records, you'll need something more elaborate.
I think better in concrete terms, so based on your pre-edit question, I'll call Table1 Guardians
(because "parents" can be confusing for SQL-heads when they're actually the child half of a relationship) and Table2 Students
(similarly, avoiding the word "children"). The goal is to ensure that any students that do not have any guardians get deleted.
If polygamy and divorce are unheard on, one option would be to have two NULLable fields in Students
, FatherID
and MotherID
(or Parent1
and Parent2
, to allow for step-parents, gay marriage, etc.). Set these to reference Guardians
with ON DELETE SET NULL
, and add an ON UPDATE
trigger to Students
to delete a record if both fields become NULL. If Bob's father is deleted but he still has a mother, he's OK, but if his mother is deleted the poor orphan gets wiped from the database. This assumes that a CASCADE UPDATE
event triggers triggers, I don't know that for a fact.
A more robust solution would be to create a third table relating Guardians
to Students
. It would include fields GuardianID
, StudentID
, EffectiveDate
(these three making up a candidate key), StopDate
, and maybe fields to indicate the rights of the guardian; perhaps only one has legal custody, and the other should not be allowed to remove the student from school grounds. This would allow for death or divorce: just set the old record's StopDate
. Insert a new record if the remaining guardian ever remarries. You could create an ON UPDATE
trigger and an ON DELETE
trigger on this third table, checking to see if there are now any students which lack active guardians.
However, before putting too much engineering into this, consider: do you really need to cascade? Keep the referential integrity, of course, but just enforce deletion logic in your business logic layer. For that matter, does it even make sense to delete students? If you're asked how many students took a specific class in 2011, but you've deleted some because they've subsequently dis-enrolled, you'll have an undercount.
Best Answer
In many cases, there are more than one way to join two tables; See the other answers for lots of examples. Of course, one could say that it would be an error to use the 'automatic join' in those cases. Then only a handfull of simple cases where it can be used would be left.
However, there is a severe drawback! Queries that are correct today, might become an error tomorrow just by adding a second FK to the same table!
Let me say that again: by adding columns, queries that do not use those columns could turn from 'correct' into 'error'!
That is such a maintenance nightmare, that any sane style guide would prohibit to use this feature. Most already prohibit
select *
for the same reason!All this would be acceptable, if performance would be enhanced. However, that's not the case.
Summarizing, this feature could be used in only a limited set of simple cases, does not increase performance, and most style guides would prohibit its usage anyway.
Therefor it is not supprising that most database vendors choose to spend their time on more important things.