I concur with your speculation that it's bad design practice. I'd recommend against it. Any time you do this sort of thing, it almost inevitably comes back to bite and the performance difference compared to a properly designed schema should be small.
You say that as it stands now, your process "would involve searching through the first table for all question_ids answered by the user, then searching the topics_and_questions table" but that's not really a good analysis of what should happen behind the scenes.
Properly designed, this process would involve checking an index on the first table for all questions answered by the user, joining to the second table to find the topics of the questions (and possibly a third table to find the names of the topics), but all in one query and with no full table scans.
If your first table (no table name provided) has (user_id, question_id) as its primary key (as it probably should), then finding the questions answered by each user will be a fast operation, no matter how many records are in this table. Rows in InnoDB are stored in primary key order, so there's no "searching though" the entire table that needs to happen... the storage engine can go directly to where that user's records are without a full table scan. If (user_id,question_id) isn't the primary key, then adding an index on (user_id,question_id) will optimize this part of the query.
In topics_and_questions
either (question_id) or (question_id,topic_id) should probably be the primary key, depending on whether a question can be in multiple topics. If it can, of course, then counting questions answered becomes more complicated.
Since you appear to be allowing a user to answer the same question more than once, there may be yet another issue to consider: you actually have to duplicate part of what you're trying to avoid, every time you update this new proposed table: You have to check whether the question just answered was one that's been answered before, or not, and only increment this counter, if it is, I would assume.
Based on the information given, I will take a shot at an answer...
You stated
If I add identifiedtex as the last column indexed in the index, will this speed up a lookup of the texture since the result is part of the index?
Let me say that you have the following query:
SELECT identifiedtex FROM texels
WHERE texelx = 210938
AND texely = 378432
AND texelhue = 23;
What would be the effect of having an index (call it Index3)
texelx, texely, texelhue
versus this index (call it Index4)
texelx, texely, texelhue, identifiedtex
When you retrieve identifiedtex
using only Index3, it will require an additional table lookup to ascertain the identifiedtex
from the table.
When you retrieve identifiedtex
using only Index4, it will an index-only lookup because all mentioned columns are in the index. It is known as a covering index.
I mentioned covering indexes in my past posts:
So to answer your original question, yes adding identifiedtex
will index make for a faster query.
Best Answer
If your business logic layer(s) know that there has been no change (i.e. it has already re-read the data and can compare it to the provided input then you can simply not send anything to the DB.
If you don't know before hitting the data layer if there has been any change or not then you can do something like:
This avoids any update if nothing has changed so it more efficient (the lookups needed to reject the update would be needed to perform it anyway and the extra CPU time for the comparison is vanishingly small unless you have huge data in any of the columns), but it is more work to maintain and more prone to errors so most of the time I would recommend keeping it simple and updating without checking every column, i.e. the standard:
The maintainability issue is even worse for NULLable columns as
fieldN <> @fieldN
has to be changed to (`fieldN <> @fieldN OR (fieldN IS NULL AND @fieldN IS NOT NULL) OR (fieldN IS NOT NULL AND @fieldN IS NULL) because comparisons between NULLs are always false (NULL is not equal to NULL, but it is also not not equal to NULL - NULL by definition is unknown so all comparisons (=, !=, <, >, ...) return false unless your DB supports non-ansi behaviour like https://stackoverflow.com/questions/9766717/). If this is all baked into you data access layer then fine, some ORMs for instance provide such optimisation as an option, but you don't want to be manually writing and maintaining it for every update statement in your application.The DB (ever engine I'm aware of) will perform a full write of the relevant data pages even if no values have changed - they won't spend the few extra CPU cycles checking. Away from the efficiency of the update itself you need to consider knock-on effects and certain business requirements. If there are triggers that take action on updates to the table they will fire even if nothing has changed, this could be a significant consideration if you have any complex triggers to consider. From a business logic point of view do you need to log that the user actively saved the record with no changes? If you are relying on automated change capture of some sort (such as your own triggers, temporal table features in postgres and the upcoming MSSQL2016) to track changes then you (or your end users) need to decide if the data model should care that a no-change update happened or not. Of course if the record includes a "last saved" column this all becomes a moot point - there is always something to update upon save if only that one column.