I'm just curious.
Say you have a table of 1 million records/ rows.
select order_value from store.orders
Does it make a difference whether that table has 1 field, 2 fields, or 100 fields, in actual query time? I mean all fields other than "order_value."
Right now I'm pushing data to a data warehouse. Sometimes I dump fields into the table that "may be used in the future, someday" – but they aren't being queried right now, by anything. Would these 'extraneous' fields affect select statements that do no include them, directly or indirectly (no * I mean)?
Best Answer
This really depends on indexes and data types.
Using the Stack Overflow database as an example, this is what the Users table looks like:
It has a PK/CX on the Id column. So it's the entirety of the table data sorted by Id.
With that as the only index, SQL has to read that whole thing (sans the LOB columns) into memory if it's not already there.
The stats time and io profile looks like this:
If I add an additional nonclustered index on just Id
I now have a much smaller index that satisfies my query.
The profile here:
We're able to do far fewer reads and save a little CPU time.
Without more information about your table definition, I can't really try to reproduce what you're trying to measure any better.
Yes, this is specific to rowstore tables. Data is stored by the row on data pages. Even if other data on the page is irrelevant to your query, that whole row > page > index needs to be read into memory. I wouldn't say that the other columns are "scanned" so much as the pages they exist on are scanned to retrieve the single value on them relevant to the query.
Using the ol' phonebook example: even if you're just reading phone numbers, when you turn the page, you're turning last name, first name, address, etc along with the phone number.