Sql-server – how does SQL Server 2008 handle joins internally

sql serversql-server-2008

for example, I have tables: User and Order

User Table:
UserId, Name, Address, Phone, Age

Order Table:
OrderId, UserId, Date, Item, Quantity

SQL:

select User.Name, Order.Date
From Order 
JOIN User ON Order.UserId = User.UserId

Will the join produce a virtual table including all fields in the tables or just the two in the select?

The point to this question is if one of my fields is large (> 1GB) then I would like to place it in its own table and not leave it in a table that gets joined many times all over the place… in order to keep the large fields from being a burden across the board.

Best Answer

The internals of any query operation can be reviewed in the query execution plan. There are many factors involved in how the data is extracted and queried from the source tables, but for the purposes of your question, there are two elements you need to be concerned about:

Seeks and Scans

The first step to any query is to extract the data from the source tables. This is performed using one of two actions (and variations within): a seek or a scan. A seek is where the query engine can leverage an index to narrow down the data retrieval. If no index is available, then the engine must perform a scan of all records in the table. In general, seeks typically perform better than scans due because it is more selective.

Physical Joins

Once data is retrieved from the tables, it must be physically joined. Note, this is not the same as logical joins (INNER and OUTER). There are three physical join operators that can be used by the engine:

The query engine will evaluate several factors to determine which physical join will be the most efficient for the query execution. As a simple example, I assembled this SQL Fiddle example, where you can view a basic execution plan using your sample schema. Note, however, that this plan could change with more data being queried. You will need to review the query plans in your environment to truly understand how your data is being retrieved.

Query plan analysis and performance tuning is beyond the scope of this answer, but I strongly recommend getting a hold of Grant Fritchey' SQL Server 2012 Query Performance Tuning for a comprehensive explanation of query execution internals and tuning approaches.