I am trying to get a greater understanding on the differences between the DELETE
and TRUNCATE
commands. My understanding of the internals goes something along the lines of:
DELETE
-> the database engine finds and removes the row from the relevant data pages and all index pages where the row is entered. Thus, the more indexes the longer the delete takes.
TRUNCATE
-> simply removes all the table's data pages en mass making this a more efficient option for deleting the contents of a table.
Assuming the above is correct (please correct me if not):
- How do different recovery modes affect each statement? If there is any effect at all
- When deleting, are all indexes scanned or only those where the row is? I would assume all indexes are scanned (and not seeked?)
- How are the commands replicated? Is the SQL command sent and processed on each subscriber? Or is MSSQL a bit more intelligent than that?
Best Answer
Yes, though there are two options here. Rows may be deleted from nonclustered indexes row-by-row by the same operator that performs the base table deletes. This is known as a narrow (or per-row) update plan:
Or, the nonclustered index deletions may be performed by separate operators, one per nonclustered index. In this case (known as a wide, or per-index update plan) the complete set of actions is stored in a worktable (eager spool) before being replayed once per index, often explicitly sorted by the particular nonclustered index's keys to encourage a sequential access pattern.
Yes.
TRUNCATE TABLE
is more efficient for a number of reasons:Deletion is always fully logged (every row deleted is recorded in the transaction log). There are some small differences in the content of log records if the recovery model is other than
FULL
, but this is still technically full logging.Deleting a row in an index (using either the narrow or wide update plans shown previously) is always an access by key (a seek). Scanning the whole index for each row deleted would be horribly inefficient. Let's look again at the per-index update plan shown earlier:
Execution plans are demand-driven pipelines: parent operators (to the left) drive child operators to do work by requesting a row at a time from them. The Sort operators are blocking (they must consume their whole input before producing the first sorted row), but they are still being driven by their parent (the Index Delete) requesting that first row. The Index Delete pulls a row at a time from the completed Sort, updating the target nonclustered index for each row.
In a wide update plan, you will often see columns being added to the row stream by the base table update operator. In this case, the Clustered Index Delete adds nonclustered index key columns to the stream. This data is required by the storage engine to locate the row to remove from the nonclustered index:
Truncation is not allowed on a table that is published using transactional or merge replication. How deletions are replicated depends on the type of replication and how it is configured. For example, snapshot replication just replicates a point-in-time view of the table using bulk methods - incremental changes are not tracked or applied. Transactional replication works by reading log records and generating appropriate transactions to apply the changes at subscribers. Merge replication tracks changes using triggers and metadata tables.
Related reading: Optimizing T-SQL queries that change data