DELETE - DATA MANIPULATION LANGUAGE(DML)
The DELETE
statement in any RDBMS is considered a DML statement. Also known as CRUD (Create, Read, Update, Delete), this kind of statement is intended to manipulate data in a database without affecting the underlying structure of the objects. What this means in practical term is:
- A
DELETE
statement can be fine-tuned using a predicate via either WHERE
or JOIN
to delete some or all rows in a table.
- A
DELETE
statement will be logged by the database and can be rolled back within a transaction if the statement fails.
- Typically a
DELETE
will take row level locks on the data it is deleted, though this could escalate higher as necessary.
- Because of the transactional overhead,
DELETE
can be "slow" (this is relative), but safer because it is fine-grained.
TRUNCATE - DATA DEFINITION LANGUAGE(DDL)
TRUCNATE
is considered a DDL statement, meaning that it is intended to alter how objects are defined in a database. Usually, DDL statements are CREATE
, ALTER
, or DROP
, but TRUNCATE
serves a particular purpose, that of "resettting" a table by removing all the rows. The methods of this differ between the RDBMS engines and I would recommend looking at the specifics of MySQL. The practical implications of a TRUNCATE
are:
TRUNCATE
can not be fine grained. If successful, it will remove all the rows from your table.
TRUNCATE
is typically not logged. This varies by RDBMS and I would suggest you looks more specifically at how MySQL handles it. (Hint, it varies by version.)
TRUNCATE
requires a table metadata lock to execute. How this is actually implemented can be specific to RDBMS, but essentially the TRUNCATE
process must prevent other processes from mucking with the table in order to execute its DDL.
- Because it is (typically) not logged and does not use predicates, a
TRUNCATE
will be faster than a DELETE
, but less safe.
DROP TABLE - DATA DEFINITION LANGUAGE(DDL)
DROP TABLE
goes further than a TRUNCATE
in that it actually removes the table from the database completely. This includes removing all associated objects, such as indexes and constraints. If you dropped a table, not only would you remove all the data, but the structure as well. This will usually be done when a table is no longer needed. The primary concern is that since a DROP
is DDL, you typically can't undo it. Some RDBMS engines will allow you to wrap DDL within a transaction so you can roll it back, but this is not considered best practice and should be avoided.
If you just want to Truncate and reload data, then faffing around with indexes is not necessarily useful.
If you're inserting data in Clustered Index order i.e. in CONCEPT_CD ASC
order, then there's no real advantage to dropping the Clustered Index. It'll be far more pain rebuilding it at the end on 3 billion rows than it would inserting the data in your Clustered Index order in the first place.
However, if you want to disable your indexes, then that would be as follows:
-- Disable indexes on OBSERVATION_FACT
-- If you're dropping, don't disable. If you're disabling, don't drop...
ALTER INDEX ALL ON OBS_FACT DISABLE;
-- Truncate your table
TRUNCATE TABLE dbo.OBS_FACT;
-- ETL Process here....
-- REBUILD indexes on OBSERVATION_FACT
-- Or recreate them if you've dropped them in step 1
ALTER INDEX ALL ON OBS_FACT REBUILD WITH (ONLINE = ON);
Creating following by immediately rebuilding your indexes as per your script, is pointless, as creating builds the index. Why then immediately rebuild?
Best Answer
DROP
&CREATE
is slightly more expensive, as it actually deletes rows from some system tables (pg_class
,pg_attribute
, ...) in addition to removing the physical table files - and later has to parse etc. the newCREATE TABLE
command, whileTRUNCATE
only removes the physical files for the table and starts new ones, keeping the catalog entries. But the difference is negligible for simple tables, especially for temp tables. And it gets smaller, yet if you factor in an additionalANALYZE
that might be needed afterTRUNCATE
. Then again, you might need that in any case. See:By "filling the tables" you mean
COPY
, I suppose? A much more costly difference would be toCREATE
orTRUNCATE
a plain table in a separate transaction before writing to it, as in this case you accrue the additional (substantial) cost of writing WAL (Write Ahead Log) entries. The manual:Bold emphasis mine. (
minimal
is the default forwal_level
.) Does not affect temp tables, which do not write WAL at all.You might be interested in
CREATE TEMP TABLE ...
ON COMMIT DELETE ROWS
. The manual:Should be fastest. But the difference still typically small.