Sql-server – Recreating Clustered Index Efficiently

clustered-indexindexsql serversql server 2014t-sql

In a test database, I am looking to:

  • Drop clustered index (they are Primary Key Clustered Constraints on a row that is super useless for us.)
  • Create new Clustered index
  • re-create Primary Key constraint as nonclustered index
  • rebuild all other non-clustered indexes.

My workflow is as above as well, with the addition of disabling all the non-clustered indexes prior to dropping the clustered.

Since dropping the clustered constraint index requires the table to save as a HEAP, the amount of time this process takes on our 45m row table is tremendous. The drop on the constraint has been going for 1:17:00 and seems to only be at about 31m (based on Logical Reads in Spotlight for the Session).

Is there a more efficient way to handle this workflow? Perhaps a way to drop the constraint index and rebuild as the new clustered index, rather than as a HEAP?

Thanks,
Wes

DDL Statements:

TABLE STRUCTURE

    CREATE TABLE [dbo].[hist](
    [prrowid] [varchar](36) NOT NULL,
    [part] [varchar](30) NULL,
    [date] [datetime] NULL,
    [per_date] [datetime] NULL,
    [type] [varchar](80) NULL,
    [loc] [varchar](80) NULL,
    [loc_begin] [decimal](28, 10) NULL,
    [begin_qoh] [decimal](28, 10) NULL,
    [qty_req] [decimal](28, 10) NULL,
    [qty_chg] [decimal](28, 10) NULL,
    [qty_short] [decimal](28, 10) NULL,
    [um] [varchar](30) NULL,
    [last_date] [datetime] NULL,
    [nbr] [varchar](30) NULL,
    [so_job] [varchar](80) NULL,
    [ship_type] [varchar](30) NULL,
    [addr] [varchar](80) NULL,
    [rmks] [varchar](80) NULL,
    [xdr_acct] [varchar](80) NULL,
    [xcr_acct] [varchar](80) NULL,
    [mtl_std] [decimal](28, 10) NULL,
    [lbr_std] [decimal](28, 10) NULL,
    [bdn_std] [decimal](28, 10) NULL,
    [price] [decimal](28, 10) NULL,
    [trnbr] [int] NULL,
    [gl_amt] [decimal](28, 10) NULL,
    [xdr_cc] [varchar](30) NULL,
    [xcr_cc] [varchar](30) NULL,
    [lot] [varchar](80) NULL,
    [sub_std] [decimal](28, 10) NULL,
    [gl_date] [datetime] NULL,
    [qty_loc] [decimal](28, 10) NULL,
    [userid] [varchar](80) NULL,
    [serial] [varchar](50) NULL,
    [effdate] [datetime] NULL,
    [prod_line] [varchar](30) NULL,
    [xslspsn1] [varchar](80) NULL,
    [xslspsn2] [varchar](80) NULL,
    [xcr_proj] [varchar](80) NULL,
    [xdr_proj] [varchar](80) NULL,
    [line] [int] NULL,
    [user1] [varchar](80) NULL,
    [user2] [varchar](80) NULL,
    [curr] [varchar](30) NULL,
    [ex_rate] [decimal](28, 10) NULL,
    [rev] [varchar](30) NULL,
    [time] [int] NULL,
    [ovh_std] [decimal](28, 10) NULL,
    [site] [varchar](80) NULL,
    [status] [varchar](80) NULL,
    [grade] [varchar](30) NULL,
    [expire] [datetime] NULL,
    [assay] [decimal](28, 10) NULL,
    [xgl_ref] [varchar](30) NULL,
    [_chr01] [varchar](80) NULL,
    [_chr02] [varchar](80) NULL,
    [_chr03] [varchar](80) NULL,
    [_chr04] [varchar](80) NULL,
    [_chr05] [varchar](80) NULL,
    [_chr06] [varchar](80) NULL,
    [_chr07] [varchar](80) NULL,
    [_chr08] [varchar](80) NULL,
    [_chr09] [varchar](80) NULL,
    [_chr10] [varchar](80) NULL,
    [_chr11] [varchar](80) NULL,
    [_chr12] [varchar](80) NULL,
    [_chr13] [varchar](80) NULL,
    [_chr14] [varchar](80) NULL,
    [_chr15] [varchar](80) NULL,
    [_dte01] [datetime] NULL,
    [_dte02] [datetime] NULL,
    [_dte03] [datetime] NULL,
    [_dte04] [datetime] NULL,
    [_dte05] [datetime] NULL,
    [_dec01] [decimal](28, 10) NULL,
    [_dec02] [decimal](28, 10) NULL,
    [_dec03] [decimal](28, 10) NULL,
    [_dec04] [decimal](28, 10) NULL,
    [_dec05] [decimal](28, 10) NULL,
    [_log01] [bit] NULL,
    [_log02] [bit] NULL,
    [ref] [varchar](80) NULL,
    [msg] [int] NULL,
    [program] [varchar](30) NULL,
    [ord_rev] [int] NULL,
    [ref_site] [varchar](80) NULL,
    [rsn_code] [varchar](80) NULL,
    [vend_lot] [varchar](30) NULL,
    [vend_date] [datetime] NULL,
    [daycode] [varchar](80) NULL,
    [for] [varchar](30) NULL,
    [slspsn##1] [varchar](82) NULL,
    [slspsn##2] [varchar](82) NULL,
    [slspsn##3] [varchar](82) NULL,
    [slspsn##4] [varchar](82) NULL,
    [fsm_type] [varchar](80) NULL,
    [upd_isb] [bit] NULL,
    [auto_install] [bit] NULL,
    [ca_int_type] [varchar](80) NULL,
    [covered_amt] [decimal](28, 10) NULL,
    [fcg_code] [varchar](80) NULL,
    [batch] [varchar](30) NULL,
    [fsc_code] [varchar](80) NULL,
    [sa_nbr] [varchar](80) NULL,
    [sv_code] [varchar](80) NULL,
    [eng_area] [varchar](30) NULL,
    [sys_prod] [varchar](30) NULL,
    [svc_type] [varchar](30) NULL,
    [ca_opn_date] [datetime] NULL,
    [cprice] [decimal](28, 10) NULL,
    [eng_code] [varchar](80) NULL,
    [wod_op] [int] NULL,
    [enduser] [varchar](80) NULL,
    [ship_inv_mov] [varchar](80) NULL,
    [ship_date] [datetime] NULL,
    [ship_id] [varchar](30) NULL,
    [ex_rate2] [decimal](28, 10) NULL,
    [ex_ratetype] [varchar](80) NULL,
    [exru_seq] [int] NULL,
    [promise_date] [datetime] NULL,
    [fldchg_cmtindx] [int] NULL,
    [SrcPDB] [varchar](12) NULL,
     CONSTRAINT [hist_PK] PRIMARY KEY CLUSTERED 
    (
        [prrowid] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

CURRENT INDEXES

ALTER TABLE [dbo].[hist] ADD CONSTRAINT [hist_PK] PRIMARY KEY CLUSTERED ( [prrowid] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##addr_eff] ON [dbo].[hist] ( [addr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##batch] ON [dbo].[hist] ( [batch] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##date_trn] ON [dbo].[hist] ( [date], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##eff_trnbr] ON [dbo].[hist] ( [effdate], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##nbr_eff] ON [dbo].[hist] ( [nbr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_eff] ON [dbo].[hist] ( [part], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_trn] ON [dbo].[hist] ( [part], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##ref_filter] ON [dbo].[hist] ( [ref] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##serial] ON [dbo].[hist] ( [serial] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr] ON [dbo].[hist] ( [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##type] ON [dbo].[hist] ( [type], [effdate] ) WITH (FILLFACTOR=100);

DESIRED INDEXES

CREATE UNIQUE CLUSTERED INDEX [hist##date_trn_CX] ON [dbo].[hist] ( [date], [trnbr] ) WITH (FILLFACTOR=100);
ALTER TABLE [dbo].[hist] ADD CONSTRAINT [hist_PK] PRIMARY KEY NONCLUSTERED ( [prrowid] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##addr_eff] ON [dbo].[hist] ( [addr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##batch] ON [dbo].[hist] ( [batch] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##eff_trnbr] ON [dbo].[hist] ( [effdate], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_eff] ON [dbo].[hist] ( [part], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_trn] ON [dbo].[hist] ( [part], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##ref_filter] ON [dbo].[hist] ( [ref] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##serial] ON [dbo].[hist] ( [serial] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr] ON [dbo].[hist] ( [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##nbr_eff] ON [dbo].[hist] ( [trnbr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr_char] ON [dbo].[hist] ( [trnbr_char] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##type] ON [dbo].[hist] ( [type], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##vend_lot] ON [dbo].[hist] ( [vend_lot] ) WITH (FILLFACTOR=100);

— NOTE —
The below answer worked perfectly for me. I did have to have a volume added. I created a second Filegroup and a datafile on the new drive. Additionally, another log file as well.

Best Answer

Ideally you would do something like this:

  1. Drop the existing primary key constraint but keep the clustered index.
  2. Recreate the clustered index on the new columns with the DROP_EXISTING = ON option set.
  3. Create the primary key constraint on a new nonclustered index.

That would skip the step of the table being converted to a heap. Unfortunately, step 1 doesn't appear to be possible in SQL Server.

When the primary key is deleted, the corresponding index is deleted.

In addition, BOL has this to say about changing a primary key with DROP_EXISTING = ON:

If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.

As far as I can tell, the best that you can do is to avoid the heap conversion by creating a copy of the table and moving all data there. Removing or adding a clustered index creates an internal copy of the data anyway so it's not like it'll require more space. Here are a few hints to speed that up:

  • You probably shouldn't use SELECT INTO. That will copy the data to a heap which is the step you're trying to avoid. However, both the SELECT INTO and the creation of the clustered index are eligible for parallelism in SQL Server 2014.
  • Take advantage of minimal logging if your recovery model allows for it. Note that for INSERT INTO... SELECT you'll need a TABLOCK hint against the target table to get minimal logging.
  • Create your nonclustered indexes after all data is loaded.
  • When creating the nonclustered indexes use the SORT_IN_TEMPDB = ON option if tempdb is sized for it.
  • Check for foreign keys on other tables. If you're able to disable those that might help speed up things.

As an aside, if you were curious to see step 2 in action (I was), here's some sample code which shows how the heap conversion step can be skipped:

DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));

CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);

INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;


-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);

SQL Server Execution Times: CPU time = 31 ms, elapsed time = 51 ms.

SQL Server Execution Times: CPU time = 2406 ms, elapsed time = 3484 ms.

-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2) 
WITH (DROP_EXISTING = ON);

SQL Server Execution Times: CPU time = 2422 ms, elapsed time = 3411 ms.