Sql-server – Recreating Clustered Index Efficiently

clustered-indexindexsql serversql server 2014t-sql

In a test database, I am looking to:

Drop clustered index (they are Primary Key Clustered Constraints on a row that is super useless for us.)
Create new Clustered index
re-create Primary Key constraint as nonclustered index
rebuild all other non-clustered indexes.

My workflow is as above as well, with the addition of disabling all the non-clustered indexes prior to dropping the clustered.

Since dropping the clustered constraint index requires the table to save as a HEAP, the amount of time this process takes on our 45m row table is tremendous. The drop on the constraint has been going for 1:17:00 and seems to only be at about 31m (based on Logical Reads in Spotlight for the Session).

Is there a more efficient way to handle this workflow? Perhaps a way to drop the constraint index and rebuild as the new clustered index, rather than as a HEAP?

Thanks,
Wes

DDL Statements:

TABLE STRUCTURE

    CREATE TABLE [dbo].[hist](
    [prrowid] [varchar](36) NOT NULL,
    [part] [varchar](30) NULL,
    [date] [datetime] NULL,
    [per_date] [datetime] NULL,
    [type] [varchar](80) NULL,
    [loc] [varchar](80) NULL,
    [loc_begin] [decimal](28, 10) NULL,
    [begin_qoh] [decimal](28, 10) NULL,
    [qty_req] [decimal](28, 10) NULL,
    [qty_chg] [decimal](28, 10) NULL,
    [qty_short] [decimal](28, 10) NULL,
    [um] [varchar](30) NULL,
    [last_date] [datetime] NULL,
    [nbr] [varchar](30) NULL,
    [so_job] [varchar](80) NULL,
    [ship_type] [varchar](30) NULL,
    [addr] [varchar](80) NULL,
    [rmks] [varchar](80) NULL,
    [xdr_acct] [varchar](80) NULL,
    [xcr_acct] [varchar](80) NULL,
    [mtl_std] [decimal](28, 10) NULL,
    [lbr_std] [decimal](28, 10) NULL,
    [bdn_std] [decimal](28, 10) NULL,
    [price] [decimal](28, 10) NULL,
    [trnbr] [int] NULL,
    [gl_amt] [decimal](28, 10) NULL,
    [xdr_cc] [varchar](30) NULL,
    [xcr_cc] [varchar](30) NULL,
    [lot] [varchar](80) NULL,
    [sub_std] [decimal](28, 10) NULL,
    [gl_date] [datetime] NULL,
    [qty_loc] [decimal](28, 10) NULL,
    [userid] [varchar](80) NULL,
    [serial] [varchar](50) NULL,
    [effdate] [datetime] NULL,
    [prod_line] [varchar](30) NULL,
    [xslspsn1] [varchar](80) NULL,
    [xslspsn2] [varchar](80) NULL,
    [xcr_proj] [varchar](80) NULL,
    [xdr_proj] [varchar](80) NULL,
    [line] [int] NULL,
    [user1] [varchar](80) NULL,
    [user2] [varchar](80) NULL,
    [curr] [varchar](30) NULL,
    [ex_rate] [decimal](28, 10) NULL,
    [rev] [varchar](30) NULL,
    [time] [int] NULL,
    [ovh_std] [decimal](28, 10) NULL,
    [site] [varchar](80) NULL,
    [status] [varchar](80) NULL,
    [grade] [varchar](30) NULL,
    [expire] [datetime] NULL,
    [assay] [decimal](28, 10) NULL,
    [xgl_ref] [varchar](30) NULL,
    [_chr01] [varchar](80) NULL,
    [_chr02] [varchar](80) NULL,
    [_chr03] [varchar](80) NULL,
    [_chr04] [varchar](80) NULL,
    [_chr05] [varchar](80) NULL,
    [_chr06] [varchar](80) NULL,
    [_chr07] [varchar](80) NULL,
    [_chr08] [varchar](80) NULL,
    [_chr09] [varchar](80) NULL,
    [_chr10] [varchar](80) NULL,
    [_chr11] [varchar](80) NULL,
    [_chr12] [varchar](80) NULL,
    [_chr13] [varchar](80) NULL,
    [_chr14] [varchar](80) NULL,
    [_chr15] [varchar](80) NULL,
    [_dte01] [datetime] NULL,
    [_dte02] [datetime] NULL,
    [_dte03] [datetime] NULL,
    [_dte04] [datetime] NULL,
    [_dte05] [datetime] NULL,
    [_dec01] [decimal](28, 10) NULL,
    [_dec02] [decimal](28, 10) NULL,
    [_dec03] [decimal](28, 10) NULL,
    [_dec04] [decimal](28, 10) NULL,
    [_dec05] [decimal](28, 10) NULL,
    [_log01] [bit] NULL,
    [_log02] [bit] NULL,
    [ref] [varchar](80) NULL,
    [msg] [int] NULL,
    [program] [varchar](30) NULL,
    [ord_rev] [int] NULL,
    [ref_site] [varchar](80) NULL,
    [rsn_code] [varchar](80) NULL,
    [vend_lot] [varchar](30) NULL,
    [vend_date] [datetime] NULL,
    [daycode] [varchar](80) NULL,
    [for] [varchar](30) NULL,
    [slspsn##1] [varchar](82) NULL,
    [slspsn##2] [varchar](82) NULL,
    [slspsn##3] [varchar](82) NULL,
    [slspsn##4] [varchar](82) NULL,
    [fsm_type] [varchar](80) NULL,
    [upd_isb] [bit] NULL,
    [auto_install] [bit] NULL,
    [ca_int_type] [varchar](80) NULL,
    [covered_amt] [decimal](28, 10) NULL,
    [fcg_code] [varchar](80) NULL,
    [batch] [varchar](30) NULL,
    [fsc_code] [varchar](80) NULL,
    [sa_nbr] [varchar](80) NULL,
    [sv_code] [varchar](80) NULL,
    [eng_area] [varchar](30) NULL,
    [sys_prod] [varchar](30) NULL,
    [svc_type] [varchar](30) NULL,
    [ca_opn_date] [datetime] NULL,
    [cprice] [decimal](28, 10) NULL,
    [eng_code] [varchar](80) NULL,
    [wod_op] [int] NULL,
    [enduser] [varchar](80) NULL,
    [ship_inv_mov] [varchar](80) NULL,
    [ship_date] [datetime] NULL,
    [ship_id] [varchar](30) NULL,
    [ex_rate2] [decimal](28, 10) NULL,
    [ex_ratetype] [varchar](80) NULL,
    [exru_seq] [int] NULL,
    [promise_date] [datetime] NULL,
    [fldchg_cmtindx] [int] NULL,
    [SrcPDB] [varchar](12) NULL,
     CONSTRAINT [hist_PK] PRIMARY KEY CLUSTERED 
    (
        [prrowid] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
    ) ON [PRIMARY]

CURRENT INDEXES

ALTER TABLE [dbo].[hist] ADD CONSTRAINT [hist_PK] PRIMARY KEY CLUSTERED ( [prrowid] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##addr_eff] ON [dbo].[hist] ( [addr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##batch] ON [dbo].[hist] ( [batch] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##date_trn] ON [dbo].[hist] ( [date], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##eff_trnbr] ON [dbo].[hist] ( [effdate], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##nbr_eff] ON [dbo].[hist] ( [nbr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_eff] ON [dbo].[hist] ( [part], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_trn] ON [dbo].[hist] ( [part], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##ref_filter] ON [dbo].[hist] ( [ref] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##serial] ON [dbo].[hist] ( [serial] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr] ON [dbo].[hist] ( [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##type] ON [dbo].[hist] ( [type], [effdate] ) WITH (FILLFACTOR=100);

DESIRED INDEXES

CREATE UNIQUE CLUSTERED INDEX [hist##date_trn_CX] ON [dbo].[hist] ( [date], [trnbr] ) WITH (FILLFACTOR=100);
ALTER TABLE [dbo].[hist] ADD CONSTRAINT [hist_PK] PRIMARY KEY NONCLUSTERED ( [prrowid] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##addr_eff] ON [dbo].[hist] ( [addr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##batch] ON [dbo].[hist] ( [batch] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##eff_trnbr] ON [dbo].[hist] ( [effdate], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_eff] ON [dbo].[hist] ( [part], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##part_trn] ON [dbo].[hist] ( [part], [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##ref_filter] ON [dbo].[hist] ( [ref] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##serial] ON [dbo].[hist] ( [serial] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr] ON [dbo].[hist] ( [trnbr] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##nbr_eff] ON [dbo].[hist] ( [trnbr], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##trnbr_char] ON [dbo].[hist] ( [trnbr_char] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##type] ON [dbo].[hist] ( [type], [effdate] ) WITH (FILLFACTOR=100);
CREATE INDEX [hist##vend_lot] ON [dbo].[hist] ( [vend_lot] ) WITH (FILLFACTOR=100);

— NOTE —
The below answer worked perfectly for me. I did have to have a volume added. I created a second Filegroup and a datafile on the new drive. Additionally, another log file as well.

Best Answer

Ideally you would do something like this:

Drop the existing primary key constraint but keep the clustered index.
Recreate the clustered index on the new columns with the DROP_EXISTING = ON option set.
Create the primary key constraint on a new nonclustered index.

That would skip the step of the table being converted to a heap. Unfortunately, step 1 doesn't appear to be possible in SQL Server.

When the primary key is deleted, the corresponding index is deleted.

In addition, BOL has this to say about changing a primary key with DROP_EXISTING = ON:

If the index enforces a PRIMARY KEY or UNIQUE constraint and the index definition is not altered in any way, the index is dropped and re-created preserving the existing constraint. However, if the index definition is altered the statement fails. To change the definition of a PRIMARY KEY or UNIQUE constraint, drop the constraint and add a constraint with the new definition.

As far as I can tell, the best that you can do is to avoid the heap conversion by creating a copy of the table and moving all data there. Removing or adding a clustered index creates an internal copy of the data anyway so it's not like it'll require more space. Here are a few hints to speed that up:

You probably shouldn't use SELECT INTO. That will copy the data to a heap which is the step you're trying to avoid. However, both the SELECT INTO and the creation of the clustered index are eligible for parallelism in SQL Server 2014.
Take advantage of minimal logging if your recovery model allows for it. Note that for INSERT INTO... SELECT you'll need a TABLOCK hint against the target table to get minimal logging.
Create your nonclustered indexes after all data is loaded.
When creating the nonclustered indexes use the SORT_IN_TEMPDB = ON option if tempdb is sized for it.
Check for foreign keys on other tables. If you're able to disable those that might help speed up things.

As an aside, if you were curious to see step 2 in action (I was), here's some sample code which shows how the heap conversion step can be skipped:

DROP TABLE IF EXISTS dbo.X_NUMBERS_1000000;
CREATE TABLE dbo.X_NUMBERS_1000000 (ID INT NOT NULL, ID2 INT NOT NULL, FILLER VARCHAR(500));

CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID);

INSERT INTO dbo.X_NUMBERS_1000000 WITH (TABLOCK)
SELECT TOP (1000000) 
  ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, REPLICATE('Z', 500)
FROM master..spt_values t1
CROSS JOIN master..spt_values t2;


-- option 1
DROP INDEX X_NUMBERS_1000000.CI_X_NUMBERS_1000000;
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000_2_COL ON dbo.X_NUMBERS_1000000 (ID, ID2);

SQL Server Execution Times: CPU time = 31 ms, elapsed time = 51 ms.

SQL Server Execution Times: CPU time = 2406 ms, elapsed time = 3484 ms.

-- option 2 (after resetting the table)
CREATE CLUSTERED INDEX CI_X_NUMBERS_1000000 ON dbo.X_NUMBERS_1000000 (ID, ID2) 
WITH (DROP_EXISTING = ON);

SQL Server Execution Times: CPU time = 2422 ms, elapsed time = 3411 ms.

Disclaimer

This is experimental and only tested rudimentarily. Proceed at your own risk. I would not use it myself and just drop / recreate constraints with standard DDL commands. If you break entries in the catalog tables you could easily mess up your database.

For all I know, there are only two differences between a PRIMARY KEY and a UNIQUE constraint in the catalog tables (the index itself is identical):

pg_index.indisprimary:
For PRIMARY KEY constraint ... TRUE
For UNIQUE constraint ... FALSE

pg_constraint.contype:
PRIMARY KEY constraint ... 'p'
UNIQUE constraint ... 'u'

You could convert constraint and index in place, from PRIMARY KEY constraint to UNIQUE constraint, my_idx being the (optionally schema-qualified) index name:

UPDATE pg_index SET indisprimary = FALSE WHERE indexrelid = 'my_idx'::regclass
UPDATE pg_constraint SET contype = 'u' WHERE conindid = 'my_idx'::regclass;

Or upgrade from UNIQUE to PRIMARY KEY:

UPDATE pg_index SET indisprimary = TRUE WHERE indexrelid = 'my_idx'::regclass;
UPDATE pg_constraint SET contype = 'p' WHERE conindid = 'my_idx'::regclass;

Best Answer

Related Solutions

Sql-server – Why is Clustered Index on Primary Key compulsory

Postgresql – What happens to the index of a primary key after a DROP CONSTRAINT

Disclaimer

Related Question