Sql-server – Where in the database file is an index rebuild started? Alternative to SHRINKFILE

disk-spaceindexsql server

When rebuilding an index in SQL server, where in the datafile does it put that data?

Does it start at the beginning of free space and fill in the gaps from there?

Context: I have some large .ndf files I'm trying to shrink (not important why); but instead of using a SHRINKFILE command, I'm wondering if I can rebuild the indexes and then truncate the file from 80GB down to 25GB.

Edit: There are many reasons to want to reduce the file space or reorganize it logically. These may relate to fragmentation, how that fragmentation is translated to IO requests, moving large files across a network, swapping out storage space, how data is read sequentially, etc. Excessive space usage could even prevent other files from being written to your disk that need the space or dbs that need room to grow. You might have done some testing and now need to clean up excessive data and the space it required. Imagine you just switched to a newer database server and could take a 400GB database and compress it down to 100GB with newly available db management techniques, compression, etc, That's got value and I'm sure there are many other scenarios where this matters. Just be careful to judge the reasons behind questions please.

Best Answer

I don't believe those details are documented anywhere, and even if you were to figure it out (by asking Paul Randal, or perhaps by mapping the exact pages before and after an index rebuild), you may not be able to rely on those going forward. You also might end up with some random single pages at the end of the file anyway, and still need to do a shrink.

The best idea, in my opinion, is to do an index rebuild into a different file group:

Create a new filegroup

Move all affected tables and indexes into the new filegroup using the CREATE INDEX … WITH (DROP_EXISTING = ON) ON syntax, to move the tables and remove fragmentation from them at the same time

Drop the old filegroup that you were going to shrink anyway (or shrink it way down if its the primary filegroup)

If that's not an option, then you can instead do a shrink-then-reindex, or maybe a reindex-shrink-reindex:

Initial reindex: this releases free space within each page, and might be necessary if you've made major changes in the table structure like dropping large columns or changing column datatypes
ShrinkFile (never use ShrinkDatabase): Pick a reasonable target size, shrink each file with some remaining free space (I usually pick 15-20% free)
Final reindex: Resolves any fragmentation we created in step 2.

This multi-pass approach is a ton of IO, and might take a while to do. That's why the single-pass approach into a different file group is preferred, if you can pre-allocate the space.

Also, the final reindex may reclaim some of the space recovered in step 2. This is common where the majority of space is taken up by a single huge table or single huge index.

`TABLESPACE`

Yes, your last idea could work. Create a new tablespace on the other disk.

CREATE TABLESPACE ts1 LOCATION '/data/disk2';

Then create an optimized copy of the table in the new tablespace:

CREATE TABLE new_tbl 
TABLESPACE ts1 AS
SELECT * FROM tbl
ORDER BY ....  -- ORDER BY is optional

The copy will be packed tightly without dead rows.

Then you might just drop the old table and rename the new one to keep using the new disk.

Or, if you have to keep it on the old disk for some reason, delete the old table, rename the new one and move it back to the default tablespace. This step is much faster now with:

ALTER TABLE tbl SET TABLESPACE pg_default

Per documentation:

This form changes the table's tablespace to the specified tablespace and moves the data file(s) associated with the table to the new tablespace. Indexes on the table, if any, are not moved; but they can be moved separately with additional SET TABLESPACE commands. See also CREATE TABLESPACE.

Either way, don't forget to (re)create all depending objects. Indexes, foreign keys, views, ...

Well, you could use ALTER TABLE tbl SET TABLESPACE ... to begin with, but then the table would not be optimized as requested, just moved as is. But you would have enough wiggle room then to run pg_repack etc.

`COPY`

A complete backup / restore might take too long, but you could do that for only the table in question.

COPY tbl TO '/path/to/other/disk/tbl.pgsql';
TRUNCATE tbl;
COPY tbl FROM '/path/to/other/disk/tbl.pgsql';

The table is now packed tightly.

Temporary table

If you should happen to have enough RAM, you could do something similar with a temporary table in RAM. Would be much faster. Detailed instructions:

Best Answer

Related Solutions

Sql-server – How does DBCC SHRINKFILE() affect backups

Postgresql – How to reclaim disk space after delete without rebuilding table

TABLESPACE

COPY

Temporary table

Related Question

`TABLESPACE`

`COPY`