AWS Redshift – Managing Deep Copy Disk Space

awsredshift

I am trying to perform a deep copy on RS. See http://docs.aws.amazon.com/redshift/latest/dg/performing-a-deep-copy.html

I created the new/temporary table using the SAME DDL with compression settings as the original table.

I have 3 nodes, each node is using 21% disk space.

When I run the copy:

insert into events_tmp (select * from events); 

I run out of disk space.
Does the deep copy not compress the data?

Thanks!

Crash and Burn

Best Answer

See vceron's post on https://discourse.snowplowanalytics.com/t/redshift-maintenance-best-practices/1124/2

When doing a Deep Copy in a highly unsorted table Redshift needs to sort this table before inserting into the new one.

This sort operation will be taken place in a intermediate temporary table, which at first be placed in memory, but as the data set is too big, will eventually spillover to disk.

However, those temporary tables are not compressed, and Redshift will allocate temporary disk space for that operation which results in disk full error if there is not sufficient space for the temporary data.