I'm trying to load data from a very large table (200 million rows) into a Presentation Datamart (PDM) for an Enterprise Data Warehouse using Teradata 14.10. I want to split this large table into several separate tables containing 30 million rows each.
I've simplified things down, but here's the table structure:
CREATE SET TABLE MYDB.LARGE_TABLE ,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
CUSTOMERID INTEGER TITLE 'CUSTOMER IDENTIFIER' NOT NULL
, FULLNAME VARCHAR(30) NOT CASESPECIFIC TITLE 'FULLNAME'
);
I've gotten as far as using ROW_NUMBER() so I know how many actual rows there are in the table:
SELECT
ROW_NUMBER() OVER (PARTITION BY CUSTOMERID, ORDER BY CUSTOMERID) AS RANK_CUST
, CUSTOMERID
, FULLNAME
FROM
MYDB.LARGE_TABLE AS MYTBL
Because of the SQL standards that are enforced we have to follow these restrictions:
- Cannot use Stored Procedures
- No single piece of code to contain more than 3 table joins
- No single piece of code to result in more than 30 million rows of output
I'm preparing this SQL script for a very stringent review, and it definitely won't be approved unless I can find a way to split this large table into smaller rows of data. With these restrictions in mind, does anyone have any ideas?
Best Answer
You can use the QUALIFY statement to restrict the number of rows against a large table.
edit - corrected syntax of ROW_NUMBER()