How to split a large table in Teradata

partitioningtableteradata

I'm trying to load data from a very large table (200 million rows) into a Presentation Datamart (PDM) for an Enterprise Data Warehouse using Teradata 14.10. I want to split this large table into several separate tables containing 30 million rows each.

I've simplified things down, but here's the table structure:

CREATE SET TABLE MYDB.LARGE_TABLE ,NO FALLBACK ,
    NO BEFORE JOURNAL,
    NO AFTER JOURNAL,
    CHECKSUM = DEFAULT,
    DEFAULT MERGEBLOCKRATIO
(
    CUSTOMERID INTEGER TITLE 'CUSTOMER IDENTIFIER' NOT NULL
    , FULLNAME VARCHAR(30) NOT CASESPECIFIC TITLE 'FULLNAME'
);

I've gotten as far as using ROW_NUMBER() so I know how many actual rows there are in the table:

SELECT
    ROW_NUMBER() OVER (PARTITION BY CUSTOMERID, ORDER BY CUSTOMERID) AS RANK_CUST
    , CUSTOMERID
    , FULLNAME
FROM
    MYDB.LARGE_TABLE AS MYTBL

Because of the SQL standards that are enforced we have to follow these restrictions:

Cannot use Stored Procedures
No single piece of code to contain more than 3 table joins
No single piece of code to result in more than 30 million rows of output

I'm preparing this SQL script for a very stringent review, and it definitely won't be approved unless I can find a way to split this large table into smaller rows of data. With these restrictions in mind, does anyone have any ideas?

Best Answer

You can use the QUALIFY statement to restrict the number of rows against a large table.

SELECT CUSTOMERID
     , FULLNAME
  FROM MYDB.LARGE_TABLE AS MYTBL
QUALIFY ROW_NUMBER() OVER (ORDER BY CUSTOMERID) 
        BETWEEN 1 and 30000000;

edit - corrected syntax of ROW_NUMBER()

Related Solutions

Will enabling row movements on list-partitioned table cause performance problems(oracle 11g r2)

You should be fine; the rowid would be a show-stopper if you were using it
Temporarily, there would be "holes" in the data files, but Oracle will fill them in without any intervention. Ensure you have automatic tablespace management enabled.
You can invoke that command but no need to; let Oracle take care of it
The person that stated that probably meant there was a one-time hit, or he didn't know what he was talking about, or using an old version. You tested; his environment may have had a rat's nest of views.

Teradata – How to Force Table Creation in Memory

The answer to your question and other information about Teradata Intelligent Memory (TIM) can be found in the Orange Book about Teradata Intelligent Memory. Also in the appendixes are some SQL samples to query Teradata to verify what the temperature of a specific table is.

Please be advised that there is a general advice to NOT load data directly as "very hot", but to "hot" and let TVS then handle the temperature characteristics. Otherwise cooling down could be a problem.

To directly answer your question, there are 3 options:

1) Wait and access the table until TVM sees that the data is "very hot". If normal query characteristics access the table often there is a high chance the table will get loaded in memory and stay there.

2) Use a queryband to force it to "very hot"

SET QUERY_BAND='TVSTEMPERATURE_PRIMARY=VERYHOT;' FOR SESSION;

However you need to have access to the macro DBC.VHCTRL(). Otherwise TD will silently set your data to "hot".

3) Use FERRET:

> force "ADW_DB.INVOICE" P TEMPERATURE=VERYHOT
force "ADW_DB.INVOICE" P TEMPERATURE=VERYHOT
FORCE command changed the temperature of table ADW_DB.INVOICE
to VERY-HOT.
The temperature of 638 cylinders have been changed to VERY-HOT.

Please also note that a DBA is not amused if you suddenly load a huge table in TIM in production, forcing other data to not reside in TIM. There is a reason why the queryband is restricted...

Related Question