DB2 LUW – How to Determine Page Size for Index

db2db2-luw

I'm in the process of migrating a set of databases from code set ISO8859-1 to UTF-8, STRING_UNITS=CODEUNITS32. I suspect that quite a few indexes will no longer fit in there current tablespace page size (SQL0614N). Is there a formula that can be used to determine the minimum page size for each index?

I've found the following article by Serge Reilau:

https://www.ibm.com/developerworks/community/blogs/SQLTips4DB2LUW/entry/getrowsize?lang=en

that I guess I can modify to determine the size of the index, but I'm not entirely sure whether the same formula holds for indexes.

Any pointers?

Best Answer

Not sure whether this is actually correct, but I'll use it as an estimate until I figure out something better:

CREATE OR REPLACE FUNCTION GetInxSize(indschema VARCHAR(128), indname VARCHAR(128))
RETURNS INTEGER
SPECIFIC GETINXSIZE
READS SQL DATA
DETERMINISTIC
NO EXTERNAL ACTION
    RETURN
        SELECT SUM(
                    CASE c.TYPENAME
                       WHEN 'SMALLINT'        THEN length
                       WHEN 'INTEGER'         THEN length
                       WHEN 'BIGINT'          THEN length
                       WHEN 'REAL'            THEN length
                       WHEN 'DOUBLE'          THEN length
                       WHEN 'DECFLOAT'        THEN length
                       WHEN 'DECIMAL'         THEN TRUNC(length / 2) + 1
                       WHEN 'CHARACTER'       THEN 4*length
                       WHEN 'VARCHAR'         THEN 4*length + 4
                       WHEN 'GRAPHIC'         THEN length * 2
                       WHEN 'VARGRAPHIC'      THEN length * 2 + 4
                       WHEN 'XML'             THEN 4*c.inline_length + 3
                       WHEN 'DATE'            THEN length
                       WHEN 'TIME'            THEN length
                       WHEN 'TIMESTAMP'       THEN length
                       ELSE raise_error('78000', 'Unknown type')
                    END +
                       CASE WHEN NULLS = 'Y' THEN 1 ELSE 0 END
                 )
        FROM SYSCAT.INDEXES i
        JOIN SYSCAT.INDEXCOLUSE ic
            ON i.indschema = ic.indschema
            AND i.indname = ic.indname
        JOIN SYSCAT.COLUMNS c
            ON i.tabschema = c.tabschema
            AND i.tabname = c.tabname
            AND ic.colname = c.colname
        WHERE i.indschema = GetInxSize.indschema
          AND i.indname = GetInxSize.indname
@

The DB:s in question don't use compression, and I also skipped types that can not be part of an index.

SELECT rtrim(indschema) || '.' || rtrim(indname)
FROM SYSCAT.INDEXES i 
JOIN SYSCAT.TABLES t 
    ON i.tabschema = t.tabschema 
    AND i.tabname = t.tabname 
JOIN SYSCAT.TABLESPACES ts 
    ON coalesce(t.index_tbspace, t.tbspace) 
WHERE t.tabschema not like 'SYS%' 
  AND getinxsize(indschema, indname) > ts.pagesize / 4;

alt:

SELECT distinct rtrim(t.tabschema) || '.' || rtrim(t.tabname)
FROM SYSCAT.INDEXES i 
JOIN SYSCAT.TABLES t 
    ON i.tabschema = t.tabschema 
    AND i.tabname = t.tabname 
JOIN SYSCAT.TABLESPACES ts 
    ON coalesce(t.index_tbspace, t.tbspace) = ts.tbspace 
WHERE t.tabschema not like 'SYS%' 
  AND getinxsize(indschema, indname) > ts.pagesize / 4;

EDIT: Example of XML column in index

A table like:

CREATE TABLE TESTING_APA.ATOM_FEED  
( FEED_XML XML 
, FEED_XML_IS_NULL SMALLINT NOT NULL 
      GENERATED ALWAYS AS (CASE WHEN FEED_XML IS NULL THEN 1 ELSE 0 END) 
) IN USERSPACE1
ORGANIZE BY ROW;

Generates two indexes (not sure why though) where type name of the column is XML:

select tabschema, tabname, indname, colnames 
from syscat.indexes 
where indname in ('SQL190304085605540','SQL190304085605550')

TESTING_APA  ATOM_FEED  SQL190304085605540  +FEED_XML                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
TESTING_APA  ATOM_FEED  SQL190304085605550  +FEED_XML                    

In addition, explicit xml indexes also result in indexes in the catalog whose column type is XML.

Since the source database is not UTF-8, there can't be any JSON columns. For a UTF-8 database, it may be worthwhile to investigate whether indexes for JSON comes into play. The function GetInxSize is however specially designed to handle non-utf -> utf, so it will have to be changed for utf-8 databases.