XML domains(lookups): Prevent duplicates between domains

database-designduplicationoraclexml

Background:

I have some GIS domains (a.k.a. lookup tables) that are stored in an XML column in a single system table. The domains are not directly accessible via SQL, so I use a view to extract the values. I combine the domains into a single view so that I can use them in a query for reporting.

SELECT
    SUBSTR(EXTRACTVALUE(CodedValues.COLUMN_VALUE, 'CodedValue/Code'),1,255) AS Code
    ,SUBSTR(EXTRACTVALUE(CodedValues.COLUMN_VALUE, 'CodedValue/Name'),1,255) AS Description
    ,I.NAME as Domain_Name
FROM
    SDE.GDB_ITEMS_VW I
JOIN 
    SDE.GDB_ITEMTYPES IT
    ON I.Type = IT.UUID,
    TABLE(XMLSEQUENCE(XMLType(Definition).Extract('/GPCodedValueDomain2/CodedValues/CodedValue'))) CodedValues
WHERE
    I.NAME IN('ACTIVITY_SIDEWALK'
             ,'ACTIVITY_SEWER'
             ,'ACTIVITY_ROAD')

USER1.ACTIVITY_COMBINED_VW
+-----------------+-------------------------------+-------------------+
|      CODE       |          DESCRIPTION          |      DOMAIN       |
+-----------------+-------------------------------+-------------------+
| CONSTR_S        | CONSTRUCT NEW SIDEWALK        | ACTIVITY_SIDEWALK |
| RECON_S         | RECONSTRUCT EXISTING SIDEWALK | ACTIVITY_SIDEWALK |
+-----------------+-------------------------------+-------------------+
| CONSTR_SEW      | CONSTRUCT NEW SEWER           | ACTIVITY_SEWER    |
+-----------------+-------------------------------+-------------------+
| CONSTR_ROAD     | CONSTRUCT NEW ROAD            | ACTIVITY_ROAD     |
| RECON_ROAD      | RECONSTRUCT EXISTING ROAD     | ACTIVITY_ROAD     |
+-----------------+-------------------------------+-------------------+

The report is based on the CAPITAL_PROJECTS table:

+------------+-------------+-----------+
| PROJECT_ID |  ACTIVITY   |   COST    |
+------------+-------------+-----------+
|         01 | RECON_S     | 5,000,000 |
|         02 | CONSTR_SEW  | 6,000,000 |
|         03 | CONSTR_ROAD | 7,000,000 |
|         04 | RECON_ROAD  | 8,000,000 |
+------------+-------------+-----------+

I have a report query that grabs the DESCRIPTION from the ACTIVITY_COMBINED_VW view:

SELECT 
    CAPITAL_PROJECTS.PROJECT_ID
   ,ACTIVITY_COMBINED_VW.DESCRIPTION
   ,CAPITAL_PROJECTS.COST
FROM 
    USER1.CAPITAL_PROJECTS 
LEFT JOIN 
    USER1.ACTIVITY_COMBINED_VW 
    ON CAPITAL_PROJECTS.ACTIVITY = ACTIVITY_COMBINED_VW.CODE;

USER1.REPORT_VW
+------------+-------------------------------+-----------+
| PROJECT_ID |        DESCRIPTION            |   COST    |
+------------+-------------------------------+-----------+
|         01 | RECONSTRUCT EXISTING SIDEWALK | 5,000,000 |
|         02 | CONSTRUCT NEW SEWER           | 6,000,000 |
|         03 | CONSTRUCT NEW ROAD            | 7,000,000 |
|         04 | RECONSTRUCT EXISTING ROAD     | 8,000,000 |
+------------+-------------------------------+-----------+

Problem:

While the GIS application does prevent users from entering duplicate values within a single domain, the application does not prevent users from entering duplicate values between different domains.

For example, it is entirely possible that a user could accidently enter a RECON_S CODE in both the ACTIVITY_SIDEWALK domain and the ACTIVITY_SEWER domain:

              +-----------------+-------------------------------+-------------------+
              |      CODE       |          DESCRIPTION          | DOMAIN            |
              +-----------------+-------------------------------+-------------------+
              | CONSTR_S        | CONSTRUCT NEW SIDEWALK        | ACTIVITY_SIDEWALK | 
Existing ---> | RECON_S         | RECONSTRUCT EXISTING SIDEWALK | ACTIVITY_SIDEWALK | 
              +-----------------+-------------------------------+-------------------+
              | CONSTR_SEW      | CONSTRUCT NEW SEWER           | ACTIVITY_SEWER    |
New,     ---> | RECON_S         | RECONSTRUCT EXISTING SEWER    | ACTIVITY_SEWER    | 
duplicate     +-----------------+-------------------------------+-------------------+
              | CONSTR_ROAD     | CONSTRUCT NEW ROAD            | ACTIVITY_ROAD     |
              | RECON_ROAD      | RECONSTRUCT EXISTING ROAD     | ACTIVITY_ROAD     |
              +-----------------+-------------------------------+-------------------+

This scenario might seem silly or unlikely in the context of the sample data. But in reality, I have dozens of domains, each with dozens of values. So yes, it's all too easy for users to enter duplicate domain values between different domains.

And so, due to the LEFT JOIN, the REPORT_VW produces incorrect results:

+------------+-------------------------------+-----------+
| PROJECT_ID |        DESCRIPTION            |  COST     |
+------------+-------------------------------+-----------+
|         01 | RECONSTRUCT EXISTING SIDEWALK | 5,000,000 | <--Two records for project 01 
|         01 | RECONSTRUCT EXISTING SEWER    | 5,000,000 | <  (an extra 5 million dollars) 
|         02 | CONSTRUCT NEW SEWER           | 6,000,000 |              
|         03 | CONSTRUCT NEW ROAD            | 7,000,000 |
|         04 | RECONSTRUCT EXISTING ROAD     | 8,000,000 |
+------------+-------------------------------+-----------+

Question:

Is there anything I can do to prevent duplicate domain values being entered between domains?

Under normal circumstances, I think domains (lookup tables) would be stored in plain old database tables. A simple unique index or primary key would be used.

But I'm not so lucky. My domains are stored in an XML column in a system table.

Ideas?

Best Answer

You can create a unique function-based index on the CODE column.

create table t1 (c1 clob);

Because your table uses CLOB based on the linked documentation.

insert into t1 values('<x1><p1>HELLO</p1><p2>WORLD</p2></x1>');

1 row inserted.

commit;

Commit complete.

select extractvalue(xmltype(c1), 'x1/p1') from t1;

EXTRACTVALUE(XMLTYPE(C1),'X1/P1')
---------------------------------
HELLO

Create the unique index for x1/p1:

create unique index i1 on t1(extractvalue(xmltype(c1), 'x1/p1'));

insert into t1 values('<x1><p1>SECOND</p1><p2>TEST</p2></x1>');
commit;

1 row inserted.

commit;

Commit complete.

select extractvalue(xmltype(c1), 'x1/p1') from t1;

EXTRACTVALUE(XMLTYPE(C1),'X1/P1')
---------------------------------
HELLO
SECOND

Try to insert duplicate (p1=HELLO) data:

insert into t1 values('<x1><p1>HELLO</p1><p2>TEST</p2></x1>');
*
ERROR at line 1:
ORA-00001: unique constraint (BP.I1) violated