For each field name in a list of fields, get the unique values

dynamic-sqljoin;oracleplsql

The Question:

I have a view that is a list of fields in tables:

FIELD_LIST_VW
+-------------+------------+
| TABLE_NAME  | FIELD_NAME |
+-------------+------------+
| ENG.TABLE_1 | FIELD_1    |
| ENG.TABLE_1 | FIELD_2    |
| ENG.TABLE_2 | FIELD_A    |
+-------------+------------+

The underlying tables look like this:

TABLE_1
+---------+---------+
| FIELD_1 | FIELD_2 |
+---------+---------+
| A       | X       |
| A       | Y       |
| B       | Y       |
| B       | Y       |
| B       |         |
| C       |         |
+---------+---------+

TABLE_2
+---------+
| FIELD_A |
+---------+
|       1 |
|       1 |
|       1 |
|       2 |
|       3 |
+---------+

TABLE_3, TABLE_4, etc...

I want to expand the list of fields from FIELD_LIST_VW, so that there is a row for each unique value, per field:

+-------------+------------+--------------+
| TABLE_NAME  | FIELD_NAME | UNIQUE_VALUE |
+-------------+------------+--------------+
| ENG.TABLE_1 | FIELD_1    | A            |
| ENG.TABLE_1 | FIELD_1    | B            |
| ENG.TABLE_1 | FIELD_1    | C            |
+-------------+------------+--------------+
| ENG.TABLE_1 | FIELD_2    | X            |
| ENG.TABLE_1 | FIELD_2    | Y            |
+-------------+------------+--------------+
| ENG.TABLE_2 | FIELD_A    | 1            |
| ENG.TABLE_2 | FIELD_A    | 2            |
| ENG.TABLE_2 | FIELD_A    | 3            |
+-------------+------------+--------------+

What I've tried:

I figure a good place to start would be a cross join sub-select:

SELECT
    a.table_name
    ,a.field_name
    ,b.unique_value
FROM
    eng.field_list_vw a
CROSS JOIN (
    SELECT 
        'FIELD_1' AS field_name    --<-- Hardcoded
         ,field_1 AS unique_value  --<-- Hardcoded
    FROM 
        eng.table_1                --<-- Hardcoded
    GROUP BY 
        field_1                    --<-- Hardcoded
    ) b
WHERE a.field_name = b.field_name

But of course, the cross join sub-select is full of hardcoded table & field names. The query joins TABLE_1, FIELD_1, but not any of the other fields in FIELD_LIST_VW. This is not what I want:

+-------------+------------+--------------+
| TABLE_NAME  | FIELD_NAME | UNIQUE_VALUE |
+-------------+------------+--------------+
| ENG.TABLE_1 | FIELD_1    | A            |
| ENG.TABLE_1 | FIELD_1    | B            |
| ENG.TABLE_1 | FIELD_1    | C            |
+-------------+------------+--------------+
(rows are missing)

Instead of hardcoding, is there a way to dynamically cross join to the unique values of each field in FIELD_LIST_VW?

I do not have CREATE TYPE privileges.

Best Answer

One approach for solving this problem is to use dynamic sql. In fact, your question is one of the example use cases:

For example, a SELECT statement that includes an identifier that is unknown at compile time (such as a table name) or a WHERE clause in which the number of subclauses is unknown at compile time.

At compile time, you don't know the table names, column names, or the number of subclauses for your query. As far as I know, you cannot write this as a simple SQL query. You'll need to have some intermediate object create the query text for you.

First let's write out a full query that does what you need for your three example rows. I propose the following:

SELECT 'ENG.TABLE_1', 'FIELD_1', FIELD_1 FROM ENG.TABLE_1
UNION
SELECT 'ENG.TABLE_1', 'FIELD_2', FIELD_2 FROM ENG.TABLE_1
UNION
SELECT 'ENG.TABLE_2', 'FIELD_A', FIELD_A FROM ENG.TABLE_2

The UNION operator will only return the distinct result set from the query so this should return the correct results. One way to generate the above query using a query against the view is to take advantage of the LISTAGG operator.

For a specified measure, LISTAGG orders data within each group specified in the ORDER BY clause and then concatenates the values of the measure column.

As a single-set aggregate function, LISTAGG operates on all rows and returns a single output row.

For the measure expression I propose building a string that adds quotes around the column names as needed. Because this is a string it gets a bit messy:

'SELECT ''' || CAST(FIELD_LIST_VW.TABLE_NAME AS VARCHAR2(30)) 
    || ''', ''' || CAST(FIELD_LIST_VW.FIELD_NAME AS VARCHAR2(30)) 
    || ''', ' || CAST(FIELD_LIST_VW.FIELD_NAME AS VARCHAR2(30)) 
    || ' FROM ' || CAST(FIELD_LIST_VW.TABLE_NAME AS VARCHAR2(30))

For the delimiter I propose the following:

' UNION '

Here is what the query looks like when you put it all together and use the view:

SELECT LISTAGG(
    'SELECT ''' || CAST(FIELD_LIST_VW.TABLE_NAME AS VARCHAR2(30)) 
    || ''', ''' || CAST(FIELD_LIST_VW.FIELD_NAME AS VARCHAR2(30)) 
    || ''', ' || CAST(FIELD_LIST_VW.FIELD_NAME AS VARCHAR2(30)) 
    || ' FROM ' || CAST(FIELD_LIST_VW.TABLE_NAME AS VARCHAR2(30))
    , ' UNION ')
WITHIN GROUP (ORDER BY NULL) into varStringToExecute
FROM FIELD_LIST_VW;

What the above query does is generates the select query that you need for each table and column from the view as a string, adds UNION between all of the queries, and concatenates the entire string together. If you need the query to build in a specific order you can use the ORDER BY clause in LISTAGG.

You can execute the query using dynamic SQL:

EXECUTE IMMEDIATE varStringToExecute;

You didn't say what you need to do with this data. If you need to just return it to the client you may want to use a cursor. Here is one set of ideas on how to accomplish that.

Note the return value of LISTAGG is VARCHAR2, which means that your query text is limited to 4000 characters with default settings. It is possible to increase that limit up to 32767 by setting the MAX_STRING_SIZE parameter, but I don't know if that's appropriate for your database. If you can't do that, there are various workarounds for creating a string over 4k characters.

Apologies if some of the code in this answer doesn't not compile. I don't have access to Oracle at the moment.