Mysql – How to perform a UNION query on a dynamic list of tables

database-designMySQLselectunion

My database has an unknown amount of tables, named with the same prefix followed by a random string: _ct_<random_string>.

Each of these tables has the same data structure.

I also have another table in which I store the name of each table for future reference.

The following table name is 'content_types' and provides reference for other tables:

+-----------------+
| id | table_name |
+-----------------+
| 1  | _ct_table1 |
| 2  | _ct_table2 |
| 3  | _ct_table3 |
| 4  | _ct_fdfsf  |
| 5  | _ct_535dgf |
...
+-----------------+

The following are example to the referenced tables in 'content_types'.'table_name':

_ct_table1:                              _ct_table2:
+-----------------------------------+    +-----------------------------------+
| id | title | subtitle | date      |    | id | title | subtitle | date      |
+-----------------------------------+    +-----------------------------------+
| 1  | fddfs | ...      | 2012-08-22|    | 4  | fddfs | ...      | 2012-03-18|
| 2  | bdsf  |          | 2012-08-22|    | 5  | b4sf  | dfs      | 2012-07-20|
| 3  | sfdsd |          | 2012-08-22|    | 6  | sfdsd |          | 2012-03-12|
+-----------------------------------+    +-----------------------------------+

Is it possible to create a query with a dynamic UNION statement based on the tables listed in 'content_types'.'table_name'?

Something like this:

select id,title,subtitle,date 
from 
    (
        select table_name 
        from content_types
    );

Best Answer

Yes you can. There are several ways to do this.

1> If you are using the database interactively, then just type in the query. 2> If the database is being accessed via a application or script, write a method in the application/script to create the query required and then run it. You would have a database call to get the list of table names, then create a string with the union query then execute that query. 3> In a similar vein to option 2, most stored procedure languages allow you to dynamically create a query as a string and then execute that string - Check out your DB's documentation to find out how to do this.

A Mysql example https://stackoverflow.com/questions/190776/how-to-have-dynamic-sql-in-mysql-stored-procedure. You may like to use a cursor ( http://dev.mysql.com/doc/refman/5.0/en/cursors.html ) to iterate though the list of tables names to Concat ( http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_concat ) together the query string.

Related Solutions

SQL Server – Choosing Prefixes for Normalized Data Store

There is always at least something else you should also know and almost equally, always something else you should be consciously putting a stop to. Specifically in the context of data warehousing, which is a relatively fledgling sector, leveraging relatively new technologies.

In regards to what I've seen in the real world, walking into a company for the first time and seeing what I'm understanding about your design would be genuinely tear-inducing: Tears of joy and relief. From the outset, you are well on your way to beginning what appears to be a well thought-out ( well engineered ) ETL / data warehousing system. As with the implementation of any software product, your mileage may vary as the solution grows and is consumed by the business, but fundamentally, you are on The Right Track™ ( and yes, you know what a natural key is ).

I've found there to be a number of challenges with these type of solutions, which I will touch upon to reinforce some of your decisions and perhaps lend some insight into the road ahead of you. Firstly, the number of times I've found myself in a predicament on account of a developer ( even fellow database administrators / data professionals ) misunderstanding the context of a control column ( using, for example running a process against the DateInserted column, a mere time stamp of insertion, over the DateReceived or similarly named column, intending to relate a row to a particular date of occurance ), that while I agree completely with the cautions @Aaron Bertrand raises, I feel that the prefixes for your control columns could actually be leveraged as a sort of flag to help prevent their misuse. Obvious should be obvious of course, but much like writing code in general, explicit is preferable. That said, I would almost certainly leave such prefixes out of the indexes and such ( probably even keys - PK types can and should stay in my opinion, but unless there's a real threat of DWD_SubCategories and DWF_SubCategories existing in the same schema, they really are just fluff ). I think the concern about the DWD and DWF prefixes is valid, but they'll be living in the [NDS] catalog and would serve to indicate intent, making it completely fine to use the nomenclature in that manner.

The second ( and perhaps most infuriating ) challenge is one of cross-training your coworkers. All of the software engineering, usage flags and design practice rules are completely for naught if your striving-for-paycheque-over-excellence colleagues get involved and do their less than very best ( or to be fair, are even just simply having a bad day ). Do keep in mind that large projects generally have many fingers in the pot, so it is imperative that those fingers are behaving well.

The last thing I'll touch on here is to always keep in mind the actual value of any ETL system to a business. Of the Extract, Transform and Load paradigm, the first and final letters have absolutely no business value, so you will want to work on making the development and maintenance of both the Extract and Load processes as minimal as possible - the "real" work will be done in the Transform phase, so you will want to automate the E and L steps as much as possible so that you can focus on making ( and keeping ) your solution valuable to the business unit by actively working on the transforms.

All of that said, I've only had the opportunity to work on a handful of different warehousing solutions so perhaps a more knowledgeable user could step in and remove my foot from my mouth if I need correcting. As I said initially, this is one of those areas where one can always learn or unlearn something, and I am absolutely no exception.

Oh, one more thing ( and probably the most important ) - Unit Test! Once your E and L are working as intended and you've had the opportunity to put a few domains through your T solution, get somebody to vet the results. If they're good, save the result set somewhere, so that when you make changes ( and you will, without a doubt ) you can ensure you haven't broken something, somewhere else. Again, automate this process as much as you possibly can ( it's another 0-value process to the business, until they go without it at least ;) ). I generally set up a separate schema or catalog for this purpose.

Hopefully some of what I've said will be useful to you!

As an update, @Aaron Bertrand's schema separation seems like it would be quite a good way to avoid unnecessary prefixing as well, so certainly consider that ( I know I will haha ).

MySQL – How to Perform a Query on a Dynamic List of Tables

All of this could have been accomplished with the INFORMATION_SCHEMA database

The table names are in the INFORMATION_SCHEMA

SELECT table_name FROM information_schema.tables
WHERE table_schema='conntrack';

You could then line up the tables that exist against the tabidx table

SELECT B.table_name FROM (SELECT date FROM tabidx) A
INNER JOIN (SELECT table_name FROM information_schema.tables
WHERE table_schema='conntrack') B ON A.date = B.table_name;

With these table_names, setup a long query against all tables in conntrack

PROPOSED QUERY

SET @fromdate = '2015-04-25';
SET @todate = CURRENT_DATE();
SET @given_username='L001000';
SELECT
    GROUP_CONCAT(qry SEPARATOR ' UNION ')
    INTO @sql
FROM
(
    SELECT CONCAT('SELECT ',QUOTE(table_name),' as "Date and Time"',
    ',username,''%'' srcip,''%'' scport FROM `',table_name,
    '` WHERE username=',QUOTE(@given_username)) qry
    FROM
    (
        SELECT BB.table_name FROM
        (
            SELECT date FROM tabidx
            WHERE date>=@fromdate AND date<=@todate
        ) AA
        INNER JOIN
        (
            SELECT table_name FROM information_schema.tables
            WHERE table_schema='conntrack'
        ) BB ON AA.date = BB.table_name
    ) A
) B;
PREPARE s FROM @sql; EXECUTE s; DEALLOCATE PREPARE s;

SAMPLE DATA (From Your Question)

DROP DATABASE IF EXISTS conntrack_sample;
CREATE DATABASE conntrack_sample;
USE conntrack_s
CREATE TABLE tabidx (date date, PRIMARY KEY (date));
INSERT INTO tabidx VALUES 
('2015-04-25'),('2015-04-26'),('2015-04-27'),
('2015-04-28'),('2015-04-29');
SELECT * FROM tabidx;
CREATE TABLE `0000-00-00`
(
    time time,username CHAR(7),
    srcip CHAR(15),scport INT,
    PRIMARY KEY (username,time)
);
CREATE TABLE `2015-04-25` LIKE `0000-00-00`;
CREATE TABLE `2015-04-26` LIKE `0000-00-00`;
CREATE TABLE `2015-04-27` LIKE `0000-00-00`;
CREATE TABLE `2015-04-28` LIKE `0000-00-00`;
CREATE TABLE `2015-04-29` LIKE `0000-00-00`;
INSERT INTO `2015-04-25` VALUES
('19:20:00','L001000','10.10.10.1',1304),
('19:20:00','L001001','10.10.10.2',1640),
('19:20:01','L001002','10.10.10.3',2001);
INSERT INTO `2015-04-26` VALUES
('19:21:00','L001000','10.10.10.1',1304),
('19:21:00','L001001','10.10.10.2',1640),
('19:21:01','L001002','10.10.10.3',2001);

SAMPLE DATA LOADED

mysql> DROP DATABASE IF EXISTS conntrack_sample;
Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> CREATE DATABASE conntrack_sample;
Query OK, 1 row affected (0.00 sec)

mysql> USE conntrack_sample
Database changed
mysql> CREATE TABLE tabidx (date date, PRIMARY KEY (date));
Query OK, 0 rows affected (0.03 sec)

mysql> INSERT INTO tabidx VALUES
    -> ('2015-04-25'),('2015-04-26'),('2015-04-27'),
    -> ('2015-04-28'),('2015-04-29');
Query OK, 5 rows affected (0.02 sec)
Records: 5  Duplicates: 0  Warnings: 0

mysql> SELECT * FROM tabidx;
+------------+
| date       |
+------------+
| 2015-04-25 |
| 2015-04-26 |
| 2015-04-27 |
| 2015-04-28 |
| 2015-04-29 |
+------------+
5 rows in set (0.00 sec)

mysql> CREATE TABLE `0000-00-00`
    -> (
    ->     time time,username CHAR(7),
    ->     srcip CHAR(15),scport INT,
    ->     PRIMARY KEY (username,time)
    -> );
Query OK, 0 rows affected (0.03 sec)

mysql> CREATE TABLE `2015-04-25` LIKE `0000-00-00`;
Query OK, 0 rows affected (0.05 sec)

mysql> CREATE TABLE `2015-04-26` LIKE `0000-00-00`;
Query OK, 0 rows affected (0.02 sec)

mysql> CREATE TABLE `2015-04-27` LIKE `0000-00-00`;
Query OK, 0 rows affected (0.02 sec)

mysql> CREATE TABLE `2015-04-28` LIKE `0000-00-00`;
Query OK, 0 rows affected (0.05 sec)

mysql> CREATE TABLE `2015-04-29` LIKE `0000-00-00`;
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO `2015-04-25` VALUES
    -> ('19:20:00','L001000','10.10.10.1',1304),
    -> ('19:20:00','L001001','10.10.10.2',1640),
    -> ('19:20:01','L001002','10.10.10.3',2001);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> INSERT INTO `2015-04-26` VALUES
    -> ('19:21:00','L001000','10.10.10.1',1304),
    -> ('19:21:00','L001001','10.10.10.2',1640),
    -> ('19:21:01','L001002','10.10.10.3',2001);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql>

PROPOSED QUERY EXECUTED

mysql> SET @fromdate = '2015-04-25';
Query OK, 0 rows affected (0.00 sec)

mysql> SET @todate = CURRENT_DATE();
Query OK, 0 rows affected (0.02 sec)

mysql> SET @given_username='L001000';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT
    ->     GROUP_CONCAT(qry SEPARATOR ' UNION ')
    ->     INTO @sql
    -> FROM
    -> (
    ->     SELECT CONCAT('SELECT ',QUOTE(table_name),' as "Date and Time"',
    ->     ',username,''%'' srcip,''%'' scport FROM `',table_name,
    ->     '` WHERE username=',QUOTE(@given_username)) qry
    ->     FROM
    ->     (
    ->         SELECT BB.table_name FROM
    ->         (
    ->             SELECT date FROM tabidx
    ->             WHERE date>=@fromdate AND date<=@todate
    ->         ) AA
    ->         INNER JOIN
    ->         (
    ->             SELECT table_name FROM information_schema.tables
    ->             WHERE table_schema='conntrack_sample'
    ->         ) BB ON AA.date = BB.table_name
    ->     ) A
    -> ) B;
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql>

PROPOSED QUERY OUTPUT

mysql> PREPARE s FROM @sql; EXECUTE s; DEALLOCATE PREPARE s;
Query OK, 0 rows affected (0.00 sec)
Statement prepared

+---------------+----------+-------+--------+
| Date and Time | username | srcip | scport |
+---------------+----------+-------+--------+
| 2015-04-25    | L001000  | %     | %      |
| 2015-04-26    | L001000  | %     | %      |
+---------------+----------+-------+--------+
2 rows in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql>

GIVE IT A TRY !!!

NOTE: Here is what @sql has

mysql> SELECT @sql\G
*************************** 1. row ***************************
@sql: SELECT '2015-04-25' as "Date and Time",username,'%' srcip,'%' scport FROM `2015-04-25` WHERE username='L001000' UNION SELECT '2015-04-26' as "Date and Time",username,'%' srcip,'%' scport FROM `2015-04-26` WHERE username='L001000' UNION SELECT '2015-04-27' as "Date and Time",username,'%' srcip,'%' scport FROM `2015-04-27` WHERE username='L001000' UNION SELECT '2015-04-28' as "Date and Time",username,'%' srcip,'%' scport FROM `2015-04-28` WHERE username='L001000' UNION SELECT '2015-04-29' as "Date and Time",username,'%' srcip,'%' scport FROM `2015-04-29` WHERE username='L001000'
1 row in set (0.00 sec)

mysql>