Mysql – Optimizing the design of a workflow engine database

database-designMySQLoptimization

I need help optimizing this design because I don't think it's going to scale well.

Background info

The database is for a workflow engine app. End users create Transactions with different attributes, and a transaction goes through a series of steps based on a subset of its attributes (right now the subset is 3, but it could change in the future).

The number and order of the steps the Transaction goes through comes from routes that I defined. These routes almost never change.

This is the structure of the tables that are used to find a route.

Route Structure

This is what the Transaction table looks like.

CREATE TABLE `Transaction` (
 `TID` int(11) NOT NULL AUTO_INCREMENT,
 `Type` varchar(30) COLLATE utf8mb4_unicode_ci NOT NULL,
 `WavPost` varchar(30) COLLATE utf8mb4_unicode_ci NOT NULL,
 `EmployeeClass` char(1) COLLATE utf8mb4_unicode_ci NOT NULL,
 PRIMARY KEY (`TID`)
)

This is a view I created to generate all possible routes (route_vw)

  select 
    `d`.`facstaff` AS `facstaff`, 
    `f`.`waveposted` AS `waveposted`, 
    `h`.`tempreg` AS `tempreg`, 
    `j`.`step_name` AS `step_name`, 
    `i`.`route_step_id` AS `route_step_id`, 
    `i`.`route_id` AS `route_id`, 
    `i`.`step_id` AS `step_id`, 
    `i`.`ndx` AS `ndx`, 
    `i`.`on_time` AS `on_time`, 
    `i`.`behind` AS `behind`, 
    `i`.`late` AS `late` 
from 
    (
        (
            (
                (
                    (
                        (
                            (
                                (
                                    `route_defn` `a` 
                                    join `fac_staff_list` `c` on(
                                        (
                                            `a`.`fac_staff_list_id` = `c`.`fac_staff_list_id`
                                        )
                                    )
                                ) 
                                join `fac_staff` `d` on(
                                    (
                                        `c`.`fac_staff_list_id` = `d`.`fac_staff_list_id`
                                    )
                                )
                            ) 
                            join `wave_post_list` `e` on(
                                (
                                    `a`.`wave_post_list_id` = `e`.`wave_post_list_id`
                                )
                            )
                        ) 
                        join `wave_post` `f` on(
                            (
                                `e`.`wave_post_list_id` = `f`.`wave_post_list_id`
                            )
                        )
                    ) 
                    join `temp_reg_list` `g` on(
                        (
                            `a`.`temp_reg_list_id` = `g`.`temp_reg_list_id`
                        )
                    )
                ) 
                join `temp_reg` `h` on(
                    (
                        `g`.`temp_reg_list_id` = `h`.`temp_reg_list_id`
                    )
                )
            ) 
            join `route_step` `i` on(
                (
                    `a`.`route_defn_id` = `i`.`route_id`
                )
            )
        ) 
        join `step_defn` `j` on(
            (
                `i`.`step_id` = `j`.`step_defn_id`
            )
        )
    )

This is how I find the route for a (the) Transaction(s).

select *
from `Transaction` `A`
join `route_vw` `B`
on (
    (`B`.`facstaff` = `A`.`EmployeeClass`) 
    and (`B`.`waveposted` = `A`.`WavPost`) 
    and (`B`.`tempreg` = `A`.`Type`)
)

Problem

Array
(
    [EVENT_ID] => 4
    [Duration] => 0.003877
    [SQL_TEXT] => SELECT *
FROM Transaction A
INNER JOIN route_vw D on (
A.EmployeeClass = D.facstaff AND
A.WavPost = D.waveposted AND
A.Type = D.tempreg 
)
    [DIGEST_TEXT] => SELECT `d` . `facstaff` AS `facstaff` , `f` . `waveposted` AS `waveposted` , `h` . `tempreg` AS `tempreg` , `j` . `step_name` AS `step_name` , `i` . `route_step_id` AS `route_step_id` , `i` . `route_id` AS `route_id` , `i` . `step_id` AS `step_id` , `i` . `ndx` AS `ndx` , `i` . `on_time` AS `on_time` , `i` . `behind` AS `behind` , `i` . `late` AS `late` FROM ( ( ( ( ( ( ( ( `UchihaTigerTalent` . `route_defn` `a` JOIN `UchihaTigerTalent` . `fac_staff_list` `c` ON ( ( `a` . `fac_staff_list_id` = `c` . `fac_staff_list_id` ) ) ) JOIN `UchihaTigerTalent` . `fac_staff` `d` ON ( ( `c` . `fac_staff_list_id` = `d` . `fac_staff_list_id` ) ) ) JOIN `UchihaTigerTalent` . `wave_post_list` `e` ON ( ( `a` . `wave_post_list_id` = `e` . `wave_post_list_id` ) ) ) JOIN `UchihaTigerTalent` . `wave_post` `f` ON ( ( `e` . `wave_post_list_id` = `f` . `wave_post_list_id` ) ) ) JOIN `UchihaTigerTalent` . `temp_reg_list` `g` ON ( ( `a` . `temp_reg_list_id` = `g` . `temp_reg_list_id` ) ) ) JOIN 
    [NO_INDEX_USED] => 0
    [NO_GOOD_INDEX_USED] => 0
    [ROWS_AFFECTED] => 0
    [ROWS_SENT] => 214
    [ROWS_EXAMINED] => 820
)

These stats were taken when Transaction had 14 rows. Based on the ROWS_EXAMINED I don't think this method is going to scale up well. Is there a better way to write this so that I don't have to examine so many rows?

Note: I uploaded a db-fiddle.com demo.

Update 1

Question – "I don't understand your 1-column "*_list" tables."

Answer – The *_lists tables allows me to specify and reuse a set/list of values that a route should be used for.

E.G

enter image description here
enter image description here

Route #4 (route_defn.route_defn_id = 4) is applicable to Staff transactions.

Route #5 is applicable to Faculty transactions.

A route with fac_staff_list_id = 3 is applicable to both Faculty and Staff transactions. * Note there are no routes with a fac_staff_list_id = 3.

This logic and structure applies to temp_reg_list and wave_post_list as well.

Update 2

"No intermediate table (*_list) is needed in a 1:many mapping. The fac_staff table simply needs a column route_defn_id. Get rid of the intervening fac_staff_list. (Ditto for other 1:many relationships.)" – @rick-james

"Please provide EXPLAIN SELECT … and SHOW CREATE TABLE" – @rick-james

The updated schema below is based on the suggestions from @rick-james.

enter image description here

show create view route_vw

CREATE ALGORITHM=MERGE DEFINER=`root`@`localhost` SQL SECURITY INVOKER VIEW `route_vw` AS select `fac_staff`.`facstaff` AS `facstaff`,`wave_post`.`waveposted` AS `waveposted`,`temp_reg`.`tempreg` AS `tempreg`,`route_step`.`route_step_id` AS `route_step_id`,`route_step`.`route_id` AS `route_id`,`route_step`.`step_id` AS `step_id`,`route_step`.`ndx` AS `ndx`,`route_step`.`on_time` AS `on_time`,`route_step`.`behind` AS `behind`,`route_step`.`late` AS `late` from ((((`route_defn` join `fac_staff` on((`route_defn`.`route_defn_id` = `fac_staff`.`route_defn_id`))) join `wave_post` on((`route_defn`.`route_defn_id` = `wave_post`.`route_defn_id`))) join `temp_reg` on((`route_defn`.`route_defn_id` = `temp_reg`.`route_defn_id`))) join `route_step` on((`route_defn`.`route_defn_id` = `route_step`.`route_id`)))

show create table route_defn

CREATE TABLE `route_defn` (
 `route_defn_id` int(11) NOT NULL AUTO_INCREMENT,
 PRIMARY KEY (`route_defn_id`)
) ENGINE=InnoDB AUTO_INCREMENT=11 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

show create table fac_staff

CREATE TABLE `fac_staff` (
  `fs_id` int(11) NOT NULL AUTO_INCREMENT,
  `route_defn_id` int(11) NOT NULL,
  `facstaff` varchar(1) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`fs_id`),
  UNIQUE KEY `route_defn_id` (`route_defn_id`,`facstaff`),
  KEY `facstaff` (`facstaff`),
  CONSTRAINT `fac_staff_ibfk_1` FOREIGN KEY (`route_defn_id`) REFERENCES `route_defn` (`route_defn_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=27 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

show create table wave_post

CREATE TABLE `wave_post` (
  `wp_id` int(11) NOT NULL AUTO_INCREMENT,
  `route_defn_id` int(11) NOT NULL,
  `waveposted` varchar(10) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`wp_id`),
  UNIQUE KEY `route_defn_id` (`route_defn_id`,`waveposted`),
  KEY `waveposted` (`waveposted`)
) ENGINE=InnoDB AUTO_INCREMENT=36 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

show create table Transaction

CREATE TABLE `Transaction` (
  `TID` int(11) NOT NULL AUTO_INCREMENT,
  `Type` varchar(30) COLLATE utf8mb4_unicode_ci NOT NULL,
  `RequestType` varchar(30) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `WavPost` varchar(30) COLLATE utf8mb4_unicode_ci NOT NULL,
  `IncSupPosition` int(8) unsigned zerofill NOT NULL,
  `IncSupEmplid` int(6) unsigned zerofill NOT NULL,
  `Supervisor` varchar(55) COLLATE utf8mb4_unicode_ci NOT NULL,
  `Phone` varchar(10) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `Location` varchar(200) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `HireMgrName` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `HireMgrEmail` varchar(191) COLLATE utf8mb4_unicode_ci NOT NULL,
  `Comment` varchar(1000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `Title` varchar(100) COLLATE utf8mb4_unicode_ci NOT NULL,
  `SalaryStart` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL,
  `PositionJustification` varchar(1500) COLLATE utf8mb4_unicode_ci NOT NULL,
  `InternalComparison` varchar(500) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `AdditionalDuties` varchar(1000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `ReducedTask` varchar(500) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `PositionNumber` int(8) unsigned zerofill DEFAULT NULL,
  `StatePosition` varchar(8) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `JobReqID` varchar(8) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `JobCode` varchar(6) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `DepartmentID` int(4) unsigned zerofill NOT NULL,
  `DepartmentName` varchar(200) COLLATE utf8mb4_unicode_ci NOT NULL,
  `BudgetCenter` varchar(4) COLLATE utf8mb4_unicode_ci NOT NULL,
  `FullPart` varchar(1) COLLATE utf8mb4_unicode_ci NOT NULL,
  `StandardHours` float NOT NULL,
  `IncumbentEmplid` int(6) unsigned zerofill DEFAULT NULL,
  `IncumbentName` varchar(200) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `PreHireEscalate` int(11) NOT NULL DEFAULT ''1'',
  `CountyCode` int(2) unsigned zerofill NOT NULL,
  `StateCode` varchar(2000) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `EmployeeClass` char(1) COLLATE utf8mb4_unicode_ci NOT NULL,
  `SubmittedBy` varchar(8) COLLATE utf8mb4_unicode_ci NOT NULL,
  `TimeStamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`TID`),
  UNIQUE KEY `TID` (`TID`,`Type`,`WavPost`,`EmployeeClass`),
  KEY `BudgetCenter` (`BudgetCenter`),
  KEY `HireMgrEmail` (`HireMgrEmail`),
  KEY `SubmittedBy` (`SubmittedBy`),
  KEY `Type` (`Type`),
  KEY `EmployeeClass` (`EmployeeClass`),
  KEY `WavPost` (`WavPost`),
  KEY `Type_2` (`Type`,`WavPost`,`EmployeeClass`)
) ENGINE=InnoDB AUTO_INCREMENT=6380 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

show create table temp_reg

CREATE TABLE `temp_reg` (
  `tr_id` int(11) NOT NULL AUTO_INCREMENT,
  `route_defn_id` int(11) NOT NULL,
  `tempreg` varchar(50) COLLATE utf8mb4_unicode_ci NOT NULL,
  PRIMARY KEY (`tr_id`),
  UNIQUE KEY `route_defn_id` (`route_defn_id`,`tempreg`),
  KEY `tempreg` (`tempreg`),
  KEY `rotue_defn_id` (`route_defn_id`),
  CONSTRAINT `temp_reg_ibfk_1` FOREIGN KEY (`route_defn_id`) REFERENCES `route_defn` (`route_defn_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=84 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

show create table route_step

CREATE TABLE `route_step` (
  `route_step_id` int(11) NOT NULL AUTO_INCREMENT,
  `route_id` int(11) NOT NULL,
  `step_id` int(11) NOT NULL,
  `ndx` int(11) NOT NULL,
  `on_time` tinyint(1) DEFAULT NULL,
  `behind` tinyint(1) DEFAULT NULL,
  `late` tinyint(1) DEFAULT NULL,
  PRIMARY KEY (`route_step_id`),
  UNIQUE KEY `route_id_2` (`route_id`,`step_id`,`ndx`),
  KEY `step_id` (`step_id`),
  KEY `ndx` (`ndx`),
  KEY `route_id` (`route_id`),
  CONSTRAINT `route_step_ibfk_1` FOREIGN KEY (`step_id`) REFERENCES `step_defn` (`step_defn_id`) ON DELETE CASCADE ON UPDATE CASCADE,
  CONSTRAINT `route_step_ibfk_2` FOREIGN KEY (`route_id`) REFERENCES `route_defn` (`route_defn_id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=34 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci

Best Answer

1 route is applicable many fac_staff types.

No intermediate table (*_list) is needed in a 1:many mapping. The fac_staff table simply needs a column route_defn_id. Get rid of the intervening fac_staff_list. (Ditto for other 1:many relationships.)