Mysql – what would be the primary disadvantages of splitting the tables into table spaces and/or separate schemata

MySQL

I would like to organize the MySQL tables of a complex application into some groups, to avoid looking at a one big bin of tables. One big bin is harder to administrate when e.g. I would like to reset certain groups of related tables, or if I'd later wish to move a group of related tables to a different MySQL server altogether, and it is also more error prone when developing new DDL code.

I can apply a sense of grouping with my own table name prefixing "convention", but thought about using table spaces or even separate schemata. What would be the primary disadvantages of using separate table spaces or schemata for applying a sense of grouping upon my tables?

Best Answer

DISADVANTAGE #1

InnoDB only has one Buffer Pool. The benefits of partitioning or compartmentalizing database tables for different clients can still be bottlenecked by database buffers (log buffer, InnoDB Buffer Pool)

DISADVANTAGE #2

Scaling becomes dependent on manual operation. This applies to scaling up disks, hardware (disk controllers), and memory.

DISADVANTAGE #3

More tables means more open file handles to cache distinct tables. Requires more flush to disk.

DISADVANTAGE #4

Introducing new groups of related tables regurgitates the first three(3) disadvantages and exacerbates the overall design.

See my other posts on such setups and other drawbacks

Nov 15, 2011 : One Big Database vs. Several Smaller Ones
Mar 11, 2011 : What problems will I get creating a database per customer?

SUGGESTIONS

You might have to compensate these deficiencies with lots of RAM with the InnoDB Buffer Pool and the Log Buffer getting the lion's share.

You could create a hybrid setup where InnoDB Log Files are on one disk and InnoDB System Tablespace is on another, and all the data on a third disk. This concept is well expressed in YOSHINORI MATSUNOBU'S BLOG (FaceBook DB Engineer).

Related Solutions

Mysql – Cannot write to thesql database while select query in process

Your first query is asking for 20,000 rows. what is being victimized here ???

Your Innodb Buffer Pool and Log Files.

Because of SELECT * FROM history WHERE log LIKE '%a%' ORDER BY log ASC LIMIT 0, 20000;, all the data pages for the history table is hogging InnoDB resources.

Since you are using Amazon RDS, you cannot resize the InnoDB Log Files. They are always 128M under all RDS MySQL models. Your buffer pool may or may not be full (You have 5882511360 set for it, which is 5610M or 5.4785G). Naturally, bigger model means more IOPs.

You can run this to see how full the buffer pool is

SELECT FORMAT(A.num * 100.0 / B.num,2) BufferPoolFullPct FROM
(SELECT variable_value num FROM information_schema.global_status
WHERE variable_name = 'Innodb_buffer_pool_pages_data') A,
(SELECT variable_value num FROM information_schema.global_status
WHERE variable_name = 'Innodb_buffer_pool_pages_total') B;

Another aspect I see is the log field (TEXT). You are asking for an ORDER BY on this big column.

You need to change the first query to throttle the SELECT. In other words, fetch smaller chunks of history.

If you cannot change the data or the query, there is nothing you can configure except to shift to m1.xlarge (Comes with 11922309120 as the Buffer Pool Size = 11370M = 11.1035G) although log file size would still be stuck at 128M.

Infrastructure aside, it is possible for SELECTs to block INSERTs, UPDATEs, and DELETEs in InnoDB.

I wrote about this before

MySQL Recursive Query – Using Stored Procedure and CURSOR

So the three questions I'm hoping someone can answer are:

1.Should I be using a CURSOR here or is there an alternative that will get the same recursive result?

No, you should not. The CURSOR keep being opened over and over. Thinking of the overhead make me cringe. Personally, I stay away from CURSORs.

2.How can I get the results back in a single result set so that it can be used in the same fashion as a subselect?

3.What is the proper way of using the results from the CALL because I can't at least as far as I've tried get the sample SELECT statement above to work? I believe this is because I can't use CALL inline but I'm not sure.

I would like to bail you out of this by suggesting something I learned from my college days when learning about data structures. You need to perform tree traversal using a queue. This allows you to start at a root and express all descendants of a tree.

The algorithm goes like this

STEP 01 : Start with an empty queue
STEP 02 : Dequeue Node From the Front of the Queue
STEP 03 : Enqueue All Children of the Latest Node
STEP 04 : Process Info From the Latest Node
STEP 05 : If the Queue is Not Empty, Go Back to STEP 02
STEP 06 : All Done

This allows you to traverse a recursive structure without using Programmatic Recursion. At this point, you are probably asking: How can I traverse a tree structure without recursion?

I wrote a post about how to script three Stored Procedures that can using a loop in a single CALL that will traverse a table with nodes and its parent in a table:

GetParentIDByID
GetAncestry
GetFamilyTree

The post is Find highest level of a hierarchical field: with vs without CTEs (Oct 24, 2011). It contains the Stored Procedures already written that will traverse the following table structure:

+------------+--------------+------+-----+---------+----------------+
| Field      | Type         | Null | Key | Default | Extra          |
+------------+--------------+------+-----+---------+----------------+
| id         | int(11)      | NO   | PRI | NULL    | auto_increment | 
| parent_id  | int(11)      | YES  |     | NULL    |                | 
| name       | varchar(255) | YES  |     | NULL    |                | 
| notes      | text         | YES  |     | NULL    |                | 
+------------+--------------+------+-----+---------+----------------+

Please read the code carefully and apply the principles.

Give it a Try !!!