Mysql – GDPR compliant multitenant MYSQL DB

multi-tenantMySQL

I wonder what are the GDPR constraint on a MySQL DB for a multitenant SaaS application.
While a solution with a single DB per each multitenant customer and restricted access will probably be fine, what about a single DB?

How do I guarantee a proper insulation of data for single tenants?
Pure applicative solutions seems weak to me (…where tenantID=xx for each query).

Will I be forced to create restricted views on each table or for most queries?
How well will MySQL 6.7 handle this? I've read about indexing views and performance, I'm not sure what will happen to my application if I switch every table to a view.

Have new versions of MySQL improved? Should I use a db proxy like MaxScale?
Or maybe it's reason enough to switch to postgres or other db?

Best Answer

Regardless of how you solve the problem, you have to go through some sort of "router". And that may be the least secure part.

Separate computer for each customer -- you need to tell user which machine to go to.

Separate process or API or VM on same machine -- user must log into the appropriate path.

Separate databases (one per customer) on single MySQL server -- either API or MySQL could handle security.

Single database, but separate rows in the tables -- Now the API must take responsibility for security. That is, the user must not have direct access to MySQL, only indirect through an API layer.

Each has "privacy by design".

Related Solutions

Mysql – What can we do in MySQL 5.0 Replication to address bandwidth concerns

SUGGESTION #1 : Use Distribution Masters

A Distribution Master is a mysql slave with log-bin enabled, log-slave-updates enabled and contains only tables with the BLACKHOLE Storage Engine. You can apply replicate-do-db to the Distribution Master and create binary logs at the Distribution Master that contains only the DB schema(s) you want binlogged. In this way you reduce the size of outgoing binlogs from the Distribution Master.

You can setup a Distribution Master as follows:

mysqldump your database(s) using --no-data option to generate a schema-only dump.
Load the schema-only dump to the Distribution Master.
Convert every table in the Distribution Master to the BLACKHOLE storage engine.
Setup replication to the Distribution Master from a master with real data.
Add replicate-do-db option(s) to /etc/my.cnf of the Distribution Master.

For steps 2 and 3 you could also edit the schema-only dump and replace ENGINE=MyISAM and ENGINE=InnoDB with ENGINE=BLACKHOLE and then load that edited schema-only dump into the Distribution Master.

In step 3 only, if you want to script the conversion of all MyISAM and InnoDB tables to BLACKHOLE in the Distribution Master, run the following query and output it to a text file:

mysql -h... -u... -p... -A --skip-column-names -e"SELECT CONCAT('ALTER TABLE ',table_schema,'.',table_name', ENGINE=BLACKHOLE;') BlackholeConversion FROM information_schema.tables WHERE table_schema NOT IN ('information_schema','mysql') AND engine <> 'BLACKHOLE'" > BlackholeMaker.sql

An added bonus to scripting the conversion of table to the BLACKHOLE storage engine is that MEMORY storage engine tables are converted as well. While MEMORY storage engine table do not take up disk space for data storage, it will take up memory. Converting MEMORY tables to BLACKHOLE will keep memory in the Distribution Master uncluttered.

As long as you do not send any DDL into the Distribution Master, you can transmit any DML (INSERT,UPDATE,DELETE) you so desire before letting clients replicate just the DB info they want.

I already wrote a post in another StackExchange site that discusses using a Distribution Master.

SUGGESTION #2 : Use Smaller Binary Logs and Relay Logs

If you set max_binlog_size to something ridiculously small, then binlogs can be collected and shipped out in smaller chunks. There is also a separate option to set the size of relay logs, max_relay_log_size. If max_relay_log_size = 0, it will default to whatever max_binlog_size is set to.

SUGGESTION #3 : Use Semisynchronous Replication (MySQL 5.5 only)

Setup your main database and multiple Distribution Masters as MySQL 5.5. Enable Semisynchronous Replication so that the main database can quickly ship binlogs to the Distribution Master. If ALL your slaves are Distribution Masters, you may not need Semisynchronous Replication or MySQL 5.5. If any of the slaves, other than Distribution Masters, have real data for reporting, high availability, passive standby or backup purposes, then go with MySQL 5.5 in conjunction with Semisynchronous Replication.

SUGGESTION #4 : Use Statement-Based Binary Logging NOT Row-Based

If an SQL statement updates multiple rows in a table, Statement-Based Binary Logging (SBBL) stores only the SQL statement. The same SQL statement using Row-Based Binary Logging (RBBL) will actual record the row change for each row. This makes it obvious that transmitting SQL statements will save space on binary logs doing SBBL over RBBL.

Another problem is using RBBL in conjunction with replicate-do-db where table name has the database prepended. This cannot be good for a slave, especially for a Distribution Master. Therefore, make sure all DML does not have a a database and a period in front of any table names.

Mysql – the most effective way to cluster MySQL for the requirements

It depends on how you use your database. If it is read-heavy but with few writes (like a blog or a news paper) you could have one mysql for writes and two for reads. You would set up so the write server is a replication master and the two for reads are slaves.

All application servers needs to know about both the write server and one of the read servers, that way when you balance the loads to application servers you automatically balance the reads between the mysql servers. It's also easy to add another mysql+application server once the need is bigger.

If you on the other hand have a write heavy site (I can't even find an example) you need to do some research on sharding. It's normally not recommended unless you really need it.

Best Answer

Related Solutions

Mysql – What can we do in MySQL 5.0 Replication to address bandwidth concerns

Mysql – the most effective way to cluster MySQL for the requirements

Related Question