Mariadb – MySQL/MariaDB Limitations

limitsmariadbperformance

we're running MariaDB on our DB server. We have to install multiple databases on each DB instance (in fact each DB server runs multiple MariaDB instances) because each database must be separated.

We're nearly like an hoster (excepted that we do not provide any file hosting). When you subscribe an hosting plan you often have 1 to X databases and that's what we want to do.

In our company we have 2 DBA: 1 says that we can't have more than 500 DB by instance (due to memory limits) and the 2nd one says that's there's no limitation (excepted FS limitations).

Could you help me to understand or give me some tips (because I'm a project manager and dev. solution need to be chosen and I not a DB expert).

Thanks

Best Answer

Both DBAs are right, in a way. There are no limits on the number of schemas, except those imposed by the filesystem, as in MySQL/MariaDB, a schema is basically a directory, producing absolutely no difference for both MyISAM and InnoDB if a table is part of one schema or another in terms of performance or features (you can join 2 tables in different schemas as fast or as slow as if they were in the same schema). I've managed instances with hundreds of thousands of schemas, and the multiple directories were not the performance bottleneck. It is a pure identifier thing. While you can run into some issues, those are indirect to having more or less schemas: for example, in 5.6, a slave can apply changes in parallel only for rows from different schemas.

Why do I say that both are right, in a way? Because having too many objects in the database is a very common source of issues. When you have lots of databases, you usually have 10x or more the number of tables, and that may mean thousands of file descriptors open (that requires memory), thousands of data dictionary cache entries, millions of open table cache entries, etc., which can lead truly to a CPU and above all, memory, bottleneck.

Related Solutions

SQL Server 2005 – More Databases Per Instance

It's going to be a challenge to apply a uniform policy across such a varied estate. My first port of call would be to understand the biggest consumers of key resources (CPU, memory, IO) on each instance, with a view to isolating the problematic databases from the others. Glenn Berrys healthcheck DMVs would be a good starting point.

Who's using most of the memory (buffer pool) on each instance?

SELECT DB_NAME(database_id) AS [Database Name],
COUNT(*) * 8/1024.0 AS [Cached Size (MB)]
FROM sys.dm_os_buffer_descriptors
WHERE database_id > 4 -- system databases
AND database_id <> 32767 -- ResourceDB
GROUP BY DB_NAME(database_id)
ORDER BY [Cached Size (MB)] DESC OPTION (RECOMPILE);

How long are pages staying in the buffer pool?

SELECT cntr_value AS [Page Life Expectancy]
FROM sys.dm_os_performance_counters
WHERE [object_name] = N'SQLServer:Buffer Manager'
AND counter_name = N'Page life expectancy' OPTION (RECOMPILE);

Who's generating most IO?

SELECT DB_NAME(DB_ID()) AS [Database Name],[file_id], num_of_reads, num_of_writes, 
io_stall_read_ms, io_stall_write_ms,
CAST(100. * io_stall_read_ms/(io_stall_read_ms + io_stall_write_ms) AS DECIMAL(10,1)) AS [IO Stall Reads Pct],
CAST(100. * io_stall_write_ms/(io_stall_write_ms + io_stall_read_ms) AS DECIMAL(10,1)) AS [IO Stall Writes Pct],
(num_of_reads + num_of_writes) AS [Writes + Reads], num_of_bytes_read, num_of_bytes_written,
CAST(100. * num_of_reads/(num_of_reads + num_of_writes) AS DECIMAL(10,1)) AS [# Reads Pct],
CAST(100. * num_of_writes/(num_of_reads + num_of_writes) AS DECIMAL(10,1)) AS [# Write Pct],
CAST(100. * num_of_bytes_read/(num_of_bytes_read + num_of_bytes_written) AS DECIMAL(10,1)) AS [Read Bytes Pct],
CAST(100. * num_of_bytes_written/(num_of_bytes_read + num_of_bytes_written) AS DECIMAL(10,1)) AS [Written Bytes Pct]
FROM sys.dm_io_virtual_file_stats(DB_ID(), NULL) OPTION (RECOMPILE);

Which databases are consuming the most CPU and which particular procedures?

SELECT 
    DB_NAME(qt.dbid) AS DatabaseName
  , SUM(qs.total_worker_time) AS [TotalWorkerTime]
FROM
    sys.dm_exec_query_stats AS qs
CROSS APPLY 
    sys.dm_exec_sql_text(qs.[sql_handle]) AS qt
WHERE
    DB_NAME(qt.dbid) IS NOT NULL
GROUP BY
    qt.dbid
ORDER BY
    SUM(qs.total_worker_time) DESC
OPTION
    (RECOMPILE) ;

SELECT TOP (1000)
    DB_NAME(qt.dbid) AS DatabaseName
  , qt.[text] AS [SP Name]
  , qs.total_worker_time AS [TotalWorkerTime]
FROM
    sys.dm_exec_query_stats AS qs
CROSS APPLY 
    sys.dm_exec_sql_text(qs.[sql_handle]) AS qt
WHERE
    DB_NAME(qt.dbid) IS NOT NULL
ORDER BY
    qs.total_worker_time DESC
OPTION
    (RECOMPILE) ;

MySQL – Can It Perform Queries on Billions of Rows

I am not very familiar with your needs, but perhaps storing each data point in the database is a bit of overkill. It sound almost like taking the approach of storing an image library by storing each pixel as a separate record in a relational database.

As a general rule, storing binary data in databases is wrong most of the time. There is usually a better way of solving the problem. While it is not inherently wrong to store binary data in relational database, often times the disadvantages outweigh the gains. Relational databases, as the name alludes to, are best suited for storing relational data. Binary data is not relational. It adds size (often significantly) to databases, can hurt performance, and may lead to questions about maintaining billion-record MySQL instances. The good news is that there are databases especially well suited for storing binary data. One of them, while not always readily apparent, is your file system! Simply come up with a directory and file naming structure for your binary files, store those in your MySQL DB together with any other data which may yield value through querying.

Another approach would be using a document-based storage system for your datapoints (and perhaps spectra) data, and using MySQL for the runs (or perhaps putting the runs into the same DB as the others).

Best Answer

Related Solutions

SQL Server 2005 – More Databases Per Instance

MySQL – Can It Perform Queries on Billions of Rows

Related Question