Mysql – use thesql to serve 100’s of millions of small image files

blobmyisamMySQLperformance

I need to serve 300,000,000 small image files totaling 1.5TB. Trying to unpack these files to individual files on the file system is next to impossible (1 MByte/sec throughput rates to create the small files).

For the time I have dumped the files all into large 12GB binary files and have an index that allows me to seek to the right point and read the correct # of bytes.

But that solution was a stop-gap, I need to also update the files daily, generally this means adding new files, and once-in-a-while (monthly/quarterly) do housekeeping and get rid of old ones.

So I wonder if I'll be shooting myself in the foot by trying to create a massive MyISAM table to store these 2-10k blobs?

On a 64 bit linux system there's a 4TB limit on DB file size right? Can I split such a massive table into multiple physical files as my dataset grows? Eventually we might serve 6+ TB of small image files.

My options right now are between:

Dump the binary data to a large flat file and manage the indexes on my own (either with metadata in the URL or a separate index file on the webserver)
Use a MySQL table, add new data via fast LOAD DATA INFILE, which we have had good experience with on another 100GB reference table we use.

Thoughts? Cautions?

Best Answer

I wonder why any kind of DB is needed. It sounds like you'd like it if these were just files, but that the number of them and their small size are problematic.

There are two strategies that can help here, one is to use a few levels of directories to help partition the files into smaller clusters.

Now, some file systems may have inode issues or become very wasteful of space if they are tiny files and large block sizes. Indeed, if you're on Linux it may be worth spending some time looking at the alternative file systems that are available and see if one meets your use case well.

A 'simply' way to avoid the issue is, for the last layer, rather than having files in a leaf directory, just have files in a zip file with zero compression. This pretty much works almost like a directory with files, but can circumvent the above two issues. It effectively gives you what you've been doing by hand, but with a more standard tool-chain.

From what you've described so far, I personally wouldn't be putting this anywhere near any kind of DB, NoSQL or traditional.

If you really want a DB solution, you could look at some form of partitioning across multiple tables to avoid any file size limits, but it wouldn't be my first option!

Related Solutions

Mysql – Very small MySQL table keeps crashing

What is the OS and MySQL version that you are using?

You need to investigate to see if anyone renames or moves files manually in the data directory, as it could be the cause of corruption.

You could run a script with CHECK TABLE command on a frequent basis that would check when the table gets corrupted and maybe give you additional clues as to what causes the corruption.

Check /var/log/messages to see if there are any system errors that could be responsible.

Some possible work arounds if you don't have time to investigate:

Since the table is rarely updated, you could compress it making it read-only using myisampack. To prevent down time, you could work on a copy of the table in another MySQL instance or issue FLUSH TABLE command on a copy of the table before running myisampack.
Convert table to InnoDB engine as usually it is less prone to corruption then MyISAM.

EDIT
Yes, with such an old version there are bound to be some bugs. At least upgrade to the latest 5.0.x version (5.0.95 is the latest at the moment). Better yet, upgrade to MySQL 5.5.x.

Export Image Column to Files in SQL Server – Step-by-Step Guide

This is the solution I came up with:

Enable xp_cmdshell with

EXEC sp_configure 'show advanced options', 1
GO
RECONFIGURE
GO
EXEC sp_configure 'xp_cmdshell', 1
GO
RECONFIGURE
GO

If needed create a directory with xp_cmdshell to get the needed permissions.
```
EXEC master..xp_cmdshell 'mkdir C:\exportdir'
```

Use BCP with queryout

EXEC master..xp_cmdshell 'BCP "SELECT column_of_type_image FROM **your_db WHERE id = 1 " queryout "C:\exportdir\yourfile.pdf" -T -N'

**your_db must be the fully qualified table name, i.e [Yourdb].[YourSchema].[YourTable]

Best Answer

Related Solutions

Mysql – Very small MySQL table keeps crashing

Export Image Column to Files in SQL Server – Step-by-Step Guide

Related Question