Mysql – Extract all unique chars from all texts across the table

character-setMySQL

I have a huge mysql table (20 columns by 320.000 rows) that contains a lot of text from a lot of languages and the problem is that I need to identify all the unique characters in that table. For example, if character "," appears one time or more than one time in that table I need this query to show it to me once.

The query should return a list with all the unique characters from this database. I don’t need the query to count them.

Can you help me finding this solution?
Thanks in advance, Catalin

Best Answer

MySQL has no way to split a cell into a bunch of values. (Yeah, it can split a piece of a cell off -- see LEFT(), etc. But not an arbitrary set.)

You would be better off doing the task in PHP or some other language -- more powerful than SQL. (Yeah, it could be done in a Stored Procedure, the the code would be really messy.)

Here is some pseudo code for PHP (etc):

declare $hash as an associative array
foreach row of SELECT * FROM tbl
    foreach cell
        split cell into characters
            $hash[$char] = 1
dump the _keys_ of $hash

(Getting the counts is a trivial enhancement.)

Be sure to work in "characters", not "bytes". The details of that depend on the language.