How to pivot data in SQLite

pivotsqlite

I've a table which contains text data containing a bunch of translations for different language. Each translation is for a specific label.

I need to generate a pivot table so to get quickly what's missing.

An example of the records is

1, en, hello
1, fr, bonjour
1, es, hola
2, en, how are you
2, fr, 
3, es, come es stas

Although translations should be always there for all language, I'm not 100% sure this is the case. So missing fields have to be considered.

The desired outcome is this

|ID|EN|FR|ES|
|1|hello|bonjour|hola|
|2|how are you| |come es stas|

The challenge I have is that the column order may not always be the same when the database is populated, so in theory I should have a dynamic list of fields.

There is not a direct PIVOT function in SQLite, so I started experimenting with the group_concat obtaining a comma separated string.

SELECT DISTINCT language, group_concat(word, ',') OVER (PARTITION BY language) AS group_concat
FROM vocabulary;

I can ran past the results later on in Python if needed; the issue is that any missing value is not appending an empty item thus shifting all the concatenation by n, thus making this solution not valid.

I have attempted also to use the filter clause in the select predicate (though this mean hardcoding the columns), but I was not able to succeed.

Any idea on how this can be achieved?

Best Answer

The following query will return all IDs that do not have all translations:

SELECT id, COUNT(*) AS count
FROM vocabulary
GROUP BY id
HAVING count < (SELECT COUNT(DISTINCT language)
                FROM vocabulary);

If you want to know which languages are missing for each ID, use a compound query to look up those languages that are in the list of all languages but not in those for this ID:

WITH missing_ids AS (
  SELECT id, COUNT(*) AS count
  FROM vocabulary
  GROUP BY id
  HAVING count < (SELECT COUNT(DISTINCT language)
                  FROM vocabulary)
)
SELECT id,
       (SELECT group_concat(language)
        FROM (SELECT DISTINCT language FROM vocabulary
              EXCEPT
              SELECT language FROM vocabulary WHERE id = missing_ids.id)
       ) AS missing_languages
FROM missing_ids;
Related Question