MySQL Rows to Columns to HBase (using Sqoop)

hadoopMySQLpivot

I have a table that looks like this in MySQL:

id | date | num | avg

I have found that I can use this for rows to columns and it works so far:

SELECT 
    CONCAT(id, '-', date),
     MAX(IF(`num` = 0, avg, NULL)) num0 
FROM table 
GROUP BY 
    id, 
    date;

The problem is I have 4000 'num' values (values are integers from 0 to 4000) and I do not want to write 4000 MAX statements manually. Is there anyway I can put a loop in there somehow? Ideally, I would like to have 4030 columns…

Edit:

I have edited my post to give some more context to the problem I am facing. I am sorry as I should have probably done this earlier.

Context of problem: I have to migrate data from MySQL to HBase. I want to import the values so that each day (which would be represented by a row key along with the id) have many columns (num) which have a value (avg). The column family is avg. I was originally planning on selecting Sqoop's "Select statement" but I understand I may be pushing the boundaries regarding MySQL's columns and I am not sure what the max is. If anyone knows a better way of doing this and can share with me, that would be great.

Best Answer

If you are interested in returning all the values as columns, you need to be try something very adventurous. First look at your query

SELECT 
    CONCAT(id, '-', date),
    MAX(IF(`num` = 0, avg, NULL)) num0 
FROM table 
GROUP BY 
    id, 
    date;

If would be a big mess to get MySQL to execute it as a query by giving MySQL the query with each column formulated.

Perhaps you can get MySQL to concatenate the column values using GROUP_CONCAT. That function was designed for aggregation (or aggravation if you are the actual developer). You can take all the num values can display it as a column-separated listed of numbers like this:

SELECT 
    CONCAT(id, '-', date),
    GROUP_CONCAT(IF(`num` = 0, avg, 0))) numlist
FROM table 
GROUP BY 
    id, 
    date;

You can also change the list to be delimited by pipes instead of commas like this:

SELECT 
    CONCAT(id, '-', date),
    GROUP_CONCAT(IF(`num` = 0, avg, 0)) SEPARATOR '|') numlist
FROM table 
GROUP BY 
    id, 
    date;

The default maximum length of a GROUP_CONCAT is 1024.

You need to change that max length in the session using this:

SET group_concat_max_len = 10240;

before you issue your query.

Give it a Try !!!