I'm given source data in a tab-delimited text file where the string columns are not quoted.
Among the columns is LAST_NAME, and naturally my source data contains a person with the surname Null. Further, some of the records have redacted names, which I would like to store as NULLs, so I can't define my column as NOT NULL. (A sentinel value is used for the redacted names, so I can identify them.)
I am trying to import this data via LOAD DATA INFILE.
I read each source column into a variable and pass those variables through a function that looks for the sentinel values and translates them to NULL. When this function returns a NULL value, the column gets assigned as NULL. But when this function returns the string 'NULL', the column still gets assigned as NULL, instead of being assigned the string 'NULL'.
How can I prevent the string 'NULL' from being interpreted as a NULL value in the context of a SET clause of a LOAD DATA INFILE command? (Assigning to a nullable varchar column.)
Update with specific data and commands:
An input data file (/tmp/data.txt); note that the fields are TAB-delimited, but I don't know how to do that here:
SMITH Other text is
REDACTED "inconsistently quoted"
NULL REDACTED
My commands:
CREATE FUNCTION despair(value varchar(255))
RETURNS varchar(255) DETERMINISTIC
RETURN
CASE value
WHEN NULL THEN 'NULL'
WHEN 'NULL' THEN 'NULL'
WHEN 'REDACTED' THEN NULL
ELSE value
END;
CREATE TABLE `misery` (LAST_NAME varchar(50), TEXT varchar(100));
LOAD DATA INFILE '/tmp/data.txt' INTO TABLE `misery`
FIELDS TERMINATED BY '\t' ENCLOSED BY '"' LINES TERMINATED BY '\n'
(@name, @text) SET LAST_NAME = despair(@name), TEXT = despair(@text);
SELECT LAST_NAME, TEXT, LENGTH(LAST_NAME), LENGTH(TEXT) FROM `misery`;
Current output. Note that the TEXT column looks right, because quotation marks are removed as desired, but the last row's LAST_NAME is a NULL value, but should instead be the string 'NULL'.
+-----------+-----------------------+-------------------+--------------+
| LAST_NAME | TEXT | LENGTH(LAST_NAME) | LENGTH(TEXT) |
+-----------+-----------------------+-------------------+--------------+
| SMITH | Other text is | 5 | 13 |
| NULL | inconsistently quoted | NULL | 21 |
| NULL | NULL | NULL | NULL |
+-----------+-----------------------+-------------------+--------------+
3 rows in set (0.00 sec)
Best Answer
Provide a
FIELDS ENCLOSED BY
from the docsEssentially if you set
FIELDS ENCLOSED BY '"'
then you can provide surname "Null", as"Null"
, andnulls
asnull
(no quotes).