I'm inserting data (Bulk insert and merge update) for a Portuguese data source (CSV files) into my database. Although the DB collation is Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data
and data stored in NVARCHAR
columns, the portuguese charaters are not correctly being saved.
i.e : Original Data
Regularização de Importação de mercadorias das ZEEs com Pagamento diferido, com Isenção
Data in database after insert
Regularização de Importação de mercadorias das ZEEs com Pagamento diferido, com Isenção
Here the import statement:
CREATE PROCEDURE [dbo].[SP_Import_Temp_CPC]
-- Add the parameters for the stored procedure here
@FilePath [nvarchar](1000)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
TRUNCATE TABLE Temp_CPC;
-- Insert statements for procedure here
EXEC ('BULK INSERT Temp_CPC
FROM ''' + @FilePath + '''
WITH ( FIELDTERMINATOR = ''<,>'', ROWTERMINATOR =''\n'', FIRSTROW = 2, KEEPIDENTITY, CODEPAGE = ''ACP'',TABLOCK,ORDER(ITEMID ASC) );');
END
I've tried with "ACP,RAW and OEM" but get the same result.
Am I doing something wrong here?
Best Answer
You're not going to get the correct result until you tell BULK INSERT the code page of your .csv file. You tried ACP, RAW and OEM but you did not try the correct one. Which one is the correct one you ask? That would be something only you can answer, since you have the file and you (should) know what code page was originally created as! Perhaps MS-DOS Portuguese?
the other alternative is to create the .csv file itself as an Unicode file, but that goes back to your source of the .csv file and depends how that file is created (you bcp out, you have it from a friend, was always used by the original program author, etc etc).
SQL Server does not support UTF-8 files for bulk insert. You must save your file as UCS-2LE or as ASCII with codepage 860. You can use libiconv to do the conversion if you cannot save the file again in the proper encoding.