I've been struggling for days with a problem on a MS SQL database. Here is the situation: I have an old Open Source PHP application (Limesurvey version 1.92) running on a Windows 2008R2 server, with SQL2008R2 as RDBMS. I need to migrate the surveys in the application to a new instance, running on Linux with MySQL. The new instance is version 3, and database structure between both versions is different. The scenario I tried to use is upgrading the old version on the old server, and then exporting the surveys as export files, which I can import into the new application. It works well. Except that I'm stuck because of a problem of character encoding. The old application (web application) uses UTF-8. But the database uses only VARCHAR and TEXT columns. Therefore UTF-8 characters are stored using several bytes in these columns : é is stored as é for example. When I upgrade the application, the upgrade script changes the structure of the tables: TEXT columns become NVARCHAR(MAX), but the content in the column is not converted to UTF-8. So I'm searching for a way to change the content in the NVARCHAR(MAX) columns from ISO8859 to UTF-8. The database has approx 200 tables and is 600MB.
Sql-server – How to bulk convert ISO8859 text in Text columns to UTF8 in nvarchar(max) columns
character-setsql serversql-server-2008-r2type conversionutf-8
Related Solutions
I'm not certain. I tried to start out be reproducing your problem but the alter worked fine for me.
test > CREATE TABLE `bar` ( `content` text ) ENGINE=MyISAM DEFAULT CHARSET=latin1; INSERT INTO bar VALUES (0x8081828384858687898A8B8C8D8E8F909192939495969798999A9B9C9D9E9F);
Query OK, 0 rows affected (0.02 sec)
Query OK, 1 row affected (0.00 sec)
test > ALTER TABLE bar CHANGE content content TEXT CHARACTER SET UTF8;
Query OK, 1 row affected (0.04 sec)
Records: 1 Duplicates: 0 Warnings: 0
test > select * from bar;
+---------------------------------+
| content |
+---------------------------------+
| ����������������������������� |
+---------------------------------+
1 row in set (0.00 sec)
test > set names utf8;
Query OK, 0 rows affected (0.00 sec)
test > select * from bar;
+---------------------------------------------------------------------------------+
| content |
+---------------------------------------------------------------------------------+
| €‚ƒ„…†‡‰Š‹ŒŽ‘’“”•–—˜™š›œžŸ |
+---------------------------------------------------------------------------------+
1 row in set (0.00 sec)
Here's my related char settings
test > show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
Edit
My char settings before running set names utf8
test > show variables like '%char%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)
Version
test > select version();
+-------------------------+
| version() |
+-------------------------+
| 5.1.41-3ubuntu12.10-log |
+-------------------------+
1 row in set (0.00 sec)
Ok, I've re-written your SP for you with QUOTENAME
and the schema
like @MartinSmith suggested. Although I'm not quite sure why you are using an SP for this. It would seem like a stand alone piece of code without the SP wrapper would make more sense for something like this.
You use QUOTENAME
to deal with odd characters in the name. For example Table-Test
is a valid table name but won't work in your code unless you put []
s around it [Table-Test]
. QUOTENAME
takes care of that for you. It also handles if you happen to have ']'s in your name as well. It generally considered a best practice when you are doing dynamic sql to include schemas
and to use QUOTENAME
where needed.
USE [EventLogic]
GO
/****** Object: StoredProcedure [dbo].[usp_SL_ConvertNtextToNvarchar] Script Date: 08/08/2013 16:28:58 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[usp_SL_ConvertNtextToNvarchar]
AS
/*
*/
BEGIN
SET NOCOUNT ON;
DECLARE @sql nvarchar(max)
DECLARE @table_schema nvarchar(128)
DECLARE @table_name nvarchar(128)
DECLARE @column_name nvarchar(128)
DECLARE @totalCount int
DECLARE @count int
SET @totalCount = 0;
SET @count = 0;
SET @sql = '';
-- Eventlogic
DECLARE tables_cursor CURSOR FOR
SELECT SCHEMA_NAME(so.schema_id) AS table_schema, so.name as table_name,
sc.name as column_name
FROM sys.objects so
JOIN sys.columns sc ON so.object_id = sc.object_id
JOIN sys.types stp ON sc.user_type_id = stp.user_type_id
AND stp.name = 'ntext'
WHERE so.type = 'U' -- to show only user tables
OPEN tables_cursor
FETCH NEXT FROM tables_cursor INTO @table_schema, @table_name, @column_name
WHILE @@FETCH_STATUS = 0
BEGIN
SET @sql = 'ALTER TABLE Eventlogic.'+QUOTENAME(@table_schema)+'.' +
QUOTENAME(@table_name) + ' ALTER COLUMN ' +
QUOTENAME(@column_name) + ' nvarchar(max);'
EXEC sp_executesql @sql
--PRINT @sql
SET @sql = 'UPDATE Eventlogic.'+QUOTENAME(@table_schema)+'.' +
QUOTENAME(@table_name) + ' SET ' +
QUOTENAME(@column_name) + '=' +
QUOTENAME(@column_name) + ' '
EXEC sp_executesql @sql
--PRINT @sql
SET @count = @count + 1;
IF @count > 0
PRINT ('Eventlogic.'+@table_schema+'.' + @table_name + '.' + @column_name + ' ' + CAST(@count AS nvarchar(10)))
SET @totalCount = @totalCount + @count;
FETCH NEXT FROM tables_cursor INTO @table_schema, @table_name, @column_name
END
CLOSE tables_cursor
DEALLOCATE tables_cursor
PRINT ('Total columns updated: ' + CAST(@totalCount AS nvarchar(10)))
END;
GO
Related Question
- Mysql – Master and slave tables with different character sets – will replication break
- Mysql – How to convert a MySQL database from utf8mb4 to utf8
- Sql-server – Any tips to watch out for when setting one of the databases to Read Only
- Sql-server – UTF8 Trouble while migrating from MSSQL to MySQL with MySQL Workbench
- Sql-server – How to safely recreate a MS SQL table that has other objects depending on it
Best Answer
SQL Server 2008r2 dopes not support UTF8 (such support does not arrive until the upcoming 2019 release) it only supports UCS2 (essentially a subset of UTF16). Because of this there is unlikely to be a practical solution in SQL Server itself.
You may have to code up a tool to pull the data out, convert it, and update the DB with the newly converted values.
On rereading the question, I notice you mention mysql as the target database, though your tags do not mention that. The types you mention (
NVARCHAR
etc.) are not mysql types. You may need to clarify the question and update the tags.