I am tasked with migrating part of our database from an on-site MSSQL server to an Amazon RDS MySQL. This will be my first attempt with a database migration, so I am doing a dry run from our MSSQL server to a locally installed instance of MySQL using MySQL Workbench. Local install MySQL Workbench version 6.3. MySQL Server version 5.7
- Error generated:
Inserting Data: Incorrect string value: '\xA0\xA0 I-D...' for column 'Address1' at row 77
while performing bulk data transfer.- This happens twice in the same table (two different rows). All other tables are successful.
Symptoms:
- On the Object Migration -> Manual Editing -> Column Mappings screen, I see a list of migration messages/warnings.
- On columns that go from source VARCHAR/NVARCHAR to VARCHAR, I get "Collation SQL_Latin1_General_CP1_CI_AS migrated to utf8_general_ci. Same for Char to Char.
- On columns that go from source BIT to TINYINT(1), I get "Source column type BIT was migrated to TINYINT(1)
Settings:
- Source Selection is MSSQL Server, ODBC (Free TDS)
- Advanced tab has "Driver sends Unicode data as UTF-8" checked. Required for Free TDS
- Target Selection is local MySQL. Connection method is Standard(TCP/IP)
I have read THIS article which seems quite promising, but I am not sure how to implement step 3 as it is a migration and when I get to the "Create Schemas" step, it says it will drop the schema and recreate anew if it already exists.
-
All the indexes on this database are on small columns, so the 3-bit to 4-bit utf8 will not be an issue for indexes.
-
As far as "Modify connection, client, and server character sets" to be utf8mb4, I cannot even find the file to modify. I went to MySQL Server 5.7/my-default.ini and changed it to include
[client] default-character-set = utf8mb4 [mysql] default-character-set = utf8mb4 [mysqld] character-set-client-handshake = FALSE character-set-server = utf8mb4 collation-server = utf8mb4_unicode_ci
as requested, but my variables still look like the below picture, so that is obviously not the right file. Plus, will I actually have access to this on the Amazon RDS instance?
Any thoughts?
Best Answer
Hex
A0
probably came from a Word document, where it is a "hard space".It appears that you did not specify that the source was 'latin1' and the destination is 'utf8'. Otherwise,
A0
should have turned intoC2A0
. That seems correct. However, this seems incorrect: "Driver sends Unicode data as UTF-8". Or perhaps just inconsistent with "Collation SQL_Latin1_General_CP1_CI_AS migrated to utf8_general_ci". Did you do anything in ODBC to tell it to convert latin1 to utf8?You have an inconsistency -- some places you mention utf8, some mention utf8mb4. Outside MySQL, there is only "utf8" (or "UTF-8"), but inside MySQL there is a difference between utf8 and utf8mb4. If you intend to handle Chinese or Emoji, go with utf8mb4.