Mysql – Why would SequelPro only import 23k rows out of 130k

importmac os xMySQL

I use SequelPro for MySQL on a Mac OS X. I used the import function to upload a 130k .csv file to my database. Everything seems to work fine, then I get the message:

File Read Error: An error occurred when reading the file, as it could not be read using the encoding you selected (Auto-detect – Unicode (UTF-8)). Only 23,000 rows were imported.

When I hit "Ok," everything else seems to work relatively fine, I'm just missing about 107,000 rows.

Any idea as to what it could be? Maybe I should use something other than auto-detect during the import? I thought that it might have been some extra commas floating around in the actual .csv file, which there were, but I got rid of those and the same thing happened.

Out of 130,000 rows, there's definitely the possibility for some non-English characters. Which ones doesn't MySQL accept and how would I find and replace them?

This is what I'm getting when I run the character set query:

show variables like 'character_set%';

Variable_name               Value
character_set_client        latin1
character_set_connection    latin1
character_set_database      latin1
character_set_filesystem    binary
character_set_results       latin1
character_set_server        latin1
character_set_system        utf8
character_sets_dir          /usr/local/mysql-5.6.10-osx10.7-x86_64/share/charsets/

Best Answer

This may depend on where you generated the CSV file. If the CSV file was generated on a Windows machine, there could be some character set issues

See https://code.google.com/p/sequel-pro/issues/detail?id=1629

See the following URLs as SequelPro's character set problems are not new

If the CSV file was generated on another Mac OSx server, you should not be having this issue.

You may have to resort to setting the default character set to match that CSV file. Sounds weird to here it goes:

Please run this query and you will see something like this:

mysql> show variables like 'character_set%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.00 sec)

mysql>

You can also see the character set of the database

mysql> show create database mydb\G
*************************** 1. row ***************************
       Database: mydb
Create Database: CREATE DATABASE `mydb` /*!40100 DEFAULT CHARACTER SET latin1 */
1 row in set (0.00 sec)

mysql>

Perhaps you should load another table that has the matching character set:

CREATE TABLE anothertable LIKE mytable;

Change the whole table's character set

ALTER TABLE anothertable CONVERT TO CHARACTER SET charset_name [COLLATE collation_name];

or change a column's character set

ALTER TABLE anothertable MODIFY col1 CHAR(50) CHARACTER SET utf8;

Then, have SequalPro load anothertable.

I guess to be less aggressive, just change the column's character set.

Related Solutions

MySQL the.cnf won’t take any effect

The option character-set-database should not be configured in my.cnf.

Please note what the MySQL Documentation says on character-set-database:

The character set used by the default database. The server sets this variable whenever the default database changes. If there is no default database, the variable has the same value as character_set_server.

Footnote : This option is dynamic, but only the server should set this information. You should not set the value of this variable manually.

Even the Documentation says it is dynamic, it not supposed to be dynamically by any manual intervention against my.cnf. If you look inside the database subfolder, you will find a file called db.opt. EXAMPLE : When you run use dbname in the mysql client, the file /var/lib/mysql/dbname/db.opt is read in order to set character-set-specific database options contained in that file. For this reason, the variable has to be dynamic.

If you cannot access the database from the OS to see db.opt, simply run this command:

SHOW CREATE DATABASE dbname;

on any database and you will see what db.opt contains (or defaults if db.opt is not there)

mysql> show create database mysql;
+----------+------------------------------------------------------------------+
| Database | Create Database                                                  |
+----------+------------------------------------------------------------------+
| mysql    | CREATE DATABASE `mysql` /*!40100 DEFAULT CHARACTER SET latin1 */ |
+----------+------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql>

In light of this, you should try setting character-set-server in my.cnf only (or at least remove character-set-database from my.cnf). Then, run service mysql restart.

Give it a Try !!!

UPDATE #1

I sort of dealt with a question like this before : Why default character_set_server is latin1?

Looking back at my old link, I had an idea: I ran this:

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.6.10    |
+-----------+
1 row in set (0.00 sec)

mysql> select * from information_schema.collations where COLLATION_NAME like '%kor%';
+-----------------+--------------------+----+------------+-------------+---------+
| COLLATION_NAME  | CHARACTER_SET_NAME | ID | IS_DEFAULT | IS_COMPILED | SORTLEN |
+-----------------+--------------------+----+------------+-------------+---------+
| euckr_korean_ci | euckr              | 19 | Yes        | Yes         |       1 |
+-----------------+--------------------+----+------------+-------------+---------+
1 row in set (0.00 sec)

mysql>

You could set a Korean character set if need to.

UPDATE #2

You should leave wait_timeout and interactive_timeout out of the [client] and [mysql] groups.

MySQL database drop insanely slow

I hate the checking permissions issue.

You may have to disable key checks before the DROP DATABASE

SET unique_checks = 0;
SET foreign_key_checks = 0;
SET GLOBAL innodb_stats_on_metadata = 0;
DROP DATABASE db_madeintouch;
SET GLOBAL innodb_stats_on_metadata = 1;
SET foreign_key_checks = 1;
SET unique_checks = 1;

UPDATE 2013-04-15 18:04 EDT

I just noticed you have innodb_file_per_table OFF. What gives ?

You currently have all the InnoDB data and the corresponding index sitting in a single file.
Any CREATE TABLE statement must make data dictionary updates and look for space (small but annoying in this instance)
Internal Fragmentation of ibdata1
Dropping a table means scanning the table and its indexes for availability to lock. With data and index pages possibly fragmented, this takes spindles, seek time, and latency.
See Pictorial Representation of ibdata1 to see everything that goes into ibdata1

Recommendation : Remove all Data and Index Pages from ibdata1

This will give ibdata1 a breather to handle just data dictionary and MVCC management. In addition, ibdata1 will stay rather lean and mean and can be read more quickly.

You will need to perform the InnoDB Infrastructure Cleanup. I wrote out all the steps back on October 29, 2010 in StackOverflow.

UPDATE 2013-04-22 08:10 EDT

Three suggestions

SUGGESTION 1 : I just noticed something else. You are using an ancient version of MySQL (5.0.45). You should think about upgrading to MySQL 5.6.11 as it performs significantly faster that MySQL 5.5 and way faster than MySQL 5.0.

SUGGESTION 2 : You should also go ahead and implement the InnoDB Infrastructure Cleanup.

SUGGESTION 3 : You should also check the disk itself. If the data is sitting on a RAID10 set, one of the disks may have an issues. Check the disk controller's battery as well because it can slow down disk caching and affect read performance.