During my cursory research I have not been able to find a definitive answer on the benefits that SELECT INTO OUTFILE
provides over INSERT INTO ... SELECT
. Upon reading the docs related to INSERT INTO ... SELECT
concerning locks on InnoDB tables it states:
sets an exclusive index record without a gap lock on each row inserted into T. If the transaction isolation level is READ COMMITTED or innodb_locks_unsafe_for_binlog is enabled, and the transaction isolation level is not SERIALIZABLE, InnoDB does the search on S as a consistent read (no locks). Otherwise, InnoDB sets shared next-key locks on rows from S.
To avoid the lock with INSERT INTO ... SELECT
it seems that I must ensure the isolation level is READ COMMITTED
to avoid locks on the source table during the query.
However, I have not been able to find any authoritative answer regarding locks and using SELECT INTO OUTFILE
, not even the MySQL docs reference lock info.
My goal is to avoid locking the source table while the query runs to avoid connection stacking.
Best Answer
You should use SELECT ... LOCK IN SHARE MODE. Why ?
In your case, you could attempt this
This would do two
SELECT
queriesSELECT
to lock the rows in the table you wishSELECT
to performSELECT ... INTO OUTFILE
Personally, I do not think you have to be this heavy-handed. Transaction isolation should be smart enough to pull off this atomic
SELECT
and use the same rows for theINSERT
. I know I saidshould be
which is why your are asking you question in the first place.Whether you do
SELECT ... INTO OUTFILE
as one command or in the heavy-handed manner I am proposing, the row data of the source table will be fully readable.GIVE IT A TRY !!!
UPDATE 2014-12-10 15:12 EST
Your comment
They are operationally different
SELECT INTO OUTFILE
creates a text fileINSERT INTO SELECT
loads one table from the results of theSELECT
UPDATE 2014-12-11 12:21 EST
The only thing I can think of in this context is the point-in-time of the data and when you are using it. With both types of operations, there will some implicit shared locking.
With
SELECT INTO OUTFILE
, you are preparing a result and saving it externally. Loading that data into a table usingLOAD DATA INFILE
will not involve any shared locking during the load process. Keep in mind thatSELECT INTO OUTFILE
will incur disk I/O and still impose some caching along the way.With
INSERT INTO SELECT
, the shared locks would probably have to live longer in InnoDB because you are locking rows and using those same rows to INSERT into another table.Therefore, if I was looking for a performance bonus, I would have give the edge to
INSERT INTO SELECT
because you are doing the same amount of shared row locking, The disk I/O for the single operation would have to be less than the separateSELECT INTO OUTFILE
and subsequentLOAD DATA INFILE
. Of course, you would have to compare the two methods against your dataset. What could a performance bonus for one dataset might be a performance cost for another dataset.UPDATE 2014-12-17 00:00 EST
Your Comment
The only authoritative response would to come from the MySQL Documentation.
First, what does the MySQL Documentation LOAD DATA INFILE say ?
Two paragraphs later, it says
When you look at Speed of INSERT Statements, it says this:
Things start to look a little nebulous at this point because you must tune the load process in terms of the storage engine. MyISAM is rather straight forward in this statement because the bulk insert buffer is for MyISAM only and LOAD DATA INFILE will harness the bulk insert buffer. InnoDB will not.
Take a look at this Pictorial Representation of InnoDB (Percona CTO Vadim Tchachenko)
There are other considerations to tweak options but
LOAD DATA INFILE
will literally slam everything into the InnoDB Buffer Pool, funneling changes through the Log Buffer, Double Write Buffer, Insert Buffer (if the target table has nonunique indexes), Redo Logs (ib_logfile0,ib_logfile1), and the Physical File of the Table. This is where LOAD DATA INFILE's benefits have to get nullified.I wrote about this
Feb 06, 2012
: LOAD DATA (400k rows) INFILE takes about 7 minutes, cannot kill the "logging slow query" process?Apr 20, 2012
: Why is 'LOAD DATA INFILE' faster than normal INSERT statements?Jul 13, 2012
: Mysql load from infile stuck waiting on hard driveJan 11, 2013
: MySQL LOAD DATA INFILE slows by 80% after a few gigs of input with InnoDB engineEPILOGUE
As I already said in my previous update to this answer
Basically, you will have to test
SELECT INTO OUTFILE/LOAD DATA INFILE
againstINSERT INTO SELECT
. It may be 6 of one, half a dozen of the other for one dataset, and a landside victory for another dataset.All being said from the MySQL Docs and my past posts, I still give the edge to
INSERT INTO SELECT
. You will just have to test the two methods.