Max Mether pointed out an error in something I mentioned last week in a performance tuning webinar and I wanted to clarify and correct. In the webinar I said that concurrent inserts would only be available on a MyISAM table if an AUTO_INCREMENT primary key was used. That’s incorrect. I was confusing the concurrent_insert server variable with the AUTO_INCREMENT property. I should have just re-read my own book, which covers the topic in some detail in Chapter 5. Oh well.
But, in case you missed the webinar or just aren’t aware of what concurrent insert is, let me explain. First, some background information on MyISAM.
MyISAM and Table-Level Locks
Unlike InnoDB, which employs row-level locking, MyISAM uses a much coarser-grained locking system to ensure that data is written to the data file in a protected manner. Table-level locking is the only level of lock for MyISAM, and this has a couple consequences:
- Any connection issuing an UPDATE or DELETE against a MyISAM table will request an exclusive write lock on the MyISAM table. If no other locks (read or write) are currently placed on the table, the exclusive write lock is granted and all other connections issuing requests of any kind (DDL, SELECT, UPDATE, INSERT, DELETE) must wait until the thread with the exclusive write lock updates the record(s) it needs to and then releases the write lock.
- Since there is only table-level locks, there is no ability (like there is with InnoDB) to only lock one or a small set of records, allowing other threads to SELECT from other parts of the table data.
As I noted in the webinar, this make MyISAM less than ideal for certain environments, such as OLTP applications or heavy-concurrency environments with a decent number of UPDATE operations.
However, note in the first bullet point above that I did not say that an INSERT request would trigger a request for an exclusive write lock — at least not necessarily. Why is that? Well, that is the concurrent insert ability of MyISAM.
Concurrent Insert for MyISAM
MyISAM can insert new records into the .MYD data file even when other threads are reading data from the file. The reason for this stems from the fact that new records are appended to the end of the .MYD file. Because the writing thread knows that the next record slot will be at the end of the data file.
The server variable which controls MyISAM’s concurrent insert functionality is, well, concurrent_insert. The value of the variable can be:
- 0 — Concurrent Insert is disabled
- 1 — Concurrent Insert can occur when there are no records marked as deleted (see above)
- 2 — Concurrent Insert can occur even when there are records marked as deleted
The default value of concurrent_insert is 1, so as long as there are no records marked as deleted in the .MYD file, records may be appended to the end of the data file without locking out SELECT statements, significantly improving the performance of applications that do many INSERTs and SELECTs, but not so many UPDATEs or DELETEs. This is why for environments with heavy read activity, little UPDATE or DELETE activity, and many INSERTs, MyISAM is a very well-performing storage engine, and the main reason why MyISAM is typically chosen in data-warehousing environments.
A second part of this article is covering the internals of concurrent inserts in MyISAM, in all their glory geek details. I’ll update this entry with a link in the near future.