While this may be old hat to those of you out there who, like me, regularly scour through the source code commits on the internals (and newly forked commits) lists, however, in the past few weeks, a couple of documentation and code commits have caught my attention. Things catch my attention for any number of reasons. Most often, the commits that I flag for a later look at are those things which I suspected would change in the near future when I wrote chapters of Pro MySQL that dealt with the source code, architecture and storage engines. Other times, I catch something about a particular bug that I’ve run into has been fixed. Other times, I just notice some really cool idea that’s going into the code and decide to investigate. Of course, besides the few things below, there were hundreds of code changes, bug fixes, and documentation updates in the past few weeks. These two just happened to catch my attention quickly… feel free to add to the list if you have other ones you found particularly compelling…
Changes to the MyISAM Table Cache in MySQL 5.1
Paul Frost committed some documentation updates for the 5.1 release which address the source code changes that deal with the MyISAM table cache. In Chapters 4 and 5 of Pro MySQL, I wrote about the MyISAM table cache, which (up until 5.1.3) is a cache storing file descriptors of the MyISAM tables opened during queries. The table cache actually is a hash of table structs that is used by the optimizer and query engine to determine, among other things, what columns are in a table. Importantly, the table C struct contained a pointer to a file descriptor, which is an integer value used to identify an actual operating system file in use by a process running under the operating system. Most, if not all, operating systems, impose a limit on the number of file descriptors that any process can simulataneously hold open.
Unfortunately, because of this link to a file descriptor, systems that had large numbers of tables (for instance, a shared web server) sometimes had issues where the number of file descriptors needed by the MyISAM table cache exceeded the number of open file descriptors available to the OS user running the mysqld process. There are a number of forum posts I’ve seen which have encountered this issue. Until MySQL 5.1, the solution wasn’t particularly scaleable, nor particularly easy to administer.
When I was scrounging through the MySQL source code in my research for chapter 4, I made a note that there were comments in the source code which indicated that the table cache was to undergo an overhaul in future versions. So, it was with great interest that I came across Paul’s documentation commit:
+ <listitem> + <para> + <literal>table_definition_cache</literal> + </para> + + <listitem> + <para> + The table definition cache. If you use a large number of + tables, you can create a large table definition cache to + speed up opening of tables. The table definition cache + takes less space and does not use file descriptors, + unlike the normal table cache. This variable was added + in MySQL 5.1.3. + </para> + </listitem> + </listitem>
This is fantastic news for those people who have struggled with table cache issues in the past. I’m going to dig into the 5.1 code and find some of the areas changed and do some more research on it. The original server variable, table_cache, has now been separated into table_open_cache, which has a similar meaning to the original table_cache variable: it stores structs having a file descriptor link, however, is not apparently used for table definition lookups anymore. Instead, a new server variable, table_definition_cache, is used to manipulate a separate hash of structs which serves to describe the metadata of the table columns and other information about the tables. I assume that this table_definition_cache is invalidated with any DDL changes to the tables and is reloaded similar to the ACL hash that serves to cache user privilege information. Anyway, I’ll follow up on this important change in a future post… stay tuned!
mmap Support in MyISAM
Another interesting commit made on Vadim Tkachenko on December 1st was the inclusion of mmap support in the MyISAM storage engine in the MySQL 5.1 tree.
mmap is an alternative to the traditional read/write/lseek file I/O operations used to bring data into a buffer from a file (or device). Instead of filling a buffer with the results of the read() function, the mmap() function literally maps a file or device into the memory space of the calling process. In some instances, this has been shown to generate performance increases, particularly for larger files.
Though I by no means an expert C coder, the idea of using mmap() for the MyISAM storage engine definitely piqued my interest. Why? Well, unlike the InnoDB storage engine, which uses a logical data page memory allocation (extents are organized from 16KB data pages and managed via InnoDB), MyISAM uses a much simpler scheme for accessing data. The .MYD file is opened, and a record pointer or address is used to “seek” into the file and read or update the data in place. Records are added to the .MYD file sequentially, regardless of the primary key column (hence, the name Indexed Sequential Access Method or ISAM). B-tree indexes are built outside of the data file, and the index records contain pointers to the “slot” in the .MYD file which contains the rest of the record data.
So, while the MyISAM storage engine caches index records (in the MyISAM key cache), unlike the InnoDB storage engine, the data records are not themselves cached. With mmap, however, it seems to be possible to have the entire .MYD file mapped into memory, making both read and write operations much speedier. Why? Well, because there would be no need to open the file, seek to the correct offset, and then read into a buffer. The data would already be in memory. Vadim’s code below shows an example of what’s going on. The following is from the mi_mmap_pread() function in /storage/myisam/mi_dynrec.c.
+ if (info->s->mmaped_length >= offset + Count) + { + memcpy(Buffer, info->s->file_map + offset, Count); + if (info->s->concurrent_insert) + rw_unlock(&info->s->mmap_lock); + return 0; + } + else + { + if (info->s->concurrent_insert) + rw_unlock(&info->s->mmap_lock); + return my_pread(info->dfile, Buffer, Count, offset, MyFlags); + } +}
The if construct checks to see if the mmap()ed file is available. If so, a simple memcpy() call is used to copy the data at the specified offset (record “slot”) into the caller’s buffer. If no mmap() is used, or the mapped file has no more room or does not contain the needed record, a call to my_pread() is issued, which issues the standard read/seek/copy to buffer operations.
For large files, this mmap() method may prove to be a performance benefit, but, since more memory is needed to store the record file entirely or mostly in memory, there will be other issues involved in this type of strategy. I’m very interested to see how this develops…
for those of you interested in the exact code changes I was looking at, see here.