Taking a little break from refactoring temporal data handling this evening, I decided to run some profiles against both Drizzle and MySQL 5.1.33. I profiled the two servers with callgrind (a valgrind tool/skin) while running the drizzleslap/mysqlslap test case. In both cases, I had to make a small change to the drizzled/tests/test-run.pl Perl script.
For the MySQL build, I used the BUILD/compile-amd64-debug-max build script. For Drizzle, I used my standard build process which builds Drizzle with maximum debugging symbols and hooks. It’s worth noting that the debug and build process for MySQL and Drizzle are very different, and the MySQL debug build contains hooks to the DBUG library, which you’ll notice appear on the MySQL call graphs and consume a lot of the overall function calls. You won’t see this in the Drizzle graphs because we do not use DBUG. For all intents and purposes, just ignore the calls to anything in the DBUG library in the MySQL graphs since in a non-debug build all that stuff is NOOPed out…
FYI, the drizzleslap/mysqlslap test case is a decent one to run profiling against because it tests a range of different SQL statements in a concurrent environment, something you won’t really see in the other tests cases. This is the reason I like using it when profiling with valgrind/callgrind/oprofile…
OK, so on to the graphs…
Drizzle Callgrind Profile — The Function Calls
Here is the top portion of the function call tree for the drizzled server over the time of the drizzleslap test run. I ordered the result by the percentage of total execution time that was spent in each function call. At the top, you’ll see that the get_text function used by the Lex_input_stream class and the InnoDB log_group_write_buf() function are the two top execution time consumers.
For the log_group_write_buf() function (defined in /storage/innobase/log/log0log.c) , this makes sense: Drizzle’s default storage engine is InnoDB (yes, even for the INFORMATION_SCHEMA) and therefore you’ll notice this function, which is responsible for writing to one of InnoDB’s log group files.
More interestingly, there is a fairly alarming 6.45% of the total execution time, for the get_text() static function (defined in drizzled/sql_lex.cc). If we look at this routine, it’s fairly clear what the function does — it simply reads an unescaped text literal, without quotes. Nothing particularly fancy, although this comment above the routine might be telling: “Return an unescaped text literal without quotes. Fix sometimes to do only one scan of the string”
Perhaps it’s time to look into that single-scan thing…or perhaps not.
The next “biggie” is the my_mb_wc_utf8mb4() routine, which is called an astonishing 20,779,971 times over the course of the execution of approximately 17,000 statements. How’d I come up with 17,000 statements? I just looked at the number of times DRIZZLEparse() is called, which is close enough…
my_mb_wc_utf8mb4() is called from a number of places, most notably from within the parser and lexer, and when converting to various string classes and primitives. It’s really scary that this function is called so many times! Why? Well, one reason is that Drizzle does not support any other character set than UTF8 full 4-byte. Although we support many collations, we don’t support the myriad character sets that MySQL does. So, it kind of makes sense that UTF8 routines would be called quite a bit. But…20M executions for 17K statements seems like there is an obvious inefficiency here. 🙂
This leads nicely to MySQL. If Drizzle is spending so much time in UTF8 routines, and MySQL doesn’t by default use UTF8 as it’s character set (and MySQL 5.1.33 doesn’t support 4-byte UTF8, it would make sense that MySQL would NOT be spending nearly as much time executing character set conversion routines, right? Well, not so much. 🙁
MySQL Callgrind Profile — The Function Calls
Remember when looking at the MySQL call graph to disregard the DBUG library calls (look for dbug.c in the source file column) since these would be optimized away in a production environment….
What we notice in the MySQL calls is that pthread_getspecific is the number one execution time consumer, followed by a few of those DBUG library calls. I have a suspicion that the pthread_specific calls are actually related to the DBUG library calls, which track debugging information in the threads. I might be wrong about this but given the stark difference between pthread_specific()‘s top spot in MySQL’s call graph and it not appearing in Drizzle’s graph, it makes sense that this is related to the DBUG library. So, I’ll ignore it for now. 🙂
So, after those, you’ll notice a bunch of calls to memcpy() and the number of calls to memcpy() very closely matches the number of calls from the Drizzle graph. This makes sense. Drizzle’s mechanism for transporting data across the wire and for translating record formats between the database kernel and the storage engine has not yet changed much, and this is where much of the calls to memcpy() are coming from. (this will change with ValueObjects, BTW, but more on that later…)
After memcpy(), though, if you scan the function call list, you’ll notice that MySQL, even with UTF8 not as the default character set, there is still a whole lot of calls, just like in Drizzle, to various character set routines — notably my_uni_utf8(), my_uft8_uni(), copy_and_convert(), my_mb_wc_latin1(), my_ismbchar_utf8(), and so on.
It turns out that if you add up all the character set conversion and comparison routine executions in both MySQL and Drizzle, that all those function calls are taking up more than 12% of the total execution time for both Drizzle and MySQL!
Call Trees Seem to Blame the Parser
I’m not going to go on too much further, as it’s getting late and I’m tired, but I’m putting the call trees for Drizzle and MySQL for the profiling runs below. I think it’s fairly clear that the parser is eating up a large chunk of execution time. Perhaps it’s time to look into prototyping and benchmarking other parsers, or at the very least, looking into streamlining the existing parser to be more efficient when it comes to character set routines… 🙂
Feel free to click on the images below for the fullsize versions.
Here is the Drizzle call tree:
And here is the MySQL one: