The Great Escape

This week, I am working on putting together test cases which validate the Drizzle transaction log‘s handling of BLOB columns.

I ran into an interesting set of problems and am wondering how to go about handling them. Perhaps the LazyWeb will have some solutions. :)

The problem, in short, is inconsistency in the way that the NUL character is escaped (or not escaped) in both the MySQL/Drizzle protocol and the MySQL/Drizzle client tools. And, by client tools, I mean both everyone’s favourite little mysql command-line client, but also the mysqltest client, which provides infrastructure and runtime services for the MySQL and Drizzle test suites.

Even within the server and client protocol, there appears to be some inconsistency in how and when things are escaped. Take a look at this interesting output from the drizzle client program (FYI, output is identical for mysql client, I checked…)

drizzle> select 'test\0me';
+---------+
| test    |
+---------+
| test me | 
+---------+
1 row in set (0 sec)

You’ll notice that in the first SELECT statement, the column header is cut off — i.e. the column header is not escaping the \0 NUL character in the string 'test\0me'. However, the result data does not truncate the string but replaces the NUL character with a space character. So, I came to the conclusion that the drizzle client does not escape column headers but does do some sort of escaping for the result data. Given this conclusion, you will understand my raised eyebrow when the following SELECT statement was displayed:

drizzle> select 'test\0me' = 'test me';
+------------------------+
| 'test\0me' = 'test me' |
+------------------------+
|                      0 | 
+------------------------+
1 row in set (0 sec)

Hmmm…so maybe column headers are being escaped by the MySQL/Drizzle client? Clearly, the NUL character was escaped as the characters ‘\\’ followed by the character ’0′ in the column header above. Indeed, quite puzzling.

OK, so the above anomaly needs to be investigated. However, a similar issue exists for the mysqltest/drizzletest client program. To see the problem, check the following out. I create a simple test case with the following in it:

--disable_warnings
DROP TABLE IF EXISTS t1;
--enable_warnings

SELECT 'test\0me';

CREATE TABLE t1 (fld BLOB NULL);
INSERT INTO t1 VALUES ('test\0me');
SELECT COUNT(*) FROM t1;
DROP TABLE t1;

Now, what you would expect to see for the output of the above — at least if you expect results similar to the MySQL/Drizzle client output — is the following:

DROP TABLE IF EXISTS t1;
SELECT 'test\0me';
test
test me
CREATE TABLE t1 (fld BLOB NULL);
INSERT INTO t1 VALUES ('test\0me');
SELECT COUNT(*) FROM t1;
COUNT(*)
1
DROP TABLE t1;

That is what you would expect to see in the output of course… Here is what you actually get in the output:

DROP TABLE IF EXISTS t1;
SELECT 'test\0me';
test
test

So, the mysqltest/drizzletest client apparently does not escape the NUL character for the result data at all. It looks like it does do some escaping/replacing for the NUL character in the column header, though, otherwise the second “test” line would not appear. This leads to the result file being essentially truncated as soon as a NUL character is included in any output to the mysqltest/drizzletest client. This essentially makes the mysqltest/drizzletest client useless for testing and validating BLOB data.

Possible Solutions?

I think the cleanest solution would be to create a shared library of code that would be responsible for uniformly and consistently escaping data, and then linking the various clients (and server) with this library and removing all of the various escaping functions currently in the server. This would, of course, take some time, but would be the most future proof solution. Anyone else have ideas on solving the problem of being able to test and validate binary data via the test suite? Cheers!