While studing for the ZCE exam, I’ve read a lot about prepared statements, especially prepared statements with PDO. According to the textbook, using prepared statements in repetitive queries (such as inserting multiple rows in the database and so on) can lead to a substantial improvement due to the fact that the query is compiled one time and the the values are quoted and put in place, whereas in the “traditional” way, a query is compiled every time it’s executed.
So I’ve decided to give it a try, and see how faster prepared statements actually are. To give the test some objectivity and relevance, I’ve calculated the amount of time required for 1000 inserts, repeated this operation 100 times and computed the average execution time.
First, the database table, very simple, 2 columns, one integer and one string:
CREATE TABLE `test_table` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`numeric` INT NOT NULL ,
`string` VARCHAR( 32 ) NOT NULL
)
The first script, the one without prepared statements. I’ve quoted the values using PDO by hand, because when using prepared statements, the arguments are automatically quoted and I’ve tried to replicate the same behaviour as close as possible.
set_time_limit( 0 );
$pdo = new PDO( DSN, DB_USER, DB_PASS );
$total = 0;
for( $tests = 0; $tests < 100; $tests++ ) {
$start = microtime( true );
for( $index = 0; $index < 1000; $index++ ) {
$pdo->exec(
sprintf(
'INSERT INTO test_table (numeric, string) VALUES ( %d, %s )',
$pdo->quote( $index, PDO::PARAM_INT ),
$pdo->quote( md5( $index ), PDO::PARAM_STR )
)
);
}
$end = microtime( true );
$total += ( $end - $start );
}
echo 'Average: ' . number_format( $total / 100 , 4, '.', ' ' ) . ' milliseconds'; // Average: 0.0708 milliseconds
And now the script with prepared statements:
set_time_limit( 0 );
$pdo = new PDO( DSN, DB_USER, DB_PASS );
$total = 0;
$stmt = $pdo->prepare( 'INSERT INTO test_table (numeric, string) VALUES ( ?, ? )' );
for( $tests = 0; $tests < 100; $tests++ ) {
$start = microtime( true );
for( $index = 0; $index < 1000; $index++ ) {
$stmt->execute( array( $index, md5( $index ) ) );
}
$end = microtime( true );
$total += ( $end - $start );
}
echo 'Average: ' . number_format( $total / 100 , 4, '.', ' ' ) . ' milliseconds'; // Average: 0.0693 milliseconds
The script using prepared statement was faster by an astonishing 0.0015 milliseconds. Wow! A huge accomplishment. NOT! But maybe I’ve been using too simple queries that are compiled very fast and thus the result is irrelevant. So I’ve decided to try again, this time using a table with 10 columns and strings.
CREATE TABLE `second_test_table` ( `id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY , `c1` VARCHAR( 32 ) NOT NULL, `c2` VARCHAR( 32 ) NOT NULL, `c3` VARCHAR( 32 ) NOT NULL, `c4` VARCHAR( 32 ) NOT NULL, `c5` VARCHAR( 32 ) NOT NULL, `c6` VARCHAR( 32 ) NOT NULL, `c7` VARCHAR( 32 ) NOT NULL, `c8` VARCHAR( 32 ) NOT NULL, `c9` VARCHAR( 32 ) NOT NULL, `c10` VARCHAR( 32 ) NOT NULL )
Now the first test, without prepared statements:
set_time_limit( 0 );
$pdo = new PDO( DSN, DB_USER, DB_PASS );
$total = 0;
for( $tests = 0; $tests < 100; $tests++ ) {
$start = microtime( true );
for( $index = 0; $index < 1000; $index++ ) {
$pdo->exec(
sprintf(
'INSERT INTO second_test_table ( c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 ) VALUES ( %s, %s, %s, %s, %s, %s, %s, %s, %s, %s )',
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR ),
$pdo->quote( md5( srand() ), PDO::PARAM_STR )
)
);
}
$end = microtime( true );
$total += ( $end - $start );
}
echo 'Average: ' . number_format( $total / 100 , 4, '.', ' ' ) . ' miliseconds'; //Average: 0.2846 miliseconds
</pre>
Now the one with prepared statements:
<pre lang="php">
set_time_limit( 0 );
$pdo = new PDO( DSN, DB_USER, DB_PASS );
$total = 0;
$stmt = $pdo->prepare( 'INSERT INTO second_test_table ( c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 ) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )' );
for( $tests = 0; $tests < 100; $tests++ ) {
$start = microtime( true );
for( $index = 0; $index < 1000; $index++ ) {
$stmt->execute(
array(
md5( srand() ),
md5( srand() ),
md5( srand() ),
md5( srand() ),
md5( srand() ),
md5( srand() ),
md5( srand() ),
md5( srand() ),
md5( srand() ),
md5( srand() )
)
);
}
$end = microtime( true );
$total += ( $end - $start );
}
echo 'Average: ' . number_format( $total / 100 , 4, '.', ' ' ) . ' miliseconds'; // Average: 0.2784 miliseconds
Another 0.0062 milliseconds gained. Not good enough. The voices in my head started telling me that maybe I should read the manual and find out more about these prepared statements. How about that? That’s crazy enough to actually work. So I did. And found this out:
MySQL supports prepared statements in version 4.1 and above.
All I know about my MySQL server is that it the default version that comes bundled with Ubuntu Interprid Ibex, but don’t know which version it is. A quick fix to that:
tudor@thor:~$ mysql -V mysql Ver 14.12 Distrib 5.0.67, for debian-linux-gnu (i486) using readline 5.2
Okay, than it should work! But the it doesn’t. An improvement of only 0.0062 milliseconds in the execution time of a thousand queries is far beyond lame. Why would anyone bother for that? Something doesn’t smell good here (yeah, go ahead and make a gym socks joke, I haven’t heard that in a while).
After staring to my monitor for about 5 minutes, I’ve noticed that all the rows in the database contain the same information. That’s because of srand(). If no seed is given, srand takes the current timestamp as a seed for generating random numbers. And since all the queries are executed in the same second, it’s quite logical that they’re all identical, so they could trigger a MySQL query caching system I don’t know about (yet). ( Later edit: these tests were conducted very late at night so I’ve mistaken srand() for rand(). Coding late at night sucks, because it leads to this kind of errors and to some incredibly lame explanations on why things happen the way they do.) So I’ve decided on a last try, now with different queries.
Without prepared statements:
set_time_limit( 0 );
$pdo = new PDO( DSN, DB_USER, DB_PASS );
$total = 0;
for( $tests = 0; $tests < 100; $tests++ ) {
$start = microtime( true );
for( $index = 0; $index < 1000; $index++ ) {
$pdo->exec(
sprintf(
'INSERT INTO second_test_table ( c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 ) VALUES ( %s, %s, %s, %s, %s, %s, %s, %s, %s, %s )',
$pdo->quote( md5( $index . 1 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 2 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 3 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 4 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 5 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 6 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 7 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 8 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 9 ), PDO::PARAM_STR ),
$pdo->quote( md5( $index . 10 ), PDO::PARAM_STR )
)
);
}
$end = microtime( true );
$total += ( $end - $start );
}
echo 'Average: ' . number_format( $total / 100 , 4, '.', ' ' ) . ' miliseconds'; //Average: 0.2833 miliseconds
…and with prepared statements…
set_time_limit( 0 );
$pdo = new PDO( DSN, DB_USER, DB_PASS );
$total = 0;
$stmt = $pdo->prepare( 'INSERT INTO second_test_table ( c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 ) VALUES ( ?, ?, ?, ?, ?, ?, ?, ?, ?, ? )' );
for( $tests = 0; $tests < 100; $tests++ ) {
$start = microtime( true );
for( $index = 0; $index < 1000; $index++ ) {
$stmt->execute(
array(
md5( $index . 1 ),
md5( $index . 2 ),
md5( $index . 3 ),
md5( $index . 4 ),
md5( $index . 5 ),
md5( $index . 6 ),
md5( $index . 7 ),
md5( $index . 8 ),
md5( $index . 9 ),
md5( $index . 10 )
)
);
}
$end = microtime( true );
$total += ( $end - $start );
}
echo 'Average: ' . number_format( $total / 100 , 4, '.', ' ' ) . ' miliseconds'; // Average: 0.2068 miliseconds
Now it starting to make some sense – 0.0765 milliseconds difference for 1000 insert queries on a 10 columns table. Still not enough, but it’s okay.
My conclusion on prepared statements
I’m going to stick with them, because they allow a much better separation between the query and the data being feed into it, preventing by default SQL injection attacks. Also, this separation improves the code’s readability and in some cases using prepared statements can lead to an improved performance, especially when dealing with repetitive queries.
These are called “parameterized queries” in .NET. It’s the best way to write queries as you don’t bother about how to send the data to the server (stuff like “how do I format my float values depending on the language of the SQL server and that of the .Net local culture?”). It allows the system to “cache” the structure of the query and just send the parameters.
But I don’t think the gain is in INSERT statements, but rather in complex SELECTs with many JOINs that may not be on foreign keys.
Like Siderite, I’ve worked with the .NET version of the PHP “prepared statements”, also found it useful and came in handy every time.
A little hint for MSSQL:
http://msdn.microsoft.com/en-us/library/aa260835.aspx
Raul> are you turning to the dark side?