The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. October 19, 2019, 01:57:58 PM

Login with username, password and session length


Pages: [1]
  Print  
Author Topic: MySQL MultiQuery Performance Testing & Results  (Read 3876 times)
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« on: September 25, 2008, 03:14:38 PM »

As many here know, I've been pushing hard on every angle I can for total efficiency on my servers, web services and sites. Among other things, I've been working with MySQL to squeeze every bit of speed out of my connection and database. This is one of the reasons for the new update to the dbConnection class and the multiQuery.

Perk's updated dbConnection class can be found HERE.

It's been long stated that multiple queries with mysqli_multi_query is more efficient, but I've not seen any hard, empirical evidence - and specifically no examples that really pertained to what I wanted to do. So I decided to tackle a little function I need for a current project and write it 3 different ways to demonstrate/prove or disprove the speed advantages of multi queries. Skip straight to the bottom if you're just looking for the money shot test results.

The Function
I use, what I call, complexRecords, which are essentially multi-dimensional arrays that represent everything about <something I want from the database>. This might be a contact record, which contains all the basics of a contact, all their address records, all their form-of-payment records, all their memos and pointers to all of their transactions. Or a transaction record, which contains the basics of a transaction, original items in that transaction, events on that transaction, then items and payments associated with each event. In many ways, I see my arrays the same way that I see XML: hierarchical constructs of data I need while I'm doing <something>.

The function I decided to test is fetchContactRecord(). This function is supposed to populate a contact record with everything I need to know about a contact during web functions like "maintaining my account" or executing a transaction. The array looks like this:
ContactRecord
{
   [basic]
      fname
      lname
      ...
   [addresses]
      [addressIDX0]
         addr1
         addr2
         city, state zip
         ...
      [addressIDX1]
         ...
   [fops]
      [fopIDX0]
         ccnum
         ccexpmonth
         ccexpyear
         ...
      [fopIDX1]
         ...
   [memos]
      [memoIDX0]
         caption
         detail
      [memoIDX1]
         ...
}

The Test Code
I extended my dbConnection class and added 3 functions: fetchContactRecord1(), fetchContactRecord2() and fecthContactRecord3(). The method for retrieval is as follows:
  • Version 1: Use multiple single-query calls to the query()
  • Version 2: Use a multiquery and a ptr/case statement to correctly place the incoming data
  • Version 3: Use multiquery and a serial set of retrieval statements to pull the data into the array

The code for the 3 different versions of the script: Note that this is a fragment of a class extension and is not usable stand-alone
Code:
<?php

function 
fetchContactRecord1($contactID, &$theArr)
{
// Write this as a local query pull
if (!$this->__connected) { $this->connect(); }

$this->query("select * from contacts where id=$contactID");
$this->fetchAssoc();
$theArr['contact'] = $this->row;

$this->query("select * from contact_addresses where contact_id=$contactID");
while ($this->fetchAssoc()) $theArr['addresses'][$this->row['id']] = $this->row;

$this->query("select * from contact_fops where contact_id=$contactID");
while ($this->fetchAssoc()) $theArr['fops'][$this->row['id']] = $this->row;

$this->query("select * from contact_memos where contact_id=$contactID");
while ($this->fetchAssoc()) $theArr['memos'][$this->row['id']] = $this->row;
}


function fetchContactRecord2($contactID, &$theArr)
{
$sql = <<<SQL
select * from contacts where id=$contactID;
select * from contact_addresses where contact_id=
$contactID;
select * from contact_fops where contact_id=
$contactID;
select * from contact_memos where contact_id=
$contactID;
SQL;

$ptr 0;
if (!$this->__connected) { $this->connect(); }
mysqli_multi_query($this->__myConnection$sql);
do 
{
$res mysqli_store_result($this->__myConnection);
while($row mysqli_fetch_assoc($res))
{
switch($ptr)
{
case :
$theArr['contact'] = $row;
break;

case :
$theArr['addresses'][$row['id']] = $row;
break;

case :
$theArr['fops'][$row['id']] = $row;
break;

case :
$theArr['memos'][$row['id']] = $row;
break;
}
}
$res->close();
$ptr++;
} while(mysqli_next_result($this->__myConnection));
}


function fetchContactRecord3($contactID, &$theArr)
{
$sql = <<<SQL
select * from contacts where id=$contactID;
select * from contact_addresses where contact_id=
$contactID;
select * from contact_fops where contact_id=
$contactID;
select * from contact_memos where contact_id=
$contactID;
SQL;

if (!$this->__connected) { $this->connect(); }
mysqli_multi_query($this->__myConnection$sql);

$res mysqli_store_result($this->__myConnection);
$theArr['contact'] = mysqli_fetch_assoc($res);
$res->close();

mysqli_next_result($this->__myConnection);
$res mysqli_store_result($this->__myConnection);
while ($row mysqli_fetch_assoc($res)) $theArr['addresses'][$row['id']] = $row;
$res->close();

mysqli_next_result($this->__myConnection);
$res mysqli_store_result($this->__myConnection);
while ($row mysqli_fetch_assoc($res)) $theArr['fops'][$row['id']] = $row;
$res->close();

mysqli_next_result($this->__myConnection);
$res mysqli_store_result($this->__myConnection);
while ($row mysqli_fetch_assoc($res)) $theArr['memos'][$row['id']] = $row;
$res->close();
}
?>


The data is as follows: there are about 10K contact records - I am obviously only grabbing one. The contact I'm using has 5 addresses, 7 forms of payment and about 90 memos. The script I wrote to test looks like this:

Code:
#!/usr/local/bin/php
<?php

require "source/class.dbExtension.php";
$dbx = new dbExtension('127.0.0.1''user''pass''thedatabase');
$theArr = array();

$start time();
for (
$i=0$i<100000$i++)
$dbx->fetchContactRecord1(1$theArr);

echo 
"done: " date('n:s'time() - $start);

?>


I changed the number of times the loop ran from 10K to 50K to 100K and the results were consistent and very interesting.

Results
At 10K iterations the results were rough, at 100K the resolution was pretty good. I found that:
  • Version 1 of the function at 10K executed in 13.5 secs (average)
  • Version 2 ran at 11.2 seconds (average)
  • Version 3 ran at 10 even.

Version 2 was a 19% increase in performance over version 1, version 3 was a 26% increase in performance. I changed the iteration count and the essential findings were confirmed. But I was curious: What happened if I changed the number of rows in a table and the number of queries?

The next thing I did was to triple the number of queries (I copy/pasted the 4 original queries 3 times to make 12 queries). This was where the difference was most strong:
  • Version 1 performed 10K iterations in 38 seconds
  • Versions 2 & 3 performed in 23 seconds - a 40% increase in performance! Again, I increased the iterations to verify the results. I also got to watch a lot of MSNBC during this part of the test /yawn/.

It was interesting to me that version 2 & 3 performed almost identically in the last test, but this seemed clearly because the number of rows per statements was low. So although I could predict the next findings, I tested anyway - I changed the test to a 1000 record pull from a single statement and executed that 10K times (removing the "multiqueryness" from the equation) - as expected, there was virtually no difference in the results between version 1 and 3, but version 2 was slightly slower (about 5%) because of the switch statement being executed on every single record retrieval.

Conclusion
  • If you have a single query, or perhaps even 2 simple queries, you will not see much benefit in multi queries, although it is considerably easier to code, and, arguably, every little bit helps your server.
  • If you have many many rows you are retrieving, put as little code in between each retrieval as you possibly can. Duh.
  • If you have several queries to execute with an unknown number of rows, you will experience a definite and important increase in efficiency, as well as a lowering of the total workload on the server - particularly if you are doing all of this on a single box.
« Last Edit: September 25, 2008, 03:21:17 PM by perkiset » Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
dbrown
Rookie
**
Offline Offline

Posts: 28


View Profile
« Reply #1 on: October 05, 2008, 08:23:54 PM »

nice... Applause
Logged
Pages: [1]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!