The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 19, 2019, 05:22:48 AM

Login with username, password and session length


Pages: [1] 2 3
  Print  
Author Topic: XSL versus Serialize/template  (Read 9507 times)
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« on: November 24, 2008, 07:42:23 PM »

So my network of sites is big. and it grows bigger on its own as many of you can imagine.

I was building my pages by caching the content and other stuff in an XML file, and using XSLT to transform that XML with an XSL template. This works great, however, it seems to bog down a bit, especially on cheapo shared hosts that have hissy fits if you tax the server at all, understandably.

So I finally got a "piss off your using too much" email, which made me reconsider XSLT vs serialize.

I have translated all my code to store the content and data in a serialized array in a file.
And my templates are simple str_replaces for now, though some will need to be regex for loops and such.

The verdict is...

XSLT IS A PIECE OF SHIT.
powerfull as hell, but a piece of shit none the less.
at least thats what it seems like.

the serialize method is much faster from what I can tell.
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #1 on: November 24, 2008, 10:02:13 PM »

Very disappointed to hear that. Since I'm almost completely old-style HTML (no visual constructs) with utter CSS formatting, I was just starting to think about movine back towards XSLT and hitting you up.

Undecided
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
jairez
Expert
****
Offline Offline

Posts: 164


JTFC


View Profile
« Reply #2 on: November 25, 2008, 12:35:19 AM »

So I'm curious here .... can you give an example of serialization?  I too have given some consideration to XSL/XSLT - hell, I even have a book. Roll Eyes

Thanx.
Logged

Spontaneity has it's time and place.  [Sluggo, 1990-ish]
DangerMouse
Expert
****
Offline Offline

Posts: 244



View Profile
« Reply #3 on: November 25, 2008, 02:33:20 AM »

Yeah disapointed to hear this too, XSL, XPath etc is such cool technology, shame they can't work out the performance issues.

DM
Logged
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #4 on: November 25, 2008, 10:19:08 AM »

ok I just did some testing.

first off, i now know my real bottleneck is out of my control, amazon... <shakefist>

serialize is twice as fast as xslt, though we are talking microtime here.
.0012sec vs .0024sec consistently.

but considering when getting spidered I sometimes am doing 100 pages a second for a few seconds, when feeding cached data, that adds up.

It is possible that I streamlined some code along the way, since this is a realworld program that I used for testing, its probably not legit. I am going to try to figure out how to test it for realz. Because frankly, I would prefer to use XSLT because of standards.

@jairez
an example of serializing would be like this.
I get data from amazon XML service.
I pull out the pieces that I need and store them into an array with a structure like:

product=>
  asin=>1234567890
  title=>some book
  reviews=>
   review=>this book is awesome!!
   review=>this book blows!!
  url=>http://makememoneybitches.com

I then serialize that and store it to a file, so I never have to get the product from amazon again. I could just as well store it in a DB, but this application warrants a file based storage system (zero setup).
Then on page load, I unserialize the file, and replace tokens in my template with matching elements in the array structure.
technically, this is the same exact process as XSLT, though XSLT is robust out of the box and uses standardized XML methodology. The serialize method is only as robust as I make it, which is good in a lot of respects, but the structure is mine and could cause issues with other applications that are unaware of the lack of standards, such as, not encoding anything like & < >

Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #5 on: November 25, 2008, 10:45:27 AM »

alrighty. I just did a pure test.

serialize is 3 times faster very consistently.
0.0006 seconds for serialize
0.0018 seconds for XSLT


neither is outputting anything, just the pure processing of the template with the product file. so spit out is not factored.
of course as a serialized version gets more complex, such as more template details, then you might start to lose the benefit.


XSLT version:
Code:
<?php
//XSLT
$gentime microtime(); 
$gentime explode(' ',$gentime); 
$gentime $gentime[1] + $gentime[0]; 
$pg_start $gentime

$doc = new DomDocument();
$xsl = new XSLTProcessor();
$doc->load('thetemplatefile');
$xsl->importStylesheet($doc);
$doc->load('theproductfile');
$pageout $xsl->transformToXML($doc);


$gentime microtime(); 
$gentime explode(' ',$gentime); 
$gentime $gentime[1] + $gentime[0]; 
$pg_end $gentime
$totaltime = ($pg_end $pg_start); 
$showtime number_format($totaltime4'.'''); 
echo(
"This page was generated in " $showtime " seconds"); 
?>


Serialized version
Code:
<?php
//serialize
$gentime microtime(); 
$gentime explode(' ',$gentime); 
$gentime $gentime[1] + $gentime[0]; 
$pg_start $gentime

$template file_get_contents('thetemplatefile');
$product unserialize(file_get_contents('theproductfile'));


$pageout generatehtmloutput($template,$product);


$gentime microtime(); 
$gentime explode(' ',$gentime); 
$gentime $gentime[1] + $gentime[0]; 
$pg_end $gentime
$totaltime = ($pg_end $pg_start); 
$showtime number_format($totaltime4'.'''); 
echo(
"This page was generated in " $showtime " seconds"); 


function 
generatehtmloutput($template,$product)
{
$out $template;
$out str_replace('%%domain%%','aaaaa',$out);
$out str_replace('%%metatitle%%',$product['metatitle'],$out);
$out str_replace('%%title%%',$product['title'],$out);
$out str_replace('%%price%%',$product['price'],$out);
$out str_replace('%%asin%%','1111111111',$out);
$out str_replace('%%content%%',$product['content'],$out);

$buff '';
foreach ($product['nodeproducts'] as $item)
{
$buff .= '<a href="'.$item['asin'].'.html">'.$item['title']."</a><br />\n";
}
$out str_replace('%%nodeproducts%%',$buff,$out);

$buff '';
foreach ($product['similarproducts'] as $item)
{
$buff .= '<a href="'.$item['asin'].'.html">'.$item['title']."</a><br />\n";
}
$out str_replace('%%similarproducts%%',$buff,$out);

$buff '';
foreach ($product['accessories'] as $item)
{
$buff .= '<a href="'.$item['asin'].'.html">'.$item['title']."</a><br />\n";
}
$out str_replace('%%accessories%%',$buff,$out);

$buff '';
foreach ($product['interlinks'] as $item)
{
$buff .= '<a href="'.$item['url'].'">'.$item['name']."</a><br />\n";
}
$out str_replace('%%interlinks%%',$buff,$out);

$buff '';
$buff displayattributes($product['attributes']);
$out str_replace('%%attributes%%',$buff,$out);

return $out;
}
function 
displayattributes($att)
{
$out '';
foreach ($att as $key=>$value)
{
if (count($att[$key])>1)
{
$out .= displayattributes($att[$key]);
}
else
{
$out .= $key.' : '.$value."<br />\n";
}
}
return $out;
}
?>

Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #6 on: November 25, 2008, 11:09:20 AM »

NBs I hope you don't mind, but I think this might go a little faster for you:

Code:
<?php
$start 
microtime_float();

$template file_get_contents('thetemplatefile');
$product unserialize(file_get_contents('theproductfile'));

$pageout generatehtmloutput($template$product);

$diff microtime_float() - $start;
echo(
"This page was generated in $diff seconds"); 


function 
generatehtmloutput($templateproduct)
{
foreach($product['nodeproducts'] as $item)
$prodArr[] = <<<HTML
<a href="{$item['asin']}.html">{$item['title']}</a><br />
HTML;

foreach($product['similarproducts'] as $item)
$simArr[] = <<<HTML
<a href="{$item['asin']}.html">{$item['title']}</a><br />
HTML;

foreach ($product['accessories'] as $item)
$accArr[] = <<<HTML
<a href="{$item['asin']}.html">{$item['title']}</a><br />
HTML;

foreach ($product['interlinks'] as $item)
$interArr[] = <<<HTML
<a href="{$item['url']}">{$item['name']}</a><br />
HTML;

$search[] = '%%domain%%' $replace[] = 'aaaaa';
$search[] = '%%metatitle%%'; $replace[] = $product['metatitle'];
$search[] = '%%title%%'; $replace[] = $product['title'];
$search[] = '%%price%%'; $replace[] = $product['price'];
$search[] = '%%asin%%'; $replace[] = '1111111111';
$search[] = '%%content%%'; $replace[] = $product['content'];
$search[] = '%%nodeproducts%%'; $replace[] = implode(chr(10), $prodArr));
$search[] = '%%similarproducts%%'; $replace[] = implode(chr(10), $simArr));
$search[] = '%%accessories%%'; $replace[] = implode(chr(10), $accArr));
$search[] = '%%interlinks%%'; $replace[] = implode(chr(10), $interArr));
$search[] = '%%attributes%%'; $replace[] = displayattributes($product['attributes']);

return str_replace($search$replace$template);
}
function 
displayattributes($att)
{
$out '';
foreach ($att as $key=>$value)
{
if (count($att[$key])>1)
{
$out .= displayattributes($att[$key]);
}
else
{
$out .= $key.' : '.$value."<br />\n";
}
}
return $out;
}

function 
microtime_float()
{
list($usec$sec) = explode(' 'microtime());
return ((float)$usec + (float)$sec);
}

?>


I think there's a faster way to do your attribute dereferencing as well, but I don't have the time ATM.

/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #7 on: November 25, 2008, 11:13:44 AM »

One more thing - if you go too far beyond this number of search/replaces, you may find that the benefit of going to a preg_replace_callback is better, since you could have access to every %%whatever%% variable in a single pass, using a switch statement to do the replacement for you. the preg_ funcs fall in a lot of overhead, but with this number of replacements it's getting close to being better I think. But this should still be a little faster, on first blush.
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #8 on: November 25, 2008, 11:24:12 AM »

cool i will try that and see, no time right now though. maybe later tonight.

I thought str_replace was almost always going to faster than regex. though I see what you're saying, that with preg_replace, I could possibly do it in 1 single line.

Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
jairez
Expert
****
Offline Offline

Posts: 164


JTFC


View Profile
« Reply #9 on: November 25, 2008, 11:34:57 AM »

Quote
an example of serializing would be like this.

Got it.  Thanx, NB.  Great stuff!
Logged

Spontaneity has it's time and place.  [Sluggo, 1990-ish]
rcjordan
Lifer
*****
Offline Offline

Posts: 882


View Profile
« Reply #10 on: November 25, 2008, 11:48:36 AM »

OK, I'm confused. What does the serialized file look like? Just a flatfile?
Logged
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #11 on: November 25, 2008, 11:53:45 AM »

I thought str_replace was almost always going to faster than regex. though I see what you're saying, that with preg_replace, I could possibly do it in 1 single line.

It really depends on the complexity of the replace. Much like what you have outlined here, I have parameterized replacements in my HTML. For example:
[%currency:123.4]
[%graphicspath:secure]

...where the portion after the colon is per-replacement parameter(s) for how the replacement should go. So with a single preg_replace_callback, I look for something like this:
Code:
\[\%([^:]+):([^\]]*$)
then in a single pass over the entire string I can pick up the replacement type and the way I want it replaced. In the first example, I will replace that string with the surfer's appropriate currency symbol, the recalculated amount of 123.4 into today's currency value and then format it. So in the end I have a single pattern, a single callback and only a switch for processing the data.

Just

Side: Heyya Jairez!
« Last Edit: November 25, 2008, 12:02:17 PM by perkiset » Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #12 on: November 25, 2008, 11:59:08 AM »

OK, I'm confused. What does the serialized file look like? Just a flatfile?

serialize and unserialize convert arrays to/from a text string. For example, this array: ('a'=>'apple', 'b'=>'orange', 'c'=>'banana') would convert to:

a:3:{s:1:"a";s:5:"apple";s:1:"b";s:6:"orange";s:1:"c";s:6:"banana";}

... which is easily read/writeable to a textfile, obviously. This is about as fast as you can move text data into a complex array in PHP. So previously, NBs would have taken his array and done something like file_put_contents('thefile.txt', serialize($myArr)); so that he could retrieve it in this fashion later.
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #13 on: November 25, 2008, 11:59:36 AM »

OK, I'm confused. What does the serialized file look like? Just a flatfile?

yep. what serialize does is take an array, multidimensional or not, and flatten it into a string format.
kind of like CSV for example.
here is a snippet.
a:14:{s:4:"asin";s:10:"0316015849";s:5:"title";s:36:"Twilight (The Twilight Saga, Book 1)";s:9:"metatitle";s:47

so you can see a:14: means an array of 14 elements follows inside {}
s:4 means string, 4 long

its a way to retain an array beyond the usage of that array, or from 1 session to another, or to send an array to another website like RPC. make sense?
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
rcjordan
Lifer
*****
Offline Offline

Posts: 882


View Profile
« Reply #14 on: November 25, 2008, 12:14:05 PM »

>make sense

Close, but I'm not quite locked in. Keep holding my hand, hhh!

In the above example code using Amazon is 'theproductfile' written as a separate file for each unique product OR is it one large file of the entire catalog with the serialized record stored as a line in the flatfile?

Logged
Pages: [1] 2 3
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!