perkiset

This is an XML handler that I wrote quite a while ago. It can either parse or create XML for you and is pretty darn fast, although I wouldn't go head to head with any of the new

PHP

 5 XML handlers. It will run perfectly in

PHP

 4 with no external lib requirements.

I'll put up some code on how to use it in my next post.


<?

php

 

class xmlNode {
var $tag = "unassigned";
var $data = "";
var $children = array();
var $attributes = array();
var $parentID = 0;
var $mID = 0;
var $handler = null;
        var $attributeNames = array();
        var $attributeValues = array();


function xmlNode($theHandler)
{
$this->handler = $theHandler;
}
function _attribStr()
{
$out = "";
$theCount = count($this->attributeNames);
for ($i=0; $i<$theCount; $i++)
{
if ($this->attributeValues[$i])
{
$out .= ' ' . $this->attributeNames[$i] . '="' . $this->attributeValues[$i] . '"';
}
}
return $out;
}
function _nextIsClose()
{
// This function is called by nodes that want to know if the next
// item in the tag list is actually <themseleves> closing...
$ptr = $_SERVER['inputPtr'];
return (($_SERVER['inputXML'][$ptr] == '<') and ($_SERVER['inputXML'][$ptr + 1] == '/'));
}
function addChild($theTag, $theValue)
{
$myPtr = count($this->children);
$theHandler = &$this->handler;
$masterPtr = $theHandler->__newXMLNode();
$this->children[$myPtr] = $masterPtr;
$theArray = &$theHandler->__xmlNodeArray;
$tempNode = $theArray[$masterPtr];
$tempNode->parentID = $this->mID;
$tempNode->tag = $theTag;
$tempNode->data = $theValue;
return $masterPtr;
}
function addChildPtr($thePtr)
{
$ptr = count($this->children);
$this->children[$ptr] = $thePtr;
}
function attributeValue($attrName)
{
$attrName = strtolower($attrName);
for ($i=0; $i<count($this->attributeNames); $i++)
{
if ($this->attributeNames[$i] == $attrName)
{
return $this->attributeValues[$i];
}
}
return '';
}
function buildXML($indent)
{
$myIndent = '';
if (! $this->handler->for

Ajax

 MS)
{
$myIndent = substr($this->handler->indentTemplate, 0, $indent * 4);
}

$out = $myIndent . '<' . $this->tag . $this->_attribStr();

if ( (strlen($this->data) ==0) and (!count($this->children)) )
{
$out .= '/>';
$_SERVER['outputArray'][] = $out;
return;
}

// I have children...
if ($this->children)
{
$_SERVER['outputArray'][] = $out . '>';
$out = '';
$theCount = count($this->children);
$theHandler = &$this->handler;
$theTag = $this->tag;
for ($i=0; $i<$theCount; $i++)
{
$childPtr = $this->children[$i];
$theNode = &$theHandler->__xmlNodeArray[$childPtr];
$theNode->buildXML($indent + 1);
}
$_SERVER['outputArray'][] = $myIndent . '</' . $this->tag . '>';
return false;
}

$out .= '>' . $this->data . '</' . $this->tag . '>';
$_SERVER['outputArray'][] = $out;

}
function dump()
{
$out = "tag=" . $this->tag . ', ';
$out .= 'attribs=[' . trim($this->_attribStr()) . '] ';
$out .= 'data="' . $this->data . '" ';
if ($this->children)
{
$theCount = count($this->children);
$out .= chr(10);
for ($i=0; $i<$theCount; $i++)
{
if ($i > 0) { $out .= chr(10); }
$out .= "  child[" . $i . '] = ' . $this->children[$i];
}
} else $out .= " No children";
return $out;
}
function hasChildren() { return (count($this->children) > 0); }
function importNode()
{
// Get the entire opening tag...
$workPtr = $_SERVER['inputPtr'] + 1;
$workPtr--;
$startPos = $workPtr;
$endOfTag = strpos($_SERVER['inputXML'], '>', $workPtr);
$realLen = $endOfTag - $startPos;
$workStr = substr($_SERVER['inputXML'], $startPos, $realLen);
$workPtr = $endOfTag + 1;
$_SERVER['inputPtr'] = $workPtr;

// Find where the name of the tag ends and grab it...
$emptyTag = false;
if (strpos($workStr, '/') > 0)
{
$emptyTag = true;
$workStr = substr($workStr, 0, strlen($workStr) - 1);
}
$ptr = strlen($workStr);
$space = strpos($workStr, ' ');
if (!$space) { $space = 99999; }
if ($space < $ptr) { $ptr = $space; }
$this->tag = strtolower(trim(substr($workStr, 1, $ptr)));

$workStr = substr($workStr, $ptr + 1, 1024);


// set the start-of-search pointer to where we just ended the tag name...
$startSearch = $ptr;

// Time for the attributes...
$attrArr = explode('" ', $workStr);
$this->attributeNames = array();
$this->attributeValues = array();
for ($i=0; $i<count($attrArr); $i++)
{
if (! strpos($attrArr[$i], '=')) { continue; }
$ptr = count($this->attributeNames);
$parts = explode('=', $attrArr[$i]);
$this->attributeNames[$ptr] = trim(strtolower($parts[0]));
$this->attributeValues[$ptr] = str_replace('"', '', $parts[1]);
}


// If I am an empty tag then there is no more to do...
if ($emptyTag) { return false; }


// If the next character in the main buffer is anything other than a '<' then there
// is text for me to collect, and the tag must, by XML rules, be all done...
if ($_SERVER['inputXML'][$workPtr] <> '<')
{
$textEnd = strpos($_SERVER['inputXML'], '<', $workPtr);
$realLen = $textEnd - $workPtr;
$this->data = substr($_SERVER['inputXML'], $workPtr, $realLen);
$ptr = $_SERVER['inputPtr'] + $realLen + 1;
$ptr = strpos($_SERVER['inputXML'], '>', $ptr) + 1;
$_SERVER['inputPtr'] = $ptr;
return false;
}


// OK: If I am here, then either there are children OR the very next tag is <me> closing
// and I was empty after all.
while ((!$this->_nextIsClose()) && ($_SERVER['inputPtr'] < $_SERVER['inputEnd']))
{
$masterPtr = $this->addChild('', '');
$theHandler = &$this->handler;
$tempNode = &$theHandler->__xmlNodeArray[$masterPtr];
$tempNode->parentID = $this->mID;
$tempNode->importNode();
}


// Right here we have allowed all children to import themselves and
// <my> closing tag is all that is left...
$workPtr = $_SERVER['inputPtr'] + 1;
$workPtr = strpos($_SERVER['inputXML'], '>', $workPtr) + 1;
$_SERVER['inputPtr'] = $workPtr;
}
function updateAttribute($theName, $theValue)
{
$theName = strtolower($theName);
$theCount = count($this->attributeNames);
for ($i=0; $i<$theCount; $i++)
{
if ($this->attributeNames[$i] == $theName)
{
$this->attributeValues[$i] = $theValue;
return true;
}
}
// It isn't in the list yet...
$this->attributeNames[$theCount] = $theName;
$this->attributeValues[$theCount] = $theValue;
}
}

class xmlHandler {
var $__xmlNodeArray = array();
var $rootPtr = 0;
var $rootNode = null;
var $currentPtr = 0;
var $originalXML = '';
var $for

Ajax

 MS = false; // Used when formatting output...
var $indentTemplate = '                                                                                                  ';

function xmlHandler() { $this->clear(); }
function __newXMLNode()
{
$ptr = count($this->__xmlNodeArray);
$tempNode = new xmlNode($this);
$tempNode->mID = $ptr;
$tempNode->handler = &$this;
$this->__xmlNodeArray[$ptr] = &$tempNode;
return $ptr;
}
function addChild($tagName, $tagValue, $stepIn=false)
{
$currentNode = &$this->__xmlNodeArray[$this->currentPtr];
$currentChildren = &$currentNode->children;
$newPtr = $this->__newXMLNode();
$newNode = &$this->__xmlNodeArray[$newPtr];
$newNode->tag = $tagName;
$newNode->data = $tagValue;
$newNode->parentID = $currentNode->mID;

$currentNode->addChildPtr($newPtr);

if ($stepIn)
{
$this->currentPtr = $newPtr;
}
return $newPtr;
}
function attributeValue($theName)
{
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
return $theNode->attributeValue($theName);
}
function childData($theIDX)
{
$tempNode = &$this->__xmlNodeArray[$this->currentPtr];
$childPtr = $tempNode->children[$theIDX];
$tempNode = &$this->__xmlNodeArray[$childPtr];
return $tempNode->data;
}
function childDataNamed($tagName)
{
$retData = "";
$holdPtr = $this->currentPtr;
if ($this->stepInto($tagName))
{
$retData = $this->currentData();
$this->currentPtr = $holdPtr;
}
return $retData;
}
function &childNode($theIDX)
{
$retNode = null;
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
if (($theIDX >=0) and ($theIDX < count($theNode->children)))
{
$ptr = $theNode->children[$theIDX];
$retNode = &$this->__xmlNodeArray[$ptr];
}
return $retNode;
}
function childNodeNamed($tagName)
{
$retNode = null;
$holdPtr = $this->currentPtr;
if ($this->stepInto($tagName) )
{
$retNode = &$this->__xmlNodeArray[$this->currentPtr];
$this->currentPtr = $holdPtr;
}
return $retNode;
}
function childTag($theIDX)
{
$tempNode = &$this->__xmlNodeArray[$this->currentPtr];
$childPtr = $tempNode->children[$theIDX];
$tempNode = &$this->__xmlNodeArray[$childPtr];
return $tempNode->tag;
}
function clear()
{
$this->__xmlNodeArray = array();
$this->rootPtr = $this->__newXMLNode();
$this->rootNode = &$this->__xmlNodeArray[0];
$this->rootNode->parentID = 0; // special - the root's parent is itself...
$this->currentPtr = 0;
}
function currentChildCount()
{
$tempNode = &$this->__xmlNodeArray[$this->currentPtr];
return count($tempNode->children);
}
function currentData()
{
$tempNode = &$this->__xmlNodeArray[$this->currentPtr];
return $tempNode->data;
}
function &currentNode()
{
$tempNode = &$this->__xmlNodeArray[$this->currentPtr];
return $tempNode;
}
function currentNodeID() { return $this->currentPtr; }
function currentTag()
{
$tempNode = &$this->__xmlNodeArray[$this->currentPtr];
return $tempNode->tag;
}
function db_allRowsToChildren($dataSet, $pkName)
{
$pkName = strtolower($pkName);
$this->updateAttribute('pkey', $pkName);
$max = mysql_num_fields($dataSet);
$row = mysql_fetch_row($dataSet);
while ($row)
{
$this->addChild('temp', '', true);
$tagName = '';
for($i=0; $i<$max; $i++)
{
$theName = mysql_field_name($dataSet, $i);
$theValue = $row[$i];
$theValue = str_replace('>', '[$fegt]', $theValue);
$theValue = str_replace('<', '[$felt]', $theValue);
if (strtolower($theName) == $pkName) { $thisName = $theValue; }
$this->addChild($theName, $theValue);
}
$this->updateTag($thisName);
$this->stepOut();
$row = mysql_fetch_row($dataSet);
}
}
function db_childrenToInsertSQL($tableName)
{
$out = array();
$out[] = "insert into $tableName (";
$max = $this->currentChildCount();
for ($i=0; $i<$max; $i++)
{
$thisStr = '';
if ($i>0) { $thisStr = ', '; }
$thisStr .= $this->childTag($i);
$out[] = $thisStr;
}
$out[] = ')' . chr(10) . 'values(';
for ($i=0; $i<$max; $i++)
{
$thisStr = '';
if ($i>0) { $thisStr = ', '; }
$thisVal = $this->childData($i);
$thisVal = stripslashes($thisVal);
$thisVal = str_replace('[$felt]', '<', $thisVal);
$thisVal = str_replace('[$fegt]', '>', $thisVal);
$thisStr .= '"' . $thisVal . '"';
$out[] = $thisStr;
}
return implode(' ', $out);
}
function db_childrenToUpdateSQL($tableName, $pkField)
{
$pkField = strtolower($pkField);
$out = array();
$out[] = "update $tableName set";
$max = $this->currentChildCount();
for ($i=0; $i<$max; $i++)
{
$thisName = strtolower($this->childTag($i));
$thisData = $this->childData($i);
if ($thisName == $pkField) { $pkData = $thisData; }
                        $thisData = stripslashes($thisData);
                        $thisData = str_replace('[$felt]', '<', $thisData);
                        $thisData = str_replace('[$fegt]', '>', $thisData);
$outStr = '';
if ($i > 0) { $outStr .= ','; }
$out[] = $outStr . $thisName . '="' . addslashes($thisData) . '"';
}
$out[] = "where $pkField="$pkData"";
return implode(chr(10), $out);
}
function db_rowToChildren($dataSet)
{
$row = mysql_fetch_row($dataSet);
if ($row)
{
for ($i=0; $i<mysql_num_fields($dataSet); $i++)
{
$theName = mysql_field_name($dataSet, $i);
$theValue = $row[$i];
$theValue = str_replace('>', '[$fegt]', $theValue);
$theValue = str_replace('<', '[$felt]', $theValue);
$this->addChild($theName, $theValue);
}
}
}
function dump()
{
$theCount = count($this->__xmlNodeArray);
$out = array();
for ($i=0; $i<$theCount; $i++)
{
$tempNode = &$this->__xmlNodeArray[$i];
$out[] = "[$i] " . $tempNode->dump();
}
return implode(chr(10), $out);
}
function importXML($inputXML)
{
$this->clear();
$this->originalXML = $inputXML;

// if not an

ajax

 MS transfer...
if (! strpos($inputXML, '

ajax

 ms="1"'))
{
// ... then kill all unneccessary (and hindering) white space and comments...
$inputXML = eregi_replace(">" . "[[:space:]]+" . "<", "><", $inputXML);
$inputXML = eregi_replace(">" . "[[:space:]]+", ">", $inputXML);
$inputXML = eregi_replace("[[:space:]]+" . "<", "<", $inputXML);
$inputXML = eregi_replace("=" . "[[:space:]]+" . """, "="", $inputXML);
$inputXML = eregi_replace("<?.*?>", "", $inputXML);
}
$_SERVER['inputXML'] = $inputXML;
$_SERVER['inputPtr'] = 0;
$_SERVER['inputEnd'] = strlen($inputXML);

$theNode = &$this->__xmlNodeArray[0];
$theNode->importNode();
}
function moveToRoot() { $this->currentPtr = $this->rootPtr; }
function &node($ptr) { return $this->__xmlNodeArray[$ptr]; }
function outputAllXML($makePretty=false)
{
// Note that this will by default compress the HTML,
// unless set to true - probably a user wanting to look at it.
$holdPtr = $this->currentPtr;
$this->moveToRoot();
return $this->outputCurrentXML($makePretty);
$this->currentPtr = $holdPtr;
}
function outputCurrentXML($makePretty=false)
{
// Note that this will by default compress the HTML,
// unless set to true - probably a user wanting to look at it.
$this->for

Ajax

 MS = !$makePretty;
$_SERVER['outputArray'] = array();
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
$theNode->updateAttribute('

ajax

 ms', ($this->for

Ajax

 MS) ? '1' : '');
$theNode->buildXML(0);
$theSep = ($this->for

Ajax

 MS) ? '' : chr(10);
return implode($theSep, $_SERVER['outputArray']);
$_SERVER['outputArray'] = array();
}
function stepOut()
{
$tempNode = &$this->__xmlNodeArray[$this->currentPtr];
$this->currentPtr = $tempNode->parentID;
}
function returnToID($theID) { $this->currentPtr = $theID; }
function stepDeep($thePath)
{
$holdPtr = $this->currentPtr;
$thePath = strtolower($thePath);
$steps = explode('|', $thePath);
for ($i=0; $i<count($steps); $i++)
{
if (!$this->stepInto($steps[$i]))
{
$this->currentPtr = $holdPtr;
return false;
}
}
return true;
}
function stepInto($theTag)
{
$theTag = strtolower($theTag);
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
$theCount = count($theNode->children);
for ($i=0; $i<$theCount; $i++)
{
$childID = $theNode->children[$i];
$thisNode = &$this->__xmlNodeArray[$childID];
if ($thisNode->tag == $theTag)
{
$this->currentPtr = $childID;
return true;
}
}
return false;
}
function stepIntoIDX($theIDX)
{
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
if (($theIDX >= 0) && ($theIDX < count($theNode->children)))
{
$this->currentPtr = $theNode->children[$theIDX];
return true;
}
return false;
}
function updateAttribute($theName, $theValue)
{
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
$theNode->updateAttribute($theName, $theValue);
}
function updateData($newTag)
{
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
$theNode->data = $newTag;
}
function updateTag($newTag)
{
$theNode = &$this->__xmlNodeArray[$this->currentPtr];
$theNode->tag = $newTag;
}
}
?>

perkiset

This little script exercises the class a bit and you can see how it works. It's sort of like the old "turtle graphics" methodology, where you're turtle is currently "on" a node and you navigate / do stuff from there ie., stepInto (deeper into the tree) stepOut (towards the root) and such. Every node is an object, so xmlNodes have plenty methods and properties as well, although I only make use of the a tiny bit here.

Good luck, ping if you need help.


#! /usr/bin/

php

 
<?

php

 

require "/www/sites/lib/classes/class.xmlHandler.

php

 ";

$anXMLString = <<<XML
<test>
<node>
<name>Tesikrep</name>
<address>1234 Main St.</address>
</node>
<node>
<namedata>
<name>Someone Else</name>
<address>Another Address</address>
</namedata>
</node>
</test>
XML;

$xml = new xmlHandler();
$xml->importXML($anXMLString);
echo
"Here comes the original as parsed: ",
$xml->outputAllXML(false),
" Here comes the original 'prettied': ",
$xml->outputAllXML(true), " ";

$xml->moveToRoot();
$xml->stepIntoIDX(0);
$xml->stepInto('name');
$xml->updateData('Perkiset');
$xml->stepOut();
$nodePtr = $xml->childNodeNamed('address');
$nodePtr->data = 'Second crater on the left, Mars';
$xml->moveToRoot();
$xml->stepIntoIDX(1);
$xml->stepDeep('namedata|name');
$xml->updateAttribute('attr1', 'this is the attribute data');
$xml->moveToRoot();
$xml->stepInto(0);
$xml->addChild('comments', '', true); // create new child node, no data, step into it
for($i=0; $i<5; $i++)
$xml->addChild("Comment$i", "This is the data for comment $i", false);

echo
"Here comes the XML at the current pointer: ",
$xml->outputCurrentXML(true),
" Here comes ALL of the new XML: ",
$xml->outputAllXML(true),
" ";

?>


The above code will produce this output:

Here comes the original as parsed:
<test

ajax

 ms="1"><node><name>Tesikrep</name><address>1234 Main St.</address></node><node><namedata>
<name>Someone Else</name><address>Another Address</address></namedata></node></test>


Here comes the original 'prettied':
<test>
    <node>
        <name>Tesikrep</name>
        <address>1234 Main St.</address>
    </node>
    <node>
        <namedata>
            <name>Someone Else</name>
            <address>Another Address</address>
        </namedata>
    </node>
</test>


Here comes the XML at the current pointer:
<comments>
    <Comment0>This is the data for comment 0</Comment0>
    <Comment1>This is the data for comment 1</Comment1>
    <Comment2>This is the data for comment 2</Comment2>
    <Comment3>This is the data for comment 3</Comment3>
    <Comment4>This is the data for comment 4</Comment4>
</comments>


Here comes ALL of the new XML:
<test>
    <node>
        <name>Perkiset</name>
        <address>Second crater on the left, Mars</address>
    </node>
    <node>
        <namedata>
            <name attr1="this is the attribute data">Someone Else</name>
            <address>Another Address</address>
        </namedata>
    </node>
    <comments>
        <Comment0>This is the data for comment 0</Comment0>
        <Comment1>This is the data for comment 1</Comment1>
        <Comment2>This is the data for comment 2</Comment2>
        <Comment3>This is the data for comment 3</Comment3>
        <Comment4>This is the data for comment 4</Comment4>
    </comments>
</test>
[/pre]

nattsurfaren

Thanks for the code.

I have been playing with it.
It doesn't work if an xml has:
<?xml version="1.0"?>

Are you stripping/adding this at the beginning and the end before you use your class?

/Natt

thedarkness

Natt there is no corresponding closing version tag IIRC

Cheers,
td

perkiset

Hey Natt -

Yes, you are correct it bombs on an XML record that has the header. As TD points out, an XML header is actually a non-conforming tag and by itself actually renderers well formed into ill-formed XML in one fell swoop.

When I see that on line one I just ditch it since it doesn't have anything to do with the record really...

nattsurfaren


I made some minor changes for newbes by a newbe Applause

Added a function:
function CutHeader($inputXML)
{
if (strpos($inputXML, '<?xml')>-1)
{
$hstart = strpos ($inputXML, "<?");
$hstop = strpos($inputXML,"?>",$hstart)+2;
$hdr = substr($inputXML,$hstart,$hstop-$hstart);
$_SERVER["xmlheader"]=$hdr;
$inputXML=substr_replace($inputXML,"",$hstart,$hstop-$hstart);
return $inputXML;

}
}


The CutHeader function is called in importXML. Se the last line.
function importXML($inputXML)
{
$this->clear();

$inputXML=$this->CutHeader($inputXML);

The return line was changed to the following:
quote
return $_SERVER["xmlheader"].implode($theSep, $_SERVER['outputArray']);

in this function
function outputCurrentXML($makePretty=false)


/Natt

perkiset

Well done man... thanks for the update!


Perkiset's Place Home   Politics @ Perkiset's