I needed a simple PHP script to parse large XML files fast and without huge memory consumption, so I’ve written a small class for this.
This class can be used to parse large XML files (it works with small one also) fast and with minimum of memory consumption.
It can parse any valid XML and convert it to an array. What it does not do is to get the attributes of the nodes.
If you need it, contact me and i can implement it for you if you want.
You can parse any part of the XML as it supports XPath with the same performance as parsing the entire XML (well, a little bit faster as it’s less data to parse)
Here is an example
Let’s say we have the following XML file (called example.xml)
<?xml version="1.0" encoding="UTF-8" ?> <!-- /********************** * Example XML (not that large, but it's for demo purpose only) **********************/ --> <myFirstNode> <color-palettes> <color type='txt'>red</color> <color type='txt'>yellow</color> <color type='txt'>lime</color> <color type='txt'>cyan</color> <color type='txt'>blue</color> <color type='txt'>magenta</color> <color type='txt'>white</color> <color type='txt'>black</color> <color type='hex'>#FF0000</color> <color type='hex'>#FFFF00</color> <color type='hex'>#00FF00</color> <color type='hex'>#00FFFF</color> <color type='hex'>#0000FF</color> <color type='hex'>#FF00FF</color> <color type='hex'>#FFFFFF</color> <color type='hex'>#000000</color> </color-palettes> <first-100-numbers> <number n='1'>1</number> <number n='2'>2</number> <number n='3'>3</number> ... <number n='97'>97</number> <number n='98'>98</number> <number n='99'>99</number> <number n='100'>100</number> </first-10-numbers> <searchengines> <engine> <name>Google</name> <website>http://www.google.com</website> </engine> <engine> <name>Yahoo</name> <website>http://www.yahoo.com</website> </engine> <engine> <name>Bing</name> <website>http://www.bing.com</website> </engine> </searchengines> </myFirstNode>
And here is the PHP code to extract some data from it as an array:
<php
// include the class
require_once('SimpleLargeXMLParser.class.php');
$xml = "example.xml";
// get all colors in hex format as an array
$array = SimpleLargeXMLParser::parseXML($xml, "//myFirstNode/color-palettes/color[@type='hex']");
// get all numbers bigger then 50 as an array
$array = SimpleLargeXMLParser::parseXML($xml, "//myFirstNode/first-100-numbers/number[@n>'50']");
// get all search engines as an array
$array = SimpleLargeXMLParser::parseXML($xml, "//myFirstNode/searchengines");
// get the full XML file as an array
// if you don't specify the first node the script will search for it and use the root node
// for performance reasons is better to specify it if you know it
$array = SimpleLargeXMLParser::parseXML($xml, "//myFirstNode");
?>
A new version is available. See this post for more information about what’s new.
Download
Download here. (there are some examples in the package)
Hello,
nice class saving my time.
i have one question to ask.
How to get node attribute values?
I’ve updated the class. Now you can get the attributes also.
see this post: http://www.protung.ro/2009/10/new-php-simple-large-xml-parser-version/
Hi,
Such a nice class dude,
but i got a problem,
I can’t read these type of xmls : http://feeds2.feedburner.com/24thfloor
Can you please provide some help here…..
I save it as XML and manualy add the tag … then only it produce result.
i tried to give
$array = SimpleLargeXMLParser::parseXML($xml,’//feed’);
and
$array = SimpleLargeXMLParser::parseXML($xml,’//entry’);
but it won’t work
Please help,
Thanks in Advance
Hi Vijay,
there is a new version of this class.
See this post:
http://www.protung.ro/2009/10/simple-xml-parser-namespace-support/
Also you need to make sure the encoding of your feed is correct (in your case it isn’t).