Simple XML Parser Namespace support

October 27th, 2009 by Dragos Leave a reply »

I have added support for namespaces to the parser.
What this means, you are now able to register a namspace in order to be able to parse the XML.

Also the usage of this class changed.

<?php

$xml = "http://www.protung.ro/feed/atom/"; // parse an Atom feed

// create a new object
$parser = new SimpleLargeXMLParser();
// load the XML
$parser->loadXML($xml);

// register the namespace
$parser->registerNamespace("atom", "http://www.w3.org/2005/Atom");
// this will get an array of entries
$array = $parser->parseXML("//atom:feed/atom:entry");

?>

As always, you can download the new version from here.

VN:F [1.9.22_1171]
Rating: 8.5/10 (14 votes cast)
Simple XML Parser Namespace support, 8.5 out of 10 based on 14 ratings
Advertisement

6 comments

  1. gael says:

    Really good job !
    Tested with huge file – around 5 Mo, 4000 entries : less than 1 sec to parse it 8-)

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  2. Gabri says:

    I’m having some headaches with parsing xml file, the problem regards xml with hex entities inside of the xml file (i.e: à and so on), I cannot understand why but the result of parsing outputs strange chars and not preserve these entities.
    I found your classes and tried to put in example.xml some hex entities …but I have the same results as I had before.
    I tried a lot but I cannot find a solution, do you have one?

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • Dragos says:

      Make sure you surround the text in CDATA

      So if you have something like:
      &#xe0;

      make it like:
      < ![CDATA[&#xe0;]]>

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: +1 (from 1 vote)
  3. Gabri says:

    Thank you, it works for text inside xml nodes. What doesn’t yet work is for hexa/decimal entities inside an attribute (ie.: <color TEXT="an entity “>#333333). In this case surrounding with a CDATA brakes the parsing process.
    I think the only solution is anyway to pre process the xml file and post process the output. Thank you anyway Dragos

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • Dragos says:

      you have to escape the &

      for example:

      <node type='&amp;#xe0;'>&amp;#xe0</node>

      the same applies for node values if you do not escape it with CDATA (it’s better with CDATA anyway)

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
      • Gabri says:

        Thank you very much! You saved me! :)
        (and sorry for the proliferation of “wrong” comments)

        VA:F [1.9.22_1171]
        Rating: 0.0/5 (0 votes cast)
        VA:F [1.9.22_1171]
        Rating: +1 (from 1 vote)