Example Code

Libxml2: Parsing XML in LabVIEW is too slow... or is it?

Code and Documents

Attachment

LVvslibxml.pngOverview

I've done a fair bit of work with XML and for a while it was frustratingly slow parsing large XML files no matter what method I tried.

I know I'm not the only one, many have come across this issue and mostly we just learned to live with it.

My users just had to wait 30+ seconds for their information to load.

 

Then I started to look deeper into it. LabXML / LibXML seemed to be much faster at the open stage and a little bit faster when using xpath expressions. Not enough to change everything over though.

Then I realised all the CLF nodes were set to run in the UI thread. So I changed them all to run in 'any thread'. A few more tweaks later and it was running about 30 times faster than anything else.

 

Now here's the trade-off. It is a bit more tricky to use. Some of the other XML parsing methods do a fair bit of work to make them easy to use.

I don't care about easy, I want FAST.

Using the libXML functions gives you fine control over your XML operation. It isn't the simplest method, but it is fast.

 

If you want easy try EasyXML. If you want fast, try this.

Description

LabXML comes in two variants. One uses MSXML, one uses the LibXML toolkit compiled for Windows. All credit for this goes to the good people at xmlsoft.org and Thijs Bolhuis who made the LabXML LabVIEW wrappers.

All I did was tweak the LabXML VIs and CLF calls. Attached is my version of the project library and the dlls needed to make it work.

For help with how to use XML or XPATH you can Google it. For help with the specific libxml functions you can go to xmlsoft.org.

There's also an example below to get you started.

 

Example

I've taken the example from this thread here, where they discuss why XML parsing is so slow and did some benchmarking for how long it takes.

I left the original VI intact except for covering the LabVIEW XML functions with a diagram disable structure and adding my libXML version over the top of it so you could switch between the two.

The big example XML file (LabVIEW_Labs_RWWbig.xml) contains 40960 clusters and is a 24MB file (quite repetitive so it zips down nice and small).

It takes the LabVIEW XML parser about 87 seconds to parse the 40960 clusters in it.
It takes the libXML parser about 4 seconds to parse the 40960 clusters in it.

(Win7 2core i5-2400)

 

More than 20 times faster in this example. I think it's worth it!

 

(v2.1 includes xmlParseDoc), xmlDocDumpFormatMemory and corrected xmlXPathFreeObject LibXML_FreeXPath.vi)

LibXMLUpdate(2).zip contains additions and updated consistent naming conventions. - Thanks to GarryG

VI Package manager file added.

lXMLwrap64.dll.zip added thanks to EricH

2019-07-23 - Corrected incorrect return type for xmlXPathFreeObject() was int32, should have been void.

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument

Example code from the Example Code Exchange in the NI Community is licensed with the MIT license.

Comments
Bob_Schor
Knight of NI Knight of NI
Knight of NI
on

I downloaded both files, unzipped them into the same folder (LibXML), opened the XMLDiscussion Project, and attempted to run Main.  It ran, but converted 0 elements, giving me an empty array "very fast".  I tried this in LabVIEW 2012 (SP1) and LabVIEW 2011 (SP1) on Windows 7 Pro (64-bit, using 32-bit LabVIEW).

What am I doing wrong?

BS

TroyK
Active Participant
Active Participant
on

Hmmm... It could have been worse, it could have taken a long time to give you zero results!

I hadn't done any error checking in the main.vi and the libxml file open is quite forgiving when it points to a non-existent file. That could be the problem.

I've updated the example to include a check if the file exists.

I just designed my code to specifically read the example XML file, not any LabVIEW XML file.

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
Bob_Schor
Knight of NI Knight of NI
Knight of NI
on

Oops, I should have caught that.  Now that the demo is working, I recognize the XML data file -- it's my data!  Good demo, I'll have to poke around a bit in the libXML parser.

GarryG
Member
Member
on

Troy, thanks for this example.  I've run into the same problems with large XML documents.  Can you post an example where you read from a string rather than a file?

Thanks in advance!

CLA, CTA
Not my tempo... AGAIN!
TroyK
Active Participant
Active Participant
on

GarryGarrett wrote: Can you post an example where you read from a string rather than a file?                 

Well there certainly seems to be a libxml function that will do that but it wasn't included in the original LabXML wrappers.

http://www.xmlsoft.org/html/libxml-parser.html#xmlParseDoc

I will attempt to make a new wrapper VI for that function and add it to the attached libXML zip file.

[Edit] Done, function now included, worked on the test VI in the xmlReadDiscussion example included. Just use libXML parse doc.vi instead of libXML open XML-file.vi

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
GarryG
Member
Member
on

Just a quick report back, in some benchmark testing with enormous XML files, I have over a 500x (yes, 500 times) speed improvement using this parsing library and have proven I get the same resultant data output as the LabVIEW parser.

You have to do some work reading the LibXML open source documentation and filling in the blanks for functions you may want that aren't included  We've also reworked the VI's to meet our framework's coding standards and naming conventions and updated some of the XPaths and re-written some sections of code because you can now specify a new rootnode (relative starting location within the document) easily with this library whereas with the NI library, you really couldn't as you pass the node handles between VI's.  But this library is incredibly fast when compared to the out of the box LabVIEW library if you're willing to do your homework.

THANK YOU TROY!

CLA, CTA
Not my tempo... AGAIN!
TroyK
Active Participant
Active Participant
on

You're welcome Garry! (Although most of the credit goes to the good folks at xmlsoft.org and Thijs Bolhuis for the original LabXML VI wrappers.)

Wow that sounds like an incredible result! Thank's for reporting back. I'm so glad to see others are finding this library useful.

I'd love to see the improvements you've made. I left the library as close as possible to the original naming. I didn't want to change it too much in case people were using the original library and wanted to easily upgrade.

Having said that, I'm all for good naming conventions and I know there are many more libxml functions that didn't have VI wrappers.

If it is possible, could you please upload your updated version of the wrapper library. It sounds like it would be very useful for people who are starting fresh with libxml whereas my version would be more convenient for those who were already using LabXML.

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
GarryG
Member
Member
on

This won't allow me to add attachments to the message, so, I've posted them over in the discussions forum on the thread you mentioned above, here:

http://forums.ni.com/t5/LabVIEW/Why-does-reading-XML-take-so-long/m-p/2541017/highlight/false#M76998...

CLA, CTA
Not my tempo... AGAIN!
TroyK
Active Participant
Active Participant
on

GarryGarrett's updated lilbrary "LibXMLUpdate(2).zip" with consistent naming, terminal layouts and icons has been uploaded into the document above. [link]

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
MinuteEngineering
Member
Member
on

If you want to parse xml data in labVIEW just visit http://minuteeng.blogspot.com/2014/01/rss-feeds-in-labview.html  or  https://decibel.ni.com/content/docs/DOC-35100 for complete video tutorial

gbecker
Member
Member
on

This saved me in the last possible moment. Times for my 500kb XML-file: LV DOM using xpaths apprx. 600s, libxml2 using xpaths apprx. 6s!

Many thanks for this library.

joernheit
Member
Member
on

Very nice library, thanks a lot for implementing this. Which is the latest update the "LibXMLUpdate(2).zip" (above) or the one from the VIPM (Version 1.0.0.1) ?. And by the way, how can I make an 'in between check' during the enparsing process. That means I would like to visualize the xml doc during enparsing as a string simply in a string indicator.

TIA,

Jörn

TroyK
Active Participant
Active Participant
on

Good point. It did occur to me that the file naming wasn't very clear.

libXML LV2011 v2.zip kept the same naming and terminal layout as LabXML so that if anyone was using the old library they could easily update.

LibXMLUpdate(2).zip has updated terminal layouts and naming conventions. It's the one I use and will add updates to.

To make it easier for people to install and use I used VIPM to make a package: libxml2-1.0.0.1.vip

This is the latest uploaded version.

Perhaps I should rename the files to make it  less confusing.

If only there was some way to clearly associate comments to attachements.

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
joernheit
Member
Member
on

Thanks for clarification. I begun with coding, unfortunately libxml is kind of complex and I am not the smartest XML guy. The Read XML example is nice but I had difficulties with building XML. Meanwhile I stepped ahead and have working examples, but still some questions. Should we start an LibXML thread in the forum for further questions ?.

TroyK
Active Participant
Active Participant
on

I think starting a thread in the forum is probably a good idea. It will alert more people to the topic and attract better answers from more experienced developers.

It's also a better format for a question and answer discussion IMHO.

Did you have a look at the examples on xmlsoft.org?

There is a write exmple there (in C) that could be used as a guide to creating a LabVIEW write example.

I haven't done any XML file writing using libxml yet so I don't have an example I can upload.

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
MikaelH
Active Participant Active Participant
Active Participant
on

Does anybody have a 64 bit dlls?

TroyK
Active Participant
Active Participant
on

I believe the latest 64bit dlls can be found here: ftp://ftp.zlatkovic.com/libxml/64bit

Source page for libxml2 http://xmlsoft.org/downloads.html points to https://www.zlatkovic.com/libxml.en.html for windows binaries.

The ftp download area on that page is https://www.zlatkovic.com/pub/libxml/ which redirects to the ftp location, then there's the 64bit subdirectory mentioned above.

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
MikaelH
Active Participant Active Participant
Active Participant
on

Thanks, I'll give that a go.

I have a tool, that goes through a set of VIs and with scripting adds a conditional structure over all dll calls and adds a 64 bit case, and then puts in the same dll call but points it to a different dll name (e.g. *-64.dll).

This makes it’s easy to use the same code in both 32 and 64 bit LabVIEW.

MikaelH
Active Participant Active Participant
Active Participant
on

I did find a 64 bit dll there for the libxml2.dll, thanks.

But who mmade the IXMLwrap.dll?

Is there a 64 bit version of this?

I see that theer are otehr dlls as well (iconv.dll,libxslt.dll,zlib1.dll), are these used by the wrapper dll?

TroyK
Active Participant
Active Participant
on

IXMLwrap.dll, iconv.dll, libxslt.dll and zlib1.dll are all needed.

I don't know who made/compiled IXMLwrap.dll though.

There is this... http://forums.ni.com/t5/LabVIEW/Migrating-LabXML-tool-to-64bits/td-p/2958699

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
MikaelH
Active Participant Active Participant
Active Participant
on

I've created a LIBXMLs class that have the same connector Pane and Vnames as NIs version.
So you can just search and replace.

For a small XML text it's about 6 times faster

For a medium size it's about 30 times faster

For a Large size it's >180 times faster

I will post tis library as soon as I've fixed a few issues and I'm still trying to fix the 64 bit issue.

I've foudn the 64 bit lbxml dll, but now I need the wrapper dll and the other depended dlls in 64 bit flavour.

I've found the Wrapper's c-code and I could potentially create my own 64 bit dll.

Update: A colleague of mine has found all 64 bit DLLs needed, now I just got to test them.

billko
Proven Zealot
Proven Zealot
on

I literally LOL'd when I refactored my code to use this and ran it.  It finished in less time than it took to do one iteration the native LV way.  (Parsing 340 data packets from an XML file.)

So it was at least 340x faster!

Just glad I wasn't eating or drinking anything when I ran it.

THANKS!!!

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
MikaelH
Active Participant Active Participant
Active Participant
on

Status Update:

I still havn't been able to build a 64 bit version of the wrapper dll yet, anybody else tried to do this?

WNM
Active Participant
Active Participant
on

I experienced some frustration following what was going on in this example VI and applying it to a similar libXML VI that I was trying to build to parse some pre-existing files.  In particular, the "querystring" syntax was never documented anywhere within the LabVIEW library that I could find. After much digging and following dead-end links into C-programming back alleys I finally managed to turn up the document to explain the syntax:  XML Path Language

TroyK
Active Participant
Active Participant
on

Oops, yes, I don't explicitly state anywhere that you should use xpath expressions to search the xml tree, I just assumed that was the standard query language for xml.

I do mention xpath in the document description and tags, although if you've never heard of xpath before it might not mean much.

Glad you worked it out.

Troy - CLD "If a hammer is the only tool you have, everything starts to look like a nail." ~ Maslow/Kaplan - Law of the instrument
billko
Proven Zealot
Proven Zealot
on

Maybe it's because you didn't realize it is the same thing as "xpath?"  W3 schools has an excellent tutorial here.  As a matter of fact, LabVIEW has an example using xpath - albeit using the native xml queries.

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
TJ.hu
Member
Member
on

Hello Everybody,

Does anyone know ho to process XML with namespaces with this libxml?

Thanks,

TJ.

TJ.hu
Member
Member
on

Anyway, I found a way around. Just replaced the namespace section at the beginning and the end of the text, after reading the file into a string.

Now the libxml2 method works on the modified text very well, uncomparably faster result with xPath queries.

Thanks.

TJ.

Kevin_Mckinley
Member
Member
on

I am seeing 2x to nearly 60x parsing speed increase while searching through 90+ XML files of sizes varying from 100 KB to 2 MB.  Speed increase is dependent on the number of queries required; the more queries the more speed increase observed.

I am floored by the improvement, I was contemplating creating a database to try to use SQL instead or figure a way to load the files to memory.

Running on: i7-4800, 2.7 GHz, 16 GB RAM, Samsung SSD

EricH
Member
Member
on

Hi everyone,

A 64 bit verision of lXMLwrap.dll can be found at the below link. Included are the 64 bit versions of other necessary dlls.

http://forums.ni.com/t5/LabVIEW/Migrating-LabXML-tool-to-64bits/td-p/2958699

This was tested on LabView 2016 (64 bit).

billko
Proven Zealot
Proven Zealot
on

I can see how this would be amazingly helpful to those who need 64-bit LabVIEW to handle memory-hungry apps.  Just think how parsing GB of xml data orders of magnitude faster would be so helpful!

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
billko
Proven Zealot
Proven Zealot
on

FYI - the links to the files are no longer active.  I THINK this is a link to the project.

Bill
CLD
(Mid-Level minion.)
My support system ensures that I don't look totally incompetent.
Proud to say that I've progressed beyond knowing just enough to be dangerous. I now know enough to know that I have no clue about anything at all.
Humble author of the CLAD Nugget.
MikaelH
Active Participant Active Participant
Active Participant
on

I love this library, and we use it a lot (now also working in 64 bit).
But when we open our projects in Win10, we from time to time get issues.

Windows Error reporting locks up the PC by adding errors in the Windows Application Log, the LV's IDE becomes unresponsive (we're not running anything, just having the project open).
In the Event Viewer, I get errors like this, every few seconds.

Faulting application name: LabVIEW.exe, version: 18.0.1.4001, time stamp: 0x5ba1d799
Faulting module name: libwinpthread-1.dll, version: 1.0.0.0, time stamp: 0x02911bc0
Exception code: 0xc0000005
Fault offset: 0x0000000000006500
Faulting process id: 0x3ab0
Faulting application start time: 0x01d4bc0f902c6407
Faulting application path: C:\Program Files\National Instruments\LabVIEW 2018\LabVIEW.exe
Faulting module path: C:\Program Files\National Instruments\LabVIEW 2018\user.lib\_LabVIEWCommon\LcXML\LibXML2_class\private\dll\libwinpthread-1.dll

I also posted about this here:
https://forums.ni.com/t5/LabVIEW/Migrating-LabXML-tool-to-64bits/m-p/3693821/highlight/false#M103874...

Have anybody else seen this?

 

MikaelH
Active Participant Active Participant
Active Participant
on
ViltBalint
Member
Member
on

Hi,

 

I have to find an XML parser for real time which is able to validate xsd schema and supports xpath. Do you have any idea, is there any tool which could do this job? Btw, I tried to use libxml2 for real time and it works but the utility dlls are compiled for windows. Would it be possible to migrate this library for real time?

 

Best regards,

 

Balint

dhana03
Member
Member
on

Hi, 

I am a new user for the LibXML, I have downloaded all the above attachments and extract the zip files. 

after extracting, run the setup file(libxml2-1.0.0.2.vip) and installed with my windows 10. after installing I have opened the(Desktop\Libxml\LibXMLUpdate(2.1)\LibXMLUpdate(2.1)\Main.vi) and Ran the main.vi getting a broken arrow with a list of errors ( Not executable vis). Please do needful for the procedure on how to make this example work in my system.

 

Liam2016
Member
Member
on

THANK YOU MichaelH!!!!! I found your link

https://forums.ni.com/ni/attachments/ni/170/1003519/1/LibXML2_class(64-bit%20friendly).zip

I was about ready to surrender to the slow processing speed of the native LabVIEW VIs when I FINALLY found this. Not sure if the content is a copy from one of the links above, but there's so many links and I tried quite a few. I just wanted to add this message here in case anyone like me needs a class that's already put together and working. All I did was add a build spec and build it and it worked!

THANK YOU to all the developers! this is a huge time saver.. even though I just spent 3 days banging my head against this wall trying to figure out why I have problems with the executable and not in the development environment.

AgentBK
Member
Member
on

Hi.

First of all, great work on the library. A file which takes 6 min to read with the NI vis, take 100 ms with this library 😵

My only problem is I tried installing the vip file, but I get the following error:

"Libxml2 (LabXML updated) v1.0.0.1 could not be added to VIPM because the file format is not valid.

Note: The remaining install operation has been aborted."

 

I am using the VIPM version 2020.3.

 

Do you know perhaps what the reason could be?

MikaelH
Active Participant Active Participant
Active Participant
on

Not sure, but just rename the .vip to .zip and unzip it.
Read the text file "spec" and see where all the files should be copied into.
It's that easy to install a vip file 🙂

AgentBK
Member
Member
on

Thank you MikaelH.

I was able to create a new package with the help of the spec file. This one now installs properly. I am not sure what the problem was with the original.

Knight of NI
Knight of NI
on

FYI: I posted an alternate implementation of reading XML using the miniXML library: https://forums.ni.com/t5/Example-Code/LabVIEW-miniXML-Library/ta-p/4265728

Hanoub
Member
Member
on

Hi Guys,
I'm wondering if someone has found the solution for indenting the XML file correctly, I've tried to pass 1 in xmlSaveOption Flags to the SAVE TO FILE VI, but the XML keeps being filled horizontally which makes it a little bit hard to read. 

Hanoub_0-1711964333819.png

Hanoub_1-1711964479127.png

 

 

Contributors