LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

XML load: preserve whitespace bug?

Check this out.
According to me there is a bug in the "preserve whitespace" in the XML LOAD function.
True/False does not change , there are always 7 children of documentElement (the root of XML)

thanks

0 Kudos
Message 1 of 10
(6,282 Views)
I don't think this is a bug with the LabVIEW VI as much as a debatable point as to how the Xerces parser (which is what is used) works. From what I've read the XML recommendation says that a parser should report whitespace if no DTD or schema is provided, regardless of the setting of "preserve whitespace".
0 Kudos
Message 2 of 10
(6,267 Views)

Wait, documentation of the function in labview says:

"preserve whitespace: specifies whether a validating parser includes ignorable whitespaces as Text Nodes. The default is TRUE. If you select FALSE, the parser discards ignorable whitespaces and does not add Text Nodes to the DOM tree. "

 

and later:

"process schema: disables schema processing for the XML parser when set to FALSE, the default. If you set the control to TRUE, you must set process namespaces to TRUE."

 

Why they are not saying the same for "preserve whitespace", that is "if you set the control to FALSE, you must provide a DTD/Schema" ?

0 Kudos
Message 3 of 10
(6,261 Views)

I can't edit the message.

This information is very important because the code could be wrong.

Also why there is no "innerXML" property?

example: <myelement>type some text here</myelement>

 

has 1 node, with innerXML="type some text here"

and now I am force to use first_child + node.value to get the same text. Am I wrong?

0 Kudos
Message 4 of 10
(6,258 Views)

Hi Slyfer,

I agree with smercurio.

The XML DOM specification explicitely says the "preserve whitespace" option is considered only if there is a valid DTD.

And that's how the Xerces DOM parser work. Being the Labview XML functions simply a wrapper over the Xerces parser, in general you should refer to Xerces behaviour and documentation for information.

Actually I could agree that Labview documentation of the XML Load function is not so clear and could be changed accordingly to Xerces documentation.

I am going to notify R&D about that.

0 Kudos
Message 5 of 10
(6,230 Views)

Could you please post the link of the Xerces documentation you are mentioning? I am not able to find the correct documentation (lost in Java api? c++ xerces? apache?).

 

This is the code in C# and in their parser there is no need of schema (DTD is obsolete by the way...) in fact :

 

#Loading a Document with DOM access 

private void button1_Click(object sender, EventArgs e) {

XmlDocument doc = new XmlDocument();

doc.PreserveWhitespace = checkBox1.Checked;

doc.Load(Application.StartupPath + @"\employees.xml");

MessageBox.Show(doc.InnerXml);

}

 

This code is bound to a click event of a button and looks if a checkbox control is true, and do the trick without any notion of schema, in a given XML file.

Whitespace is a property of the XmlDocument object.

Providing the schema for XML just for removing whitespace is a XML standard recommendation or a personal implementation of Xerces?

0 Kudos
Message 6 of 10
(6,224 Views)

Sure!

Here's the explanation:

The "preserve whitespace" refers to property "element content whitespace", which is defined in the XML Information Set by w3c, as a property of validated documents (see par 2.6).

Therefore that property has no meaning for non-valid documents.

It is true that some others dom parser implementations (see Oracle  and MS) adopt the same property for non-valid documents, but actually it is a semantic violation of the standard (I think this thread in the apache dev list is clear in that regard).

Xerces in conclusion does nothing more than respecting the standard.

Anyways, in my opinion it is not such a big problem. 

In labview you could simply bypass the problem in this way:

 

remove_empty_spaces.jpg

Cheers,

 

Lucius

0 Kudos
Message 7 of 10
(6,216 Views)

Thanks for the explanation, but I am not able to make it work. See attachment, first part, I get error loading XML as text file, what are those strange characters at the beginning? omg...

 

About DTD: they are W3C standard. The XSD schema specification is a W3C recommendation and cannot be supported by Xerces? I remind you that DTD is an obsolete technology...as if I was using <font> tag in HTML pages... I could do it but it's obsolete. 

 

0 Kudos
Message 8 of 10
(6,208 Views)

Hi Slyfer,

the strange characters you get on top your string are the ascii representation of the Byte Order Mark (BOM) of the Unicode Standard (specifically UTF-8).

The problem is, Labview works well with ASCII.

You should simply convert the xml file to ASCII (if you want to do that in Labview see this post ).

 

Speaking about XML Schema, why do you think Xerces does not support it? I have never said that. I just said that a specific property of the XML Info Set is defined only for valid documents. 

For sure the Xerces parser supports it. Just try by yourself 🙂

Cheers

 

Lucius

0 Kudos
Message 9 of 10
(6,197 Views)

I will try but, again, LV documentation speaks only about DTD 🙂

I will try to link a XSD schema and see what happen....in labview 🙂

Do you know if it is possible (and how) to "extend" the labview wrapper? If I would like to let new properties appear or methods. I haven't seen the innerXML() for example (it returns the XML string inside an element).

 

0 Kudos
Message 10 of 10
(6,189 Views)