LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

XPS (XML Paper Specification) Reader

Solved!
Go to solution

This is a little hard to search for, but I'm looking for a way to read text out of an XPS file.  I searched but found people talking about the XP operating system as plural (XPs).  Does anyone have any suggestions for reading this file?  The only option I thought of so far is to use a XPS to PDF converter, and then I saw a tool that would turn a PDF into text.  This is of course only going to work if the XPS to PDF conversion preserves the text.

 

Has anyone programmaticaly read information out of a generated XPS file?

0 Kudos
Message 1 of 11
(4,704 Views)

From LabVIEW, if you navigate to Help -> Find Examples and search for “Flatten and Unflatten XML,” you will see a project file (.lvproj) that has some good example VI’s outlining how to write to an XML file as well as read from and unflatten data in an XML file. This would be a great place to start!

Ross S.
Applications Engineering
National Instruments
0 Kudos
Message 2 of 11
(4,658 Views)

Hooovahh,

 

     Try searching for "xps microsoft".  I think this is an "open" (can you believe that?) protocol that Microsoft hoped would replace PDF, which some Other Company developed.  I found numerous citations with this search topic -- maybe there's even Helpful Information ...

 

BS

0 Kudos
Message 3 of 11
(4,642 Views)

Could you post the xps file(s)? It would be interesting to see what you're doing. I looked up xps, and it looks like it is going to be a matter of reading the xml in to generate what is needed.

Here's a link to xps including some schemas:

https://msdn.microsoft.com/en-us/library/windows/hardware/dn614032(v=vs.85).aspx

 

Glad to answer questions. Thanks for any KUDOS or marked solutions 😉
0 Kudos
Message 4 of 11
(4,634 Views)

Thank you, Microsoft!  I just opened (on an XP VM) a Word document and printed it to an XPS file.  Can you say "Unreadable"?  Certainly not Ascii.  Let me try with a "pure text file" from an old-fashioned Text editor ... Well, still unreadable, and slightly larger (my 93-byte text file "printed" to a 17KB .XPS file).

 

BS

0 Kudos
Message 5 of 11
(4,617 Views)

Okay yeah I guess I should have posted an example.  Here is one such XPS file.  Please note that dispite the name this does not appear to be XML.  The unflatten/flatten XML won't work, it is not ASCII.

 

I found several XPS to PDF converters, but in the process the text is lost.  So if I try to run it through a PDF to text converter the text is garbage.  For my specific application I found a possible work around.  But still the discussion can continue on suggestions for parsing and understanding this file format.

0 Kudos
Message 6 of 11
(4,579 Views)
Solution
Accepted by topic author Hooovahh

@Hooovahh wrote:

Please note that dispite the name this does not appear to be XML.  The unflatten/flatten XML won't work, it is not ASCII.


I've just had a look at the Wikipedia page and according to that - the xps file itself is a container for XML files:

 

An XPS file is a Unicoded ZIP archive using the Open Packaging Conventions, containing the files which make up the document. These include an XML markup file for each page, text, embedded fonts, raster images, 2D vector graphics, as well as the digital rights management information. The contents of an XPS file can be examined by opening it in an application which supports ZIP files.

 

So you might be able to unzip it to get the xml files and then extract the bits of information you need from there.

 

There does seem to be support for reading/writing XPS files in .NET - perhaps you might have some luck with that? https://msdn.microsoft.com/en-us/library/windows/desktop/dd316975(v=vs.85).aspx


LabVIEW Champion, CLA, CLED, CTD
(blog)
Message 7 of 11
(4,573 Views)

Here is a link you might want to look over if you haven't already gone past this in other research:

 

http://www.wictorwilen.se/Post/Dissecting-XPS-part-1--The-basics.aspx

 

Glad to answer questions. Thanks for any KUDOS or marked solutions 😉
Message 8 of 11
(4,564 Views)

Sweet thanks, sorry I missed this at first.  So for those interested you can get information out by extracting the zip, then look in the Documents \1\Pages folder and you'll find a text document for each page which more or less has text that can be understood for each page.  I saw the potential .Net method but didn't get very far.

0 Kudos
Message 9 of 11
(4,556 Views)

@Hoovah

 

I looked at your sample:

In Explorer it only shows one file

After extraction it only shows one file

 

I'm on a Win 7 box. Does this matter?

Glad to answer questions. Thanks for any KUDOS or marked solutions 😉
0 Kudos
Message 10 of 11
(4,528 Views)