LabVIEW Idea Exchange

cancel
Showing results for 
Search instead for 
Did you mean: 
altenbach

Associative Arrays!

Status: Completed

Available in LabVIEW 2019 with the Collection > Map functions and VIs. 

LabVIEW  has a somewhat hidden feature built into the variant attributes functionality that easily allows the implementation of high performance associative arrays. As discussed elsewhere, it is implemented as a red-black tree.

 

I wonder if this functionality could be exposed with a more intuitive set of tools that does not require dummy variants and somewhat obscure VIs hidden deeply in the variant palette (who would ever look there!).

 

Also, the key is currently restricted to strings (Of course we can flatten anything to strings to make a "name" for a more generalized use of all this).

 

I imagine a set of associative array tools:

 

 

  • Create associative array (key datatype, element datatype)
  • insert key/element pair (replace if key exists)
  • lookup key (index key) to get element
  • read all keys
  • delete key/element
  • delete all keys/elements
  • dump associative array to disk
  • restore associative array from disk
  • destroy associative array
  • ... (I probably forgot a few more)
 
 
I am currently writing such a tool set as a high performance cache to avoid duplicate expensive calculations during fitting (Key: flattened input parameters, element: calculated array).
 
However, I cannot easily write it in a truly generalized way, just as a version targeted for my specific datatype. I've done some casual testing and the variant attribute implementation is crazy fast for lookup and insertion. Somebody at NI really did a fantastic job and it would be great to get more exposure for it.
 
Example performance: (Key size: 1200bytes, element size 4096: bytes, 10000 elements) 
insert: ~60 microseconds
random lookup: ~12 microseconds
(compare with a random lookup using linear search (search array): 10ms average. 1000x slower!)
 
Thanks! 

 

25 Comments
jdunham
Member

We discussed associative arrays (dictionaries) yet again at this week's Bay Area LabVIEW Users' Group meeting.  The points I made were:

  • If you told any recent CS graduate "I use this really awesome computer language, but it doesn't support associative arrays natively, you have to hack it up with a side-effect of one of the other language features", they would laugh in your face.
  • Given that Big Data is all the rage, and the whole technology is driven by high-performance associative arrays, it seems like NI is trying to remain as irrelevant as possible by not adding any kind of dictionary as a native feature.
  • It's not too hard to build a dictionary with the variant attributes, but what's really missing is the edit-time polymorphic behavior you get with built-in LabVIEW primitives.  Having to clone every dictionary I use from a template class is a pain in the neck and bloats my code.

Why is this feature still marked as "New" after 3 years?!?

 

 

Underflow
Active Participant

I agree with everything Mr. Dunham mentions above.

 

The one feature that I need is greater access to the underlying tree nodes.  For instance, if I want to search through a variant attribute dictionary using a regex, I need to copy the whole structure out to arrays.  It would be nice to have something closer to an iterator or callback that checks nodes in search order to save some cycles and memory.

Darren
Proven Zealot
Status changed to: In Development
 
GregSands
Active Participant

Smiley Happy

Rob_Calhoun
Member

Status changed to: In Development

 

Well, that is welcome news indeed.

 

We make extensive use of variant attributes as "variant hash tables" in our application. I found 141 uses of "set variant attribute" and 166 uses of "get variant attribute" in the ~5000 VIs that make up the project. For Darren's benefit I wanted to show two typical uses, the "strongly-typed hash table" and the "weakly-typed hash table", or "dynamically-defined object".

 

Strongly-typed Hash Tables:

This is for data structures that can be defined at compile time, and is the most common use case in our code.

 

Frequently, there is a need for an application to manage some set of objects or internal state. In this case, all values have the same data type; and the objects/states are accessed via string key.

A typical example looks like this when implemented with variant hashes:

variant-hash.png

This has good performance but since variant hashes are not strongly-typed, the compiler can't check the code for correctness. Things get particularly messy when the hash table "references" (a by-value variant!) are passed from one SubVI to another, since a variant control will accept anything. Any variant hash table that is scoped outside of a single VI is inviting trouble.

 

Much better would be a strongly-typed hash table modeled after LabVIEW's excellent queue and notifier functions. For example, say we could create a hash table like this:

 

strongly-typed.png

Much better! Now the compiler can verify code correctness before execution, hash refnums can't be miswired, and NI can optimize the internals any way it wants to. In this way the useful but obscure variant hash feature would be made available to a wider audience. I'm not uninterested in functions that allow regexp key retrieval (a nifty idea!) etc., but even just a set of basic hash table functions (create_hash, add_item, retrieve_item, delete_item, retrieve_all, delete_all, release_hash) would go a long way towards filling the very large hole in the LabVIEW toolbox. I very much look forward to it.

 

Weakly-Typed Hash Tables (Dynamically Defined Objects):

Perhaps "dynamically defined object" is the a better term for this than "weakly-typed hash table", but I am adding it to this associative array thread because I'm thinking about json-style objects that have only properties associated with them, and these are more similar to associative arrays / hash tables than to full LabVIEW objects

 

These "property-only" objects are still very useful. It's nice to have strongly-typed data, but sometimes code cannot know what the format of the data it will be working with until runtime. Most of the time the requirement for dynamic data types has come from the need to interact with various web services. For example, the AWS Machine Learning API defines the feature vector as a variable length record of name-values pairs. As data format for contemporary web services is nearly always json, that is the natural serialization format for dynamic objects and we might as well set out the requirement as "LabVIEW must be able to instantiate a json-defined object at runtime and re-serialize it again to json without data loss".

 

It is actually possible to do this in current versions of LabVIEW. The technique again relies on the obscure "variant hash" features of variants. The key is concept is that variant attribute value can itself be a variant, which allows for arbitrary tree construction, just what we need to implement dynamically-defined objects. (In the code below, the "{}" VI is just an empty variant, there as a reminder that it is a variant being used as an object.)

 

dynamic-object.png

 

We still need a serializer/deserializer. By modifying Tomi Maila's outstanding JKI JSON package to support variant hash tables, as is done in my fork of JKI JSON, we gain the ability to dynamically-defined objects from arbitrary json and vice-versa. I try to use strongly-typed data, typedefs, etc wherever possible, but when the data format is not known at compile time this technique is invaluable. Since a "dynamic object" is just an empty variant, you can even embed weakly-typed objects inside strongly-typed controls. These tools let one solve programming tasks that are trivial in python, javascript, etc but quite challenging in LabVIEW. Here's a good example: a json pretty-printer in two VIs!

 

pretty-print.png

 

 

The resulting code works fine, has good performance and mixes well with strongly-typed data, but it's such a baroque usage of variant attributes that it's hard to claim it's "good code". I'd be happy to switch to something cleaner.

 

The lack of associative arrays / hash tables is the number one weakness of LabVIEW as a language. I'm boggled that it has taken 8 years for Altenbach's suggestion to move to "In Development" but I am certainly pleased to see that it finally has.

 

Rob Calhoun

AristosQueue (NI)
NI Employee (retired)

Rob: Thanks for the feedback. Here's an image from my computer...

Untitled.png

AristosQueue (NI)
NI Employee (retired)

To go a bit further... I know your strongly-typed hash tables case is covered by the new feature. I'm not sure you'll see much improvement in the weak-typed case ... that map will still be string-to-variant. So it'll be clearer from your API that you're using a map (instead of a top-level variant that seems strange to anyone unfamiliar with the technique), but you're still handling the underlying data as a variant. ... I suppose in some use cases that you might do better than string-to-variant by integer-to-variant, where integer is an index into the original JSON string, thus reducing the copy overhead. Whatever you pick, the key can at least be strongly typed.

Darren
Proven Zealot
Status changed to: In Beta
 
Darren
Proven Zealot
Status changed to: Completed

Available in LabVIEW 2019 with the Collection > Map functions and VIs. 

Rob_Calhoun
Member

There is a bug in the LabVIEW 2019 (19.0) "Map" implementation which causes LabVIEW to return the wrong values with when reading maps with integer keys in an array context. For string values, it returns empty strings; for array values, it returns empty arrays; for float values it returns some value (I got 5.31406E-315.) See attached example.

 

The workaround is either to pass the map into array as both a map and as an array, get the list of keys from the array context then look up the value from the map context, or stick with string keys, which seem to be ok, or at least not obviously broken.

 

I will also file a support ticket. Sigh.

 

map bugmap bug

 

LabView2019MapBug2.png