LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Python Dictionary ported to LabVIEW

I missed having proper hash map dictionaries in LabVIEW and was not happy with the implementation from Bloomy so I implemented a proper dictionary in LabVIEW. Based mostly off of this wonderful talk from one of the core Python developers, it uses 4 arrays to track the keys, values, hashes and index map and 2 ints to keep track of how many elements there are and how many buckets are used in the index map.

 

On average, inserts and gets are much faster than variant attributes, relabels are orders of magnitude faster and deletes are a tad slower.

1000 elements

Capture.PNG

 

Try it out and see if you can squeeze any more performance out of it. I'm not super familiar with the intricacies of the LabVIEW compiler so I'm sure there is something I've overlooked performance-wise but it was pretty fun nonetheless and I'm quite happy with the results.

Message 1 of 20
(3,835 Views)

I'd really like to see how you did this, and perhaps use it in our application. Would you mind posting a 2013 port? Thanks!

0 Kudos
Message 2 of 20
(3,713 Views)

@OneOfTheDans wrote:

I'd really like to see how you did this, and perhaps use it in our application. Would you mind posting a 2013 port? Thanks!


Attached.

0 Kudos
Message 3 of 20
(3,697 Views)

Thanks for the port, this looks neat! Did you build this exactly following Python's algorithm, or just was that just the source of inspiration? And this handles key collisions by re-modulating the hash inside _lookup? I'm looking forward to using this in one of our projects soon. I'll touch base here if there are any issues, or further optimizations.

0 Kudos
Message 4 of 20
(3,663 Views)

I can't open your code right now, but i'll assume you have 1 variant, right? You could do an array of variants, similar to buckets and then use attributes as the list. I wonder how that'll compare. 🙂

/Y

G# - Award winning reference based OOP for LV, for free! ADDQ VIPM Now on GitHub
"Only dead fish swim downstream" - "My life for Kudos!" - "Dumb people repeat old mistakes - smart ones create new ones."
Certified-LabVIEW-Developer
0 Kudos
Message 5 of 20
(3,654 Views)
@OneOfTheDans wrote:

Thanks for the port, this looks neat! Did you build this exactly following Python's algorithm, or just was that just the source of inspiration? And this handles key collisions by re-modulating the hash inside _lookup? I'm looking forward to using this in one of our projects soon. I'll touch base here if there are any issues, or further optimizations.


I pulled the majority of the algorithm straight from Python. Only thing that's a bit different is the resizing strategy and that's mostly because LabVIEW makes an array copy whenever you append or delete from the end of an array.

 

The hashing scheme uses a linear probe with perturbation on hash collision. Sounds complicated but the source code is really well commented.

 


@Yamaeda wrote:

I can't open your code right now, but i'll assume you have 1 variant, right? You could do an array of variants, similar to buckets and then use attributes as the list. I wonder how that'll compare. 🙂

/Y


What you're suggesting sounds very similar to Bloomy's implementation. The problem with using Variant Attributes is that they use a binary search tree (O(log n)) to store values rather than a hash map (O(1)).

0 Kudos
Message 6 of 20
(3,636 Views)

If each Hash Bucket is small, the log n-part is small. The O(1) is only quite true if you have enough buckets. 🙂 I'll have to look at it, but it feels like the implementation would be simpler.

How big is your Hash-structure? Is this a case of sacrificing memory for speed?

/Y

G# - Award winning reference based OOP for LV, for free! ADDQ VIPM Now on GitHub
"Only dead fish swim downstream" - "My life for Kudos!" - "Dumb people repeat old mistakes - smart ones create new ones."
Certified-LabVIEW-Developer
0 Kudos
Message 7 of 20
(3,611 Views)

@Yamaeda wrote:

If each Hash Bucket is small, the log n-part is small. The O(1) is only quite true if you have enough buckets. 🙂 I'll have to look at it, but it feels like the implementation would be simpler.

How big is your Hash-structure? Is this a case of sacrificing memory for speed?

/Y


The implementation is definitely a bit more involved but simpler isn't always better (and it's really not too bad, only __resize.vi does anything weird).

 

I do resize a bit more aggressively than Bloomy dicts (quadrupling the number of buckets until over 50k then only doubling) but each bucket is still pretty small (1.67 I32 integers + 0.67 strings + 0.67 variants per bucket) so there is no monstrous space consumption because of it. There is definitely a memory vs speed trade-off between dictionaries and binary trees though, always has been.

 

I made this as an exercise to get myself back into LabVIEW. I shared it because I thought it was interesting and potentially useful. I'm not looking to argue about the algorithm, it's been in use for years by one of the most popular languages on the planet. I would love for someone to nitpick my code though. I'm not as familiar with LabVIEW as I am other languages and I'm sure there's an unneeded buffer allocation or something somewhere that I've missed.

 

 

 

0 Kudos
Message 8 of 20
(3,599 Views)

Here is some nitpicking, do not know if it will be useful. Your skill level is above mine.

 

1.) Unnecessary Buffer allocation in _hash.vi. You have a buffer allocation going from V8 to I32 which is unneeded. Convert to I32 for the element and in the loop, see below.

 

pict1.png

 

2.) For the _resize.vi you can try "reshaping" the array instead of making a new array and inserting the old values in. See below.

 

pict2.png

 

I have not tested any of these suggestions, so not sure if they are faster, better, etc.

 

Cheers,

mcduff

0 Kudos
Message 9 of 20
(3,587 Views)

Great code 🙂

 

I replaced the variant-based core of some of the dictionary-solutions available on ni.com (like the Point Value Map for example) with this hash map to get the best of both worlds (multiple nameable dictionaries etc), and got a very nice performance improvement (perhaps an idea the owners of those solutions would be interested in...).

 

I also did a version that only supports doubles, just to see how fast I could get it. Brilliant. It is the fastest I've found in native LabVIEW code.

0 Kudos
Message 10 of 20
(3,070 Views)