03-15-2020 09:27 AM - edited 03-15-2020 09:44 AM
Hi Everyone,
As it weird as it sounds the title, I will try to explain my problem here.
I have two nested actors acquiring data from different places, they have a similar period, let's say 1 Hz. All the messages are sent back to the caller. The caller then decides whats to do with the data.
Now I introduce a third actor, which uses the data from these two producer actors to perform a given calculation. I know that these data can't be synchronous, given all the issues involving parallel execution, jitter, among others.
But I want to use the latest data from both, and these data, must not be so apart in time.
I have already thought of using a shift register to store the first incoming data and wait for the second, but I am not sure it is the proper way of handling this in AF. Probably it is just my way of thinking always asynchronously that blocks me from seeing it straight.
Do you guys have any thoughts about it?
Regards,
Solved! Go to Solution.
03-16-2020 09:26 AM - edited 03-16-2020 09:26 AM
This is not a request I've ever gotten before, so I'm just brainstorming here.
Let's clarify the request a bit.
Root sends "start collection" message to A and B.
A and B send replies back to Root.
Root then sends the combined AB data to C.
Yes?
Assuming I've got that right...
Let's denote a message from A with timestamp as A1, A2, etc, and similar for B as B1, B2, etc.
A and B start running. They're both running free, so one of them will always be ahead of the other, but because of jitter, which one is ahead can shift. So let's look at some scenarios.
Root starts receiving data:
A1, A2, A3, A4, A5...
as long as Root doesn't have data from B, it cannot send to C. Root knows that it will eventually get data from B. When that data arrives, is it guaranteed to be B1? Or could it be any where in the range? I'm going to assume it can be anywhere in the range. That means that the Root has to store all the A values in its private data. So you have some sort of internal data structure to store the list. You'll need two such structures, one for A data and one for B data.
Now, that internal data structure could be an array. But I suggest a private queue refnum would give you better performance for what you're doing. Even if you decide to use an array, you're going to use that array as if it is a queue: insert arriving data at one end, removing data from the other end so the data is processed in order. For this reason, regardless of which actual data type you choose to use, I'm going to use the term "queue" in the text below.
Ok, now the first B message arrives:
B3
Let's say you stored data in a queue. Because the A queue is not empty, you know that A is currently ahead of B, so start dequeueing from the A queue and comparing timestamps. A1? Too old. A2? Too old. A3? Ah match. Pass A3B3 message to C.
Next B arrives again:
B6
Again, start dequeing. A4? Too old. A5? Too old. But A5 is the last message in the queue. You have a choice... do you use A5 or wait for better correlated data? I suggest you use A5 as it is the most recent from A matching the most recent from B. This means you send the message A5B6 to C.
At this point both queues are empty, so whichever one arrives next will go in the queue. The patterns above work regardless of which one pulls ahead.
So the rules of this plan are simple:
In this scheme, you should never have both A queue and B queue be non-empty at the same time.
Does this match your desired data pairing for C?
03-16-2020 09:58 AM
If you are OK with having the latest values for both places, perhaps using something like the CVT (Current Value Table) or an actor that just keeps track of the current value for all your actors
03-16-2020 10:56 AM
The specific solution will definitely depend on your actual data and programmatic requirements. AQ and Fabiola's solutions are both great, but another option (if C needs REALLY well synchronized data, and it doesn't need to process it in real-time but can be delayed by a few samples) is to upsample both incoming data streams by, say, 10x or so, then use the upsampled data as the input to your third actor. If I had two independently sampled data sets and was trying to merge them in Excel I think this is the method I'd use.
The last time I needed to do this was for a system that sampled a load cell and a position sensor to measure stiffness of an object. I basically used Fabiola's method but didn't use any external libraries as I just had two data sources. The load cell was very well timed via DAQmx but the position sensor was coming in from another controller whose timing I couldn't verify. In this situation, they wanted to plot position vs load for display purposes and would do the "real" analysis later on, so I just coded it so that each time the force sensor reported data, it took the most recent position data and plotted it. It was very simple and didn't take much time to do, and the customer was happy as they could do any splining or reinterpolation after the fact, when they could work with whole data sets instead of trying to do the computations in real time.
03-16-2020 11:26 AM
@AristosQueue (NI) escreveu:
So the rules of this plan are simple:
- If data arrives from A and the B queue is empty, enqueue the A message in A queue.
Same for data from B when A is empty.- If data arrives from A and the B queue is not empty, dequeue from A queue until you find the first message equal or greater in time to the A message OR you get to the last message in the A queue. Send the combined A+B message.
Same for data from B when A is non-empty.In this scheme, you should never have both A queue and B queue be non-empty at the same time.
Thank you for the fast reply.
You've got exactly what I was thinking.
This solution may solve the problem, although I really need to check requirements to see what is the maximum timestamp difference between them, but I suspect it is not that restrictive.
I'll mark your post as the current solution.
By the way, do you think this solution may be scalable? i.e. another actor comes in the scene.
Regards,
03-16-2020 11:31 AM
@FabiolaDelaCueva escreveu:
If you are OK with having the latest values for both places, perhaps using something like the CVT (Current Value Table) or an actor that just keeps track of the current value for all your actors
Thanks for suggesting this library. I haven't used it yet, but I'll have a look into it.
In a quick overview of the content, I did not see any timestamp related to the value, am I right?
Not exactly in this case, but it may be a solution for other projects I am handling.
Regards,
03-16-2020 01:20 PM
By the way, do you think this solution may be scalable? i.e. another actor comes in the scene.
03-16-2020 07:34 PM
@felipe.foz wrote:
In a quick overview of the content, I did not see any timestamp related to the value, am I right?
No, there is no timestamp out of the box, but you could use the CVT library as what it is: a reference design, a springboard for you to make your own. They are using a lookup table implemented with variants inside an Action Engine to keep track of the current values. You could save this library or fork its source code and create your own where you add to the cluster of current values a timestamp for each value.
Happy wiring,
Fab
03-17-2020 06:26 AM
@AristosQueue (NI) escreveu:
Whether or not it is scalable enough is hard to answer. There are different types of scalability.
Memory scaling: If your actors are very fast and have little latency, the queues might not get that big. Or they might blow up a lot.
Performance scaling: If your system has a lot of things going on, those queues might get large and suddenly the timestamp comparison becomes a performance bottleneck.
Implementation scaling: If you hardcode in two queues into your Root, then you have to hardcode in a third. But if you make it an array of queues, then you can freely add more actors.
Scalability is a question only you can answer for your situation.
Understood. Thanks for your feedback. I'll be coding these possibilities, and I'll let you know when I get some news.
03-17-2020 06:27 AM
@FabiolaDelaCueva escreveu:
No, there is no timestamp out of the box, but you could use the CVT library as what it is: a reference design, a springboard for you to make your own. They are using a lookup table implemented with variants inside an Action Engine to keep track of the current values. You could save this library or fork its source code and create your own where you add to the cluster of current values a timestamp for each value.
Happy wiring,
Fab
Nice. Thanks for the tip Fabiola.
Regards,