Hardware Developers Community - NI sbRIO & SOM

cancel
Showing results for 
Search instead for 
Did you mean: 

TCP/IP connection is terminated on the client side when I load a new FPGA bitfile on NI sbRIO-9651 SOM

Hi Developers,

 

I use NI sbRIO-9651 SOM with our product, that acts as a server and accepts TCP/IP connections over Ethernet. I noticed one strange behavior with keeping connections alive.

 

What I do:

1. The device boots up, loads FPGA bitfile and starts listening for connections, it is a Server

2. I establish a TCP/IP connection from a Client (PC) side

3. Then I communicate with the device, perform some actions

4. Then I ask my device through a TCP/IP command to load another bitfile to FPGA. And then something interesting happens:

 

5. If I have a direct cable connection (1-to-1), the TCP/IP connection on the Client side is terminated with the code Error 66 (The network connection was closed by the peer). The Server side doesn't detect the problem right away, and only after 20 seconds or so it closes the broken connection as well.

5. If I have an ethernet switch between my device and my PC, then nothing bad happens, the connection is alive.

 

Have you guys seen something like this before? Do you know why this happens? What would you do to workaround this difference for an end user?

 

Thank you in advance!

 

 

 
 

 

 

Nikita Prorekhin
CLA
0 Kudos
Message 1 of 11
(1,868 Views)

Hi Nikita,

 

On the sbRIO-9651, the ethernet interface goes through the FPGA fabric.  When you load a new FPGA bitfile the connection between the CPU and the Ethernet Phy is briefly disrupted. Normally this does not cause any issues. However the 9651 will miss some packets during this time, and the link may be briefly going down. The link dropping briefly may be causing your host PC to drop the connection, but when using a switch that host PC link never drops and everything continues to work. This would be my best educated guess at what is causing the behavior you are seeing.

 

Thanks,

Nathan

Nathan
NI Chief Hardware Engineer
0 Kudos
Message 2 of 11
(1,846 Views)

Thank you Nathan, this explains the behavior I am seeing.

 

That is why we always suggest our customers use switches or routers with our device, but this is not as convenient as a direct cable or a USB-To-LAN adapter. Not only our API connection is broken, but also an SSH connection, plus the device tries to acquire a new IP address from a DHCP server (if no static IP addressing is configured) and sometimes even stops responding ping commands for 30 seconds or so.

 

You wrote The link dropping briefly may be causing your host PC to drop the connection,  and Normally this does not cause any issues. This makes me think, if we eventually made a mistake in a carrier board design, since it happens always: all connections are broken, network is reset and the device requests an IP address from a DHCP server.

 

Do you think we have some flaws in our board design so we always get this issue?

Do you think we can make some hardware tricks on a carrier board to workaround this?

Do you know if a new SOM-like device is on the NI roadmap, so we can potentially discuss this issue and influence product design decisions?

 

There is a lot of questions, I know:) We like your product and we want to use in the future.

Thank you very much for your time!

Nikita Prorekhin
CLA
0 Kudos
Message 3 of 11
(1,830 Views)

Hi Nikita,

 

Is this behavior happening on the "primary" ethernet port (the one where you just provide magnetics and connector on your board? Or is it happening on the optional secondary port where you provide the ethernet phy on your board? 

 

If it is the primary port, then it is unlikely that your layout is to blame. If it is the secondary port, then the issue might be made worse depending on how your board handled the ethernet phy's reset signal. 

 

Part of the problem with direct connecting to the host PC is that it may drop all connections as soon as it sees a link down, regardless of timeouts, etc.. for each individual connection. The switch resolves this problem because the host PC always maintains link with the switch. If you absolutely cannot have a switch then I would look at software solutions to the problem. To automatically shutdown and reestablish connections to the SOM when you reconfigure the FPGA. 

 

I don't have any information currently on future SOM products, but I would love to hear any feedback you have on our current offerings. 

 

Note: another option you have if the host PC is close enough for a USB connection to the SOM. This may require a board change on your part, but you can use one of the SOM's USB port as a device port for the SOM. When set as a USB device port and attached to a host PC, it looks like a USB Ethernet adapter to the host PC, and works just like our normal ethernet connections to the SOM do. The difference here is that it does not go through the FPGA and would not be affected by reconfigurations.

 

Thanks,

Nathan
NI Chief Hardware Engineer
Message 4 of 11
(1,818 Views)

Hi, Nathan

 

Just curious about whether the issue you mentioned above affects sbRIO-9607, which is also based on Zynq-7020. The phenomena that we've run into is that the primary Eth0 port stops to work sometimes when the target is rebooted. I can see the yellow light on the Eth0 RJ45 keep yellow, instead of flashing. To be clearer, the yellow and orange lights on the RJ45 port are both on without flashing. I'm wondering whether this Eth0 on 9607 goes through FPGA Fabric also. If not, do you have any idea on why this happens sometimes?

 

Thanks,

Richie

0 Kudos
Message 5 of 11
(1,784 Views)

Hi Richie, 

 

Yes, the sbRIO-9607 is the same. It's primary and secondary ethernet ports go through the FPGA fabric. Anytime you reboot the CPU the ethernet phys are also reset, but the ports should always work after the reboot. Are you saying it does not work at all after a CPU reboot? If so, can you describe how you are causing the reboot?

 

Thanks,

Nathan
NI Chief Hardware Engineer
0 Kudos
Message 6 of 11
(1,747 Views)

Hi Nathan,

 

We use Eth0 in our board, so it seems like our current board design has no flows and we need to look for a software solution.

 

We could change our board design in the future to provide a USB-Ethernet connection, as you suggested. However I was not able to find anything mentioning this functionality here - https://www.ni.com/pdf/manuals/376962c.pdf (from page 9) or here - https://www.ni.com/pdf/manuals/376960c.pdf

 

Do you know what port (USB0 or USB1) and what mode (Host/Device) should we use?

 

Thank you again,

Nikita.

 

Nikita Prorekhin
CLA
0 Kudos
Message 7 of 11
(1,734 Views)

Hi, Nathan

 

There are three scenarios.

1. When we use NI RAD Utility to flush an image to a new board, the tool requires reboot. Sometimes, the YELLOW light on RJ45 keeps YELLOW after reboot. 

2. Sometimes, we use WinSCP to run a reboot cmd.

3. In our code base, sometimes, we use System Exec.vi or nisyscfg.lvlib:Restart.vi to reboot the CPU.

 

We did run into connection broken after reboot by any of the above three methods.

 

BTW, since the Eth ports go through fabric, do you recommend we download bit file to flash or download the bit file in our software?

 

Thanks,

Richie

0 Kudos
Message 8 of 11
(1,722 Views)

Hi Nikita,

 

The Device mode is what I am talking about and what you want to use. It will work as a USB Ethernet connection. So USB0 in device mode. 

 

Thanks,

Nathan
NI Chief Hardware Engineer
0 Kudos
Message 9 of 11
(1,709 Views)

Hi Richie,

 

A connection to the device should break when it is rebooted. This is because the entire device is reset, including the ethernet ports. Once reboot is complete you should be able to reconnect to the device. If this is not working then your device may be damaged. Please message me directly if you think that is the case.

 

Thanks,

 

Nathan
NI Chief Hardware Engineer
0 Kudos
Message 10 of 11
(1,706 Views)