L33R Network Drop

Gadelric

Member
Join Date
Nov 2018
Location
Midwest
Posts
137
Team,

I have a Compact Logic L33R that is having a random network issue.

Starting about a month ago, I had my first issue with this station.
The HMI was in Error states on the indicators, and secondary testing PC stated it could not connect to the PLC.
The RPI that uses Tight VNC to the HMI was still responding to hmi screen changes, so it was still seeing the HMI.
Corrected this issue by power cycling the switches.

2 days later, issue came back, power cycled the switches and it recovered. There were two-4 port 10/100 switches in the cabinet, all I had in MRO was a 12 port gigabit switch, popped it in during their break and everything worked.

2 days later, plc dropped again, RPI stayed connected to HMI, using a singular switch now... removed the switch from my potential issue, noticed the network cable to the PLC was hand made, replaced it with a tailor made and everything came back.

next day, issue returned... unplugged network cable from port one on the PLC and plugged it back in, no change, plugged it into port two and everything returned.

Couple weeks later... Last night at 2 am to be exact, I get the call that its down again. Walked 3rd shift maint through unplugging the port and reconnecting, no resolution, tried switching the port back to port 1, no resolution. Things of note is the link light would start flashing once the network cable was plugged into either port 1 or 2 but the IO light never went solid.
Power cycled station, no resolution, power cycled it a second time and pulled the fuse from the battery back up to drop the plc and then powered it all back up, everything works.



For now, I have the network cable pulled from the wire tray and jumping directing from the switch to the PLC, the developer was not very diligent with separating the high and lower voltage in this cabinet.
I am not completely sold the PLC is the issue, but i'm starting to get to the point of grasping straws.
Any recommendations as to what I should do to isolate the PLC from the equation?


Thanks,

Gad
 
Is the HMI in the IO tree of the PLC?

The flashing IO light means that one or more modules is not communicating on the network. If you have access to the PLC program I would look to see which IO module is not communicating before trying to get the system back up so it narrows down your issue.

By chance could something with a duplicate IP address be joining the network when this happens?
 
Dock,

The HMI is not mapped to the IO tree, the only thing mapped is the atlas Copco torque gun and 3 Keyence IV nav cameras and 1732E-16CFMG12R/A Armorblock x9
This system is behind a nat device.
There is a second network within this station.
The second network is on a separate subnet that is connected to the tester PC.
Although I never want this thing to drop again, I am hoping that when it does I will be here and will try to connect directly to the PLC via one of the open ports and see what is missing.

If I was just dropping the HMI, that would be a place to start.
I am however, dropping just the PLC from both the HMI and the Test equipment PC.
The raspberry Pi still stays connected to the HMI via tight vnc. Pi to HMI and HMI to PLC use the same switch.
 
Last edited:
Have you checked the webpage for diagnostics? Check counters and look for any faults as well as total connections. Something could be connecting and not gracefully closing.
 
Maxkling has good advice, you could also do this in the controller properties since you have access to it.

When you say VNC do you mean an actual program like tight VNC or are you running the hmi through FTView me through a PC?

Another thing I would try would be to change the IP address of the hmi. If for some reason there are duplicates joining the this would resolve it. Obviously there are lots of other possible reasons for the network errors but it seems local to the hmi.

The plc is the host, if you can communicate with it then I would focus on other network IO.
 
Team,

After a long day of crawling through this machine, doing all the wiggle tests and watching processes run, I have not been able to "make it" crash.

I have a series of tests set up to run on this station if/when this issue happens again.
Until then, I am at a loss as to what the root cause is.
We also took the extra step to face a security camera at this station so we can see if things are being plugged into the line by operators.

random gremlins are the worse.


Gad
 
Someone suggested to check out duplicate IP address. Did you test this theory?

Were the switches Phoenix brand?

The C-shift guy possibly falling in love with you.

RPI or more properly RPi , can you live without it/disconnect?
 
Did you check the diagnostic counters?


Yes I did, nothing stands out, no errors.


The only thing I could find in the Event Log was :
SysLog: detected a slow reader
31830
SysLog.cpp
519
1 day, 12h:22m:28s
0
0
Slave clock outside sync threshold
31829
OsFastTimerImpl.cpp
648
1 day, 12h:22m:28s
65516
0
Connection timeout notification
31828
CmdConsTimeout.cpp
176
1 day, 12h:22m:24s
1011
419581633

Id like to find out more info about the event log and what device(s) is being classified as a slow reader as well as what device created the timeout notification.
Not sure if the slave clock would cause this as well, slave clock is the most prominent event in the log. Everything in the event log's time is around the time that they powered the PLC back on. All the events are within 30 sec of each other.

As for the Raspberry Pi (RPi), the HMI is in a location the operator is unable to see, the work around is the pi using vnc so the operator can see what they need to do with the part in the station, not having it will create headache that I would like to avoid, them off shifters get real angry when we turn things off.

As for now, everything it chugging along great.
 
Someone suggested to check out duplicate IP address. Did you test this theory?




We have worries that it could be an external device that someone is plugging into the line and then disconnecting it after the fault happens.
We have a security camera recording everything that happens at this station.

So far, every IP address assigned to this station is accounted for and functional with no duplications.
 
We have worries that it could be an external device that someone is plugging into the line and then disconnecting it after the fault happens.
We have a security camera recording everything that happens at this station.

So far, every IP address assigned to this station is accounted for and functional with no duplications.

So, you might re-crimp the ethernet ends, and test the cable if you have a tester.

If you can might get a known ethernet cable and run it temporarily and see if the problem comes back.

Check to see if the spanning tree of the switch is setup correctly. We had a switch that every time a device was plugged in or unplugged the new switch would redo the tree and everything would disconnect on the switch very briefly. (I might have the wrong terminology its been awhile but I know it was the spanning tree.)

Finally, you might check that you don't have a weird loop in the network. I had that and it ****ed me off when I found it, so during production I was able to reconfigure the two switches and remove the loop and future potential loops.
 
We have worries that it could be an external device that someone is plugging into the line and then disconnecting it after the fault happens.
We have a security camera recording everything that happens at this station.

So far, every IP address assigned to this station is accounted for and functional with no duplications.

Maybe you could setup Wireshark (perhaps on one of the HMIs connected to that network, or other PC such as a laptop) with the right filter (arp) and leave it running. Normally ARP (address resolution protocol) packets are only sent when a device's LAN interface link goes up, and then periodically, depending on the device, and thus the amount odf data generated is not huge. Because ARP packets are broadcast packet you do not even need a special port-mirroring tap. Any switch will send this packet to Wireshark. If indeed someone is connecting a PC with conflicting IP address you will get the MAC address of the PC/device and the time stamp for the moment when this PC/device was connected.
 
We are starting to believe that the issue is related to the secondary equipment that does the functionality tests.
This PLC had 2 ethernet ports on it.
Everything is run through port 1, local side and test equipment side.
If I was to move the test equipment to port 2 and leave the local plc side on port 1, would this issue still drop both ports or just he one its connected to?
 
We are starting to believe that the issue is related to the secondary equipment that does the functionality tests.
This PLC had 2 ethernet ports on it.
Everything is run through port 1, local side and test equipment side.
If I was to move the test equipment to port 2 and leave the local plc side on port 1, would this issue still drop both ports or just he one its connected to?

From memory the PLC doesn't have 2 ports but 1 port that has 2 connections. (Meaning it has 1 IP address but 2 ports so you can daisy chain or DLR.)
 
Update,

While hardwired directly to the PLC, via port 2, I was able to catch the IO start to drop. I started to randomly drop networked IO blocks, and the RTA dropped as well. These drops where only for a second or two and then would come right back.

While that was happening, the Parts Per Hour (pph) board was angry, stating it could not reach the PLC.

The PPH board runs Advance HMI to display the PPH.
When the PPH board would drop, a random network device would drop from the PLC IO tree.

I went over to the PPH PC and pulled the ethernet cable and shut it down.
Network stabilized, IO stopped dropping, and production continued.

As we reflect on what changed with this PC, we noticed my first instance of this error happened the first day IT put some new antivirus on the system.
This antivirus hates Advance HMI.

I really doubt advance HMI is the issue as it was running on this computer for months with no issues.
All my other advanced HMI programs run fine, but are not on a PC that IT has access to.

For now, I have this PC disconnected and will run without it until after the Thanksgiving holiday.

As I have had this issue almost every night this week, if we stop having problems, I will look into building another PC that IT doesn't have access to.


Thanks for your support, I have learned more than I wanted to about network troubleshooting.
🍻
Gad
 

Similar Topics

I want to establish a Profinet network in my production plant to connect multiple devices, including a PLC, HMI, and multiple Profinet-based...
Replies
19
Views
553
Greetings Folks, This is my first post after a long gap (almost 13 years) on this forum and i hope that i will get things fixed with your...
Replies
10
Views
236
I have been looking to this and thought I'd ask for input from others before I take it in a wrong direction. The guy who used to set these up...
Replies
9
Views
342
I have inherited a system that uses a Parker ACR9000 motion controller with the Ethernet PowerLink option that it uses to control five Parker...
Replies
5
Views
161
Hi legends, new to this industry but absolutely loving it. I am looking to go online with a ControlLogix PLC, the network is supported by a...
Replies
10
Views
294
Back
Top Bottom