1756 Modules Faulting Randomly

lostcontrol

Lifetime Supporting Member
Join Date
May 2009
Location
NeverSayNever
Posts
1,069
This installation has been having issues with IO faults occuring, and clearing straight away. The devices are simple valves, 1x OP & En/DeEn feedback, with standard fault logic.

The installation consists of 2x 1756 racks, connected via ENET's. THe issues are occuring at the remote rack.

When on site, I noticed on a couple of occasions, 1 of the output modules would fault (outputs switch off, LED flashes red), and then come right again.
I am not sure if this is the exact cause of these issues, but I want to target this to begin with, as it does seem likley that this could be an issue, especially if the CPU is commanding a valve to operate, and it switches it self off, then the valve is not at its commanded state for the fault time, so it faults.

The other confusing thing, is that the SCADA is logging the fault as occuring, & clearing with exactly the same timestamp, this could be possible, as the IOServer is confgured for 1-sec, and the alarm log only displays to the second.

What I would like to know, is what can cause a 1756 IO module to fault randomly like this. All of the modules on this remote rack have their RPI's set to 35mS.


I have put some code in to monitor the module fault status, so hopefully, if it occurs again I will be able to see what the codes & info are.
We have had issues with the ETN comms in the past, but that has been reasonably solid for quite some toime. There are plans to change this to ControlNet in the near future....
 
Noise, inductive loads without surge protection?

On the Rack supply side, possible I guess, there is no UPS on the remote rack supply, or even the main rack.
Is odd though that it is a particular module that I witnessed on a couple of occasions. I did notice 3 of the modules in this rack at one stage also...
 
I had an unusual incident somewhat like this.

Turned out that when the panel builder had tightened the rack down to the backplane, it was not square.

During commissioning we had errors somewhat the same. Fix? Loosen the rack so that it was no longer in tension.
 
Make sure that your fault trap logic includes a way to capture the duration of the connection failure; that might be an important clue.

An EtherNet/IP connection failure happens fast; the connection times out if data is not received for 4x the RPI, with a minimum of 100 milliseconds. But re-connection usually takes between 6 and 10 seconds, for the scanner to politely wait to re-establish TCP and/or CIP connections.

A connection failure across the backplane will happen just as fast but it might re-establish faster.

If this was my system I would do a full sweep for the "Atmel64" backplane bridge chips that caused headaches in 2007, put in fault duration and code trap logic, and start looking at the connection statistics in the 1756-ENBT modules embedded web pages as well as the Backplane tab on the Module Statistics applet (right-click, select Module Statistics) in RSLinx Classic.

Very interesting stuff about the backplane torsion, Oakley; thanks for sharing that.
 
I had an unusual incident somewhat like this.

Turned out that when the panel builder had tightened the rack down to the backplane, it was not square.

During commissioning we had errors somewhat the same. Fix? Loosen the rack so that it was no longer in tension.
Interesting, I will give this a crack. This install is 8+ years old, so may not be the root cause, but worth a look.

Make sure that your fault trap logic includes a way to capture the duration of the connection failure; that might be an important clue.
What would be the best way to do this, using EntryStatus, or FaultCode attributes?
An EtherNet/IP connection failure happens fast; the connection times out if data is not received for 4x the RPI, with a minimum of 100 milliseconds. But re-connection usually takes between 6 and 10 seconds, for the scanner to politely wait to re-establish TCP and/or CIP connections.
I have witnessed this type of occurence on other sites, in similar circumstance. At the moment though, I think this is module related, but have not ruled out ETN scanner comms.

A connection failure across the backplane will happen just as fast but it might re-establish faster.

If this was my system I would do a full sweep for the "Atmel64" backplane bridge chips that caused headaches in 2007, put in fault duration and code trap logic, and start looking at the connection statistics in the 1756-ENBT modules embedded web pages as well as the Backplane tab on the Module Statistics applet (right-click, select Module Statistics) in RSLinx Classic.
This site is 8+ years old, so it may of escaped the 2007 headaches.
At the moment, I have put in a GSV for faultcode on the module in question. Should i be looking at EntryStatus as well?
Unfortunately, these racks are linked by ENET's, so are reasonably old.
 
Power Supply

Just something that I have encountered in the past. Make sure that you are not overloading your power supply or that it is providing good power. Low voltage can cause this randomly depending on the load being accessed.
 
Just something that I have encountered in the past. Make sure that you are not overloading your power supply or that it is providing good power. Low voltage can cause this randomly depending on the load being accessed.
Do you mean the IO or Rack Power Supply? I am working through the IO supply, but am not sure how to test the Rack Supply??
 
Do you mean the IO or Rack Power Supply? I am working through the IO supply, but am not sure how to test the Rack Supply??

I am pretty sure he means the Rack Supply. If it is overloaded, it can cause all kinds of wierd errors. I don't have a system available to me right now, so I can't make any suggestions about testing. Perhaps someone else can jump in with a suggestion.

Stu.....
 
all the above

The IO rack should have a power supply. There will be a terminal on this power supply that you can check. Sometimes the IO will be powered from this rack or it will be powered externally or at times both. The key is to make sure whatever is powering the cards has the proper voltage and can handle all of the loads you may encounter.

This is what creates the randomness (is that a word?) of the problem. There may be two or more devices that intermittently load the power supply down to the point of failing.
 
Also look at the ENET setup. If your remote interface is not setup for the correct rate (10T, 100T, Auto), you can have issues like this. AB recommends using Auto-detect.

Steve
 

Similar Topics

I am doing a PLC5 to Controllogix upgrade (not using conversion software) I hooked up the 1756-DHRIO card to the existing (and working) Remote...
Replies
3
Views
2,751
Hello, I am having an issue with 1756-IT16 modules going out at a client’s location. This has been the issue from the very start. The modules will...
Replies
0
Views
925
I have a pair of 1756-L73 controllers on A4/A backplanes with PA72 power supplies, configured for redundant operation with RM2 and EN2T modules...
Replies
1
Views
2,759
I was going through our spare parts to put together a 1756 rack of I/O to replace an old 1771 rack. About 5-6 of my spare digital cards are...
Replies
6
Views
1,808
I am adding a new 1756-IRT8I to a system and I have been using IFM modules with this system. Unfortunately, I am seeing two different part...
Replies
6
Views
2,942
Back
Top Bottom