Intermittent ControlLogix communication issues

DarrenG · Apr 28, 2017

Hi folks,

I'll try to keep this as short as possible, also I don't expect anyone to go through troubleshooting the whole issue on here as there's so too many things it could be. I only provided an overview of the ongoing issues so that if something in the screenshots provided below pops out to someone and the story behind it backs that up then I'll be happy having narrowed down the issue. All I really want is the screenshots to be viewed and any potential issues flagged. I only recently graduated so network issues wouldn't be my strong point

(BTW we are using InTouch 10.1 with AB 1756 ControlLogix PLC's)

So, we recently commissioned a second line in one of our plants and during commissioning we had comms issues; after updating the SCADA for example it would take anywhere between a few seconds to a few hours for the data to start populating, temperatures, pressures etc. So we added another subnet to reduce the load on the original and that seemed to work OK at the time.

Since then though if we need to update the SCADA it may or may not communicate immediately afterwards, the time still varies. I watched the Log file in System Management Console last time this happened and errors of either "invalid item 'tagname' not defined in processor", "unknown object name" or "Device connection attempt to xxx timed out, socket closed" kept popping up. Under the configuration tree where you can view the tag list, they are usually green if everything is OK, red if there's issues or nothing at all if you're not connected, the tags would show red for a few seconds, then disappear altogether, come back red after a few seconds and repeat. After a few hours of monitoring, shutting down, reopening or reactivating the server in various combinations it became clear something needed to be changed so I tried increasing the max CIP connections and it worked, no idea why as it was working fine before the update (typical!) but it did. The time before that it happened on a different station so it doesn't seem to be one station in particular, and I did not need to increase the CIP connections on either of the other two for them to work which leads me to believe it couldn't have been the update either.

Anyway today I was having a look at the PLCs web server and noticed a few things that may point us in the right direction, I just don't know heaps about it so if there is anything glaringly obvious to someone more in the know I can at least pass this information on and maybe we can get someone in to have a look.

I took 4 screenshots of potential problems (from the PLC on the original line, the newer line has less IO so going to start here) and highlighted what stood out to me.

Thanks in advance for your help

PS. it's taken me so long to write this concisely that the day is almost over so apologies for any delay in replying over the weeked/bank holiday!

DarrenG · Apr 28, 2017

OK these should be a bit better quality... sorry!

Mispeld · Apr 28, 2017

Are there communication links other than the SCADA and HMI that could be interfering, with the most obvious impact showing up as poor SCADA performance? Specifically, are there produced/consumed connections at extremely short Requested Packet Intervals (RPI) that may be affecting network capacity and/or processor communications overhead? This is the kind of thing that could happen during an upgrade where an RPI for a new device or consumed tag(s) on a heavily utilized network is in the single-digit millisecond range. Just a guess.

DarrenG · May 2, 2017

Thanks for your help Mispeld, the vast majority of the modules are configured to 80ms but there are a few that are as low as 20ms. I monitored each of the tasks and a few of them have real time scan times as low as 6us, that could be down to the fact that there's very little in them routine though. One of the tasks was as high as 8607us, any ideas on how we could reduce that? Split it in two maybe? Or would it not make any difference as the two seperated tasks would still be requesting the same amount of data? Apologies for all the silly questions. I'll pass this on to my boss anyway and see what he thinks about increasing the RPI, it's as good a place to start as any.

As a matter of interest, what would be considered a relatively normal RPI for non critical values lets say, temperatures on cooling tower lines or something. Would 200ms be ok or could we go higher again? Also would staggering the scan times make a big difference? As in, probably 85% of the modules are set to 80ms at the moment, so instead of almost everything requesting information every 80ms, if I alternated them at 120, 130, 140 etc or something like that would that be better?

JohnCalderwood · May 2, 2017

Darren,

You say you are using 1756 PLCs - was a new one added to the system as part of the new line?

What model numbers and firmware revisions are you using?

Was the SCADA upgraded too, or was it always V10.1?

The only reason I ask is that we tried to upgrade an InTouch V8 to 2014R2, whilst still using the 1756-L55 PLC which was version 16, and we had some issues between SCADA and PLC. We had to admit defeat and upgrade to a 1756-L71 at version 21.

Mispeld · May 2, 2017

DarrenG said:
As a matter of interest, what would be considered a relatively normal RPI for non critical values lets say, temperatures on cooling tower lines or something. Would 200ms be ok or could we go higher again? Also would staggering the scan times make a big difference? As in, probably 85% of the modules are set to 80ms at the moment, so instead of almost everything requesting information every 80ms, if I alternated them at 120, 130, 140 etc or something like that would that be better?

In my experience these RPI values are typical and have not caused network loading issues. I would have been more concerned with an RPI of, say, 2 ms, which one time happened at my site when a device came with that interval as the default setup. Without knowing the number of devices on your network, it is difficult to quote a specific number. I would not expect alternating or staggering the RPIs to have a significant impact.

On a related item, maybe someone with more knowledge with the subtleties of the "System Overhead Time Slice" (Controller Properties, Advanced Tab) can comment on whether this may have an impact on correcting the observed behavior. It will also depend somewhat on your program structure of a continuous task and/or periodic tasks.

DarrenG · May 2, 2017

John, yes there was a new one added for the second line, on both lines they are 1756-L72's using V19.01.00 (CPR 9 SR 3) of RSLogix 5000 and InTouch stayed the same, it was 10.1 originally. What sort of issues were you having?

Oh right OK, the lowest RPI value was 20ms which was on about 5 modules. I'll have a look at the original line in a bit and see if there's anything lower than that but the problem has only reared its ugly head since the new line so that's why I'm focusing there.

GTUnit · May 2, 2017

Check to see that you are are not maxing out the PLC memory. I have seen that cause comm issues as well.

DarrenG · May 3, 2017

How do I check that?

DarrenG · May 3, 2017

OK so I found it, should have looked a bit harder before replying! There's plenty of memory left on both lines

the CPU usage is pretty low too, ones 50% and the others 30% and the RPI's on the original line are mostly at 100ms too.

I wonder is there interference being caused when we have a particular part of the plant running. If none of those figures I originally posted seem to be an issue, the RPI's etc all seem fine, and the problem only occurs occasionally then maybe it's a power cable ran too close to a PLC cable or something like that. It's only a guess but when it happens again I'll take note of what's running and what's not.

But then again would that cause all the tags in SMC to appear/disappear over and over again, and the other stations to work fine.... maybe not.

OkiePC · May 3, 2017

Do you have Ethernet IP I/O communications sharing the same network?

DarrenG · May 3, 2017

Our VSD motors are on Ethernet and the IO panels are connected via Ethernet but all the instruments are 4-20mA. The motors are on the same network as the rest of the IO though yes. We have two subnets, one for each line and everything for line 1 including the common items are on the original subnet, everything exclusively for line 2 is on the other.

Intermittent ControlLogix communication issues

DarrenG

Member

DarrenG

Member

Mispeld

Member

DarrenG

Member

JohnCalderwood

Member

Mispeld

Member

DarrenG

Member

GTUnit

Member

DarrenG

Member

DarrenG

Member

OkiePC

Lifetime Supporting Member

DarrenG

Member

Similar Topics