ControlLogix Time Slices

PLC_Time · Dec 2, 2009

I have an application with two Contrologix controllers in the same rack. One is setup just as a continous loop. The other one with 3 timed intervals (5ms, 15ms, 50ms) and a continous loop. There are 28 axis between the two processors and I know CPU load goes up and scan times slow down dramatically during production. The integrator that setup the system has things chugging along so that the CPU utilization is around 95%+ all the time according to the monitor system. I have seen it at 96% and it turns red in the system monitor. We also sometimes get watchdog errors (it appears its not set too tight).

Previously all my coding had been done with continuous loop, thus the time slice concepts are new to me. I have attached a screen shot of things.

Having some difficulty tracking some of this down online and the help menus. We are having some issues with very slow response times with some AB Panelviews (over ethernet) with 3-5sec delay and sometimes intermittent position problems. We are thinking it may be due to missing some things the way the PLC logic is scanning. The integrator has tried all they seem to know to do.

To sum it up I have three questions for now:

1. Any clues to know if this is a time slice issue?
2. What is the overlap column telling me in the screenshot?
3. Any tips on how to learn more about how to optimize things?

THANKS!

mellis · Dec 2, 2009

The overlap column tells you how many times this particular task did not have enough time to complete before it was ordered to start again. Anything besides zero here is BAD.

Your 50ms task is the lowest priority of the timed tasks and it has been interrupted enough by the higher priority tasks that it did not complete 20 times. That max scan time of almost 70ms on a 50ms task is bad news too. The last scan was only 2.8ms, so something is going on there that makes it take a LOT more time occasionally.

Your 15ms task is getting interrupted by the 5ms task so much that is failed to complete 223 times.

Even the 5ms task which shouldn't be interrupted by anything failed to complete 2 times.

Nothing good will come from code that doesn't execute reliably.

Another way to look at it:

The highest priority task is running at 5ms. Say it takes 1ms to run, that means it consumes 20% of the processor time all by itself.

The next highest task gets to run in the remaining 80%. It is running at 15ms. Say it takes 7.5ms to run, that means it consumes 50% of the remaining procesor time leaving 40% for everything else.

The lowest timed task runs in that 40%. It is running at 50ms. Say it takes 5ms run, that means it consumes 10% of the remaining processor time leaving 36% for the continuous task and overhead.

Let's say the main continuous task is taking 100ms scan time. But since there is only 36% processor left, the best you can expect for time between one complete execution and the next is at least 3 times that. I also see that your system overhead timeslice is set to 33%, that means that of the remaining 36% one thrid of the time is reserved for overhead leaving 24% of the processor time to run the continuous task. So that 100ms scan time continuous task is taking at least 400ms to complete.

The numbers I picked are just to illustrate the concept and because they were easy to do in my head.

In the scenario I describe, this system still functions because we never overlap. Your system definitely overlaps. Bottom line is your processor doesn't have enough horsepower to execute the code it has as fast as it's been scheduled to run. When your timed tasks overlap, there may be very little or no time left for the continuous and overhead tasks to run. This can produce long delays in response on HMIs.

How to fix it? Very difficult to say without seeing the system.

thiem · Dec 2, 2009

See AB publication 1756-PM001G-EN-P. The one I have is a little old, from March 2004, you might find a newer one on the ab site. Reference is named Logix 5000 common procedures. Chapter 4 talks about managing multiple tasks.

I can answer your question 2. Overlaps tell you the number of times that a task was called to execute but it had not finished the previous execution. Therefore it DOES NOT execute for that call, it is skipped. This might not be good depending on what is programmed in that task.

Your screen shot does not show too long of an execution for the 5 ms task, but two overlaps. I would guess either the max scan was reset and not the overlap count, or your motion planner was updating and that is why you skipped your 5 ms task??

There is more to your CPU load than just task scan. Motion planner supersedes all task priorites. I remember a couple years ago we had issues with the serial port usage being too high a processor priority, causing motion planner problems. AB said they were going to resolve this (I believe this was around v16 release time) Can anyone else confirm if the serial port is now in fact a lower priority??

Comm is updated during unused time of the CPU, so during high CPU load it will be hard to update things like an HMI on Ethernet (and MSG's) if there is no overhead to do so.

Try posting back a pick of the task tool showing actual CPU load instead of the zeros by the tasks, and I believe the next tab will show CPU usage for motion planner, serial, comms etc. Might help get some other responses.

If you need clarification on the task management, definetly search the ab site for some help. Chapter four in the reference might be a good place to start.

PLC_Time · Dec 2, 2009

Thanks for the feedback so quickly!!!! Right now the machine is down and its in an Aseptic cleaning process so maybe late tommorow morning I can get some screen shots with it running.

Nice to know the servos get priority, this probably explains the long lags with the touchscreens. According to the vendor other "bad" things happen if I just try to increase the scan times, especially on the one that hits every 5ms. I do not know why the monitor tool is not showing CPU load on the modules but has such a high CPU usage even with nothing running. In the interim I went ahead and am sending the monitor screen shots at Idle and will send again when its running.

Alan Case · Dec 2, 2009

From a very quick read of what is happenning it seems like the vendor/machine designer has built this thing right on or over the limit of the processors capabilities.
As a matter of interest can you send us a screen shot of the IO layout (ie, what cards are installed).
It sounds scary when your supplier says that bad things happen if the task operates slower than every 5 millisecs.
Maybe some of the axis control need to be passed off to dedicated controllers.
Hope you keep us informed as this sounds an interesting problem.

Regards
Alan Case

jstolaruk · Dec 3, 2009

I wouldn't be surprised to find some poorly written code in those processor tasks; for-next, gotos, etc. Can you post the code? Remove all of the annotation if that is a concern, any poor program practices will stick out anyways.

Brownhat · Dec 3, 2009

I see you are at firmware level 13. Would updating the firmware to 16 or 17 improve the instruction execution time, and therefore reduce the processor utilization?

Oakley · Dec 3, 2009

Does this system have redundant controllers?

thiem · Dec 3, 2009

Just a step back here, but as was already stated, it looks like your on v13? Is this a new system or has it been there for a couple of years already? If it has been running for a while, are these problems new? V13 is not a very recent release of the firmware/software.

Also, if you are making changes to this system which sounds rather 'involved' and you are unsure of something, make sure you talk to the equipment provider for clarification.

Read the release notes for the newer revisions of logix (v16, v17). See if there are any changes/fixes that will help your situation out. If you do decide to upgrade and you are on v13, your motion commands will change. Specifically I remember they added jerk as a required parameter for alot of blocks which you will have to address.

AB technote 28660 lays out the CPU priorities. Serial Port is highest, even higher than the motion planner. So from your screen shot stating 14 percent usage, that is not helping you out at all.

Technote 42964 gives a brief explanation of System Overhead Time Slice and how it works when you have a continuous task in the system.

Technote 39085 gives an brief explanation of the Coarse Update Rate for the Motion Planner. Again from the screen shot provided the 70 msec execution of the motion planner seems long to me.

It looked like you had an L63 with plenty of overhead memory left, so you should be good there. However, if you are using a lot of MSG's for communications to other devices and the connections are not cached, then the system needs to use the unconnected buffers to open connections all the time, this takes overhead to do as well. Chapter 10 in my old common procedures manual talks about msg communications.

Tread carefully. Again, this sounds like an invovled system. There are a lot of things to look at, managing multiple tasks, motion planner, communications servicing, I/O updating etc.

rdrast · Dec 7, 2009

Late to the party, but it is generally a very bad idea to have so many fast, periodic tasks. Much better would be to have ONLY the 5ms one, and maintain a counter (simple 'Add' instruction works fine) for the sub-tasks, and jump to them conditionally.

Ex: in the 5ms task, every third pass, go to the '15ms routines'. Every tenth pass, go to the '50ms routines'.

That is generally much more efficient then multiple fast tasks.

Also, it may be that someone is attempting too much in the fast tasks. Periodic tasks should be "Get in, do the minimum required work, get out". They generally shouldn't be doing a whole lot of processing, or any logic that can be handled in the background by the cyclic task.

PLC_Time · Dec 9, 2009

Sorry to be so slow to post back all. Thanks for all the ideas!!! Multiple issues everywhere it seems. This application is in a pharma environment awaiting sutdies/clinicals to finish, so even though its new it has done a lot of sitting over the last year, thus the older firmware rev 13. We have discussed with the vendor about upgrading but they are uncomfortable with it. FOr now its still under warranty so my hands are somewhat tied to upgrade.

Just found today that the overlaps only occur when the amplifiers are going out of run mode or a fault occurs on one of them. The scan times are actually only about half of the scheduled under nomral running. So it appears under normal operation it does not overlap. Still dont like it but it appears to be "okay".

We also found on our AB touchscreens we were on an older version for development that had caused memory leaks by never closing a screen, it would always just open more and more screens. Compiled another version and downloaded. Am monitoring RAM usage.

Still have a question though, according to the vendor there is nothing we can do about the CPU utilization. THey claim if I slow down the timed modules it will just make the continuous task run faster, thus the overall CPU utilization would be the same. Anyone know if this is true?

Also found the Coarse servo movement time is set to 10ms and AB recommends 1ms per axis (we have 28). Thus I am going to looking into tweaking this as well. But that also routes back to my question if I slow this down will the extra time be given back to the continous loop, thus the overall CPU usage will still be the 96-97%?

Assuming this window of downtime I can get in I plan to just try the coarse setting and the above to see what happens. Didn't know if others have been down the same road?

And yes, this machine runs at 40 parts a minute, thus it runs fairly quick (actually 400 minute but it fills 10 every cycle). So I am a little hesitant to just start tweaking.

jstolaruk · Dec 9, 2009

As you mentioned, its still under warranty so for now you can only lean on the OEM to clean it up while they still have responsibility. Wasn't any of this left as open issues during the machine run-off / acceptance phase?

PLC_Time · Dec 9, 2009

Should of been, this is getting into the gray area, qualifications were completed and approved so technically we said the machine was okay, but now we have unveiled some issues while runing. Pushbacks both ways...

AM working with OEM but they are wanting to be done, and I am concerned of intermittent problems later. I will keep working with them and AB over this. Hope to have the time slice tests later today. I will report back.

mellis · Dec 9, 2009

PLC_Time said:
Sorry to be so slow to post back all. Just found today that the overlaps only occur when the amplifiers are going out of run mode or a fault occurs on one of them. The scan times are actually only about half of the scheduled under nomral running. So it appears under normal operation it does not overlap. Still dont like it but it appears to be "okay".

Seems to me these conditions aren't being handled properly if they cause such a spike in scan time. But it's not my system. Maybe you have to live with it, I'd want to really understand why I had to live with it if it was mine.

PLC_Time said:
Still have a question though, according to the vendor there is nothing we can do about the CPU utilization. THey claim if I slow down the timed modules it will just make the continuous task run faster, thus the overall CPU utilization would be the same. Anyone know if this is true?

Yes, that is the case. The continuous task runs as fast as it can. If you aren't having overlaps, and the continuous task is getting done in a reasonable time, you're good. Remember, the continuous task shares time with overhead and comms. So having the continuous task executing several times a second is not a bad thing.

rdrast · Dec 10, 2009

Rubbish. There are absolutely things that can be done about scan issues, especially as you are describing them now. Also, you aren't actually looking to reduce CPU loading, but prevent scan overruns on the periodic tasks.

Overruns caused by axis shutdown's seem odd. Actually, I'm not sure why you would even be trying to handle the shutdowns in periodic tasks (especially at those rates), which sounds like what you are doing.

As far as staying at Rev. 13 uhhh... If this is falling under the standard Pharma certification, get the processor firmware upgraded before you certify! Get to at least Rev 16.x, if not 17.x.

Worse comes to worse, you might have to have your OEM add another processor, but I don't see the need unless they are doing very silly things in the logic (and it sounds like they are).

ControlLogix Time Slices

PLC_Time

Member

mellis

Member

thiem

Member

PLC_Time

Member

Alan Case

Lifetime Supporting Member

jstolaruk

Lifetime Supporting Member

Brownhat

Member

Oakley

Member

thiem

Member

rdrast

Lifetime Supporting Member

PLC_Time

Member

jstolaruk

Lifetime Supporting Member

PLC_Time

Member

mellis

Member

rdrast

Lifetime Supporting Member

Similar Topics