Minor Error Bit

eight_bools · Dec 3, 2021

Good Morning,

We recently installed some logic in one of our SLC 5/05 processors to give us a pack per minute read out. I attached a photo of the logic for review. It looks like we have been getting the following fault and when the faults started aligns with when the changes were made. They happen randomly, we have not been able to find any kind of pattern.

"A minor error bit is set at the end of the scan. refer to S:5 minor error bits."

I have been reading that the math overflow bit is often the culprit, and understand that it is common to unlatch the S:5/0 at the end of ladder 2. I am curious though if anyone can help me find where the mistake was made so we don't make it moving forward.

Logic is as follows

If machine is in auto/run mode than time duration between flags. When prox flags divide 6000 by the timer accumulation value. Finally divide the output from that instruction by the number of units in the pack to get packs per minute.

If machine is not in auto/run set the part per minute integer to 0

Let me know if anyone has any questions, appreciate any responses.

Thanks

geniusintraining · Dec 3, 2021

Just a WAG... but does T4:32 ever have a ACC of 0?

eight_bools · Dec 3, 2021

geniusintraining said:
Just a WAG... but does T4:32 ever have a ACC of 0?

It has a value of zero when not in auto/run mode. I attached another screenshot of this condition. It usually starts up and runs just fine though when transitioning.

When I typed 6000/0 into my calculator I got the error "cannot divide by 0". Will this cause a minor fault in the SLC?

mylespetro · Dec 3, 2021

Yes, I'm pretty sure it will cause a fault when you try to DIV by 0. I don't know what else N7:52 gets used for, but maybe you could have a GRT in front of the first MOV or in front of the DIV so that N7:50 must be greater than 0 in order for them to operate, or you could avoid setting the .ACC to 0 when it's not in auto/run.

EDIT: Changed tag to N7:50 from N7:52

plvlce · Dec 3, 2021

Attempting to divide by zero will generate a minor fault, yes.

EDIT: A value of 0 in either T4:32.ACC or N7:51 will result in the error. Since you indicate that you are always using 6000 as your starting divisor, N7:51 should not be the culprit.

Do you reset the timer anywhere? If so, then if your case counter oneshot is true the same scan* that the timer is reset, then the timer would be zero when the math is run.

It would also be zero if the timer is not running when the oneshot is true.

*in the case the reset is later than these rungs in the program, it would be the following scan instead of the same scan.

geniusintraining · Dec 3, 2021

What I have done in the past... if :32.ACC = 0 then MOVe a 1 into that register

drbitboy · Dec 3, 2021

Simple fix to eliminate the divide-by-0 fault would replace

MOV T4:32.ACC N7:50

with

ADD T4:32.ACC 1 N7:50

It would introduce a slight error in the result, but it is better than a minor fault becoming a major fault and halting the PLC; also that error is small compared to the error introduced by the roundoff.

I have some queries though:

What value is in C5:1.PRE, and does it ever change as the program is normally running without manual or operator intervention?

What do you want that expression

(6000 ÷ T4:32.ACC) ÷ C5:1.PRE

to calculate?

Do you see the calculated value (N7:53) starting as a large number, which initially quickly, then more gradually, decreases to a final value of 0 at about the time T4:32.ACC reaches 4000?

Ken Moore · Dec 3, 2021

Just put a compare in front of the divide block, on do the division if N7:50 > 0. BTW this should be standard practice for all divides where there is the potential for divide by 0.

eight_bools · Dec 3, 2021

Wow

Thanks everyone for the responses, this has been very helpful. Here are a couple of quick responses.

plvlce
The timer is not reset anywhere in the program

dirtboy
C5:1.Pre is only changed during package changeovers while the line is down, not while in auto/run

The timer never reaches its preset (6000), the largest value I have seen during machine ramp up is 250.

I ended up replacing the MOV with the ADD, this is not mission critical, its just for an operator interface so a small error is acceptable.

Just out of curiosity though could this whole thing be replaced with a CPT instruction? I have never used one before

(6000/(T4:32.ACC+1))/C5:1.PRE

OkiePC · Dec 3, 2021

I usually put a rung at the end of the MAIN LAD 2 file that examines the math error bit and if true, sets an alarm bit and unlatches it. This keeps the CPU running in the even of a math error (which may not always be the right thing to do).

If the program is lengthy and math intensive, I may put that rung at the end of each file with unique alarm bits for each file to help me figure out which file to look for if there is an error.

When dividing by a variable, it is a good practice to a NEQ instruction ahead of that instruction to skip it when that value is equal to zero (Ken Moore already said nearly the same thing).

When moving a variable into a timer preset, it is a good practice to put a GEQ instruction in front of the MOV to ensure you are not moving a negative number into a timer preset (another cause of the math overflow error).

Ken Moore · Dec 3, 2021

OkiePC, I know for a fact that moving a negative number into a timer preset will fault the processor(CLX platform for sure, suspect others will too.).
I use that to test fault routines in SIS systems.

drbitboy · Dec 3, 2021

Thanks for the feedback.

I suspect a CPT statement would work/

Btw, if the one-shot [CASE COUNTER LS 1 SHOT B3:0/15] can never be 1 if [AUTO_RUN] is not 1, then the [XIC AUTO_RUN] instruction could be moved to be ANDed with the [XIO B3:0/15] on the TON feed rung, and all the rungs could be arranged in parallel instead of nested, which would look a little nicer.

drbitboy · Dec 4, 2021

brandonspeaks72 said:
I have been reading that the math overflow bit is often the culprit, and understand that it is common to unlatch the S:5/0 at the end of ladder 2.

See red-highlighted instructions in image below for the "right" way to do this.

brandonspeaks72 said:
I am curious though if anyone can help me find where the mistake was made so we don't make it moving forward.

Great question.

N.B. @plvce already covered this; I am only going to expand on what they said.

Assumption: the problem is that the value of the timer accumulator, [T4:32.ACC], and therefore also the value of N7:50, are 0 when the first divide [DIV N7:51 N7:50 N7:52] occurs, and that divide only occurs when the one-shot [CASE COUNTER LS 1 SHOT B3:0/15] is 1, causing a divide-by-0 minor fault, which, if not cleared, is promoted to a major fault at the end of the scan, which major fault halts the PLC.

We don't know what triggers the one-shot to become 1; from the original OP and one possible analysis-of-units interpretation of the math shown:

Code:

6000 cs     1 unit        1 pack             pack
--------  x -------  x  ----------  =  Rate ------
1 minute    .ACC cs     .PRE units          minute

it appears that .ACC is how long the timer has been running, in cs (centiseconds), before a physical unit triggers a prox event, which event eventually assigns a 1 to the one-shot [B3:0/15] for one scan, which [XIC B3:0/15] in turn triggers the divides.

Also, the timer is reset, and the value of .ACC becomes 0, EITHER

by the [XIO B3:0/15] branch when the one-shot is 1 on the same rung if [AUTO_RUN B3:0/10] is 1,

OR

by the [XIC B3:0/10] if [AUTO_RUN B3:0/10] is 0.

So the question

Why does the divide-by-0 occur?

becomes

Why is .ACC's value zero when the AUTO_RUN's and one-shot's values are 1?

which in turn becomes

Why is the one-shot's value 1 during a scan, when AUTO_RUN's value is 1, within 1cs (0.01s) of the last time the timer was reset?

So I see two possibilities. The first is

AUTO_RUN's value can transition from 0-to-1 while a unit is in front of the prox, and the one-shot logic includes AUTO_RUN e.g. [XIC AUTO_RUN XIC PROX ONS memory_bit OTE one-shot-B3:0/15]. That might be a programming mistake: the one-shot transition should occur across two successive scans where AUTO_RUN's value is 1 on both scans.

If this was the case then the fault would be occurring on a transition of the value of AUTO_RUN from 0 to 1. However, OP said there was no apparent pattern to the faults, and I would assume if the fault occurred at that transition it would be noticed, so this is unlikely.

The second possibility is

While AUTO_RUN is 1, the prox has two separate rising edges within 1cs (10ms, i.e. a few scans) but more than one scan apart. So the one-shot resets the timer and assigns 0 to the .ACC value on the first rising edge, and then does the divide on the second rising edge before .ACC increments to 1.

This seems unlikely to be caused by two separate units triggering the prox, but it could be "bounce" in the prox input signal, in which case applying part of the debounce pattern (cf. this link) to the prox input signal could fix the problem.

If the problem is bounce in the prox, then another possible solution would be to change the timer base to 0.001s i.e. milliseconds, so the .ACC value would be less likely to be 0 on the second rising edge i.e. first bounce. The first problem here is that the bounce could still cause the divide-by-0. The second problem is the divide instruction into the number representing .ACC units per minute. That number would be 60000 and too high for a 16-bit signed INT, but that could be solved by using 30000 as the numerator and adding a factor of two somewhere else [DIV .ACC 2 N7:50], or by switching to using real values (does the SLC allow that?) in the calculations.

plvlce · Dec 4, 2021

Ken Moore said:
OkiePC, I know for a fact that moving a negative number into a timer preset will fault the processor(CLX platform for sure, suspect others will too.).
I use that to test fault routines in SIS systems.

A negative timer preset will fault the processor when the timer is enabled on the SLC platform, but iirc it has its own major fault rather than just using the overflow bit so it couldn't be the issue here.

Ran into this myself a month or two ago… a machine in service for over a decade faulted for a negative preset. Come to find out the preset in question was adjustable from the HMI, but the panelview had the entry configured as an unsigned int with full range (0-65536) of entry allowed. Any value >32767 would be interpreted as a negative in the PLC causing the processor to fault once process was started.

daba · Dec 5, 2021

Just unlatch the damn overflow trap bit at the end of file 2 !!

No other A-B platform has this "trap", it is useless, because it doesn't tell you where the over or underflow occurred.

IMHO, a pointless bit of error trapping that you should be doing yourselves in logic if it's important.

Minor Error Bit

eight_bools

Lifetime Supporting Member

geniusintraining

Lifetime Supporting Member + Moderator

eight_bools

Lifetime Supporting Member

mylespetro

Member

plvlce

Lifetime Supporting Member

geniusintraining

Lifetime Supporting Member + Moderator

drbitboy

Lifetime Supporting Member

Ken Moore

Lifetime Supporting Member

eight_bools

Lifetime Supporting Member

OkiePC

Lifetime Supporting Member

Ken Moore

Lifetime Supporting Member

drbitboy

Lifetime Supporting Member

drbitboy

Lifetime Supporting Member

plvlce

Lifetime Supporting Member

daba

Lifetime Supporting Member

Similar Topics