It Stopped Running Because The 20 Year Old PLC Program Changed

I_Automation

Lifetime Supporting Member
Join Date
Jun 2020
Location
Detroit, Michigan USA
Posts
1,646
Just had the anomaly.


Stamping press, 20 years old, SLC5/04 CPU with EEPROM memory module.



The customer says suddenly a couple days ago they have to hold the run buttons down or as soon as they let go it top stops.


DH+ to a PanelMate HMI at 204K baud and the DF-1 port is set to ASCII. Took a while to get an old laptop with the PCMK card connected to the DH+.


Had to upload the use file as it didn't match their last save. Searching through there is a bit that is set if any of a dozen conditions or faults come true to OTE the top stop bit true.


However, the rung that turns on the top stop output for the relay is checking XIO top stop bit, and the saved archive is XIC top stop bit.


Can't online edit through DH+ and can't download to the CPU even if it is in Program by the key - RSLogix says it can't get ownership of the CPU. Pulled the battery & wiped the program with the EEPROM removed and still couldn't connect to the DF-1 port. Put the EEPROM back in and it reloaded the glitched bit, so the XIC changed in the EEPROM and was put in the CPU.


The only recourse I had was get another 5/04, change channel 1 to System and 19,200 baud and download the program to it to have the correct program and be able to use the DF-1 port.
 
Stupid program.

Apologies if I'm not following the plot. It sounds like the suspect buttons were PanelMate buttons. HMI's have a tendency to "stick" in the opposite state. What if that happened (button stuck as if being pressed), someone got online and inverted the state of contact to "fix" the stuck HMI button. Then it flipped again down the road, leaving them having to hold the button.
 
No, nothing to do with the HMI.


A rung on the PLC ladder changed from XIC B3:6/13 to XIO - here's a screenshot of the upload I saved.


In the archived save the customer had it was XIC.


EDIT: The buttons are the 2 palm buttons on the control panel.

Capture.jpg
 
Had a few problems on SLC's ranging from 502-504, most common is when in ram, had exactly that where a contact had changed just like you, nobody else had access to it, one system did not have an HMI, also had a few where program was corrupted could not recover it so had to download backup. I also had one where an EEprom was fitted, same thing. I suppose it is possible that a corruption could possibly change a contact even to perhaps another address after all, it's only bits. I would have thought there would be some sort of checksum tested on startup but who knows.
One system got NVRam problem downloading the backup cured it a few times, replaced the processor, same happened, changed the PSU, same happened, changed the rack never happened again.
Although had many of these systems in general very reliable but as they age cause all sorts of problems with bad joints, poor contacts & failing capacitors. seem to be more un-reliable than other makes when they are a few years old. I know it's time to upgrade them but try telling the accountants that.
 
I think that for some reason a quick fix pertaining to the button in question (*) was implemented onsite, the program was backed up but never stored in the EPROM.
Then years later, due to a power supply issue the program loads automatically from the EPROM and thereby reverts to the state before the quick fix.

And that can happen when you fix something, start the machine and check that it works OK, but now production wont stop the machine because they are already behind schedule.
You then make mental note to have the program loaded on the EPROM later, but forget about it because something else pops op.

*: Maybe the button was supposed to be an N.O. but an N.C. was installed.
 
Last edited:
The story seems to be that the program spontaneously changed not one, but two, XIC instructions to XIO instructions.

And these are for "2 palm buttons?" On a stamping press?

Are we sure the change was not an intentional (or necessary) safety improvement?
_
 
I think it can be the same explanation for 2 buttons.
Original program loaded on EPROM.
Hardware difference found during testing on both buttons, the program was modified and loaded, but only in RAM.

I am also concerned if this is a safety function programmed in a non-safety PLC.
In the past there were many posts on how to program 2-hand control, but I thought that it was mostly beginners doing some training, not a real life (or real death !) situation.
 
The instruction that changed was only one instruction, in one place.


It had nothing to do with the 2 palm buttons or their use - just if the palm buttons were held by the operator the Top Stop wasn't activated by the relay logic, not controlled by the PLC. The PLC has nothing to do with the 2 hand controls except watch a RUNNING CR when the press is running.



The EEPROM is loaded every boot with S:5/8 ON and the machine has been powered down every night for years, so it wasn't an edit when this was last connected to 3 years ago that wasn't saved to EEPROM.


Plus with the Channel 1 set to USER-ASCII online edits, and even downloading an updated program, is not possible. That could only be done once with the new processor and once the channel config change was accepted no more edits, the only way of changing that CPU is install a new EEPROM. I did a pull the battery/factory reset to default and it did wipe the program but channel 1 did not revert to DF-1 1200 baud as I figured it would. EDIT: Or any other DF-1 baud.
 
Last edited:
Hey, ghosts exist.....


I've been to a machine that I was told by reliable sources was not worked on, or touched by anyone. but somehow, I had to rewire part of a circuit to get it to work, because there was no real way it was going to function as is. customer had no access to the plc program due to not owning studio 5000 either.


I asked more than a few times if anyone had tried to repair the problem by swapping wires around, and I was told no every time. had to be the same ghost that keeps screwing up all the other machines I get called to help fix and have changes put in place that nobody did.


















on second thought, I think my kids work at all these places, because nobody was ever the one to screw something up at home either...
 
The story seems to be that the program spontaneously changed not one, but two, XIC instructions to XIO instructions.

And these are for "2 palm buttons?" On a stamping press?

Are we sure the change was not an intentional (or necessary) safety improvement?
_

nobody ever reached for a part on the shelf and said, "Well shoot, all we have are these normally closed contacts for the palm buttons, but we need normally open.......... I have an idea"
 
The customer says

Possibility 1: A technician or contractor made changes to the controller and saved them in the EEPROM, and the customer is not telling you the whole story.

Possibility 2: Electromagnetic impulses or high-energy particles struck the EEPROM in precisely the correct places to change two related instructions to their opposite instruction codes and corrupt the memory checksum to the correct value to match.

The universe is big enough for Possibility 2, but I know which one I'd bet the rent on.
 
I'm not hugely familiar with the SLC's and the fine detail around their EEPROM storage, but could I venture Possibility 3?

Sometime in the early 2000's...
- Program is stored on EEPROM with the two instructions as XIO's and S:5/8 ON to load from EEPROM each power up
Sometime in the early 2010's...
- Palm buttons break and are replaced. Unfortunately they order the wrong type and then the customer has to hold down the buttons or the machine stops. Customer calls in a programmer to have a look
- Programmer identifies that the buttons have been replaced with NO instead of NC and changes the two XIO's to XIC's
- That night the machine is powered down, and when it powers back up in the morning, the hold-to-run issue is back
- Customer calls the programmer back in. Programmer realises the program has been set to load from EEPROM. Programmer changes the XIO's back to XIC's again, and turns off bit S:5/8 to stop it happening again
Sometime in mid 2022...
- A small furry rodent chews a cable it should not chew, and the resulting power surge corrupts the program in the SLC. The next morning the SLC powers up, realises that it has corrupt memory, and loads the image from the EEPROM, which has two XIO's and bit S:5/8 set to on
 
Could be that someone bypassed a failed safety switch [Pilz] to keep the line running. Which would be a huge mistake on the Techs part if this was the case.
 
I'm not hugely familiar with the SLC's and the fine detail around their EEPROM storage, but could I venture Possibility 3?

Sometime in the early 2000's...
- Program is stored on EEPROM with the two instructions as XIO's and S:5/8 ON to load from EEPROM each power up
Sometime in the early 2010's...
- Palm buttons break and are replaced. Unfortunately they order the wrong type and then the customer has to hold down the buttons or the machine stops. Customer calls in a programmer to have a look
- Programmer identifies that the buttons have been replaced with NO instead of NC and changes the two XIO's to XIC's
- That night the machine is powered down, and when it powers back up in the morning, the hold-to-run issue is back
- Customer calls the programmer back in. Programmer realises the program has been set to load from EEPROM. Programmer changes the XIO's back to XIC's again, and turns off bit S:5/8 to stop it happening again
Sometime in mid 2022...
- A small furry rodent chews a cable it should not chew, and the resulting power surge corrupts the program in the SLC. The next morning the SLC powers up, realises that it has corrupt memory, and loads the image from the EEPROM, which has two XIO's and bit S:5/8 set to on


All palm buttons for 2 hand controls are 1N.O./1N.C. and both have to work each cycle of the buttons, and 2 hand controls the N.O. on both buttons have to come on within a set time (usually 1/4 to 1/3 of a second) Also, as I said the palm buttons are not wired to the PLC at all, the 2 hand control relay logic turns on a Running CR and the PLC gets a signal from that.




Also this press doesn't have a Pilz or Banner 2 hand control safety relay, just standard relays like have been used in 2 hand controls since the 1940's (maybe earlier)


This customer is good in that they have an archive on their network of every PLC, HMI and robot program on their property - even keeping machine files they have sold or scrapped out. Plus they demand a copy of any update, even if it's just an online save of current data.
 

Similar Topics

Hi all, Got a call from a client just now, apparently his remote well site stopped working so he has it in hand. He sent me photos of the PLC...
Replies
10
Views
3,637
I have several AB 1769's that transmit messages to each other. Sometime last night/this morning one of the messages stopped transmitting. It is...
Replies
8
Views
1,585
I'm trying to better understand what actually happens to a motor when a VFD is decelerating it. And how it would act if physically stopped while...
Replies
26
Views
12,901
Dear All,, We have a Siemens Simatic S5-95U and there is stopped red LED active Tried resetting the PLC,switching the Run/Stop toggle push button...
Replies
6
Views
3,035
Wizards, I have been farmed out to another plant because they are having all kinds of issues. Most of the items I have been able to correct, but...
Replies
3
Views
3,045
Back
Top Bottom