Better troubleshooting experience for Siemens PLCs?

Li Yi

Member
Join Date
Jul 2024
Location
Suzhou
Posts
11
Hello friends! I am a pre-development engineer in Siemens. I used to work more with math and algorithm, but recently with practical tasks. My personal feeling is that troubleshooting PLCs with real machines (such as a montage station) is really annoying, such as:
1. to repeat a test, I need to click several buttons in sequence manually. I won't/can't code a program, because I need to observe some data step by step, or coding the test program takes more time than manual operation.
2. some strange failures occurs, but I can't recreate it. I know I missed something, or background things changed, but I don't know exactly what should be fixed.
3. I have TIA Portal Trace, but it supports limited signals/tags and data length.
4, even if I have Trace function and some records, I can't use it for debugging.
For point 1 and 2, I programmed an ugly workable code to save DB data in each PLC cycle. I know there are SPS recorder or IBA recorder, but they can't solve all problems, are also quite expensive. I assume that many automation engineers may have similar complaints. I have the chance to improve future Siemens products, so I want to know more of your stories and suggestions, thanks.
 
What is it that you're trying to do? What functions are you using?

Something that tripped me up early on was that MOVE blocks do have ENO enabled by default, which means that the program continues processing while the move is carried out.
Also a 10ms timer can sometimes give you that 1-2 scan cycle delay if things aren't executing how they should.
Make sure you make the distinction between SR and RS if used, as the wrong one can return the opposite result that you want.
 
There should be alarms coded into the PLC to cover every foreseeable hardware error situation.
If there are functional errors in the code (i.e. program "bugs"), then you can pinpoint these by.
1. Observe the PLC online during operation. Either look at the code directly or use watch tables.
2. Observe the simulated PLC code.

Trace is an excellent tool. If there are spurious error situations, and you cannot be online to observe the program 24/7, then trace can give you the info to what or where the problem is. You can use the info to recreate the scenario with a simulation.

even if I have Trace function and some records, I can't use it for debugging.
That's on you.

I use extensive PLC simulation before to the commisioning. In that way I have the program debugged to maybe 95% before even starting the machine the first time.
 
There should be alarms coded into the PLC to cover every foreseeable hardware error situation.
If there are functional errors in the code (i.e. program "bugs"), then you can pinpoint these by.
1. Observe the PLC online during operation. Either look at the code directly or use watch tables.
2. Observe the simulated PLC code.

Trace is an excellent tool. If there are spurious error situations, and you cannot be online to observe the program 24/7, then trace can give you the info to what or where the problem is. You can use the info to recreate the scenario with a simulation.


That's on you.

I use extensive PLC simulation before to the commisioning. In that way I have the program debugged to maybe 95% before even starting the machine the first time.
Hi Jesper, thanks for your reply and praise of Siemens product!

The errors are more likely caused by "corner cases" or "unforeseen user operation". In the previous one, I can't simulate something I can't foresee, in the later, my user might be a rookie who can't notice that something wrong was done. I can't monitor it online, there are many meetings and other tasks waiting for me. The failure may happen even 1 hour ago, or I don't know the how should the trigger look like, because it is a random failure. I know it sounds like excusing myself for incompetence, but it happened.

I plan to have something like this: have a better version Trace, which can monitor EVERY tags value in the end of PLC cycle for hours and monitor the invoked subroutines in each PLC cycle; then I may put the record file into TIA Portal PLCSIM for troubleshooting and code bug fixing. But if it is only my problem, my boss won't approve the idea.
 
What is it that you're trying to do? What functions are you using?

Something that tripped me up early on was that MOVE blocks do have ENO enabled by default, which means that the program continues processing while the move is carried out.
Also a 10ms timer can sometimes give you that 1-2 scan cycle delay if things aren't executing how they should.
Make sure you make the distinction between SR and RS if used, as the wrong one can return the opposite result that you want.
Hi, Puddle, you are fully correct. That's why I said, it is ugly workable and can't solve all problems.
 
Before trace existed it was common to capture events that only occur very rarely by programming "traps" in software. On one occasion this was really the only was to diagnose what turned out to be a profibus data consistency error and also to prove that it was fixed.
 
1. to repeat a test, I need to click several buttons in sequence manually. I won't/can't code a program, because I need to observe some data step by step, or coding the test program takes more time than manual operation.
I have used a unit test product by Siemens... you write the sequence of events you need and tell the processor to advance by cycle or by time. Surely, if you work at Siemens this is available to you? It works with the Professional PLCSIM.

The professional PLCSim also allows you to generate sequences of conditional steps to perform but no control over the timing as far as I remember.

2. some strange failures occurs, but I can't recreate it. I know I missed something, or background things changed, but I don't know exactly what should be fixed.

In a more advanced way, programming is done by two people. Both read a document detailing what to do, one writes the testing the other writes the program and they both should match when time comes to run the software. This also allows the tester to not be swayed by the logic he wrote (we're human and that's how we function).
3. I have TIA Portal Trace, but it supports limited signals/tags and data length.

Do you have ServiceLab? I found it nicer to use, but most times you will need to set traps as MangleMender mentioned. Or if you want to be fancy, use the instruction that writes to the event log of the processor.
Traps can be monitored in a trending tool thus giving you some sense of sequence of events.

4, even if I have Trace function and some records, I can't use it for debugging.
Why? Just curious.

I have the chance to improve future Siemens products, so I want to know more of your stories and suggestions, thanks.

I think you probably need to learn all the available products from Siemens and how they work first, no?
 
Before trace existed it was common to capture events that only occur very rarely by programming "traps" in software. On one occasion this was really the only was to diagnose what turned out to be a profibus data consistency error and also to prove that it was fixed.
Manglemender, Hats off to you for this feat!
That must have been quite a tough one and have required huge insight and understanding of the system and process.
 
I plan to have something like this: have a better version Trace, which can monitor EVERY tags value in the end of PLC cycle for hours and monitor the invoked subroutines in each PLC cycle; then I may put the record file into TIA Portal PLCSIM for troubleshooting and code bug fixing. But if it is only my problem, my boss won't approve the idea.
How would that work? It would effectively create a snapshot of every single tag and DB value per cycle and store that data on the PLC until you download it, similar to data logs?

I have a 1211 CPU that's used to collate data from a network of temperature controllers. Cycle time averages 10ms, there's over 500 tags used, lets say they're all UINT to make it easier. That's 2B stored directly or 4-5B as CSV.

That program would take 6MB of storage per minute.
 
That program would take 6MB of storage per minute.

Hence why the manager isn't moving forward with it...
As computing evolved, the basics most programmers were brought up on, like manage your memory as the finite resource it is was thrown out of the window because you can always get more memory or larger hard drive. Surely you've seen browsers and Teams display this behavior...
 
The errors are more likely caused by "corner cases" or "unforeseen user operation".
Interview the operators about the error scenario,
Tell them to take photos or vids of the machine right after the error situation. Also take photos and vids of the HMI.
Then based on the information, recreate the scenario by simulation.
Try different kind of user errors.
If you can recreate the error, you can fix it.

edit: Ask the operators if they can recreate the error scenario.

Another thing to log could be all user inputs on the HMI. This can be far less data than logging all machine IO with milliseconds resolution.
 
Manglemender, Hats off to you for this feat!
That must have been quite a tough one and have required huge insight and understanding of the system and process.
Data consistency is a strange beast that can catch you out. When you speak to a DP slave, you open a data consistency window which must be closed - usually by reading/writing to the last byte defined in the comms structure.

I have worked with profibus since it's invention back in the days of S5 and data consistency has only been a issue twice. The first time required the assistance of Siemens with a packet analyser to identify a very intermittent issue where an anlogie input would fail to update it's value for seconds at a time.

On the second occasion I was sending move data to a Bosch Rexroth Motion contoller. The GSD defined 10 bytes but the BR engineer insisted that I should only send 9 bytes which immediately raised a concern in my mind (I was over-ruled by the engineering manager). Most of the time this worked fine but the machine was very very occoasionally making the same part twice and skipping a part. The "Trap" read back some data from the motion contoller and compared it to what was sent/sent previously. Some 12 hours later, bingo! 10 bytes were sent and the problem was never seen again.
 
Data consistency is a strange beast that can catch you out. When you speak to a DP slave, you open a data consistency window which must be closed - usually by reading/writing to the last byte defined in the comms structure.

Got caught once on data consistency and how I kept getting some weird glitching values... after reading the manual for the read function the word consistency popped out and the advice to use the read consistent data function. :)
 
Why? Just curious.

I think you probably need to learn all the available products from Siemens and how they work first, no?
Hi, thanks for patient answer.
About Trace for debugging, I don't have every input signal, thus cannot simulate the process. Also, I haven't found an automatic way to put the measured value in IO forcing table.
I must admit even as a Siemens employee, it is hard to know every Siemens and partner's products. I have consulted my senior expert colleges; they have similar or very complicated solutions (deep into circuits and runtime), solve part of the problems but none matches precisely.
 
How would that work? It would effectively create a snapshot of every single tag and DB value per cycle and store that data on the PLC until you download it, similar to data logs?

I have a 1211 CPU that's used to collate data from a network of temperature controllers. Cycle time averages 10ms, there's over 500 tags used, lets say they're all UINT to make it easier. That's 2B stored directly or 4-5B as CSV.

That program would take 6MB of storage per minute.
Indeed, neither efficient nor easy. Many manual operations must be down combined with stop or pause. I even tried copy from RAM to ROM.

Hence why the manager isn't moving forward with it...
As computing evolved, the basics most programmers were brought up on, like manage your memory as the finite resource it is was thrown out of the window because you can always get more memory or larger hard drive. Surely you've seen browsers and Teams display this behavior...
The amount of data can be reduced. Not all data will vary within two cycles, only changed values will be save with "cycle stamp". They will not be saved in CSV or txt, but in binary form. Then 1024 4-byte signals sampled with 1024Hz need only 1MB/s, assuming that only 1/4 signals changes. Of course, it is still too large, and more like a DAC instead of PLC. However, PC based PLCs appear and get some market share. Beckhoff has PLC with very fast cycle, Keyence has playback function. A PLC not only for experience automation engineer who can Thus, I need to talk with engineers with more practice to test my idea.
 

Similar Topics

Busy with a bonfiglioli Drive. Say no more.... I just want to write data to the drive. Getting data not an issue, but think I am missing...
Replies
0
Views
138
...swore up and down that a couple of VFDs in use on one line were servos and I believed him. How do I use Kinetix servos to drive a web-type...
Replies
9
Views
658
Hey All, I am looking for a better way of sorting my data. I have an array (length = 233) of positive integers between 0 and 20160. I would like...
Replies
45
Views
2,210
Good Afternoon , It seems like we always have problems with using Type J Thermocouples with our slip rings . Would using a RTD at 10...
Replies
6
Views
1,919
I am beginner in B&R Automation Studio and TIA Portal. Have an experience in electrical part of PLC cabinets. Now its time to get a new skills...
Replies
8
Views
2,022
Back
Top Bottom