Not solving the cause of the problem

Hundikoer

Member
Join Date
Jul 2003
Location
Tallinn
Posts
271
Hello automation masters

Iam in business trip in France at the moment and this customer case forced me to write down following

The plant I am working belongs to automotive industry and I have been here several times. The machines I have to program (actually fix bugs) are typical pick-and-place robots. Seems quite simple. Anyway, when the machines were built I had to use ordinary base program for these most complex that kind of machines in company's history because time was very limited and I did not get permission to use my "smart(er)" algorithm. And tha causes a lot of unpredictable situations. Now I hope it is my last time here (although they pay well, I do not want to come here anymore) and I was wondering that this time and also last time I do not really fix causes of the problems (these can be very hard to find) but choosed faster way and define the "special" situation and if all conditions are OK, I just overwrite some values (values that should be overwritted by program already).
This solutions work great, does not kill anyone and customer is very happy, saing "You are the best!"

Now the question

Does anyone have also some kind of experience, when it is easier to overwrite some values in case of all neccessary conditons are OK, or do you always try to find source of the bug, no matter what it takes?


Regards
Lauri


PS!
Maybe MS uses same approach, that can explain problems with PC software?
 
It's pretty easy to say that you should ALWAYS find the source of the error. Conceptually I agree. But given time and production pressures it's not always reasonable to do so.

Only you can determine if the machine is left in a safe state by doing this. As you said in your original post, the solution doesn't kill anyone, which is an important first evaluation step.

The biggest concern with not finding to baseline cause of the probelm is it may show itself in an unforseen way and you will be going back anyway.

At the very least you need to take your best educated guess at what is causing the issue. Three or four educated guesses would be even better. Then write some error trapping software. You obviously know when the values are wrong as you are overwriting them with the right ones. When you have to overwrite these values also log some additional values in an error log that can be evaluated later. After some reasonalbe period of time retrieve the logged data and see if you can pick out a recurrance or trend. If you know what shouyld be triggering calculation of the affected values that would be the first stuff I would log.

Hope this helps.
Keith
 
The best solution is to eliminate the bug.

Sometimes this is not easy to do, especially if the bug is a result of program structural problems and not a specific item of code. Sometimes in this case its easier to write a trap to catch and correct the bug when it occurs.

In either case you need to understand what the cause of the bug is.
 
Here's the Catch-22 to your situation. In order to be sure that your software patch is a good solution you need to know exactly what is causing the problem in the first place. If you knew that, you'd be able to fix the problem without having to resort to a patch.
 
I have had some experience with situation like this and here is what I suggest. Put in some code that will help you log, time and date stamp, count and otherwise identify problems as they happen.
If you have an external connection over ethernet or modem you can periodically review what's going on and eventually eliminate the root cause bug.
 
Thanks for replies

First of all I must say that these machines life is up to 6 years. After that they are in so bad condition (because of the linemanagers who take "sparparts" from other cell etc.) that they must be built up from zero. Therfore Iam not worried that what happens if someone else must fix bugs.

I know that I am lazy ugly *******, but normally I choose easiest way, to be more efficient. I must admit that if someone else opens the program, he/she will sceram.

What I am tring to say is that always it is not best way to fix the cause but eliminate bad results that can be seen by enduser. Also I know that I do not win here any popularity with that kind of statement


BR
Lauri
 
Hundikoer said:
Also I know that I do not win here any popularity with that kind of statement


BR
Lauri

Do not beat yourself up too hard. I think everyone on this site has been in the same position at one time or another. I know I have. It is very tough to just "slap a bandaid" on something when you know it really needs "stitches" but when the manager of the plant is standing over your shoulder complaining that he is losing $1000s of dollars in lost production, you do what you can to get them going.

Bob
 
Continue...

I would plan another trip there... Leave you program running, in a "safe" condition, take your copy and work on it with less stress then return for the final loading and testing...🙃
 
Jiri Toman said:
I have had some experience with situation like this and here is what I suggest. Put in some code that will help you log, time and date stamp, count and otherwise identify problems as they happen.
If you have an external connection over ethernet or modem you can periodically review what's going on and eventually eliminate the root cause bug.

This is a very good method for debugging someone else's logic and the solution is not clear. I'll be doing this Friday morning on a machine tha was programmed by a vendor before me and screws up on average of once a month.
 
Steve said it first...

You are employing patches to "fix a problem". But of course, as Steve implied, you don't know why the so-called problem is occurring. It could very well be that the original code is doing exactly what it is expected to do... even though that might not be the particular response you were hoping for.

You have to be very careful about slapping patches around... you could be shooting yourself in the foot.

Could it be that your "patch" could actually replace the original code?

The suggestion that Keith, Alaric and Jiri made about building some error-trapping code is good. It works best while you are on-site so you can check the error-trapping code as soon as a fault occurs.

Basically, this technique is used with poorly written code. In this case, the code is probably more like spaghetti-code... that is, very hard to follow from rung to rung. If so, then this is an obvious case of no, or inadequate, modularity.

Then Daniel comes with the "A+" answer...
Take a copy of the current program home with you and rebuild it at your leisure (leisure... yeah, right). This, of course, requires that you really understand the particular process. Even though you have the on-site program in a somewhat stable situation, this is the classic "poster-child" situation for breaking something just so you can fix it!

Just reading what you have said... I am more than sure that the program is nothing less than a "hack-job".

While at home (really, at home) you could rebuild this code (with true modularity) and add a bunch of bells and whistles to help the operators and maintenance folks. You could then use that program as a model for the next generation. Then, when you get called back (the money is good, right?) then you can have a super-duper up-grade for them. Then, maybe, they will really have reason to think that you are the BEST!

Then Lauri, you said...

What I am tring to say is that always it is not best way to fix the cause but eliminate bad results that can be seen by enduser. Also I know that I do not win here any popularity with that kind of statement

... stand-by for a little moralizing tangent... (if you can't appreciate it, then just ignore it).
You are right... you will not get my vote. Could it be that night-life and socializing is more important to you than doing what you have to do to support your livelihood? I have always thought that both had the same level of importance. Even with respect to "family", each (professional-life and personal-life) depends on the other.

Life, in all of of it's different aspects, is a trade-off. You have to do what you have to do, but you have to maintain the balance. You have to be honest with yourself about what you really have to do... "each depends on the other".

You have to find the "right" balance of what is really important. Always keep in mind... everyone is always surprised when old-age creeps up on them.
 
squash the bug. i'm way too obcessive to leave stuff alone just cause it works with some hacked up patch job. i'll have these crazy circular dreams about it and won't be able to get the problem out of my head until i fully understand it... it actually gets really annoying at times.

but as far as efficiency goes this kicks me in the pants every time when stuff just needs to get done.
 
Fixing the root cause can be difficult.
The cause can be too elusive to find, or there may be no opportunity to access the machine for troubleshooting.

If you really cannot squash the bug, then the second best thing is an assured way to reset the machine so that the production can get up and running again.

Apart from that I would not dismiss the problem as "not so bad" too soon. In the 6 years lifetime of the machine it can ruin your reputation in the market.
The initial joy over that the machine seems to work will cede, and the customer will start to be irritated when (if) the error comes back too many times.
 
Hundikoer said:
Hello automation masters

...I did not get permission to use my "smart(er)" algorithm.

Explain why you are not responsible for this program.

Why you would have to ask permission for the nature of your code but now don't seem to have to ask any persmission for hacking it.

I don't put the blame on you but you seem to put it on your boss. In the end this is YOUR program.

Fix it! Fix it has if YOU where the one paying for it.
 

Similar Topics

I need some help in how to solve over-run/under-run correction in a liquid batch system. Thanks for any help.
Replies
10
Views
3,167
I'm setting up some controller-based email notifications for alarms, process conditions, and so on. Test emails fire great! Along with this, I'd...
Replies
0
Views
1,634
Hi to all reading this, My questions are.... 1) does any1 follow the PLC approach as seen by the Ron Beaufort YouTube videos 2) where can I obtain...
Replies
5
Views
2,499
I am using a compactLogix PLC. We built a conveyor system now we have added an end of line sorter. Yes an add on and no one has thought of how it...
Replies
19
Views
7,553
The problem: Break is released and one second later my motor starts functioning. My motor stops functioning when it reaches a certain value...
Replies
6
Views
1,763
Back
Top Bottom