PROFINET MRP takes 5 seconds to recover from break

Mark Whitt · Apr 6, 2016

Hello fellow Automation folks!

Disclosure: I've made this same post on the Siemens support forum, but so far haven't gotten anywhere. So, I'm hoping that someone here may be able to lend some guidance.

I'm new to MRP and having some trouble understanding some things about it.

I've read where it should take on the order of 200 milliseconds for an MRP loop to detect a break and re-route messages the other way around the loop. But when I test the setup I have here, it seems to take about 5 seconds for the bus to recover and data to start coming into the PLC again.

I wondering if I have something configured wrong?

Here's a link to a picture of the Topology and the Domain Management views of the configuration that I'm using: ftp://ftp-static.mt.com/pub/Indmkg/Whitt/Temp/MRP.jpg

I'm using an S7-315 2 PN/DP for my tests. I have the MRP loop connected to both Ethernet ports, and am monitoring the program data over the USB MPI link. Programming package is Step 7 V5.5 SP3.

The details of what I'm seeing are:

When I break the loop in one direction, the interrupt is almost unnoticeable. However, when I re-connect the break, the devices will drop offline for about 5 seconds. Is this normal?
When I break the loop in the other direction, the devices drop offline for about 5 seconds. When I reconnect the break, the devices drop offline again for another 5 seconds. This one surprised me the most.
When the communications drop out, the SF and BF2 lights will flash on the PLC. When the communications re-establish, the lights go out. This all seems to line up with what I'm seeing happening to the data as the links re-establish.

One last thing - on the domain management view, the form doesn't seem to remember that I've selected "MRP domain" for the Display radio buttons. Could this indicate that I've set something up wrong?

Oh! And the devices on the network are new. It's possible that there is a problem with the devices themselves.

Thanks in advance for any comments and guidance!

Mark Whitt · Apr 7, 2016

A little more information. I've found out that the devices were tested at PI and supposedly passed this functional test. So, I'm thinking that this is a programming issue - either an OB block needs to be defined (which one, I've tried several) or there needs to be some handling code in an OB block (OB122 seems to be what comes up in the diagnostic log). Nothing obvious in the manuals. Searches on the internet come up empty. No responses on the Siemens forum. And this forum is disturbingly quite on the topic. Is this just not used anywhere?

Mark Whitt · Apr 11, 2016

Finally got to the bottom of this! But it took some help from two other suppliers to do it. Hopefully, posting it here will save someone else the days of work that it cost me!

It turns out that the problem was that the Device Watchdog timers were set lower than the 200 millisecond recovery time for the MRP loop. As a result, when a break took place, the Watchdog timer would trip and cause the device to fault. It took about 5 seconds for the device to recover from that fault.

Details:
In the hardware configuration, double-click on the slot "X1" line "Interface" module of the Device. Then click on the "IO Cycle" tab. In the frame for the "Watchdog Time", click on the drop-down box and select the number of I/O updates that the device can accept with missing I/O data. The time (shown on the next line below) is calculated as this number multiplied by the Update Time (see the "Update Time" frame right above the "Watchdog Time" frame).

Make sure that the total Watchdog time is GREATER than the 200 millisecond recovery time for the MRP loop.

This is also posted in the following Siemens support forum:

https://support.industry.siemens.co...m-break/145707/?page=0&pageSize=10#post587165

Pete.S. · Apr 11, 2016

Mark, I'm not familiar with MRP.
What are the advantages/disadvantages? I assume the primary purpose is network redundancy?

Thanks,
Pete

Mark Whitt · Apr 11, 2016

Hello Pete,

Yes, that's exactly what it is for. The PROFINET link is set up in a loop so that the same message from the PLC will go out both ports. If all is well, the messages will come back in the opposite ports and let the PLC (or MRP Manager) know that the link is okay. If a break occurs, all devices in the loop should still get the messages from one direction or the other.

It's very similar to Ethernet/IP's DLR (Device Loop Redundancy).

Mark Whitt · Apr 11, 2016

Advantages and disadvantages

I forgot to mention the advantages and disadvantages. Obviously, the advantages are more robust communications.

Disadvantages would be that special equipment is required (Switches have to be MRP capable, as does the PLC and any other devices put into the loop). Configuration can also be a challenge, which is why I created this thread in the first place.

Pete.S. · Apr 11, 2016

Thanks Mark, for your reply.

It looks interesting but somewhat limited support, except Siemens. I'm vaguely familiar with spanning tree which is used on IT switches.

It looks like it have to be a ring. I wonder how you connect devices that doesn't have two ethernet ports, for instance Siemens HMIs.

Mark Whitt · Apr 12, 2016

Hello Pete,

I can't claim to be an expert at MRP, so someone else may have more accurate and useful information than I do. If so, I hope they chime in!

Actually, the configuration of the MRP loop wasn't too bad - except for this issue I ran into with the Watchdog timers. So, I wouldn't be too concerned about limited support as long as we have forums like this.

My understanding is that you can put non-MRP enabled devices into a loop. But you have to do it using an MRP enabled switch, which appear to all be managed (so it's not as simple as just plugging things in). Those switches also are not cheap. But if you need reliable communications, then finding inexpensive equipment probably isn't your highest priority.

One nice thing about both MRP (PROFINET) and DLR (Ethernet/IP) is that you can set up an entire network without using switches at all - because each device has two ports on it that already act as a switch, and allow you to daisy chain one to the next. That's really handy in panels where all of those cables from a star network can take up space that could be better used for other things - such as air flow enhancement. Not to mention that you don't have to allocate space and power for a switch either.

Those same devices can be set up in a standard daisy chain and give you most of the same benefits - if you're not too concerned about message timing (it takes time to jump from one device to the next which adds up as the number of devices increases) and don't have a high reliability requirement.

mk42 · Apr 12, 2016

Pete.S. said:
Thanks Mark, for your reply.

It looks interesting but somewhat limited support, except Siemens. I'm vaguely familiar with spanning tree which is used on IT switches.

It looks like it have to be a ring. I wonder how you connect devices that doesn't have two ethernet ports, for instance Siemens HMIs.

Actually some of the HMIs from Siemens do support it, the Comfort line, with 2 ethernet ports (or more, in the larger sizes).

MRP definitely does need to be a ring. It is intended for use in situations where people are planning to daisy chain a number of devices together, like they would have for Profibus, but would like some network redundancy. It is intended for a ring of mostly field devices (a PLC and its IO), although some switches support it as well.

The advantage it has over something like Spanning Tree (STP) or Rapid STP is that it detects the failure and repairs itself faster. Even Rapid spanning tree often takes 5 seconds to detect and repair a change in the network, whereas MRP does it in less than 200 ms.

The disadvantage to MRP compared to RSTP is that MRP ONLY works in a ring, whereas RSTP can be used in much more flexible typologies with multiple redundant links. That extra flexibility is what can cause the delay. RSTP is more intended for switches than for field devices.

The disadvantage that MRP has in comparison to some ring protocols is that it isn't bumpless. Some ring protocols send every packet in both directions, so there is no need to detect a failure and repair. However, that doubles the traffic on the network, and requires the switches to be much smarter (and therefore more expensive). Some Profinet products support MRPD, which is bumpless, but this is typically only common in devices involved in coordinated motion (drives or something else that typically uses PN IRT), and not in more typical field devices.

Mark is correct that MRP is pretty much equivelent to DLR, in the same way that Profinet is pretty much equivelent to Ethernet/IP. Essentially, two different approaches to the same basic problem.

Pete.S. · Apr 12, 2016

Interesting, thanks guys!

In what scenarios would you say using MRP is a good choice?
I mean the CPU and I/Os are normally not redundant (well unless you use the redundant ones).

.

Mark Whitt · Apr 12, 2016

The processes that I'm familiar with where redundant communications are required are things like you would find in refineries or glass furnace operations where a prolonged loss of control can damage the process or cause unsafe conditions. Or something like the nuclear industry where it takes a lot of time for the maintenance people to suit up so they can get to a problem.

Another type of process where redundant communications are important would be anything that is time-critical, like when you have a conveyor going through an oven and the product would be ruined if the I/O failed.

Of course, anything that requires motion control with fine positioning (so things don't crash for instance) would require an even higher level of redundancy - such as MRPD (bumpless) that mk42 referred to.

I'm sure that there are others. If people would chime in with some that they can point to, that would be great!

Pete.S. · Apr 12, 2016

Mark Whitt said:
The processes that I'm familiar with where redundant communications are required are things like you would find in refineries or glass furnace operations where a prolonged loss of control can damage the process or cause unsafe conditions. Or something like the nuclear industry where it takes a lot of time for the maintenance people to suit up so they can get to a problem.

I'm familiar with redundant systems like chemical industry and refineries but that is usually done with DCS systems and not PLCs and they use redundant everything.

And the plants I've worked with have had separate safety system so this is not something the control systems would handle.

I was thinking more along the lines of when using the MRP would make sense for regular non-redundant PLC systems. Or maybe it always makes sense?

.

Mark Whitt · Apr 12, 2016

Hello Pete,

Maybe the answer you're looking for has to do with daisy chaining. Daisy chaining has a number of advantages over using a star network topology. But has the obvious disadvantage of being more susceptible to problems with a single device taking down the network. With an MRP (or DLR) ring, the impact of a problem with one device would be very much reduced, and bring the over all network reliability nearly up to the star network capabilities.

mk42 · Apr 12, 2016

R

Pete.S. said:
Interesting, thanks guys!

In what scenarios would you say using MRP is a good choice?
I mean the CPU and I/Os are normally not redundant (well unless you use the redundant ones).

tl;dr: Rings/MRP are a way to cheaply build a network that can survive a fault.

To me, looking at when something like MRP makes sense requires starting with 2 questions: 1)Why are you using a line topology instead of a star, and 2)Why are you using a ring instead of a line? The two big factors in those decisions are the physical layout of the system and what needs to happen when faults occur.

Star topologies, where all the devices are connected to a common switch, are probably the most common Ethernet layout that I see. Industrial rated ethernet switches are pretty reliable, but if you lose the switch, either through failure or power loss, you lose communication to every devices. On the plus side, you can power off any of the end devices without affecting the communications to the rest of the system. As an aside, I always recommend managed switches for the diagnostics, but unmanaged often get the job done.

Line topologies tend to be a little bit cheaper, because you daisy chain from device to device, and you don't need to spend money for a switch (although you can include them if you choose). This can make a lot of sense if you have a lot of IP65 field devices mounted outside of cabinets around the machine. It may be much simpler to cable from device to device than to wire everything back to one central point. However, if you lose one of the nodes in the middle of the line, it breaks the line into half. Each half can talk, but no traffic can go across the failure. This can make commissioning a huge pain.

Ring topologies are pretty much a direct improvement on Line topologies. with one extra cable, you complete the ring, which means that if you a have a failure somewhere, the network will repair itself, and after a short pause all traffic will be able to flow again. The downside here is making sure that all components support the same ring protocol, and the extra engineering/commissioning effort to make sure it works. If you create an Ethernet loop with no ring configured, your network will go south fast.

So now that we have that part of the discussion out of the way, Line topologies don't make sense when you have a line that needs to/can run through a failure. This is typical in redundant type setups, but it is also common with distributed setups. If a PLC controls 4 independent robot cells, you don't want to connect them in a linear way. You might want the other 3 robots to keep running even if one is down.

Either Rings or Stars can be a good upgrade from the line topology, depending what your system needs. Often, both are valid options, but I typically see start topologies far more often than rings. True redundancy isn't a requirement all that often, and when you really need it, the redundancy usually goes way past just one ring.

Pete.S. · Apr 12, 2016

Good discussion guys.

I haven't given network topologies much thought for many years as they almost always turn out to be stars. But as you said a ring with Profinet MRP is an option to consider.

PROFINET MRP takes 5 seconds to recover from break

Mark Whitt

Member

Mark Whitt

Member

Mark Whitt

Member

Pete.S.

Member

Mark Whitt

Member

Mark Whitt

Member

Pete.S.

Member

Mark Whitt

Member

mk42

Lifetime Supporting Member

Pete.S.

Member

Mark Whitt

Member

Pete.S.

Member

Mark Whitt

Member

mk42

Lifetime Supporting Member

Pete.S.

Member

Similar Topics