Hi all,
I'm trying to troubleshoot why a point-to-point serial DNP3 link suddenly stopped working and need some expert advice on the matter, since I have no formal training on DNP3 and everything I learnt is on the job.
Context
There's a serial fibre link from a Prosoft DNP3 gateway to an SEL RTU that was working fine but suddenly experienced major issues. Basically, it would only work for one or two minutes at a time before dropping out for almost an hour and then start working again for another few minutes before dropping out for another long period of time. The RTU is the DNP master and the DNP3 gateway is the slave (converts to Modbus on the other end).
The DNP3 polling scheme is basically an integrity poll every 2 seconds with unsolicited responses disabled by the master RTU (I'm aware how idiotic this setup is but that's what I have to work with). The actual request is below:
The fibre link is dedicated to the RTU-gateway comms, there are no other nodes that communicate on it. I have full access to the gateway, while I have very limited access to the remote RTU as it is a device managed by a third-party.
Questions I need help with
I captured the raw hex dump from the Prosoft gateway and ran it through a DNP3 decoder. What I've found is that the RTU is integrity polling significantly more sporadically than its specified 2 second interval. This RTU was previously polling fine with the same settings and program.
The traffic I'm seeing much more regularly from the RTU is the request to disable unsolicited responses from the gateway slave. Decoded request is shown below:
The response from the gateway is very telling:
Basically, it indicates that there is some configuration corruption in the gateway, which I obviously need to look at. However, even though the configuration is corrupt, the gateway is still perfectly capable of reporting back its DNP3 points on a read request. Furthermore, it doesn't not explain why it would affect polling rate of the RTU master? Instead of integrity polling as it should, the RTU just keeps sending out unsolicited response disable requests.
The issue isn't that the gateway is sending back malformed data (because it's not), the problem is moreso the RTU is not polling it as it originally should, despite there being no changes made to it. I even tested this with a DNP3 master simulator and the simulator has no issues polling the gateway every 2 seconds - the data comes in perfectly fine.
My question is: is the RTU dynamically adjusting its polling rate based off the IIN bits it is receiving from the gateway? That would be my only explanation as to why the RTU is not polling as per its original specification. I have not much experience designing DNP3 communication schemes but is this normal programmed behaviour?
Thanks in advance to anyone who has read my wall of text.
I'm trying to troubleshoot why a point-to-point serial DNP3 link suddenly stopped working and need some expert advice on the matter, since I have no formal training on DNP3 and everything I learnt is on the job.
Context
There's a serial fibre link from a Prosoft DNP3 gateway to an SEL RTU that was working fine but suddenly experienced major issues. Basically, it would only work for one or two minutes at a time before dropping out for almost an hour and then start working again for another few minutes before dropping out for another long period of time. The RTU is the DNP master and the DNP3 gateway is the slave (converts to Modbus on the other end).
The DNP3 polling scheme is basically an integrity poll every 2 seconds with unsolicited responses disabled by the master RTU (I'm aware how idiotic this setup is but that's what I have to work with). The actual request is below:
Code:
Function: PRI_UNCONFIRMED_USER_DATA Dest: 1 Source: 0 Length: 20
FIR: 1 FIN: 1 SEQ: 31 LEN: 14
FIR: 1 FIN: 1 CON: 0 UNS: 0 SEQ: 8 FUNC: READ
060,002 - Class Data - Class 1 - all objects
060,003 - Class Data - Class 2 - all objects
060,004 - Class Data - Class 3 - all objects
060,001 - Class Data - Class 0 - all objects
The fibre link is dedicated to the RTU-gateway comms, there are no other nodes that communicate on it. I have full access to the gateway, while I have very limited access to the remote RTU as it is a device managed by a third-party.
Questions I need help with
I captured the raw hex dump from the Prosoft gateway and ran it through a DNP3 decoder. What I've found is that the RTU is integrity polling significantly more sporadically than its specified 2 second interval. This RTU was previously polling fine with the same settings and program.
The traffic I'm seeing much more regularly from the RTU is the request to disable unsolicited responses from the gateway slave. Decoded request is shown below:
Code:
Function: PRI_UNCONFIRMED_USER_DATA Dest: 1 Source: 0 Length: 17
FIR: 1 FIN: 1 SEQ: 55 LEN: 11
FIR: 1 FIN: 1 CON: 0 UNS: 0 SEQ: 0 FUNC: DISABLE_UNSOLICITED
060,002 - Class Data - Class 1 - all objects
060,003 - Class Data - Class 2 - all objects
060,004 - Class Data - Class 3 - all objects
The response from the gateway is very telling:
Code:
Function: PRI_UNCONFIRMED_USER_DATA Dest: 0 Source: 1 Length: 10
FIR: 1 FIN: 1 SEQ: 56 LEN: 4
FIR: 1 FIN: 1 CON: 0 UNS: 0 SEQ: 0 FUNC: RESPONSE IIN: [0x16, 0x20]
IIN1.1 - Class 1 events
IIN1.2 - Class 2 events
IIN1.4 - Need time
IIN2.5 - Configuration corrupt
Basically, it indicates that there is some configuration corruption in the gateway, which I obviously need to look at. However, even though the configuration is corrupt, the gateway is still perfectly capable of reporting back its DNP3 points on a read request. Furthermore, it doesn't not explain why it would affect polling rate of the RTU master? Instead of integrity polling as it should, the RTU just keeps sending out unsolicited response disable requests.
The issue isn't that the gateway is sending back malformed data (because it's not), the problem is moreso the RTU is not polling it as it originally should, despite there being no changes made to it. I even tested this with a DNP3 master simulator and the simulator has no issues polling the gateway every 2 seconds - the data comes in perfectly fine.
My question is: is the RTU dynamically adjusting its polling rate based off the IIN bits it is receiving from the gateway? That would be my only explanation as to why the RTU is not polling as per its original specification. I have not much experience designing DNP3 communication schemes but is this normal programmed behaviour?
Thanks in advance to anyone who has read my wall of text.
Last edited: