Peter Nachtwey
Member
PLCDontUQuitOnMe said:What I found surprised me. The new, elegant, two rung solution was 2 1/2 times slower than the old, brute force, 17 rung method. Why is this?
My scan time is 200 us with the old code and 500 us with the new(your implementation).
Do the CLR, ADD, LES, and JMP instructions take THAT much more time to execute rather than simple XIC, OLE, and a single MOV?
I know it doesn't seem right but I bet the RSLogix programmers have the XIC and XIO and OTE functions optimized and the bit numbers and masks can be hard coded. This can all be done at compile time in the RSLogix. It is hard to make a general case indirect XIC with a variable bit number near as efficient because this must be done at run time. Also, for each bit you are also doing an extra compare and jump that isn't required if you used the simple 16X xic ote method. Personally, I would use the xic ote method for only 16 bits or one of the methods in the bit twiddling hacks link above.
A modern C or C++ compiler will often unroll loops and put the code inline to avoid the compare and jumps.
Note, I would be embarrased if it took an extra 300 microseconds on our product but to be honest we don't allow small loops that could possibly cause the PLC or motion controller to get stuck in an infinite loop and fault. I like these questions because I implement these problems on our product just to see how we measure up. I would recommend this because it can be done in-line and is very deterministic because the same code is executed regardless of the bit pattern:
Code:
Bits:=(SHR(Bits,1 ) AND 16#55555555 ) OR SHL(Bits AND 16#5555555,1 );
Bits:=(SHR(Bits,2) AND 16#33333333) OR SHL(Bits AND 16#33333333,2);
Bits:=(SHR(Bits,4) AND 16#0F0F0F0F) OR SHL(Bits AND 16#0F0F0F0F,4);
Bits:=(SHR(Bits,8) AND 16#00FF00FF) OR SHL(Bits AND 16#00FF00FF,8);
Bits:=SHR(Bits,16) OR SHL(Bits,16);
http://forum.deltamotion.com/viewtopic.php?f=12&t=30. You can see execution is fast even though our compilers still isn't optimized. There are too many times when Bits is stored at the end of an expression and then reloaded again at the beginning of the next instruction. The compiler does no optimzation across expressions.
This takes 12 microseconds on our product and it is swapping 32 bits. Swapping 32 bits one bit a time into another word.
Code:
BitsA.31:=BitsB.0;
BitsA.30:=BitsB.1;
...
BitsB:=BitsA; // copy all 32 bits back to the original DWORD
I wonder how a S7 would do using hand optimized STL.