ASF's code with the minor adjustment to the location of the JMP works perfectly and it's pretty fast. Not quite as fast as the ST but close enough. I combined it with the rest of the logic needed, made an AOI and it works great.
I just wish I could pull the description field from a Logix PLC tag from within PLC logic. That would be the cherry on top - but alas that can't yet be done.
Yes, and subtracting 1 from -32768 would cause and overflow, and any 16-bit signed integer with a value of 1 in alarm bit 15 will eventually result in subtracting 1 from -32768.
Update: Actually, these appear to be DINTs, but they still have the same problem if any bit 31 values are 1.
ASF's code with the minor adjustment to the location of the JMP works perfectly and it's pretty fast. Not quite as fast as the ST but close enough. I combined it with the rest of the logic needed, made an AOI and it works great.
I just wish I could pull the description field from a Logix PLC tag from within PLC logic. That would be the cherry on top - but alas that can't yet be done.
Is the juice here worth the squeeze?
Like you I like to get the code as tight as I can if for no other reason than my own OCD and I think that's true here.
Having said that... now that you pointed it out... Damn it!.
Here are the results for a MicroLogix 1100; I don't know if they will be similar in a Logix/Studio 5000 environment.
The timing was used the S:4/S:35 10kHz timer for 1000 iterations of four 32-bit signed integers (MicroLogix Longs). Each of the four Longs were filled with for consecutive bytes of the same value (0x00_00_00_00, 0x01_01_01_01, ..., 0xFF_FF_FF_FF).
The following 1-valued bit counting algorithms were coded and measured:
CONTRL - no bit counting (control group; measures mean scan time)
SUBAND - Kernighan's way; SUBtract and AND result to count right-most bit; what @waterboy used; cf. here.
N = 0
Y[N] := N * N; N := N + 1;
Y[N] := N * N; N := N + 1;
Y[N] := N * N; N := N + 1;
Optimizing compilers can unroll FOR loops when the number of iterations is known, and even sometimes when the number of iterations is not known. It provides a performance enhancement because the loop index comparison with its implicit GOTO (JMP equivalent) are eliminated.
In our case, the NEQ element 0 was the comparison, so only some JMPs saved, but it does save a few µs per Long.