Adapting a solution found ca. page 180 of this white paper:
https://support.amd.com/TechDocs/25112.PDF
I finagled it into this PICO-8-specific form, which counts every bit (including fractional bits) in constant and pretty-brief time:
function popcount(v) v-=band(v*.5,0x5555.5555) v=band(v,0x3333.3333)+band(v*.25,0x3333.3333) return band((band(v+v*.0625,0x0f0f.0f0f)*0x0101.0101)*256,255) end |
It's a bit uglier than it would be in C/C++, but it's mostly the same operations, with a minor adaptation for the 16-bit shifts inherent in 16.16 fixed-point multiplies.
Just leaving it here in case anyone ever needs it.
PS: Apparently it's commonly called "popcount" or "popcnt", after an SSE instruction that does it. Basically, it's the population-of-1's count. Had no idea it had a name.


Just in case you haven't seen it, I strongly recommend this page if you find yourself dealing with this sort of algorithm with any frequency: https://graphics.stanford.edu/~seander/bithacks.html
That page presents your algorithm as
v = v - ((v >> 1) & 0x55555555); // reuse input as temporary v = (v & 0x33333333) + ((v >> 2) & 0x33333333); // temp c = ((v + (v >> 4) & 0xF0F0F0F) * 0x1010101) >> 24; // count |
[Please log in to post a comment]