Here's a simple tunnel effect I did to test out doing pixel effects with userdata. Not super inspired, but it does run at 60fps at full resolution!



Added some more motion to the effect using strided userdata operations. My intuition was that this change should effectively be free from a performance perspective, but that hasn't worked out for a few reasons:
- I can't get userdata ops to work as I expect when
u0
andu2
are not the same object. - I'm not sure stride parameters allow you to "cut out" part of
u0
? It looks to me like you always have to iterate over each element ofu0
. - Strided ops seem slow - the strided copies here seem to be ~3x the cost of non-strided copies.
So as a result this cart only runs at 20fps, unlike the original, which runs at 60fps.



Thank you for sharing this. I would also have expected that this change would only have the overhead of a 3 strided copies. It seems that since you shared this strided copies are now broken for non "u8"-typed matrices, and the 1st element gets broadcasted throughout.



This cart successfully exercises strided copies for userdata matrices of type "u8":



While this cart fails trying to do the same with userdata matrices of type "u32" (as of Picotron 0.2.0h3):
[Please log in to post a comment]