-- update: fixed a bug where I was still drawing 16x16 sprites from testing that.
Cart showing how to exploit the undocumented multiple display functions to set up extra sprite buffers that can be mapped back into memory for negligible cost.
Use at your own risk ;)
poke(0x5f36,1) -- enable multidisplay
- write sheet 0 of sprite graphics to screen however you want
- write sheet 1 of sprite graphics to screen
- write sheet 2 of sprite graphics to screen
write sheet 3 of sprite graphics to screen
- swap screen and sprite addresses
poke(0x5f55,0) -- map display to where sprites are to start poke(0x5f54,0x60) -- map sprites to where screen mem was
Obviously your cart doesn't get any more storage space for sprites this way so how you fill the extra buffers is a bit of a challenge.
- map in display containing the sprite you want (_map_display seems to map to 0x6000 regardless of whether you've poked the screen or sprites address to somewhere else)
- draw the sprite as normal with spr() (or sspr() or map() or tline() etc.)
In the example cart there's a simple wrapper function that takes a sprite number from 0 to 1023, maps in the buffer and chooses the sprite for spr() to use.
This is used to draw 256 sprites across the whole screen each frame. Looking at the performance with ctrl-P on my system, it costs approx 0.02-0.04 per frame compared to reverting to spr() with no mapping.
I've been storing extra sprite sheets in strings that I decoded into packed tables to dump into memory with a line like
poke4(0,unpack(sprites)). This works okay, but took a significant amount of performance each frame. To stay at 60fps with other stuff going on I'd had to restrict changing sprite sheets pretty carefully e.g. twice a frame: background sprites, character/object sprites. Otherwise, I'd tried only mapping portions of the sprite sheet as needed, but it gets fiddly and still hurts performance.
I've also played about with the multidisplay functionality (take my Christmas Chaos game on itch for instance). During dev on that I realised that the different display buffers are persistent and accessible whether PICO-8 is actually in multi display mode or not, but it didn't seem very useful at the time.
I knew we were going to get to change the address of screen and sprite memory with 0.2.4, but I haven't got round to playing with it until now.
Perhaps there's some other fun things to do with this, but I haven't thought of them yet.
I hope zep doesn't mind me posting this :)
In the meantime I tried this:
function blank(i) end cls() poke(0x5f36,1) -- enable multidisplay printh("start") start=t() for i=0,99 do for j=0,9999 do blank(j%4) end end printh(t()-start) start=t() for i=0,99 do for j=0,9999 do _map_display(j%4) end end printh(t()-start)
Output on my machine is:
start 1.1667 0.9667
It's hard to see any caching method beating being able to map in faster than calling a blank function. To be fair, changing the address of a buffer should be faster than copying into that buffer, really.
That said, this way does seem a little bit of an exploit. There's not really any apparent downsides to it either: the token cost is minimal, all the memory above 0x8000 is still available for maps etc, map() will still work (at least with the sprites on the display mapped in when called).
I think I worked out why my comparison cart wasn't working and, unless I'm very confused, I've found a bug.
When the screen and sprite memory addresses are changed, memcpy doesn't seem to respect the change i.e. writing to 0x6000 makes visible changes on the screen even if the screen address is poked to 0.
Here's a repro (at least on my machine):
When run (if it behaves how it does for me) then it shows a black screen followed by a red screen when X is pressed. I think, if PICO-8 is behaving as the manual says, that it should show red immediately, not black.
With that in mind I fixed my comparison cart pretty easily:
X changes method used for multiple sprite sheets.
Z swaps out the actual spr() function so that no drawing occurs to try and show only the overhead from the methods.
UP swaps between 1024 and 256 sprites, mostly to allow comparison with spr on it's own, but it also shows the zero cache miss scenario for the "hicache" method.
[Please log in to post a comment]