An implementation of Conway's Game of Life cellular automata.
Game of Life wiki: http://www.conwaylife.com/wiki/Main_Page
v0.1: Initial version. GoL simulation uses the screen itself for storing cell data, using the two-line buffer method, with null edges.
v0.2: Pattern library. Audible feedback. Small bug fixes and performance improvements.
v0.3: Performance improvements. Pattern placer now lets you stamp multiple copies until you cancel. Simulation displays a generation count. Small visual changes.
Neat! I was wondering if someone was gonna demo up Life; seems like an important piece of the canon.
Using the screen for storage and calculations is sort of slick as a way of keeping things minimal but it definitely gives itself away in the visible rasterization of the screen updates -- I'd definitely recommend writing the new state to elsewhere in memory (just chucking it to an array would be simple) and then using that to draw to the screen all in one frame to get rid of the draw artifacts.
And I'd imagine some actual wizards will have more deep suggestions, but as one simple optimization for frame rendering speed you could with each board update track the leftmost, topmost, rightmost, and bottommost cells that are alive, and use that to create a bounding rectangle with an extra cell or two of buffer that you can restrict your calculations to. You'll get bigger gains from smaller play areas that way, so not so useful if your goal is a steady frame rate, but it'll let you whip through small initial configurations a lot faster.
The current implementation minimizes memory use but for no good reason. I'm not using anything in the sprite, map, or user-defined memory regions, so I have plenty of space for two screen buffers written directly in the two-pixels-per-byte screen graphics format. Writing to two buffers and memcpy'ing them onto the screen might be faster than populating line buffer arrays and pget'ing and pset'ing one cell at a time. I'll give it a try for comparison.
v0.1 also uses a 0/1 internal representation of a cell that simplifies neighbor counting, but also requires a lot of conversion to pixel colors. Assuming that empty_color is black (0) would allow for just using multiples of cell_color (7) for everything.
I wrapped some repeated logic in function calls. I don't know if rolling those out would save execution time. I'd only do that as a last resort. I want this to be a clean code sample.
I'm not aware of a better buffering technique than the two-line buffer. There are alternate data structures, but they're mostly for saving memory or doing long-term calculations (Hashlife), not iterative simulation speed.
I'm doing some speed tests of the various methods. I can post code later if people are interested, but I thought this was slightly interesting:
1) Two full-board buffers stored in Lua arrays as 0's and 1's and copied to the screen with pset(): 50 generations in 31 seconds
2) Direct-to-screen using a two-line buffer: 50 generations in 29 seconds
3) Two full-board buffers written in screen data format at 0x0000-0x1fff and 0x2000-0x3fff, using peek and poke to access the buffers and copied to the screen with memcpy: 50 generations in 57 seconds
Method #3 requires rewriting pset and pget in Lua so that I can set pixels in a non-screen portion of memory. I haven't profiled these yet but my hunch is that this is a likely cause of slowness. I bet if I double the width of the cells and just peek and poke 0x00 (two blacks) and 0x77 (two whites) I could do better.
My measurement for method #2 can probably be improved by cleaning up my code to more resemble my experimental set-up in #1 and #3, and doing tiny things to avoid unnecessary conversions from color values (7) to 0/1 cell values.
I'll try these improvements when I have time.
Method #3 with double-width cells for simpler reads and writes: 50 generations in 18.5 seconds. That's only a partial improvement considering that my board is now half the width (equivalent to 37 seconds at full width), so I'm not going to call that a win over using pset()/pget() to copy from a buffer in an array. My custom pset()/pget() may have been slow, though.
I did notice that I had wrapped my poke() in a function for calculating the address from coords. Inlining this poke saved 2 seconds (with the half-width board). So, as usual, function calls are expensive.
I may or may not try to set up a comparable experiment of method #2 with a half-width board for comparison. It might be the winner.
The UI needs a lot of work. I want a [random] button that fills the screen with an interesting % of live cells, and a [shape] brush that creates gliders, toggles, etc when you press z, instead of a single dot. You could even make puzzle levels where the goal is to kill everything with a limited number of gliders, by causing side effects from them, but that's probably a bigger idea...
oh, and you might want to look into pal(). you can remap 1 to be white, so there's less conversion cost.
It's getting much better! Few things I would change:
- change  button text to [draw], since it's possible to edit with patterns too.
- make the current pixel toggle if z is down (btn) and any direction was just pressed (btnp), so dragging works. btnp periodically fires so z has to use btn instead.
- change the flashing highlight colors when selecting pattern vs. placing pattern.
- make it so you can continue to place the selected pattern until you cancel with x.
Is there any way to evaluate the next step using convolution shader techniques? i.e. from the wiki, the egocentric method where sum of nine = 3 is birth (solid white), = 4 is preserve (transparent), else death (solid black). You might be able to map all pal colors 0-9 to the correct color and transparency (palt), then just draw a spritesheet over the screen with a single call. And since pget returns the remapped values, the whole map will only be 0's and 7's for the next iteration. (oh, and if you don't want this, try pal(1,7,1), which only remaps on display, instead of changing the drawn data.
innomin: To paraphrase, correct me if I'm wrong: the iteration loop could count all pixels in a block of nine, set the corresponding middle pixel in a board-size sheet of sprite data to one of three values based on that count (3=white 4=transparent else=black), then map-draw the sprite data on top of the board. That'd be more fruitful if there were a faster way to read a rectangle of pixels than accessing them individually. I'm not sure it simplifies the conditional expression enough to conflate two states using the transparency trick. The sprite draw otherwise sounds similar to the two-board memcpy method and seems unlikely to outperform it, even if transparency is handled in an optimized way. Cool thinking, though.
I like your other UI ideas. I may steal them.
Just had an unrelated idea for an optimization. brb. :)
v0.3 is now posted. It can now do 50 generations in 11 seconds.
Function calls are slow, so I burned some tokens to handle the edge cases separately from the main cell loop. Because the main loop knows it is not handling the edges, it doesn't need to range check, so I can inline linebuf and pget accesses and drop the helper functions. That makes a big difference.
I took several of innomin's suggestions. The pattern placer lets you stamp multiple copies until you cancel, so you can paste in a fleet of spaceships. Place mode uses a different blink color. "edit" is now "draw".
Also, the simulation now displays a generation count. Enjoy!
Very nice speedup gains! Also, I like how the spaceships become gliders when they hit the edge. This is nearly shippable. =)
You paraphrased correctly, I was thinking by eliminating all conditional checks until draw you could really speed up the per-pixel function. Just adding all the local values and remapping 0-9 to the correct palette of 0,1. I may open your code and have an attempt at it, if that's okay with you.
Heh, I may have started early, but it's good to know that you approve! Here's my palette recast version, runs at 175/100 your 0.3 version.
I did it by only peeking each address once, saving them in local vars, and bitshifting to do two pixels at the same time. And there were a lot of little pal() calls everywhere to make sure things still worked. Oh, and I had to move the patterns into map memory while running because my algorithm used the entire sprite buffer to run! Still could do with a lot of optimization, but I'm happy with the speed boost I got out of it. =)
EDIT: And I just now realized that the text is color 1 and incorrectly being remapped to white. Oh, and btnp for exiting run is spotty, should have used btn...
thanks! and sorry about the icon, for some reason the thread is using that gray square now...
I'm a little confused about what you meant when you said GoL golf. I thought that meant we'd be taking turns, but now it seems more likely that would be called GoL tennis or pong or something.
I want to do an optimized version of my algorithm this weekend, but then I think that's the most I can do to help. And I'll fix those recoloring bugs at the same time. Oh, and you're welcome to use any part of my versions in your final, if that wasn't clear. =)
I'll be happy to steal your stuff, yes. :) And thanks!
I'll let you take another crack at it, then I'll fold it in. I have some more polish planned but not much else. I like all of the ideas of adding game-like elements such as challenges or puzzles (draw a 6x6 pattern with exactly 15 cells that looks like This on generation 6...), so I'll give that some thought. Could also be a two-player real time action strategy game: wipe out the other player's cells by spawning up to 100 cells behind the line... Dunno, the life rules might be too fragile. Probably prior art in this area, too.
I also have an idea for a separate game where you play a cell trying to survive as many generations as you can through various means (movement, power-ups, etc.). No idea if that'd be fun but might be worth a prototype.
I was thinking I might make my final optimized version a toroid shaped one (so top and bottom wrap, etc). Let me know if you want that!
And yeah, the puzzles will be tricky to make fun. A lot of patterns leave garbage, so I'm not sure many puzzles will end in nothing... but I like the idea of matching patterns (though my algorithm ruins the entire spriteset range of memory, so I was storing the patterns in map 0x2000 during run).
I was also thinking maybe you can only draw in a certain area of the screen, and you have to get life to reach an edge of the screen, or a target, and even stay there for x cycles. Then unlock new patterns by beating these levels that you can use in later levels. But if I were you I'd keep the life simulation in a separate cart, and call this Puzzle Life or something.
... and every time I look at my sprbuffer life I keep thinking I should call one of these carts Thug Life. XD
Here we go! Apologies for the monthlong hiatus - I've been working on my own cart, but I had another idea for a major optimization here. This new version runs at twice the speed of my last cart, a full 17 frames per second!
It works by using the full 32 bits in PICO-8 lua variables, instead of 8. Some fancy shifting and adding means I do 1/4 the math, but the extra overhead means it's only double speed. Still quite a good boost, though! =)
And I decided against doing toroid edges because (a) it's slower and (b) the almost-square area means gosper guns and the like would kill themselves immediately, anyway. I'm calling this done now, if you want to fold it into your version up top, and I fixed the two little issues from last time, too.
Edit: apparently there are always bugs. The [F7] screenshot doesn't take screen palettes into account, so all the colors are off. Oh well!
[Please log in to post a comment]