Compression seems to be a thing I've been doing recently. Continuing this tradition, I am releasing an (only partially done) e-book reader for the PICO-8, with the first two chapters of Moby Dick! (It's also got some of the third one, too.)
Some things to note:
Text is stored in the graphics, map, and sprite property data, as Huffman-coded bytes.
The text is divided up into sections that encode up to 6351 characters each (the length of user RAM minus one screen's worth of characters). These sections are byte-aligned.
With Huffman coding, each character of the source text is basically replaced with a variable-length bit sequence (one that is sure to never be a substring of any other), with the bit sequences getting shorter the more frequent a character is.
I tried to design this in a general way, such that anyone who knows how (i.e. me) can change the aesthetics, or the parameters for decoding the book, by changing a few values in the first few lines of the code.
- Currently, the encoded data is only stored in the first 12544 bytes of RAM. In theory, filling the music and SFX with compressed data could add over 8000 characters to the book. I will leave this, however, as an exercise for another day.
If I can find some sort of book that fits within the ~22000 characters that this e-book reader can hold, I will release an official PICO-8 cartridge of that book (and possibly add some new features, like page-by-page scrolling and bookmark-saving!). Please alert me of any books that might fit this limit!
Thanks to @Scylus on the PICO-8 Discord for providing me with a way to generate and interpret the dictionaries.
Nice! I did something similar with Tale of Two Cities a while back: https://www.lexaloffle.com/bbs/?tid=2776
I had slightly different goals. I was trying to compress a large collection of relatively small strings such that you can index into the compressed data and use a common dictionary to decompress it, such as for an adventure game. With my tool you can tag inline string literals with special syntax, and a preprocessor extracts them, replaces them with function calls, and stores the compressed text in the image data. I ended up testing it with an ebook-like thing similar to yours, with similar performance. Example unprocessed source: https://github.com/dansanderson/p8advent/blob/master/tests/testdata/pager.lua
It was especially interesting to see how you can balance dictionary size against the compression performance. For a game with other stuff in it I'd need to cut down on dictionary size to leave Lua RAM headroom for actual game logic.
@dddaaannn Your demo of "A Tale of Two Cities" was actually the inspiration behind this little demo. I wanted to try the same idea of compressing text somewhere, using a method that I'm at least a little familiar with (Huffman coding) and which would require minimal resources to actually decode. My end goal was more like an e-book reader than a general text compression engine, though this method could certainly also be used generally.
I ended up bundling a modified version of this engine with a full short story, and it ended up compressing a 21747 character story to 11968 bytes - 55% of its original size. Not as compressed as your example, but hey, it accomplished my goal.
I read the whole Poe story in your reader last night!
Going by word count I think yours is just as performant as mine, but maybe I didn't look too closely. I didn't do any special tricks, I was just copying the LZW algorithm out of Wikipedia. Kind of surprised I got it to work at all. :)
[Please log in to post a comment]