So I'm brushing up against the compressed codesize limit with a cart that's barely halfway towards the token limit. It sounds like this is unusual, and from what I can tell it's because I'm commenting the code fairly thoroughly.
And I have to wonder, what is the intent behind the compressed size limit? What is it still meant to accomplish that the token limit doesn't already?
The token count is a rough representation of binary or bytecode size, which was an authentic historical constraint for many systems. It's measurable from within the editor: a coder can see at a glance how many tokens they have left and how they are affecting that number as they type. While it's approximate, it's directly correlated to code complexity, and this makes it fairly intuitive to reason about. The steps for reducing token count are also intuitive: simplify code, improve code sharing, generate data rather than hardcoding it, do more with less.
The token limit defines a scope for pico-8 cartridges; it encourages creative solutions and algorithmic content generation, and it plays off the other cart limits by discouraging tactics like offloading data into code.
Compressed code size, on the other hand, is a representation only of how much entropy your source code exhibits, a limitation no developer historically suffered under. It cannot be easily measured as you make changes, and it has no direct correlation to the complexity of your program or the efficiency of your code; only to the volume of text and the amenability of that text to an arbitrary compression algorithm.
This doesn't foster creativity: all it does is cause unwelcome surprise and punish certain coding styles. Specifically it discourages comments and descriptive naming, which become barriers for anyone else trying to learn from your code. The compressed size limit doesn't even keep you honest; it's trivially easy to bypass with the aid of a minifier, at the cost of making your code unreadable and uneditable.
The compressed size limit does prevent you from packing your code full of string data, which ok fair enough; text storage was a huge constraint for early game developers and led to creative solutions like paragraph books and the z-machine's bespoke text compression format. But characters in text strings seem like they ought to count toward the token limit anyway - and maybe they already do?
IMO, the compressed code size shouldn't count comments at all, and I'm pretty sure you're right in thinking that it does. I realize that the carts are meant to decompress with everything intact, including comments and whitespace, so that they can be opened in the editor, therefore the comments and everything are still there in the compressed file. The problem is though that counting comments and whitespace against the compressed limit discourages good commenting, and I feel that the compiler should ignore them in terms of counting the characters, since traditionally a compiler would be stripping them out entirely anyway. So I feel that the compiler should pretend they aren't there, just as any other compiler wouldn't compile with them in there anyway.
I agree with what you said about the characters in strings - they should totally still count, because they are still data used by the compiled program. But comments and whitespace (including newline chars) shouldn't. You're spot on with the notion of unreadable and unmaintainable code because of the limit. I've seen too many carts that name everything with one or two characters, use little or no comments, and/or don't do much indenting, probably to avoid the character limit thing (although some of that could partially be because of the 33 columns in the editor thing as well, to avoid horizontal scrolling. I've been guilty of some of these myself for primarily that reason).
Scathe: comments and whitespace are indeed counted into the compressed size, which is AFAIK just the byte size of the zlib-compressed source code.
From testing in 0.1.5 it appears strings of any length are treated as a single token, so the character and compressed size limits are all that are keeping you from releasing an ebook in pico8 form.
I feel like it would be truer to pico-8's design goals to count individual chars as tokens, but then the token limit would have to be raised (or another separate limit implemented) so as not to break existing carts.
That's essentially what it already does: the maximum cart size is 16kB+16kB (max code size+fixed size of sprite/map/sound data).
The issue is that "max code size" here means the size of a zipped version of your original source code, not the size of the machine code or bytecode it would - theoretically - compile down to. That's what the token count roughly represents, and is what actually constrained the developers of yore (the ones who weren't writing in BASIC, at least.*)
So this feels like the wrong metric to me. To draw an analogy, it's a bit like limiting the amount of sprite data you can fit by basing it on the size of the photoshop PSD file you drew the sprites in - not the size of the actual 16-color sprites it produces.
(* BASIC programmers operated under much the same constraints as pico-8 now: their source code was what they shipped, and every extra character was disk space they could ill afford. But I would make the point that their source code tended to be unreadably terse as a result, and one of pico-8's great theoretical strengths is that you can easily explore and learn from other people's code.)
The compressed code limit is an actual format limit of the PNG cartridge system. It's the maximum amount of data that can be stored in the pixels of the cartridge image in accordance to the format Zep came up with. It's not in any way intended to be a part of Pico-8's standard limitations (as far as I know), just a side effect of how cartridge saving works.
The reason you can't change it to a bytecode limit is because Zep maintains that all Pico-8 cartridges should be open-source. This is partly because Lua bytecode is not platform-independent and not secure -- if bytecode that has been tampered with could be loaded from Pico-8 cartridges, it could crash the Pico-8 or worse. But if it's open-source, that means the source code (including all comments) has to be exported on the cartridge.
So if you want to resolve this yourself, your best bet is probably to get Zep to un-limit the text format (.p8) if it's limited in the same way, and save to that during development, then use a tool like picotool https://www.lexaloffle.com/bbs/?tid=2691 to strip the comments out before you export it to .p8.png to post on the BBS.
@JTE: Thanks for shedding light on that! That does make a lot more sense, and I had gotten hung up on intent rather than considering there might be practical reasons.
In 0.1.5 the compressed size limit is applied to .p8s as well as .p8.pngs: pico-8 will happily load and play an oversized cart, but refuses to save it in either format.
Previous threads all suggest that the compressed size limit ought to be a nonissue: that carts should hit the token limit before maxing out compressed size. (Perhaps that analysis dates from before pico-8 got more picky about what it treated as tokens.)
On the other hand, it would be fairly easy to store the data in seperate chunks of the png format, that's how I do it with the Nano89 cartridges. They appear as png images as well, but can hold up to 1mb of data, which is the address space limit of the internal mmu which pages the cartride into the 64kb system space of my fantasy console.
As this thread is directed to a topic I had of interest today, "compressed size," I would like to point out that it might be helpful for coders to be able to allocate what they want and for where.
For instance, it is highly unlikely I will ever use MUSIC or the MAPPER. I can do one channel music as a SFX and did so effectively in my Haunted House game. That space could be released to me to allow more coding space.
Alternatively, someone could say I'm going to do it all myself, and free ALL resources except for coding space.
It would not exceed the current memory and still give greater flexibility to the programmer.
On the Apple ][+ computer, you could do this. If you decided not to use HIRES graphics (280x192 (2 pages) and instead chose LORES graphics $0400-$0800. That was an additional $03C00 bytes you could work with instead of just the basic $4000.
To sum up, PICO could add this command in code:
ALLOCATE(Code space, Sprite space, Mapper Space, SFX Space, Music Space).
tbh, counting comments is bullshizz and shouldn't happen.
But if you want a personal/dirty solution you can always just number your comments and include a text file with the source like footnotes.
I know this isn't elegant nor where anyone wants to be, but it is a solution until/if something else gets fixed.
Hmm ... unlimited coding space for REMARKS. I like that, Cabledragon, and so should other coders.
Tobiasvl, I'm definitely using a compressor and decompressor for pixeled data for my paint program.
It's helping quite a bit and I've saved a lot of space instead of just all-out raw default drawing to the sprite page.
I'd vote for a new cart format where the code gets automatically split into:
a) Minified form, which is what gets executed and is bound by the current limitations.
b) "Unminification instructions" which can be used to turn the minified code into the original code without changing semantics (e.g. can only add comments & whitespace) in order to display it on the site. This would not be bound by as many limitations.
If we're going this route, maybe it would also be nice to allow users to have a RAM copy of the 3x5 font so it could be adjusted.
Instead of the lowercase letters just looking like smaller uppercase, they could truly be lowercase and have descenders as well.
I have a lot of extra code in my game dedicated to debugging and automated testing, that won't make it into release but is useful for development. So I have a debug build separate from release to include all this extra code (as done in other engines). Except the extra code was blowing up both the token and character limit, and was unusable in practice.
I was desperate to still have it run on my machine to make development easier, so I patched my copy of PICO-8 to extend the token limit. I could not find a way to extend the char limit though, so right now I'm trying ways to minimize the source code itself, without necessarily reducing the token count (mainly renaming variables with shorter names, I'm also trying luamin as suggested by the OP).
Patching the app is clearly not a solution to unlock code limitations for actual game releases, only a hack that allows development with more features. End-users should use a vanilla PICO-8 anyway.
I'm not sure if zep appreciates the move even though it's meant for developers only, so I'm not releasing the patch until I get permission. Honestly, I think it wouldn't be useful for most projects, but since mine is open source, I still want people to be able to download the code and build the different configs, including the debug one (there is actually a full debug with profiling and all, and a config for simulation tests which shows the character running around, and I want to show the latter for demonstration purpose). So in the end, I'd like to provide that patch to people who want to test those configs.
Anyway, there are still many things I can do to reduce the character count, but for the token count I'm stuck as my architecture is already streamlined and there is not much to change. Maybe some hacks like inlining functions used only one time... But that should be done in a build step, not directly in the source to keep the code readable.
As I add more features, the release code itself will grow and may eventually reach the token limit as well. At that time, my hack will become useless as it won't work for end-users.
[Please log in to post a comment]