Log In  

Hello, first post here.

I just got started with Pico-8 recently, and have learned about the usefulness of storing data in strings to get around token limits. I was wondering, can I build a data string using a p8 program, then output that string as text that can be fed into another p8 program without it being reformatted? I'm working on a graphical level-editor program that will output level data strings usable by another cart.

I tried using printh(), but when I opened the text file it mangled up the extended character glyphs and didn't preserve the 4x3 font lower-case letters I had in the string. My goal is to have all 121 characters available so I can store about 7 bits per character, instead of the 4 possible with hexadecimal, if this is possible.

Thanks.

P#73959 2020-03-15 22:20

It's wonky, but maybe you could use the clipboard? printh(string, "@clip") and then paste into the p8 file? We were able to copy at least the printable characters, paste them into a file in Notepad++, and get them back intact when the program ran.

P#73964 2020-03-16 01:04

there is a known bug with printh and unicode (glyphs).

P#73974 2020-03-16 06:43

You should probably not think in terms of bits per character, but of bits per byte in the cartridge. Not all characters are equal when stored in a cartridge: only 59 of them (0123456789abcdefghijklmnopqrstuvwxyz!#%(){}[]<>+=/*:;.,~_ plus space and \n) will be stored as one byte, all others need two bytes for storage.

This means that using 121 different characters will only encode 5.1462 bits per byte, whereas using those 59 “good” characters will give you 5.8826 bits per byte. So if your decoder’s size can be negligible, I would strongly suggest using those 59 characters. It gives you the best encoding ratio and gets rid of your Unicode output issue.

Note that there will actually be 245 characters available (0x10-0xff) when 0.1.12 is out, allowing you to encode 7.9366 bits of information per character, but only a mere 4.9239 bits per byte.

A final note: writing a base-59 bigint encoder/decoder will probably be very tricky. You could instead split your data into 47-bit chunks and encode them as 8-digit base-59 numbers (those can hold 47.0611 bits of information). In terms of storage, this technique will give you exactly 5.875 bits per byte (instead of the theoretical maximum of 5.8826) with a much simpler encoder/decoder.

P#73978 2020-03-16 09:52 ( Edited 2020-03-16 11:32)
:: Felice

@samhocevar

I think you'd be better off sticking with base64 and just take the hit for the occasional use of 5 characters that need a second byte. Otherwise you stand a good chance of obscuring patterns in your data that the source code compressor could condense well.

Mostly I'd just suggest trying methods and going with whatever actually performs best.

P#73983 2020-03-16 13:57
1

@samhocevar,

Thanks for the detailed rundown. I've been a bit confused about some issues and that makes things a lot clearer. I'm thinking I'll go with either 60 characters or 64, as these will give me nearly 99% and 95% of maximum density respectively, and 59 is just a very odd number to work with. I actually don't need to do any kind of encoding so to speak in my program, just to read string symbols as decimal values and plug these into standard functions. I will say it's a shame that we'll be stuck under 6 bits per byte even with the expanded character set, but oh well...

@Felice,

Yeah, I'll have to try things out, if base64 seems to work better/compress better than base60,I'll just go with the former. I have noticed that different character sets compress differently, with Pico-8 able to store about 20,000 random hex characters, as opposed to about 16,000 random chars from an extended set I tried out, and I'll do more experiments like that. Oh, btw, thanks for your contributions in the forum, I've been using some bits of your code for converting values to and from hex. =)

P#73984 2020-03-16 14:18

Oh, one more small question, I tried working with the 59-character set a bit, but the line-return symbol "n\" doesn't seem to format correctly into a string that I can print out, and just (as the name would suggest) goes to the next line. Is there something I'm missing? Otherwise I think I'll substitute another character for it.

P#73986 2020-03-16 15:13

[Please log in to post a comment]

Follow Lexaloffle:        
Generated 2020-07-11 01:45 | 0.047s | 2097k | Q:25