(recommended: use with PICO-8 0.2.1b or later)
This function can be used to convert binary strings into a format that can be pasted into source code. Binary strings contain all characters from chr(0)..chr(255) and as such include unprintable / unstorable characters. escape_binary_str() adds the needed escape codes and stores the remaining characters as-is. For example character 10 becomes \n and character 0 becomes \0, or \000 when followed by a number (to avoid ambiguity).
This is useful for storing dense binary data efficently (e.g. compressed with PX9). If you are storing structured data in code (like a raw image), it will likely be easier and almost as efficient to store them as a bunch of hexadecimal characters.
function escape_binary_str(s) local out="" for i=1,#s do local c = sub(s,i,i) local nc = ord(s,i+1) local pr = (nc and nc>=48 and nc<=57) and "00" or "" local v=c if(c=="\"") v="\\\"" if(c=="\\") v="\\\\" if(ord(c)==0) v="\\"..pr.."0" if(ord(c)==10) v="\\n" if(ord(c)==13) v="\\r" out..= v end return out end |
Workflow
Step 1. Generate a Binary String
binstr="" for i=1,256 do binstr..=chr(i%256) -- any data you like end ?#binstr -- 256 ?ord(binstr,256) -- 0 ?ord(binstr, 13) -- 13 |
Step 2. Escape the String and Copy to Clipboard
printh(escape_binary_str(binstr), "@clip") |
Step 3. Paste into Source Code
* Turn on Puny Mode (CTRL-P) // to make sure uppercase characters are encoded as punyfont
CTRL-V into source code as a string value bindat="[paste here]". You should get something like this:
bindat="¹²³⁴⁵⁶⁷⁸ \nᵇᶜ\rᵉᶠ▮■□⁙⁘‖◀▶「」¥•、。゛゜ !\"#$%&'()*+,-./0123456789:;<=>?@𝘢𝘣𝘤𝘥𝘦𝘧𝘨𝘩𝘪𝘫𝘬𝘭𝘮𝘯𝘰𝘱𝘲𝘳𝘴𝘵𝘶𝘷𝘸𝘹𝘺𝘻[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~○█▒🐱⬇️░✽●♥☉웃⌂⬅️😐♪🅾️◆…➡️★⧗⬆️ˇ∧❎▤▥あいうえおかきくけこさしすせそたちつてとなにぬねのはひふへほまみむめもやゆよらりるれろわをんっゃゅょアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワヲンッャュョ◜◝\0" |
4. Enjoy your Binary Data
The contents of bindat can now be accessed with ord(bindat, index) (note that index is 1-based).


No, I would expect sub to return a substring of the initial byte string!
Edit: tested 'sub', works as intended (doesn’t cut any character), so it can be used to split a big encoded string into sub-strings. This makes it possible for example to encode many levels in one string then get one to load it, rather that being forced to define separate strings. We can’t use 'poke' though, so to load binary data we have to call 'ord' in a loop.


I noticed someone referring to this post and also that it hasn't been updated in a while, so I figured I ought to take a stab at refining the function. This comes in at about half the tokens (now 57) and probably performs better, though this sort of function probably doesn't get used much at runtime, so maybe that's not so important.
-- ordinal (0..255) -> escape sequence table ord_esc=split("¹²³⁴⁵⁶⁷⁸\t?ᵇᶜ?ᵉᶠ▮■□⁙⁘‖◀▶「」¥•、。゛゜ !?#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[?]^_`abcdefghijklmnopqrstuvwxyz{|}~○█▒🐱⬇️░✽●♥☉웃⌂⬅️😐♪🅾️◆…➡️★⧗⬆️ˇ∧❎▤▥あいうえおかきくけこさしすせそたちつてとなにぬねのはひふへほまみむめもやゆよらりるれろわをんっゃゅょアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワヲンッャュョ◜◝",1,false) ord_esc[0]="\\0" -- nul ord_esc[10]="\\n" -- newline ord_esc[13]="\\r" -- cr ord_esc[34]="\\\"" -- quote ord_esc[92]="\\\\" -- backslash function str_esc(s) local r="" for i=1,#s do r..=ord_esc[ord(s,i)] end return r end |
BTW I had to convert a literal tab in my split("...") string into "\t" because the BBS code parser converts tabs to spaces. This seems like a possible problem. I really wish you'd preserve tabs in code previews and just set the CSS "tab-size" value to something appropriate for PICO-8. I suggest 2, as always, but do 4 or 1 or whatever, just as long as you keep the tab character as-is. Code blocks should never be molested in any way other than styling them, really.
Edit: Here's the code inside a cart just so it can run the unit test, which just compares my method's results with yours, and eventually break when something changes. 😜


That's nice and streamlined, but it doesn't add 2 extra zeroes to \0 glyphs if they're followed by numeric symbols, which could cause errors.


Yep, @JadeLombax and @Felice. For instance in my compressor I must use \48 to \57 for digits. If I don't the data messes up.


Oh, drat, I missed that element of zep's converter. I'll see if I can come up with something that's streamlined but still works well, hmm. Lemme think.
[Please log in to post a comment]