Log In  

(recommended: use with PICO-8 0.2.1b or later)

This function can be used to convert binary strings into a format that can be pasted into source code. Binary strings contain all characters from chr(0)..chr(255) and as such include unprintable / unstorable characters. escape_binary_str() adds the needed escape codes and stores the remaining characters as-is. For example character 10 becomes \n and character 0 becomes \0, or \000 when followed by a number (to avoid ambiguity).

This is useful for storing dense binary data efficently (e.g. compressed with PX9). If you are storing structured data in code (like a raw image), it will likely be easier and almost as efficient to store them as a bunch of hexadecimal characters.

function escape_binary_str(s)
 local out=""
 for i=1,#s do
  local c  = sub(s,i,i)
  local nc = ord(s,i+1)
  local pr = (nc and nc>=48 and nc<=57) and "00" or ""
  local v=c
  if(c=="\"") v="\\\""
  if(c=="\\") v="\\\\"
  if(ord(c)==0) v="\\"..pr.."0"
  if(ord(c)==10) v="\\n"
  if(ord(c)==13) v="\\r"
  out..= v
 end
 return out
end

Workflow

Step 1. Generate a Binary String

binstr=""
for i=1,256 do
 binstr..=chr(i%256) -- any data you like
end

?#binstr         -- 256
?ord(binstr,256) --   0
?ord(binstr, 13) --  13 

Step 2. Escape the String and Copy to Clipboard

printh(escape_binary_str(binstr), "@clip")

Step 3. Paste into Source Code

* Turn on Puny Mode (CTRL-P) // to make sure uppercase characters are encoded as punyfont

CTRL-V into source code as a string value bindat="[paste here]". You should get something like this:

bindat="¹²³⁴⁵⁶⁷⁸    \nᵇᶜ\rᵉᶠ▮■□⁙⁘‖◀▶「」¥•、。゛゜ !\"#$%&'()*+,-./0123456789:;<=>?@𝘢𝘣𝘤𝘥𝘦𝘧𝘨𝘩𝘪𝘫𝘬𝘭𝘮𝘯𝘰𝘱𝘲𝘳𝘴𝘵𝘶𝘷𝘸𝘹𝘺𝘻[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~○█▒🐱⬇️░✽●♥☉웃⌂⬅️😐♪🅾️◆…➡️★⧗⬆️ˇ∧❎▤▥あいうえおかきくけこさしすせそたちつてとなにぬねのはひふへほまみむめもやゆよらりるれろわをんっゃゅょアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワヲンッャュョ◜◝\0"

4. Enjoy your Binary Data

The contents of bindat can now be accessed with ord(bindat, index) (note that index is 1-based).

P#78950 2020-07-05 21:26 ( Edited 2020-10-29 03:56)

Hi zep! (First comment here but not the first time this is discussed on discord :)

Could you confirm that in addition to 'ord', we can also use 'sub' to get more that one byte at once, and use 'poke' to get the data into memory?

P#96159 2021-08-17 21:58

sub returns a string - @merwork: are you expecting that to work:

poke4(0x4300,sub(byte_str,1,4))

??

P#96170 2021-08-18 08:23 ( Edited 2021-08-18 09:29)

No, I would expect sub to return a substring of the initial byte string!

Edit: tested 'sub', works as intended (doesn’t cut any character), so it can be used to split a big encoded string into sub-strings. This makes it possible for example to encode many levels in one string then get one to load it, rather that being forced to define separate strings. We can’t use 'poke' though, so to load binary data we have to call 'ord' in a loop.

P#96178 2021-08-18 13:41 ( Edited 2021-08-18 17:59)
2

0.2.3 changelog:

> Added: ord(str, pos, num) returns num results starting from character at pos (similar to peek)

One call, no loop \o/

P#96965 2021-09-06 18:16
2

@zep

I noticed someone referring to this post and also that it hasn't been updated in a while, so I figured I ought to take a stab at refining the function. This comes in at about half the tokens (now 57) and probably performs better, though this sort of function probably doesn't get used much at runtime, so maybe that's not so important.

-- ordinal (0..255) -> escape sequence table
ord_esc=split("¹²³⁴⁵⁶⁷⁸\t?ᵇᶜ?ᵉᶠ▮■□⁙⁘‖◀▶「」¥•、。゛゜ !?#$%&'()*+,-./0123456789:;<=>?@abcdefghijklmnopqrstuvwxyz[?]^_`abcdefghijklmnopqrstuvwxyz{|}~○█▒🐱⬇️░✽●♥☉웃⌂⬅️😐♪🅾️◆…➡️★⧗⬆️ˇ∧❎▤▥あいうえおかきくけこさしすせそたちつてとなにぬねのはひふへほまみむめもやゆよらりるれろわをんっゃゅょアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワヲンッャュョ◜◝",1,false)
ord_esc[0]="\\0"    -- nul
ord_esc[10]="\\n"   -- newline
ord_esc[13]="\\r"   -- cr
ord_esc[34]="\\\""  -- quote
ord_esc[92]="\\\\"  -- backslash

function str_esc(s)
    local r=""
    for i=1,#s do
        r..=ord_esc[ord(s,i)]
    end
    return r
end

BTW I had to convert a literal tab in my split("...") string into "\t" because the BBS code parser converts tabs to spaces. This seems like a possible problem. I really wish you'd preserve tabs in code previews and just set the CSS "tab-size" value to something appropriate for PICO-8. I suggest 2, as always, but do 4 or 1 or whatever, just as long as you keep the tab character as-is. Code blocks should never be molested in any way other than styling them, really.

Edit: Here's the code inside a cart just so it can run the unit test, which just compares my method's results with yours, and eventually break when something changes. 😜

Cart #foziyotehe-0 | 2022-12-30 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA
2

P#123332 2022-12-30 22:05 ( Edited 2022-12-30 22:26)

@Felice

That's nice and streamlined, but it doesn't add 2 extra zeroes to \0 glyphs if they're followed by numeric symbols, which could cause errors.

P#123371 2022-12-31 18:45

Yep, @JadeLombax and @Felice. For instance in my compressor I must use \48 to \57 for digits. If I don't the data messes up.

P#123374 2022-12-31 19:23

-- removed incorrect comment --

P#123424 2023-01-01 09:59 ( Edited 2023-01-01 15:20)
1

@JadeLombax

Oh, drat, I missed that element of zep's converter. I'll see if I can come up with something that's streamlined but still works well, hmm. Lemme think.

P#123429 2023-01-01 13:41

I've been working with a system similar to this, but \0 isnt read, it will also corrupt the next byte if its a 0. are there any solutions for this?

P#139687 2024-01-06 01:35

yes @teddblue that's fixable: "\000" is another way to encode the 0 byte, "\005" for 5, etc. you should use this way to avoid issues if the next byte is an ascii number ("0"-"9", bytes 48-57)

escape_binary_str (above) handles this with the local pr = line -- take a look at that part of the code

P#139697 2024-01-06 06:28

yea i just have my encoder use "\000" instead of "\0" now and it works great.

P#140004 2024-01-11 20:35

[Please log in to post a comment]

Follow Lexaloffle:          
Generated 2024-03-29 14:36:03 | 0.084s | Q:38