19

recently, while investigating the pico-8 preproccesor, i found some really weird behaviour, which culminated in some very strange token optimizations, and an infinite token exploit

### Arithmetic assignment save

often, you want to perform multiple arithmetic operations on a varible, then assign it to itself. for example, `a=2*a+1`

the 2 ways you'd do this normally in pico-8
are

 ```a=2*a+1 -- 7 tokens a*=2 a+=1 -- 6 tokens```

however, using preproccesor trickery, we can reduce it even more:

 ```a*=2 +1 -- 5 tokens```

the reason this works is because the preprocessor patching for += works line-wise, so this would be patched to

 ```a= a*(2) +1```

### Infinite token exploit #1

this exploit allows you to run any code that is on 1 line, and doesn't use any pico-8 preproccesor based syntax extensions (i.e. +=, shorthand if, ?), while only costing 8 tokens
it works as follows:

 ```a={} a["[t"]+=" < your code here > t(```

this looks extremely weird, because it is. note that our code is in an (unclosed) string thus, it only counts as 1 token

the preproccessor patches this code to

 ```a={} a["[t"] = t"] + (" < your code here > t( ) ```

we see that because of how the preproccesor parsed our expression, we don't have any unclosed strings anymore, but this still looks weird, so let's simplify it

 ```a={} a["[t"] = t("] + (") < your code here > t( ) ```

so, the code contains 4 parts

1. creating an empty table
2. assigning some value to some key in the table (not that t is the function time())
3. actually running our code
4. calling t()

parts 1 2 and 4 don't actually do anything, which means we ran our code while only costing 8 tokens!

### Infinite token exploit #2

the previous exploit is already nice, but being limited to 1 line is a bit annoying. so let's improve it:

 ```a={} a['[t']+=[[' < your code here > t(a[a[1]] ```

is patched to

 ```a={} a['[t'] = t'] + ([[' < your code here > t(a[a[1]] ) ```

similarly to exploit #1, before patching, our code is in a multiline string, and thus only costs 1 token. after patching, it is not in a string anymore, so pico8 just runs it as regular code. so now we can run any code (with the same caveat as before of not using pico-8's preproccesor based syntax extensions), using only 8 tokens

### An argument against the preproccesor

all of these exploits are caused by the preproccesor being kind of weird and finnicky. while i'm sure these specific ones can be fixed by changing it, i'm pretty convinced you could find things like these in every non-syntax-aware preprocessor. while @zep has been against adding compound operators (+=) to the syntax in the past, I think these examples (and all the other weird preproccesor behaviour) provide a decent argument for why it should be

### Demo

to show the viability of this method, I made a version of celeste that only uses 5 tokens, using exploit #1 (the 3 token save comes from using _ENV instead of defining a)

Cart #fivetokenleste-0 | 2022-10-28 | Code ▽ | No License | Edit

P#119789 2022-10-28 19:51

8

Haha, nice one @gonengazit

I've been looking again at ditching the pre-processor recently while working a bit on Picotron (which does not use one), and this pretty much seals the deal. @samhocevar has already proved by example that it is a viable approach with z8lua, and the branch I'm experimenting with seems to have pretty good backwards compatibility already. Apart from getting rid of weird edge case behaviours and not being eternally bug-prone, I'm also happy that compound assignments like "num[rnd(5)\1] += 1" can work as expected without evaluating the rnd() twice.

P#119791 2022-10-28 21:03 ( Edited 2023-02-02 19:48)
10

Looks like this should be doable for 0.2.5d. Enjoy your phony tokens while they last, scallywags.

P#119792 2022-10-28 21:08
2

Everyone quick, save a copy of this build and never delete it! XD

P#119797 2022-10-28 22:50
1

@gonengazit: That is quite the find !

For those confused, try out this simple code:

Cart #kosesorita-0 | 2022-10-29 | Code ▽ | No License
1

 ```_𝘦𝘯𝘷['[t']+=[[' -- start of code function _init() cls() end function _update() pset(rnd(128),rnd(128),rnd(16)) end -- end of code t( ```

Make sure to type that `_env` above in lowercase letters and the `t(` below. Use CTRL+P to help type lowercase and CTRL+P to return back to normal typing.

Yep, that only takes =5= tokens !

And no, I don't know of a way to call this through a string. That'd be awesome then you could have strings have code and have self-modifying code. Yet ... ZEP said he is fixing this - so - ratz ...

Ha, @RyanC. I remember finding an exploit years ago in Pico-8 where you could use the `include()` command inside an executable to run a notepad file as code, defeating the purpose of purchasing the editor.

I brought that to Zep's attention and he fixed it, too ...

P#119803 2022-10-29 01:11 ( Edited 2022-10-29 04:13)

While I wouldn’t exploit this for a game, it might give the freedom to implement all the debugging tools I need to finish the game comfortably, like it was discussed in this “developer mode” thread https://www.lexaloffle.com/bbs/?tid=49573 going to test it tomorrow!

P#119853 2022-10-30 10:46
3

@zep

Hey, while you're porting stuff out of the preprocessor, I assume you'll need to handle the assignment operators, which brings up an issue I noticed:

In 0.2.5 you added a synonym for PICO-8's `^^` xor operator to match what vanilla Lua did. I really liked this, because I've realized that `~` for xor makes total sense, considering subtraction (`a-b`) is to negation (`-b`) as xor (`a~b`) is to not/invert (`~b`).

However, there's no `~=` assignment operator, and I assume this is due to difficulty detecting the difference between that and the vanilla lua `~=` logical inequality operator with regexes. I'm gonna guess the parser is more capable, though, so I hope you can make that happen. I mean, having to use `^^=` isn't the end of the world, but it'd be nice to be consistent and use the more-appropriate `~=` when I mean xor-equals.

Side note: I think this is one of those things where a language got something wrong early on. I always thought Lua's main mistake was its 1-based tables, but the `~=` inequality operator became a close contender when bitwise ops were introduced. I really wish they'd stuck with the existing `!=` that nearly everyone else used. It's nigh-impossible to deprecate it now in favor of `!=`, but it'd sure be nice.

P#119964 2022-11-01 18:59 ( Edited 2022-11-01 19:04)
1

what?! `~=` is the normal lua inequality comparison for all types. it cannot become an augmented assignment.

P#119968 2022-11-01 19:41
1

@merwok

Of course it can. It's a contextual thing. The parser knows where assignment operators are vs. where expression operators are. You wouldn't find a comparison operator to the right of a receiving variable, so in that position it would obviously be an assignment operator. Likewise, Lua doesn't allow in-expression assignments the way C/C++ do, so there's no risk that it'll be mistaken for an assignment operator in the middle of an expression.

This is just like how the "-" character works both for subtraction and negation. The parser knows that when there's an expression to the left of it, it's subtraction, and when there's an operator (or nothing) next to it, it's negation.

Also, in my experience reading other people's carts, almost everyone uses the PICO-8-specific != inequality operator, probably because they're used to it from common languages like C/C++/Java/Javascript/etc.

P#119974 2022-11-01 22:49 ( Edited 2022-11-01 22:53)
3

@Felice

if we're already talking about nice things we could get from this change, it'd be extremely nice if we could get multiple assignment from compound assignment (I.e. `a,b += c,d`). While this should definitely be possible to implement in the syntax, I don't know how difficult/ complex it would be to implement it in the parser. It'd be extremely nice to have though.

P#119975 2022-11-01 23:03
1

@gonengazit

Ooo, true. I've actually asked for that in the past, but the regex preprocessor method was apparently not up to the task.

I'd actually expect that, if zep can make the parser assignment-operator-aware, it'll just work right out of the box, since assignments naturally support tuples, but only @zep would know for certain.

That'd be really wonderful, for sure. There's hardly a game in existence that wouldn't be more elegant with `px,py += vx,vy`.

P#119976 2022-11-01 23:07 ( Edited 2022-11-01 23:11)