@dddaaannn - preprocessors, picotool, programming philosophy, pizza

Felice • 2018-03-05*2018-03-05 14:35* •

BBS>

PICO-8>Chat

On yonder thread, where you wisely noted we should stop polluting its subject, you responded to me and said:

Static preprocessors like cpp are a weird fit for dynamic languages like Lua. For example, cpp-style includes have no order-dependent side effects in the languages they're used for (other than in the preprocessor macro language itself, I think?), so they can more simply insert code at first mention. This is not the case in Lua. Lua modules make the handling of side effects in the code explicit, so there's no confusion as to what a require() is expected to do. I'm very interested to know why a Pico-8 developer would prefer an #include-like to a require() because I can't think of a reason, as long as require() is implemented correctly.

Re: defines and such, I think what we actually want, especially in the context of Pico-8, is build-time constant folding and dead code elimination.

But we should take build tool discussion somewhere else, so people can use this thread to discuss Compos. :)

Well, I'm a C/C++ programmer at heart, and while I've certainly warmed to a lot of the dynamic aspects of non-static languages like python and Lua, I'm also aware of a lot of the inherent costs that zep tends to hide from us. I like to adhere to practices that wouldn't be suicidal, perf-wise, in a normal context, so long as they aren't detrimental in this context.

Abstraction and modularity have their uses, but when your resources are limited, sometimes they get in the way. The importance of being super flexible is only as important as the width of the path you're walking. Like, consider tweetcarts... a very narrow path, no use for flexibility, just have to walk the exact line the cart needs to walk, or you'll fall off.

A lot of senior programmers are very quick to mutter about premature optimization, and they're right to, but I think there needs to be a lot more muttering about premature generalization. I said to someone recently that you don't create an object to represent every pixel, with a getter and a setter, so obviously there's a point somewhere between reading the user's input and putting pixels on the screen where you stop trying to create general, re-usable solutions and start making specific, one-off solutions that will work best in the context.

Anyway, the most direct answer as to why I'd prefer an #include-like solution is that it simply has the least overhead in the final executable. I'm very good about being modular in terms of keeping things in separate files and independent, because I do like re-usability, but that means that there's a non-negligible token cost (as in dozens, easily) to having a function wrapped around every require()d file.

Oh, and yeah, constant folding would be amazing, but I won't hold my breath for zep to add that. He's clearly not going to upgrade the lua engine embedded in PICO-8. I'd love to have far more than just the folding. Actual binary ops (rather than functions) would be awesome, among other things. But I don't see it happening. So I thought I'd hit you up for more stopgap measures. :)

I also wish for something along the lines of inlining. Just the ability to express something as a function, when it should be a function, rather than having to manually inline it all over because each inlined instance is one or two tokens fewer than calling it, due to the simplicity of the function.

Hm, think that covers it.

BTW, I only put pizza in the title because I wanted another 'P' word. As it happens, I personally prefer pepperoni pizza.

freds72 • 2018-03-05*2018-03-05 16:14*

Is it a call to vote for:

local lib=requires("framework")

vs.

#include "framework.lua"

Felice • 2018-03-05*2018-03-05 16:36*

It doesn't have to be versus. I think you can have a system where you can do either thing.

dddaaannn • 2018-03-05*2018-03-05 17:20*

Your point about pragmatism is probably the most important. To that end I agree that a macro preprocessor would be useful. That's why I suggested that maybe an existing standalone preprocessor could be made to work as part of a build pipeline, or picotool could provide adaptor logic instead of implementing one of its own.

I also agree with your points about optimization in the abstract, but I think we need actual numbers and experience to determine whether (for example) the token overhead of the current require() implementation is untenable. That's why I'm interested in real world stories of people trying to use require() and hitting its limitations in real-world scenarios.

Specific to tokens, with #include-like global idioms, I see a one-time cost of +49 tokens, +7 tokens per library, and +3 tokens per require() call. With Lua module idioms, add at least +2 per library to return an object, maybe +1 per exported symbol to rearrange lib contents in table form, and, as WaltCodes mentioned might be a deal breaker, +2 per library symbol dereference. (Devs can mitigate that last one with +4-per-symbol aliases into the global namespace. I filed an issue to contemplate ways of doing this automatically, though I don't have a huge win in mind.)

Is that too much? It depends! Would a macro preprocessor give devs necessary control over the final contents of the cart? Maybe! Not every Pico-8 dev hits the token limit with every project, and some Pico-8 devs might find the utility of module semantics worth the slightly tighter belt.

My personal interest in instrumenting reusable Pico-8 libraries is less for the games that are likely to fill a cart and more for beginner/intermediate devs that benefit from a bit more leverage in the coding department. We're not there yet: picotool needs to be easier to use (or made irrelevant by built-in features), and we need more lib/framework-like things to explore which advantages of this approach are worth exploiting.

Re: constant folding and dead code elimination, I'm imagining this happening at build time, within p8tool build. mygame.lua could contain (41 tokens):

debug = false

radius = 10
pi = 3.14
area = pi * radius * radius
circ_color = 8

if debug then
 printh('area: ' .. area)
end

function unused_function(foo)
 return foo * area
end

circfill(50, 50, radius, circ_color)
print('area: ' .. area, 30, 80, circ_color)

The final cart would render as (9 tokens):

circfill(50, 50, 10, 8)
print('area: 314', 30, 80, 8)

This is what I'd want a macro preprocessor to do, and it wouldn't be that difficult to do it with just static analysis in conservative cases. My only blocker for this in picotool right now is I want to overhaul the parser to make this easier to implement, and that's been a more-than-a-weekend project so far.

Notice how this would interact with require(): You could load in a large library, maybe one that depends on many other libraries, and only the code that your cart actually uses would make it into the final cart. This could offset the overhead of require() and justify module semantics for organizing the code. Dead code elimination is orthogonal to require() vs #include. But, I assert, module semantics would make large loose collections of libraries easier to use. Dead code elimination might make it more practical.

freds72 • 2018-03-05*2018-03-05 17:25*

What would be the dependency resolution logic for p8.png files?
Search local file system to resolve the file or load it from the png itself?

IMHO, couple of dependency management gotcha:

how to copy/paste code when the first line is #include "blah.lua"?
how to survive dependency management hell without a central repository (a la nuget/maven)?
do we need such complexity in the first place, given most complex project will strugle to save every single token possible

I see the benefits of reusable/sharable code.
Question is how to do it without loosing pico "soul" (eg. everything you need is "there").

Felice • 2018-03-05*2018-03-05 18:09*

@dddaaannn

Yeah, no question that removing dead code or eliding unnecessary code would make the question entirely moot. In fact, it would probably eliminate my person concern over the overhead of the wrapper functions, because the function call could be elided itself, with the contents simply inlined, as it's a single call from a single site and there's no practical reason for it to be a function call.

I suppose multiple paths to the require() would complicate that, though. I think I've seen you say that it deals gracefully with redundant or cyclic require()s. Is the logic in the preprocessor or in the runtime?

dddaaannn • 2018-03-05*2018-03-05 20:22*

@freds72

I think in this context we're referring exclusively to a workflow outside of the Pico-8 editor, where you edit .lua files in a text editor then run a tool to add the final code to a cart. That tool would have all the usual local build behaviors including path resolution, and would remove all non-Pico non-Lua lines from the final output.

I'm not inclined to solve a Maven-like packaging and distribution problem until we have one. For now, we can assume that libraries on offer can be downloaded, added to a library path, and referred to locally.

As for what kinds of projects would use such features, that's the question I want to answer through experimentation. We've had multiple attempts at libraries/frameworks so far that have just assumed that someone would copy-paste the entire thing into a new cart that wants to use it. I want to make it easier to develop and use a suite of small libraries, so we don't have to paste in large frameworks just to reuse code. It's difficult to visualize without examples, but there's less incentive to develop examples without the tooling. So the tooling comes first. picotool supports Lua-style require() today, and this discussion is about whether a textual preprocessor might make more sense for Pico-8.

@Felice

Dead code elimination wouldn't eliminate the function wrapper for require(). That's just part of how require() works, both with picotool and with standard Lua. Each lib gets a local scope and exports values by putting them in a table and returning the table. The function is called the first time require() is called for that path, and the resulting export table is cached.

Multiple require()'s of the same library do the right thing. Subsequent calls return the cached export table. That part specifically is handled in the runtime implementation of require(). The actual insertion of code into the path-to-function lookup table occurs in the preprocessor.

(In case you're referring to this, I did document that different path strings that refer to the same file in the Lua lookup path will result in multiple inclusions. This matches Lua's documented behavior, for better and for worse, and I didn't want to deviate in a first pass. If there are reasons to normalize by file path or something we can consider it.)

Felice • 2018-03-06*2018-03-06 13:59*

Okay, so it's a runtime check.

And yeah, I know dead code elimination won't get rid of it. That's why I also mentioned eliding unnecessary code.

Like, if you have the actual AST in your minifier, you can get rid of one-time-only function calls entirely and just inline their contents at the call site. That removes the tokens for the call and for the function definition, barring the need for a do/end to scope any locals that were inside the function.

In the case of require() calls, though, you'd likely have cyclic dependencies between library files that would produce those multiple require()s and make it trickier to eliminate the calls. You'd need a static compile/minification-time check instead of a runtime table of what's already been added. I suspect that wouldn't be too painful though. Also, if there are factors that can cause the scheme not to work, just disable the scheme when they're present. That way it works the most efficient way for most basic usages but falls back to the runtime check for complex ones.

Ideally, you could also add some kind of hinting with comments or something, e.g. --[[inline]] to request that a given function be inlined regardless of call count. Or just analyze the code and do it automatically where it works better. This might be great for attribute getters and setters where the contents of the function are no worse than a call to the function plus the function.

For instance:

-----------
--   set
-----------

-- 9 tokens
function object:setvalue(v)
  self.value=v
end

-- 4 tokens
ob:setvalue(v)

-- 4 tokens with no need for the function declaration
ob.value=v

-----------
--   get
-----------

-- 6 tokens
function object:getvalue()
  return v
end

-- 5 tokens
v=ob:getvalue()

-- 4 tokens with no need for the function declaration
v=ob.value

Most of the time, this is all getters and setters do, but it's nice to be able to do them at all times so you can adapt them later or occasionally to do more complex stuff under the hood. It often feels like a shame to be manually inlining sets and gets in my OO code to save tokens and improve perf.

dddaaannn • 2018-03-06*2018-03-06 20:57*

I'm not keen on the general idea of statically inlining functions in a dynamic language, but I haven't put much thought into it. A function that refers to a global would have to check that the symbol still refers to the global in the inlined context, for example. Args pretty much have to be locals in a do block, or at least assigned to name-munged locals to avoid conflicts with other locals.

I'd want some evidence that such a feature provides a useful savings. One fun thing about Pico-8 tooling is that it's reasonably straightforward to analyze a large number of real-world carts to come up with such estimates. Naturally, you can only analyze carts written before the new feature existed. I think for token reduction stuff we probably want to prioritize techniques that don't require changes to idioms to be beneficial.

Re: the specific example of getters and setters, the main reason to use them is to make it possible to extend or replace their behavior later as the data mutates. In Lua, you can do this with the double-underscore index metamethod. Just have clients of a table (object) refer directly to a property, then if access needs to change, define or extend the double-underscore index metamethod to use special behavior for those properties and the default access otherwise. [Edit: double-underscore is converted to bold by the BBS, so I had to spell it out. :) ]

And because you mention it, for reference and for fun, here's a previous discussion discussing type annotations in comments. I actually think it's not difficult to define conservative criteria to determine that a variable is used as a constant, but the general idea is sound and useful.

Felice • 2018-03-06*2018-03-06 21:28*

Well, for pure attributes, maybe, but for more complex get/set stuff, like say a setrect(), it's not always such a great idea to do it as attributes and then override for specific cases via __index.

In general I don't like using the __index fallback, since it incurs a heavy cost behind the scenes. On pico-8, it's not bad at all, but I really really hate developing habits on pico-8 that aren't a good idea anywhere else.

By the way, what's your issue with static inlining, exactly? I didn't follow you about the globals that might change. If you're referring to doing a static analysis to inline a global that's assigned as, say, tau=6.28, then I'd agree that that would be a bad idea. Personally, I'd prefer to do constants via preprocessor, so that they're truly constant, for exactly that reason. If I do the equivalent of, say, #define tau 6.28, I know it can't be stomped.

Anyway, in my previous comment, I was referring to inlining only code, not values. Code is static and has no such issues. I don't really mean to take the same approaches to code and values.

dddaaannn • 2018-03-07*2018-03-07 02:54*

It was just a corner case that stuck out to me at first glance. Something like:

state = 'level1'

function change_game_state(newstate)
 state = newstate
end

function geography_quiz()
 local state = 'washington'
 change_game_state('level2')
 do_quiz(state)
end

without proper care could become:

state = 'level1'

function geography_quiz()
 local state = 'washington'
 state = 'level2'
 do_quiz(state)
end

A reasonable solution is to never inline something that refers to a global, and to name-munge inlined locals. There are probably other worthy disqualifications.

Static inlining in a dynamic language is different from inlining generated assembly in a compiled language. Not impossible I guess, but more to think about to result in a functionally equivalent program. It seems unlikely to save tokens, so saving the perf hit of the function call looks like the main benefit.

No real objection to the getter/setter example as something small that could be inlined, just something that occurred to me. I'm not sure I'd be inclined to use getters and setters in Pico-8 OO without a good reason within a single version of a game.

Felice • 2018-03-07*2018-03-07 07:04*

Actually, I was lying in bed thinking obsessively about this, as I am wont to do, and it occurred to me that without type information, you really can't inline getters and setters. At least, not with a class hierarchy. I guess you could do it with a flat set of classes, but that's hardly OO programming.

I guess with hinting, e.g. either comment pragmas or with name decoration, you could at least imply that some methods are always from the base class, but that seems of limited use.

Hmm. Needs more lying in bed, I think.

It'd still be nice to have global inlines, though. Like, a function that's nice to name, but which really only does one or two simple math ops, like the ceil() we used to have to write ourselves.

Oh, and yeah, you'd definitely have to be aware of local variable scope. I assume the AST has that info in it, though, since it's known at compile time.

Felice • 2018-03-07*2018-03-07 07:22*

Had a bit of a think and realized it might be better if we stick to one topic for the moment, rather than letting feature creep rule the day.

It's probably most useful to tackle the idea of constants for now. That'd aid the programming cause the most, I think. It might not help the runtime much, or at all, but I think it might produce the most quality of life for the least effort.

Maybe literal folding as well, at least getting rid of unary minus on literals. I think you should be able to pattern match on the AST and see unary-minus,literal pairs. Maybe you do this already, I forget. I should look at your code again.

dddaaannn • 2018-03-07*2018-03-07 11:12*

The most researched model for these shenanigans is Closure compiler, which does all this for JavaScript. Their conclusion relies heavily on metadata in comments to hint at and verify correct behavior. picotool may have to do the same, unfortunately.

A simple but pernicious example is tables. A table reference can use dot notation for symbol-like access, but at any time code can index into a table using a string calculated from whatever. A preprocessor can never know from the code alone that a given property assignment will never be used. The only viable solution is hints in metadata comments: this symbol is safe to mess with because I declare that I only refer to it using symbol syntax. (IIRC Closure compiler will even error if it sees objects declared this way accessed via array syntax.) picotool's minifier already has this problem, and having a fix for this will be necessary for combining this technique with require()'d libs.

I'm reluctant to invest in a sophisticated meta-language, at least at first. Thankfully I think many carts are simple enough that some early experiments might prove fruitful. Just a lot to think about to get it right.

Felice • 2018-03-07*2018-03-07 11:40*

Yeah, I think it might be best to start simple and work incrementally. Having a grand unified theory would be great, but often I find that trying to create one causes a project to stall indefinitely, whereas tinkering with small ideas, which may or may not work out, tends to lead to practical progress and may, in time, lead one inevitably to the grand unified theory anyway.

How about starting with simple global constant replacement? On PICO-8, we know no one has access to the _G table that wraps global variables, so you can only muck with a global by accessing it by name. Should be reeeasonably straightforward to detect what globals are effectively constant and substitute their values, at least for number types.

dddaaannn • 2018-03-07*2018-03-07 18:03*

Like I mentioned earlier, I want to overhaul the parser first. Right now it's a bigger pain than it should be to transform the AST and still generate nice looking code.

Constant folding will be a build-time operation on the AST and will have nothing to do with Pico-8 or a Lua runtime. In theory, it'd be general enough for all Lua (though I can't imagine other Lua use cases that would want to eliminate tokens :) ). I hope to generalize some of this during the overhaul.

In the general solution, it can just analyze every scope and find variables that are assigned exactly once with a literal expression. It would make multiple passes (or use a queue) to allow a substitution to turn a runtime expression into a literal expression.

Identifying literal expressions is a short hop away from evaluating them, which is why I was expecting to combine the features.

Another early simplification is to only consider the global scope of the main program. It still has to do some analysis of other scopes to ensure that it knows the difference between a global constant and a local shadow, both to know when a global is non-constant and when a symbol refers to a local instead of a global. With that info, it is again a relatively short leap to the general solution (where a function could have local constants!).

Dead code elimination is a broad subject involving many techniques. I'm excited to at least support simplifying if's with constant conditions. After some const substitution and literal expression simplification has taken place, it'd be trivial to recognize "if true" and "if false" and delete unused branches. Then you can continue doing more const identification, because a variable defined exactly once in both branches of a conditional simplifies to a single definition that may qualify as a const and inspire more pruning.

That much also gets you several features you might normally want a macro preprocessor to do. Hence my earlier example of:

debug = false

if debug then
 color = 7
else
 color = 8
end

print('hi', 40, 40, color)

The first pass wouldn't recognize color as a const because it's assigned in two places, but would recognize debug as a const. The second pass would notice "if false" and prune. The third pass would notice color is only assigned once and simplify again. The final result:

print('hi', 40, 40, 8)

Felice • 2018-03-07*2018-03-07 20:00*

Oh yeah, having the option to do the equivalent of #if/#ifdef through constant conditions would be marvelous. I have one project I've been working on for aaages, which I might actually manage to finish, but it's so painful to debug stuff when I'm this close to the token limit. Disable this, that, and the other, run, oh damnit, now the behavior is different and the bug doesn't repro, etc., sigh.

Good point about local constants, too. There are a lot of good uses for those.

[Please log in to post a comment]

About | Contact | Updates | Terms of Use | Picotron

Follow Lexaloffle:

Generated 2025-07-10 23:20:05 | 0.041s | Q:30

User:
Password: