Log In  

How much time do I have in _update()? Does the PICO-8 throttle the CPU at all, or do I have the native CPU's full processing power to execute LUA script on a single core?

This is an interesting question because my approach to solving problems changes entirely; given that it's a memory constrained environment with possibly a lot of CPU, I would avoid caching solutions to computable problems.

Can someone explain if the PICO-8 has any CPU limitations?

I am not asking how fast _update() ticks. I am investigating how many instructions I can fit in 30hz.

P#25672 2016-07-21 13:43 ( Edited 2016-07-21 17:43)

pico-8 has its own kind of simulated cpu, and you're only "allowed" a certain amount of operations per update. if you load up the stock raycasting demo, you can see this in action with looking into the corner vs looking out at the rest of the level

i don't know the rundown of how "expensive" each operation is, but! you can get an idea of how much of "the pico8 cpu" you used last frame with the stat(1) function

for example, my 'brakeowt' alpha (check the WIP part of the forums) prints it right on the screen as the third line

P#25674 2016-07-21 14:19 ( Edited 2016-07-21 18:19)

Please correct me if I'm wrong, but: stat(1) returns the ratio of clock time used by _update() and the frame rate. That number will be different for the same code depending on the host operating system and architecture. Pico-8 runs its Lua interpreter at full blast in the host environment, and syncs the game loop to the frame clock. A faster host machine should be able to do more in 1/30th of a second than a slower one.

It'd be interesting if it did some sort of clock sync for the Lua bytecode or something but I don't think it does. It may not be necessary, even with the increasing range of platforms running Pico-8 (HTML export, PocketCHIP, desktop OSes, RetroPi).

P#25742 2016-07-22 18:46 ( Edited 2016-07-22 22:46)

I think you get about 70,500 pseudo-cycles per _update60() and about 141,000 per _update(). I forget the exact number, but I think it was about 4,240,000 cycles per second (4.24MHz clock).

One math operation (*/+-%) is a single pseudo-cycle, as is an assignment. A function call is a handful, I think 3 with no args and +1 for each arg, plus the cost of the function itself. Everything has a cost, some more than others. Often you can just look at the token count for a line of code and get a rough idea.

This is not dictated by your physical processor, but by the simulated processor in PICO-8. Most modern processors could outrun this performance by a thousand times easily.

Note: This is all undocumented and subject to change. It's just information I've derived from some benchmarking code I wrote.

P#25763 2016-07-22 22:35 ( Edited 2016-07-23 03:43)

Have you checked your benchmarks with varied host processors? The closest thing to a "processor" in Pico-8 is just the Lua bytecode interpreter. I'm pretty sure Pico-8 doesn't compile Lua to anything else, nor does the interpreter itself run on a simulated processor. I can imagine hacking the bytecode interpreter with some timing normalization, but I'd still be surprised if this is necessary.

I'd also be curious to see how your benchmarks compare to the official Lua 5.3 interpreter on the same host hardware. Is that easy to do?

P#25766 2016-07-23 01:29 ( Edited 2016-07-23 05:29)

Shouldn't be too hard. The code is just plain Lua. Lemme take a look.

P#25767 2016-07-23 02:23 ( Edited 2016-07-23 06:24)
1

Okay, here we go. I should give a little background on how I've been testing this first, though.

Through a series of rather imprecise tests, I figured out roughly what the minimum time spent on any PICO-8 Lua operation was. Basically, I looked for some operation whose duration evenly divided the durations of every other operation. Unsurprisingly, a local variable assignment (including PICO-8's shorthand op-assigns, e.g. '+=') was the smallest unit of operation. I called its duration a cycle.

Edit: I accidentally deleted a paragraph here. Here it is:
I then figured out that that duration was the same length as one iteration of an empty numeric-range loop (e.g. for i=1,10 do). I increased the number of iterations on that loop until it was exactly one second long, or at least as exactly as PICO-8's time() function would let me know, since it's kind of imprecise. I had to guess at rounding mode and some other stuff, but the number of iterations I ended at was pretty close I think. That number is the assumed clock frequency, roughly 4.24Mhz.

To test other operations or pieces of code, I just put them inside the previously-empty loop and time it. The cycle count for the tested code is simply the number of seconds minus the one-second overhead for the loop iteration.

Doing the same empty loop for one second, with lua53.exe on Win7x64 on a 4GHz Intel Core I7 3930K, requires almost exactly 50x the number of iterations. So a PICO-8 can run its Lua code at about 2% the speed my real computer can run its Lua code.

Also, since I'm mucking with numbers...

The lua53.exe "cpu", if we can call it that, running 50x faster than than PICO-8's "cpu" indicates that it runs at an equivalent of around 50 * 4.24MHz = 212Mhz on my 4GHz machine.

Therefore my PC's lua interpreter is getting these simple operations done in about 4GHz/212MHz = ~19 real cycles each. Which sounds about right from what I remember of the lua interpreter code. I think a lot of the more-complex operations would take considerably longer on PC, e.g. adding entries to a table. However, if we're just talking basic stuff like math, that's probably close.


TL;DR: Win7x64 @4GHz runs simple Lua operations ~50x faster than PICO-8 @4.24MHz does. If clock rates were equal, PICO-8 would run Lua ~19x faster.

P#25768 2016-07-23 04:00 ( Edited 2016-07-23 09:00)

Argh, I somehow deleted a paragraph. I edited it back in. See the paragraph under the italicized "Edit:" line.

P#25770 2016-07-23 04:58 ( Edited 2016-07-23 08:58)

[Please log in to post a comment]