Log In  

I have started measuring function costs precisely, because I like accurate things. It’s all on the wiki but not fully complete.

Here are a few funny things I already learned:

  • <code>x^.5</code> costs 16 cycles, whereas <code>sqrt(x)</code> costs 27
  • <code>x^4</code> costs 8 cycles, but <code>x*x*x*x</code> only costs 3

Some of these, such as clipped <code>circ()</code>, are pretty tricky to measure, I hope someone can help!

Edit: removed claim about shl() because that function behaves a bit differently.

P#60180 2018-12-20 18:06 ( Edited 2018-12-20 21:36)

Out of interest how are you measuring these things in the first place? Some sort of sampling profiler?

P#60190 2018-12-21 01:58

I just use stat(1) and stat(2) and call the code 1024 times:

n = 1024

-- calibrate
flip() x,t=stat(1),stat(2) for i=1,n do end y,u=stat(1),stat(2)

-- measure sqrt(i)
for i=1,n do

function c(t0,t1,t2) return(t0+t2-2*t1)*128/n*256/60*256 end
print("lua cycles: "..c(x-t,y-u,z-v))
print("system cycles: "..c(t,u,v))

This prints the cycle counts for sqrt(i):

lua cycles: 3
system cycles: 24
P#60198 2018-12-21 09:31

Thanks - didn't know about stat(2).
Perhaps if I knew anything about Lua internals I'd be nodding sagely at this point, but I guess I've got some reading up to do..

P#60227 2018-12-22 01:16

I've done a bunch of this.

Take care that there are periodic interrupts, possibly for audio, so if you time long code, it can return unexpected results.

Also note that your variable type will affect the timing, e.g. sqrt(i) will be different if 'i' is local or global.

P#60233 2018-12-22 09:30

this snippet probably saved my game!

is the '60' for 60 fps? should I change it for a 30 fps game?

function c(t0,t1,t2) return(t0+t2-2*t1)*128/n*256/60*256 end

btw, I get a minus value with this sometimes, any ideas what I'm doing wrong?

P#82410 2020-09-28 20:03

I've expanded on this snippet, and written out an explanation of what every term in that calculation is doing: https://www.lexaloffle.com/bbs/?tid=46117

(tl;dr: 128*256*256 comes from pico-8's speed (8MHz), and the 60 comes from 60fps)

P#104805 2022-01-11 08:59 ( Edited 2022-01-11 08:59)

[Please log in to post a comment]

Follow Lexaloffle:          
Generated 2022-11-26 08:20:19 | 0.007s | Q:18