I'm struggling with a function that randomly picks up an element from a table with ponderated probability. For now this is my current code:
items = { {value="a",probability=.35}, {value="b",probability=.35}, {value="c",probability=.145}, {value="d",probability=.145}, {value="e",probability=.008}, {value="f",probability=.002} }  total probability = 1 function rnd_item() local rnd_number = rnd()  between [0,0.99999) local i = 1 local sum = items[i].probability while sum < rnd_number do i += 1 sum += items[i].probability end return items[i] end 
What is the problem with the implementation? Right, In some cases it crashes because 'i' goes out of bounds. It seems that the problem occurs when rnd_number is near 1 (i.e. 0.99997) but I can't figure out how I fix it.
Is there a better implementation to solve the problem?
It looks like it's a decimaltobinary conversion rounding error: when I run
numbers={.35,.35,.145,.145,.008,.002} sum=0 for i=1,#numbers do sum+=numbers[i] print(tostr(numbers[i],true)) end print(tostr(sum,true)) 
the numbers add up to 0x0.fffd < 0x0.ffff
I can think of three possible ways to handle it:

Instead of entering weights as decimal numbers, enter them as hexadecimal numbers with up to four hexadecimal places. This way, there's no rounding and therefore no rounding error, and you can ensure everything adds to 1.
 Enter the weights unnormalized (e.g. 350, 350, 145, 145, 8, 2), calculate the sum, and either:
2a. Normalize programmatically, dividing each weight but the last by the sum of the weights and setting the last to the remainder, or
2b. use
local rnd_number = rnd(sum) 
to generate a random number in the range [0, the sum).
+1 to everything packbat said, and +1 especially to method 2b. This is what I use:
function choose_weighted(opts)  given opts that looks like {  {ch_ground,3},  {ch_key,1},  {ch_ladder,6},  }, returns one of the first elements,  weighted based on the second elements local sum=0 for e in all(opts) do sum+=e[2] end if sum==0 then return nil end local rng=rnd(sum) for e in all(opts) do rng=e[2] if rng<0 then return e[1] end end assert(false) it should be impossible to execute this line end 
some test code:
table={ {"x",20}, {".",2}, {"",10}, } cls() s="" for y=1,20 do for x=1,20 do s..=choose_weighted(table) end s..="\n" end print(s) 
result:
...actually, just realized there's a Method 3:
 Enter the weights unnormalized
 Run a calculation to replace weights in the table with cumulative sums
 Use the last entry in the table as the value for rnd(value) and just compare entrybyentry each time you roll.
Or, alternatively, enter the normalized cumulative sums directly:
numbers={.35,.7,.845,.99,.998,1}
Edit: (not at computer, this is not tested)
items = { cumulative probabilities {value="a",probability=.35}, .35 {value="b",probability=.7}, .35 {value="c",probability=.845}, .145 {value="d",probability=.99},.145 {value="e",probability=.998},.008 {value="f",probability=1}.002 }  total probability = 1 function rnd_item() local rnd_number = rnd()  between [0,0.99999) for i=1,#items do if items[i].probability > rnd_number then return items[i] will exit loop end end end 
Edit 2: Did a Monte Carlo test in PICO8 and it seems to be working just fine. Probably the best method from a tokens/CPU perspective  the rnd_item() function is 26 tokens and not a lot of cycles per function call (sorting the list from most likely to least likely as you did is optimal)  but a bit of a pain because you have to handcalculate everything in advance and tweaking is inconvenient.
Here's a fast alternative, using a bag of tokens:
bag={} function to_bag(e,n) for i=1,n do add(bag,e) end end function from_bag() return bag[1+rnd(#bag)\1] end 
test code, borrowed from pancelor:
to_bag('x', 20) to_bag('.', 2) to_bag('', 10) cls() s="" for y=1,20 do for x=1,20 do s..= from_bag() end s..="\n" end print(s) 
this will use memory, depending on the granularity you need, but a hundred tokens is nothing and a thousand is still not much (any "big" thing you put in the bag is stored as a reference).
you could also shuffle the bag before use, and address the tokens in sequence. this way you can replay the same sequence if needed. also when you get to the end of the bag you're sure your tokens were uniformly distributed.
@packbat Oh shit, I'm sorry! Your avatar confused me...
@ultrabrite With your solution we've losing the weighted random selection. haven't we?
[Please log in to post a comment]