Log In  

I noticed that split() splits the table at the first character if you use a separator of more than two characters.
Is it a bug or a specification that split() cannot be used with separators of more than two characters?

Thank you.


If there is no limit to the number of characters in the separator, string replacement can be implemented in a compact way. (as in replace_short())

function replace_short(s,f,r)
  return join(split(s,f),r or '')
end
-- join() is custom function

function replace(s,f,r)
 local a=''
 while #s>0 do
  local t=sub(s,1,#f)
  a=a..(t~=f and sub(s,1,1) or r or '')
  s=sub(s,t==f and 1+#f or 2)
 end
 return a
end

P#107090 2022-02-17 13:37

:: merwok

Pretty sure separator can only be one character.
There is an alternative: if you pass a number, then the string will be split in groups of this number of characters.

P#107111 2022-02-17 17:24
:: dw817

Ah, I =JUST= now ran into this same problem, @shiftalow and @merwok. So yes, I would like it if @zep would allow you to have 2- or more characters recognized in the SPLIT() command.

a="apple::banana"
a=split(a,"::")
cls()
?a[1] -- correct value
?a[2] -- incorrect value

Result is APPLE and a blank line.

I especially wanted it for the following:

function instr(a,b)
  return #split(a,b)[1]+1
end

Which works fine when scanning for a single character, however in this case print(instr("coconut","on")) you get the incorrect answer of 2 truly showing it is looking for just the 1st character.

P#107185 2022-02-18 17:38 ( Edited 2022-02-18 17:43)
:: merwok

guess you have to write instr using sub!

P#107191 2022-02-18 20:20
:: dw817
2

Oh I've got INSTR() done, @merwok, using SUB() as you said.

-- return from string a where
-- string b is first found.
function instr(a,b)
local r=0
  if a>"" and b>"" then
    for i=1,#a-#b+1 do
      if sub(a,i,i+#b-1)==b then
        r=i
        break
      end
    end
  end
  return r
end

I was going to put together a bunch of string and numeric functions a bit later, that's one of them.

P#107203 2022-02-18 21:46
1

@merwok

Ok, I wanted to use it for replace(), but I decided to solve it another way!
> if you pass a number, then the string will be split in groups of this number of characters.

By the way, I use the method of passing a number as an argument, for the purpose of converting hexadecimal numbers into decimal numbers.

function hextodecimal(b,n)
 a={}
 foreach(split(b,n or 2),function(v)
  add(a,tonum('0x'..v))
 end)
 return a
end
P#107279 2022-02-20 01:01
1

@dw817

Oh, that's great. I found a solution with my replace() using your INSTR as a hint!
Thank you!

function replace(s,f,r)
 local a,i='',1
 while i<=#s do
  if sub(s,i,i+#f-1)~=f then
   a..=sub(s,i,i)
   i+=1
  else
   a..=r or ''
   i+=#f
  end
 end
 return a
end
P#107281 2022-02-20 01:02
:: dw817
2

Glad to help, @shiftalow. :)

And now that we are at it, here is what I wrote earlier to do a string replace.

-- return replacement of all
-- instances in string a of b
-- with c.
function replace(a,b,c)
local r,i=a,1
  if a>"" and b>"" and c>"" then
    r=""
    repeat
      if sub(a,i,i+#b-1)==b then
        r=r..c
        i=i+#b-1
      else
        r=r..sub(a,i,i)
      end
      i=i+1
    until i>#a-#b+1
  end
  return r
end

It's always curious and often enlightening to see someone else's solution to code you wrote yourself.

P#107288 2022-02-20 02:45 ( Edited 2022-02-20 02:46)
:: merwok
1

note that you can now use tonum("aa",1) to convert a byte (two hex chars) to a number!

P#107325 2022-02-20 11:34
:: Felice
1

@shiftalow

Lately I've been encoding nybbles as 0123456789:;<=>?, which is 16 sequential characters in ascii/p8scii order, so that I can use ord(s,i)-48 (because ord("0") is 48) to get the nybble's value without extracting/creating substrings. This has the downside that the nybble-encoded data isn't as readable (unless you memorize the new "digits" for A-F/10-15), but it simplified a lot of my code.

For instance, if I were to specialize your function for bytes:

function hexbytetodecimal(b)
 local a={}
 for i=1,#b,2 do
  -- -816 == -48*16-48
  add(a,ord(b,i)*16+ord(b,i+1)-816)
 end
 return a
end

This is also nice for unpacking to memory, you just replace the table add with a poke.

Similarly, you'd build up such a string with b..=chr(byte>>>4|48)..chr(byte&15|48)).

P#107470 2022-02-22 11:30 ( Edited 2022-02-26 10:45)

@merwok

Thanks!
I've created a new thread about tonum()!
https://www.lexaloffle.com/bbs/?tid=42448


@Felice

Oh! That's an interesting measure!
If split() has a strict CPU cost in the future, you should consider using it.

P#107595 2022-02-24 12:42

[Please log in to post a comment]

Follow Lexaloffle:        
Generated 2022-05-04 14:01:12 | 0.036s | Q:27