Log In  

I noticed that split() splits the table at the first character if you use a separator of more than two characters.
Is it a bug or a specification that split() cannot be used with separators of more than two characters?

foreach(split('123456','45'),print)
--123
--56 (Expect only :6)

Thank you.


If there is no limit to the number of characters in the separator, string replacement can be implemented in a compact way. (as in replace_short())

function replace_short(s,f,r)
  return join(split(s,f),r or '')
end

-- join() is custom function

function replace(s,f,r)
 local a=''
 while #s>0 do
  local t=sub(s,1,#f)
  a=a..(t~=f and sub(s,1,1) or r or '')
  s=sub(s,t==f and 1+#f or 2)
 end
 return a
end

P#107090 2022-02-17 13:37 ( Edited 2023-06-15 13:36)

Pretty sure separator can only be one character.
There is an alternative: if you pass a number, then the string will be split in groups of this number of characters.

P#107111 2022-02-17 17:24

Ah, I =JUST= now ran into this same problem, @shiftalow and @merwok. So yes, I would like it if @zep would allow you to have 2- or more characters recognized in the SPLIT() command.

a="apple::banana"
a=split(a,"::")
cls()
?a[1] -- correct value
?a[2] -- incorrect value

Result is APPLE and a blank line.

I especially wanted it for the following:

function instr(a,b)
  return #split(a,b)[1]+1
end

Which works fine when scanning for a single character, however in this case print(instr("coconut","on")) you get the incorrect answer of 2 truly showing it is looking for just the 1st character.

P#107185 2022-02-18 17:38 ( Edited 2022-02-18 17:43)

guess you have to write instr using sub!

P#107191 2022-02-18 20:20
2

Oh I've got INSTR() done, @merwok, using SUB() as you said.

-- return from string a where
-- string b is first found.
function instr(a,b)
local r=0
  if a>"" and b>"" then
    for i=1,#a-#b+1 do
      if sub(a,i,i+#b-1)==b then
        r=i
        break
      end
    end
  end
  return r
end

I was going to put together a bunch of string and numeric functions a bit later, that's one of them.

P#107203 2022-02-18 21:46
1

@merwok

Ok, I wanted to use it for replace(), but I decided to solve it another way!
> if you pass a number, then the string will be split in groups of this number of characters.

By the way, I use the method of passing a number as an argument, for the purpose of converting hexadecimal numbers into decimal numbers.

function hextodecimal(b,n)
 a={}
 foreach(split(b,n or 2),function(v)
  add(a,tonum('0x'..v))
 end)
 return a
end
P#107279 2022-02-20 01:01
1

@dw817

Oh, that's great. I found a solution with my replace() using your INSTR as a hint!
Thank you!

function replace(s,f,r)
 local a,i='',1
 while i<=#s do
  if sub(s,i,i+#f-1)~=f then
   a..=sub(s,i,i)
   i+=1
  else
   a..=r or ''
   i+=#f
  end
 end
 return a
end
P#107281 2022-02-20 01:02
2

Glad to help, @shiftalow. :)

And now that we are at it, here is what I wrote earlier to do a string replace.

-- return replacement of all
-- instances in string a of b
-- with c.
function replace(a,b,c)
local r,i=a,1
  if a>"" and b>"" and c>"" then
    r=""
    repeat
      if sub(a,i,i+#b-1)==b then
        r=r..c
        i=i+#b-1
      else
        r=r..sub(a,i,i)
      end
      i=i+1
    until i>#a-#b+1
  end
  return r
end

It's always curious and often enlightening to see someone else's solution to code you wrote yourself.

P#107288 2022-02-20 02:45 ( Edited 2022-02-20 02:46)
1

note that you can now use tonum("aa",1) to convert a byte (two hex chars) to a number!

P#107325 2022-02-20 11:34
1

@shiftalow

Lately I've been encoding nybbles as 0123456789:;<=>?, which is 16 sequential characters in ascii/p8scii order, so that I can use ord(s,i)-48 (because ord("0") is 48) to get the nybble's value without extracting/creating substrings. This has the downside that the nybble-encoded data isn't as readable (unless you memorize the new "digits" for A-F/10-15), but it simplified a lot of my code.

For instance, if I were to specialize your function for bytes:

function hexbytetodecimal(b)
 local a={}
 for i=1,#b,2 do
  -- -816 == -48*16-48
  add(a,ord(b,i)*16+ord(b,i+1)-816)
 end
 return a
end

This is also nice for unpacking to memory, you just replace the table add with a poke.

Similarly, you'd build up such a string with b..=chr(byte>>>4|48)..chr(byte&15|48)).

P#107470 2022-02-22 11:30 ( Edited 2022-02-26 10:45)

@merwok

Thanks!
I've created a new thread about tonum()!
https://www.lexaloffle.com/bbs/?tid=42448


@Felice

Oh! That's an interesting measure!
If split() has a strict CPU cost in the future, you should consider using it.

P#107595 2022-02-24 12:42

[Please log in to post a comment]