Speako8 Speech Synthesis Library

bikibird • 2022-08-30*2022-08-30 19:14* •

BBS>

PICO-8>Cartridges

Speako8

by bikibird

Cart #speako8-2 | 2022-08-30 | Code ▽ | Embed ▽ | License: CC4-BY-NC-SA

177

Add speech synthesis to your games with Speako8, a speech synthesis library for PICO-8 in under a thousand tokens! It's loosely based on a Klatt synthesizer and will remind some folks of Software Automatic Mouth (S.A.M.)

To add Speako8 to your games, copy and paste the library below:

Speako8's voice must be configured prior to first use:

function _init()
spk8_pitch,spk8_rate,spk8_volume,spk8_quality,spk8_intonation,spk8_if0,spk8_shift,spk8_bandwidth,spk8_whisper=
140,1,1,.5,10,10,1,1,1
end

Warning: This application may generate loud and harsh sounds. Protect your hearing! Do not test with headphones on and turn down the volume.

Variable	Range	Explanation
spk8_pitch	60-230	Fundamental pitch of voice (F0) in hertz
spk8_rate	.1-2	Standard rate of speech divisor— below 1 is slower; above 1 is faster.
spk8_volume	.1-2	Standard volume factor— below 1 is quieter; above 1 is louder.
spk8_quality	.1-2	Glottis open period— below .5 is creakier voice; .5 is modal voice; 1 is breathy voice; above 1 is weaker voice.
spk8_intonation	0-20	Degree of pitch prosody in hertz— set to zero for robotic monotone.
spk8_if0	0-20	Degree to which the inherent pitch (F0) of vowels varies (in hertz)— set to zero for robotic monotone.
spk8_shift	.8-1.2	Factor by which to shift formant frequencies (F1, F2, F3)— above raises formant frequencies; below one lowers
spk8_bandwidth	.5-5	Factor by which to alter bandwidth of formant frequencies (F1, F2, F3)— below 1 narrows the bandwidth of the formants; above 1 widens. When formants are shifted upward, it is recommended to widen bandwidths as well.
spk8_whisper	1 or 2	Speaking mode— Normal voice is 1; whispering is 2.

To enable speech synthesis, you must call speako8 in the _update function. Then call say with a speech string:

function _update() 

  speako8()  -- must appear once unconditionally in _update.

  if btnp(5) then --❎ button
    -- use speaking if you want to check if anything is currently being said.
    -- if not speaking() then
    say("_/hh/-1.57/eh/-1.07/l/-1.33/3/ow/-1.03/-3/w/1.27/-3/er/1.65/-3/l/-1.64/-3/_/d")
    -- end
  elseif btnp(4) then -- 🅾️ button
    mute()  --flushes the sound queue and immediately stops whatever is currently being said.
  end
end

Speech strings represent text phonetically and include prosody markup. The easiest way to create and test them is with the Declare web app. You can also try out different voice options with it.

If you are interested in learning more about speech synthesis, try googling these key words: Klatt synthesizer, formant, acoustic phonetics. It's a fascinating topic. An unminimized version of the library is included in the demo cart.

Special thanks to the gang on the Discord server for lending me their ears and keeping me motivated, especially @packbat, who did it for science. Thanks also to @IMLXH for finding the Klatt prosody rules.

pcm speech synthesis sam

177

dw817 • 2022-08-30 2022-08-30 19:51

Ah, you cannot enter in just raw english text and have it speak, @bikibird. Nonetheless - a miracle to see - pardon - HEAR - in Pico-8 !

That scary whisper would be perfect for horror games.
Gold star to the elocution expert !

IMLXH • 2022-08-30 2022-08-30 21:22

ITS HERE ITS HERE ITS HERE AAAAAAAAAAAA

SealProgrammer • 2022-08-30 2022-08-30 21:55

Amazing!

taxicomics • 2022-08-30 2022-08-30 22:02

Damn! And a nice little democart. It`ll be fun to think about ways to use this. Maybe a song generator with vocals? A Spaceship-Announcement-System? Speaking NPCs in an RPG? Great work!

MaxBize • 2022-08-30 2022-08-30 22:41

Incredible! Great work

thattomhall • 2022-08-31 2022-08-31 00:27

Haha this is AMAZING. The whispering was hilarious. :D

zep • 2022-08-31 2022-08-31 11:38

You had me at "Speako8", haha

Amazing work, and exciting that as it's small enough to be used as a component. Now I'm trying to find an excuse to make something that needs speech synth.

Mr.Mouse • 2022-08-31 2022-08-31 19:51

bro, speako8 is the best thing i watch in my life.

icegoat • 2022-08-31 2022-08-31 20:15

This is an amazing achievement, and I hope to find a project to use it in as well!

bikibird • 2022-08-31 2022-08-31 21:35

@dw817, I think it's possible to write phonetic and prosody rules to convert English text to speech strings and fit them into a cart, but I doubt very much that there would be many tokens left over for anything else.

@zep, yes, really looking forward to see what folks do with this. Glad you liked the name.

To everyone, thank you for your appreciations. They made my day.

Vladislav Kolesnikov • 2022-09-02 2022-09-02 20:21

Imagine using this in a video game. Maybe the creepy whispers would be perfect for a scary atmosphere...

idk_how_to_code • 2022-09-04 2022-09-04 16:03

kinda reminds me of Stephen hawking's tts voice (rip)

bikibird • 2022-09-04 2022-09-04 16:30

@idk_how_to_code, that's because it kinda is! Hawking spoke via a Klatt synthesizer and Speako8 is loosely based on that synthesizer. Much of the formant data and prosody rules I used came from Klatt's writings and are based on his own speech samples. So, yeah Speako8 sounds like Hawking sounds like Klatt.

DRStudio2010 • 2022-09-05 2022-09-05 00:45

It sounded like Currah μSpeech for the ZX Spectrum lol

m_b • 2022-09-05 2022-09-05 01:49

...but much worse than VOX for the ZX Spectrum. Anyway, the whisper mode is the only one i can understand. keep it up, don't remove it.

pahammond • 2022-09-05 2022-09-05 08:12

Absolutely amazing. Going to try a few quotes from the original "Berzerk" arcade game.

ridgek • 2022-09-14 2022-09-14 11:21

this is so cool

DANNY1 • 2022-09-20 2022-09-20 22:28

YOO THAT'S ACTUALLY AMAZING

augusto99999 • 2022-09-26 2022-09-26 21:19

this is very cool, but what if you could feed it raw english text and let it say stuff?

bikibird • 2022-09-26 2022-09-26 21:37

@augusto99999, it could be done. You'd have to write rules for phoneticizing. They wouldn't be as accurate as the pronouncing dictionary that Declare uses, but they might be okay enough. You'd also have to convert the prosody rules from declare.js. I'm afraid, though, that once those tasks were done, there wouldn't be many tokens left for an actual game.

emi_peny69 • 2022-09-30*2022-09-30 15:17*

gabriel is coming in your games ony for 1000 tokens of code

jalecko • 2022-10-08 2022-10-08 18:51

This is impressive

Nbrother1607 • 2022-10-23 2022-10-23 06:27

it works when music and sfx are playing

morningtoast • 2022-10-26 2022-10-26 22:08

I've added this to my exploring game and finding the CPU/mem usage spikes crazy high when the talking happens, slowing everything else in the game down.

I guess it doesn't surprise me given the magic it's doing, but is there anything I can/should do to try and minimize the hit?

I'm just using the say() with a trigger like documented. It's an incredibly tight and well-thought out component - super thanks for that - but that puts the magic way above my head to try and tweak it :)

bikibird • 2022-10-26 2022-10-26 22:42

@morningtoast, As you've noticed, it tends to drop frames when the buffer is loading. You can improve things quite a bit, but not totally eliminate the issue by searching for while stat(108)<1920 and changing it to while stat(108)<256 in the Speako8 lib.

collinthenewmaker • 2022-11-02 2022-11-02 07:26

Whispering sounds like a true intruder moment

kalevolary • 2022-11-02 2022-11-02 09:00

Cool. Like it!
fireboy and watergirl

Vapayt • 2022-11-20 2022-11-20 08:53

can he sing?

bikibird • 2022-11-20 2022-11-20 14:50

@Vapayt, in theory, yes, but you'd have to change the pitch after each syllable and check that Speako is done saying the last syllable before sending the next.

FantasticCrab • 2023-01-14 2023-01-14 08:26

Reminds me of SAM, the software automatic mouth for the c64 iirc. It sounds pretty similar!

Charliedi • 2023-02-08 2023-02-08 19:36

very cool, thanks!

Omay • 2023-04-08 2023-04-08 23:20

sounds kinda like the faith voice to me

Vladislav Kolesnikov • 2023-04-19 2023-04-19 17:34

Very cool! It is a little robotic, but still I like it!

bikibird • 2023-04-19 2023-04-19 17:44

It's a lot robotic. I'm working on improving things, though. Stay tuned.

Verb • 2023-04-19 2023-04-19 19:27

Seems like the ideal tool for reenacting WarGames 🤣

D0S81 • 2023-06-27 2023-06-27 18:29

i can see the GlaDos singing simulations already

postgoodism • 2023-12-06 2023-12-06 15:43

I wanted some incomprehensible babble noises in my game, and searched for "pico8 speech" in case somebody had posted a tutorial for the best way to achieve that effect with an SFX. Instead I found this library, and suddenly I knew where those last ~1000 tokens were going! The look on players' faces when the cart speaks to them is magical.

I had an idea to reduce the library's token/character count: aside from the demo cart, many applications would probably only use a subset of the library's features on a small pool of utterances. In my case, the only parameters I change from their defaults at runtime are spk8_whisper and spk8_rate, and only two short sentences. The remaining parameters could safely be omitted/hard-coded, and the unused phonemes could be stripped from the data tables. I don't know how much of an impact it would make in general, but towards the end of the project when I was looking for assignment statements to combine and variable names to shorten (even after minifying), every little bit would've helped.

Amazing library, thank you so much for sharing! And calling it "speako8" is just chefskiss.

arrowonionbelly • 2024-04-01 2024-04-01 05:43

Will this work in picotron? I've got a game I'm porting over and already have no idea if the sound even actually works yet

bikibird • 2024-04-01 2024-04-01 14:02

@arrowonionbelly, no I don't thinks so, not without a lot of work. Speako8 uses PCM. Currently there is no support for PCM in Picotron. I believe zep has said that PCM could be emulated by updating a waveform in the wavetable on the fly, but as far as I know there is no way to play a waveform indefinitely, so it would need to by tied to the note function or the tracker and I'm not sure how I would control for duration in that scenario.

I've played around with the FX filter for instruments in Picotron and I've been able to make some vaguely vowel-like sounds, but that's a far way from true speech synthesis.

@zep, my Picotron wish is for a biquad filter to be added as a node type with parameters frequency, Q, and gain. I believe that such a filter would allow me to build a speakophone for Picotron, which would be very cool.