I haven't been able to stop thinking recently about how possible text to speech synthesis would be in Pico-8. A lot of people have tried creating synthetic speech and struggled profusely, however I think I have an idea which could simplify the whole process.

Basically, you would start by breaking everything down to phonetics. Each letter of the alphabet having it's own individual SFX, the remaining SFX slots dedicated to combined sounds like "CH", "SH" etc. These would be called upon together to craft full words and sentences.

For the actual text to speech system, each letter of the alphabet would be numbered. Not neccesarily in order however. The system would check sentences by multiplying and/or adding numbers. For example, the letter "H" could be assigned the number "3", while "C" could be "6" and "S" be "9". The outcome will depend on what multiple the system gets when multiplying the assigned numbers of the word, thus allowing "CHASE" and "SHADE" to use the correct phonetic for "C" and "S".

Actually executing this concept is far beyond my scope in both code and audio, but I thought I'd throw it out there maybe to even kickstart discussion. I have a feeling this would work even better with Japanese, since Katakana and Hiragana are constructed with basic syllables.

P#69492 2019-10-30 20:30 ( Edited 2019-10-30 20:30)

P#69503 2019-10-31 01:25

