This was a program I hacked up rather quickly back in the 90's when I was taking a night class in Dutch.

It was based very roughly on the Navy text to speech algorithm, but with a mod to perform hyphenation
before applying the rules, as a substitute for breaking a word into component morphemes so that accidental
letter combinations between morphemes were not erroneously reduced (like the ph in haphazard in English
which might have been mis-coded as an 'f' sound.)

The code was written in Imp77 using the variant for the Acorn Archimedes (ARM) system.

It's not perfect text to speech and it needs some rule tweaking, but overall I think it holds up pretty
well and it about as good for Dutch as the original Navy code was for English - maybe better since the
rules of Dutch phonology are more consistent than those of English.

Somewhere on my linux at home I have my C version of the Navy TTS where I recoded it to use a table
from a config file, and the 8-bit ISO character set (but not the current UTF-8 encoding which did not
exist at the time) however I don't believe that that version uses the hyphenation hack nor did I do
a version for Dutch with it, so this Imp code is all I can offer at the moment.  There's an Imp77
compiler available on github, look for "imp2026" to try it.

I looked this out today and put it online because I'm trying out an AtariVox+ speech device and I thought
it might be interesting to resurrect this code.  I will need to add a final pass to output the phonemes
in SpeakJet mnemonic format though, since this code uses my own rather trivial format.

In hindsight the simple string substitution that I was using in Imp could be replaced in a C implementation
by using regular expressions, and since I have a UTF-32-aware version of exactly that in my 'uparse'
parser suite, it might not be much of a job to do that conversion, especially if I start from an imptoc
translation of the Imp code...

Graham Toal <gtoal@gtoal.com>
23rd April 2026