Ask / Submit

Text to Speech (TTS) service and API

asked 2014-07-22 13:46:21 +0200

Mika Hanhijärvi gravatar image

updated 2015-04-29 21:31:52 +0200

rdmo gravatar image

tl;dr It is competitively essential to have and promote Text to Speech (TTS) service in Sailfish.

TTS applications are accessible to visually impaired people. TTS would be useful for the other users too: applications, for example can then notify users via spoken text when needed. TTS is also required by screen readers like Orca.

There should be an API which developers could use in their applications.

One solution is porting eSpeak, or festvox, Speech Dispatcher, etc, to Sailfish. Large parts of these remain available on desktop Linux, supporting multiple languages, but are untapped by the jolly Jolla ecosystem.

edit retag flag offensive close delete



TTS is missing, who doesnt remember how handy was to listen sms in car from classic symbian? and also

pan tau ( 2015-02-22 05:44:35 +0200 )edit

Also, the voice navigation in Here Maps gives street names, if a TTS is properly configured in the system (and interfaced with Android apps).

Federico ( 2017-08-24 17:48:43 +0200 )edit

2 Answers

Sort by » oldest newest most voted

answered 2014-08-01 09:40:12 +0200

kimmoli gravatar image

updated 2015-05-03 22:00:28 +0200

eSpeak seems to build on Jolla (it was built also by MartinK in merObs (8 months ago) but i found that too late).

First attempt didn't give any audio-output, so decided to use my old friend, gst-launch.

Then wrapped things to an ugly script: and we have somekind speaking notifications.

But yes, needs more work. Bundling this to a daemon which listens to dbus and speaks out what is told to would be next step.

STEP 2 :

edit flag offensive delete publish link more



Worked on exactly that some years ago for N900/N8x0 but abandoned it when my N900 phone module broke/switched to the N950. Source is a bit messy but available here;a=summary

onion ( 2014-08-01 15:34:06 +0200 )edit

answered 2017-08-23 10:50:32 +0200

rinigus gravatar image

I am hitting the lack of TTS facilities in Sailfish while developing for it. From the developer point of view, we would like to have an API that would allow to synthesize voice prompts in given languages (to WAV or live) and allow to query installed languages. Such API is currently not available for us. In this post, I would like to summarize what I found so far with the hope that it could be useful for others.

In general, after looking into the area a bit, it seems that OpenSource TTS for Linux have a long way to go. See for some background information.

What makes our situation on SFOS rather complicated is that we are expected to have TTS for many languages, as on other mobile platforms. As mentioned by many others, while espeak does support many languages, its voice output is rather poor, to put it mildly.

At present, we have reasonable coverage for English via Mimic (based on Flite) and few other languages (de, es, fr, it) via PicoTTS (the both are available at openrepos). Those are tools that allow you to generate WAV file from text. Playing WAV file is responsibility of the app requesting it.

As you can see, many languages are missing. In Linux, we could also use MaryTTS ( which uses Java to generate speech. As highlighted by Ken Starks (see link above), its probably the best tool available right now for many languages. Which is of no surprise since it uses unit selection technique for many of them. The RAM requirements are probably significant (expect 500 MB RAM, get surprised if less), but phones do get more RAM these days. As for CPU requirements, no idea - haven't tested it. We would need java (non-GUI) to run it, but its probably possible as well.

Now coming back to API: Linux has Speech Dispatcher which seems to be an interface between TTS-requiring apps and TTS synthesis engines. Speech Dispatcher is what Qt Speech (5.9) uses, as far as I can see. Maybe that could be solution for us as well and allow us to specify in one place the preferred TTS synthesizer as well as the preferred voice (male/female, voice model). In theory, it would be possible to make a GUI allowing to manage the voices and engines. Note that some voices could be rather large (100+ MBs).

There are also several companies working in the area and, maybe, that is the way to do it. Several companies have developed TTS solutions for Linux, ARM included. It looks to me that they prefer B2B model, but I haven't been in touch with any of them. Maybe someone in Jolla could contact and ask whether they would be interested in selling their solution for people running SFOS? Ideally, it should allow users of all devices (ported, SFOS from Jolla, SFOS from RU) to purchase the software and languages separately.

With the current developments, I think that TTS is becoming a necessity and an expected way for device to communicate at certain situations.

edit flag offensive delete publish link more
Login/Signup to Answer

Question tools



Asked: 2014-07-22 13:46:21 +0200

Seen: 1,206 times

Last updated: Aug 23 '17