Someone explain the TTS issues to me, please

aurel42de · December 11, 2024, 8:08am

Can someone who has experience with TTS explain to me why the Azure AI Speech thingie has such a hard time using correct pauses and emphasis with prepared or generated text? I’m mostly interested in career mode “actors” and ATC.

We’ve all heard this: “One two three decimal four [pause] FiveCessna Alpha Bravo Charlie.”

Sometimes it feels like the AI model is going out of its way to put the emphasis on the wrong words in every sentence it says. And in ATC messages, it seems to actively try to detach the last digit of a number, the last character of a spelled aircraft registration or procedure id from the whole number or designator and add it to the next word, whatever that may be.

Isn’t there a way for Asobo to provide hints to the TTS system? Like the use of punctuation in written speech to help the reader parse the intent of the author. Can’t they add tags or something to the text they feed to TTS to help the AI to identify frequencies as numbers with two or three decimal digits, to tell the AI to group all letters of an identifier together when spelling it out using the phonetic alphabet?

This is really, really bothering me more than it should.
I believe it is the acoustic equivalent of the “uncanny valley”.
The speech synthesis as such is so convincing that it is really jarring when the immersion is broken by the wrong rhythm, cadence, semantics, not sure what to call it. (I throw balls far. You want good words? Date a languager.)

Is it just me?

xNate262x · December 12, 2024, 4:41am

I’ve also noticed that the controllers will pronounce words incorrectly as well. Instead of saying “wind” as in what you feel on your face, they’ll say “wind” as in “winding up.” Or they’ll say “eff ell” instead of “Flight Level” when reading the abbreviation.

CasualOne7471 · March 4, 2025, 12:45pm

Well… the f l is in the text file on your drive. Some of those things ive changed. Like changed decimal to point as well. Ive also (expanded) the pushback . “Release parking brake, ready for engine start” so on and so on. Little things like that do make a difference. Only thing is every update i need to copy paste my changes back in. Lol

aurel42de · March 4, 2025, 1:25pm

I think this is the first time in about two decades that I regret always turning off indexing for performance reasons. Do you mind sharing filenames, so I don’t have to wait for grep to give results?

edit: fs-base-atc/en-US.locPak and fs-career-asobo/en-US.locPak seem to cover most of what annoys me.

CasualOne7471 · March 4, 2025, 4:45pm

Thats all you need. Sorry.

CasualClick · March 4, 2025, 4:51pm

That’s not necessarily true any more. The locpak file is only used partially now. I’ve been working with another player for the longest time on his FAA Phraseology Mod which works superbly in 2020 and we’ve found that Azure TTS will disregard completely certain portions of the locpak file and insert it’s own phrases and trigger conditions. Those latter items are totally invisible to and out of player control.

At a larger level, the LLM is very immature and needs more training - most likely better sample points and perhaps guidance/rules. For example, how to distinguish a phrase that might use the letter R to be read back as “Romeo” versus if it’s appended to a number value which might mean it’s a runway - so read back 25R as “Two Five Right.” It’s finer nuances like that where LLMs fall down at a baseline.

CasualOne7471 · March 4, 2025, 4:52pm

Is that for career, or freeflight? Or both?

CasualClick · March 4, 2025, 4:53pm

ATC is barely workable in Career, but yes, it appears to apply to both. I’ve yet to progress to an IFR stage in Career so I can’t stress test it yet under that play mode, but certainly in VFR, the ATC LLM behavior is consistent regardless.

CasualClick · March 4, 2025, 4:57pm

This may be of interest to you

cianpars · March 6, 2025, 12:44pm

Whilst ATC and voices in 2020 were quite poor, 2024 seems to be a massive downgrade. Half of the time it doesn’t work at all and when it does it sounds robotic and generally terrible.

One little thing that really annoys me is that it refers to wind like to wind a clock rather than wind, though that is fairly minor compared to everything else that’s wrong with it.

WrongChicken · March 6, 2025, 2:20pm

Don’t get the IFR cert in career mode. Once you get it you’re forced to use live weather and it’s currently bugged. All your missions will have extreme turbulence and 40kt crosswinds in a 172. It makes many/most missions unflyable.

CasualClick · March 6, 2025, 2:22pm

Ah too late. Plus I need it to get charter pax missions.

WrongChicken · March 6, 2025, 2:27pm

I really want to enjoy career mode but since I got the IFR cert it’s not much fun. There’s no predicting when a mission will be flyable. From the development updates it seems they’re working on it but it’s slated for SU2 so maybe it will be fixed in 3-6 months?

CasualClick · March 6, 2025, 2:43pm

I’ll have to check it anyway to see if TTS ATC behavior is any different from Free Flight. The ENTER key shortcut on Mission Goals obfuscates what’s happening in the ATC dialogue box.