Friday, June 11, 2010

Microsoft SAPI text-to-speech fun

It seems to fail at resampling. I tested on Windows 7 with the stock Microsoft Anna voice; you probably won't notice unless you use an alternative output (e.g. write to file) and specify a non-native sample rate (I suppose most people don't do that).

Handily I was in the context of a foobar2000 component, so I was able to make SAPI output to memory at the native sample rate and use FB2K's resampler.

Here are some samples (native 16kHz, SAPI-resampled 22kHz, FB2K-resampled 22kHz). Feel free to share any insight in the comments...