Inputs
string: Text to generate speech from
string: Generation voice.
string?: Generation language.
0.52
float32?: Voice speed multiplier.
Press and hold for realtime mode.
Outputs
float32: Linear PCM audio samples with shape (F,) and sample rate 24KHz.