@supertone/supertonic-2

Generate speech from text with SuperTonic 2.

Inputs

string: Text to generate speech from.

string: Generation voice (F1-F5 for female, M1-M5 for male).

string?: Generation language code.

0.72

float32?: Speech speed multiplier.

int32?: Number of diffusion steps (higher = better quality, slower).

Press and hold for realtime mode.

Outputs

float32: Linear PCM audio samples with shape (F,) and sample rate 22050Hz.