Introducing Muna for Flutter

Run LLMs, speech models, and more in Flutter.

Feb 24, 2026

We just launched our Flutter client library. Developers can now run open models in Flutter apps for iOS and Android. It contains an OpenAI-compatible client for creating chat completions, speech, transcriptions, and embeddings. And like in every other Muna client, developers can specify where each request runs: locally or on cloud GPUs.

Using the OpenAI Client

The quickest way to get started is to use our OpenAI-compatible muna.beta.openai client, which allows for running open models locally or in the cloud:

import "package:muna/muna.dart";
// 💥 Create an OpenAI client
final openai = Muna(accessKey: "<ACCESS KEY>").beta.opena;
// 🔥 Create speech
final response = await openai.audio.speech.create(
input: text,
model: "@kitten-ml/kitten-tts-mini-0.8",
voice: "Bella",
acceleration: "local_gpu",
);
// 🚀 Playback the audio
await audioPlayer.play(BytesSource(response.content));

Bringing your own Model: From Python to Flutter

Muna also allows developers to bring custom models to run in Flutter, by compiling a Python function:

from muna import compile
@compile()
def greeting(name: str) -> str:
"""
Say a friendly greeting.
"""
return f"Hey {name}, welcome to Muna on Flutter."

The muna compile CLI command first transpiles the Python function to C++, then compiles the function for various platforms (see the docs). Once compiled, use the muna.predictions.create API to run the function in Flutter:

// 🔥 Run our compiled function
final prediction = await muna.predictions.create(
"@your-username/greeting",
inputs: { "name": "Lina" },
);
// 🚀 Use the results
print(prediction.results?.first);

Note that Python functions can be compiled to be compatible with the OpenAI client. Learn more.

Why this is different

Developers coming from the Python ecosystem often take their ease-of-use for granted. Anyone can pull in virtually any model from Hugging Face, wrap it in a few lines of code, and run locally or in the cloud. Flutter has never had that flexibility.

Now, developers can take any model, write a Python function, run muna compile, and invoke the compiled function locally using the familiar OpenAI-compatible API. And if the model is too large to run on-device, they can simply route it to run on a datacenter GPU. That decision is just another parameter in code, no longer an expensive architectural commitment.

The full breadth of the Python AI ecosystem is now available to Flutter developers.