Introducing Muna for Flutter
Run LLMs, speech models, and more in Flutter.
Feb 24, 2026
We just launched our Flutter client library. Developers can now run open models in Flutter apps for iOS and Android. It contains an OpenAI-compatible client for creating chat completions, speech, transcriptions, and embeddings. And like in every other Muna client, developers can specify where each request runs: locally or on cloud GPUs.
Using the OpenAI Client
The quickest way to get started is to use our OpenAI-compatible muna.beta.openai
client, which allows for running open models locally or in the cloud:
import "package:muna/muna.dart";// 💥 Create an OpenAI clientfinal openai = Muna(accessKey: "<ACCESS KEY>").beta.opena;// 🔥 Create speechfinal response = await openai.audio.speech.create(input: text,model: "@kitten-ml/kitten-tts-mini-0.8",voice: "Bella",acceleration: "local_gpu",);// 🚀 Playback the audioawait audioPlayer.play(BytesSource(response.content));
Bringing your own Model: From Python to Flutter
Muna also allows developers to bring custom models to run in Flutter, by compiling a Python function:
from muna import compile@compile()def greeting(name: str) -> str:"""Say a friendly greeting."""return f"Hey {name}, welcome to Muna on Flutter."
The muna compile CLI command first transpiles the Python function to C++, then compiles
the function for various platforms (see the docs).
Once compiled, use the muna.predictions.create API to run the function in Flutter:
// 🔥 Run our compiled functionfinal prediction = await muna.predictions.create("@your-username/greeting",inputs: { "name": "Lina" },);// 🚀 Use the resultsprint(prediction.results?.first);
Note that Python functions can be compiled to be compatible with the OpenAI client. Learn more.
Why this is different
Developers coming from the Python ecosystem often take their ease-of-use for granted. Anyone can pull in virtually any model from Hugging Face, wrap it in a few lines of code, and run locally or in the cloud. Flutter has never had that flexibility.
Now, developers can take any model, write a Python function, run muna compile, and invoke the compiled function locally using the familiar OpenAI-compatible API. And if the model is too large to run on-device, they can simply route it to run on a datacenter GPU. That decision is just another parameter in code, no longer an expensive architectural commitment.
The full breadth of the Python AI ecosystem is now available to Flutter developers.