edgerunner

Simplified AI runtime integration for mobile app development

Website | Contact | Discord | Twitter | Docs

💡 Introduction

The purpose of Edgerunner is to facilitate easy integration of common AI model formats and inference runtimes (TFLite, ONNX, QNN, etc.) for consumer mobile devices (smartphones, laptops, tablets, wearables, etc.).

Edgerunner removes the complexities of deploying an off-the-shelf model into your app, with NPU acceleration, regardless of the model format or target devices. Platform-specific NPU SDKs are managed for you and can be leveraged with a device-agnostic API. Edgerunner exposes a boilerplate-free API with sane defaults.

Kotlin bindings to create AI applications in Android with Edgerunner can be found at edgerunner-android. We are also creating specific use cases built upon Edgerunner (Llama, Stable diffusion, etc.) which will come with their own Android bindings.

Please request additional features or desired use cases through Github issues or on our Discord.

🔌 Support

OS

Android	iOS	Linux	MacOS	Windows
✅	⏳	✅	⏳	⏳

NPU

Apple	Qualcomm	MediaTek	Samsung	Intel	AMD
⏳	✅	⏳	⏳	⏳	⏳

🛠 Building and installing

Edgerunner is in its early development stages. Refer to the HACKING document to get setup.

🕹 Usage

Edgerunner is designed around the following usage pattern;

#include <edgerunner/edgerunner.hpp>
#include <edgerunner/model.hpp>

auto model = edge::createModel("/path/to/model");

model.applyDelegate(DELEGATE::NPU);

auto input = model.getInput(0).getTensorAs<float>();

/* overwrite input data */

model.execute();

auto output = model.getInput(0).getTensorAs<float>();

/* interpret output data */

See examples for more detailed usage.

See model.hpp and tensor.hpp for complete API.

🏆 Contributing

See the CONTRIBUTING document.

Join our Discord for discussing any issues.

📜 Licensing

See the LICENSING document.