Install once. Run fully offline. No accounts. No lock-in.
When you want more power, TuTu can
tap distributed compute — on your terms.
TuTu removes friction at every step: instant local AI, zero cloud lock-in, and optional distributed scale when you need it.
Go from install to answers in under a minute. TuTu auto-downloads and optimizes top models for your hardware.
No accounts, no cloud calls, no hidden telemetry. Sensitive prompts and data stay on your machine.
Drop-in replacement for OpenAI's API on localhost:11434. Works with LangChain, LlamaIndex, every SDK.
Create custom models with a TuTufile — set system prompts, parameters, and adapters. Like Dockerfile for AI.
Need more throughput? Opt in and unlock shared compute. Contribute idle GPU cycles and earn credits.
Ed25519 identity, gVisor sandbox, supply chain verification, Byzantine fault tolerance. Built for trust.
Enterprise AI tool access via Model Context Protocol. SLA tiers, usage metering, rate limiting built-in.
XP levels, achievements, weekly quests, passive earning. Max 1 notification/day. Respect-first design.
Built for 10M+ nodes across 7 continents. Hierarchical gossip, regional sharding, ML-driven scheduling.
No configuration. No accounts. No PhD required.
One command on macOS, Linux, or Windows. Single binary, zero dependencies.
tutu run llama3.2 — auto-downloads and starts chatting instantly.
Use the OpenAI-compatible API at localhost:11434 with any SDK.
Your idle GPU earns credits. Every node makes AI faster and cheaper for everyone.
TuTu speaks OpenAI's protocol. Every tool that works with OpenAI works with TuTu — just change the URL.
Choose your platform. Be running AI models in under a minute.
Supports Intel & Apple Silicon.
Supports x86_64 and ARM64. APT/RPM repos coming soon.
PowerShell or WinGet. Requires Windows 10+.
Requires Go 1.24+. Single binary, no CGO required.
AI should be a utility, not a tax. TuTu gives developers private local intelligence first, then optional distributed scale.
TuTu Network is building a distributed AI supercomputer powered by real-world idle devices. We make advanced AI accessible without forcing users into expensive cloud contracts.
Start local in one command, keep full privacy, and stay productive with an OpenAI-compatible API that drops into your existing stack.
When extra capacity matters, opt into the network to access shared compute and contribute your idle GPU for credits. You keep control over when, how, and why your machine participates.
Zero telemetry. Your data never leaves your machine. Period.
MIT licensed. Fully transparent. Community-driven development.
Written in Go. Single binary. Sub-50MB idle footprint.
Every contributor matters. Building together for everyone.
Track our progress, download latest versions, and see what's new in every release.
Every release includes pre-built binaries for macOS, Linux, and Windows, detailed changelogs, and upgrade instructions.
Go to our GitHub releases page to see all versions, release notes, and download assets.
Click "Watch" → "Custom" → "Releases" on GitHub to get notified of every new version.
Run the install command again or download the latest binary. TuTu handles migrations automatically.
Replace your OpenAI calls by pointing at localhost. Works with every SDK and framework.
TuTu implements the MCP 2025-03-26 specification. Connect any AI client to any tool through one universal protocol.
MCP is an open standard that lets AI models interact with databases, APIs, file systems, and any external service — through one consistent interface.
AI coding assistants connected to your CI/CD. Support bots with CRM access. Data pipelines with SQL tools. DevOps agents managing Kubernetes — all via MCP.
4 SLA tiers (Free to Enterprise). Rate limiting, usage metering, and latency targets built-in. Scale from prototype to production without changing code.
JSON-RPC 2.0 over Streamable HTTP. Real-time tool execution, resource access, and prompt serving with session management and progress notifications.
Run tutu serve — the MCP endpoint is live at /mcp automatically.
Point Claude, ChatGPT, or any MCP-compatible client to http://localhost:11434/mcp.
The AI can now run models (tutu_run), list models (tutu_list), pull new models
(tutu_pull), and check status (tutu_status).
| SLA Tier | Rate Limit | Latency | Price |
|---|---|---|---|
| Free | 10 req/min | Best effort | $0 |
| Pro | 100 req/min | < 500ms | Credits |
| Business | 1,000 req/min | < 200ms | Credits |
| Enterprise | 10,000 req/min | < 100ms | Credits |
Contribute GPU time, earn credits. Spend credits on network AI, fine-tuning, and priority access. Simple, fair, transparent.
Your idle GPU earns credits automatically. High-end GPUs earn 2.5× more. 99%+ uptime? 1.3× reliability bonus. Early adopters get 1.5× for the first 30 days.
Use credits for network inference (0.001/token), fine-tuning jobs (10/hour), MCP Pro tier (50/month), and priority queue access (5/request).
Double-entry bookkeeping, velocity checks, Benford's Law analysis, and full audit trails. Every transaction is balanced and verifiable.
Don't want to contribute GPU? Buy credits starting at $9.99/mo for 5,000 credits. Enterprise packages available for organizations.
| Package | Credits | Price | Best For |
|---|---|---|---|
| Starter | 500 | Free | Everyone. Included automatically. |
| Developer | 5,000 | $9.99/mo | Individual developers |
| Team | 25,000 | $39.99/mo | Small teams |
| Enterprise | 100,000 | $149.99/mo | Organizations |
| Custom | Unlimited | Contact us | Large-scale deployments |
Local AI is always 100% free. Credits are only for distributed network features. 500 free credits included for every user. No surprise charges.
Customize any model for your use case. Use your own hardware or leverage the distributed network's GPU power with credits.
Fine-tune models on your own hardware using TuTufile. Define adapters, system prompts, and training parameters — all in one declarative file. Zero cost.
Submit LoRA/QLoRA jobs to the network. Tasks are distributed across capable peers. Pay with credits. Get results faster than training alone.
Full fine-tune (48GB+), LoRA (8GB+), or QLoRA (4GB+). Choose the method that fits your hardware and budget. Adapter merging lets you combine results.
Contribute GPU time for other users' fine-tuning jobs and earn credits. Your hardware works for you even when you're not using it.
Define your base model, training data, system prompt, and adapter parameters in a simple
Tutufile.
Run tutu create my-model -f Tutufile locally, or
tutu agent finetune --budget 100 to distribute across the network using credits.
Your fine-tuned model is ready. Run it with tutu run my-model. Share it on the marketplace
if you want.
From solo hackers to enterprise teams, TuTu makes AI accessible to everyone.
"Finally, an AI runtime that respects my privacy. Installed in 10 seconds, running Llama 3.2 offline. No accounts, no telemetry. This is how software should be."
"The OpenAI-compatible API is seamless. Switched my entire LangChain pipeline from GPT-4 to local Mistral by just changing the URL. Zero code changes."
"The distributed compute vision is incredible. My gaming rig earns credits overnight while I sleep. It's like mining but actually useful — powering AI inference."
Run any model locally without limits. No accounts, no subscriptions, no token meters. Open source forever.
Be part of building the world's largest distributed AI supercomputer.