LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)
LM Studio's LM Link lets you run open-weight models on a home GPU and access them from your phone over an encrypted private network—ideal for privacy-focused MiniTavern and SillyTavern roleplay without cloud APIs.
- lm studio
- lm link
- local llm
- privacy
- sillytavern
- minitavern
- tutorial
LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)
If you use SillyTavern or MiniTavern for AI character-card roleplay, you have probably weighed the trade-off between cloud APIs (fast, smart, but your prompts leave your device) and local LLMs (private, unlimited, but tied to one machine). LM Studio closes that gap with LM Link—a feature that keeps inference on your own hardware while letting you chat from a phone, laptop, or tablet as if the model were running right beside you.
This guide explains how LM Link works, walks through setup step by step, and shows how to wire it into SillyTavern and MiniTavern for a privacy-first tavern workflow in 2026.
What Is LM Studio?
LM Studio is a desktop application for discovering, downloading, and running open-weight LLMs locally. It wraps the llama.cpp runtime behind a friendly GUI, supports GGUF model files, and exposes an OpenAI-compatible REST API on http://localhost:1234 by default.
Key LM Studio terms you will see in docs and forums:
| Term | Meaning |
|---|---|
| Model loader | UI panel where you pick and load a GGUF model into VRAM/RAM |
| Local server | Built-in API server (port 1234) that SillyTavern/MiniTavern connect to |
| lms CLI | Command-line tool for headless servers (lms server start, lms link enable) |
| llmster | Headless LM Studio variant for GPU rigs and servers without a GUI |
| OpenAI-compatible endpoint | /v1/chat/completions and /v1/completions routes—same shape as OpenAI’s API |
Unlike a cloud API, nothing in your character card, World Info, or chat history is sent to OpenAI, Anthropic, or DeepSeek—only to software you control.
What Is LM Link?
LM Link is LM Studio’s device-linking feature (built in partnership with Tailscale). It creates a private, end-to-end encrypted mesh network between machines you own. Once linked:
- A powerful desktop at home can load and serve a 13B–70B model.
- Your laptop or iPhone can use that model as if it were local—it appears in the model loader with a “Linked” badge.
- Requests to
localhost:1234on the client device are transparently routed to the remote GPU machine.
LM Link is currently in preview and rolling out in batches. Check lmstudio.ai/link for availability.
How LM Link Differs from Port Forwarding
Traditional remote access means opening router ports or exposing a public IP—risky for a home LLM server. LM Link uses Tailscale mesh VPNs: devices talk over encrypted tunnels, never exposed to the open internet. Neither LM Studio nor Tailscale can read your prompts; they only handle device discovery and routing.
Why Privacy-Focused Tavern Users Should Care
SillyTavern and MiniTavern users who care about privacy typically want:
- No third-party inference — character backstories, persona prompts, and intimate RP stay off corporate servers.
- Mobile access — phones are where most MiniTavern sessions happen, but phones cannot run 13B+ models smoothly.
- One card library, many devices — import once, play on desktop ST, MiniTavern iOS, or Web Tavern without re-uploading PNG cards to a cloud.
LM Link solves (2) and strengthens (1): your home PC becomes the inference engine, while your phone is just the chat front-end. Combined with MiniTavern’s offline card library and SillyTavern-compatible PNG imports, you get a full local-first tavern stack.
Architecture at a Glance
[Home PC — LM Studio]
├── Loaded GGUF model (e.g. Qwen2.5 14B)
├── Local server :1234
└── LM Link enabled (Tailscale mesh)
│
│ E2E encrypted
▼
[Phone / Laptop — client]
├── LM Studio + LM Link (or Locally app on iOS)
├── SillyTavern / MiniTavern → localhost:1234
└── API requests routed to home GPU
Your chat UI and character cards stay on the client; only token generation happens on the remote machine.
Prerequisites
- Home machine: Windows, macOS, or Linux with a GPU (8 GB+ VRAM for 7B–14B quantized models; 16 GB+ for larger).
- Client device: Another PC, Mac, or iPhone/iPad with LM Link access.
- LM Studio 0.3.4+ (LM Link requires a recent build—check release notes).
- Same LM Link account signed in on all devices.
- Character cards ready in SillyTavern or MiniTavern (browse the Card Quest Market or import via the MiniTavern Chrome Extension).
Step 1: Set Up LM Studio on Your Home PC
- Download LM Studio from lmstudio.ai.
- Open the Discover tab and search for a roleplay-friendly model, for example:
Qwen2.5-14B-Instruct(strong instruction following)Mistral-7B-Instruct-v0.3(fast on modest GPUs)Llama-3.1-8B-Instruct(balanced quality/speed)
- Download a Q4_K_M or Q5_K_M GGUF quant—good quality with reasonable VRAM use.
- Load the model in the Chat or Developer tab and confirm it responds.
Step 2: Enable LM Link on the Home PC
- Open Settings → LM Link.
- Toggle Enable LM Link to ON.
- Sign in with your LM Link account (Tailscale-backed).
- Enable Allow loading models on this machine so remote clients can trigger loads.
- Leave LM Studio running with the model loaded.
For headless GPU rigs, use the CLI:
lms login
lms link enable
lms server start --port 1234
Step 3: Link Your Phone or Laptop
On iPhone / iPad: Locally app
LM Studio acquired the Locally iOS app and integrated it into the LM Link mesh. After LM Studio 0.4.16+:
- Install Locally from the App Store.
- Sign in with the same LM Link account as your home PC.
- Linked models from your desktop appear in Locally—you can chat natively on the go.
This path is ideal for quick mobile chats without configuring API URLs.
On laptop or second PC: LM Studio client
- Install LM Studio on the client machine.
- Settings → LM Link → Enable → sign in with the same account.
- Open the model loader—remote models show as Linked.
- Optionally set a preferred device so API calls route to your home GPU.
Step 4: Start the Local API Server
On the client device (the one running SillyTavern or MiniTavern):
- In LM Studio, open the Developer tab (or Local Server panel).
- Click Start Server on port
1234. - Confirm the server status shows running.
With LM Link active, requests to http://localhost:1234/v1/chat/completions are served by whichever linked device holds the loaded model—usually your home PC.
Test with curl:
curl http://localhost:1234/v1/models
You should see the remote model listed.
Step 5: Connect SillyTavern
- Open SillyTavern (desktop or self-hosted).
- Click the plug icon → API Connections.
- Select Chat Completion (OpenAI-compatible) or Text Completion / KoboldAI depending on your ST version.
- Set the API URL to
http://localhost:1234/v1(chat) orhttp://localhost:1234(text completion). - Click Connect and pick the linked model from the dropdown.
- Import a character card and send a test message.
Tips for local roleplay:
- Shorten verbose system prompts—local models handle concise cards better.
- Set context to 4096–8192 tokens if VRAM allows.
- Temperature 0.7–0.9 works well for character RP.
- See our local LLM privacy guide for card tuning details.
Step 6: Connect MiniTavern on Mobile
MiniTavern’s Multi-Model Hub supports custom OpenAI-compatible endpoints—the same API LM Studio exposes.
At home (same Wi-Fi, no LM Link needed):
- Find your PC’s LAN IP (e.g.
192.168.1.42). - In MiniTavern → model settings, add a custom endpoint:
http://192.168.1.42:1234/v1. - Ensure LM Studio’s server allows connections from your network (check CORS / “serve on local network” if available).
Away from home (with LM Link):
- On a laptop with LM Studio + LM Link + local server running, use
http://localhost:1234/v1in MiniTavern if you sideload or use a remote-desktop workflow. - On iPhone, Locally is the native LM Link client; use MiniTavern with a cloud-free card library for cards and switch to Locally for linked inference—or use Web Tavern on a linked laptop.
The MiniTavern workflow: discover cards on the Character Card Market → manage with the Chrome Extension → play on iOS/Android with your chosen API backend.
Recommended Models for Character-Card Roleplay
| Model | Size | Best for |
|---|---|---|
| Qwen2.5 14B Instruct | ~9 GB Q4 | Strong RP, follows card personality well |
| Mistral 7B Instruct v0.3 | ~5 GB Q4 | Fast replies on 8 GB VRAM |
| Llama 3.1 8B Instruct | ~5 GB Q4 | Reliable instruction following |
| Gemma 2 9B | ~6 GB Q4 | Good dialogue, Google open weights |
Avoid sub-3B models for complex character cards—they struggle with personality consistency and World Info triggers.
Troubleshooting
| Problem | Fix |
|---|---|
| Linked model not visible | Confirm same LM Link account on both devices; restart LM Studio |
| ”Connection refused” on :1234 | Start the local server on the client; check firewall |
| Slow first token | Normal over WAN; home gigabit LAN is near-instant |
| Model loads on wrong device | Set preferred device in LM Link settings |
| SillyTavern empty replies | Match chat template to model family; reduce max tokens |
| LM Link not in settings | Feature is preview—update LM Studio or join waitlist |
LM Link vs Ollama vs Cloud APIs
| LM Link + LM Studio | Ollama (LAN only) | Cloud API | |
|---|---|---|---|
| Privacy | Full—your hardware | Full—your hardware | Data leaves device |
| Mobile away from home | Yes (encrypted mesh) | No (LAN only) | Yes |
| GUI model browser | Yes | CLI-first | N/A |
| OpenAI-compatible API | Yes (:1234) | Yes (:11434) | Yes |
| Setup complexity | Medium | Low | Lowest |
Ollama remains excellent for same-machine or same-LAN setups. LM Link adds secure remote access without VPN configuration—valuable for tavern users who want phone RP powered by a home GPU.
Privacy Best Practices
- Keep LM Studio updated—security patches for the local server matter.
- Use open-weight models from trusted sources (Hugging Face, LM Studio catalog).
- Disable cloud fallbacks in SillyTavern/MiniTavern so a misconfigured endpoint does not leak to OpenAI.
- Encrypt sensitive card files if you store personal lore on disk.
- Review Tailscale ACLs if you link multiple household devices.
Conclusion
LM Studio LM Link turns a home gaming PC into a private AI inference server for SillyTavern and MiniTavern—no cloud API keys, no usage caps, and end-to-end encrypted access from your phone. For the privacy-conscious tavern community, it is one of the most practical ways to combine mobile character-card roleplay with local model sovereignty.
Ready to build your private setup? Import cards via MiniTavern iOS/Android, browse the Character Card Market, and point your API connector at localhost:1234—your home GPU handles the rest.
Keep reading
More guides you might like
How to Create a Roleplay Character: A Step-by-Step Guide for AI Roleplay in 2026
Creating a compelling character for AI roleplay is more than just writing a name and a backstory. In 2026, character creation has evolved into a nuanced cr…
- roleplay
- character-creation
- ai-roleplay
- guide
Debugging Your SillyTavern Character Cards: A Troubleshooting Guide for Better AI Responses
If you’ve ever spent hours crafting the perfect character card in SillyTavern, only to have the AI respond with generic, outofcharacter, or outright broken…
- sillytavern
- character-cards
- troubleshooting
- errors
Introduction: Why Build a Character Card from Scratch in 2026?
The world of AI roleplay has evolved dramatically, and at the heart of it all lies the character card. Whether you're a seasoned roleplayer or a curious ne…
- sillytavern
- character-card
- creator
- guide