← Back to blog

LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)

LM Studio's LM Link lets you run open-weight models on a home GPU and access them from your phone over an encrypted private network—ideal for privacy-focused MiniTavern and SillyTavern roleplay without cloud APIs.

Published
  • lm studio
  • lm link
  • local llm
  • privacy
  • sillytavern
  • minitavern
  • tutorial

LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)

If you use SillyTavern or MiniTavern for AI character-card roleplay, you have probably weighed the trade-off between cloud APIs (fast, smart, but your prompts leave your device) and local LLMs (private, unlimited, but tied to one machine). LM Studio closes that gap with LM Link—a feature that keeps inference on your own hardware while letting you chat from a phone, laptop, or tablet as if the model were running right beside you.

This guide explains how LM Link works, walks through setup step by step, and shows how to wire it into SillyTavern and MiniTavern for a privacy-first tavern workflow in 2026.

What Is LM Studio?

LM Studio is a desktop application for discovering, downloading, and running open-weight LLMs locally. It wraps the llama.cpp runtime behind a friendly GUI, supports GGUF model files, and exposes an OpenAI-compatible REST API on http://localhost:1234 by default.

Key LM Studio terms you will see in docs and forums:

TermMeaning
Model loaderUI panel where you pick and load a GGUF model into VRAM/RAM
Local serverBuilt-in API server (port 1234) that SillyTavern/MiniTavern connect to
lms CLICommand-line tool for headless servers (lms server start, lms link enable)
llmsterHeadless LM Studio variant for GPU rigs and servers without a GUI
OpenAI-compatible endpoint/v1/chat/completions and /v1/completions routes—same shape as OpenAI’s API

Unlike a cloud API, nothing in your character card, World Info, or chat history is sent to OpenAI, Anthropic, or DeepSeek—only to software you control.

LM Link is LM Studio’s device-linking feature (built in partnership with Tailscale). It creates a private, end-to-end encrypted mesh network between machines you own. Once linked:

  • A powerful desktop at home can load and serve a 13B–70B model.
  • Your laptop or iPhone can use that model as if it were local—it appears in the model loader with a “Linked” badge.
  • Requests to localhost:1234 on the client device are transparently routed to the remote GPU machine.

LM Link is currently in preview and rolling out in batches. Check lmstudio.ai/link for availability.

Traditional remote access means opening router ports or exposing a public IP—risky for a home LLM server. LM Link uses Tailscale mesh VPNs: devices talk over encrypted tunnels, never exposed to the open internet. Neither LM Studio nor Tailscale can read your prompts; they only handle device discovery and routing.

Why Privacy-Focused Tavern Users Should Care

SillyTavern and MiniTavern users who care about privacy typically want:

  1. No third-party inference — character backstories, persona prompts, and intimate RP stay off corporate servers.
  2. Mobile access — phones are where most MiniTavern sessions happen, but phones cannot run 13B+ models smoothly.
  3. One card library, many devices — import once, play on desktop ST, MiniTavern iOS, or Web Tavern without re-uploading PNG cards to a cloud.

LM Link solves (2) and strengthens (1): your home PC becomes the inference engine, while your phone is just the chat front-end. Combined with MiniTavern’s offline card library and SillyTavern-compatible PNG imports, you get a full local-first tavern stack.

Architecture at a Glance

[Home PC — LM Studio]
  ├── Loaded GGUF model (e.g. Qwen2.5 14B)
  ├── Local server :1234
  └── LM Link enabled (Tailscale mesh)

           │  E2E encrypted

[Phone / Laptop — client]
  ├── LM Studio + LM Link (or Locally app on iOS)
  ├── SillyTavern / MiniTavern → localhost:1234
  └── API requests routed to home GPU

Your chat UI and character cards stay on the client; only token generation happens on the remote machine.

Prerequisites

  • Home machine: Windows, macOS, or Linux with a GPU (8 GB+ VRAM for 7B–14B quantized models; 16 GB+ for larger).
  • Client device: Another PC, Mac, or iPhone/iPad with LM Link access.
  • LM Studio 0.3.4+ (LM Link requires a recent build—check release notes).
  • Same LM Link account signed in on all devices.
  • Character cards ready in SillyTavern or MiniTavern (browse the Card Quest Market or import via the MiniTavern Chrome Extension).

Step 1: Set Up LM Studio on Your Home PC

  1. Download LM Studio from lmstudio.ai.
  2. Open the Discover tab and search for a roleplay-friendly model, for example:
    • Qwen2.5-14B-Instruct (strong instruction following)
    • Mistral-7B-Instruct-v0.3 (fast on modest GPUs)
    • Llama-3.1-8B-Instruct (balanced quality/speed)
  3. Download a Q4_K_M or Q5_K_M GGUF quant—good quality with reasonable VRAM use.
  4. Load the model in the Chat or Developer tab and confirm it responds.
  1. Open Settings → LM Link.
  2. Toggle Enable LM Link to ON.
  3. Sign in with your LM Link account (Tailscale-backed).
  4. Enable Allow loading models on this machine so remote clients can trigger loads.
  5. Leave LM Studio running with the model loaded.

For headless GPU rigs, use the CLI:

lms login
lms link enable
lms server start --port 1234

On iPhone / iPad: Locally app

LM Studio acquired the Locally iOS app and integrated it into the LM Link mesh. After LM Studio 0.4.16+:

  1. Install Locally from the App Store.
  2. Sign in with the same LM Link account as your home PC.
  3. Linked models from your desktop appear in Locally—you can chat natively on the go.

This path is ideal for quick mobile chats without configuring API URLs.

On laptop or second PC: LM Studio client

  1. Install LM Studio on the client machine.
  2. Settings → LM Link → Enable → sign in with the same account.
  3. Open the model loader—remote models show as Linked.
  4. Optionally set a preferred device so API calls route to your home GPU.

Step 4: Start the Local API Server

On the client device (the one running SillyTavern or MiniTavern):

  1. In LM Studio, open the Developer tab (or Local Server panel).
  2. Click Start Server on port 1234.
  3. Confirm the server status shows running.

With LM Link active, requests to http://localhost:1234/v1/chat/completions are served by whichever linked device holds the loaded model—usually your home PC.

Test with curl:

curl http://localhost:1234/v1/models

You should see the remote model listed.

Step 5: Connect SillyTavern

  1. Open SillyTavern (desktop or self-hosted).
  2. Click the plug iconAPI Connections.
  3. Select Chat Completion (OpenAI-compatible) or Text Completion / KoboldAI depending on your ST version.
  4. Set the API URL to http://localhost:1234/v1 (chat) or http://localhost:1234 (text completion).
  5. Click Connect and pick the linked model from the dropdown.
  6. Import a character card and send a test message.

Tips for local roleplay:

  • Shorten verbose system prompts—local models handle concise cards better.
  • Set context to 4096–8192 tokens if VRAM allows.
  • Temperature 0.7–0.9 works well for character RP.
  • See our local LLM privacy guide for card tuning details.

Step 6: Connect MiniTavern on Mobile

MiniTavern’s Multi-Model Hub supports custom OpenAI-compatible endpoints—the same API LM Studio exposes.

At home (same Wi-Fi, no LM Link needed):

  1. Find your PC’s LAN IP (e.g. 192.168.1.42).
  2. In MiniTavern → model settings, add a custom endpoint: http://192.168.1.42:1234/v1.
  3. Ensure LM Studio’s server allows connections from your network (check CORS / “serve on local network” if available).

Away from home (with LM Link):

  1. On a laptop with LM Studio + LM Link + local server running, use http://localhost:1234/v1 in MiniTavern if you sideload or use a remote-desktop workflow.
  2. On iPhone, Locally is the native LM Link client; use MiniTavern with a cloud-free card library for cards and switch to Locally for linked inference—or use Web Tavern on a linked laptop.

The MiniTavern workflow: discover cards on the Character Card Market → manage with the Chrome Extension → play on iOS/Android with your chosen API backend.

ModelSizeBest for
Qwen2.5 14B Instruct~9 GB Q4Strong RP, follows card personality well
Mistral 7B Instruct v0.3~5 GB Q4Fast replies on 8 GB VRAM
Llama 3.1 8B Instruct~5 GB Q4Reliable instruction following
Gemma 2 9B~6 GB Q4Good dialogue, Google open weights

Avoid sub-3B models for complex character cards—they struggle with personality consistency and World Info triggers.

Troubleshooting

ProblemFix
Linked model not visibleConfirm same LM Link account on both devices; restart LM Studio
”Connection refused” on :1234Start the local server on the client; check firewall
Slow first tokenNormal over WAN; home gigabit LAN is near-instant
Model loads on wrong deviceSet preferred device in LM Link settings
SillyTavern empty repliesMatch chat template to model family; reduce max tokens
LM Link not in settingsFeature is preview—update LM Studio or join waitlist
LM Link + LM StudioOllama (LAN only)Cloud API
PrivacyFull—your hardwareFull—your hardwareData leaves device
Mobile away from homeYes (encrypted mesh)No (LAN only)Yes
GUI model browserYesCLI-firstN/A
OpenAI-compatible APIYes (:1234)Yes (:11434)Yes
Setup complexityMediumLowLowest

Ollama remains excellent for same-machine or same-LAN setups. LM Link adds secure remote access without VPN configuration—valuable for tavern users who want phone RP powered by a home GPU.

Privacy Best Practices

  1. Keep LM Studio updated—security patches for the local server matter.
  2. Use open-weight models from trusted sources (Hugging Face, LM Studio catalog).
  3. Disable cloud fallbacks in SillyTavern/MiniTavern so a misconfigured endpoint does not leak to OpenAI.
  4. Encrypt sensitive card files if you store personal lore on disk.
  5. Review Tailscale ACLs if you link multiple household devices.

Conclusion

LM Studio LM Link turns a home gaming PC into a private AI inference server for SillyTavern and MiniTavern—no cloud API keys, no usage caps, and end-to-end encrypted access from your phone. For the privacy-conscious tavern community, it is one of the most practical ways to combine mobile character-card roleplay with local model sovereignty.

Ready to build your private setup? Import cards via MiniTavern iOS/Android, browse the Character Card Market, and point your API connector at localhost:1234—your home GPU handles the rest.

More guides you might like