LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)

If you use SillyTavern or MiniTavern for AI character-card roleplay, you have probably weighed the trade-off between cloud APIs (fast, smart, but your prompts leave your device) and local LLMs (private, unlimited, but tied to one machine). LM Studio closes that gap with LM Link—a feature that keeps inference on your own hardware while letting you chat from a phone, laptop, or tablet as if the model were running right beside you.

This guide explains how LM Link works, walks through setup step by step, and shows how to wire it into SillyTavern and MiniTavern for a privacy-first tavern workflow in 2026.

What Is LM Studio?

LM Studio is a desktop application for discovering, downloading, and running open-weight LLMs locally. It wraps the llama.cpp runtime behind a friendly GUI, supports GGUF model files, and exposes an OpenAI-compatible REST API on http://localhost:1234 by default.

Key LM Studio terms you will see in docs and forums:

Term	Meaning
Model loader	UI panel where you pick and load a GGUF model into VRAM/RAM
Local server	Built-in API server (port 1234) that SillyTavern/MiniTavern connect to
lms CLI	Command-line tool for headless servers (`lms server start`, `lms link enable`)
llmster	Headless LM Studio variant for GPU rigs and servers without a GUI
OpenAI-compatible endpoint	`/v1/chat/completions` and `/v1/completions` routes—same shape as OpenAI’s API

Unlike a cloud API, nothing in your character card, World Info, or chat history is sent to OpenAI, Anthropic, or DeepSeek—only to software you control.

What Is LM Link?

LM Link is LM Studio’s device-linking feature (built in partnership with Tailscale). It creates a private, end-to-end encrypted mesh network between machines you own. Once linked:

A powerful desktop at home can load and serve a 13B–70B model.
Your laptop or iPhone can use that model as if it were local—it appears in the model loader with a “Linked” badge.
Requests to localhost:1234 on the client device are transparently routed to the remote GPU machine.

LM Link is currently in preview and rolling out in batches. Check lmstudio.ai/link for availability.

How LM Link Differs from Port Forwarding

Traditional remote access means opening router ports or exposing a public IP—risky for a home LLM server. LM Link uses Tailscale mesh VPNs: devices talk over encrypted tunnels, never exposed to the open internet. Neither LM Studio nor Tailscale can read your prompts; they only handle device discovery and routing.

Why Privacy-Focused Tavern Users Should Care

SillyTavern and MiniTavern users who care about privacy typically want:

No third-party inference — character backstories, persona prompts, and intimate RP stay off corporate servers.
Mobile access — phones are where most MiniTavern sessions happen, but phones cannot run 13B+ models smoothly.
One card library, many devices — import once, play on desktop ST, MiniTavern iOS, or Web Tavern without re-uploading PNG cards to a cloud.

LM Link solves (2) and strengthens (1): your home PC becomes the inference engine, while your phone is just the chat front-end. Combined with MiniTavern’s offline card library and SillyTavern-compatible PNG imports, you get a full local-first tavern stack.

Architecture at a Glance

[Home PC — LM Studio]
  ├── Loaded GGUF model (e.g. Qwen2.5 14B)
  ├── Local server :1234
  └── LM Link enabled (Tailscale mesh)
           │
           │  E2E encrypted
           ▼
[Phone / Laptop — client]
  ├── LM Studio + LM Link (or Locally app on iOS)
  ├── SillyTavern / MiniTavern → localhost:1234
  └── API requests routed to home GPU

Your chat UI and character cards stay on the client; only token generation happens on the remote machine.

Prerequisites

Home machine: Windows, macOS, or Linux with a GPU (8 GB+ VRAM for 7B–14B quantized models; 16 GB+ for larger).
Client device: Another PC, Mac, or iPhone/iPad with LM Link access.
LM Studio 0.3.4+ (LM Link requires a recent build—check release notes).
Same LM Link account signed in on all devices.
Character cards ready in SillyTavern or MiniTavern (browse the Card Quest Market or import via the MiniTavern Chrome Extension).

Step 1: Set Up LM Studio on Your Home PC

Download LM Studio from lmstudio.ai.
Open the Discover tab and search for a roleplay-friendly model, for example:
- Qwen2.5-14B-Instruct (strong instruction following)
- Mistral-7B-Instruct-v0.3 (fast on modest GPUs)
- Llama-3.1-8B-Instruct (balanced quality/speed)
Download a Q4_K_M or Q5_K_M GGUF quant—good quality with reasonable VRAM use.
Load the model in the Chat or Developer tab and confirm it responds.

Step 2: Enable LM Link on the Home PC

Open Settings → LM Link.
Toggle Enable LM Link to ON.
Sign in with your LM Link account (Tailscale-backed).
Enable Allow loading models on this machine so remote clients can trigger loads.
Leave LM Studio running with the model loaded.

For headless GPU rigs, use the CLI:

lms login
lms link enable
lms server start --port 1234

Step 3: Link Your Phone or Laptop

On iPhone / iPad: Locally app

LM Studio acquired the Locally iOS app and integrated it into the LM Link mesh. After LM Studio 0.4.16+:

Install Locally from the App Store.
Sign in with the same LM Link account as your home PC.
Linked models from your desktop appear in Locally—you can chat natively on the go.

This path is ideal for quick mobile chats without configuring API URLs.

On laptop or second PC: LM Studio client

Install LM Studio on the client machine.
Settings → LM Link → Enable → sign in with the same account.
Open the model loader—remote models show as Linked.
Optionally set a preferred device so API calls route to your home GPU.

Step 4: Start the Local API Server

On the client device (the one running SillyTavern or MiniTavern):

In LM Studio, open the Developer tab (or Local Server panel).
Click Start Server on port 1234.
Confirm the server status shows running.

With LM Link active, requests to http://localhost:1234/v1/chat/completions are served by whichever linked device holds the loaded model—usually your home PC.

Test with curl:

curl http://localhost:1234/v1/models

You should see the remote model listed.

Step 5: Connect SillyTavern

Open SillyTavern (desktop or self-hosted).
Click the plug icon → API Connections.
Select Chat Completion (OpenAI-compatible) or Text Completion / KoboldAI depending on your ST version.
Set the API URL to http://localhost:1234/v1 (chat) or http://localhost:1234 (text completion).
Click Connect and pick the linked model from the dropdown.
Import a character card and send a test message.

Tips for local roleplay:

Shorten verbose system prompts—local models handle concise cards better.
Set context to 4096–8192 tokens if VRAM allows.
Temperature 0.7–0.9 works well for character RP.
See our local LLM privacy guide for card tuning details.

Step 6: Connect MiniTavern on Mobile

MiniTavern’s Multi-Model Hub supports custom OpenAI-compatible endpoints—the same API LM Studio exposes.

At home (same Wi-Fi, no LM Link needed):

Find your PC’s LAN IP (e.g. 192.168.1.42).
In MiniTavern → model settings, add a custom endpoint: http://192.168.1.42:1234/v1.
Ensure LM Studio’s server allows connections from your network (check CORS / “serve on local network” if available).

Away from home (with LM Link):

On a laptop with LM Studio + LM Link + local server running, use http://localhost:1234/v1 in MiniTavern if you sideload or use a remote-desktop workflow.
On iPhone, Locally is the native LM Link client; use MiniTavern with a cloud-free card library for cards and switch to Locally for linked inference—or use Web Tavern on a linked laptop.

The MiniTavern workflow: discover cards on the Character Card Market → manage with the Chrome Extension → play on iOS/Android with your chosen API backend.

Recommended Models for Character-Card Roleplay

Model	Size	Best for
Qwen2.5 14B Instruct	~9 GB Q4	Strong RP, follows card personality well
Mistral 7B Instruct v0.3	~5 GB Q4	Fast replies on 8 GB VRAM
Llama 3.1 8B Instruct	~5 GB Q4	Reliable instruction following
Gemma 2 9B	~6 GB Q4	Good dialogue, Google open weights

Avoid sub-3B models for complex character cards—they struggle with personality consistency and World Info triggers.

Troubleshooting

Problem	Fix
Linked model not visible	Confirm same LM Link account on both devices; restart LM Studio
”Connection refused” on :1234	Start the local server on the client; check firewall
Slow first token	Normal over WAN; home gigabit LAN is near-instant
Model loads on wrong device	Set preferred device in LM Link settings
SillyTavern empty replies	Match chat template to model family; reduce max tokens
LM Link not in settings	Feature is preview—update LM Studio or join waitlist

LM Link vs Ollama vs Cloud APIs

	LM Link + LM Studio	Ollama (LAN only)	Cloud API
Privacy	Full—your hardware	Full—your hardware	Data leaves device
Mobile away from home	Yes (encrypted mesh)	No (LAN only)	Yes
GUI model browser	Yes	CLI-first	N/A
OpenAI-compatible API	Yes (:1234)	Yes (:11434)	Yes
Setup complexity	Medium	Low	Lowest

Ollama remains excellent for same-machine or same-LAN setups. LM Link adds secure remote access without VPN configuration—valuable for tavern users who want phone RP powered by a home GPU.

Privacy Best Practices

Keep LM Studio updated—security patches for the local server matter.
Use open-weight models from trusted sources (Hugging Face, LM Studio catalog).
Disable cloud fallbacks in SillyTavern/MiniTavern so a misconfigured endpoint does not leak to OpenAI.
Encrypt sensitive card files if you store personal lore on disk.
Review Tailscale ACLs if you link multiple household devices.

Conclusion

LM Studio LM Link turns a home gaming PC into a private AI inference server for SillyTavern and MiniTavern—no cloud API keys, no usage caps, and end-to-end encrypted access from your phone. For the privacy-conscious tavern community, it is one of the most practical ways to combine mobile character-card roleplay with local model sovereignty.

Ready to build your private setup? Import cards via MiniTavern iOS/Android, browse the Character Card Market, and point your API connector at localhost:1234—your home GPU handles the rest.

LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)

LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)

What Is LM Studio?

What Is LM Link?

How LM Link Differs from Port Forwarding

Why Privacy-Focused Tavern Users Should Care

Architecture at a Glance

Prerequisites

Step 1: Set Up LM Studio on Your Home PC

Step 2: Enable LM Link on the Home PC

Step 3: Link Your Phone or Laptop

On iPhone / iPad: Locally app

On laptop or second PC: LM Studio client

Step 4: Start the Local API Server

Step 5: Connect SillyTavern

Step 6: Connect MiniTavern on Mobile

Recommended Models for Character-Card Roleplay

Troubleshooting

LM Link vs Ollama vs Cloud APIs

Privacy Best Practices

Conclusion

How to Create a Roleplay Character: A Step-by-Step Guide for AI Roleplay in 2026

Debugging Your SillyTavern Character Cards: A Troubleshooting Guide for Better AI Responses

Introduction: Why Build a Character Card from Scratch in 2026?

LM Studio LM Link: Run Local LLM APIs on Your PC and Chat from Your Phone (MiniTavern & SillyTavern Guide)

What Is LM Studio?

What Is LM Link?

How LM Link Differs from Port Forwarding

Why Privacy-Focused Tavern Users Should Care

Architecture at a Glance

Prerequisites

Step 1: Set Up LM Studio on Your Home PC

Step 2: Enable LM Link on the Home PC

Step 3: Link Your Phone or Laptop

On iPhone / iPad: Locally app

On laptop or second PC: LM Studio client

Step 4: Start the Local API Server

Step 5: Connect SillyTavern

Step 6: Connect MiniTavern on Mobile

Recommended Models for Character-Card Roleplay

Troubleshooting

LM Link vs Ollama vs Cloud APIs

Privacy Best Practices

Conclusion

Keep reading

How to Create a Roleplay Character: A Step-by-Step Guide for AI Roleplay in 2026

Debugging Your SillyTavern Character Cards: A Troubleshooting Guide for Better AI Responses

Introduction: Why Build a Character Card from Scratch in 2026?