Open Source LLM Download Center
Top 10 open-source LLMs with download links, install guides & hardware recommendations
No account needed, data stays local · MacBook / Mac mini / Windows / Linux supported
brew install ollama
curl -fsSL https://ollama.com/install.sh | sh
irm https://ollama.com/install.ps1 | iex
ollama pull qwen3
ollama run qwen3
Filter by Use Case
Featured Open Source Models
DeepSeek-R1
DeepSeek AITop reasoning model globally, math & code on par with proprietary flagships, efficient MoE architecture
ollama pull deepseek-r1
Qwen 3.5
AlibabaBest Chinese comprehension globally, 0.6B–235B scales covering low-end to flagship hardware
ollama pull qwen3
MiniMax M2
MiniMaxSparse MoE ultra-fast inference, low power, pair with OpenClaw for 24/7 local AI assistant
huggingface-cli download MiniMaxAI/MiniMax-M1-40k
智谱 GLM
Zhipu AITsinghua flagship open model, leading in code & tool calling, mature deployment ecosystem
ollama pull glm4
Llama 4
MetaMeta flagship open series, world's largest open ecosystem, Scout/Maverick multi-size
ollama pull llama4
Mistral Large 2
Mistral AIEurope's top open model, excellent code & multilingual, privacy-compliant
ollama pull mistral-large
Gemma 3
GoogleGoogle lightweight open model, runs 1B on 4GB VRAM, first choice for low-end devices
ollama pull gemma3
Phi-4
MicrosoftMicrosoft precision small model, 14B outperforms same-size rivals, MIT license for commercial use
ollama pull phi4
MiniCPM
Tsinghua / ModelBestUltra-lightweight model that runs on phones, best for edge deployment, excellent Chinese support
ollama pull minicpm-v
Yi-34B
01.AI01.AI flagship open model, balanced Chinese-English, excellent long-doc understanding & writing
ollama pull yi:34b
Hardware Reference
Choose models based on your GPU VRAM. More VRAM = larger model parameters you can run.
| GPU VRAM | Recommended Models |
|---|---|
| 4 GB | Gemma 3 (1B/4B) MiniCPM-3B Qwen3-0.6B |
| 8 GB | Qwen3-8B Phi-4 (14B量化) GLM-9B |
| 16 GB | DeepSeek-R1-14B Qwen3-14B Yi-34B量化 |
| 24 GB+ | DeepSeek-R1-32B Qwen3-32B Mistral-22B |
Tip: All models can be downloaded and run via Ollama: ollama pull <model-name>
Model Tier List
March 2026 · Unified benchmark comparison · Full bar = highest score
| 模型 |
MMLU-Pro
General Knowledge
|
GPQA Diamond
Science Reasoning
|
SWE-Bench
Code Repair
|
Arena Elo
Human Preference
|
VRAM | |
|---|---|---|---|---|---|---|
|
S
Top Tier — On par with leading proprietary models
|
||||||
| S |
Qwen 3.5
397B/17B · 阿里巴巴
|
84.6%
|
82.1%
|
62.5%
|
1451
|
8GB+ |
| S |
DeepSeek-R1
685B/37B · 深度求索
|
84.0%
|
85.3%
|
49.2%
|
1420
|
16GB+ |
| S |
智谱 GLM-5
744B/40B · 智谱AI
|
70.4%
|
86.0%
|
77.8%
|
1452
|
24GB+ |
|
A
A Tier — Flagship capability, manageable hardware
|
||||||
| A |
Llama 4 Maverick
400B/17B · Meta
|
83.2%
|
78.5%
|
55.8%
|
1320
|
8GB+ |
| A |
Mistral Large 3
675B/41B · Mistral AI
|
82.8%
|
79.3%
|
54.1%
|
1315
|
24GB+ |
|
B
B Tier — Runs on consumer GPU, excellent overall capability
|
||||||
| B |
Llama 4 Scout
109B/17B · Meta · 10M上下文
|
78.5%
|
74.2%
|
48.5%
|
1280
|
8GB |
| B |
Gemma 3 27B
27B · Google
|
67.5%
|
42.4%
|
35.2%
|
1220
|
16GB |
|
C
C Tier — 4–8GB VRAM, best for edge/low-end devices
|
||||||
| C |
Phi-4
14B · Microsoft · MIT
|
75.2%
|
56.1%
|
41.3%
|
1200
|
8GB |
| C |
Yi-1.5-34B
34B · 零一万物
|
63.1%
|
40.2%
|
31.5%
|
1140
|
16GB |
| C |
MiniCPM-o 4.5
9B · 清华/面壁 · 多模态
|
58.3%
|
38.5%
|
28.1%
|
1150
|
6GB |
Sources: Artificial Analysis · LMSYS Chatbot Arena · Official model reports (March 2026) · Some scores are community test estimates
Download LLMs Fast – Speed Matters
Downloading 50GB+ model files from Hugging Face requires a stable high-speed global network
Related Tutorials
DeepSeek R1 – Advanced Reasoning at 1/10th the Cost
GPT-4 level reasoning at a fraction of the cost. Full guide to DeepSeek R1: usage, local deploy and API calls.
Read more →Qwen3.5 Ollama Setup: Run 0.8B–35B Free on PC & Mac
Install Qwen3.5 with Ollama on Windows, Mac or Linux in minutes. Run any size model for free, step-by-step.
Read more →Qwen3.5 on iPhone: Run 9B AI Offline with MLX
Run Qwen3.5 9B fully offline on iPhone using Apple MLX. No cloud, no account, complete privacy.
Read more →Qwen3.5 Android: Best Apps to Run AI Locally
Run Qwen3.5 locally on Android with MNN, llama.cpp or Jan AI. Install a 4B model offline on your phone.
Read more →