Open Source LLM Download Center

Top 10 open-source LLMs with download links, install guides & hardware recommendations

🚀 Quick Start: Install Ollama and run any model in 3 steps

No account needed, data stays local · MacBook / Mac mini / Windows / Linux supported

1
brew install ollama
2
ollama pull qwen3
3
ollama run qwen3

Filter by Use Case

Featured Open Source Models

DeepSeek-R1

DeepSeek AI
MIT
671B
Params
16GB+
Min VRAM
★★★★★
Reasoning

Top reasoning model globally, math & code on par with proprietary flagships, efficient MoE architecture

Daily Chat Code Dev Document
ollama pull deepseek-r1

Qwen 3.5

Alibaba
Apache 2.0
0.6B~235B
Params
4GB+
Min VRAM
★★★★★
Chinese NLP

Best Chinese comprehension globally, 0.6B–235B scales covering low-end to flagship hardware

Daily Chat Code Dev Document Low-end
ollama pull qwen3

MiniMax M2

MiniMax
CC-BY-NC
456B
Params
24GB+
Min VRAM
★★★★☆
Infer Speed

Sparse MoE ultra-fast inference, low power, pair with OpenClaw for 24/7 local AI assistant

Daily Chat Document
huggingface-cli download MiniMaxAI/MiniMax-M1-40k

智谱 GLM

Zhipu AI
Apache 2.0
9B~32B
Params
8GB+
Min VRAM
★★★★☆
Code Ability

Tsinghua flagship open model, leading in code & tool calling, mature deployment ecosystem

Daily Chat Code Dev Document
ollama pull glm4

Llama 4

Meta
Llama 4
17B~400B
Params
8GB+
Min VRAM
★★★★★
Ecosystem

Meta flagship open series, world's largest open ecosystem, Scout/Maverick multi-size

Daily Chat Code Dev
ollama pull llama4

Mistral Large 2

Mistral AI
MRL 2.0
123B
Params
24GB+
Min VRAM
★★★★☆
Multilingual

Europe's top open model, excellent code & multilingual, privacy-compliant

Daily Chat Code Dev
ollama pull mistral-large

Gemma 3

Google
Gemma ToU
1B~27B
Params
4GB+
Min VRAM
★★★★☆
Lightweight

Google lightweight open model, runs 1B on 4GB VRAM, first choice for low-end devices

Daily Chat Low-end
ollama pull gemma3

Phi-4

Microsoft
MIT
14B
Params
8GB
Min VRAM
★★★★★
Value

Microsoft precision small model, 14B outperforms same-size rivals, MIT license for commercial use

Code Dev Low-end
ollama pull phi4

MiniCPM

Tsinghua / ModelBest
Apache 2.0
3B~4B
Params
4GB
Min VRAM
★★★★☆
Edge Deploy

Ultra-lightweight model that runs on phones, best for edge deployment, excellent Chinese support

Daily Chat Low-end
ollama pull minicpm-v

Yi-34B

01.AI
Apache 2.0
34B
Params
16GB+
Min VRAM
★★★★☆
CN+EN

01.AI flagship open model, balanced Chinese-English, excellent long-doc understanding & writing

Daily Chat Document
ollama pull yi:34b

Hardware Reference

Choose models based on your GPU VRAM. More VRAM = larger model parameters you can run.

GPU VRAM Recommended Models
4 GB Gemma 3 (1B/4B) MiniCPM-3B Qwen3-0.6B
8 GB Qwen3-8B Phi-4 (14B量化) GLM-9B
16 GB DeepSeek-R1-14B Qwen3-14B Yi-34B量化
24 GB+ DeepSeek-R1-32B Qwen3-32B Mistral-22B

Tip: All models can be downloaded and run via Ollama: ollama pull <model-name>

Model Tier List

March 2026 · Unified benchmark comparison · Full bar = highest score

模型 MMLU-Pro
General Knowledge
GPQA Diamond
Science Reasoning
SWE-Bench
Code Repair
Arena Elo
Human Preference
VRAM
S Top Tier — On par with leading proprietary models
S
Qwen 3.5
397B/17B · 阿里巴巴
84.6%
82.1%
62.5%
1451
8GB+
S
DeepSeek-R1
685B/37B · 深度求索
84.0%
85.3%
49.2%
1420
16GB+
S
智谱 GLM-5
744B/40B · 智谱AI
70.4%
86.0%
77.8%
1452
24GB+
A A Tier — Flagship capability, manageable hardware
A
Llama 4 Maverick
400B/17B · Meta
83.2%
78.5%
55.8%
1320
8GB+
A
Mistral Large 3
675B/41B · Mistral AI
82.8%
79.3%
54.1%
1315
24GB+
B B Tier — Runs on consumer GPU, excellent overall capability
B
Llama 4 Scout
109B/17B · Meta · 10M上下文
78.5%
74.2%
48.5%
1280
8GB
B
Gemma 3 27B
27B · Google
67.5%
42.4%
35.2%
1220
16GB
C C Tier — 4–8GB VRAM, best for edge/low-end devices
C
Phi-4
14B · Microsoft · MIT
75.2%
56.1%
41.3%
1200
8GB
C
Yi-1.5-34B
34B · 零一万物
63.1%
40.2%
31.5%
1140
16GB
C
MiniCPM-o 4.5
9B · 清华/面壁 · 多模态
58.3%
38.5%
28.1%
1150
6GB
MMLU-Pro General Knowledge
GPQA Diamond PhD-level Science
SWE-Bench Code Repair (max=77.8%)
Arena Elo Human Preference (max=1500)

Sources: Artificial Analysis · LMSYS Chatbot Arena · Official model reports (March 2026) · Some scores are community test estimates

S
Top Tier Comparable to top proprietary models, high hardware demand
智谱 GLM-5 Apache 2.0
744B 总 / 40B 激活 · 24GB+ Elo 1452
SWE-Bench
77.8%
GPQA
86.0%
MMLU-Pro
70.4%
Code #1 China Chip
DeepSeek-R1 MIT
685B 总 / 37B 激活 · 16GB+ MATH 97.3%
AIME 2025
79.8%
GPQA
85.3%
MMLU-Pro
84.0%
Best Reasoning Math #1
Qwen 3.5 Apache 2.0
397B 总 / 17B 激活 · 8GB+ Elo 1451
SWE-Bench
62.5%
GPQA
82.1%
MMLU-Pro
84.6%
Chinese #1 201 Languages
A
A Tier Flagship capability, manageable hardware requirements
Llama 4 Maverick Llama 4
400B 总 / 17B 激活 · 8GB+
SWE-Bench
55.8%
GPQA
78.5%
MMLU-Pro
83.2%
1M Context Largest Ecosystem
Mistral Large 3 Apache 2.0
675B 总 / 41B 激活 · 24GB+
SWE-Bench
54.1%
GPQA
79.3%
MMLU-Pro
82.8%
EU Compliant Strong Multilingual
B
B Tier Runs on single consumer GPU, excellent overall capability
Llama 4 Scout Llama 4
109B 总 / 17B 激活 · 单H100
MMLU-Pro
78.5%
Context
10M
10M Long Context Single GPU Run
Gemma 3 27B Gemma ToU
27B · 16GB 消费显卡
HumanEval
78.5%
MMLU-Pro
67.5%
GPQA
42.4%
Best Consumer GPU Multimodal

Download LLMs Fast – Speed Matters

Downloading 50GB+ model files from Hugging Face requires a stable high-speed global network

¥9/月
Low Monthly Fee
1000Mbps
Gigabit Bandwidth
70+国家
全球节点
30 Days
Money-back Guarantee

Related Tutorials

$1.5/month · 10 Years Strong
Free Trial VPN07