Model Model size Template
Baichuan 2 7B/13B
Baichuan2-7B-Chat
Baichuan2-13B-Chat
baichuan2
BLOOM/BLOOMZ 7.1B/176B
bloom-7b1
tr11-176B-logs
ChatGLM3 6B
chatglm3-6b
chatglm3
Command R 35B/104B
aya-23-35B
c4ai-command-r-plus
cohere
DeepSeek (Code/MoE) 7B/16B/67B/236B
deepseek-vl-7b-chat
deepseek-moe-16b-chat
deepseek-llm-67b-chat
deepSeek-V2
deepseek
DistilGPT2 82M
distilgpt2
Falcon 7B/11B/40B/180B
falcon-7b
falcon-11B
Falcon-40b
falcon-180B
falcon
Gemma/Gemma 2/CodeGemma 2B/7B/9B/27B
gemma-2b
gemma-7b
codegemma-7b
gemma-2-9b
gemma-2-27b
gemma
GLM-4 9B
glm-4-9b-chat
glm4
GPT2 124M/
gpt2
gpt2-large
gpt2-xl
InternLM2 7B/20B
internlm2-7b
internlm2-20b
intern2
Llama 2 7B/13B/70B
Llama-2-7b-chat-hf
Llama-2-13b-hf
Llama-2-70b-hf
llama2
Llama 3 8B/70B
Meta-Llama-3-8B
Meta-Llama-3-8B-Instruct
Meta-Llama-3-70B
llama3
LLaVA-1.5 7B/13B
llava-1.5-7b-hf
llava-1.5-13b-hf
vicuna
Mistral/Mixtral 7B/8x7B/8x22B
Mistral-7B-Instruct-v0.1
Mistral-7B-Instruct-v0.3
Mixtral-8x7B-v0.1
Mixtral-8x7B-Instruct-v0.1
Mixtral-8x22B-v0.1
mistral
OLMo 1B/7B
OLMo-1B-hf
OLMo-7B-hf
OPT 125M/350M
opt-125m
opt-350m
Phi-1.5/Phi-2 1.3B/2.7B
phi-1_5
phi-2
Phi-3 4B/7B/14B
Phi-3-mini-128k-instruct
Phi-3-small-8k-instruct
Phi-3-medium-128k-instruct
phi
Qwen/Qwen1.5/Qwen2 (Code/MoE) 0.5B/1.5B/7B/14B/32B/72B/110B
Qwen-7B-Chat
Qwen-72B
Qwen1.5-110B-Chat
Qwen2-0.5B
Qwen2-1.5B
Qwen2-7B-Instruct
Qwen2-57B-A14B-Instruct
Qwen2-72B-Instruct
qwen
StarCoder 2 3B/7B/15B
starcoder2-3b
Starcoder2-7b
starcoder2-15b
tiny-gpt2 110MB
tiny-gpt2
TinyLlama 1.1B
TinyLlama-1.1B-Chat-v1.0
vicuna 7B/13B
Vicuna-7b-v1.5
vicuna-13b-v1.5
vicuna
Yi/Yi-1.5 6B/9B/34B
Yi-34B-Chat
Yi-1.5-6B-Chat
Yi-1.5-9B-Chat
Yi-1.5-34B-Chat
yi