Models

Explore all available models and compare their capabilities.

Reasoning models

Qwen3-0.6B
DeepSeek-R1-Distill-Qwen-1.5B

Flagship chat models

Qwen2.5-0.5B-Instruct
Qwen2.5-1.5B-Instruct
Qwen2.5-3B-Instruct
Qwen2.5-Coder-0.5B-Instruct
Qwen2.5-HA-0.5B-Instruct
Llama-3.2-1B-Instruct
openbuddy-llama3.2-1b-v23.1-131k

Multimodal models

InternVL2_5-1B-MPO
InternVL3-1B-MPO
Qwen3-VL-2B-Instruct
SmolVLM-256M-Instruct
SmolVLM-500M-Instruct

Text-to-speech

Models that can convert text into natural sounding spoken audio.

MeloTTS-English
MeloTTS-Chinese
MeloTTS-Japanese
MeloTTS-Spanish
CosyVoice2

Transcription

Model that can transcribe and translate audio into text.

Whisper-tiny
Whisper-base
Whisper-small
SenseVoice-small

Keyword spotting

Models that can detect specific keywords in audio streams.

Keyword spotting

Voice activity detection

Models that can detect whether there is speech in an audio stream.

Silero-vad

Automatic Speech Recognition

Models that can convert spoken language into text.

Automatic Speech Recognition

Vision

Models that can process images and perform tasks like object detection, etc.

Yolo11n
Depth-Anything-V2

Model Pages

DeepSeek-R1-Distill-Qwen-1.5B
Qwen3-0.6B
Qwen2.5-0.5B-Instruct
Qwen2.5-1.5B-Instruct
Qwen2.5-3B-Instruct
Qwen2.5-Coder-0.5B-Instruct
Llama-3.2-1B-Instruct
openbuddy-llama3.2-1b-v23.1-131k
InternVL2_5-1B-MPO
InternVL3-1B-MPO
Qwen3-VL-2B-Instruct
SmolVLM-256M-Instruct
SmolVLM-500M-Instruct
MeloTTS-English
MeloTTS-Chinese
MeloTTS-Japanese
MeloTTS-Spanish
CosyVoice2
Whisper-tiny
Whisper-base
Whisper-small
SenseVoice-small
Yolo11n
Depth-Anything-V2
Keyword spotting
Silero-vad
Automatic Speech Recognition

Next Overview

Page Tools

PDF

Devices & Quick Start

AI Pyramid

Module LLM

LLM630 Compute Kit

Models

Qwen2.5

Qwen3

DeepSeek-R1

SmolVLM

MeloTTS

Whisper

Llama

AI Pyramid Applications

Module LLM Applications

Audio

CV Vision Application

Vision Language Model (VLM)

Large Language Model (LLM)

Voice Assistant

OpenAI API

Models

Reasoning models

Flagship chat models

Multimodal models

Text-to-speech

Transcription

Keyword spotting

Voice activity detection

Automatic Speech Recognition

Vision

Model Pages

On This Page