pdf-icon

StackFlow AI Platform

Module LLM Applications

CV Vision Application

Vision Language Model (VLM)

Large Language Model (LLM)

Voice Assistant

InternVL3-1B

Introduction

InternVL3 is an advanced series of multimodal large language models (MLLMs) that demonstrates outstanding overall performance. Compared with InternVL 2.5, InternVL3 delivers superior multimodal perception and reasoning capabilities, while further expanding its multimodal abilities to cover areas such as tool usage, GUI agents, industrial image analysis, and 3D visual perception.

Available NPU Models

Base Model

internvl3-1B-448-ax630c

The base model provides a context window of 1024, with a maximum output of 1280 tokens.
Supported platforms: LLM630 Compute Kit, Module LLM, and Module LLM Kit

  • Context window: 1024
  • Maximum output tokens: 1280
  • Time to first token (ttft): 534.95 ms
  • Average generation speed: 9.78 token/s
  • Image encoding resolution: 448×448
  • Image encoding time: 2267.89 ms

Installation

apt install llm-model-internvl3-1b-448-ax630c

internvl3-1B-448-ax650

The base model provides a context window of 2048, with a maximum output of 2048 tokens.
Supported platform: AI Pyramid

  • Context window: 2048
  • Maximum output tokens: 2048
  • Time to first token (ttft): 142.32 ms
  • Average generation speed: 26.67 token/s
  • Image encoding resolution: 448×448
  • Image encoding time: 393.08 ms

Installation

apt install llm-model-internvl3-1b-448-ax630c
On This Page