pdf-icon

Product Guide

Offline Voice Recognition

Industrial Control

IoT Measuring Instruments

Air Quality

Module13.2 PPS

Ethernet Camera

DIP Switch Usage Guide

Module GPS v2.0

Module GNSS

Module ExtPort For Core2

Module LoRa868 V1.2

InternVL3-1B

  1. Manually download the model and upload it to raspberrypi5, or pull the model repository using the following command.
Note
If git lfs is not installed, please refer to git lfs Installation Instructions for installation.
git clone https://huggingface.co/AXERA-TECH/InternVL3-1B

File Description

(axcl) m5stack@raspberrypi5:~/InternVL3-1B $ ls -lh
total 17M
-rw-r--r-- 1 m5stack m5stack 3.8K Jul 25 16:04 config.json
-rw-r--r-- 1 m5stack m5stack 3.9K Jul 25 16:04 gradio_demo.py
drwxr-xr-x 2 m5stack m5stack 4.0K Jul 25 16:06 internvl3_1b_ax650
drwxr-xr-x 2 m5stack m5stack 4.0K Jul 25 16:04 internvl3_tokenizer
-rw-r--r-- 1 m5stack m5stack 6.6K Jul 25 16:04 internvl3_tokenizer.py
-rw-r--r-- 1 m5stack m5stack 6.4M Jul 25 16:05 main_api_ax650
-rw-r--r-- 1 m5stack m5stack 1.9M Jul 25 16:05 main_api_axcl_x86
-rw-r--r-- 1 m5stack m5stack 6.3M Jul 25 16:05 main_ax650
-rw-r--r-- 1 m5stack m5stack 1.8M Jul 25 16:05 main_axcl_x86
-rw-r--r-- 1 m5stack m5stack  277 Jul 25 16:04 post_config.json
-rw-r--r-- 1 m5stack m5stack 6.0K Jul 25 16:04 README.md
-rw-r--r-- 1 m5stack m5stack  495 Jul 25 16:04 run_internvl_3_1b_448_api_ax650.sh
-rw-r--r-- 1 m5stack m5stack  516 Jul 25 16:04 run_internvl_3_1b_448_api_axcl_x86.sh
-rw-r--r-- 1 m5stack m5stack  506 Jul 25 16:04 run_internvl_3_1b_448_ax650.sh
-rw-r--r-- 1 m5stack m5stack  527 Jul 25 16:04 run_internvl_3_1b_448_axcl_x86.sh
-rw-r--r-- 1 m5stack m5stack  50K Jul 25 16:04 ssd_car.jpg
  1. Start the tokenizer parser
python internvl3_tokenizer.py --port 12345
  1. Run the tokenizer service, with the default host IP as localhost and the port set to 12345. Once running, information will appear as follows:
(axcl) m5stack@raspberrypi5:~/InternVL3-1B $ python internvl3_tokenizer.py --port 12345
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None None 151645 <|im_end|> 151665 151667
context_len is  256
prompt is <|im_start|>system
You are ShuSheng·WanXiang, English name InternVL, a multimodal large language model jointly developed by Shanghai Artificial Intelligence Laboratory, Tsinghua University, and multiple partners.<|im_end|>
<|im_start|>user
Hello
<img></img>
<|im_end|>
<|im_start|>assistant
47
http://0.0.0.0:12345
  1. Run InternVL3-1B
m5stack@raspberrypi5:~/InternVL3-1B $ chmod +x main_axcl_aarch64 run_internvl_3_1b_448_axcl_aarch64.sh
  1. Test image
  1. Output information
m5stack@raspberrypi5:~/InternVL3-1B $ ./run_internvl_3_1b_448_axcl_aarch64.sh
[I][                            Init][ 160]: LLM init start
[I][                            Init][  34]: connect http://0.0.0.0:12345 ok
bos_id: -1, eos_id: 151645
img_start_token: 151665
img_context_token: 151667
input size: 1
    name:    image [unknown] [unknown]
        1 x 3 x 448 x 448 size:2408448


output size: 1
    name:   output
        1 x 256 x 896 size:917504

[I][                            Init][ 265]: IMAGE_CONTEXT_TOKEN: 151667, IMAGE_START_TOKEN: 151665
[I][                            Init][ 290]: image encoder input nchw@float32
[I][                            Init][ 320]: image encoder output float32

[I][                            Init][ 330]: image_encoder_height : 448, image_encoder_width: 448
[I][                            Init][ 332]: max_token_len : 2047
[I][                            Init][ 335]: kv_cache_size : 128, kv_cache_num: 2047
[I][                            Init][ 343]: prefill_token_num : 128
[I][                            Init][ 347]: grp: 1, prefill_max_token_num : 1
[I][                            Init][ 347]: grp: 2, prefill_max_token_num : 128
[I][                            Init][ 347]: grp: 3, prefill_max_token_num : 256
[I][                            Init][ 347]: grp: 4, prefill_max_token_num : 384
[I][                            Init][ 347]: grp: 5, prefill_max_token_num : 512
[I][                            Init][ 347]: grp: 6, prefill_max_token_num : 640
[I][                            Init][ 347]: grp: 7, prefill_max_token_num : 768
[I][                            Init][ 347]: grp: 8, prefill_max_token_num : 896
[I][                            Init][ 347]: grp: 9, prefill_max_token_num : 1024
[I][                            Init][ 351]: prefill_max_token_num : 1024
________________________
|    ID| remain cmm(MB)|
========================
|     0|           5764|
¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯¯
[I][                     load_config][ 282]: load config:
{
    "enable_repetition_penalty": false,
    "enable_temperature": true,
    "enable_top_k_sampling": true,
    "enable_top_p_sampling": false,
    "penalty_window": 20,
    "repetition_penalty": 1.2,
    "temperature": 0.9,
    "top_k": 10,
    "top_p": 0.8
}

[I][                            Init][ 448]: LLM init ok
Type "q" to exit, Ctrl+c to stop current running
prompt >> Please describe this image
On This Page