pdf-icon

Product Guide

Offline Voice Recognition

Industrial Control

IoT Measuring Instruments

Air Quality

Module13.2 PPS

Ethernet Camera

DIP Switch Usage Guide

Module GPS v2.0

Module GNSS

Module ExtPort For Core2

Module LoRa868 V1.2

Qwen2.5-1.5B

  1. Manually download the model and upload it to raspberrypi5, or pull the model repository using the following command.
Note
If git lfs is not installed, please refer to the git lfs installation guide to install it first.
git clone https://huggingface.co/AXERA-TECH/Qwen2.5-1.5B-Instruct-GPTQ-Int4

File Description

m5stack@raspberrypi:~/rsp/Qwen2.5-1.5B-Instruct-GPTQ-Int4$ ls -lh
total 2.9M
-rw-rw-r-- 1 m5stack m5stack    0 Aug 12 10:48 config.json
-rw-rw-r-- 1 m5stack m5stack 976K Aug 12 10:48 main_axcl_aarch64
-rw-rw-r-- 1 m5stack m5stack 999K Aug 12 10:48 main_axcl_x86
-rw-rw-r-- 1 m5stack m5stack 932K Aug 12 10:48 main_prefill
-rw-rw-r-- 1 m5stack m5stack  277 Aug 12 10:48 post_config.json
drwxrwxr-x 2 m5stack m5stack 4.0K Aug 12 10:49 qwen2.5-1.5b-gptq-int4-ax650
drwxrwxr-x 2 m5stack m5stack 4.0K Aug 12 10:48 qwen2.5_tokenizer
-rw-rw-r-- 1 m5stack m5stack 4.2K Aug 12 10:48 qwen2.5_tokenizer.py
-rw-rw-r-- 1 m5stack m5stack 6.8K Aug 12 10:48 README.md
-rw-rw-r-- 1 m5stack m5stack  521 Aug 12 10:48 run_qwen2.5_1.5b_gptq_int4_ax650.sh
-rw-rw-r-- 1 m5stack m5stack  526 Aug 12 10:48 run_qwen2.5_1.5b_gptq_int4_axcl_aarch64.sh
-rw-rw-r-- 1 m5stack m5stack  522 Aug 12 10:48 run_qwen2.5_1.5b_gptq_int4_axcl_x86.sh
Note
If you have already created a qwen virtual environment before, you do not need to create it again, just activate it.
  1. Create a virtual environment
python -m venv qwen
  1. Activate the virtual environment
source qwen/bin/activate
  1. Install dependency packages
pip install transformers jinja2
  1. Start the tokenizer parser
python qwen2.5_tokenizer.py --port 12345
  1. Run the tokenizer service. The host IP defaults to localhost, and the port number is set to 12345. After running, the information is as follows:
(qwen) m5stack@raspberrypi:~/rsp/Qwen2.5-1.5B-Instruct-GPTQ-Int4 $ python qwen2.5_tokenizer.py --port 12345
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
None None 151645 <|im_end|>
<|im_start|>system
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
<|im_start|>user
hello world<|im_end|>
<|im_start|>assistant

[151644, 8948, 198, 2610, 525, 1207, 16948, 11, 3465, 553, 54364, 14817, 13, 1446, 525, 264, 10950, 17847, 13, 151645, 198, 151644, 872, 198, 14990, 1879, 151645, 198, 151644, 77091, 198]
http://localhost:12345
Note
The following operations require opening a new terminal for raspberrypi.
  1. Set executable permissions
chmod +x main_axcl_aarch64 run_qwen2.5_1.5b_gptq_int4_axcl_aarch64.sh
  1. Start the Qwen2.5 model inference service
./run_qwen2.5_1.5b_gptq_int4_axcl_aarch64.sh

After successfully starting, the information is as follows

m5stack@raspberrypi:~/rsp/Qwen2.5-1.5B-Instruct-GPTQ-Int4$ ./run_qwen2.5_1.5b_gptq_int4_axcl_aarch64.sh
build time: Feb 13 2025 15:44:57
[I][                            Init][ 111]: LLM init start
bos_id: -1, eos_id: 151645
  3% | ██                                |   1 /  31 [0.00s<0.09s, 333.33 count/s] tokenizer init
  100% | ████████████████████████████████ |  31 /  31 [28.75s<28.75s, 1.08 count/s] init post axmodel ok
[I][                            Init][ 226]: max_token_len : 1024
[I][                            Init][ 231]: kv_cache_size : 256, kv_cache_num: 1024
[I][                     load_config][ 282]: load config:
{
    "enable_repetition_penalty": false,
    "enable_temperature": true,
    "enable_top_k_sampling": true,
    "enable_top_p_sampling": false,
    "penalty_window": 20,
    "repetition_penalty": 1.2,
    "temperature": 0.9,
    "top_k": 10,
    "top_p": 0.8
}

[I][                            Init][ 288]: LLM init ok
Type "q" to exit, Ctrl+c to stop current running
>> hello
Hello! How can I assist you today?

[N][                             Run][ 610]: hit eos,avg 15.03 token/s

>>
On This Page