pdf-icon

Product Guide

Offline Voice Recognition

Industrial Control

IoT Measuring Instruments

Air Quality

Module13.2 PPS

Ethernet Camera

DIP Switch Usage Guide

Module GPS v2.0

Module GNSS

Module ExtPort For Core2

Module LoRa868 V1.2

DeepSeek-R1-Distill-Qwen-1.5B-GPTQ-Int4

  1. Click to download the model and upload it to raspberrypi5.

File Description

m5stack@raspberrypi:~/rsp/DeepSeek-R1-Distill-Qwen-1.5B-GPTQ-Int4$ ls -lh
total 2.9M
-rw-rw-r-- 1 m5stack m5stack    0 Aug 12 10:56 config.json
drwxrwxr-x 2 m5stack m5stack 4.0K Aug 12 11:00 deepseek-r1-1.5b-gptq-int4-ax650
drwxrwxr-x 2 m5stack m5stack 4.0K Aug 12 10:56 deepseek-r1_tokenizer
-rw-rw-r-- 1 m5stack m5stack 4.2K Aug 12 10:56 deepseek-r1_tokenizer.py
-rw-rw-r-- 1 m5stack m5stack 976K Aug 12 10:58 main_axcl_aarch64
-rw-rw-r-- 1 m5stack m5stack 999K Aug 12 10:58 main_axcl_x86
-rw-rw-r-- 1 m5stack m5stack 932K Aug 12 10:58 main_prefill
-rw-rw-r-- 1 m5stack m5stack  277 Aug 12 10:56 post_config.json
-rw-rw-r-- 1 m5stack m5stack 7.4K Aug 12 10:56 README.md
-rw-rw-r-- 1 m5stack m5stack  533 Aug 12 10:56 run_deepseek-r1_1.5b_gptq_int4_ax650.sh
-rw-rw-r-- 1 m5stack m5stack  538 Aug 12 10:56 run_deepseek-r1_1.5b_gptq_int4_axcl_aarch64.sh
-rw-rw-r-- 1 m5stack m5stack  534 Aug 12 10:56 run_deepseek-r1_1.5b_gptq_int4_axcl_x86.sh
  1. Create a virtual environment
python -m venv deepseek
  1. Activate the virtual environment
source deepseek/bin/activate
  1. Install dependencies
pip install transformers jinja2
  1. Start the tokenizer parser
python deepseek-r1_tokenizer.py --port 12345
  1. Run the tokenizer service — Host IP defaults to localhost, port should be set to 12345.
    Once running, you should see something like this:
(deepseek) m5stack@raspberrypi:~/rsp/DeepSeek-R1-Distill-Qwen-1.5B-GPTQ-Int4 $ python deepseek-r1_tokenizer.py --port 12345
None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
151646 <|begin▁of▁sentence|> 151643 <|end▁of▁sentence|>
<|begin▁of▁sentence|>You are DeepSeek-R1, You are a helpful assistant.<|User|>hello world<|Assistant|>
[151646, 151646, 2610, 525, 18183, 39350, 10911, 16, 11, 1446, 525, 264, 10950, 17847, 13, 151644, 14990, 1879, 151645]
http://localhost:12345
Tip
The following steps should be done in a new terminal on the raspberrypi.
  1. Grant execute permission
chmod +x main_axcl_aarch64 run_deepseek-r1_1.5b_gptq_int4_axcl_aarch64.sh
  1. Start the DeepSeek-R1-Distill-Qwen-1.5B model inference service
./run_deepseek-r1_1.5b_gptq_int4_axcl_aarch64.sh

After successfully starting, you should see:

m5stack@raspberrypi:~/rsp/DeepSeek-R1-Distill-Qwen-1.5B-GPTQ-Int4$ ./run_deepseek-r1_1.5b_gptq_int4_axcl_aarch64.sh
build time: Feb 13 2025 15:44:57
[I][                            Init][ 111]: LLM init start
bos_id: 151646, eos_id: 151643
  3% | ██                                |   1 /  31 [0.00s<0.09s, 333.33 count/s] tokenizer init
  100% | ████████████████████████████████ |  31 /  31 [28.78s<28.78s, 1.08 count/s] init post axmodel ok
[I][                            Init][ 226]: max_token_len : 1023
[I][                            Init][ 231]: kv_cache_size : 256, kv_cache_num: 1023
[I][                     load_config][ 282]: load config:
{
    "enable_repetition_penalty": false,
    "enable_temperature": true,
    "enable_top_k_sampling": true,
    "enable_top_p_sampling": false,
    "penalty_window": 20,
    "repetition_penalty": 1.2,
    "temperature": 0.9,
    "top_k": 10,
    "top_p": 0.8
}

[I][                            Init][ 288]: LLM init ok
Type "q" to exit, Ctrl+c to stop current running
>> hello


Hello! I'm DeepSeek-R1, an AI assistant ready to help you with any questions or tasks you may have. How can I assist you today?

[N][                             Run][ 610]: hit eos,avg 13.29 token/s

>>
Model Quant Method tftt (ms) token/s
DeepSeek-R1-Distill-Qwen-1.5B-GPTQ-Int4 w4a16 - 13.29
On This Page