pdf-icon

产品上手指引

AI 智能体

Chat Coding(聊天造物)

IoT 测量仪表

Air Quality

Module13.2 PPS

输入输出设备

HID 输入设备

音频播放设备

Ethernet 摄像头

拨码开关&引脚切换

Qwen3.5-2B

INT4 版本

  1. 手动下载模型 并上传到 raspberrypi5,或者通过以下命令拉取模型仓库。
提示
如果没有安装 git lfs,先参考git lfs 安装说明进行安装。
git clone https://huggingface.co/AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047

文件说明

m5stack@raspberrypi:~/rsp/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047 $ ls -lh
total 4.5G
-rw-rw-r-- 1 m5stack m5stack  977 Apr  1 09:39 config.json
-rw-rw-r-- 1 m5stack m5stack 347K Apr  1 09:41 image.png
-rw-rw-r-- 1 m5stack m5stack 1.2G Apr  1 09:43 model.embed_tokens.weight.bfloat16.bin
-rw-rw-r-- 1 m5stack m5stack  275 Apr  1 09:39 post_config.json
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l0_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l10_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:41 qwen3_5_text_p128_l11_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l12_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l13_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l14_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:41 qwen3_5_text_p128_l15_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l16_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l17_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l18_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:41 qwen3_5_text_p128_l19_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l1_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l20_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l21_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l22_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:42 qwen3_5_text_p128_l23_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l24_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l25_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l26_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:41 qwen3_5_text_p128_l27_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:39 qwen3_5_text_p128_l28_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l29_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l2_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l30_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:41 qwen3_5_text_p128_l31_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:41 qwen3_5_text_p128_l3_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:40 qwen3_5_text_p128_l4_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:41 qwen3_5_text_p128_l5_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:41 qwen3_5_text_p128_l6_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  67M Apr  1 09:41 qwen3_5_text_p128_l7_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:41 qwen3_5_text_p128_l8_together.axmodel
-rw-rw-r-- 1 m5stack m5stack  74M Apr  1 09:41 qwen3_5_text_p128_l9_together.axmodel
-rw-rw-r-- 1 m5stack m5stack 662M Apr  1 09:42 qwen3_5_text_post.axmodel
-rw-rw-r-- 1 m5stack m5stack 6.2M Apr  1 09:39 qwen3_5_tokenizer.txt
-rw-rw-r-- 1 m5stack m5stack 356M Apr  1 09:41 qwen3_5_vision.axmodel
-rw-rw-r-- 1 m5stack m5stack  15K Apr  1 09:39 README.md
提示
如果之前已经安装了 axllm ,不需要重复安装。
  1. 安装 axllm
curl -fsSL https://raw.githubusercontent.com/AXERA-TECH/ax-llm/axllm/install.sh | bash

安装成功后可以在终端输出以下信息:

m5stack@raspberrypi:~ $ axllm 
Usage:
  axllm run <model_path> [options]    Run interactive chat mode
  axllm serve <model_path> [options]  Run HTTP API server mode

Arguments:
  model_path    Path to model directory containing config.json and model files

Serve options:
  --port <port> Server port (default: 8000)

Embedding model config:
  Set "is_embedding": true in config.json to enable /v1/embeddings.

Model directory structure:
  model_path/
    ├── config.json          # Model configuration
    ├── tokenizer.txt        # Tokenizer model
    ├── *.axmodel            # AXera model files
    └── post_config.json     # Post-processing config (optional)
  1. 终端聊天模式
axllm run Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047/
m5stack@raspberrypi:~/rsp $ axllm run Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047/
09:44:03.880 INF Init:218 | LLM init start
09:44:03.880 INF Init:226 | mixed attention enabled: full_attention_interval=4 ref_full_layer_idx=3
09:44:03.880 INF run:30 | AXCLWorker start with devid 0
tokenizer_type = 3
 97% | ###############################  |  34 /  35 [57.08s<58.75s, 0.60 count/s] init post axmodel ok,remain_cmm(3789 MB)       
09:45:01.287 INF Init:368 | max_token_len : 2047
09:45:01.287 INF Init:371 | kv_cache_size : 1024, kv_cache_num: 2047
09:45:01.287 INF Init:374 | prefill_token_num : 128
09:45:01.287 INF Init:379 | grp: 1, prefill_max_kv_cache_num : 1
09:45:01.287 INF Init:379 | grp: 2, prefill_max_kv_cache_num : 128
09:45:01.287 INF Init:379 | grp: 3, prefill_max_kv_cache_num : 256
09:45:01.287 INF Init:379 | grp: 4, prefill_max_kv_cache_num : 384
09:45:01.287 INF Init:379 | grp: 5, prefill_max_kv_cache_num : 512
09:45:01.287 INF Init:379 | grp: 6, prefill_max_kv_cache_num : 768
09:45:01.287 INF Init:379 | grp: 7, prefill_max_kv_cache_num : 1024
09:45:01.287 INF Init:379 | grp: 8, prefill_max_kv_cache_num : 1152
09:45:01.287 INF Init:384 | prefill_max_token_num : 1152
09:45:01.287 INF Init:27 | LLaMaEmbedSelector use mmap
100% | ################################ |  35 /  35 [57.08s<57.08s, 0.61 count/s] embed_selector init ok
09:45:06.702 INF Init:695 | Qwen-VL token ids: vision_start=248053 image_pad=248056 video_pad=248057
09:45:06.702 INF Init:728 | VisionModule init ok: type=Qwen3VL, tokens_per_block=144, embed_size=2560, out_dtype=fp32
09:45:06.702 INF load_config:282 | load config: 
09:45:06.702 INF load_config:282 | {
09:45:06.702 INF load_config:282 |     "enable_repetition_penalty": true,
09:45:06.702 INF load_config:282 |     "enable_temperature": true,
09:45:06.702 INF load_config:282 |     "enable_top_k_sampling": true,
09:45:06.702 INF load_config:282 |     "enable_top_p_sampling": true,
09:45:06.702 INF load_config:282 |     "penalty_window": 20,
09:45:06.702 INF load_config:282 |     "repetition_penalty": 1.0,
09:45:06.702 INF load_config:282 |     "temperature": 0.7,
09:45:06.702 INF load_config:282 |     "top_k": 20,
09:45:06.702 INF load_config:282 |     "top_p": 0.8
09:45:06.702 INF load_config:282 | }
09:45:06.702 WRN load_config:306 | Both top_p and top_k enabled; prefer top_p and disable top_k
09:45:06.702 INF Init:448 | LLM init ok
Commands:
  /q, /exit  退出
  /reset     重置 kvcache
  /dd        删除一轮对话
  /pp        打印历史对话
Ctrl+C: 停止当前生成
VLM enabled: after each prompt, input image path (empty = text-only). Use "video:<frames_dir>" for video.
----------------------------------------
prompt >> 

文本输入

prompt >> hello
image >> 
09:45:44.708 INF SetKVCache:749 | prefill_grpid:2 kv_cache_num:128 precompute_len:0 input_num_token:20
09:45:44.708 INF SetKVCache:757 | current prefill_max_token_num:1152
09:45:44.708 INF SetKVCache:760 | first run
09:45:44.708 INF Run:818 | input token num : 20, prefill_split_num : 1
09:45:44.708 INF Run:858 | prefill chunk p=0 history_len=0 grpid=1 kv_cache_num=0 input_tokens=20
09:45:44.709 INF Run:881 | prefill indices shape: p=0 idx_elems=128 idx_rows=1 pos_rows=0
09:45:45.569 INF Run:1023 | ttft: 860.29 ms
<think>

</think>

Hello! How can I help you today?

09:45:47.825 NTC Run:1145 | hit eos,avg 5.76 token/s
09:45:47.826 INF GetKVCache:721 | precompute_len:33, remaining:1119
prompt >> 

图片输入

prompt >> 描述图片内容
image >> bus.jpg
09:47:13.316 INF EncodeForContent:1121 | Qwen-VL pixel_values[0] bytes=884736 min=0 max=255 (w=384 h=384 tp=2 ps=16 sm=2)
09:47:13.474 INF EncodeForContent:1144 | vision cache store: bus.jpg
09:47:13.476 INF SetKVCache:749 | prefill_grpid:3 kv_cache_num:256 precompute_len:33 input_num_token:159
09:47:13.476 INF SetKVCache:757 | current prefill_max_token_num:1024
09:47:13.476 INF Run:818 | input token num : 159, prefill_split_num : 2
09:47:13.476 INF Run:858 | prefill chunk p=0 history_len=33 grpid=2 kv_cache_num=128 input_tokens=128
09:47:13.476 INF Run:881 | prefill indices shape: p=0 idx_elems=128 idx_rows=1 pos_rows=3
09:47:14.393 INF Run:858 | prefill chunk p=1 history_len=161 grpid=3 kv_cache_num=256 input_tokens=31
09:47:14.393 INF Run:881 | prefill indices shape: p=1 idx_elems=128 idx_rows=1 pos_rows=3
09:47:15.430 INF Run:1023 | ttft: 1953.40 ms
<think>

</think>

这张图片展示了一个城市街道的场景,画面中有一辆蓝色的公交车和几位行人。以下是详细描述:

1. **公交车**:
   - 公交车为蓝色,车身上有白色和黄色的标识,其中可以看到“ZERO EMISIONES”(西班牙语,意为“零排放”)字样,表明这是一辆环保或电动公交车。
   - 车门位于车辆前部,车窗较大,可以看到车内部分结构。
   - 车顶上有一个遮阳棚或广告牌。

2. **行人**:
   - 图片左侧有几位行人,他们正沿着人行道行走。
   - 其中一位穿着深色外套,另一位穿着浅色上衣,他们似乎在交谈或等待过马路。

3. **背景**:
   - 背景是一栋黄色的建筑,有多扇窗户和阳台,阳台上有黑色栏杆。
   - 建筑上方有一些绿色植物,可能是阳台上的盆栽。
   - 街道两侧有树木,增添了自然氛围。

4. **整体氛围**:
   - 图片整体给人一种安静、日常的城市生活感觉。
   - 天气看起来晴朗,光线充足,可能是白天。

如果你需要更具体的信息,请告诉我!

09:47:59.246 NTC Run:1145 | hit eos,avg 5.75 token/s
09:47:59.247 INF GetKVCache:721 | precompute_len:312, remaining:840
prompt >> 
  1. OpenAI API 兼容服务器模式
注意
如果之前安装了基于 StackFlow 的 OpenAI API 服务,需要先执行 systemctl stop llm-openai-api 停止,或 sudo apt remove llm-openai-api 卸载,避免端口冲突。
axllm serve Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047/
m5stack@raspberrypi:~/rsp $ axllm serve Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047/
09:49:16.652 INF Init:218 | LLM init start
09:49:16.652 INF Init:226 | mixed attention enabled: full_attention_interval=4 ref_full_layer_idx=3
09:49:16.652 INF run:30 | AXCLWorker start with devid 0
tokenizer_type = 3
 97% | ###############################  |  34 /  35 [56.58s<58.24s, 0.60 count/s] init post axmodel ok,remain_cmm(3789 MB)       
09:50:13.546 INF Init:368 | max_token_len : 2047
09:50:13.546 INF Init:371 | kv_cache_size : 1024, kv_cache_num: 2047
09:50:13.546 INF Init:374 | prefill_token_num : 128
09:50:13.546 INF Init:379 | grp: 1, prefill_max_kv_cache_num : 1
09:50:13.546 INF Init:379 | grp: 2, prefill_max_kv_cache_num : 128
09:50:13.546 INF Init:379 | grp: 3, prefill_max_kv_cache_num : 256
09:50:13.546 INF Init:379 | grp: 4, prefill_max_kv_cache_num : 384
09:50:13.546 INF Init:379 | grp: 5, prefill_max_kv_cache_num : 512
09:50:13.546 INF Init:379 | grp: 6, prefill_max_kv_cache_num : 768
09:50:13.546 INF Init:379 | grp: 7, prefill_max_kv_cache_num : 1024
09:50:13.546 INF Init:379 | grp: 8, prefill_max_kv_cache_num : 1152
09:50:13.546 INF Init:384 | prefill_max_token_num : 1152
09:50:13.546 INF Init:27 | LLaMaEmbedSelector use mmap
100% | ################################ |  35 /  35 [56.58s<56.58s, 0.62 count/s] embed_selector init ok
09:50:18.914 INF Init:695 | Qwen-VL token ids: vision_start=248053 image_pad=248056 video_pad=248057
09:50:18.914 INF Init:728 | VisionModule init ok: type=Qwen3VL, tokens_per_block=144, embed_size=2560, out_dtype=fp32
09:50:18.914 INF load_config:282 | load config: 
09:50:18.914 INF load_config:282 | {
09:50:18.914 INF load_config:282 |     "enable_repetition_penalty": true,
09:50:18.914 INF load_config:282 |     "enable_temperature": true,
09:50:18.914 INF load_config:282 |     "enable_top_k_sampling": true,
09:50:18.914 INF load_config:282 |     "enable_top_p_sampling": true,
09:50:18.914 INF load_config:282 |     "penalty_window": 20,
09:50:18.914 INF load_config:282 |     "repetition_penalty": 1.0,
09:50:18.914 INF load_config:282 |     "temperature": 0.7,
09:50:18.914 INF load_config:282 |     "top_k": 20,
09:50:18.914 INF load_config:282 |     "top_p": 0.8
09:50:18.914 INF load_config:282 | }
09:50:18.914 WRN load_config:306 | Both top_p and top_k enabled; prefer top_p and disable top_k
09:50:18.914 INF Init:448 | LLM init ok
Starting server on port 8000 with model 'AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047'...
API URLs:
  GET  http://127.0.0.1:8000/health
  GET  http://127.0.0.1:8000/v1/models
  POST http://127.0.0.1:8000/v1/chat/completions
  GET  http://192.168.20.57:8000/health
  GET  http://192.168.20.57:8000/v1/models
  POST http://192.168.20.57:8000/v1/chat/completions
  GET  http://172.18.0.1:8000/health
  GET  http://172.18.0.1:8000/v1/models
  POST http://172.18.0.1:8000/v1/chat/completions
  GET  http://172.17.0.1:8000/health
  GET  http://172.17.0.1:8000/v1/models
  POST http://172.17.0.1:8000/v1/chat/completions
  GET  http://172.19.0.1:8000/health
  GET  http://172.19.0.1:8000/v1/models
  POST http://172.19.0.1:8000/v1/chat/completions
  GET  http://198.18.0.1:8000/health
  GET  http://198.18.0.1:8000/v1/models
  POST http://198.18.0.1:8000/v1/chat/completions
Aliases:
  GET  http://127.0.0.1:8000/models
  POST http://127.0.0.1:8000/chat/completions
  GET  http://192.168.20.57:8000/models
  POST http://192.168.20.57:8000/chat/completions
  GET  http://172.18.0.1:8000/models
  POST http://172.18.0.1:8000/chat/completions
  GET  http://172.17.0.1:8000/models
  POST http://172.17.0.1:8000/chat/completions
  GET  http://172.19.0.1:8000/models
  POST http://172.19.0.1:8000/chat/completions
  GET  http://198.18.0.1:8000/models
  POST http://198.18.0.1:8000/chat/completions
OpenAI API Server starting on http://0.0.0.0:8000
Max concurrency: 1
Models: AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047

使用 curl 调用 API

文本输入

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047",
    "messages": [
      {"role": "user", "content": "hello"}
    ]
  }'

返回结果:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "<think>\n\n</think>\n\nHello! How can I help you today?",
        "role": "assistant"
      }
    }
  ],
  "created": 1775008314,
  "id": "chatcmpl-485365ffeebff730ce0b0af5",
  "model": "AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 0,
    "prompt_tokens": 0,
    "total_tokens": 0
  }
}

图片输入

echo "{\"model\": \"AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047\", \"messages\": [{\"role\": \"user\", \"content\": [{\"type\": \"text\", \"text\": \"简要描述图片内容\"}, {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/jpeg;base64,$(base64 -w 0 bus.jpg)\"}}]}]}" > payload.json
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d @payload.json

返回结果:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "<think>\n\n</think>\n\n这张图片展示了一条城市街道的街景。\n\n- **主要主体**:一辆蓝白相间的现代公交车是画面的焦点,车身侧面有“e”标志和“cer0 emisiones\"(零排放)字样,表明这是一辆环保公交车。\n- **人物活动**:\n  - 前景中有三位行人:一位穿米色外套和卡其裤的男士正朝镜头方向走来;他身后是一位穿黑色夹克、内搭红色衣物的男士;还有一位穿深色外套的人部分可见。\n  - 背景中,在公交车后方,可以看到一位穿着浅色上衣的人站在人行道上。\n- **环境背景**:街道两侧是黄色的多层建筑,带有绿色窗框和阳台栏杆,具有典型的欧洲城市建筑风格。左侧有绿树,右侧可见更多建筑立面。\n- **整体氛围**:阳光明媚,光线充足,整个场景显得宁静而日常,展现了城市中公共交通与市民生活的和谐共存。\n\n这张照片捕捉了一个典型的城市日常瞬间,融合了交通、建筑与人文元素。<|endoftext|><|im_start|>assistant\n\n你提到的图片中,其实并没有直接展示“人物”与“公交车”之间的直接互动或空间关系(如“公交车在人物背后”或“人物在公交车前面”),但从画面整体来看,我们可以从视觉元素和空间布局的角度来理解:\n\n---\n\n### **1. 人物与公交车的空间关系**\n- **前景人物(穿米色外套的男士)**:\n  - 他位于画面左侧,正朝镜头方向行走。\n  - 从视觉上看,他与公交车之间的距离较近,但并未“紧贴”公交车。\n  - **结论**:他并非“在公交车前面”,而是“在公交车前方不远处行走”。\n\n- **中间人物(穿黑色夹克的男士)**:\n  - 他位于穿米色外套男士的左后方,距离公交车更近。\n  - 由于透视关系,他看起来离公交车更近,但实际上仍处于一定距离。\n  - **结论**:他也不是“紧贴公交车”,而是“在公交车前方稍远位置”。\n\n- **背景人物(穿浅色上衣的人)**:\n  - 位于公交车后方,距离较远,几乎处于背景中。\n  - **结论**:他确实“在公交车后面”,但距离较远,不构成“紧贴”。\n\n---\n\n### **2. 公交车与建筑的关系**\n- 公交车停在街道中央,后方是黄色建筑。\n- 建筑与公交车之间没有明显遮挡,但建筑距离公交车也有一定距离(通常为几米到十几米)。\n- **结论**:公交车并未“紧贴”建筑,而是停在正常街道位置。\n\n---\n\n### **3. 整体空间布局**\n- 公交车位于街道中央,周围有行人和建筑。\n- 人物与公交车之间没有直接的“前后紧贴”关系,而是通过街道空间自然分布。\n- 因此,**不能简单地说“人物在公交车前面”或“公交车在人物背后”**,而应描述为“人物在公交车前方/后方的街道上行走或站立”。\n\n---\n\n### ✅ 总结\n虽然图片中人物与公交车在空间上存在前后关系,但**并非紧贴或完全遮挡**,而是处于正常街道通行距离。因此,在描述时应避免绝对化表述,而应强调“相对位置”和“实际距离”。\n\n例如:\n> “图中行人位于公交车前方或后方的街道上,与公交车保持一定通行距离。”\n\n这样的描述更符合视觉事实和逻辑。\n\n如果你需要更精确的空间分析(如像素坐标、实际距离估算),可以告诉我,我可以进一步帮你分析!",
        "role": "assistant"
      }
    }
  ],
  "created": 1775010780,
  "id": "chatcmpl-7e897991158ad67ad1bb6dfa",
  "model": "AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 0,
    "prompt_tokens": 0,
    "total_tokens": 0
  }
}

使用 python 调用 API

pip install openai
import requests
import base64

with open("bus.jpg", "rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode('utf-8')

url = "http://localhost:8000/v1/chat/completions"
payload = {
    "model": "AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047",
    "messages": [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "描述图片内容"},
                {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
            ]
        }
    ]
}

response = requests.post(url, json=payload)
print(response.json())
m5stack@raspberrypi:~ $ python test_img.py 
{'choices': [{'finish_reason': 'stop', 'index': 0, 'message': {'content': '<think>\n\n</think>\n\n这张图片展示了一个城市街景,主要聚焦于一辆现代公交车和路边的行人。\n\n**主要元素:**\n\n- **公交车:**\n  - 一辆白色的现代公交车停在铺有砖块的街道上。\n  - 车身侧面有醒目的绿色“e”标志(代表环保或电动),并配有文字“cero emisiones”(西班牙语,意为“零排放”)和“Eco-Energia”,表明这是一辆环保或电动公交车。\n  - 车窗为深色,车顶装有空调或通风设备。\n  - 车身上方还挂着一个红色的遮阳篷或广告牌。\n\n- **背景建筑:**\n  - 公交车后方是一栋浅黄色的多层建筑,带有多个窗户和阳台。\n  - 部分窗户装有绿色百叶窗,阳台上有黑色铁艺栏杆。\n  - 建筑外墙上还悬挂着一些花盆或装饰物。\n\n- **行人与环境:**\n  - 在公交车左侧,有一位穿着浅色上衣的行人正在行走。\n  - 街道两旁有树木,树叶呈黄绿色,可能是在秋季或初冬时节。\n  - 整体光线明亮,应为白天拍摄。\n\n**氛围与主题:**\n这张照片传递了一种现代、环保的城市交通理念。通过展示“零排放”公交车,强调了可持续交通的发展。同时,日常街景中的人与车和谐共处,也反映了城市生活的常态。\n\n—— 总结:这是一张展现现代城市公共交通与日常街景的照片,突出了环保交通理念。<|endoftext|><|im_start|> Qwen', 'role': 'assistant'}}], 'created': 1775009646, 'id': 'chatcmpl-5df44b08180d4ac764cc6843', 'model': 'AXERA-TECH/Qwen3.5-4B-AX650-GPTQ-Int4-C128-P1152-CTX2047', 'object': 'chat.completion', 'usage': {'completion_tokens': 0, 'prompt_tokens': 0, 'total_tokens': 0}}
On This Page