pdf-icon

Product Guide

Offline Voice Recognition

Industrial Control

IoT Measuring Instruments

Air Quality

PowerHub

Module13.2 PPS

Input Device

Ethernet Camera

DIP Switch Usage Guide

Module GPS v2.0

Module GNSS

Module ExtPort For Core2

Module LoRa868 V1.2

Jina CLIP v2

jina-clip-v2 is a general-purpose multilingual multimodal embedding model for text & images.

  1. Manually download the model and upload it to raspberrypi5, or pull the model repository using the following command.
Tip
If git lfs is not installed, please refer to git lfs installation guide first.
git clone https://huggingface.co/AXERA-TECH/jina-clip-v2

File Description

(ppocr) m5stack@raspberrypi:~/rsp/jina-clip-v2 $ ls -lh
total 1.7G
-rw-rw-r-- 1 m5stack m5stack  51K Oct 20 12:24 beach1.jpg
-rw-rw-r-- 1 m5stack m5stack    0 Oct 20 12:24 config.json
-rw-rw-r-- 1 m5stack m5stack 351M Oct 20 12:25 image_encoder.axmodel
drwxrwxr-x 2 m5stack m5stack 4.0K Oct 20 12:24 jina-clip-v2
-rw-rw-r-- 1 m5stack m5stack 1.3K Oct 20 12:24 README.md
-rw-rw-r-- 1 m5stack m5stack 3.2K Oct 20 12:24 run_axmodel.py
-rw-rw-r-- 1 m5stack m5stack 1.3G Oct 20 12:26 text_encoder.axmodel
  1. Create virtual environment
python -m venv clip
  1. Activate the virtual environment
source clip/bin/activate
  1. Install dependencies
pip install https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc2/axengine-0.1.3-py3-none-any.whl
pip install torch pillow transformers timm torchvision
  1. Run
python run_axmodel.py

Execution result:

(clip) m5stack@raspberrypi:~/rsp/jina-clip-v2 $ python3 run_axmodel.py -i beach1.jpg -t "beautiful sunset over the beach" -iax ./image_encoder.axmodel -tax ./text_encoder.axmodel --hf_path ./jina-clip-v2
[INFO] Available providers:  ['AXCLRTExecutionProvider']
[INFO] Using provider: AXCLRTExecutionProvider
[INFO] SOC Name: AX650N
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Compiler version: 4.2 df480136
[INFO] Using provider: AXCLRTExecutionProvider
[INFO] SOC Name: AX650N
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Compiler version: 4.2 27910799
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
A new version of the following files was downloaded from https://huggingface.co/jinaai/jina-clip-implementation:
- transform.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
text -> image: 0.3140323
On This Page