Jina CLIP v2

jina-clip-v2 is a general-purpose multilingual multimodal embedding model for text & images.

Manually download the model and upload it to raspberrypi5, or pull the model repository using the following command.

Tip

If git lfs is not installed, please refer to git lfs installation guide first.

git clone https://huggingface.co/AXERA-TECH/jina-clip-v2

File Description

(ppocr) m5stack@raspberrypi:~/rsp/jina-clip-v2 $ ls -lh
total 1.7G
-rw-rw-r-- 1 m5stack m5stack  51K Oct 20 12:24 beach1.jpg
-rw-rw-r-- 1 m5stack m5stack    0 Oct 20 12:24 config.json
-rw-rw-r-- 1 m5stack m5stack 351M Oct 20 12:25 image_encoder.axmodel
drwxrwxr-x 2 m5stack m5stack 4.0K Oct 20 12:24 jina-clip-v2
-rw-rw-r-- 1 m5stack m5stack 1.3K Oct 20 12:24 README.md
-rw-rw-r-- 1 m5stack m5stack 3.2K Oct 20 12:24 run_axmodel.py
-rw-rw-r-- 1 m5stack m5stack 1.3G Oct 20 12:26 text_encoder.axmodel

Create virtual environment

python -m venv clip

Activate the virtual environment

source clip/bin/activate

Install dependencies

pip install https://github.com/AXERA-TECH/pyaxengine/releases/download/0.1.3.rc2/axengine-0.1.3-py3-none-any.whl
pip install torch pillow transformers timm torchvision

python run_axmodel.py

Execution result:

(clip) m5stack@raspberrypi:~/rsp/jina-clip-v2 $ python3 run_axmodel.py -i beach1.jpg -t "beautiful sunset over the beach" -iax ./image_encoder.axmodel -tax ./text_encoder.axmodel --hf_path ./jina-clip-v2
[INFO] Available providers:  ['AXCLRTExecutionProvider']
[INFO] Using provider: AXCLRTExecutionProvider
[INFO] SOC Name: AX650N
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Compiler version: 4.2 df480136
[INFO] Using provider: AXCLRTExecutionProvider
[INFO] SOC Name: AX650N
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Compiler version: 4.2 27910799
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
A new version of the following files was downloaded from https://huggingface.co/jinaai/jina-clip-implementation:
- transform.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
text -> image: 0.3140323 

Next Overview

Linux PC

CM4Stack

CoreMP135

AI Accelerator Card

LLM-8850 Card

Quick Start

Vision Models

Large Language Models

Multimodal Models

Audio Models

Generative Models

Application List

Advanced Usage

LLM

Real-Time AI Voice Assistant

OpenAI Voice Assistant

XiaoZhi Voice Assistant

XiaoLing Voice Assistant

AtomS3R-M12 Volcengine Kit

Offline Voice Recognition

Unit ASR

Industrial Control

StamPLC

IoT Measuring Instruments

Air Quality

PowerHub

Module13.2 PPS

VAMeter

T-Lite

Input Device

Ezdata

Ethernet Camera

PoECAM

Wi-Fi Camera

TimerCAM

Unit CamS3/-5MP

AI Camera

UnitV2

M5StickV/UnitV

LoRa & LoRaWAN

TTN (The Things Network)

Meshtastic

Motor Control

Unit Roller485/CAN

Develop Tools

Hobby Kit