pdf-icon

Product Guide

Real-Time AI Voice Assistant

Smart Home

M5Paper

IoT Tools

IoT Cloud

Ethernet Camera

Develop Tools

M5Module-LLM Arduino API

M5Module-LLM Arduino Driver Library API Documentation.

M5ModuleLLM Class

M5ModuleLLM is used to initialize the LLM Module and provides internal members to quickly initialize each unit of the LLM, making it convenient to build applications according to individual needs.

class M5ModuleLLM {
public:
    bool begin(Stream* targetPort);
    bool checkConnection();
    void update();

    m5_module_llm::ApiSys sys;
    m5_module_llm::ApiLlm llm;
    m5_module_llm::ApiAudio audio;
    m5_module_llm::ApiTts tts;
    m5_module_llm::ApiKws kws;
    m5_module_llm::ApiAsr asr;
    m5_module_llm::ModuleMsg msg;
    m5_module_llm::ModuleComm comm;
private:
};

begin

Function Prototype:

bool begin(Stream* targetPort);

Description:

  • Initializes the UART configuration of the LLM Module.

Parameters:

  • Stream* targetPort:
    • Pass the Serial pointer.

Return Value:

  • bool:
    • true: Initialization successful.
    • false: Initialization failed.

checkConnection

Function Prototype:

bool checkConnection();

Description:

  • Sends sys.ping command to check the connection status of the LLM Module.

Parameters:

  • null

Return Value:

  • bool:
    • true: Module responds.
    • false: Module does not respond.

update

Function Prototype:

void update();

Description:

  • Fetches UART response data from the LLM Module. This API should be included in the Loop for continuous execution.

Parameters:

  • null

Return Value:

  • null

ApiSys Class

The ApiSys sys member of M5ModuleLLM is used to control the SYS unit for operations such as system reset.

ping

Function Prototype:

int ping();

Description:

  • Sends the sys.ping command to check the connection status of the LLM Module.

Parameters:

  • null

Return Value:

  • int:
    • MODULE_LLM_OK / Error Code

reset

Function Prototype:

int reset(bool waitResetFinish = true);

Description:

  • Sends the sys.reset command to reset the software services.

Parameters:

  • bool waitResetFinish:
    • true: Blocking wait for reset to finish.
    • false: Non-blocking execution of reset.

Return Value:

  • int:
    • MODULE_LLM_OK / Error Code

reboot

Function Prototype:

int reboot();

Description:

  • Sends the sys.reboot command to reboot the system.

Parameters:

  • null

Return Value:

  • int:
    • MODULE_LLM_OK / Error Code

ApiAudio Class

The ApiAudio audio member of M5ModuleLLM is used to control the initialization and configuration of the AUDIO unit.

setup

Function Prototype:

String setup(ApiAudioSetupConfig_t config = ApiAudioSetupConfig_t(), String request_id = "audio_setup");

Description:

  • Initializes the Audio unit and activates the system sound card. (This feature needs to be enabled before using KWS and TTS)

Parameters:

ApiAudioSetupConfig_t config:

  • LLM unit initialization configuration:
  • String request_id:
    • Session id, use the default.
struct ApiAudioSetupConfig_t {
    int capcard      = 0;
    int capdevice    = 0;
    float capVolume  = 0.5;
    int playcard     = 0;
    int playdevice   = 1;
    float playVolume = 0.15;
};
Parameter Description Input Value
capcard Microphone sound card index Default system sound card: 0
capdevice Microphone device index Onboard silicon mic: 0
capVolume Input volume 0.0~10.0 (volume > 1 will amplify, default value is 0.5)
playcard Speaker sound card index Default system sound card: 0
playdevice Speaker device index Onboard speaker: 1
playVolume Output volume 0.0~10.0 (volume > 1 will amplify, default value is 0.5)

Return Value:

  • String:
    • audio_work_id: audio unit work_id

ApiKws Class

The ApiKws kws member of M5ModuleLLM is used to control the initialization and configuration of the KWS unit.

setup

Function Prototype:

String setup(ApiKwsSetupConfig_t config = ApiKwsSetupConfig_t(), String request_id = "kws_setup");

Description:

  • Initializes the KWS unit and configures the wake keyword.

Parameters:

ApiKwsSetupConfig_t config:

  • KWS unit initialization configuration:
  • String request_id:
    • Session id, use the default.
struct ApiKwsSetupConfig_t {
    String kws             = "HELLO";
    String model           = "sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01";
    String response_format = "kws.bool";
    String input           = "sys.pcm";
    bool enoutput          = true;
};
Parameter Description Input Value
model Conversion model English model: "sherpa-onnx-kws-zipformer-gigaspeech-3.3M-2024-01-01"
Chinese model: "sherpa-onnx-kws-zipformer-wenetspeech-3.3M-2024-01-01"
kws KWS wake word text setting No mixing of Chinese and English allowed, English must be in uppercase
enoutput Enable UART output Enable: true
Disable: false

Return Value:

  • String:
    • kws_work_id: kws unit work_id

ApiAsr Class

The ApiAsr asr member of M5ModuleLLM is used to control the initialization and configuration of the ASR unit.

setup

Function Prototype:

String setup(ApiAsrSetupConfig_t config = ApiAsrSetupConfig_t(), String request_id = "asr_setup");

Description:

  • Initializes the ASR unit to activate the speech-to-text functionality.

Parameters:

ApiAsrSetupConfig_t config:

  • ASR unit initialization configuration:
  • String request_id:
    • Session id, use the default.
struct ApiAsrSetupConfig_t {
    String model           = "sherpa-ncnn-streaming-zipformer-20M-2023-02-17";
    String response_format = "asr.utf-8.stream";
    String input           = "sys.pcm";
    bool enoutput          = true;
    bool enkws             = true;
    float rule1            = 2.4;
    float rule2            = 1.2;
    float rule3            = 30.0;
};
Parameter Description Input Value
model Conversion model English model: "sherpa-ncnn-streaming-zipformer-20M-2023-02-17"
Chinese model: "sherpa-ncnn-streaming-zipformer-zh-14M-2023-02-23"
response_format Output format Standard output: "asr.utf-8"
Streaming output: "asr.utf-8.stream"
input Input LLM input: "llm.xxx"(work_id of the llm unit)
UART input: "tts.utf-8"
UART streaming input: "tts.utf-8.stream"
enkws Support wake by KWS Allow KWS to trigger ASR: true
Continuous ASR without KWS wake: false
rule1 Timeout between wake and no content recognition Unit: seconds
rule2 Maximum interval time for recognition Unit: seconds
rule3 Maximum timeout duration for recognition Unit: seconds
enoutput Enable UART output Enable: true
Disable: false

Return Value:

  • String:
    • asr_work_id: asr unit work_id

ApiLlm Class

The ApiLlm llm member of M5ModuleLLM is used to control the initialization and configuration of the LLM unit.

setup

Function Prototype:

String setup(ApiLlmSetupConfig_t config = ApiLlmSetupConfig_t(), String request_id = "llm_setup");

Description:

  • Initializes the LLM unit and supports configuring the input and output data methods for the LLM unit.

Parameters:

  • ApiLlmSetupConfig_t config:
    • LLM unit initialization configuration:
  • String request_id:
    • Session id, use the default.
struct ApiLlmSetupConfig_t {
    String prompt;
    String model           = "qwen2.5-0.5b";
    String response_format = "llm.utf-8.stream";
    String input           = "llm.utf-8.stream";
    bool enoutput          = true;
    bool enkws             = true;
    int max_token_len      = 127;
};
Parameter Description Input Value
model Conversion model Predefined model "qwen2.5-0.5b"
response_format Output format Standard output: "llm.utf-8"
Streaming output: "llm.utf-8.stream"
input Input ASR input: "asr.xxx"(work_id of the asr unit)
UART input: "llm.utf-8"
UART streaming input: "llm.utf-8.stream"
enkws Should KWS wake terminate the process KWS interrupts the process: true
KWS does not interrupt the process: false
max_length Configure maximum output token length Max value: 1024, recommended: 127
prompt Model initialization prompt String
enoutput Enable UART output Enable: true
Disable: false

Return Value:

  • String:
    • llm_work_id: llm unit work_id

inference

Function Prototype:

int inference(String work_id, String input, String request_id = "llm_inference");

Description:

  • Inputs data and begins inference. The result content will be stored in the responseMsgList container within M5ModuleLLM.msg.

Parameters:

  • String work_id:
    • work_id of the LLM unit being called.
  • String input:
    • Input text.
  • String request_id:
    • Session ID, used to differentiate if multiple sessions are present.

Return Value:

  • int:
    • MODULE_LLM_OK / Error Code

inferenceAndWaitResult

Function Prototype:

int inferenceAndWaitResult(String work_id, String input, std::function<void(String&)> onResult, uint32_t timeout = 5000, String request_id = "llm_inference");

Description:

  • Inputs data and begins inference, blocking and waiting for the result, then calling the callback function.

Parameters:

  • String work_id:
    • work_id of the LLM unit being called.
  • String input:
    • Input text.
  • void onResult(String&):
    • Callback function for inference results.
  • uint32_t timeout:
    • Timeout duration for waiting for inference.
  • String request_id:
    • Session ID, used to differentiate if multiple sessions are present.

Return Value:

  • int:
    • MODULE_LLM_OK / Error Code

ApiTts Class

The ApiTts tts member of M5ModuleLLM is used to control the initialization and configuration of the TTS unit.

setup

Function Prototype:

String setup(ApiTtsSetupConfig_t config = ApiTtsSetupConfig_t(), String request_id = "tts_setup");

Description:

  • Initializes the TTS unit to enable text-to-speech functionality.

Parameters:

ApiTtsSetupConfig_t config:

  • LLM unit initialization configuration:
  • String request_id:
    • Session id, use the default.
struct ApiTtsSetupConfig_t {
    String model           = "single_speaker_english_fast";
    String response_format = "tts.base64.wav";
    String input           = "tts.utf-8.stream";
    bool enoutput          = true;
    bool enkws             = true;
};
Parameter Description Input Value
model Conversion model English model: "single_speaker_english_fast"
Chinese model: "single_speaker_fast"
input Input LLM input: "llm.xxx"(work_id of the llm unit)
UART input: "tts.utf-8"
UART streaming input: "tts.utf-8.stream"
enkws Should KWS wake terminate the process KWS interrupts the process: true
KWS does not interrupt the process: false
enoutput Enable UART output Enable: true
Disable: false

Return Value:

  • String:
    • tts_work_id: tts unit work_id

inference

Function Prototype:

int inference(String work_id, String input, uint32_t timeout = 0, String request_id = "tts_inference");

Description:

  • Inputs data and begins inference, and upon completion, it will automatically play on the speaker.

Parameters:

  • String work_id:
    • work_id of the TTS unit being called.
  • String input:
    • Input text.
  • uint32_t timeout:
    • Timeout duration for waiting for inference.
  • String request_id:
    • Session ID, used to differentiate if multiple sessions are present.

Return Value:

  • int:
    • MODULE_LLM_OK / Error Code

ModuleMsg Class

The ModuleMsg msg member of M5ModuleLLM provides the responseMsgList container, which is used to cache various information returned from the LLM Module. Refer to the example below to iterate and retrieve response results in the main loop.

void loop()
{
    module_llm.update();

    // Handle response msg
    for (auto& msg : module_llm.msg.responseMsgList) {
        // KWS msg
        if (msg.work_id == kws_work_id) {
            Serial.printf(">> Keyword detected\n");
        }

        // ASR msg
        if (msg.work_id == asr_work_id) {
            if (msg.object == "asr.utf-8.stream") {
                // Parse and get ASR result
                JsonDocument doc;
                deserializeJson(doc, msg.raw_msg);
                String asr_result = doc["data"]["delta"].as<String>();
                Serial.printf(">> %s\n", asr_result.c_str());
            }
        }
    }
    module_llm.msg.responseMsgList.clear();
}

VoiceAssistant Class

M5ModuleLLM_VoiceAssistant is used to quickly create an LLM voice assistant instance, allowing easy implementation of KWS (wake-up keyword) -> ASR (speech-to-text) -> LLM (large model inference) -> TTS (text-to-speech).

  • During initialization, simply pass the M5ModuleLLM instance to the constructor and register callback functions for the respective events to complete the creation of the voice assistant.
/*
 * SPDX-FileCopyrightText: 2024 M5Stack Technology CO LTD
 *
 * SPDX-License-Identifier: MIT
 */
#include <Arduino.h>
#include <M5Unified.h>
#include <M5ModuleLLM.h>

M5ModuleLLM module_llm;
M5ModuleLLM_VoiceAssistant voice_assistant(&module_llm);

/* On ASR data callback */
void on_asr_data_input(String data, bool isFinish, int index)
{
    M5.Display.setTextColor(TFT_GREEN, TFT_BLACK);
    M5.Display.printf(">> %s\n", data.c_str());

    /* If ASR data is finished */
    if (isFinish) {
        M5.Display.setTextColor(TFT_YELLOW, TFT_BLACK);
        M5.Display.print(">> ");
    }
};

/* On LLM data callback */
void on_llm_data_input(String data, bool isFinish, int index)
{
    M5.Display.print(data);

    /* If LLM data is finished */
    if (isFinish) {
        M5.Display.print("\n");
    }
};

void setup()
{
    M5.begin();
    M5.Display.setTextSize(2);
    M5.Display.setTextScroll(true);

    /* Initialize module serial port */
    Serial2.begin(115200, SERIAL_8N1, 16, 17);  // Basic
    // Serial2.begin(115200, SERIAL_8N1, 13, 14);  // Core2
    // Serial2.begin(115200, SERIAL_8N1, 18, 17);  // CoreS3

    /* Initialize module */
    module_llm.begin(&Serial2);

    /* Ensure module is connected */
    M5.Display.printf(">> Check ModuleLLM connection..\n");
    while (1) {
        if (module_llm.checkConnection()) {
            break;
        }
    }

    /* Begin voice assistant preset */
    M5.Display.printf(">> Begin voice assistant..\n");
    int ret = voice_assistant.begin("HELLO");
    if (ret != MODULE_LLM_OK) {
        while (1) {
            M5.Display.setTextColor(TFT_RED);
            M5.Display.printf(">> Begin voice assistant failed\n");
        }
    }

    /* Register on ASR data callback function */
    voice_assistant.onAsrDataInput(on_asr_data_input);

    /* Register on LLM data callback function */
    voice_assistant.onLlmDataInput(on_llm_data_input);

    M5.Display.printf(">> Voice assistant ready\n");
}

void loop()
{
    /* Keep voice assistant preset updated */
    voice_assistant.update();
}

Error Code

enum ModuleLLMErrorCode_t {
    MODULE_LLM_OK                              = 0,
    MODULE_LLM_RESET_WARN                      = -1,
    MODULE_LLM_JSON_FORMAT_ERROR               = -2,
    MODULE_LLM_ACTION_MATCH_FAILED             = -3,
    MODULE_LLM_INFERENCE_DATA_PUSH_FAILED      = -4,
    MODULE_LLM_MODEL_LOADING_FAILED            = -5,
    MODULE_LLM_UNIT_NOT_EXIST                  = -6,
    MODULE_LLM_UNKNOWN_OPERATION               = -7,
    MODULE_LLM_UNIT_RESOURCE_ALLOCATION_FAILED = -8,
    MODULE_LLM_UNIT_CALL_FAILED                = -9,
    MODULE_LLM_MODEL_INIT_FAILED               = -10,
    MODULE_LLM_MODEL_RUN_FAILED                = -11,
    MODULE_LLM_MODULE_NOT_INITIALISED          = -12,
    MODULE_LLM_MODULE_ALREADY_WORKING          = -13,
    MODULE_LLM_MODULE_NOT_WORKING              = -14,
    MODULE_LLM_NO_UPDATEABLE_MODULES           = -15,
    MODULE_LLM_NO_MODULES_AVAILABLE_FOR_UPDATE = -16,
    MODULE_LLM_FILE_OPEN_FAILED                = -17,
    MODULE_LLM_WAIT_RESPONSE_TIMEOUT           = -97,
    MODULE_LLM_RESPONSE_PARSE_FAILED           = -98,
    MODULE_LLM_ERROR_NONE                      = -99,
};
On This Page