Wukong AI Capabilities

Last Updated on : 2026-01-09 02:30:38download

This topic describes how to integrate Wukong AI capabilities into a robot vacuum, focusing on utilizing Tuya’s interfaces to enable intelligent interaction features and audio/video data processing.

Functional description

After integrating Wukong AI capabilities, the robot vacuum can achieve intelligent interaction with users, analyze and process audio and video data, and thus provide an enhanced user experience. The system offers comprehensive AI interaction lifecycle management, including initialization, pre-record configuration, starting interactions, and stopping interactions.

Interfaces and usage

Initialize AI ​​capabilities

ty_ai_chat_service_init

Functional description

Initialize the AI functionalities. This interface must be called before using any AI capabilities.

Interface definition

/**
 * @brief         callback of ai cmd
 * @param[in]     cmd. specifically refer to TY_AI_CHAT_CMD_TYPE_E
 * @param[in]     args. additional data for cmd
 * @return        VOID
 */
typedef VOID (*TY_AI_CMD_CB)(TY_AI_CHAT_CMD_TYPE_E cmd, CONST PVOID_T args);

/**
 * @brief         callback of audio play
 * @param[in]     audio_data. if the AI voice ends, audio_data will be equal to NULL
 * @param[in]     len. len of audio data. if the AI voice ends, len will be equal to 0
 * @return        VOID
 */
typedef VOID (*TY_AI_AUDIO_PLAY_CB)(CONST CHAR_T *audio_data, INT_T len);

/**
 * @brief AI chat service initialization
 * @param[in] cmd_cb AI command callback function
 * @param[in] audio_play_cb Audio playback callback function
 * @param[in] tts_format TTS audio format, such as "mp3", "pcm", etc. Pass NULL to not set
 * @param[in] pre_record_time Pre-recording time in ms, developers can adjust as needed
 * @return OPRT_OK on success. Others on error, please refer to tuya_error_code.h
 */
OPERATE_RET ty_ai_chat_service_init(TY_AI_CMD_CB cmd_cb,
                                    TY_AI_AUDIO_PLAY_CB audio_play_cb,
                                    CHAR_T *tts_format,
                                    UINT_T pre_record_time);

Parameters

Parameter Description
cmd_cb The AI command callback function that handles AI voice commands. You shall integrate specific operation commands or states.
audio_play_cb The AI audio playback callback function. Handles the playback of TTS audio.
tts_format The TTS audio format, such as MP3 and PCM. Pass NULL for no explicit setting.
pre_record_time The AI chat pre-record duration, in milliseconds (ms). When ty_ai_station_start_act is called, the Wukong AI retrieves audio/video data from the ring buffer starting from pre_record_time milliseconds before the current moment and sends it to the cloud for processing.

Return value

Parameter Description
OPRT_OK Initialization succeeded.
Others An error occurred. For the specific error codes, refer to tuya_error_code.h.

Start AI interaction

ty_ai_station_start_act

Functional description

Start the current AI interaction session. After calling this interface, the system will begin retrieving audio and video data from the ring buffer and transmitting it to the cloud.

Interface definition

/**
 * @brief Start AI station activity
 *
 * @return OPRT_OK on success. Others on error, please refer to tuya_error_code.h
 */
OPERATE_RET ty_ai_station_start_act(void);

Return value

Parameter Description
OPRT_OK Initialization succeeded.
Others An error occurred. For the specific error codes, refer to tuya_error_code.h.

Stop AI interaction

ty_ai_station_stop_act

Functional description

Stop the current AI interaction session. After the call, the device stops sending data to the cloud and starts receiving data returned by the cloud. The cloud data will be notified to you through TY_AI_CMD_CB event_cb and TY_AI_AUDIO_PLAY_CB audio_cb.

Interface definition

/**
 * @brief Stop AI station activity
 *
 * @return OPRT_OK on success. Others on error, please refer to tuya_error_code.h
 */
OPERATE_RET ty_ai_station_stop_act(void);

Return value

Parameter Description
OPRT_OK Initialization succeeded.
Others An error occurred. For the specific error codes, refer to tuya_error_code.h.

Procedure

The following example demonstrates the typical workflow for using the AI capabilities:

#include "tuya_error_code.h"
/**
 * @brief The AI command callback function.
 * @param[in] cmd AI command types, including speech recognition, emotion analytics, and motion control.
 * @param[in] args Pointer to command arguments.
 */
static VOID tuya_rvc_ai_cmd_cb(TUYA_AI_CMD_TYPE_E cmd, CONST PVOID_T args)
{
    PR_DEBUG("tuya_rvc_ai_cmd_cb cmd:%d", cmd);
    // Handle the cmd as per your implementation.
    return;
}

/**
 * @brief The audio playback callback function.
 * @param[in] buf The audio data buffer.
 * @param[in] len The length of audio data.
 * @note Handles audio data storage by writing it to a PCM file.
 *       When buf is NULL, it indicates end of data. Close the file.
 */
static VOID tuya_rvc_audio_play_cb(CONST CHAR_T *buf, UINT_T len)
{
    // Implement file saving logic here.
    return;
}
// Main power-on initialization flow.
int main(int argc, char* argv[])
{
    OPERATE_RET ret = 0;

    ret = ty_rvc_iot_init(); // Initialize robot-related events.
    if (ret != OPRT_OK) {
        PR_ERR("ty_rvc_iot_init err");
        return ret;
    }
    // Initialize the SDK
    // Initialize AI chat service, set TTS format to MP3 and pre-record duration to 2000 ms
    ret = ty_ai_chat_service_init(tuya_rvc_ai_cmd_cb, tuya_rvc_audio_play_cb, "mp3", 2000);
    if (ret != OPRT_OK) {
        PR_ERR("ty_ai_chat_service_init err");
        return ret;
    }

    // Start AI station service
    ret = ty_ai_station_start_act();
    if (ret != OPRT_OK) {
        PR_ERR("ty_ai_station_start_act err");
        return ret;
    }

    PR_DEBUG("AI chat service started successfully");

    // Other service initialization
}

FAQs

Why is the pre-record time necessary?

The pre-record time ensures the complete beginning part of the user’s voice command is captured, thereby improving recognition accuracy.

Does the AI interaction session support concurrency?

It is recommended to maintain only one active AI interaction session at a time to ensure system stability.

Why am I not receiving audio data from the cloud?

Currently, the cloud only supports audio in PCM and Opus. The issue might arise if the audio data encoded into the ring buffer does not comply with these format requirements.

By utilizing these interfaces appropriately, you can implement comprehensive AI interaction capabilities in your robot vacuum, making your product more smart and enhancing user experience.