Interaction Mode

Last Updated on : 2026-03-23 08:55:00Copy for LLMView as MarkdownDownload PDF

Overview

Mode module is responsible for managing multiple AI conversation trigger methods (press and hold, single button press, keyword spotting, P2P, and translation), providing a flexible framework for implementing various interaction modes. Each mode is built upon the same state machine. A new mode can be added by implementing a set of optional callbacks.

Directory

mode/
├── wukong_ai_mode.h/c             # Mode manager, providing external interfaces
├── wukong_ai_mode_free.c          # Free talk
├── wukong_ai_mode_hold.c          # Press and hold to start talk
├── wukong_ai_mode_oneshot.c       # Single button press turn-based talk
├── wukong_ai_mode_wakeup.c        # Keyword spotting turn-based talk
├── wukong_ai_mode_p2p.c           # P2P communication (requires trigger by app panel)
└── wukong_ai_mode_translate.c     # Translation

Terms and abbreviations

Abbreviation Spelled-out
AEC Acoustic Echo Cancellation
VAD Voice Activity Detection
KWS Keyword Spotting
ASR Automatic Speech Recognition
TTS Text-to-Speech
P2P Peer-to-Peer

Supported modes

All modes support AEC by default. Their differences mainly lie in the trigger method, VAD mode, whether KWS/Server VAD is enabled, and typical scenarios.

Mode Trigger method Application scenario Features
Press-and-hold
(wukong_ai_mode_hold)
Press and hold the specified button Push-to-talk conversation
  • Manual VAD mode (Control recording with the specified button)
  • KWS disabled
  • Server VAD disabled
Single button press turn-based talk
(wukong_ai_mode_oneshot)
Press the specified button once Turn-based conversation
  • Manual VAD mode
  • KWS disabled
  • Server VAD enabled
Keyword spotting turn-based talk
(wukong_ai_mode_wakeup)
Keyword spotting Hands-free voice interaction
  • Automatic VAD mode
  • KWS enabled
  • Server VAD enabled
Free talk mode
(wukong_ai_mode_free)
Keyword spotting and continuous conversation Natural conversational flow
  • Automatic VAD mode
  • KWS enabled
  • Server VAD enabled
  • Supports continuous conversation
P2P mode
(wukong_ai_mode_p2p)
P2P communication (requires trigger by app panel) Direct point-to-point audio communication
  • Manual VAD mode
  • KWS disabled
  • Direct P2P transmission of audio streams to the app
Translation mode
(wukong_ai_mode_translate)
Keyword spotting Real-time language translation
  • Automatic VAD mode
  • KWS enabled
  • ASR-LLM-TTS workflow
  • Supports retrieving the language list and switching among languages

How to configure the modes

The enabling of modes, board type, and function switches is all configured in the Application Configuration.

  • Entry: make app_menuconfig APP_NAME=tuyaos_demo_wukong_ai
  • Configuration generation: The configuration is written to the build directory (such as APPConfig) under the application root. After modification, you must run make app_config to generate the header files.

State machine

All modes share the same state machine. The state transition flow is as follows:

INIT → IDLE → LISTEN → UPLOAD → THINK → SPEAK
        ↑       ↑                         ↓
        └───────└─────────────────────────┘
Status Description
INIT Initialization
IDLE Idle, waiting for a trigger
LISTEN Listen for user input
UPLOAD Upload audio to the cloud
THINK Handle by cloud AI
SPEAK Play TTS reply

API reference

Mode management

/**
 * @brief  Initialize the mode system.
 *
 * Register all enabled modes and initialize the default mode.
 *
 * @return Return OPRT_OK on success. Otherwise, return an error code.
 */
OPERATE_RET wukong_ai_mode_init(VOID);

/**
 * @brief Switch to the next enabled mode.
 *
 * @param[in] cur_mode The current mode.
 * @return Return OPRT_OK on success. Otherwise, return an error code.
 */
OPERATE_RET wukong_ai_mode_switch(AI_TRIGGER_MODE_E cur_mode);

/**
 * @brief Switch to a specified mode.
 *
 * @param[in] mode The target mode.
 * @return Return OPRT_OK on success. Otherwise, return an error code.
 */
OPERATE_RET wukong_ai_mode_switch_to(AI_TRIGGER_MODE_E mode);

Message dispatching (Recommended entry)

Provides a unified interface to handle various states and messages. Only a single entry, wukong_ai_mode_dispatch, is exposed externally. It distributes requests to the corresponding callbacks of the current mode via operation enums.

/** Operation type: Corresponds one-to-one with the callbacks in AI_CHAT_MODE_HANDLE_T. */
typedef enum {
    AI_MODE_OP_INIT, AI_MODE_OP_DEINIT, AI_MODE_OP_KEY, AI_MODE_OP_TASK,
    AI_MODE_OP_EVENT, AI_MODE_OP_WAKEUP, AI_MODE_OP_VAD, AI_MODE_OP_CLIENT,
    AI_MODE_OP_NOTIFY_IDLE, AI_MODE_OP_AUDIO_INPUT, AI_MODE_OP_MAX,
} AI_MODE_OP_E;

/**
 * @brief Dispatch an operation to the current mode (recommended).
 * @param[in] op   The operation type.
 * @param[in] data The payload (it can be NULL).
 * @param[in] len  The payload length.
 * @return Return OPRT_OK on success. Return OPRT_NOT_FOUND if no corresponding handler is found, or the return value of the callback function.
 */
OPERATE_RET wukong_ai_mode_dispatch(AI_MODE_OP_E op, VOID *data, INT_T len);

Mode handler interface

Each mode implements the following interfaces as needed. Interfaces that are not implemented (assigned as NULL) will be safely intercepted during event dispatching or fall back to default behavior.

typedef struct {
    OPERATE_RET (*on_init)(VOID *data, INT_T len);        /* Initialization when the mode is entered (For example, set VAD mode, and start/stop KWS) */
    OPERATE_RET (*on_deinit)(VOID *data, INT_T len);      /* Deinitialization when the mode is exited (resource cleanup) */
    OPERATE_RET (*on_key)(VOID *data, INT_T len);        /* Physical button event triggered (For example, press and hold, or press once) */
    OPERATE_RET (*on_task)(VOID *data, INT_T len);        /* Scheduled task or main loop polling (handles timeouts or delayed operations) */
    OPERATE_RET (*on_event)(VOID *data, INT_T len);      /* AI cloud or system events (For example, ASR recognition results and TTS playback status) */
    OPERATE_RET (*on_wakeup)(VOID *data, INT_T len);      /* Local voice keyword wake-up event */
    OPERATE_RET (*on_vad)(VOID *data, INT_T len);         /* Voice Activity Detection event (detects start/end of human speech) */
    OPERATE_RET (*on_client)(VOID *data, INT_T len);      /* Client business instructions or other network downlink data */
    OPERATE_RET (*on_notify_idle)(VOID *data, INT_T len); /* System idle state notification (automatically return to default state in case of timeout) */
    OPERATE_RET (*on_audio_input)(VOID *data, INT_T len); /* Audio data stream input (if this callback is not implemented, the default cloud uplink will be used) */
} AI_CHAT_MODE_HANDLE_T;

Modes can internally share the AI_CHAT_MODE_PARAM_T (defined in wukong_ai_mode.h), which includes wakeup_stat and state, as the context structure.

State change macro

Switches states within a mode and prints logs during the transition.

#define MODE_STATE_CHANGE(_mode, _old, _new) \
do { \
    PR_DEBUG("Mode %s status changes from %s to %s", \
             _mode_str[_mode], _state_str[_old], _state_str[_new]); \
    _old = _new; \
} while (0)

Example

Initialize

#include "wukong_ai_mode.h"

/* Initialize the mode system */
OPERATE_RET init_modes(VOID)
{
    OPERATE_RET rt = OPRT_OK;

    /* Initialize the mode manager */
    /* This will register all enabled modes and start the default mode */
    rt = wukong_ai_mode_init();
    if (rt != OPRT_OK) {
        printf("Failed to initialize mode: %d\n", rt);
        return rt;
    }

    return rt;
}

Switch mode

/* Switch to wakeup mode */
wukong_ai_mode_switch_to(AI_TRIGGER_MODE_WAKEUP);

/* Switch to the next enabled mode */
AI_TRIGGER_MODE_E current = tuya_ai_toy_trigger_mode_get();
wukong_ai_mode_switch(current);

Handle events (Consistently use dispatch + operation type)

/* Key */
PUSH_KEY_TYPE_E key_event = NORMAL_KEY;
wukong_ai_mode_dispatch(AI_MODE_OP_KEY, &key_event, sizeof(key_event));

/* AI event */
WUKONG_AI_EVENT_T event = { .type = WUKONG_AI_EVENT_ASR_OK, .data = asr_result };
wukong_ai_mode_dispatch(AI_MODE_OP_EVENT, &event, 0);

/* Wakeup and VAD */
wukong_ai_mode_dispatch(AI_MODE_OP_WAKEUP, NULL, 0);
WUKONG_AUDIO_VAD_FLAG_E vad_flag = WUKONG_AUDIO_VAD_START;
wukong_ai_mode_dispatch(AI_MODE_OP_VAD, &vad_flag, sizeof(vad_flag));

State change macro

/* Use MODE_STATE_CHANGE to perform state transitions within the mode */
MODE_STATE_CHANGE(AI_TRIGGER_MODE_WAKEUP, s_ai_wakeup.state, AI_CHAT_LISTEN);

Create a mode

Create a new file wukong_ai_mode_my_mode.c under the mode/ directory. This file should only depend on wukong_ai_mode.h and does not require a separate header file.

Preparation: Enable the mode in the configuration

  1. Add a configuration item. In the application Kconfig, add an option ENABLE_AI_MODE_MY_MODE (or the current mode name) so that it can be selected in make app_menuconfig.
  2. Enable the mode for board types. In the build/APPconfig file, add select ENABLE_AI_MODE_MY_MODE under the config section of the desired board type. If all board types should have this option, add it for each board type respectively.
  3. Generate configuration. After saving, run make app_config APP_NAME=tuyaos_demo_wukong_ai to generate the header files.

Step 1: Implement the mode source file (.c)

/* wukong_ai_mode_my_mode.c */
#include "wukong_ai_mode.h"

#if defined(ENABLE_AI_MODE_MY_MODE) && (ENABLE_AI_MODE_MY_MODE == 1)

STATIC AI_CHAT_MODE_HANDLE_T s_ai_my_mode_cb = {0};
STATIC AI_CHAT_MODE_PARAM_T s_ai_my_mode = {0};
STATIC AI_CHAT_STATE_E s_ai_cur_state = AI_CHAT_INVALID;

/* Status callback */
STATIC OPERATE_RET __ai_my_mode_idle_cb(VOID *data, INT_T len)
{
    tuya_ai_toy_led_off();
    wukong_audio_input_wakeup_set(FALSE);
    return OPRT_OK;
}

STATIC OPERATE_RET __ai_my_mode_listen_cb(VOID *data, INT_T len)
{
    tuya_ai_toy_led_flash(500);
    wukong_audio_input_wakeup_set(TRUE);
    return OPRT_OK;
}

/* ... Implement other state callbacks ... */

/* Event handler */
STATIC OPERATE_RET wukong_ai_my_mode_event_cb(VOID *data, INT_T len)
{
    WUKONG_AI_EVENT_T *event = (WUKONG_AI_EVENT_T *)data;

    switch (event->type) {
    case WUKONG_AI_EVENT_ASR_OK:
        MODE_STATE_CHANGE(AI_TRIGGER_MODE_MY_MODE, s_ai_my_mode.state, AI_CHAT_THINK);
        break;

    case WUKONG_AI_EVENT_TTS_PRE:
        MODE_STATE_CHANGE(AI_TRIGGER_MODE_MY_MODE, s_ai_my_mode.state, AI_CHAT_SPEAK);
        break;

    /* ... Handle other events ... */
    }

    return OPRT_OK;
}

/* Initialize the handler */
STATIC OPERATE_RET wukong_ai_my_mode_init_cb(VOID *data, INT_T len)
{
    wukong_audio_input_wakeup_mode_set(WUKONG_AUDIO_VAD_AUTO);
    wukong_kws_enable();
    MODE_STATE_CHANGE(AI_TRIGGER_MODE_MY_MODE, s_ai_my_mode.state, AI_CHAT_IDLE);
    return OPRT_OK;
}

/* Register mode */
OPERATE_RET ai_my_mode_register(AI_CHAT_MODE_HANDLE_T **cb)
{
    s_ai_my_mode_cb.on_init       = wukong_ai_my_mode_init_cb;
    s_ai_my_mode_cb.on_deinit     = wukong_ai_my_mode_deinit_cb;
    s_ai_my_mode_cb.on_key        = wukong_ai_my_mode_key_cb;
    s_ai_my_mode_cb.on_task       = wukong_ai_my_mode_task_cb;
    s_ai_my_mode_cb.on_event      = wukong_ai_my_mode_event_cb;
    s_ai_my_mode_cb.on_wakeup     = wukong_ai_my_mode_wakeup;
    s_ai_my_mode_cb.on_vad        = wukong_ai_my_mode_vad;
    s_ai_my_mode_cb.on_client     = wukong_ai_my_mode_client_run;
    s_ai_my_mode_cb.on_notify_idle = wukong_ai_my_mode_notify_idle_cb;

    *cb = &s_ai_my_mode_cb;
    return OPRT_OK;
}

#endif /* ENABLE_AI_MODE_MY_MODE */

Step 2: Register in the mode manager

In wukong_ai_mode.c, add an extern declaration for ai_my_mode_register. Then, within wukong_ai_mode_init(), register the mode according to the existing mode. Enable based on the ENABLE_AI_MODE_MY_MODE switch, and set s_ai_mode_map[AI_TRIGGER_MODE_MY_MODE].

Without modifying the configuration macros, it is necessary to add AI_TRIGGER_MODE_MY_MODE to the enumeration and populate the _mode_str table accordingly.

Step 3: Add mode enumeration (for a new mode type)

In wukong_ai_mode.h, add the new enumeration value to AI_TRIGGER_MODE_E (for example, AI_TRIGGER_MODE_MY_MODE). Simultaneously, update the _mode_str table in wukong_ai_mode.c.

Configuration macros (like ENABLE_AI_MODE_*) are managed by the existing build/configuration system, so no new configuration macros are added or modified here.

Support

If you have any problems with TuyaOS development, you can post your questions in the Tuya Developer Forum.