Voice Services

Last Updated on : 2023-02-13 03:09:24download

AI voice technologies enable hands-free device control and open up some fascinating possibilities. This topic describes the voice services provided in TuyaOS Central Control SDK and how to use the APIs.

Background

The voice services depend on the cloud to work. The central control device uploads voice data to the cloud. The cloud crunches the data and sends back the result. The voice services include the following capabilities:

  • Upload voice data to the cloud.
  • Send text-to-speech (TTS) audio to the central control device.
  • Send music or FM radio to the central control device.
  • Send automatic speech recognition (ASR) text output to the central control device.
  • Send TTS text to the central control device.
  • Get the details of alarm clocks.
  • Pair devices by voice.
  • Set a nickname.
  • Voice system data points (DPs), including volume setting, alarm clock/reminder, mic on/off, play/pause, previous/next, and Bluetooth playback on/off.

Enable voice services

Function prototype OPERATE_RET home_control_user_voice_sevice_start(VOID)
Function description This is a demo function, used to enable voice services.
Parameter -
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Example

OPERATE_RET home_control_user_voice_sevice_start(VOID)
{
    OPERATE_RET rt = OPRT_OK;
    TY_SPEAKER_CLOUD_CBS_S speaker_cloud_cbs;
    TY_VOICE_CAPABLE_CBS_S voice_cap = {0};
    // Register callbacks for handling commands received from the cloud.
    memset(&speaker_cloud_cbs, 0x0, sizeof(TY_SPEAKER_CLOUD_CBS_S));
    speaker_cloud_cbs.rev_media_cb = __voice_rcv_cloud_media_cb;
    speaker_cloud_cbs.sync_audio_cb = __voice_rcv_sync_audio_cb;
    speaker_cloud_cbs.thing_config_cb = __voice_rev_cloud_thing_config_cb;
    speaker_cloud_cbs.nick_name_cb = __voice_rev_cloud_set_nick_name_cb;
    speaker_cloud_cbs.rev_ext_custom_cb = __voice_rcv_cloud_ext_custom_cb;
    rt = tuya_speaker_cloud_register(&speaker_cloud_cbs);
    if (OPRT_OK != rt) {
        PR_ERR("tuya_speaker_cloud_register error, rt: %d", rt);
        return rt;
    }

    // Initialize voice upload channel.
    SPEAKER_UPLOAD_CONFIG_S upload_config = SPEAKER_UPLOAD_CONFIG_FOR_SPEEX();
    upload_config.report_stat_cb          = __voice_upload_media_status_cb;
    rt = speaker_intf_upload_init(&upload_config);
    if (OPRT_OK != rt) {
        PR_ERR("speaker_upload_init error:%d", rt);
        return rt;
    }

    // Register callbacks for voice system DPs.
    user_voice_system_dp_register();
    return OPRT_OK;
}

Register callbacks on initialization

Volume setting callback

Function prototype VOID (*TY_VOICE_VOL_CTL_CB)(UINT8_T vol)
Function description Set the volume through the voice system DP 203.
Parameter vol: The volume level, ranging from 0 to 100.
Return value VOID
Detailed description After the setting is done, call tuya_voice_capable_report_vol to synchronize the current volume level with the cloud.

DP control command callback

Function prototype VOID (*TY_VOICE_CTL_CB)(TY_VOICE_CTL_E ctl)
Function description The cloud sends DP commands to control the voice system of the central control device.
Parameter ctl: The control command.
Return value VOID
Detailed description The TY_VOICE_CTRL_E struct includes all the voice system commands.
  • TY_VOICE_MIC_OPEN: Turn on mic.
  • TY_VOICE_MIC_CLOSE: Turn off mic.
  • TY_VOICE_BT_PLAY_OPEN: Turn on Bluetooth playback.
  • TY_VOICE_BT_PLAY_CLOSE: Turn off Bluetooth playback.
  • TY_VOICE_PLAY_START: Start playing the media.
  • TY_VOICE_PLAY_PAUSE: Pause the media.
  • TY_VOICE_PLAY_PREV: Go to the previous item.
  • TY_VOICE_PLAY_NEXT: Go to the next item.
After setting is done, call tuya_voice_capable_report_ctl to synchronize the current status with the cloud.

Create local alarm clocks

Function prototype VOID (*TY_VOICE_ALARM_CLOCK_CB)(CHAR_T *alarm)
Function description The cloud sends a local alarm clock to the central control device through the voice system DP 207.
Parameter alarm: The details of the alarm.
Return value VOID
Detailed description -

Voice upload status

Function prototype VOID(*SPEAKER_UPLOAD_REPORT_STAT_CB)(SPEAKER_UPLOAD_STAT_E stat)
Function description This callback is invoked when an error occurs on voice upload.
Parameter status: The status of voice upload.
Return value VOID
Detailed description status: SPEAKER_UP_STAT_NET_ERR indicates a network error.

Receive multimedia data from the cloud

Function prototype VOID (*TY_SPEAKER_REV_CLOUD_MEDIA_CB)(IN TY_CLOUD_MEDIA_S **pp_media_arr, IN CONST UINT_T arr_size)
Function description This callback is invoked when the central control device receives multimedia data from the cloud. The URL and audio parameters are used to request the audio data stream.
Parameter
  • pp_media_arr: Multimedia data.
  • arr_size: The number of struct arrays.
Return value VOID
Detailed description Parameters for multimedia data:
  • comm: The audio parameter.
  • p_url: The URL of the HTTP request.
  • p_req_body: The body of the HTTP request.

Sub-device pairing request

Function prototype VOID (*TY_SPEAKER_THING_CONFIG_CB)(IN TY_THING_CONFIG_MODE_T mode, IN CONST CHAR_T *token, IN CONST UINT_T timeout)
Function description The central control device receives a request from the cloud to start or stop pairing a sub-device.
Parameter
  • mode: The pairing command.
  • token: The token used to pair a Wi-Fi device.
  • timeout: The pairing timeout period.
Return value VOID
Detailed description The pairing command:
  • TY_THING_CONFIG_START: Start pairing.
  • TY_THING_CONFIG_STOP: Stop pairing.

Nickname setting request

Function prototype VOID (*TY_SPEAKER_NICK_NAME_CB)(IN CONST TY_NICK_NAME_MODE_T mode, IN CONST CHAR_T *nickname, IN CONST CHAR_T *pinyin)
Function description The central control device receives a nickname setting request from the cloud.
Parameter
  • mode: The nickname setting command.
  • nickname: The nickname in Chinese.
  • pinyin: The nickname in Pinyin.
Return value VOID
Detailed description The nickname setting command:
  • TY_NICK_NAME_SET: Add a nickname.
  • TY_NICK_NAME_DEL: Delete a nickname.

Pass through custom data

Function prototype VOID (*TY_SPEAKER_REV_CUSTOM_CB)(IN CONST CHAR_T *type, IN CONST ty_cJSON *json)
Function description A customized 501 pass-through function to pass custom data. You can define the protocol and provide a description.
Parameter
  • type: The interface type.
  • json: The JSON data.
Return value VOID
Detailed description -

Report the current volume level

Function prototype OPERATE_RET tuya_voice_capable_report_vol (UINT8_T vol)
Function description The central control device reports the current volume level to the cloud.
Parameter vol: The volume level, ranging from 0 to 100.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Report the current voice system status

Function prototype OPERATE_RET tuya_voice_capable_report_vol (UINT8_T vol)
Function description The central control device reports the current voice system status to the cloud.
Parameter ctl: The command.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description The TY_VOICE_CTRL_E struct includes all the voice system commands.
  • TY_VOICE_MIC_OPEN: Turn on mic.
  • TY_VOICE_MIC_CLOSE: Turn off mic.
  • TY_VOICE_BT_PLAY_OPEN: Turn on Bluetooth playback.
  • TY_VOICE_BT_PLAY_CLOSE: Turn off Bluetooth playback.
  • TY_VOICE_PLAY_START: Start playing the media.
  • TY_VOICE_PLAY_PAUSE: Pause the media.
  • TY_VOICE_PLAY_PREV: Go to the previous item.
  • TY_VOICE_PLAY_NEXT: Go to the next item.

Upload voice data

The following example opens a WAV audio file, reads the PCM audio data, and uploads it to the cloud. You can find the audio reference file in the audio_resource directory.

OPERATE_RET home_control_user_voice_upload_cloud(CONST CHAR_T *pcm_file)
{
    OPERATE_RET ret = OPRT_OK;
    CONST INT_T send_len = 640;//704
    INT_T remain_len = 0;
    INT_T sended_len = 0;
    CONST INT_T wav_head_len = 44;

    PR_DEBUG("start upload test pcm");

    if (pcm_file == NULL) {
        PR_ERR("Param is invalid");
        return OPRT_INVALID_PARM;
    }

    FILE *p_file = fopen(pcm_file, "r");
    if (p_file == NULL) {
        PR_ERR("Unable to open file %s", pcm_file);
        return OPRT_INVALID_PARM;
    }
    struct stat statbuf;
    stat(pcm_file, &statbuf);
    INT_T total_len = statbuf.st_size - wav_head_len;

    CHAR_T *buf = (char *)malloc(total_len);
    if (buf == NULL) {
        PR_ERR("Malloc error");
        return OPRT_MALLOC_FAILED;
    }
    memset(buf, 0, sizeof(total_len));
    fseek(p_file, wav_head_len, SEEK_SET);
    ret = fread(buf, 1, total_len, p_file);
    if (ret != total_len) {
        PR_ERR("Read error\n", ret);
        ret = OPRT_COM_ERROR;
        goto exit;
    }

    // Initialize encoding parameters.
    TY_AUDIO_INFO_S audio_info = {
        .channels = 1,
        .rate = 16000,
        .bits_per_sample = 16,
    };
    tuya_voice_upload_media_start(TY_SPEEX, &audio_info);

    do {
        ret = tuya_voice_upload_media_send((BYTE_T *)buf + sended_len, send_len);
        if (ret < 0) {
            PR_ERR("speaker_intf_upload_media_send faild:%d", ret);
            ret = OPRT_COM_ERROR;
            goto exit;
        }
        sended_len += send_len;
        remain_len = total_len - sended_len;
    } while (remain_len >= send_len);

    PR_DEBUG("remain_len:%d\n", remain_len);
    if (remain_len > 0) {
        tuya_voice_upload_media_send((BYTE_T *)buf + sended_len, remain_len);
    }

    ret =  OPRT_OK;

exit:
    if (p_file)
        fclose(p_file);
    if (buf)
        free(buf);
    tuya_voice_upload_media_stop();

    return ret;
}

Voice data upload only supports a single thread. There are three steps involved: start, upload, and stop.

Initialize voice upload

Function prototype OPERATE_RET tuya_voice_upload_media_start (IN CONST TY_MEDIA_ENCODE_T type, IN CONST TY_AUDIO_INFO_S *info)
Function description Enable uploading voice data to the cloud.
Parameter
  • info: Audio parameters.
  • type: The encoding type.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description The audio parameters:
  • channels: The audio channel. Set it to 1 for mono.
  • sample_freq: The sample rate. Set it to 16000.
  • bit_depth: The sample depth. Set it to 16 bits.

Upload voice data

Function prototype OPERATE_RET tuya_voice_upload_media_send(IN CONST BYTE_T *p_buf, IN CONST UINT_T buf_len)
Function description The central control device uploads voice data to the cloud, writing PCM raw stream to the buffer.
Parameter
  • p_buf: The buffer address of PCM raw stream.
  • buf_len: The length of PCM raw stream.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Stop uploading voice data

Function prototype OPERATE_RET tuya_voice_upload_media_stop(VOID)
Function description The central control device stops uploading voice data and frees resources.
Parameter VOID
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Free the resources for multimedia reception

Function prototype OPERATE_RET tuya_speaker_free_cloud_media_arr (IN TY_CLOUD_MEDIA_S **pp_media_arr, IN CONST UINT_T arr_size)
Function description The central control device frees the resources used for multimedia reception.
Parameter
  • pp_media_arr: Same as the parameter of __voice_rcv_cloud_media_cb.
  • arr_size: Same as the parameter of __voice_rcv_cloud_media_cb.
Return value OPERATE_RET
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description When __voice_rcv_cloud_media_cb is invoked, the user downloads the requested audio file, such as TTS or music. After the playback is finished, the resources are freed.

Notify the cloud of the end of a conversation

Function prototype OPERATE_RET tuya_speaker_mqtt_report_complete_tts(IN CONST CHAR_T *p_callback_val)
Function description The central control device notifies the cloud that the current conversation ended.
Parameter p_callback_val: The type of conversation.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description The user can hold a continuous conversation with the voice assistant by a single use of a wake word. In this case, when TTS playback is finished, this interface is called to notify the cloud that the current conversation is closed so that a new conversation can be initiated.

Pair devices by voice

Start pairing

Function prototype OPERATE_RET tuya_speaker_mqtt_report_thing_config_request (VOID)
Function description The central control device reports the start of pairing to the cloud.
Parameter VOID
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Stop pairing

Function prototype OPERATE_RET tuya_speaker_mqtt_report_thing_config_stop (VOID)
Function description The central control device reports the stop of pairing to the cloud.
Parameter VOID
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Reject pairing

Function prototype OPERATE_RET tuya_speaker_mqtt_report_thing_config_reject (VOID)
Function description The central control device reports the rejection of pairing to the cloud.
Parameter VOID
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Report the number of paired devices

Function prototype OPERATE_RET tuya_speaker_mqtt_report_thing_config_access_count (IN CONST INT_T count)
Function description The central control device reports the number of paired devices to the cloud.
Parameter count: The number of paired devices.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description -

Report the result of nickname setting

Function prototype OPERATE_RET tuya_speaker_mqtt_report_nick_name (IN CONST TY_NICK_NAME_MODE_T mode, IN CONST CHAR_T *nickname, IN CONST CHAR_T *pinyin, IN CONST BOOL_T set_result)
Function description The central control device reports the result of adding or deleting a nickname.
Parameter
  • mode: The nickname setting command.
  • nickname: The nickname in Chinese.
  • pinyin: The nickname in Pinyin.
  • set_result TRUE: Success. FALSE: Failure.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description The nickname setting command:
  • TY_NICK_NAME_SET: Add a nickname.
  • TY_NICK_NAME_DEL: Delete a nickname.

Upload text to the cloud

Function prototype OPERATE_RET tuya_speaker_mqtt_get_tts(IN CONST CHAR_T *p_tts_content)
Function description Upload the text content to the cloud.
Parameter p_tts_content: The text content.
Return value OPERATE_RET:
  • 0: Success.
  • Other values: Failure. See the error code.
Detailed description __voice_rcv_cloud_media_cb receives the TTS audio for playback.