AI voice technologies enable hands-free device control and open up some fascinating possibilities. This topic describes the voice services provided in TuyaOS Central Control SDK and how to use the APIs.
Background
The voice services depend on the cloud to work. The central control device uploads voice data to the cloud. The cloud crunches the data and sends back the result. The voice services include the following capabilities:
- Upload voice data to the cloud.
- Send text-to-speech (TTS) audio to the central control device.
- Send music or FM radio to the central control device.
- Send automatic speech recognition (ASR) text output to the central control device.
- Send TTS text to the central control device.
- Get the details of alarm clocks.
- Pair devices by voice.
- Set a nickname.
- Voice system data points (DPs), including volume setting, alarm clock/reminder, mic on/off, play/pause, previous/next, and Bluetooth playback on/off.
Enable voice services
Function prototype |
OPERATE_RET home_control_user_voice_sevice_start(VOID) |
Function description |
This is a demo function, used to enable voice services. |
Parameter |
- |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Example
OPERATE_RET home_control_user_voice_sevice_start(VOID)
{
OPERATE_RET rt = OPRT_OK;
TY_SPEAKER_CLOUD_CBS_S speaker_cloud_cbs;
TY_VOICE_CAPABLE_CBS_S voice_cap = {0};
memset(&speaker_cloud_cbs, 0x0, sizeof(TY_SPEAKER_CLOUD_CBS_S));
speaker_cloud_cbs.rev_media_cb = __voice_rcv_cloud_media_cb;
speaker_cloud_cbs.sync_audio_cb = __voice_rcv_sync_audio_cb;
speaker_cloud_cbs.thing_config_cb = __voice_rev_cloud_thing_config_cb;
speaker_cloud_cbs.nick_name_cb = __voice_rev_cloud_set_nick_name_cb;
speaker_cloud_cbs.rev_ext_custom_cb = __voice_rcv_cloud_ext_custom_cb;
rt = tuya_speaker_cloud_register(&speaker_cloud_cbs);
if (OPRT_OK != rt) {
PR_ERR("tuya_speaker_cloud_register error, rt: %d", rt);
return rt;
}
SPEAKER_UPLOAD_CONFIG_S upload_config = SPEAKER_UPLOAD_CONFIG_FOR_SPEEX();
upload_config.report_stat_cb = __voice_upload_media_status_cb;
rt = speaker_intf_upload_init(&upload_config);
if (OPRT_OK != rt) {
PR_ERR("speaker_upload_init error:%d", rt);
return rt;
}
user_voice_system_dp_register();
return OPRT_OK;
}
Register callbacks on initialization
Volume setting callback
Function prototype |
VOID (*TY_VOICE_VOL_CTL_CB)(UINT8_T vol) |
Function description |
Set the volume through the voice system DP 203. |
Parameter |
vol : The volume level, ranging from 0 to 100. |
Return value |
VOID |
Detailed description |
After the setting is done, call tuya_voice_capable_report_vol to synchronize the current volume level with the cloud. |
DP control command callback
Function prototype |
VOID (*TY_VOICE_CTL_CB)(TY_VOICE_CTL_E ctl) |
Function description |
The cloud sends DP commands to control the voice system of the central control device. |
Parameter |
ctl : The control command. |
Return value |
VOID |
Detailed description |
The TY_VOICE_CTRL_E struct includes all the voice system commands.TY_VOICE_MIC_OPEN : Turn on mic.TY_VOICE_MIC_CLOSE : Turn off mic.TY_VOICE_BT_PLAY_OPEN : Turn on Bluetooth playback.TY_VOICE_BT_PLAY_CLOSE : Turn off Bluetooth playback.TY_VOICE_PLAY_START : Start playing the media.TY_VOICE_PLAY_PAUSE : Pause the media.TY_VOICE_PLAY_PREV : Go to the previous item.TY_VOICE_PLAY_NEXT : Go to the next item. After setting is done, call tuya_voice_capable_report_ctl to synchronize the current status with the cloud. |
Create local alarm clocks
Function prototype |
VOID (*TY_VOICE_ALARM_CLOCK_CB)(CHAR_T *alarm) |
Function description |
The cloud sends a local alarm clock to the central control device through the voice system DP 207. |
Parameter |
alarm : The details of the alarm. |
Return value |
VOID |
Detailed description |
- |
Voice upload status
Function prototype |
VOID(*SPEAKER_UPLOAD_REPORT_STAT_CB)(SPEAKER_UPLOAD_STAT_E stat) |
Function description |
This callback is invoked when an error occurs on voice upload. |
Parameter |
status : The status of voice upload. |
Return value |
VOID |
Detailed description |
status : SPEAKER_UP_STAT_NET_ERR indicates a network error. |
Receive multimedia data from the cloud
Function prototype |
VOID (*TY_SPEAKER_REV_CLOUD_MEDIA_CB)(IN TY_CLOUD_MEDIA_S **pp_media_arr, IN CONST UINT_T arr_size) |
Function description |
This callback is invoked when the central control device receives multimedia data from the cloud. The URL and audio parameters are used to request the audio data stream. |
Parameter |
pp_media_arr : Multimedia data.arr_size : The number of struct arrays.
|
Return value |
VOID |
Detailed description |
Parameters for multimedia data:comm : The audio parameter.p_url : The URL of the HTTP request.p_req_body : The body of the HTTP request.
|
Sub-device pairing request
Function prototype |
VOID (*TY_SPEAKER_THING_CONFIG_CB)(IN TY_THING_CONFIG_MODE_T mode, IN CONST CHAR_T *token, IN CONST UINT_T timeout) |
Function description |
The central control device receives a request from the cloud to start or stop pairing a sub-device. |
Parameter |
mode : The pairing command.token : The token used to pair a Wi-Fi device.timeout : The pairing timeout period.
|
Return value |
VOID |
Detailed description |
The pairing command:TY_THING_CONFIG_START : Start pairing.TY_THING_CONFIG_STOP : Stop pairing.
|
Nickname setting request
Function prototype |
VOID (*TY_SPEAKER_NICK_NAME_CB)(IN CONST TY_NICK_NAME_MODE_T mode, IN CONST CHAR_T *nickname, IN CONST CHAR_T *pinyin) |
Function description |
The central control device receives a nickname setting request from the cloud. |
Parameter |
mode : The nickname setting command.nickname : The nickname in Chinese.pinyin : The nickname in Pinyin.
|
Return value |
VOID |
Detailed description |
The nickname setting command:TY_NICK_NAME_SET : Add a nickname.TY_NICK_NAME_DEL : Delete a nickname.
|
Pass through custom data
Function prototype |
VOID (*TY_SPEAKER_REV_CUSTOM_CB)(IN CONST CHAR_T *type, IN CONST ty_cJSON *json) |
Function description |
A customized 501 pass-through function to pass custom data. You can define the protocol and provide a description. |
Parameter |
type : The interface type.json : The JSON data.
|
Return value |
VOID |
Detailed description |
- |
Report the current volume level
Function prototype |
OPERATE_RET tuya_voice_capable_report_vol (UINT8_T vol) |
Function description |
The central control device reports the current volume level to the cloud. |
Parameter |
vol : The volume level, ranging from 0 to 100. |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Report the current voice system status
Function prototype |
OPERATE_RET tuya_voice_capable_report_vol (UINT8_T vol) |
Function description |
The central control device reports the current voice system status to the cloud. |
Parameter |
ctl : The command. |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
The TY_VOICE_CTRL_E struct includes all the voice system commands.TY_VOICE_MIC_OPEN : Turn on mic.TY_VOICE_MIC_CLOSE : Turn off mic.TY_VOICE_BT_PLAY_OPEN : Turn on Bluetooth playback.TY_VOICE_BT_PLAY_CLOSE : Turn off Bluetooth playback.TY_VOICE_PLAY_START : Start playing the media.TY_VOICE_PLAY_PAUSE : Pause the media.TY_VOICE_PLAY_PREV : Go to the previous item.TY_VOICE_PLAY_NEXT : Go to the next item.
|
Upload voice data
The following example opens a WAV audio file, reads the PCM audio data, and uploads it to the cloud. You can find the audio reference file in the audio_resource
directory.
OPERATE_RET home_control_user_voice_upload_cloud(CONST CHAR_T *pcm_file)
{
OPERATE_RET ret = OPRT_OK;
CONST INT_T send_len = 640;
INT_T remain_len = 0;
INT_T sended_len = 0;
CONST INT_T wav_head_len = 44;
PR_DEBUG("start upload test pcm");
if (pcm_file == NULL) {
PR_ERR("Param is invalid");
return OPRT_INVALID_PARM;
}
FILE *p_file = fopen(pcm_file, "r");
if (p_file == NULL) {
PR_ERR("Unable to open file %s", pcm_file);
return OPRT_INVALID_PARM;
}
struct stat statbuf;
stat(pcm_file, &statbuf);
INT_T total_len = statbuf.st_size - wav_head_len;
CHAR_T *buf = (char *)malloc(total_len);
if (buf == NULL) {
PR_ERR("Malloc error");
return OPRT_MALLOC_FAILED;
}
memset(buf, 0, sizeof(total_len));
fseek(p_file, wav_head_len, SEEK_SET);
ret = fread(buf, 1, total_len, p_file);
if (ret != total_len) {
PR_ERR("Read error\n", ret);
ret = OPRT_COM_ERROR;
goto exit;
}
TY_AUDIO_INFO_S audio_info = {
.channels = 1,
.rate = 16000,
.bits_per_sample = 16,
};
tuya_voice_upload_media_start(TY_SPEEX, &audio_info);
do {
ret = tuya_voice_upload_media_send((BYTE_T *)buf + sended_len, send_len);
if (ret < 0) {
PR_ERR("speaker_intf_upload_media_send faild:%d", ret);
ret = OPRT_COM_ERROR;
goto exit;
}
sended_len += send_len;
remain_len = total_len - sended_len;
} while (remain_len >= send_len);
PR_DEBUG("remain_len:%d\n", remain_len);
if (remain_len > 0) {
tuya_voice_upload_media_send((BYTE_T *)buf + sended_len, remain_len);
}
ret = OPRT_OK;
exit:
if (p_file)
fclose(p_file);
if (buf)
free(buf);
tuya_voice_upload_media_stop();
return ret;
}
Voice data upload only supports a single thread. There are three steps involved: start, upload, and stop.
Initialize voice upload
Function prototype |
OPERATE_RET tuya_voice_upload_media_start (IN CONST TY_MEDIA_ENCODE_T type, IN CONST TY_AUDIO_INFO_S *info) |
Function description |
Enable uploading voice data to the cloud. |
Parameter |
info : Audio parameters.type : The encoding type.
|
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
The audio parameters:channels : The audio channel. Set it to 1 for mono.sample_freq : The sample rate. Set it to 16000.bit_depth : The sample depth. Set it to 16 bits.
|
Upload voice data
Function prototype |
OPERATE_RET tuya_voice_upload_media_send(IN CONST BYTE_T *p_buf, IN CONST UINT_T buf_len) |
Function description |
The central control device uploads voice data to the cloud, writing PCM raw stream to the buffer. |
Parameter |
p_buf : The buffer address of PCM raw stream.buf_len : The length of PCM raw stream.
|
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Stop uploading voice data
Function prototype |
OPERATE_RET tuya_voice_upload_media_stop(VOID) |
Function description |
The central control device stops uploading voice data and frees resources. |
Parameter |
VOID |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Free the resources for multimedia reception
Function prototype |
OPERATE_RET tuya_speaker_free_cloud_media_arr (IN TY_CLOUD_MEDIA_S **pp_media_arr, IN CONST UINT_T arr_size) |
Function description |
The central control device frees the resources used for multimedia reception. |
Parameter |
pp_media_arr : Same as the parameter of __voice_rcv_cloud_media_cb .arr_size : Same as the parameter of __voice_rcv_cloud_media_cb .
|
Return value |
OPERATE_RET0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
When __voice_rcv_cloud_media_cb is invoked, the user downloads the requested audio file, such as TTS or music. After the playback is finished, the resources are freed. |
Notify the cloud of the end of a conversation
Function prototype |
OPERATE_RET tuya_speaker_mqtt_report_complete_tts(IN CONST CHAR_T *p_callback_val) |
Function description |
The central control device notifies the cloud that the current conversation ended. |
Parameter |
p_callback_val : The type of conversation. |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
The user can hold a continuous conversation with the voice assistant by a single use of a wake word. In this case, when TTS playback is finished, this interface is called to notify the cloud that the current conversation is closed so that a new conversation can be initiated. |
Pair devices by voice
Start pairing
Function prototype |
OPERATE_RET tuya_speaker_mqtt_report_thing_config_request (VOID) |
Function description |
The central control device reports the start of pairing to the cloud. |
Parameter |
VOID |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Stop pairing
Function prototype |
OPERATE_RET tuya_speaker_mqtt_report_thing_config_stop (VOID) |
Function description |
The central control device reports the stop of pairing to the cloud. |
Parameter |
VOID |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Reject pairing
Function prototype |
OPERATE_RET tuya_speaker_mqtt_report_thing_config_reject (VOID) |
Function description |
The central control device reports the rejection of pairing to the cloud. |
Parameter |
VOID |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Report the number of paired devices
Function prototype |
OPERATE_RET tuya_speaker_mqtt_report_thing_config_access_count (IN CONST INT_T count) |
Function description |
The central control device reports the number of paired devices to the cloud. |
Parameter |
count : The number of paired devices. |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
- |
Report the result of nickname setting
Function prototype |
OPERATE_RET tuya_speaker_mqtt_report_nick_name (IN CONST TY_NICK_NAME_MODE_T mode, IN CONST CHAR_T *nickname, IN CONST CHAR_T *pinyin, IN CONST BOOL_T set_result) |
Function description |
The central control device reports the result of adding or deleting a nickname. |
Parameter |
mode : The nickname setting command.nickname : The nickname in Chinese.pinyin : The nickname in Pinyin.set_result TRUE : Success. FALSE : Failure.
|
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
The nickname setting command:TY_NICK_NAME_SET : Add a nickname.TY_NICK_NAME_DEL : Delete a nickname.
|
Upload text to the cloud
Function prototype |
OPERATE_RET tuya_speaker_mqtt_get_tts(IN CONST CHAR_T *p_tts_content) |
Function description |
Upload the text content to the cloud. |
Parameter |
p_tts_content : The text content. |
Return value |
OPERATE_RET:0 : Success.- Other values: Failure. See the error code.
|
Detailed description |
__voice_rcv_cloud_media_cb receives the TTS audio for playback. |