Last Updated on : 2024-06-25 04:04:02download
TuyaOS provides a Core Dump tool to capture the stack trace when a segmentation fault (segfault) occurs, helping you find the function call that caused the problem.
This tool is only applicable to the chip platforms that Tuya has adapted. Its usage on non-adapted platforms is not verified. It is recommended to simulate segfaults to test if this tool works for your platform. If it does not work, use your platform-specific debugger instead.
This topic describes how to implement the problem diagnosis feature with TuyaOS Gateway Development Framework.
A device might be restarted due to a segfault in a program. If a segfault occurs during the running of business logic, the corresponding feature might not work. In some cases, exceptions are not found in the testing stages but after going live.
For devices running on Linux, you can enable core dumps to create a .core
file when various errors occur. You can use GDB
to read this file to track function calls and identify the line of code that caused the problem. However, accumulative .core
files will use a lot of storage space. Generally, for smart devices, core dumps are enabled only in the debugging stages.
Given this, TuyaOS provides a lightweight Core Dump tool to help you track down program errors on deployed devices.
The SDK handles exception signals. When an exception in the program occurs, the SDK saves the stack trace to a file that is only a few KB in size. You can use the Core Dump tool to examine the stack trace and determine the function that caused the problem by reviewing the function at the top of the stack and the function call stack.
Generally, the executable program does not come with a symbol table. But, be sure to use the -g
option to compile an additional program with debugging information that can be read by the Core Dump tool for stack trace analysis.
After gateway initialization, call tuya_gw_app_debug_start
to enable the problem diagnosis feature to capture the stack trace when exceptions occur in the program.
When you implement local logs in Log Management, package the stack trace file into the log file so that you can get the stack information at the time of a crash.
Example
int main(int argc, char **argv)
{
OPERATE_RET rt = OPRT_OK;
// TuyaOS
TUYA_CALL_ERR_RETURN(tuya_iot_init("./"));
// Set authorization information.
TUYA_CALL_ERR_RETURN(tuya_iot_set_gw_prod_info(&prod_info));
// Gateway pre-initialization.
TUYA_CALL_ERR_RETURN(tuya_iot_sdk_pre_init(TRUE));
// Gateway initialization.
TUYA_CALL_ERR_RETURN(tuya_iot_wr_wf_sdk_init(IOT_GW_NET_WIRED_WIFI, GWCM_OLD, WF_START_AP_ONLY, M_PID, M_SW_VERSION, NULL, 0));
// Gateway startup.
TUYA_CALL_ERR_RETURN(tuya_iot_sdk_start());
// Enable problem diagnosis. The parameter is the directory where the stack trace file resides.
tuya_gw_app_debug_start("./log_dir/");
while (1) {
tuya_hal_system_sleep(10*1000);
}
return OPRT_OK;
}
When a segfault occurs, copy the stack trace file and the program compiled with debug information to the directory of the Core Dump tool and run Core Dump for analysis. Note that the name of the debugging program must be the same as the executable program.
Run the following command to use Core Dump:
python3 coredump.py -d <dump file>
Source code
import argparse
import os
parser = argparse.ArgumentParser(description='SDK Coredump Analyzer')
parser.add_argument(
'-d', '--dump_file', required=True, type=str, help='crash dump file')
args = parser.parse_args()
sys_so = ["libc.so", "libc-", "libpthread-", "libpthread.so", "ld-", "ld.so", "stdc++", "uClibc", "libgcc"]
'''
crash dump file format:
stack dump:
00000c00 00000001 7fd10000 00000001
stack dump End
dump text section
00400000-00897000 r-xp 00000000 00:08 237597 /var/tmp/tyZ3Gw
'''
def parse_dump_file(filename):
is_stack = False
is_text = False
stack = []
text = {}
if not os.path.isfile(filename):
return stack, text
with open(filename, 'r') as f:
for line in f:
if line.find("stack dump:") != -1:
is_stack = True
continue
if line.find("stack dump End") != -1:
is_stack = False
continue
if line.find("dump text section") != -1:
is_text = True
if is_stack:
stack.extend(line.split())
if is_text and line.find("r-xp") != -1:
text_content = line.split()
if len(text_content) != 6:
print("parse text section error")
continue
addr = text_content[0]
path = text_content[-1]
filename = os.path.basename(path)
# Filter system so
is_omit = False
for so_name in sys_so:
if filename.find(so_name) != -1:
is_omit = True
break
if is_omit:
continue
addr_range = addr.split('-')
if len(addr_range) != 2:
continue
text[filename] = addr_range
return stack, text
def dump_addr2line(stack, text):
for addr in stack:
addr = int(addr, 16)
for name in text:
addr_start = int(text[name][0], 16)
addr_end = int(text[name][1], 16)
if addr >= addr_start and addr <= addr_end:
# Shared object need to offset
if name.find(".so") != -1:
addr = addr - addr_start
addr = str(hex(addr))
if not os.path.exists(name):
print("{} is not found".format(name))
break
os.system('addr2line {} -e {} -f'.format(addr, name))
break
def main():
dump_file = args.dump_file
print("crash dump file: {}".format(dump_file))
stack, text = parse_dump_file(dump_file)
dump_addr2line(stack, text)
if __name__ == '__main__':
main()
Example of parsing
kyson@LAPTOP-ORFJBPHU:~/workspace/tuya/tools/crash_dump$ python3 coredump.py -d 959_user_iot_1645100484
crash dump file: 959_user_iot_1645100484
__start
??:?
sig_proc
/root/workspace_temp/EmbedSDKs/ty_gw_zigbee_ext_sdk/ty_gw_zigbee_ext_sdk/sdk/svc_linux_crash_dump/src/crash_dump.c:287
??
??:0
emberAfSendDefaultResponseWithCallback
/root/workspace_temp/EmbedSDKs/ty_gw_zigbee_ext_sdk/ty_gw_zigbee_ext_sdk/sdk/zigbee_host/slabs/v2.2/protocol/zigbee/app/framework/util/util.c:764
__start
??:?
...
Stack trace analysis prints the stack trace at the time of a segfault. In the print output, you can focus on the function at the top of the stack and review other information as needed. The above example shows the segfault occurs in the function emberAfSendDefaultResponseWithCallback
. With the context, you can then identify the line of code that caused the segfault.
Is this page helpful?
YesFeedbackIs this page helpful?
YesFeedback