I2C crash with release/v3.0 - what's an effective way to debug this?

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Wed Jan 31, 2018 8:30 pm

@ESP_Angus: when a backtrace occurs, there's often one or more yellow lines that looks like this:

Code: Select all

0x400847e7: i2c_master_cmd_begin_static at /Users/david/esp32/esp-idf-v3.0/components/driver/./i2c.c:1023
It's just occurred to me that these lines may actually be being displayed by the 'monitor' program, and not in the serial log emitted by the ESP32 itself. For example, I don't see this output when I use the 'simple_monitor' instead. Is this what you mean by 'addr2line'? When I first read that I wondered how addr2line was being run on the ESP32, but if 'monitor' is doing that, it makes far more sense.

So if that's true, then that helps to explain why I see these lines at boot time, sometimes, when no crash has actually occurred. Perhaps 'monitor' is holding them over from a previous backtrace and something in the boot output triggers their display. So is this a bug in 'monitor'?

Here's an example where those lines appear during boot (after a crash has occurred earlier and 'monitor' hasn't been restarted):

Code: Select all

ets Jun  8 2016 00:22:57

rst:0x1 (POWERON_RESET),boot:0x16 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:5576
load:0x40078000,len:0
ho 12 tail 0 room 4
load:0x40078000,len:13716
entry 0x40078fb4
I (29) boot: ESP-IDF v3.0-rc1-151-gb6b8af49-dirty 2nd stage bootloader
I (30) boot: compile time 12:35:35
I (31) boot: Enabling RNG early entropy source...
I (36) boot: SPI Speed      : 40MHz
I (40) boot: SPI Mode       : DIO
I (44) boot: SPI Flash Size : 4MB
I (48) boot: Partition Table:
I (52) boot: ## Label            Usage          Type ST Offset   Length
I (59) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (66) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (74) boot:  2 factory          factory app      00 00 00010000 00100000
I (81) boot: End of partition table
I (86) esp_image: segment 0: paddr=0x00010020 vaddr=0x3f400020 size=0x17e40 ( 97856) map
I (129) esp_image: segment 1: paddr=0x00027e68 vaddr=0x3ffb0000 size=0x033a0 ( 13216) load
I (134) esp_image: segment 2: paddr=0x0002b210 vaddr=0x40080000 size=0x00400 (  1024) load
0x40080000: _iram_start at /Users/david/esp32/esp-idf-v3.0/components/freertos/./xtensa_vectors.S:1685

I (136) esp_image: segment 3: paddr=0x0002b618 vaddr=0x40080400 size=0x049f8 ( 18936) load
I (152) esp_image: segment 4: paddr=0x00030018 vaddr=0x400d0018 size=0x64b58 (412504) map
0x400d0018: _stext at ??:?

I (296) esp_image: segment 5: paddr=0x00094b78 vaddr=0x40084df8 size=0x0def0 ( 57072) load
0x40084df8: i2c_isr_handler_default at /Users/david/esp32/esp-idf-v3.0/components/driver/./i2c.c:1023

I (320) esp_image: segment 6: paddr=0x000a2a70 vaddr=0x400c0000 size=0x00000 (     0) load
I (331) boot: Loaded app from partition at offset 0x10000
I (331) boot: Disabling RNG early entropy source...
I (332) cpu_start: Pro cpu up.
I (335) cpu_start: Starting app cpu, entry point is 0x4008102c
0x4008102c: call_start_cpu1 at /Users/david/esp32/esp-idf-v3.0/components/esp32/./cpu_start.c:215

I (0) cpu_start: App cpu up.
I (346) heap_init: Initializing. RAM available for dynamic allocation:
D (353) heap_init: New heap initialised at 0x3ffae6e0
I (358) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
D (364) heap_init: New heap initialised at 0x3ffb9140
I (369) heap_init: At 3FFB9140 len 00026EC0 (155 KiB): DRAM
I (375) heap_init: At 3FFE0440 len 00003BC0 (14 KiB): D/IRAM
I (382) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
D (389) heap_init: New heap initialised at 0x40092ce8
I (393) heap_init: At 40092CE8 len 0000D318 (52 KiB): IRAM
I (399) cpu_start: Pro cpu start user code
D (411) clk: RTC_SLOW_CLK calibration value: 3302912
D (85) intr_alloc: Connected src 46 to int 2 (cpu 0)
D (86) intr_alloc: Connected src 57 to int 3 (cpu 0)
D (86) stack_chk: Intialize random stack guard
D (91) intr_alloc: Connected src 24 to int 9 (cpu 0)
I (96) cpu_start: Starting scheduler on PRO CPU.
D (0) intr_alloc: Connected src 25 to int 2 (cpu 1)
I (0) cpu_start: Starting scheduler on APP CPU.
Perhaps something in this looks like a code address and triggers 'monitor' to try and decode it?

Code: Select all

I (134) esp_image: segment 2: paddr=0x0002b210 vaddr=0x40080000 size=0x00400 (  1024) load
EDIT: never mind, I just found this: http://esp-idf.readthedocs.io/en/latest ... -addresses. So that means that a previous crash must enable the address decoder for future boots. The fix, not that one is really needed I suppose, is to either ignore those lines, or restart 'monitor'. Now I know this, it's much less confusing!

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby ESP_Angus » Thu Feb 01, 2018 5:03 am

meowsqueak wrote: EDIT: never mind, I just found this: http://esp-idf.readthedocs.io/en/latest ... -addresses. So that means that a previous crash must enable the address decoder for future boots. The fix, not that one is really needed I suppose, is to either ignore those lines, or restart 'monitor'. Now I know this, it's much less confusing!
Glad you found the docs and they resolved some confusion.

The monitor tool will always try to look up anything which looks like a code address, because it doesn't know any better (it just sees the raw log output and looks for things that might be hex addresses). During the initial boot sequence some things which are actually code addresses (like the load address of the binary segments in IRAM) are logged and the monitor tool will decode these. You can ignore those lines - they're technically correct (address XYZ does correspond to function ABC in the ELF file), but this information is mostly meaningless.

On the other hand, when you see it decoding addresses in a crash dump these are probably useful! You need to check the log line above (ie the bit which isn't yellow) to see what context that particular hex address was printed in (ie was it a value stored in a register, or is it part of the Backtrace, or is it something else...)

I noticed there's a very similar I2C interrupt watchdog issue reported on Github recently, and the driver team is helping with this one, so you may want to follow along there as well: https://github.com/espressif/esp-idf/issues/1503

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Thu Feb 01, 2018 5:22 am

ESP_Angus wrote:I noticed there's a very similar I2C interrupt watchdog issue reported on Github recently, and the driver team is helping with this one, so you may want to follow along there as well: https://github.com/espressif/esp-idf/issues/1503
Well, fancy that... ;)

[I created that issue - it's one and the same]

ESP_Angus
Posts: 2344
Joined: Sun May 08, 2016 4:11 am

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby ESP_Angus » Thu Feb 01, 2018 6:14 am

Oops, I missed that you'd dropped that link at the bottom of a previous reply. Sorry.

ginodecock
Posts: 30
Joined: Thu Jun 29, 2017 7:10 pm

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby ginodecock » Thu Feb 01, 2018 8:06 pm

meowsqueak wrote:
ginodecock wrote:Hi,

I've got a simular error when the i2c sensor is disconnected.
What kind of sensor are you using? Can you paste your exact error message? When it crashes, does it crash over-and-over?
yes it's crashing over and over again

I am using the tmp102 temperature sensor


ESP_ERROR_CHECK failed: esp_err_t 0xfffffffff at 0x401068f1
0x401068f1: tmp102_init1 at C:/msys32/home/projects/ulp_cc_templogger/main/main.c:164

file: "C:/msys32/home/projects/ulp_cc_templogger/main/tmp102.c" line 199
func: tmp102_init1
expression: i2c_master_ccmd_begin(I2C_NUM_0, cmdconfig, 100 / portTICK_PERIOD_MS)

Backtrace: 0x400883bb:0x3ffc1f60 0x400887f1:0x3ffc1f80 0x401068f1:0x3ffc1fa0 0x40107261:0x3ffc1fc0 0x400d13f8:0x3ffc1ff0
0x400883bb: invoke_abort at Y:/VM/Share/GitHub/esp-idf-v3.0-rc1/components/esp32/panic.c:572

0x400887f1: _esp_error_check_failed at Y:/VM/Share/GitHub/esp-idf-v3.0-rc1/components/esp32/panic.c:584

0x401068f1: tmp102_init1 at C:/msys32/home/projects/ulp_cc_templogger/main/main.c:164

0x40107261: app_main at C:/msys32/home/projects/ulp_cc_templogger/main/main.c:460

0x400d13f8: main_task at Y:/VM/Share/GitHub/esp-idf-v3.0-rc1/components/esp32/cpu_start.c:456


Rebooting...
ets Jun 8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv::0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0018,len:4
It seems that my problem is unrelated. By removing the ESP_ERROR_CHECK the crashen stops.
Last edited by ginodecock on Sat Feb 03, 2018 6:50 am, edited 2 times in total.

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Thu Feb 01, 2018 8:19 pm

EDIT: ignore all this, see next message.

With your build/project-name.elf file, can you do what ESP_Angus suggested with gdb "info line" to make sure the backtrace line numbers are correct?

Something like:

Code: Select all

$ xtensa-esp32-elf-gdb build/ulp_cc_templogger.elf 
(gdb) info line *0x400883bb
(gdb) info line *0x400887f1
(gdb) info line *0x401068f1
(gdb) info line *0x40107261
(gdb) info line *0x400d13f8
Let's see if they point to the same lines as my issue.[/s]
Last edited by meowsqueak on Thu Feb 01, 2018 8:22 pm, edited 1 time in total.

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Thu Feb 01, 2018 8:21 pm

Actually, your code is crashing because a call to ESP_ERROR_CHECK failed - whatever it's calling within (see your main.c:164) is returning an ESP error code, and this causes an assert that causes a reboot. This is intentional - so you need to check what function is returning the error code. It might still be related though - if you remove the ESP_ERROR_CHECK calls and let the code continue, it would be interesting to see if you get a watchdog panic in i2c.c eventually.

ginodecock
Posts: 30
Joined: Thu Jun 29, 2017 7:10 pm

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby ginodecock » Sat Feb 03, 2018 3:48 pm

Hi,

Removing ESP_ERROR_CHECK made my code continue. I don't see a watchdog panic however.

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Sat Feb 10, 2018 1:49 am

Quick update: the current hypothesis is that it's an issue with "too many interrupts" causing the ISR to take too long to complete, triggering the watchdog timer. When I get time I should be able to prove this with some logging and some experiments with varying the watchdog timer. I've found an effective way to reproduce the problem:

https://github.com/DavidAntliff/esp32-i2c-error-1503

Just run that project, connect two wires, brush them together... and boom.

A few other people have also reported the same issue.

Sadly it seems Espressif have gone completely quiet on the issue, which is unfortunate since I spent a lot of time getting it to a simple reproducible state, and I don't even know if they have reproduced it yet. But I'm hopeful they are still investigating and have been able to reproduce it.

Also, if the I2C FSM is reset within i2c_driver_install, then the "repeated crash" doesn't occur, so there's at least one workaround - would be good to get some sort of response from Espressif on this point too. For example, is there a legitimate use case where one would not want to reset the FSM on initialisation?

meowsqueak
Posts: 151
Joined: Thu Jun 15, 2017 4:54 am
Location: New Zealand

Re: I2C crash with release/v3.0 - what's an effective way to debug this?

Postby meowsqueak » Sat Feb 17, 2018 11:58 pm

Bumping this in the hope that ESP_<someone> replies as my project is pretty much unimplementable with the ESP32 as things stand right now. At the very minimum I need confirmation that the unconditional "FSM reset at start-up" workaround isn't going to cause other problems (need someone with some in-depth knowledge of the hardware to comment please).

Who is online

Users browsing this forum: Baidu [Spider] and 140 guests