Page 1 of 1

MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Wed May 23, 2018 7:16 am
by rahul.b.patel
Hello,
I have ported eclipse paho mqtt client library as below link,
https://github.com/eclipse/paho.mqtt.embedded-c

and added modifications for secure connection options with mbedTLS.

I tested it with IDFv2.1 and IDFv3.0 stable release and found some issues as listed below with Disconnect event of MQTT connection,
with IDF3.0 only.

first of all my part of code snippet is as below,

Code: Select all

NetworkConnect()
MQTTConnect()
while (1)
{
	MQTTYield(&client, 1000);
	if (!MQTTIsConnected(&client))
	{
		ESP_LOGE(TAG,"Disconnected.......");
		break;
	}
}
DisconnectNetwork()
NetworkConnect()
MQTTConnect()
Now the issue is with IDFv3.0,
when connection is lost (STA is disconnected from ESP), MQTTYield() in above code snippet does not return until STA gets connected again. SO until STA is connected again, MQTTYield() is running forever. I am suspecting timer does not expires in implementation of MQTT library.

But changing back to the IDFv2.1,
MQTTYield() returns after 1sec as expected and gives "disconnected..." log on console.


This issue is only with IDF3.0. With IDF2.1 MQTTYield() returns as soon as STA gets dis-connected. and as per my functionality it disconnect network and again try to connect to the network.

Hope I explained issue good enough to have idea about it.
It will be a great help if anybody have come across the same issue and have the resolution.

Is there any change in IDF3.0 which is causing this issue. As looking at MQTTYield(), its based on gettimeofday() dependent which is again dependent on freertos ticks. In IDF3.0, tick source for both CPUs separated. Is there any way that its coming into picture. well its just a guess.

if anybody have any idea, it would be a great help.
thanks.

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Thu May 24, 2018 12:22 pm
by rahul.b.patel
Hi,
Can any one have idea if changes in IDFv3.0 may cause this issue.?

Thanks.

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Fri May 25, 2018 2:43 am
by ESP_Angus
Hi Rahul,

You've explained the problem you're seeing well but without seeing your port it's not really possible to guess at what the problem might be. There aren't any changes in v3.0 that spring to mind as likely candidates, but it's hard to guess.

Suggest either using a JTAG debugger or adding some logging in the MQTT code so you can see (for example) each time the code goes into a delay loop or blocks for something. Check that the timeouts for delays and blocking calls are being set correctly. Maybe there is some integer overflow/underflow somewhere causing something to block indefinitely when it should be timing out.

Angus

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Fri May 25, 2018 12:36 pm
by rahul.b.patel
ESP_Angus wrote:Hi Rahul,

You've explained the problem you're seeing well but without seeing your port it's not really possible to guess at what the problem might be. There aren't any changes in v3.0 that spring to mind as likely candidates, but it's hard to guess.

Suggest either using a JTAG debugger or adding some logging in the MQTT code so you can see (for example) each time the code goes into a delay loop or blocks for something. Check that the timeouts for delays and blocking calls are being set correctly. Maybe there is some integer overflow/underflow somewhere causing something to block indefinitely when it should be timing out.

Angus
Hi Angus,
Thanks for suggestive reply. I analyze the scenario with some logging. I found that, mbedtls_ssl_read() function stays in blocking state usually.
The issue I am observing is that on STA disconnect event, mbedtls_ssl_read() comes out of block state in IDFv2.1 while its stays in blocking state with IDFv3.0.

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Sat May 26, 2018 4:23 am
by rahul.b.patel
Hi Angus,

I back ported LWIP component from IDFv2.1 to IDFv3.0 and now its works fine. So need to check what is the changes that create this issue in LWIP component in IDFv3.0. Any suggestion from your side will be helpful.

thanks.

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Mon May 28, 2018 6:26 am
by littlesky
I guess that it may be caused by the IP address. In IDF v2.1, IP address is lost once WiFi disconnects. But in IDF v3.0, IP address keeps unchanged in 120 seconds after WiFi disconnects. Due to IP address is unchanged, TCP connections bound on it are not changed.

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Tue Jun 05, 2018 3:21 am
by liuzhifu
Please refer to my mail about this issue:

This is an issue introduced in IDFv3.0.

In original LWIP, when WiFi disconnects, call tcp_abort() to kill all active/bound TCP connections.

In IDFv3.0 and latest idf, when WiFi disconnects, don't remove the TCP connections and the application is responsible to remove all TCP connections. When WiFi connection recovers again, the TCP connections no need to re-create and just rebind to the new IP address.

The historical reason not removing TCP connections in IDFv3.0:
1. Most of the applications have WiFi auto-reconnect mechanism, the WiFi will auto reconnect if it disconnects because of some reasons.
2. If the WiFi can successfully reconnects to the AP, then the TCP no need to re-create.
3. If this WiFi no longer reconnects to the AP, or if esp_wifi_set_config() is called to reconnects the WiFi to a different AP, or if esp_wifi_stop()/esp_wifi_deinit() is called, then the application needs to close all sockets connections.
4. Most of the applications, such as audio applications, they use the recv with a TIMEOUT value, so if the WiFi in disconnect status for a long time, the recv returns because of timeout.
5. However, for some existing library API, such as ebedtls, the use the LWIP recv with blocked forever. That's what our customer met in this issue.

So the suggestions for customers:
1. Ignore this issue if the WiFi will reconnect
2. Close the sockets if the WiFi will not reconnect or WiFi reconnect to a different AP

In the future(sorry I'm not sure the exact date), we are considering to add a menuconfig option, it's disabled by default (the behavior is same as IDFv2.1).

Re: MQTT with Secure connection disconnection issue with IDFv3.0

Posted: Tue Jun 05, 2018 8:32 am
by rahul.b.patel
liuzhifu wrote:Please refer to my mail about this issue:

This is an issue introduced in IDFv3.0.

In original LWIP, when WiFi disconnects, call tcp_abort() to kill all active/bound TCP connections.

In IDFv3.0 and latest idf, when WiFi disconnects, don't remove the TCP connections and the application is responsible to remove all TCP connections. When WiFi connection recovers again, the TCP connections no need to re-create and just rebind to the new IP address.

The historical reason not removing TCP connections in IDFv3.0:
1. Most of the applications have WiFi auto-reconnect mechanism, the WiFi will auto reconnect if it disconnects because of some reasons.
2. If the WiFi can successfully reconnects to the AP, then the TCP no need to re-create.
3. If this WiFi no longer reconnects to the AP, or if esp_wifi_set_config() is called to reconnects the WiFi to a different AP, or if esp_wifi_stop()/esp_wifi_deinit() is called, then the application needs to close all sockets connections.
4. Most of the applications, such as audio applications, they use the recv with a TIMEOUT value, so if the WiFi in disconnect status for a long time, the recv returns because of timeout.
5. However, for some existing library API, such as ebedtls, the use the LWIP recv with blocked forever. That's what our customer met in this issue.

So the suggestions for customers:
1. Ignore this issue if the WiFi will reconnect
2. Close the sockets if the WiFi will not reconnect or WiFi reconnect to a different AP

In the future(sorry I'm not sure the exact date), we are considering to add a menuconfig option, it's disabled by default (the behavior is same as IDFv2.1).
Hi,
Thanks for the detail explaination. So all the sockets closing needs to manage on application on WiFi STA disconnect event considering that AP will not connect again considering the worse case scenario.