overwhelming incoming socket

User avatar
mzimmers
Posts: 643
Joined: Wed Mar 07, 2018 11:54 pm
Location: USA

overwhelming incoming socket

Postby mzimmers » Fri Oct 12, 2018 10:05 pm

Hi all -

I'm implementing a file transfer facility for my ESP32 app. I'm using a TCP socket for this. Small files work OK, but bigger ones eventually fail. Initially I was getting about 70KB into a file transfer before the ESP broke[*]. I inserted a small delay into my Windoze program, and that got me about 10X deeper.

[*] by "broke" I mean that the recv call returns 0 bytes, and then I start getting the error " No more processes" on subsequent read attempts. It doesn't recover until the sender gives up and sends a RST.

Here's the source code; can anyone suggest what I might be doing wrong?

Code: Select all

int Ota::receiveBinary(int size)
{
    char buff[512];

    int bytesRead;
    int rc = 0;

    ESP_LOGI(TAG, "receiveBinary(): file size is %d", size);

    while (size > 0)
    {
            // read from the socket.
            bytesRead = recv(m_sockRead, buff, (size_t) sizeof(buff), 0);//MSG_DONTWAIT);
            if (bytesRead > 0)
            {
                ESP_LOGI(TAG, "receiveBinary(): %d bytes read.", bytesRead);
            }
            else if (bytesRead == 0)
            {
                ESP_LOGW(TAG, "receiveBinary(): recv() returned 0 bytes.");
                continue;
            }
            else if (errno == EAGAIN || errno == EWOULDBLOCK)
            {
                ESP_LOGW(TAG, "receiveBinary(): recv() returned error %s", strerror(errno));
                continue;
            }
            else
            {
                ESP_LOGE(TAG, "receiveBinary(): recv() returned error %s.", strerror(errno));
                rc = -1;
                break;
            }
            
            size -= bytesRead;
            ESP_LOGI(TAG, "receiveBinary(): %d bytes remaining.", size);
    }

    ESP_ERROR_CHECK(close(m_sockRead));
    ESP_LOGI(TAG, "receiveBinary(): closed socket sockRead.");

    return rc;
}

JayLogue
Posts: 19
Joined: Sat Apr 21, 2018 4:44 pm

Re: overwhelming incoming socket

Postby JayLogue » Sat Oct 13, 2018 4:16 am

Just a wild guess: I noticed your code checks for EAGAIN/EWOULDBLOCK. Have you specifically put the socket in non-blocking mode? If so, then your while loop will spin constantly when there's no data available, possibly starving the LwIP thread and/or causing causing excessive contention on a network-layer lock.

In general, when reading from a socket in a loop, one shouldn't have the socket in non-blocking more.

User avatar
mzimmers
Posts: 643
Joined: Wed Mar 07, 2018 11:54 pm
Location: USA

Re: overwhelming incoming socket

Postby mzimmers » Sat Oct 13, 2018 9:41 pm

Hi Jay -

Here's my socket setup code; the only thing I do is set a timeout.

Code: Select all

int Ota::setupSocket()
{
    struct sockaddr_in sockAddr;
    struct sockaddr_in sockServer;
    socklen_t addrLen = sizeof(sockAddr);
    timeval tv;
    socklen_t optlen;
    socklen_t *pOptlen = &optlen;
    int rc = 0;

    // Create the socket to receive the binary.
    m_sockListen = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);
    if (m_sockListen != ESP_FAIL)
    {
        // set receive timeout on socket.
        tv.tv_sec = 1;
        tv.tv_usec = 0;
        optlen = sizeof(tv);
        rc = setsockopt(m_sockListen, SOL_SOCKET, SO_RCVTIMEO, &tv, optlen);
        if (rc != ESP_OK)
        {
            ESP_LOGE(TAG, "setupSocket(): setsockopt() SO_RCVTIMEO on listening socket: error: %s.", strerror(errno));
        }
        rc = getsockopt(m_sockListen, SOL_SOCKET, SO_RCVTIMEO, &tv, pOptlen);
        //ESP_LOGI(TAG, "setupSocket(): TCP listening socket timeout value is %d.%06d.",(int) tv.tv_sec, (int) tv.tv_usec);

        // Bind our server socket to a port.
        sockServer.sin_family = AF_INET;
        sockServer.sin_addr.s_addr = htonl(INADDR_ANY);
        sockServer.sin_port = htons(OTA_PORT_NBR);
        rc = bind(m_sockListen, (struct sockaddr *)&sockServer, sizeof(sockServer));
        if (rc == ESP_OK)
        {
            // listen for new connections.
            rc = listen(m_sockListen, 5);
            if (rc == ESP_OK)
            {
                // accept incoming connection.
                m_sockRead = accept(m_sockListen, (struct sockaddr *)&sockAddr, &addrLen);
                if (m_sockRead >= 0)
                {
                    // if the accept was successful, set a timeout on the read socket.
                    rc = setsockopt(m_sockRead, SOL_SOCKET, SO_RCVTIMEO, &tv, optlen);
                    if (rc == ESP_OK)
                    {
                        rc = getsockopt(m_sockRead, SOL_SOCKET, SO_RCVTIMEO, &tv, pOptlen);
                        ESP_LOGI(TAG, "setupSocket(): TCP reading socket timeout value is %d.%06d.",
                                 (int) tv.tv_sec, (int) tv.tv_usec);
                    }
                    // and so on

JayLogue
Posts: 19
Joined: Sat Apr 21, 2018 4:44 pm

Re: overwhelming incoming socket

Postby JayLogue » Sun Oct 14, 2018 11:06 pm

I'm away from my computer, but a very quick review of the code doesn't reveal anything overtly wrong.

You might consider using Wireshark to capture and analyze the traffic. This would have to happen on the server or at intermediate router. Look for ack packets from the ESP32, and verify the device continues to open the TCP receive window. If not, there's likely something wrong on the ESP32 side, such as running out of memory at the LwIP layer. Some of the ESP-IDF memory diagnostic functions could be useful here.

--Jay

Evilhommer
Posts: 1
Joined: Tue Oct 16, 2018 12:55 pm

Re: overwhelming incoming socket

Postby Evilhommer » Tue Oct 16, 2018 2:40 pm

Hello all!
I`m tying to flash firmware over TCP/IP but have the same problems with sockets.
Add log from Wireshark below.
Attachments
esp_revc_fail.zip
(8.69 KiB) Downloaded 446 times

JayLogue
Posts: 19
Joined: Sat Apr 21, 2018 4:44 pm

Re: overwhelming incoming socket

Postby JayLogue » Thu Oct 18, 2018 3:55 am

Evilhommer wrote:Hello all!
I`m tying to flash firmware over TCP/IP but have the same problems with sockets.
Add log from Wireshark below.
Hi Evilhommer,

I gather from the packet trace that the ESP32 is acting as a web server and you are POSTing the firmware image to the device using some form of web client. This is very different from the OP who is transferring the file directly over TCP.

The problem in this case is that the web server on the ESP32 is timing out waiting for the entirety of the response:

Code: Select all

HTTP/1.1 408 Request Timeout
Content-Type: text/html
Content-Length: 29

Server closed this connection
Prior to this there are long delays during which packets are lost or unacknowledged. This culminates in throughput dropping to zero for about 10 seconds prior to the server giving up.

Because of the retransmissions this looks like a WiFi problem. Alternatively the LwIP stack could be starved for processor time or buffers. It is possible that increasing the server timeout could allow the upload to succeed. However the transfer rate is likely to be very slow.

--Jay

User avatar
mzimmers
Posts: 643
Joined: Wed Mar 07, 2018 11:54 pm
Location: USA

Re: overwhelming incoming socket

Postby mzimmers » Tue Oct 23, 2018 8:28 pm

This seems to simply be a problem with data coming in too fast for the ESP32. Looking at Wireshark, I see the ESP32's window size steadily drop until it sends a ZeroWindow message. This seems to persist much longer (several seconds) than it should, given that the ESP32 is not doing anything with the data (just reading it and throwing it away). This condition results in ZeroWindowProbes from the client, and acks from the ESP32.

Two things improve the situation:
1. I just increased the priority of the task that handles socket communications.
2. I put a small delay in my client between socket writes.

This second fix really shouldn't be necessary, and I'm not sure why the first one helps, given how idle the other tasks are.

But the real concern is that the ESP32 continues to "stall" - just stops reading (at least, according to my telltales). If I'm going to use this for FW updates, the connection has to be a lot more reliable than it currently appears.

Who is online

Users browsing this forum: No registered users and 109 guests