Giter Site home page Giter Site logo

Unexpected socket closing about lwan HOT 8 CLOSED

pontscho avatar pontscho commented on June 3, 2024
Unexpected socket closing

from lwan.

Comments (8)

lpereira avatar lpereira commented on June 3, 2024

Lwan doesn't have a cache for connections (it pre-allocates the connection structs but it's not really a cache per se).

What happens in this case is that the operating system always gives you back a file descriptor with the lowest available value first, and in the case that you make a request with Connection: close, and make another request, the OS is very likely to give you the same file descriptor from the connection that just closed to the new connection you just made.

But it's going to be a new connection from the OS perspective and from Lwan's perspective. You can look in the spawn_coro() function in lwan-thread.c to see this being initialized. You can follow what calls this function to understand the flow of how new connections are accepted and initialized.

Have you seen any problem related to this? Like, Lwan trying to close the same connection twice, or something? If you saw this then this is definitely a bug; otherwise, this is working as designed.

from lwan.

pontscho avatar pontscho commented on June 3, 2024

Lwan doesn't have a cache for connections (it pre-allocates the connection structs but it's not really a cache per se).

Yep, I saw that when I "rev. eng'd" the lwan_connection_get_fd() function.

Have you seen any problem related to this? Like, Lwan trying to close the same connection twice, or something?

Definitely. Flow is that:

  • when HTTP call happened, spawn_coro() calls timeout_queue_insert()
  • after that call or a while (I don't remember my debug logs) timeout_queue_expire() was called from somewhere
  • after a while Lwan reinsert this entry into the tq with timeout_queue_move_to_last() (in thread_io_loop()) after this socket was expire in previous call (resume_coro())

This last move has to effects:

  • Lwan closes foreign sockets unexpectedly,
  • cause some performance degradation, because timeout list is getting longer and longer (I saw this in my debug logs, Lwan try to close those invalid sockets in tq forever).

That few lines above is just a band-aid, not a real solution but helps to understand the problem that cause a strange hazard. It was fun to find.

If you saw this then this is definitely a bug

I think it is a bug, but I wasn't sure about it isn't planned.

from lwan.

lpereira avatar lpereira commented on June 3, 2024

from lwan.

pontscho avatar pontscho commented on June 3, 2024

Sorry for the late answer, I had some time just now for working on this issue.

I know how sockets are working on tcp, that's why I was surprised those unexpected socket closing. Let me explain steps for reproducing this issue with current master (939dbf6):

  1. first of all, apply this patch please:
diff --git a/src/lib/lwan-tq.c b/src/lib/lwan-tq.c
index 096e56b1..76d19fa9 100644
--- a/src/lib/lwan-tq.c
+++ b/src/lib/lwan-tq.c
@@ -66,6 +66,7 @@ void timeout_queue_move_to_last(struct timeout_queue *tq,
      * served.  In practice, if this is called, it's a keep-alive connection. */
     conn->time_to_expire = tq->current_time + tq->move_to_last_bump;

+    lwan_status_info(" ! LWAN MOVE TO LAST: %d", lwan_connection_get_fd(tq->lwan, conn));
     timeout_queue_remove(tq, conn);
     timeout_queue_insert(tq, conn);
 }
@@ -93,6 +94,7 @@ void timeout_queue_expire(struct timeout_queue *tq,
         conn->coro = NULL;
     }

+    lwan_status_info(" ! LWAN CLOSE: %d", lwan_connection_get_fd(tq->lwan, conn));
     close(lwan_connection_get_fd(tq->lwan, conn));
 }
  1. start Lwan, for example that hello http server from samples directory

  2. make a GET call first: curl -s -X GET http://127.0.0.1:8080 > /dev/null, you have to see something like this:

1606947 lwan-request.c:1488 log_request() 127.0.0.1 [Tue, 19 Apr 2022 13:51:09 GMT] ae3e96ebc6245d8b "GET / HTTP/1.1" 200 text/plain (r:0.003ms p:0.528ms)
1606947 lwan-tq.c:69 timeout_queue_move_to_last()  ! LWAN MOVE TO LAST: 40
1606947 lwan-tq.c:97 timeout_queue_expire()  ! LWAN CLOSE: 40

I think this is normal, but...

  1. make a PUT or POST call: curl -s -X PUT http://127.0.0.1:8080 > /dev/null and you have to see something like this:
1606947 lwan-request.c:1488 log_request() 127.0.0.1 [Tue, 19 Apr 2022 13:51:39 GMT] 1be8c15addce20a4 "PUT / HTTP/1.1" 400 text/html (r:0.007ms p:1.203ms)
1606947 lwan-tq.c:97 timeout_queue_expire()  ! LWAN CLOSE: 40
1606947 lwan-tq.c:69 timeout_queue_move_to_last()  ! LWAN MOVE TO LAST: 40
 ... after ~20 seconds...
1606947 lwan-tq.c:97 timeout_queue_expire()  ! LWAN CLOSE: 40

Lwan close socket 40 twice, and if I open a socket and it get a socket was used by Lwan previously, Lwan will close that socket whatever who open it.

from lwan.

lpereira avatar lpereira commented on June 3, 2024

from lwan.

lpereira avatar lpereira commented on June 3, 2024

from lwan.

pontscho avatar pontscho commented on June 3, 2024

Hey, I think this bug is fixed, thank you very much. And other hand, after this patch the CPU usage of Lwan decreased on macos.

from lwan.

lpereira avatar lpereira commented on June 3, 2024

That's a nice surprise! I rarely test Lwan on macOS these days as my Mac gave up the ghost, but it's good to know.

from lwan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.