Giter Site home page Giter Site logo

Comments (11)

mtrudel avatar mtrudel commented on September 28, 2024 1

CthulhuExceptions should be being logged in all cases (1.4.2 and 1.5.0+) but considering that the main focus of the 1.5.0 release was around logging of exceptions (and, if there's a bug fix to come from this, it'll be against the current version), it'd be much easier to establish a repro on the common ground of the current release.

from bandit.

mtrudel avatar mtrudel commented on September 28, 2024 1

As discussed with @alisinabh, this is likely due to the fact that curl shuts down the HTTP2 connection as soon as its one request is complete, where Firefox will keep the connection open. Because of the way that the Bandit HTTP/2 process model works, the parent connection process being terminated causes the child request process to terminate before it has a chance to complete its logging.

I'll look into getting the connection process join on any living children for some amount of time before shutting them down.

from bandit.

mtrudel avatar mtrudel commented on September 28, 2024

Assuming they're both coming via http/2 it could be any number of things. Since you're raising this as an error from your controller, the changes in 1.5.0 are probably relevant to your interests (it looks like you're using 1.4.2 above).

I'd be interested in seeing a repro of this! There's no end of interesting edge cases coming from Firefox (not 'wrong', just 'different').

from bandit.

DHager avatar DHager commented on September 28, 2024

Since you're raising this as an error from your controller, the changes in 1.5.0 are probably relevant to your interests (it looks like you're using 1.4.2 above).

At the moment I'm more concerned about logging unpredictable and arbitrary exceptions: If an undreamt-of CthulhuException rises from the stack-abyss to topple the production endpoints of man, we should at least get an alert e-mail about it. :p

from bandit.

DHager avatar DHager commented on September 28, 2024

I'll keep poking at it, but it does seem to happen with 1.5.2 as well.

* bandit 1.5.2 (Hex package) (mix)
* hpax 0.1.2 (Hex package) (mix)
* plug 1.16.0 (Hex package) (mix)
* telemetry 1.2.1 (Hex package) (rebar3)
* thousand_island 1.3.5 (Hex package) (mix)

from bandit.

mtrudel avatar mtrudel commented on September 28, 2024

Minimal repro case is raising an ArgumentError inside a phoenix controller and accessing it via Firefox / curl? How are you realizing http/2 connections? HTTPS? Proxy server?

from bandit.

DHager avatar DHager commented on September 28, 2024

I think this may be a race-condition where the process serving the request is getting terminated--silently!--after it sends data to the client but before it can finish logging properly. Perhaps there's some kind of bug in some code somewhere that supervises things and determines how long to spend on requests?

The code where it happens (on my box)

In my declining sanity, I've resorted to brute force of putting in IO.puts/1 calls inside Bandit.Pipeline.handle_error/5, like so:

defp handle_error(error, stacktrace, transport, span, opts) do
    #IEx.pry()
    IO.puts(">>>HANDLE_ERROR_2")
    Bandit.Telemetry.span_exception(span, :exit, error, stacktrace)
    IO.puts(">>>HANDLE_ERROR_2B")
    status = error |> Plug.Exception.status() |> Plug.Conn.Status.code()
    IO.puts(">>>HANDLE_ERROR_2C")
    if status in Keyword.get(opts.http, :log_exceptions_with_status_codes, 500..599) do
      IO.puts(">>>HANDLE_ERROR_ERRSTATUS_1")
      Logger.error(Exception.format(:error, error, stacktrace))
      IO.puts(">>>HANDLE_ERROR_ERRSTATUS_2")
      Bandit.HTTPTransport.send_on_error(transport, error)
      IO.puts(">>>HANDLE_ERROR_ERRSTATUS_3")
      {:error, error}
    else
      IO.puts(">>>HANDLE_ERROR_OTHERSTATUS_1")
      Bandit.HTTPTransport.send_on_error(transport, error)
      IO.puts(">>>HANDLE_ERROR_OTHERSTATUS_2")
      {:ok, transport}
    end
  end

Naturally, this requires some extra steps like mix deps.compile bandit to put the local changes into effect.

What gets printed

When making requests with Firefox or Chrome, this handler function completes entirely, and I also get the expected error logging. However with cURL, I get varying amounts of output! For example, sometimes it's as short as:

>>>HANDLE_ERROR_2
>>>HANDLE_ERROR_2B

While other times it gets a bit further:

>>>HANDLE_ERROR_2
>>>HANDLE_ERROR_2B
>>>HANDLE_ERROR_2C
>>>HANDLE_ERROR_ERRSTATUS_1

Possibly if I try it enough times, it'll eventually get further and log the error?

Interactive with IEx.pry/1

Additional evidence for "the process is being killed" comes from using an IEx.pry/1 call at the start of the method and starting up with iex -S mix phx.server.

With Firefox/Chrome triggering, everything works as expected, and I am able to confirm the prompt and dig around and them respawn to continue execution.

However in the buggy cause triggered with curl, the process is dead before I can do anything:

Request to pry #PID<0.2218.0> at Bandit.Pipeline.handle_error/5 (deps/bandit/lib/bandit/pipeline.ex:229)
  226:   end
  227: 
  228:   defp handle_error(error, stacktrace, transport, span, opts) do
  229:     IEx.pry()
  230:     IO.puts(">>>HANDLE_ERROR_2")
  231:     Bandit.Telemetry.span_exception(span, :exit, error, stacktrace)
  232:     IO.puts(">>>HANDLE_ERROR_2B")
Allow? [Yn] y

Interactive Elixir (1.15.7) - press Ctrl+C to exit (type h() ENTER for help)
** (EXIT from #PID<0.2218.0>) shell process exited with reason: no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started

Other

Minimal repro case is raising an ArgumentError inside a phoenix controller and accessing it via Firefox / curl?

Yes. Or at least, I haven't identified any other critical conditions yet. That said, calling curl multiple times shows variability in how much works and how much fails.

How are you realizing http/2 connections? HTTPS? Proxy server?

In this case it's all on localhost, no proxies or anything. Our application has multiple endpoints on different ports, and for the newer ones being developed we're using bandit instead of cowboy2.

from bandit.

DHager avatar DHager commented on September 28, 2024

Sounds good, just adding another data-point to confirm: If I call curl with additional URLs on the same invocation, then the issue only shows up if the problem-URL is the last one listed. I assume this is because curl is keeping its connections while it works on the URLs, since a later URL may benefit from a connection established previously.

That said, this behavior could potentially occur with other tools or libraries.

from bandit.

mtrudel avatar mtrudel commented on September 28, 2024

Thanks for that - that just about confirms my hypothesis.

There's nothing 'wrong' here from a safety or a code correctness perspective, this is just a classic race condition. Should be pretty easy to mitigate.

Does the condition happen 100% of the time with curl?

from bandit.

DHager avatar DHager commented on September 28, 2024

Does the condition happen 100% of the time with curl?

To invert the saying: "It never works on my machine." However someone with a slow network or slow client-computer might not trigger the issue, since it gets very close to the logging statement before termination.

I can repro the problem with Chrome by:

  1. Inside Pipeline.handle_error/5 and immediately before the Logger.error/1 line, add :thread.sleep(5000). This widens the time-window to a comfortable 5 seconds. (Recompile dependencies, relaunch server, etc.)

  2. Open up the browser with only one window/tab.

  3. Navigate to the problematic URL.

  4. Immediately (i.e. within 5 seconds) close the browser, indirectly forcing it to terminate the connection sooner than it would normally choose to.

  5. Wait out the remaining time, and observe that there's no log output on the server.

Conversely, skipping step 4 will cause the log entry to appear on that 5-second delay.

from bandit.

adonig avatar adonig commented on September 28, 2024

I'm not sure whether this is related. I run Phoenix 1.7.14 and get an error when I access my endpoint with a browser:

[info] Sent 200 in 516ms
[error] ** (Bandit.HTTPError) closed

This doesn't happen when I call it with CURL.

from bandit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.