Giter Site home page Giter Site logo

Comments (24)

silviucpp avatar silviucpp commented on August 18, 2024

Also if you have a big pool it's stopping the pool completely because the supervisor restart limit is triggered

from mysql-otp-poolboy.

RaoH avatar RaoH commented on August 18, 2024

Hmm do you have {keep-alive, true} set on the mysql connection? (https://mysql-otp.github.io/mysql-otp/index.html)
Because it sounds like the connection is reset by mysql if the connection isn't used much in a while.
With the above setting it will send a ping to keep it alive.

But if the server is down, then the pool should be restarted. Our idea was to crash early.
Don't know if you agree, but we could do a new lib with another/different pool handler approach if the use of this one doesn't fit.

from mysql-otp-poolboy.

silviucpp avatar silviucpp commented on August 18, 2024

Yes I'm using keep alive. but the behaviour is odd in the following scenarios:

Scenario 1

  1. Start the pool
  2. Run some queries = everything ok
  3. Stop mysql server
  4. Run queries few times - the entire pool is stopped (supervisor dies because of the restart limits)
  5. Start mysql
  6. Run some queries - nothing works supervisor is already dead

In emysql the same scenario works as follow:
point 4 - all queries are failing (which is ok because server is down)
point 6 - everything goes back to normal

Scenario 2:

  1. Start the pool
  2. Run some queries = everything ok
  3. Restart mysql server
  4. Run queries - if the pool has 20 connections then you have 20 queries that will fail then it's ok.

The same scenario in emysql: At point 4 once the server is up again there is no failing query any longer.. Somehow detects server is up and restart all pool connections.. not sure how,,

This is what anyone will expect from a pool : to minimize the number of failures as well and to handle the reconnection properly.

from mysql-otp-poolboy.

RaoH avatar RaoH commented on August 18, 2024

Hmm true that... realising that we have a manager that keeps pool active and refreshed. (We use this poolmanager, but wrapped in our own supervisor)

I'll look into a another solution to solve this. I can't promise a "fix" or something like that this week =)
But I'll work on it.

from mysql-otp-poolboy.

silviucpp avatar silviucpp commented on August 18, 2024

Great Raoul, the driver it's pretty nice, clean modern and implements a lot of nice features not available into the erlang community. I don't think there is any other driver into active development at this point.

Silviu

from mysql-otp-poolboy.

silviucpp avatar silviucpp commented on August 18, 2024

Hello Raoul any idea when you have time to look into this issue ?

Silviu

from mysql-otp-poolboy.

RaoH avatar RaoH commented on August 18, 2024

Sorry, all I've managed so far is doing the outline for better handling. But work got in way and I have had to focus on getting stuff back in order. But at least I've started :)

from mysql-otp-poolboy.

ihiasi avatar ihiasi commented on August 18, 2024

Great to know that you're working on this, Raoul - I also need this feature.

from mysql-otp-poolboy.

zuiderkwast avatar zuiderkwast commented on August 18, 2024

Just trying to explain a little how things work. If the server shuts down nicely, it will send a TCP message that it is closing the connection. We do not detect this because our gen_tcp connection is in passive mode. Making it active (or {active, once}) would probably let us detect that the server has disconnected so that our connection process can exit and be restarted by its supervisor or pool. I'm not sure if switching between passive and active after every query is good though.

Reconnecting using the same connection process (what I think is what emysql does) is not a very good idea. The connection process holds a lot of state such as which prepared statements are prepared. Mysql session variables are also reset when reconnecting, etc. This state should be lost when reconnecting if we can't reconstruct in in some way.

The problem @silviucpp describes in scenario 1 (that the whole pool is terminated) is an issue of the pool, not the driver itself. That is how poolboy was designed. As @RaoH said, we have been using another "manager" process that regularily tries to restart the pool after some time of delay. Another solution would be to write a new pool library that can wait for a server to be alive again by trying to restart stuff at regular intervals or so.

from mysql-otp-poolboy.

silviucpp avatar silviucpp commented on August 18, 2024

Hello,

I think the main goal to achieve here is to minimize the number of queries that are failing when the server is not reachable (was restarted for example) or the connections are broken. From my tests emysql is very good on this chapter. I mean if I do a query over a dead connection I can see in logs a message that indicates that connection is dead and driver will reconnect but my query never fails.

I think is very hard to achieve this if you don't reconnect in the same process.

Silviu

from mysql-otp-poolboy.

ihiasi avatar ihiasi commented on August 18, 2024

Silviu, I agree with Viktor on this one - the state SHOULD be lost when reconnecting. In the case where the connection is lost, the application will get an error message like "mysql_protocol, badmatch {error, closed} ..." that it needs to interpret in order to 1. retry the operation or 2. return something like "fatal server error, please retry" to the client. To minimize the number of failing queries when the server is not reachable - the pool should just abort or buffer new requests until the connection is re-eastblished.

from mysql-otp-poolboy.

silviucpp avatar silviucpp commented on August 18, 2024

Hi Ionut,

In my opinion if you do in the pool something to retry all failed queries (where the error indicates that the current connection was closed) at least once on a new fresh connection (not on an existing one which might be broken as well and not discovered by the pool) then probably will cover 99% of the reconnection problems.

Usually in production you don't close the cluster and expect the driver to buffer all requests until the server is up again. Most of the time you restart the server or the LB in order to load some other configs, etc.

Silviu

from mysql-otp-poolboy.

ihiasi avatar ihiasi commented on August 18, 2024

Hi Silviu,

"something to retry all failed queries" means we need to buffer failed queries, right ? This is something that goes against "expect the driver to buffer requests until the server is up again". Please clarify - my use case might be a little different than yours : if a request fails, oh well, we could report the error back to the client and he could choose to retry - but still, the state should be lost when reconnecting. What I'm suggesting is to bucket the connections by destination IP - such that if one of them fails, restart + reinitialize all the other connections in the same bucket (i.e., connected to the same IP - I'm using mysql_otp in a Galera + MariaDB cluster). This will take care of not having too many restarts / second AND will minimize the number of failed operations at the same time. Kinda like HAProxy.

from mysql-otp-poolboy.

RaoH avatar RaoH commented on August 18, 2024

I've been kinda swamped at worked at the moment. But I've done some research. Poolboy in it's core is simple and doesn't take care of a process if it fails. We could change the wrapper to handle it, but I feel it kinda beats the purpose of a simple pool handler.

But! I do see the issue here for a more robust handling.
So I've been looking into using https://github.com/seth/pooler that handles crashes in a pool a lot better.
So what do you guys think? Does it meet the "spec" =)

For me it's just a matter of time sadly. But I should have something whipped up next friday (not promising)

from mysql-otp-poolboy.

silviucpp avatar silviucpp commented on August 18, 2024

Hello Ionut,

I'm thinking that this error handling should be something general not focused on specific scenarios.
Of course you can implement all this retry logic into your main app but this will be a nightmare for developers because every one should implement almost the same logic in any app using this driver.

Haw I see this feature from a behaviour point of view (not from an implementation one) is exactly how emysql is doing. I tested a lot this emysql project on this scenarios and I couldn't find something wrong with the behaviour they provide. But that procjet seems old and not under active development any more. This driver for sure can be batter.

  1. When server is down you receive all errors into your app because nothing can be done and queueing might work for some apps and might not work for some others so if client needs queue they can implement in their own apps. I think most of them will not want to queue the queries.
  2. When a connection is broken (for different reasons: server was restarted, network failures not detected by the driver, etc) and you are running a query in their logs you can see something like : "Connection lost. Reconnecting... " and they retry that query on a new connection that replace the existing one. For the driver users this is transparent. How it's implemented or if the driver restarts the state or whatever is doing are their internal details. In my opinion they try the query on a new brand connection and not on an existing one in the cluster because maybe all existing connections might be broken from the same reason as the one you tried to use. I think it's pretty hard to detect in real-time when a connection is broken without sending pings too often.

Silviu

from mysql-otp-poolboy.

ihiasi avatar ihiasi commented on August 18, 2024

Raoul, thanks for the quick reply - I was looking into episcina (https://github.com/erlware/episcina). Will look into seth/pooler as well and let you know :).

from mysql-otp-poolboy.

RaoH avatar RaoH commented on August 18, 2024

Ah! That's an option as well!

from mysql-otp-poolboy.

zuiderkwast avatar zuiderkwast commented on August 18, 2024

Here are my three cents:

  • The driver is simple and should loose its state if the connection goes down.
  • The poolboy wrapper is a simple wrapper which should be seen as a working example.
  • If you need something else, feel free to create another pool wrapper. I wouldn't mind adding more pool wrappers and similar projects under the mysql-otp organisation and/or adding links to such in the documentation for the driver.

Personally I feel that any retry logic should be in the application for maximum control but I also see that a retrying pool would be useful. If we retry a failing query, we should also retry an entire transaction if the connection is broken. A way to retry failing transactions would be to add a try-catch in the transaction/2,3,4 and in the with/2 functions in the pool. Then the logic in the transaction would be retried including any other side effects that the passed function may cause. This may or may not have to be a mysql-otp specific pool...

from mysql-otp-poolboy.

sngyai avatar sngyai commented on August 18, 2024

Has it been solved?

from mysql-otp-poolboy.

ihiasi avatar ihiasi commented on August 18, 2024

In my case, it doesn't apply anymore : we run MaxScale on the Erlang machines to load balance access to a MariaDB cluster (so the MySQL connection goes to 127.0.0.1:3306) and if the connection disappears, we consider it a fatal, etc.

from mysql-otp-poolboy.

silviucpp avatar silviucpp commented on August 18, 2024

I fixed this kind of issues in https://github.com/silviucpp/mysql_pool (it's another pool project for mysql-otp based on pooler).

Silviu

from mysql-otp-poolboy.

GeraldXv avatar GeraldXv commented on August 18, 2024

Here is my solution, adding a "proxy" worker for poolboy:
`-module(mysql_worker).

-behaviour(poolboy_worker).

%% API
-export([start_link/1]).

start_link(Args) ->
case mysql:start_link(Args) of
{ok, Pid} ->
{ok, Pid};
{error, Error} ->
lager:error("start mysql failed, reason: pn", [Error]),
timer:sleep(1000),
start_link(Args)
end.`
when the mysql gen_server recv a tcp_closed, it will stop it self. And poolboy will try to restart the "worker" again, which will call "start_link" again and again till the mysqld started.

from mysql-otp-poolboy.

zuiderkwast avatar zuiderkwast commented on August 18, 2024

That's a good solution, @GeraldXv!

Maybe add some warning in the error log, so that the user actually understands what happens (perhaps only the first time so the log doesn't get flooded). Otherwise, the pool just appears to freeze.

(We did something similar but we did it as a wrapper of the mysql-otp-poolboy itself, i.e. when the whole pool died, we tried to restart it at regular intervals.)

from mysql-otp-poolboy.

zuiderkwast avatar zuiderkwast commented on August 18, 2024

The original issue, detecting connection close, was solved in the driver years ago, so I'm closing this issue.

from mysql-otp-poolboy.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.