Giter Site home page Giter Site logo

Threading Support about tufao HOT 66 CLOSED

vinipsmaker avatar vinipsmaker commented on September 27, 2024
Threading Support

from tufao.

Comments (66)

vinipsmaker avatar vinipsmaker commented on September 27, 2024

It would be nice to have a ThreadedHttpPluginServer, so the server can scale up when lots of requests come in at the same time.

Threads are the one feature that could make Tufão better than other frameworks, imho. I want to integrate threads support in Tufão, but I want to do it right.

Can you share some design ideas? Then we could iterate until we reach a final proposal.

Here are some notes:

  • I think threads support shouldn't be transparent. Transparent threads support requires implicit synchronization. Synchronization doesn't scale well. The user should be aware of threads support and enable it explicitly.
  • I've put some effort in design to allow share a large part of the implementation between HttpServer and HttpsServer. Thanks to that, the HttpServer became flexible and the user could tune its behaviour.
  • I wanna some design that fits well in current Tufão structure. Something that you can combine with HttpServer and HttpServerRequestRouter.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Well i'm still figuring out how the code works but here are some initial ideas:

  • I think we should not touch HttpServer or HttpsServer to implement that, the server can still run in the main
    thread and accept the requests, but then forwards to a Handler that uses Threads internally to handle
    them. However this would require the chain of handlers to work in a async way since we can not wait
    if a handler really took care of a request because this would block the main thread then.
  • Alternatively the Chain of request handlers need to be handled INSIDE the thread itself. But that would not fit
    nicely into how the library works atm.
  • I already thought about a way to have the user decide if he wants to handle requests threaded or non threaded,
    however if he chooses threads we may need to do some internal synchronization (for example sessiondata needs
    to be read by multiple threads at the same time, requests needs to be enqueued and dequeued)
  • A Pool of already running threads should be the best choice, the user can decide how big the pool is. After
    a request is finished the thread goes back to sleep. Using QWaitCondition we can control and wakeup the
    threads. I'm not shure yet if every thread should have its own copy of a request handler or just reenter the
    handler if needed. The code of the handler needs to be reentrant in either way.

Maybe we can even make it as easy as:

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);
    ThreadedHttpPluginServer plugins{"routes.json"};
    AsyncHttpServerRequestRouter router{
        {QRegularExpression{""}, plugins},
        {QRegularExpression{""}, HttpFileServer::handler("public")},
        {QRegularExpression{""}, NotFoundHandler::handler()}
    };
    HttpServer server;
    QObject::connect(&server, &HttpServer::requestReady,
                     &router, &AsyncHttpServerRequestRouter::handleRequest);
    server.listen(QHostAddress::Any, 8080);
    return a.exec();
}

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

I already thought about a way to have the user decide if he wants to handle requests threaded or non threaded, however if he chooses threads we may need to do some internal synchronization (for example sessiondata needs to be read by multiple threads at the same time, requests needs to be enqueued and dequeued)

If the user wants the best performance, he wants to use threads and he wants only the minimum level of synchronization.

If the idea is to use transparent thread support, it doesn't even make sense to create an interface. Just put everything in HttpServer and add tuneable options in configure/build time.

A Pool of already running threads should be the best choice, the user can decide how big the pool is. After a request is finished the thread goes back to sleep. Using QWaitCondition we can control and wakeup the threads. I'm not shure yet if every thread should have its own copy of a request handler or just reenter the handler if needed. The code of the handler needs to be reentrant in either way.

There is no need for a pool. We can use the same design of Monkey HTTP Daemon project. One thread per core (user tunable value), one event loop per thread, multiple handlers per event loop. User handle design and synchronization of his code (using mutexes, multiple copies of data, whatever...).

[code sample]

I like the simplicity, it might be a good start.

Does ThreadedHttpPluginServer imply that only plugins could be threaded?

from tufao.

bzeller avatar bzeller commented on September 27, 2024

If the user wants the best performance, he wants to use threads and he
wants only the minimum level of synchronization.
Agreed

If the idea is to use transparent thread support, it doesn't even make
sense to create an interface. Just put everything in HttpServer and add
tuneable options in configure/build time.

But HttpServer just emits a signal when a new request is available,
i think the threading code should be outside of HttpServer and just take
the incoming signal and dispatch the request to one of the threads.

There is no need for a pool. We can use the same design of Monkey HTTP
Daemon project http://monkey-project.com/. One thread per core (user
tunable value), one event loop per thread, multiple handlers per event
loop. User handle design and synchronization of his code (using mutexes,
multiple copies of data, whatever...).

I disagree on that, imagine you have 4 Cores, the user writes code that
pulls data from the database or does some other heavy work where the
thread is blocked, after only 4 requests like that the server will just
not handle requests until one of the threads is free again.
Also one core can handle more than one thread easily because of sleep
time on systemcalls and stuff like that.

Think about it, there are only a few reasons why you want code to run
on the server and not on the client in JS:

  • You want to store data (Will block on the SQL call or systemcall for
    a file based datastorage)
  • You want to restore data (Will block on the SQL call or systemcall for
    a file based datastorage)
  • Heavy work or calculations

I would say every thread has the same handlers as the others, that means
every thread can handle all requests, so the dispatcher code does not
need to think about that.
However a user tuneable size of the pool would make both approaches
possible:

  threadDispatcher->setPoolSize(QThread::idealThreadCount());

This threads are purely for handling lots of incoming requests and
waiting for them to be finished without blocking the main loop, the user
should not write code that communicates between those threads. If the
user needs threads for whatever reason he needs to start and handle
them by himself.

[code sample]

I like the simplicity, it might be a good start.

Does |ThreadedHttpPluginServer| imply that only plugins could be threaded?

No, of course not but atm it seems to be the logical choice. Maybe the
HttpServerRequestRouter is a better choice to put the thread handling
code? We need a point where we dispatch the requests to the handlers
inside the threads, what would you say is the most logical place in
your code?

The user of course has to be aware of using threads, so he can make
the right design choices. For example he has to be aware of that the
handler function needs to be reentrant and could be called multiple
times from different threads (same requirement for Java Servlets).
That means inside the AbstractHttpServerRequestHandler subclass there
should be no member variables used (if we use the same instance for
all threads).
Also that its not guaranteed that a thread will be called again for
the same session, which means only the SessionStorage is the right place
to store data between requests..

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

I see this like an Producer Consumer Problem!

And I thing to solve that we need:
HttpServer let as is.
When signal was emitted another class will add in a queue the request and increment the semaphore (producer);
And the each thread when free try to acquire one request to handler (the consumers);

And another class will handler how many threads are to be alive:

  • creating one when queue are long;
  • destroying one when takes to much time to be used (idle);
    this need be intelligent to don't destroy and create many times
    Or create X threads for eternity, but this can imply more resources
    needed like databases connections, them I prefer the intelligent creation.

https://en.wikipedia.org/wiki/Producer–consumer_problem

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.

Em 30/07/2013, às 22:43, Vinícius dos Santos Oliveira [email protected] escreveu:

I already thought about a way to have the user decide if he wants to handle requests threaded or non threaded, however if he chooses threads we may need to do some internal synchronization (for example sessiondata needs to be read by multiple threads at the same time, requests needs to be enqueued and dequeued)

If the user wants the best performance, he wants to use threads and he wants only the minimum level of synchronization.

If the idea is to use transparent thread support, it doesn't even make sense to create an interface. Just put everything in HttpServer and add tuneable options in configure/build time.

A Pool of already running threads should be the best choice, the user can decide how big the pool is. After a request is finished the thread goes back to sleep. Using QWaitCondition we can control and wakeup the threads. I'm not shure yet if every thread should have its own copy of a request handler or just reenter the handler if needed. The code of the handler needs to be reentrant in either way.

There is no need for a pool. We can use the same design of Monkey HTTP Daemon project. One thread per core (user tunable value), one event loop per thread, multiple handlers per event loop. User handle design and synchronization of his code (using mutexes, multiple copies of data, whatever...).

[code sample]

I like the simplicity, it might be a good start.

Does ThreadedHttpPluginServer imply that only plugins could be threaded?


Reply to this email directly or view it on GitHub.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

I see this like an Producer Consumer Problem!

And I thing to solve that we need:
HttpServer let as is.
When signal was emitted another class will add in a queue the request
and increment the semaphore (producer);
And the each thread when free try to acquire one request to handler (the
consumers);

Yes exactly. Sadly QSemaphore does not work that way, you can not
increment it, you have to tell it beforehand how much resources are
available:

QSemaphore sem(5); // sem.available() == 5

sem.acquire(3); // sem.available() == 2
sem.acquire(2); // sem.available() == 0
sem.release(5); // sem.available() == 5
sem.release(5); // sem.available() == 10

sem.tryAcquire(1); // sem.available() == 9, returns true
sem.tryAcquire(250); // sem.available() == 9, returns false

But Producer-Consumer is the right way to go, i would do it like that
(Pseudocode):

Somewhere in the Main Loop:

 Request* req = httpserver->nextRequest();
 mutex.lock();
 pendingRequests.append(req);

 int requests = pendingRequests.size();
 while(requests > 0){
   waitCond.wakeOne();
   requests--;
 }
 mutex.unlock();

In the Threads main function:

forever{
mutex.lock();
waitCond.wait(&mutex);

 Request *req = pendingRequests.takeFirst();
 mutex.unlock();

 //handle the request
 ....

 //tell the dispatcher we are done
 qApp->postEvent(dispatcher,new RequestFinishedEvent(req));
}

And another class will handler how many threads are to be alive:

  • creating one when queue are long;
  • destroying one when takes to much time to be used (idle);
    this need be intelligent to don't destroy and create many times
    Or create X threads for eternity, but this can imply more resources
    needed like databases connections, them I prefer the intelligent creation.

Releasing and spawning threads can be expensive, but if you can come
up with a intelligent algorithm, why not. It needs to spawn threads on
high loads but not free them before the high load is gone for some time.
But imho we can implement this when the basic threading code works,
first we should get the basic things to work.

I also thought about things like Database Connections, creating them
really can be expensive too.

https://en.wikipedia.org/wiki/Producer–consumer_problem

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.

Em 30/07/2013, às 22:43, Vinícius dos Santos Oliveira
[email protected] escreveu:

I already thought about a way to have the user decide if he wants to
handle requests threaded or non threaded, however if he chooses threads
we may need to do some internal synchronization (for example sessiondata
needs to be read by multiple threads at the same time, requests needs to
be enqueued and dequeued)

If the user wants the best performance, he wants to use threads and
he wants only the minimum level of synchronization.

If the idea is to use transparent thread support, it doesn't even
make sense to create an interface. Just put everything in HttpServer and
add tuneable options in configure/build time.

A Pool of already running threads should be the best choice, the user
can decide how big the pool is. After a request is finished the thread
goes back to sleep. Using QWaitCondition we can control and wakeup the
threads. I'm not shure yet if every thread should have its own copy of a
request handler or just reenter the handler if needed. The code of the
handler needs to be reentrant in either way.

There is no need for a pool. We can use the same design of Monkey
HTTP Daemon project. One thread per core (user tunable value), one event
loop per thread, multiple handlers per event loop. User handle design
and synchronization of his code (using mutexes, multiple copies of data,
whatever...).

[code sample]

I like the simplicity, it might be a good start.

Does ThreadedHttpPluginServer imply that only plugins could be threaded?


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

I disagree on that, imagine you have 4 Cores, the user writes code that pulls data from the database or does some other heavy work where the thread is blocked, after only 4 requests like that the server will just not handle requests until one of the threads is free again. Also one core can handle more than one thread easily because of sleep time on systemcalls and stuff like that.

Think about it, there are only a few reasons why you want code to run on the server and not on the client in JS:

  • You want to store data (Will block on the SQL call or systemcall for a file based datastorage)
  • You want to restore data (Will block on the SQL call or systemcall for a file based datastorage)
  • Heavy work or calculations I would say every thread has the same handlers as the others, that means every thread can handle all requests, so the dispatcher code does not need to think about that.

Well, database connections are usually sockets and can be made async, but I see your point. Qt doesn't have a strong support for async operations. The Tufao::HttpFileServer itself block on system calls. Well, at least some options were discarded and we are close to the right design for Tufão.

This threads are purely for handling lots of incoming requests and waiting for them to be finished without blocking the main loop, the user should not write code that communicates between those threads. If the user needs threads for whatever reason he needs to start and handle them by himself.

There is no need for some code wait for incoming requests to be finished. The handlers receive the {input/request, output/response} pair and the work doesn't need to come back to some main class or alike.

Does |ThreadedHttpPluginServer| imply that only plugins could be threaded?

No, of course not but atm it seems to be the logical choice. Maybe the HttpServerRequestRouter is a better choice to put the thread handling code?

This change would propagate threads support to everything Tufão offers. It's a good option. I need to think more about it.

We need a point where we dispatch the requests to the handlers inside the threads, what would you say is the most logical place in your code?

If we dispatch the requests, we'll have a lot of problems to move objects between threads and transfer of ownership and responsibility among Tufão objects. If we don't intend to transfer ownership, then we'd need to give resources back to a central place or alike.

HttpServer has the problem of being intrusive. You call listen and it get the full port to itself. No way to pass socket descriptors to him. Maybe it'd be better to create a middleware-like worker that accepts sockets descriptors and provide the HttpServer-like interface. Maybe change HttpServer to accept socket descriptors and just ignore listen method. The socket descriptors idea is to kill the need for complex communications.

I'll need some time to come up with a design.

Some places to look:

The user of course has to be aware of using threads, so he can make the right design choices. For example he has to be aware of that the handler function needs to be reentrant and could be called multiple times from different threads (same requirement for Java Servlets). That means inside the AbstractHttpServerRequestHandler subclass there should be no member variables used (if we use the same instance for all threads).

Maybe we can use a factory of handlers, then each thread would have its own handlers.

Also that its not guaranteed that a thread will be called again for the same session, which means only the SessionStorage is the right place to store data between requests..

This is a good concern and we should put it in the documentation when we are done with the implementation.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 31.07.2013 20:16, Vinícius dos Santos Oliveira wrote:

I disagree on that, imagine you have 4 Cores, the user writes code
that pulls data from the database or does some other heavy work
where the thread is blocked, after only 4 requests like that the
server will just not handle requests until one of the threads is
free again. Also one core can handle more than one thread easily
because of sleep time on systemcalls and stuff like that.

Think about it, there are only a few reasons why you want code to
run on the server and not on the client in JS:

  * You want to store data (Will block on the SQL call or systemcall
    for a file based datastorage)
  * You want to restore data (Will block on the SQL call or
    systemcall for a file based datastorage)
  * Heavy work or calculations I would say every thread has the same
    handlers as the others, that means every thread can handle all
    requests, so the dispatcher code does not need to think about that.

Well, database connections are usually sockets and can be made async,
but I see your point. Qt doesn't have a strong support for async
operations. The |Tufao::HttpFileServer| itself block on system calls.
Well, at least some options were discarded and we are close to the right
design for Tufão.

Yeah sadly Qt lacks async database support. Anyway heavy computing still
would be an issue.

This threads are purely for handling lots of incoming requests and
waiting for them to be finished without blocking the main loop, the
user should not write code that communicates between those threads.
If the user needs threads for whatever reason he needs to start and
handle them by himself.

There is no need for some code wait for incoming requests to be
finished. The handlers receive the {input/request, output/response} pair
and the work doesn't need to come back to some main class or alike.

True but what with the chain of handlers? They need to know if a handler
really handled the request or not.

    Does |ThreadedHttpPluginServer| imply that only plugins could be
    threaded?

No, of course not but atm it seems to be the logical choice. Maybe
the HttpServerRequestRouter is a better choice to put the thread
handling code?

This change would propagate threads support to everything Tufão offers.
It's a good option. I need to think more about it.

We need a point where we dispatch the requests to the handlers
inside the threads, what would you say is the most logical place in
your code?

If we dispatch the requests, we'll have a lot of problems to move
objects between threads and transfer of ownership and responsibility
among Tufão objects. If we don't intend to transfer ownership, then we'd
need to give resources back to a central place or alike.

Or we just don't pass objects but give the user a flat object that can
contain everything we need:

setBody();
setHeader();

but no access to the sockets and then just write all the stuff to the
socket in the main thread. But i think maybe calling moveToThread for
our objects is the better way to go.
Question is how many QObjects are in a Request object.

HttpServer has the problem of being intrusive. You call |listen| and it
get the full port to itself. No way to pass socket descriptors to him.
Maybe it'd be better to create a middleware-like worker that accepts
sockets descriptors and provide the HttpServer-like interface. Maybe
change HttpServer to accept socket descriptors and just ignore listen
method. The socket descriptors idea is to kill the need for complex
communications.

I'll need some time to come up with a design.

Some places to look:

Maybe we can use a factory of handlers, then each thread would have its
own handlers.

Yeah good idea! This makes it much easier for the user. You can then
store request local data inside the handler and don't need to think if
another thread uses it. But i think for a plugin based handler that
means when plugins change we have to stop all threads to reload them.

Also that its not guaranteed that a thread will be called again for
the same session, which means only the SessionStorage is the right
place to store data between requests..

This is a good concern and we should put it in the documentation when we
are done with the implementation.
Same requirement for Java Servlets btw ;).

I already created a copy of the repository, i will have some time on my
hands tomorrow and could start to hack something. Atm i think we should
create a ThreadedRequestRouter. The router spawns threads and every
thread creates all request handlers that are registered to the router.
Not shure if that is compatible with the plugin based server but maybe
it is with some small changes.


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

bzeller avatar bzeller commented on September 27, 2024

I'm currently looking into what needs to be changed to make it work,
i found some things i just wanted to ask back:

  1. Parent ownership:
    Currently the Request and Response Object are owned by the HttpServer, when we need to move
    it, it might be better if the ownership is different:
    • The Request itself should have no owner at all (moveToThread does not allow that)
    • The Response object should maybe be owned by Request (it is directly related to the request obj)
    • All internally used QObjects should be owned by the surrounding object
      e.g. the Socket should be owned by the Request object instead of the HttpServer.
      That way they will be moved automatically, because all children of a QObject are moved.
    • All QObjects internally used should be pointers (afaik it is mentioned somewhere that QObject should
      normally be pointers), like the QTimer object also needs to be owned by the parent to get moved properly.
      And when the parent gets out of scope (object is deleted), it will try to delete all children which will crash
      because you can not delete a non pointer object.
  2. The Plugin loader does not call unload() explicit, according to the docs the Destructor of QPluginLoader does
    not automatically unload the lib. (not related to threading)
  3. I'm not shure yet if we can have a eventloop inside the threads, if we want to use QWaitCondition to handle
    them then its maybe not possible. Question is do we really need a eventloop. Maybe we can enter and exit the
    eventloop when we start or finish the thread, i have to think more about that.
  4. Copy by ref: Why do you use copy by reference for the request objects? Ok we can get the pointer from them
    using & to move them to the thread, and then make them references again. Is it just for usability? Or are their
    other reasons (just curious)

from tufao.

bzeller avatar bzeller commented on September 27, 2024

5 . The Upgrade handler is problematic, if the client sends a upgrade after the Request was moved to the thread,
we have a problem, because the Upgrade code runs in the main thread, but the Request belongs to the
worker thread....
Maybe disconnecting the upgrade signal and handling it inside the worker thread could work? Maybe with
passing the current upgrade handler with the request to the worker thread? Or implement a thread safe
getCurrentUpgradeHandler() function.

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

I don't have much time available now, but I want contribute, currently with ideas or brainstorming.

Just for reference I have my own httpserver (https://gitorious.org/qthttpserver) Based on Stefan Frings work (http://stefanfrings.de/qtwebapp/index-en.html)

He used an different approach in multi-thread support:
The http server receive a socket number and send to the thread pooler to send this to a empty thread or create a new one.

Maybe taking a look you help to decide the best design for Tufão.

In fact, I do an benchmark test between booth codes and Tufão are the winner
then I want to use Tufão but for this multi thread is an requirement.

Whatever, I here to contribute with Tufão!

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Before printing, think about your responsibility to the ENVIRONMENT.

Em 01/08/2013, às 06:14, bzeller [email protected] escreveu:

I'm currently looking into what needs to be changed to make it work,
i found some things i just wanted to ask back:

Parent ownership: Currently the Request and Response Object are owned by the HttpServer, when we need to move it, it might be better if the ownership is different:
The Request itself should have no owner at all (moveToThread does not allow that)
The Response object should maybe be owned by Request (it is directly related to the request obj)
All internally used QObjects should be owned by the surrounding object e.g. the Socket should be owned by the Request object instead of the HttpServer. That way they will be moved automatically, because all children of a QObject are moved.
All QObjects internally used should be pointers (afaik it is mentioned somewhere that QObject should normally be pointers), like the QTimer object also needs to be owned by the parent to get moved properly. And when the parent gets out of scope (object is deleted), it will try to delete all children which will crash because you can not delete a non pointer object.
The Plugin loader does not call unload() explicit, according to the docs the Destructor of QPluginLoader does not automatically unload the lib.
I'm not shure yet if we can have a eventloop inside the threads, if we want to use QWaitCondition to handle them then its maybe not possible. Question is do we really need a eventloop. Maybe we can enter and exit the eventloop when we start or finish the thread, i have to think more about that.
Copy by ref: Why do you use copy by reference for the request objects? Ok we can get the pointer from them using & to move them to the thread, and then make them references again. Is it just for usability? Or are their other reasons (just curious)

Reply to this email directly or view it on GitHub.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

@bzeller:

1

Good remark.

The important characteristic to keep is: No memory leaks and memory management should be easy to the user.

2

Yeah, sounds like a bug. I'll fix it later.

Thanks for the report.

3

@israelins85 has more knowledge about event loops. He did a wizard job to make an easy body parser function. I don't remember where the code is.

4

C++ cool style. I was using too much Qt-style in Tufão 0.x, but Qt is old and not always right. Pass pointers when bare references are enough is more confusing. You have an explicit intent (I'm giving you a reference, but I'm not transferring ownership, giving guarantees that you can use them later or whatever).

5

Maybe disconnecting the upgrade signal and handling it inside the worker thread could work?

Handle in thread A and call the handler from thread B? It's okay if the user is aware (documentation), but I think it should be consistent/uniform with whatever we choose to handle requests.

Maybe with passing the current upgrade handler with the request to the worker thread?

Almost same as above. It's safer/easier.

Or implement a thread safe getCurrentUpgradeHandler() function.

I think this solution is complex and we should avoid unless needed. Maybe I'm wrong and I'm not seeing how this would be done.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

@israelins85:

I don't have much time available now, but I want contribute, currently with ideas or brainstorming.

Same as me.
^^

He used an different approach in multi-thread support:

The http server receive a socket number and send to the thread pooler to send this to a empty thread or create a new one.

Maybe taking a look you help to decide the best design for Tufão.

It may be a good idea to look this code for inspiration.

In fact, I do an benchmark test between booth codes and Tufão are the winner then I want to use Tufão but for this multi thread is an requirement.

It's difficult to run an accurate, objective and useful benchmark. I was delaying this task until I have all features I wanted.

But it's kind of interesting to have this information now. I myself liked the design of this other project and the major problem was related to license.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Ok i gave it some thought and played around with the code a bit.
I maybe have a solution and want to know what you guys think about it.

I think the main thread should just accept the connection and dispatch it, everything
else should be done by the thread like @israelins85 suggested. But that needs some
refactoring.

First refactor the Http handling code out of HttpServer and HttpsServer and put it inside
a HttpConnectionHandler and HttpsConnectionHandler. HttpServer and HttpsServer internally
will use that class so we don't break source compatibility to existing solutions.

The threaded solution will look something like that:

int main (int argc, char* argv[])
{
    auto intializer = []{
        //The connection handler the thread will use to pass the connections to
        HttpConnectionHandler* hdl = new HttpConnectionHandler();

       //Requests handlers, created on the heap and children of the HttpConnectionHandler
       //The thread will clean up the handler and with it everything thats a child of it
        HttpPluginServer *plugins = new HttpPluginServer({"routes.json"},hld);

        HttpServerRequestRouter *router = new HttpServerRequestRouter({
            {QRegularExpression{""}, plugins},
            {QRegularExpression{""}, HttpFileServer::handler("public")},
            {QRegularExpression{""}, NotFoundHandler::handler()}
        },hdl);

        QObject::connect(hdl, &HttpConnectionHandler::requestReady,
            router, &HttpServerRequestRouter::handleRequest);

        return hdl;
    };

    //The threaded connection dispatcher, it will schedule the threads and forward the 
    //connections to the different threads, replaces HttpServer for threaded approach.
    ThreadedHttpConnectionDispatcher dispatch;
    dispatch.setThreadInitializer(initializer);

    dispatch.listen(QHostAddress::Any, 8080);

    return a.exec();
}

I think this is the safest solution, because we don't need to worry about QObject thread affinity and
upgrade handlers. Everything will be in the right thread for a request.
The thread will run as long as the socket descriptor is open, once it is closed the thread goes
leaves the eventloop and goes back to sleep to pick up the next request.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

The threaded solution will look something like that [...]

I liked the interface.

I wanna read more about the internal architecture. Will be there a thread pool like you initially proposed? If so, how will it work?

I may have more time to look into this issue in the weekend and I might be able to propose how HttpsServer could be integrated in this design.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Sorry for the flood, the comment button gone crazy.
=(

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Well HttpsServer is already integrated into the design, just replace HttpConnetionHandler with
HttpsConnectionHandler.

The ConnectionHandlers are like Http and HttpsServer just without the QTcpServer code. The only difference
is they will have a:

AbstractHttpConnectionHandler::handleNewConnection(int socketDesc);

function.

Internally there will be used a threadpool for now. Maybe we can introduce a algorithm later that grows
and shrinks the pool on demand like @israelins85 suggested. But for now i want to keep it simple and make it
work first.
The ThreadedHttpConnectionDispatcher will accept the connections, handle the threadpool and forwards the
connection handling to the threads. Nothing more, small and needs only a Mutex for controlling the threads
(Maybe a per thread Mutex is better, i can not decide yet, i have to try it)
The workerthreads sleep on a QWaitCondition, when a new request is available the dispatcher puts it in the queue
and calls wakeOne() on the QWaitConidtion. The thread takes out the request and starts to handle it. It will call
handleRequest() and after the call enter the eventloop (makes async solutions work) and leaves it when the connection is closed. Once it is finished it goes back to sleep.
I think that solution should even make websockets work, the only bad thing with this is, it will use the thread
as long as the websocket is open.
For really long running connections we could run out of threads, then we need the growing threadpool feature or
a different solution (like give the user a way to detach the thread from the pool, so he is in control what happens). Otoh the threaded solution is maybe not first choice for websockets based programming.

I have some time in the next days and maybe can hack out a first solution in my own repos and create a pull
request.

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

In my project every socket are exactly that way:

  • The HttpServer only listen the sockets, and notifies when arrives.
  • The Poller create or re-use a thread (called connection handler),
    that parses the data on socket and call handler when request is complete;
  • The handler NEED be thread-safe;
  • When connection are responded to free the thread for another request I close the socket.

The problem I see with this is:
The "keep alive" optimization is lost, because I can't have 200 or 300 threads connecting to the database.

The prefer stay like that are the sockets are maintained on main thread, and encapsulated in a thread safe way to be
passed to threads handler, and stay opened for a while to next request be more fast.

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Before printing, think about your responsibility to the ENVIRONMENT.

Em 07/08/2013, às 02:52, Vinícius dos Santos Oliveira [email protected] escreveu:

Sorry for the flood, the comment button gone crazy.
=(


Reply to this email directly or view it on GitHub.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

@israelins85 Do you have a solution for the Upgrade handler? Or can a upgrade only occur BEFORE a request is
created? The problem with the upgrade handler is, its in the main thread, but our requests are handled in the
workers. If a upgrade happens somewhere in between when its handled that would most likely break.
If its guaranteed that the upgrade handler is only called before the HttpServer emits the requestReady signal we are
fine to go.

Also if we push Requests to the threads we need to use moveToThread().

Another problem: with none of both approaches you can implement something like a server push, where the request is kept open to simulate a socket connection. Same problem for websockets btw.
We need a threadsafe way to pause and restart a request when we want to send data to the client.

Design question: One handler instance for ALL threads? Or one handler instance PER thread?
One for all: less memory consumption, but requires the user to make the handler threadsafe.
One per thread: more memory, but takes a lot of burden from the user to make the handler threadsafe and makes programming easier (less need for mutexes, threadlocal data and stuff like that)

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Or can a upgrade only occur BEFORE a request is created?

The upgrade interface exported via HttpServer creates an HttpServerRequest, but it only exposes this object when a request is ready OR an upgrade is requested. It's kind of before. See here for more detail.

The problem with the upgrade handler is, its in the main thread, but our requests are handled in the workers. If a upgrade happens somewhere in between when its handled that would most likely break. If its guaranteed that the upgrade handler is only called before the HttpServer emits the requestReady signal we are fine to go.

requestReady OR upgrade is emitted.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 07.08.2013 19:58, Vinícius dos Santos Oliveira wrote:

Or can a upgrade only occur BEFORE a request is created?

The upgrade interface exported via |HttpServer| creates an
HttpServerRequest, but it only exposes this object when a request is
ready OR an upgrade is requested. It's kind of /before/. See here
https://github.com/vinipsmaker/tufao/blob/master/src/httpserver.cpp#L100
for more detail.
OK that means upgrade and request handling are totally different things.
IF a upgrade happens its BEFORE we pass the request to the handlers.
That means we can forget about the request handler being threadsafe.

The problem with the upgrade handler is, its in the main thread, but
our requests are handled in the workers. If a upgrade happens
somewhere in between when its handled that would most likely break.
If its guaranteed that the upgrade handler is only called before the
HttpServer emits the requestReady signal we are fine to go.

|requestReady| OR |upgrade| is emitted.
Means a Request that is emitted by upgrade will never emit requestReady.
Good to know!


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Means a Request that is emitted by upgrade will never emit requestReady. Good to know!

One last thing: The event sequence requestReady, requestReady and upgrade for the same connection/HttpServerRequest object is possible. But it's true that after an upgrade, no more requestReady will be emitted.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Lets say, we use a approach where we push the Request objects to the
threads (not the socketdescriptors), and wait until end() is emitted,
then push the Request object back to the main thread. That means we can
still use the keep-alive mechanism.

However this will change the code a bit:

int main (int argc, char* argv[])
{
    //thread initializer now returns a RequestHandler
    auto intializer = []{

       //Requests handlers, created on the heap and children of the returned RequestHandler
       //The thread will clean up the handler and with it everything thats a child of it
        HttpPluginServer *plugins = new HttpPluginServer({"routes.json"});

        HttpServerRequestRouter *router = new HttpServerRequestRouter({
            {QRegularExpression{""}, plugins},
            {QRegularExpression{""}, HttpFileServer::handler("public")},
            {QRegularExpression{""}, NotFoundHandler::handler()}
        });

        plugins->setParent(router);

        return router;
    };

    HttpServer server;

    //just a normal AbstractHttpServerRequestHandler
    ThreadedHttpRequestDispatcher dispatch;
    dispatch.setThreadInitializer(initializer);

    QObject::connect(&server, &HttpServer::requestReady,
                  &dispatch,&ThreadedHttpRequestDispatcher::handleRequest);

    server.listen(QHostAddress::Any, 8080);

    return a.exec();
}

This solutions gives the user the possibility to handle some requests with threads
and some not (like requests that would open a server push channel)

Do you have a example for a Http-Server-Push mechanism implemented with
tufao? This works a bit different because we have long running requests.
But i think for something like that using threads to handle those
requests is wrong. This requests just wait until events are available
to be send to the client. Imho the user could have a own thread for
this connections and handle it differently.

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

I guess that are the best way!

Previously you comment about one handler per thread or a handler per thread.

Make an assumption: If user is using Tufão threaded they known how thread works!

Them a one handler is better to user less memory.
A Handle is not a class to have attributes.
I see 3 types of attributes on Tufão handler:
1- Global;
2- Per request;
3- Per thread;

1- If user whats they have to known how protect attributes between threads simultaneous access;
2 - Is Session data (cookies etc) have to be handled by a SessionStore class, that I have done in my HttpServer;
3- User HAVE to know how threads works, if they don't know they will continue using Tufão as now;

To put fire in that BrainStorm
Another way is used by PostgreSQL Daemon and Apache: They have an instance of process to each request.
I don't know how works in under level. But I guess the Listener program receives the request and create/re-use a process.
That way the handler and listener process are independent, if a handler aborts, just one request are lost;

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Before printing, think about your responsibility to the ENVIRONMENT.

Em 07/08/2013, às 15:44, Vinícius dos Santos Oliveira [email protected] escreveu:

Means a Request that is emitted by upgrade will never emit requestReady. Good to know!

One last thing: The event sequence requestReady, requestReady and upgrade for the same connection/HttpServerRequest object is possible. But it's true that after an upgrade, no more requestReady will be emitted.


Reply to this email directly or view it on GitHub.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

I though about that one handler for all thing:
There comes one big problem into mind: Thread affinity of QObjects!
If the user want to do asynchronous request handling he is doomed.
And lots of our handler are already QObjects, like HttpServerRequestRouter or
HttpPluginServer.
Or imagine signal/slots, if the user uses some sort of signals on his objects and
the QObject lives in another thread. That would require event processing and
Qt::QueuedConnection.
Also the user can not have children of the handler object, because of different
thread affinity.
That sounds like a bit too much to think about for the user, i think handler per thread
is better. Its not that much more memory and even on an embedded platform that should
be no problem: just use less threads in the pool.
If we WANT to do it differently we have to remove the QObject from the handlers.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Using processes would work too, but there are some problems with that:

  • Connection to the client: On Unix you can forward a socket to another process using some tricks.
    No idea if that works on other platforms too. But it has to be implemented for every platform
  • You need a way to communicate with the new process and have to move all the data to it
  • Every new process costs more memory than a thread, and its slower to spawn a new process
  • It would maybe require lots of refactoring (that we need to check)
  • Totally different approach when it comes to programming, the user needs to write a new application instead
    of putting everything into the same codebase.

Otoh it would also solve some problems for us

  • The HttpPluginServer would correctly work (unloading and loading plugins)
  • The user can not access data from other requests (different process , different memory pages)
  • If one process crashes the others still work, the server does not go down
  • Maybe more

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Before printing, think about your responsibility to the ENVIRONMENT.

Em 08/08/2013, às 04:30, bzeller [email protected] escreveu:

Using processes would work too, but there are some problems with that:

Connection to the client: On Unix you can forward a socket to another process using some tricks. No idea if that works on other platforms too. But it has to be implemented for every platform

  • May be we can leave the socket in main program and just write and read using QProcess to pass the data in, out.
    You need a way to communicate with the new process and have to move all the data to it
  • Asked above;
    Every new process costs more memory than a thread, and its slower to spawn a new process
  • That's true, in previously mail I say about less memory using only one handler, but maybe doing this will be better;
    It would maybe require lots of refactoring (that we need to check)
  • Yes, that's true, the problem doing this is, we need stay tufão working as now, for users that don't need this complexity,.
    Maybe having 3 ways to works:
    1- One process => one thread;
    2- One process => multi thread;
    3- Multi-process;

There be amazing have the 3 ways justing changing classes used!!!

Otoh it would also solve some problems for us

The HttpPluginServer would correctly work (unloading and loading plugins)
The user can not access data from other requests (different process , different memory pages)
If one process crashes the others still work, the server does not go down
Maybe more

Reply to this email directly or view it on GitHub.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Connection to the client: On Unix you can forward a socket to another
process using some tricks. No idea if that works on other platforms too.
But it has to be implemented for every platform

  • May be we can leave the socket in main program and just write and read
    using QProcess to pass the data in, out.
    Would work yes, but that will put more load on the main loop that should
    just handle the processes.

Every new process costs more memory than a thread, and its slower to
spawn a new process

  • That's true, in previously mail I say about less memory using only one
    handler, but maybe doing this will be better;
    When it comes to crash safety the current code has the same problem
    if one request crashes the server everything is gone.

It would maybe require lots of refactoring (that we need to check)

  • Yes, that's true, the problem doing this is, we need stay tufão
    working as now, for users that don't need this complexity,.
    Maybe having 3 ways to works:
    1- One process => one thread;
    2- One process => multi thread;
    3- Multi-process;
    There be amazing have the 3 ways justing changing classes used!!!

Multi-process can not be done by just changing classes.
The user needs to create executables for the handler processes.
Thats much more complexity than just changing classes.
Maybe that could be solved with loading plugins but still
its completely different from the current situation.

I think multi process would require a complete server solution,
not just a library you link to your application.
One would just have to write plugins and configure the server
correctly to load them and call them on the right path.
I like the safety the process solution could bring to
tufao. But remember things like shared session classes do not
work then, or also need to be handled between the processes.

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

In multi-tprocess we can provide a template server application ready for the Tufão user,
and they will need just make the Handle processes, like plugins are, configurable by a conf file.

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Before printing, think about your responsibility to the ENVIRONMENT.

Em 08/08/2013, às 12:34, bzeller [email protected] escreveu:

Connection to the client: On Unix you can forward a socket to another
process using some tricks. No idea if that works on other platforms too.
But it has to be implemented for every platform

  • May be we can leave the socket in main program and just write and read
    using QProcess to pass the data in, out.
    Would work yes, but that will put more load on the main loop that should
    just handle the processes.

Every new process costs more memory than a thread, and its slower to
spawn a new process

  • That's true, in previously mail I say about less memory using only one
    handler, but maybe doing this will be better;
    When it comes to crash safety the current code has the same problem
    if one request crashes the server everything is gone.

It would maybe require lots of refactoring (that we need to check)

  • Yes, that's true, the problem doing this is, we need stay tufão
    working as now, for users that don't need this complexity,.
    Maybe having 3 ways to works:
    1- One process => one thread;
    2- One process => multi thread;
    3- Multi-process;
    There be amazing have the 3 ways justing changing classes used!!!

Multi-process can not be done by just changing classes.
The user needs to create executables for the handler processes.
Thats much more complexity than just changing classes.
Maybe that could be solved with loading plugins but still
its completely different from the current situation.

I think multi process would require a complete server solution,
not just a library you link to your application.
One would just have to write plugins and configure the server
correctly to load them and call them on the right path.
I like the safety the process solution could bring to
tufao. But remember things like shared session classes do not
work then, or also need to be handled between the processes.

I know that, I just saying the 3 approaches can live together just making this a decision to the Tufão user.


Reply to this email directly or view it on GitHub.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Lets say, we use a approach where we push the Request objects to the threads (not the socketdescriptors), and wait until end() is emitted, then push the Request object back to the main thread. That means we can still use the keep-alive mechanism.

It sounds like the simplest solution we had so far.

  • We don't need to think further to maintain HTTP and HTTPS working for the same handlers.
  • It works with WebSocket
  • It doesn't require several new classes like the other solutions
  • It allows the use of different scheduling algorithms (through an enum)
  • ...

Do you have a example for a Http-Server-Push mechanism implemented with tufao?

What do you mean by Http-Server-Push example?

I though about that one handler for all thing:

There comes one big problem into mind: Thread affinity of QObjects!

[...]

That sounds like a bit too much to think about for the user, i think handler per thread is better. Its not that much more memory and even on an embedded platform that should be no problem: just use less threads in the pool. If we WANT to do it differently we have to remove the QObject from the handlers.

Agreed.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 09.08.2013 20:11, Vinícius dos Santos Oliveira wrote:

Lets say, we use a approach where we push the Request objects to the
threads (not the socketdescriptors), and wait until end() is
emitted, then push the Request object back to the main thread. That
means we can still use the keep-alive mechanism.

It sounds like the simplest solution we had so far.

Seems like we have a way where we all more or less agree:

-> We push the requests to the thread, when the request is done,
its pushed back to the mainthread and can be reused
-> We provide a factory function that creates the main handler
for every thread, as in the example i had before
-> We want as much compatibility as possible for the thread stuff
it would be nice if the user just has to change to a factory
function and it works
-> Upgrades are handled in the main thread, if the user wants websockets
he also can have a own thread to handle them on himself. Thats not
the job of a http request handler

Did i forget anything?

Only problem i see with this is plugin loading, because we can only
unload a plugin when ALL plugin loader instances called unload.
But first things first, once we get the plain handlers to work we will
find a solution for the plugins too.

I already have some proof of concept code here, it needs some
refactoring, but maybe i can hack something in the next few days,
if you guys agree with my points.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Did i forget anything?

I don't think so.

I already have some proof of concept code here, it needs some refactoring, but maybe i can hack something in the next few days, if you guys agree with my points.

Everything is fine for me.
=)

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

Agreed too!

Enviado via iPhone

Em 09/08/2013, às 19:27, Vinícius dos Santos Oliveira [email protected] escreveu:

Did i forget anything?

I don't think so.

I already have some proof of concept code here, it needs some refactoring, but maybe i can hack something in the next few days, if you guys agree with my points.

Everything is fine for me.
=)


Reply to this email directly or view it on GitHub.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Ok guys, i have a first version in my tufao repository, if you have some time you can check it out.
There is also a tufao-thread-test repository with some testcode you can use.

Sometimes the server stops to handle requests, i have to investigate that when i have some more
time.

Tell me what you think!

from tufao.

bzeller avatar bzeller commented on September 27, 2024

I found the problem with the server stopping to handle requests.
Its fully functional now afaik. Now some other things need to be made threadsafe,
like the sessionstore. Any other things that use shared backends?

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Ok guys, i have a first version in my tufao repository, if you have some time you can check it out.

Can you test if HTTP Upgrade (e.g. WebSocket) work? You changed socket parent to HttpServerRequest, but the connect can live longer than the request. When an upgrade occurs, the HttpServer delete the HttpServerRequest.

There is also a tufao-thread-test repository with some testcode you can use.

Interface is great and I like it.

I think these two lines could be removed.

I'll review the rest of the code after these items are resolved.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Yeah the HTTP Upgrade would have failed because the Request object would have deleted the socket.
However i think i solved the problem with reparenting the Socket object to 0 (alternatively we could use the HttpServer object, but i think the user should be in charge of the socket after a request).

For example in WebSocket::startServerHandshake , the WebSocket should take parentship of the Socket, so the user can moveToThread on the WebSocket. In fact we should check if every internal used QObject is correctly parented to the wrapping QObject. Otherwise using threads could create ugly bugs that are hard to debug...

I removed the two lines in the testcode, that was just for testing a bug because in HttpServerRequestRouter::map
there was a crash if the internal QVector is empty. I already fixed it in my repository.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Yeah the HTTP Upgrade would have failed because the Request object would have deleted the socket.
However i think i solved the problem with reparenting the Socket object to 0 (alternatively we could use the HttpServer object, but i think the user should be in charge of the socket after a request).

The previous behaviour was HttpServer being the parent. We don't pay nothing to maintain the behaviour consistent with current version (e.g. reparenting to HttpServer).

I noticed that you were putting copyright in my name. Stop doing that, I don't require copyright assignment in Tufão contributions. Just use your name in the copyright notice at the beginning of the files. ;)

I'll review the rest of the code later today.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 15.08.2013 11:20, Vinícius dos Santos Oliveira wrote:

Yeah the HTTP Upgrade would have failed because the Request object
would have deleted the socket.
However i think i solved the problem with reparenting the Socket
object to 0 (alternatively we could use the HttpServer object, but i
think the user should be in charge of the socket after a request).

The previous behaviour was HttpServer being the parent. We don't pay
nothing to maintain the behaviour consistent with current version (e.g.
reparenting to HttpServer).
True for non threaded applications. But as i said that may be a problem
with threaded apps if the user does a moveToThread. Anyway i check if
the request is still parent, if not i assign one.

I noticed that you were putting copyright in my name. Stop doing that, I
don't require copyright assignment in Tufão contributions. Just use your
name in the copyright notice at the beginning of the files. ;)
Ok i'll change that in the next commit

I'll review the rest of the code later today.


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Hey guys, back from my vacation. Did you have some time to check out the code yet?

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Ups closing was and accident

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Hey guys, back from my vacation.

I hope you enjoyed your vacations.
=P

Did you have some time to check out the code yet?

Okay, I'll review it tonight.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Did you have some time to check out the code yet?

Okay, I'll review it tonight.

Too many changes, I'll need more time to review, but I may be able to do it this week.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 05.09.2013 02:49, Vinícius dos Santos Oliveira wrote:

    Did you have some time to check out the code yet?

Okay, I'll review it tonight.

Too many changes, I'll need more time to review, but I may be able to do
it this week.

No need to rush, i just found a bug but not the cause of it.
After the first 5 requests come in something blocks, it just does not
accept any more connections until the first 5 are finished, then it
continues to work. It really seems QTcpServer does not handle incoming
Connections for some reason. I started debugging it but that will take
some time, as there is no clear reason for this (mainloop is not
blocked).

I added a threaded Pluginserver too, all this stuff seems to work
except the connection handling.


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Ok that was not a bug, in fact it was a "feature" of my browser. Firefox AND Chromium always have max 6 connections to a single server, no matter how many tabs you have open.
Even if you open 20 requests at the same time, they are split up in a bunch of 6 requests and handled bunch by
bunch.

So thats not a bug, everything seems to work just fine. I still need to write some docs and i will upload my current
testproject this evening, so you can have a look at how it works.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Hi guys, I came here just to inform you that this is the last week of GSoC and I'll have time to review the code (and code together) after that.

Wish me luck.
=)

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Hey Vinícius,

good luck ;).

Just wanted to inform you that we hit a ugly bug in the Qt Eventloop
(thats our current assumption), with sockets and moveToThread().
Until now Thiago tries to debug it but i did not hear from him since
last week.
No idea if he will be able to fix the bug soon, thats why i started
a new branch (fd-based), where i implement the threading stuff with
socket descriptors instead of QTcpSocket. I had to do a lot of
refactoring but i think i managed it too keep source compatibility so
far. Nothing useable in that branch yet but i will inform you as soon
as there is code to test.

Ben

On 16.09.2013 18:37, Vinícius dos Santos Oliveira wrote:

Hi guys, I came here just to inform you that this is the last week of
GSoC and I'll have time to review the code (and code together) after that.

Wish me luck.
=)


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

israelins85 avatar israelins85 commented on September 27, 2024

Hi, guys.

I use QTcpSocket between threads with no problems, in many projects. Then you can explain the problem?

Regards,

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Before printing, think about your responsibility to the ENVIRONMENT.

Em 16/09/2013, às 13:37, Vinícius dos Santos Oliveira [email protected] escreveu:

Hi guys, I came here just to inform you that this is the last week of GSoC and I'll have time to review the code (and code together) after that.

Wish me luck.
=)


Reply to this email directly or view it on GitHub.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

I doubt you hit the problem, it only occurs on very high
load in special circumstances (remote host needs to close the
connection). I stumbled over it when i was stress testing the
tufao code with autobench/httperf.

Sometimes the main thread and the workerthread both
get a close event for the same socket. Even though the
socket is properly moved to the workerthread. The thread
that received the event first, deletes the socket (because
we connect the disconnected signal with deleteLater). The other
thread then tries to access freed memory which will result in
a segfault.

That means you will run into that bug only when:

  1. you are moving sockets between threads
  2. you connect your disconnected signal to deleteLater, or delete the
    socket when the signal comes in.
  3. you have high load (not sure about that one)
  4. The remote host closes the connection in the right second

On 16.09.2013 21:35, israelins85 wrote:

Hi, guys.

I use QTcpSocket between threads with no problems, in many projects.
Then you can explain the problem?

Regards,

Israel Lins Albuquerque

Antes de imprimir, pense em sua responsabilidade com o MEIO AMBIENTE.
Before printing, think about your responsibility to the ENVIRONMENT.

Em 16/09/2013, às 13:37, Vinícius dos Santos Oliveira
[email protected] escreveu:

Hi guys, I came here just to inform you that this is the last week of
GSoC and I'll have time to review the code (and code together) after that.

Wish me luck.
=)


Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Ok guys, brief update.

I did a rewrite to make the code use socketdescriptors instead of QTcpSockets.
This is how it works:

A new connection comes in and the socket descriptor is directly forwarded to a
worker thread. Inside of that worker script the Socket is created and handled just
like before. The request now belongs to that thread as long as the socket does not
close it or it times out.
I implemented a mechanism that allows me to have more than one request per thread,
its a timer that measures the work load of the thread, if the timeout is arriving too slow
the workload of the thread is high.

The request dispatcher does a round robin dispatching
but also uses the workload to determine the next thread and prefers completely idle threads.
Thats how the request dispatcher works (i'm open for other ideas):
-> New connection is coming in
-> The dispatcher looks if there is a idle thread (A thread with no attached request)
-> If it can not find a idle thread, it iterates over the working threads, and tries to find a
thread with a low workload
-> If it can not find a thread with a low workload it will use the first one with a medium workload
-> If it can not find a thread with low or medium workload it will enqueue the connection and
try again later
-> the selected thread is then moved to the end of the thread queue, to make the request
dispatching more fair

I had to do some refactoring, the biggest change is the new AbstractConnectionHandler class,
its basically a HttpServer without the internal QTcpServer. The AbstractConnectionHandler will
be used inside the Threads to handle the incoming connections.
However i did maintain source compatibility with HttpServer as much as possible, i think
that should rarely break something.

If someone wants to try the code checkout the fd-based branches from my tufao and tufao-thread-test
repositories.

from tufao.

bzeller avatar bzeller commented on September 27, 2024

Oh and btw, the threading code supports completely async and event based programming.
Multiple requests can be handled in one thread at the same time. That means you cannot
store request related variables inside the event handler for example. You always need to link that
to the request.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Hello folks, GSoC is over and I committed some changes to Tufão today (mainly regarding documentation).

I'll be reviewing @bzeller code. I have a lot of spare time and I'll be using it for this task.

[...] A new connection comes in and the socket descriptor is directly forwarded to a worker thread. Inside of that worker script the Socket is created and handled just like before. The request now belongs to that thread as long as the socket does not close it or it times out.
I implemented a mechanism that allows me to have more than one request per thread, its a timer that measures the work load of the thread, if the timeout is arriving too slow the workload of the thread is high. [...]

It sounds great.
=)

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 30.09.2013 10:19, Vinícius dos Santos Oliveira wrote:

Hello folks, GSoC is over and I committed some changes to Tufão today
(mainly regarding documentation).

I'll be reviewing @bzeller https://github.com/bzeller code. I have a
lot of spare time and I'll be using it for this task.

[...] A new connection comes in and the socket descriptor is
directly forwarded to a worker thread. Inside of that worker script
the Socket is created and handled just like before. The request now
belongs to that thread as long as the socket does not close it or it
times out.
I implemented a mechanism that allows me to have more than one
request per thread, its a timer that measures the work load of the
thread, if the timeout is arriving too slow the workload of the
thread is high. [...]

It sounds great.
=)


Reply to this email directly or view it on GitHub
#15 (comment).

I did not have time yet to write comments for everything.
But just have a look in my examples, i think you should find everything
there.
Btw i did some heavy load testing of a simple handler with threads
(file handler). I did not run into any crashes and all threads
returned to idle at the end of the test. No leftover requests or
something like that.
I think the fd-based version is much much more robust.

Don't forget to use the fd-based branches! The other one is still
the code that hits the Qt bug!

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

I think I'm going to write an AsyncFileOperation or something like that, then the HttpFileServer won't block the thread and everyone will be happy.

Btw i did some heavy load testing of a simple handler with threads (file handler). I did not run into any crashes and all threads returned to idle at the end of the test. No leftover requests or something like that. I think the fd-based version is much much more robust.

Good to know. =)

Okay, my review:

  • File src/doc/examples/threadedapplicationdefaultmain.cpp added, but not used.
    Maybe you forgot to include it in the documentation?
  • There are mixed new features on the branch, making difficult to review the
    work. e.g. Replacing override by Q_DECL_OVERRIDE
  • You left some commented code under HttpFileServer. It's safe to remove them
    • //request.socket().flush();
  • Why Priv type declaration was moved to public under HttpFileServer?
  • I'm reviewing HttpConnectionHandler and I'll bump more items later

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 01.10.2013 10:27, Vinícius dos Santos Oliveira wrote:

I think I'm going to write an AsyncFileOperation or something like that,
then the HttpFileServer won't block the thread and everyone will be happy.

Btw i did some heavy load testing of a simple handler with threads
(file handler). I did not run into any crashes and all threads
returned to idle at the end of the test. No leftover requests or
something like that. I think the fd-based version is much much more
robust.

Good to know. =)

Okay, my review:

  • File src/doc/examples/threadedapplicationdefaultmain.cpp added, but
    not used. Maybe you forgot to include it in the documentation?

Documentation is not finished yet.
I still need to do that but i have to do some other work first that
takes up most of my time.

  • There are mixed /new/ features on the branch, making difficult to
    review the work. e.g. Replacing /override/ by /Q_DECL_OVERRIDE/
    Sorry about that, i can revert those if you want
  • You left some commented code under HttpFileServer. It's safe to
    remove them
    o //request.socket().flush();
    Yeah the socket().flush() can be a problem if the Socket has closed
    already but QSocket does not know yet. There is also no way to
    check if that happend. I'll remove them.
  • Why Priv type declaration was moved to public under HttpFileServer?
    Ah, because my subclass also subclasses the Priv Type.
    Think about this:
    We have a 3 level subclassing. If i create a Private Type per subclass
    we get 3 additional pointers per instance. If i do that
    in the workerthreads, i get 3 pointers * nr of threads. And only
    for one Type. Thats why i chose to subclass the Private structures.
  • I'm reviewing HttpConnectionHandler and I'll bump more items later


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024
  • There are mixed /new/ features on the branch, making difficult to review the work. e.g. Replacing /override/ by /Q_DECL_OVERRIDE/

Sorry about that, i can revert those if you want

I'd appreciate that.
=)

  • Why Priv type declaration was moved to public under HttpFileServer?

Ah, because my subclass also subclasses the Priv Type. Think about this: We have a 3 level subclassing. If i create a Private Type per subclass we get 3 additional pointers per instance. If i do that in the workerthreads, i get 3 pointers * nr of threads. And only for one Type. Thats why i chose to subclass the Private structures.

I see. It's a good idea. Improving data locality (and cache usage, probably), memory usage, ...
=)

I'll continue with my review.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

@bzeller:
I've pulled/pushed/cloned/??? your fd-based branch to threads branch on Tufão repository.

  • My changes will go on this branch
    • For important changes, I'll bug you before do anything
  • When the code becomes ready, I'll merge it with master and release a new version, but I'll discuss with you if you want to change anything before the merge/release

Okay, the first changes to the new threads branch were:

  • Merged latest changes from master
  • Coding style

And my initial concerns are:

  • src/sessionstore.h:
    • {,un}signSession were moved to public. These methods implement the HMAC-SHA1 algorithm only, but a public API should be independent from the SessionStore class and its interface should allow the programmer to choose the algorithm, even if only one is available.
  • src/httpconnectionhandler.h
    • Shouldn't the upgrade-handler things be moved to AbstractConnectionHandler?

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

And more:

  • Why is AbstractConnectionHandler::handleConnection virtual?
  • I'm going to reintroduce the protected functions of HttpServer and document them with \deprecated. The actual implementation will only call the AbstractConnectionHandler functions
  • Do you mind if a rename AbstractConnectionHandler to AbstractHttpConnectionHandler?

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

Last review item for today: ThreadedHttpServer::RequestHandlerFactory signature is polluting public API. Can you get rid of void **customData?

from tufao.

bzeller avatar bzeller commented on September 27, 2024
  • {,un}signSession were moved to public.
    Whoops seems i pushed some of my changes i needed for the application i'm writing. This is not necessary, i
    needed it to sign SessionIds that i put in a JSON request object
  • Shouldn't the upgrade-handler things be moved to AbstractConnectionHandler?
    Probably, i was assuming that maybe not all Subclasses want to handle upgrades and because of that
    don't need the UpgradeHandler stuff, but it can not hurt i guess.
  • Why is AbstractConnectionHandler::handleConnection virtual?
    Think about the Scgi Handler we talked about, it does not need to create the Socket, but it needs to create
    ScgiRequest and ScgiResponse, instead of HttpRequest and Response
  • Do you mind if a rename AbstractConnectionHandler to AbstractHttpConnectionHandler?
    Hm i guess not, all possible subclasses should be Http related at least.
    Btw i personally think that names like HttpServerRequest or HttpServerRequestRouter don't need
    that Server, it makes the names very long ;)
  • ThreadedHttpServer::RequestHandlerFactory signature is polluting public API. Can you get rid of void **customData?
    Actually that was meant to be a feature. If the user wants to store some additional data, he can use the
    customData storage. ThreadedHttpPluginServer uses it for example. I could work around it if i put the
    pointer to ThreadData into the Lambda if you don't like it.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024
  • I'm going to reintroduce the protected functions of HttpServer and document them with \deprecated. The actual implementation will only call the AbstractConnectionHandler functions

I thought more about the problem and this won't fix.

The threads-branch is API-incompatible. I'll release these changes with Tufão 2.x (not a big deal, I'm not the kind of maintainer that thinks that a major release can only happen after we bloated the API and waited for 10 years or so on).

  • ThreadedHttpServer::RequestHandlerFactory signature is polluting public API. Can you get rid of void **customData?
    Actually that was meant to be a feature. If the user wants to store some additional data, he can use the
    customData storage. ThreadedHttpPluginServer uses it for example. I could work around it if i put the
    pointer to ThreadData into the Lambda if you don't like it.

Can you replace void ** by QVariant? It's the Qt-way and already adopted by Tufão.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024
  • ThreadedHttpServer::RequestHandlerFactory signature is polluting public API. Can you get rid of void **customData?
    Actually that was meant to be a feature. If the user wants to store some additional data, he can use the
    customData storage. ThreadedHttpPluginServer uses it for example. I could work around it if i put the
    pointer to ThreadData into the Lambda if you don't like it.

Can you replace void ** by QVariant? It's the Qt-way and already adopted by Tufão.

In fact, consider other options too. It'd be bad to deprecate a signal, just because the signature gone wrong. Maybe we are looking for a new RequestHandlerFactory class. Or something else...

from tufao.

bzeller avatar bzeller commented on September 27, 2024

On 15.10.2013 07:13, Vinícius dos Santos Oliveira wrote:

      * ThreadedHttpServer::RequestHandlerFactory signature is
        polluting public API. Can you get rid of void **customData?
        Actually that was meant to be a feature. If the user wants
        to store some additional data, he can use the customData
        storage. ThreadedHttpPluginServer uses it for example. I
        could work around it if i put the pointer to ThreadData into
        the Lambda if you don't like it.

Can you replace |void **| by QVariant? It's the Qt-
<https://qt-project.org/doc/qt-4.8/qmodelindex.html#data>way
<http://qt-project.org/doc/qt-4.8/qt.html#UserRole> and already
adopted by Tufão
<http://vinipsmaker.github.io/tufao/ref/1.x/class_tufao_1_1_http_server_request.html#a1b66a5edb66964f5a0d7ef101d2261c2>.

In fact, consider other options too. It'd be bad to deprecate a signal,
just because the signature gone wrong. Maybe we are looking for a new
|RequestHandlerFactory| class. Or something else...

I of course considered a RequestHandlerFactory class, but i figured
using lambdas is the Tufao way ;). Also we need a instance of the
factory per thread OR it needs to be reentrant, which makes it
impossible to store custom data. If we need a factory per thread
we either need a FactoryFactory which sucks, or it needs to be
copied for every thread, which means the user cannot do anything
in the Factorys constructor, because we need to make sure initialization
only happens inside the threads. Also its not common for a factory
do store things like custom data that belongs to the RequestHandler it
creates.

Alternatively we could force that the user has to return a
ThreadRequestHandler subclass from our factory lambda, cleaner but also
means more code for the user.

To using QVariant, i'm not sure this is a good idea in that case, since
i don't know what the user wants to store and if it is possible to copy.
Qt also uses void pointers here and there:
QModelIndex QAbstractItemModel::createIndex(int row, int column, void *
ptr = 0)

And using QVariant with pointer types could easily lead to memory
leaks if people forget to free the memory by hand.

Maybe we could use a QSharedPointer but i don't know if this works
(especially when destructing), alternatively we could have a
RequestHandlerCustomData subclass the user could put into a QSharedPointer.

Any comments? Not sure how we resolve that best.


Reply to this email directly or view it on GitHub
#15 (comment).

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

[...]

Any comments? Not sure how we resolve that best.

It's a lot of choices and you are better familiarized with the problem than me.

I indeed like lambdas because make related callback-style pieces of code closer together. It means easier reading, less bugs, ... . And lambdas are way shorter than write your own classes overriding the function-call operator too. But I think I'm going off-topic here.

About the signal using void*, I just wanted to emphasize that you shouldn't hear me about the what I initially proposed for replacement. I just wanted you to try better or teach me what I still need to review.
=P

About the suggestions and explanations you gave, I'll need to think more.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

5 months later... here I am again... oh... I missed this fun discussion so much :)

I was recently playing with the plugin's code and there are some thoughts to share:

[...] then each thread would have its own handlers.

[...] You can then store request local data inside the handler and don't need
to think if another thread uses it. But i think for a plugin based handler
that means when plugins change we have to stop all threads to reload them

Qt will handle real unload of plugins only after all QPluginLoader called
unload. This already ease things for us. The remaining problem (force reload)
can be tackled in several ways. Two I like:

  • User could use a different filename for the new version, then the plugin
    server objects in each thread will upgrade incrementally (kind of nice)
  • User can clean config and wait some time till all plugins were unloaded and
    then specify a filled config again. Ugly, I know

Any other ideas? I'll disable all plugins support in 2.0 and reenable it again
in 2.1, then there is more time to think about that later and focus on current
issues (threads and FastCGI).

Also worth mentioning is that @bzeller refactored the plugin code to be clearer, but there were some improvements in master branch and the refactoring wasn't rebased. I just reverted the refactoring, but I have nothing against it and I'd merge it if proposed again (rebased against recent releases). Anyway, this is going too off-topic, time to focus on threads...

I'm really interested in the idea of the initially proposed flat objects for representing HTTP requests. If this idea is adopted, then HTTP/2.0 and FastCGI could be introduced with no API/ABI breaks. Also it would be easier to dispatch requests to different threads (dispatching requests vs connections is finer-grained, which can help on loading partition). Problems:

  • Streaming: If we are handling file uploads, we'd like to directly write the file to disk to avoid wasting 2GB from the main memory.
  • HTTP pipelining allow the client to send data before the reply starts. This means that we cannot embed the QIODevice/QAbstractSocket object into the flat HTTP request object, because the user might end up reading the beginning of another request as data.
  • Another point against the QIODevice/QAbstract embedding: I don't know how the CGI protocol works, but if I pictured correctly, it uses environment variables, not some streaming interface. Even if I pictured CGI wrong, there may be one HTTP interface that introduces the same problem.
    • A streaming interface may not be used by the backend. The backend would need to forge a fake HTTP session just to be compatible, because rerouting. This is ugly and inefficient.

So my solution is:

Do not go full flat object, but do not try to route HTTP messages either.

  • The backend won't need to notify about headers, it'll deliver the object with all headers filled. It's higher level and the only interface we need anyway.
  • We introduce a QIODevice layer (this QIODevice is not the same object of the connection), which will be used by the backend to send body's data (and only this) to the flat HTTP request object.
  • The flat HTTP object will of course continue to inherit from QObject, handle the QIODevice::readyRead() signal and cache the body in memory (while export the readBody() to the user, who can do whatever he wants).

I'll present something later, I'm currently working on modernization (in the sense of work against today's Tufão) of the set of patches @bzeller created to implement threads.

Said all that, I was planning to release the version 2.0 in the beginning of the week, but I just realized how much work this means and I'm willing to release it in the end of the week.

from tufao.

vinipsmaker avatar vinipsmaker commented on September 27, 2024

We introduce a QIODevice layer (this QIODevice is not the same object of the connection), which will be used by the backend to send body's data (and only this)

PS:
This could also lead to transparent compression support on requests (this is already possible, just not implemented, in responses).

from tufao.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.