Comments (17)
It is known that the first time a project has to be loaded in qgis server it may takes some outrageous amount of time depending on the number of layers and the datasource involved.
Now you must know that py-qgis-server use qgis server worker in child processes for handling requests (qgis server by itself is not asynchronous and that there is no shared cache between those workers (an issue that cannot be solved without rewriting a large part of Qgis code).
From this you may experience latency each time a project has to be loaded in a worker cache and
you will have always optimal response time when the project has been loaded into each worker cache.
Depending of the nature and the number of projects (number of layers, big datasources....) you are using you may have to use different proxy strategies (for exemple you may implement sharding with several py-qgis-server instances). If you have a few projects, you may also considering seeding with multiple initial requests until all workers have their projects loaded.
from py-qgis-server.
Interesting. How to implement sharding with several py qgis servers?
from py-qgis-server.
You pop several instances of py-qgis-servers and may use a nginx as reverse proxy with some consistent hashing of the MAP parameter.
from py-qgis-server.
I noticed that there is QGSRV_CACHE_ROOTDIR variable. May it help?
from py-qgis-server.
No, the cache QGSRV_CACHE_ROOTDIR set the location of the projects files. The configuration is not well documented and we are working on it. You may adjust the number of workers with QGSRV_SERVER_WORKERS.
from py-qgis-server.
QGSRV_SERVER_WORKERS
Interesting discussion !
So how workers works ? Is there a rule between cpu threads and workers ?
from py-qgis-server.
It is not thread, it is really multi processing. Requests are distributed using a fair queuing with 0MQ messaging. you may also distribute your workers on a whole cluster by running worker only/proxy only containers.
from py-qgis-server.
Ok understand, thank you. So it means, it will need a big infrastructure to achieve good performances but when I tried with lizmap-docker-compose on big project and 8 vcpu + 8 gb ram, it never reach full computer load (ram and cpu : ram does not reach his maximum and cpu stay at around 25% by vCPU).
Even with agressive parameters in :
- qgis server (QGSRV_SERVER_WORKERS 2 to 8 to 32 : could be stupid but just trying to saturate computer and QGIS_SERVER_MAX_THREADS : 8),
- php fpm : pm.start_servers, pm.min_spare_servers, pm.max_spare_servers ... I will play later with others fpm parameters like PM_CHILD_PROCESS, PM_MAX_CHILDREN, PM_MAX_REQUESTS and PM_PROCESS_IDLE_TIMEOUT setting in environment variable do nothing at this time in current docker configuration. First ones are written with bash command sed to fpm configuration.
- nginx : worker_processes and worker_connections
So at this time, my question is : any idea where this limitation come from :
- docker ? (I don't think so cause I manage to get 100% vCPU on running stress or stress ng in qgis server, lizmap, nginx and redis container)
- project reading ? 2.2 Mo
- ...
from py-qgis-server.
You will saturate your CPU with computation intensive jobs. This is highly dependend of the context and the kind of project, as a rule a thumb you may expect that jobs spend most of the doing I/0 which means it has mostly no impact on cpu demand.
Increasing the number of workers will not change loading time nor the time spent internally by one worker to process your request: it will enable you to process more request at the same time scale according to your request rate.
Because of this you must also set the proper values for php-fpm depending on what is your scenario.
A said before, performances depends on many factors and the appropriate solution depends on what you want to improve.
AMHA, here are the questions to asks:
Number of workers, php configuration and cache size will play a role with:
- What is the the expected request rate
- How requests distribute on projects
- How many numbers of differents projects I have to handle.
And the following will impact the internal performances of each workers
- How many layers is there is in the projects
- How big is the data I have to handle.
- How is access to my backend databases - (I have seen many performances issues from bad settings in postgis databases).
The former questions target your infrastructure choice, the latter rely for the most part on Qgis internal performances.
from py-qgis-server.
Ok thank you for the reply, with docker stats I clearly show the I/0.
Docker have one limitation about I/0 : by default it's reduced depending on linux distribution. docker.service needs to be updated from
LimitNOFILE=1048576
LimitNPROC=1048576
to
LimitNOFILE=infinity
LimitNPROC=infinity
Tested on Lizmap and show an improvement on pre cached layers
from py-qgis-server.
Tested on Lizmap and show an improvement on pre cached layers
Good to know thanks !
from py-qgis-server.
Do you have some metrics ? Could be interesting to investigate the performance gain.
from py-qgis-server.
No sorry, just visual but you could read here some input about I/0 on docker with metrics : moby/moby#21485
So you could test by yourself with reading / writing inside and outside your container.
PR at the end moby/moby#24307
Consider looking at TasksMax=infinity
as well in same systemd service as it was not mention in PR and related to your kernel option
from py-qgis-server.
from py-qgis-server.
Depending on your host linux distribution. Some distribution already have it well tuned. Could not be a final solution and need to be more tested with qgis server
It used to be in /usr/lib/systemd/system/docker.service
from py-qgis-server.
I see a new config "SERVER_RESTARTMON", can we use it to improve internal performances of each workers ? By updating the file that "SERVER_RESTARTMON" is watching before a user make an OWS request.
Is there a way to make have timeout before request send a '422 Unprocessable Entity' ?
Thanks for your work !
from py-qgis-server.
@TANK2003 SERVER_RESTARTMON is a just a very simple way to ask the workers to make a graceful restart, for example when you are updating plugins, it is not really related to internal performances.
The main process broadcasts a notification to the workers: they restart as soon they have finished the current processing. while new incoming requests are held back by the dispatcher. This ensure that there is no lost of requests during the restart process.
'422 Unprocessable Entity' has nothing to do with timeout, it is sent when you have invalid layers in strict checking mode.
from py-qgis-server.
Related Issues (20)
- Is it necessary QGIS to use this server? HOT 3
- Abort request to PostgreSQL if the connection pool is full HOT 3
- Missing documentation about new environment variables
- Improve how to use a symlink for a QGIS plugin HOT 2
- implement multiprocessing by adding php proxy/wrapper HOT 4
- Can we have the form UI design in QGIS as HTML in the response in GetFeatureInfo ? HOT 1
- Add a QuickStart mode HOT 2
- Running proxy and workers separately with docker image show argument error HOT 1
- QGSRV_SERVER_ALLOW_HEADERS config does not appear in documentation HOT 1
- Default user when running under k8s is now 9001 breaking volume mounts HOT 5
- How to force py-qgis-server to not use cache for GetCapabilities request ? HOT 9
- Is there a problem in the refresh_cache method ? HOT 8
- customize SVG search path? HOT 3
- We need a docker image with Qgis 3.22 HOT 2
- Qgis: WMS: Download of capabilities failed: SSL handshake failed HOT 2
- Feature request: preload cache ready healcheck HOT 2
- Workers update sequentially HOT 9
- Cannot use WFS3 ogc api HOT 9
- Test with -w in docker-entrypoint.sh gives error HOT 13
- Capabilities to change port HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from py-qgis-server.