Comments (10)
This is the type of message the queue worker gets (sometimes) when the queue connection drops:
Fail:
Code: 500
Value: Operation could not be completed within the specified time.
details (if any): <?xml version="1.0" encoding="utf-8"?><Error><Code>OperationTimedOut</Code> <Message>Operation could not be completed within the specified time.
RequestId:789e0000-0000-0036-59dd-d70517000000
Time:2018-04-19T12:51:27.6988826Z</Message></Error>.
I think that is handed over to Laravel as a queue message, which Laravel then attempts to run as a job (which it shouldn't, but Laravel's lack of validation of queue messages is another bugbear of mine). This locks up the queue worker, which can only be killed from the Linux command line. Not sure what it's doing - blocking while waiting for its non-existent child worker to finish, I suspect.
Anyway, I need to do some more tests of long-running connections, but if what I think is happening is correct, then inspecting the messages that the Azure queue library gives to this connector, and looking for some text that indicates the connection has dropped, would allow it to reconnect and not pass the message back to Laravel as a job.
"Fail:\nCode: %s\nValue: %s\ndetails (if any): %s." == AZURE_ERROR_MSG == ServiceException
I guess "details" is the raw XML data returned by the API, and "Value" is the human-readable response phrase.
from azure-queue-laravel.
from azure-queue-laravel.
Let me know what I can do to help. There are a lot of pins set up in the chain here, and some things just have to be left to run for a few hours before the problem is observed. So if there is specific points that it would be helpful for me to capture data, I can do that.
from azure-queue-laravel.
I also think this is something that Laravel should take on board to a certain extent, perhaps allowing the connector to return a "connection lost" exception so that Laravel can ask for a reconnection before trying again. Or something like that.
from azure-queue-laravel.
Noticed the Azure PHP storage libraries have all been taken up a point release in the last day. Not sure if the new version tackles any of the timeout/auto-reconnect issues.
from azure-queue-laravel.
I've had a look into this, and with what you are seeing it appears there may be two different scenarios happening:
- cURL is simply failing to connect to the Azure service:
- cURL raises an exception for error 7 "Couldn't connect"
- GuzzleHttp raises a
ConnectException
for this, setting the message to the formatcURL error %s: %s (%s)
- Within the Azure PHP Storage library the promise is rejected, the HTTP response is null as it never connected, so the library re-throws the exception which makes its way up to your top error handler.
- The Azure service is returning an HTTP 500 error response for OperationTimedOut
- cURL completes the request and return the HTTP 500 error
- GuzzleHttp completes the promise with the 500 error and HTTP response body
- Azure PHP Storage library is not expecting a 500 response from the server, and raises a
ServiceException
(seethrowIfError
inServiceRestProxy
) - Azure PHP Storage library formats the Exception message to the format
Fail:\nCode: %s\nValue: %s\ndetails (if any): %s.
- Laravel receives this Exception in the Worker
getNextJob
method, which is then caught and reported. As the error message does not match any of the strings incausedByLostConnection
, the worker does not quit and keeps running.
The Laravel workers operate on a polling frequency rather than maintaining open connections, however cURL could potentially be holding connections open under the hood, and the default polling of every 3 seconds could be sufficient to keep a connection open. Laravel already tries to re-attempt what it understands to be lost connections for queues, but this doesn't cover this specific scenario.
Options I see are:
- I can handle the specific error in the
pop
method in theAzureQueue
and re-throw an Exception that Laravel understands, which will cause it to kill the Worker and terminate the connection - I can submit a PR to the Laravel Framework to get the messages added into
causedByLostConnection
method, which will have the same effect as Option 1
I'm not sure either will fix your first error, as that seems like a standard connection error. This could be Azure refusing the connection due to too many concurrent requests, or just an occasional network issue.
from azure-queue-laravel.
Also, I am not 100% sure whether the latter HTTP 500 error is actually a formal timeout / connection lost, so may get some pushback from the Laravel maintainers if it is just a transient error where the Azure Service itself just timed out and failed.
from azure-queue-laravel.
Thanks, that's some great analysis, very much appreciated.
I think you may be right about connections being held open. If Laravel is polling the queue every few seconds, and it is opening a new connection each time, then if connections are not closed properly we would certainly have a problem building up. Without restarting the queue worker, it would take anywhere between four and six hours before it effectively freezes. I could kill the workers (I have three of them) with a akill
signal, but artisan queue:restart
would not work, so I guess they were blocking on something - waiting for a spare connection slot to be freed up perhaps? I'm only guessing, with limited knowledge on the Azure side.
The long delays I was seeing in pushing to the queues could very well be a symptom of the same problem. If the queue workers have taken all available connection slots, then opening a connection to push a message could be blocking (somewhere in the route to the queue). Then we end up with a kind of deadlock.
If this is what is happening, then restarting the queue workers every ten minutes, which I am doing now, is probably the best workaround for now.
So, the options. I personally have an issue with Laravel's single list of "connection lost" messages for the database connections, that are hard-coded and need to cater for a wide range of databases. IMO those lists should be in the individual database connectors, and should be extendable to provide east fixes for specific cases, and the ability to add messages in other languages (I cannot fathom how the non-English locales are coping with lost connections). So the queue workers also support lost connection handling? I wasn't aware of that. If so, then keeping it in this queue connector makes more sense to me (option 1). This connector knows about Azure, and should be telling the Laravel framework what to do :-)
I'm not an Azure export, but I can get someone else to monitor our connection pool. If they are growing every few seconds, then that would give us some clues for what needs to be addressed in the longer term.
Looking at the Azure code last week, it seemed that just about any failure to talk to the remote queue resulted in a 500 exception. I suppose it is reasonable to always rethrow this as "connection lost", as a restart of the queue worker is likely to clear out any crap that has been building up by restarting the process.
from azure-queue-laravel.
Just looking at the lost connection detection in illuminate/queue
, and realise it just uses the database lost connection detection:
https://github.com/illuminate/queue/blob/9c063c804f6ab0596c68c41830496029fbca31f7/Worker.php#L9
That's a bit WTF, TBH. It may be relevant for the database queue connector, and only then for error messages that it knows about (it lacks many) but is completely inappropriate for non-database connectors. This is why there will be push-back on adding to that list - it's a list of database errors.
from azure-queue-laravel.
I had the same problem and wanted to know how to solve it.
samm 500 time out error ~~
Fail: Code: 500 Value: Operation could not be completed within the specified time. details (if any): OperationTimedOut
Operation could not be completed within the specified time. RequestId:ffa1d05b-c003-00c1-205f-105061000000 Time:2020-04-12T00:14:23.0928233Z.
錯誤檔案:
/home/site/wwwroot/vendor/microsoft/azure-storage-common/src/Common/Internal/ServiceRestProxy.php
from azure-queue-laravel.
Related Issues (14)
- Is Laravel 5.4 planned to be supported soon? HOT 4
- Laravel 6 support HOT 3
- Laravel 8 Support HOT 3
- MalformedToken HOT 2
- How to receive messages from service bus? HOT 1
- PHP 8 Support
- Release for PHP 8.0 HOT 1
- QueueEndpoint argument support HOT 3
- Laravel 9 HOT 3
- Laravel 10 HOT 3
- Plan for Azure SDK EOL HOT 3
- Unable to install on Laravel 5.5.* HOT 14
- Laravel 5.7 not compatible HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from azure-queue-laravel.