Comments (8)
Well, I finally found something! [QTBUG-124496] QObject::timerEvent may be called before the timer elapsed - Qt Bug Tracker
I wonder if we have a single-shot QTimer running somewhere in the IPC code, which Qt confuses with our timeout timer? Or maybe QLocalS(ocket|erver)
has one internally?
That also explains why anecdotally, "clicking around" in the UI made it reproduce faster, because that probably leads to more timers being used.
The bug also says Qt 6.5 in addition to 6.7, but the linked bug says 6.5.6 (or .5 in the comments?). So chances are it was a change that landed in Qt 6.7 but also was backported to 6.5 LTS?
Assuming it's still there in Qt 6.7.1, I suppose it might be a good idea to leave the warning logging in.
from qutebrowser.
I can somewhat reliably reproduce via running qutebrowser and this script:
import time
import json
import getpass
import random
from PyQt6.QtNetwork import QLocalSocket
socket = QLocalSocket()
while True:
name = f"qutebrowser-{getpass.getuser()}"
socket.connectToServer(name)
ok = socket.waitForConnected(1000)
assert ok
cmd = random.choice([":version", ":tab-close"])
json_data = {"args": [cmd], "target_arg": None, "version": "3.1.0", "protocol_version": 1}
line = json.dumps(json_data) + '\n'
data = line.encode("utf-8")
socket.writeData(data)
ok = socket.waitForBytesWritten(1000)
if not ok:
print("!", end="", flush=True)
continue
assert socket.error() == QLocalSocket.LocalSocketError.UnknownSocketError
socket.disconnectFromServer()
if socket.state() != QLocalSocket.LocalSocketState.UnconnectedState:
ok = socket.waitForDisconnected(1000)
assert ok
print(".", end="", flush=True)
time.sleep(random.uniform(0.01, 0.2))
It seems to be easier to reproduce when I do some clicking in the UI, e.g. switching tabs.
The logging I added in db3d15d might still be a red herring, because the ID is fetched when the timeout function runs, and not when the timeout is started...
from qutebrowser.
My best bet at getting a Windows C++ stack trace by inserting a breakpoint()
and letting DebugDiag do its thing:
readline/Python stuff:
ntdll!NtDeviceIoControlFile+14
KERNELBASE!ConsoleCallServerGeneric+e9
KERNELBASE!ReadConsoleInternal+18d
KERNELBASE!ReadConsoleW+1a
python310!PyOS_Readline+542
python310!PyOS_Readline+38d
python310!PyOS_Readline+132
python310!Py_gitversion+63ba2
python310!Py_SetStandardStreamEncoding+807
python310!PyEval_EvalFrameDefault+1aa2
python310!PyEval_EvalFrameDefault+f65
python310!PyEval_EvalFrameDefault+f65
python310!PyEval_EvalFrameDefault+f65
python310!PyEval_EvalFrameDefault+f65
python310!PyEval_EvalFrameDefault+f65
python310!PyFunction_Vectorcall+87
python310!PyObject_GC_Malloc+1133
python310!PySys_GetSizeOf+141
python310!PyUnicode_ToUppercase+3ef
python310!PyUnicode_ToUppercase+20c
python310!PyUnicode_ToUppercase+186
python310!PyUnicode_ToUppercase+325
python310!PyEval_EvalFrameDefault+9338
python310!PyFunction_Vectorcall+87
python310!PyObject_GC_Malloc+1133
python310!Py_HandleSystemExit+80c
python310!Py_HandleSystemExit+58f
python310!PyVectorcall_Call+5c
python310!PyObject_Call+4f
QtCore!PyInit_QtCore+eb2f5
QtCore!PyInit_QtCore+eb73a
QtCore!PyInit_QtCore+eb7f4
QtCore!PyInit_QtCore+ed7ab
Qt6Core!QMetaObject::metacall+41
Qt6Core!QObject::qt_static_metacall+1842
Qt6Core!QMetaObject::activate+84
The interesting part:
Qt6Core!QTimer::timerEvent+cf
QtCore+66e56
Qt6Core!QObject::event+64
QtCore+590f3
Qt6Widgets!QApplicationPrivate::notify_helper+10e
Qt6Widgets!QApplication::notify+18c1
QtWidgets!PyInit_QtWidgets+1c3dcf
Qt6Core!QCoreApplication::notifyInternal2+11c
Qt6Core!QEventDispatcherWin32Private::sendTimerEvent+1a6
Qt6Core!QEventDispatcherWin32::event+1fc
Qt6Widgets!QApplicationPrivate::notify_helper+10e
Qt6Widgets!QApplication::notify+18c1
QtWidgets!PyInit_QtWidgets+1c3dcf
Qt6Core!QCoreApplication::notifyInternal2+11c
Qt6Core!QCoreApplicationPrivate::sendPostedEvents+209
Qt6Gui!QWindowsGuiEventDispatcher::sendPostedEvents+f
Qt6Core!QEventDispatcherWin32::processEvents+90
Qt6Gui!QWindowsGuiEventDispatcher::processEvents+19
Qt6Core!QEventLoop::exec+19f
Qt6Core!QCoreApplication::exec+15d
Init stuff:
QtWidgets!PyInit_QtWidgets+7c3e5
python310!PyObject_IsTrue+101
python310!PyEval_EvalFrameDefault+3ffe
python310!PyFunction_Vectorcall+87
python310!PyEval_EvalFrameDefault+707
python310!PyFunction_Vectorcall+87
python310!PyEval_EvalFrameDefault+1b88
python310!PyFunction_Vectorcall+87
python310!PyEval_EvalFrameDefault+1b88
python310!PyObject_FastCallDictTstate+15d
python310!PyEval_EvalCode+82
python310!PyNumber_Negative+45f
python310!PyNumber_Negative+35f
python310!PyEval_EvalFrameDefault+1aa2
python310!PyFunction_Vectorcall+87
python310!PyEval_EvalFrameDefault+707
python310!PyFunction_Vectorcall+87
python310!PyVectorcall_Call+5c
python310!PyObject_Call+4f
python310!PyArg_Parse+1ba
python310!Py_RunMain+13d
python310!Py_RunMain+15
python310!Py_Main+25
python+1230
kernel32!BaseThreadInitThunk+10
ntdll!RtlUserThreadStart+2b
from qutebrowser.
I tried to improve the logging in f9ae555 and 245d117... and everything points to Windows/Qt/PyQt just immediately calling the timeout
function after starting the timer.
Why? How?? Why (hopefully!) only this timer? I have no clue! I was unable to find anything that sounded remotely related. No bugreports, no changes that look remotely relevant, nothing.
I think this is user-facing: It happens before the IPC connection ever gets read, and then we disconnect, so we'd (even if rarely) sometimes miss IPC commands.
The best I can come up with is:
@@ -168,6 +168,7 @@ def __init__(self, socketname, parent=None):
self._timer = usertypes.Timer(self, 'ipc-timeout')
self._timer.setInterval(READ_TIMEOUT)
self._timer.timeout.connect(self.on_timeout)
+ self._timer_start_time = None
if utils.is_windows: # pragma: no cover
self._atime_timer = None
@@ -261,6 +262,7 @@ def handle_connection(self):
log.ipc.debug("Client connected (socket {}).".format(self._socket_id))
self._socket = socket
self._timer.start()
+ self._timer_start_time = time.monotonic()
socket.readyRead.connect(self.on_ready_read)
if socket.canReadLine():
log.ipc.debug("We can read a line immediately.")
@@ -393,11 +395,23 @@ def on_ready_read(self):
if self._socket is not None:
self._timer.start()
+ self._timer_start_time = time.monotonic()
@pyqtSlot()
def on_timeout(self):
"""Cancel the current connection if it was idle for too long."""
assert self._socket is not None
+ assert self._timer_start_time is not None
+ if (
+ time.monotonic() - self._timer_start_time < READ_TIMEOUT / 1000 / 10
+ and qtutils.version_check("6.7.0", exact=True, compiled=False)
+ and utils.is_windows
+ ):
+ # WORKAROUND for unknown Qt bug (?) where the timer triggers immediately
+ # https://github.com/qutebrowser/qutebrowser/issues/8191
+ log.ipc.debug("Ignoring early on_timeout call")
+ return
+
log.ipc.error("IPC connection timed out "
"(socket {}).".format(self._socket_id))
self._socket.disconnectFromServer()
I hate it, but I spent the better part of my day with this now, and I can't come up with anything better. I think any further spelunking would require building Qt from source and firing up a C++ debugger to see what happens in Qt... but I'm exhausted and would rather move on to something more interesting.
@toofar thoughts?
from qutebrowser.
That's very strange. Had a look at recent commits to QTimer and QLocalSocket and nothing jumped out at me. I had a thought that maybe it could be due to calls getting interleaved, like if handle_connection()
gets called when we are in the middle of reading data from the socket. But there's nothing I could find in search results warning of anything like that and the docs for QTimer and QLocalSocket don't call it out either.
I guess if identifying and ignoring the situation is the best we can do then we can try that and look into it further if it comes up again.
A few questions:
- should we be setting
singleShot()
on this timer? The default is false right? (I tested and it didn't fix the issue though.) - when the timer fires early, what does
timer.remainingTime()
report? - assuming it is 0, should we be restarting the time when it has fired early?
I also worry that this will affect other timers, since we don't know what causes it. But since we haven't seen this issue on any of the linux test runs and people are already using 6.7 without complaint, I guess we hope it remains this isolated ¯_(ツ)_/¯
from qutebrowser.
That's very strange. Had a look at recent commits to QTimer and QLocalSocket and nothing jumped out at me.
Agreed on both parts. I also looked at changes in qeventdispatcher_win.cpp which has the Windows-specific implementation, and Qt changelogs, and PyQt changelogs, and even diffed PyQt... no dice. Nothing I could find in Qt issues either. I have no idea what's going on there, but it does seem pretty isolated.
I had a thought that maybe it could be due to calls getting interleaved, like if
handle_connection()
gets called when we are in the middle of reading data from the socket. But there's nothing I could find in search results warning of anything like that and the docs for QTimer and QLocalSocket don't call it out either.
In f9ae555 I tried making sure that every connection is a separate object with its own QTimer
. I like the result of that refactoring, but for now I want to do something different than poking at IPC stuff. I'll probably open an issue to finish that later™. However, it didn't help either.
I guess if identifying and ignoring the situation is the best we can do then we can try that and look into it further if it comes up again.
Agreed. I want to wait for Qt 6.7.1 before pushing it. It's now planned for next week apparently... but once it's out, I hope we'll be able to easily try it out with PyQt, as it should be binary compatible. This will probably delay v3.2.0 even further, but eh... Qt 6.7.0 seems to have other annoying issues on Windows, so it's probably for the better.
A few questions:
- should we be setting
singleShot()
on this timer? The default is false right? (I tested and it didn't fix the issue though.)
Good point. It shouldn't matter much because in on_timeout
we call self._socket.disconnectFromServer()
, and that calls the on_disconnected
slot, which stops the timer.
- when the timer fires early, what does
timer.remainingTime()
report?
0
- assuming it is 0, should we be restarting the time when it has fired early?
See above - in theory yes, but since it's currently a recurring timer, it does already restart anyways.
I also worry that this will affect other timers, since we don't know what causes it. But since we haven't seen this issue on any of the linux test runs and people are already using 6.7 without complaint, I guess we hope it remains this isolated ¯\(ツ)/¯
With 01cc3df and 31ab343 I tried to add a generic check to all timers that logs a warning. The CI run passed. I'm somewhat tempted to commit it and see how it fares over time (hah!). Then again, in case this issue just magically vanishes with Qt 6.7.1 again, we might as well pretend it was never there.
from qutebrowser.
Ohh, good spotting! So on Windows the timers are backed by some Windows API and somehow the Windows things aren't cleared when the Qt ones are and then a new Qt one will re-use a still active windows thing? Hmm, that doesn't sound right. Does stopping a timer fire a timer event and that's what's being picked up by the re-used one? Maybe we could make a new timer object before killing the old one and that would resolve this? We do stop and start the timer a couple of times in this class.
Anyway, it's reassuring to know it's for sure a Windows only problem. And that Thiago is assigned to the ticket.
I tried to add a generic check to all timers
Damn I saw that and though "wait, we are trying to get a release out, have you heard of scope creep!". Then I looked at the commits and remembered we already have a QTimer wrapper 🤣
Looks pretty reasonable to me.
from qutebrowser.
I pushed some more stuff to your branch! So the current set of stuff on there is:
- incrementing socket IDs in the IPC server
- ignore early timer firing for read timeout timer in IPC server
- detect and log early timer firing in usertypes.Timer
- switch all QTimer objects to be usertypes.Timer objects
The changes I made were to
- refactor that "is this an early timer" check to be only in usertypes.timer and have the IPC server use that one
- remove the is_windows and Qt6.7.0 checks (the checks should be a no-op if there is no bug, so they are safe?)
- fix tests and lint and add a couple of unit tests
In my opinion that's all good to go, although I'm not hung up on much of it if you think some of it is risky and want to drop it (and I haven't looked at the socket ID stuff at all). Also I probably left a bunch of typos in there as usual. Personally I would edit the commit history to have less incidental stuff in it though ;) (If only so we don't have to skip over commits when running git blame on the ci workflow file.)
I should have some time tomorrow to follow up on this or have a look at pdf.js, so let me know how you want to move forward.
from qutebrowser.
Related Issues (20)
- Configurable placeholders for audio status for tabs HOT 1
- Double download prompt segfault
- A stream site/service suddenly started causing renderer process crashed (status 133) on qutebrowser
- Add screen picker completion menu for screen sharing
- Youtube fullscreen doesn't work with Wayland (Ubuntu 24.04) HOT 2
- nitter.poast.org not working with qutebrowser HOT 2
- Feature Request: Picture-In-Picture mode for video players HOT 2
- Session autosave is unreliable HOT 1
- No module named 'PyQt6' after 3.02 update HOT 1
- Slack sign-in fails to redirect back to app HOT 6
- Crash on requesting one-time password on National Insurance website HOT 5
- Websites scroll to the top when statusbar is hidden HOT 1
- Renderer process crash when visiting atuin.sh website HOT 2
- quebrowser 3.2 Segmentation fault
- Build universal binaries for macOS
- Method to download-open a hint HOT 3
- Query state of webpage in userscript? HOT 3
- segfault: `hint links run fake-key -g ":download {hint-url}<Return><Return>"` HOT 2
- Can't view page source when using the uBlock Origin filter 'Badware' HOT 1
- Issues with Qt 6.8 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qutebrowser.