In 11b8bbe one thing that was claimed was that we now properly set the
Notifier's actual fd on the NotifierActivationEvent. It turns out that
claim was false because a crucial step was forgotten: actually set the
m_notifier_fd when registering. Despite that mistake, it ultimately was
irrelevant as the methods on NotifierActivationEvent are currently
unused code. We were posting the event to the correct Notifier receiver
so the on_activation was still getting invoked.
Given they are unused, NotifierActivationEvent can be defined the same
way as TimerEvent is, where we just pass the event type enum to the
Event base class. Additionally, NotificationType can be moved to
the Notifier header as this enum is now always used in the context of
creating or using a Notifier instance.
The initial IOCP event loop implementation adjusted wake() to manually
queue a completion packet onto the current threads IOCP. This caused
us to now be dependent on the current threads IOCP, when the previous
behaviour did not depend on any data from the thread that was waking
the event loop.
Restoring that old behaviour allows https://hardwaretester.com/gamepad
to be loaded again.
The initial IOCP event loop implementation removed the single shot
timer fix added in 0005207 was removed.
Adding this back allowed simple web pages like https://ladybird.org/ to
be loaded again.
The initial IOCP event loop implementation had a fd() method for the
EventLoopNotifier packet that did not actually return the fd for the
notifier, but a to_fd() call on an object HANDLE that was always NULL.
This meant we were always posting NotifierActivationEvents with a fd of
0.
This rendered all of our WinSock2 I/O invalid, meaning no IPC messages
would ever be successfully sent or received.
This commit changes the event loop to use IOCPs instead of
WaitForMultipleObjects to wait on events. This is done through the Nt
kernel api NtAssociateWaitCompletionPacket which associates an event
with a completion packet. Each completion packet notifies only once, as
they are normally used to signal completion of an operation so to use
them for notifiers they are associated again after each time they are
triggered.
There are more optimizations that can be done, such as reusing the
EventLoopNotifier and EventLoopTimer structures to reduce the number of
allocations and context switches for timer and notifier registration.
This fixes a really nasty EventLoop bug which I debugged for 2 weeks.
The spin_until([&]{return completed_tasks == total_tasks;}) in
TraversableNavigable::check_if_unloading_is_canceled spins forever.
Cause of the bug:
check_if_unloading_is_canceled is called deferred
check_if_unloading_is_canceled creates a task:
queue_global_task(..., [&] {
...
completed_tasks++;
}));
This task is never executed.
queue_global_task calls TaskQueue::add
void TaskQueue::add(task)
{
m_tasks.append(task);
m_event_loop->schedule();
}
void HTML::EventLoop::schedule()
{
if (!m_system_event_loop_timer)
m_system_event_loop_timer = Timer::create_single_shot(
0, // delay
[&] { process(); });
if (!m_system_event_loop_timer->is_active())
m_system_event_loop_timer->restart();
}
EventLoop::process executes one task from task queue and calls
schedule again if there are more tasks.
So task processing relies on one single-shot zero-delay timer,
m_system_event_loop_timer.
Timers and other notification events are handled by Core::EventLoop
and Core::ThreadEventQueue, these are different from HTML::EventLoop
and HTML::TaskQueue mentioned above.
check_if_unloading_is_canceled is called using deferred_invoke
mechanism, different from m_system_event_loop_timer,
see Navigable::navigate and Core::EventLoop::deferred_invoke.
The core of the problem is that Core::EventLoop::pump is called again
(from spin_until) after timer fired but before its handler is executed.
In ThreadEventQueue::process events are moved into local variable before
executing. The first of those events is check_if_unloading_is_canceled.
One of the rest events is Web::HTML::EventLoop::process sheduled in
EventLoop::schedule using m_system_event_loop_timer.
When check_if_unloading_is_canceled calls queue_global_task its
m_system_event_loop_timer is still active because Timer::timer_event
was not yet called, so the timer is not restarted.
But Timer::timer_event (and hence EventLoop::process) will never execute
because check_if_unloading_is_canceled calls spin_until after
queue_global_task, and EventLoop::process is no longer in
event_queue.m_private->queued_events.
By making a single-shot timer manually-reset we are allowing it to fire
several times. So when spin_until is executed m_system_event_loop_timer
is fired again. Not an ideal solution, but this is the best I could
come up with. This commit makes the behavior match EventLoopImplUnix,
in which single-shot timer can also fire several times.
Adding event_queue.process(); at the start of pump like in EvtLoopImplQt
doesn't fix the problem.
Note: Timer::start calls EventReceiver::start_timer, which calls
EventLoop::register_timer with should_reload always set to true
(single-shot vs periodic are handled in Timer::timer_event instead),
so I use static_cast<Timer&>(object).is_single_shot() instead of
!should_reload.
This fixes the problem when none of the timers or notifiers get
executed if wake() is called frequently.
Note that calling WaitForMultipleObjects repeatedly until it fails
will not work because rapidly firing timer can get all the attention.
That's why I check every event individually with WaitForSingleObject.
This behavior matches EventLoopImplementationUnix.
and unregister_timer in EventLoopManagerWindows
Destructors for thread local objects are called before destructors of
global not thread local objects.
This is a partial stack of the problem, thread_data is already
destroyed at this point:
>WebContent.exe!Core::ThreadData::the
WebContent.exe!Core::EventLoopManagerWindows::unregister_notifier
WebContent.exe!Core::EventLoop::unregister_notifier
WebContent.exe!Core::Notifier::set_enabled
WebContent.exe!Core::LocalSocket::~LocalSocket
WebContent.exe!Requests::RequestClient::~RequestClient
WebContent.exe!Web::`dynamic atexit destructor for 's_resource_loader'
It fixes a bug in which ImageDecoder and RequestServer
do not exit because their connections don't close.
This makes the shutdown behavior match the Linux version,
which receives FD_READ | FD_HANGUP on socket close, and
TransportSocket::read_as_much_as_possible_without_blocking calls
schedule_shutdown when read from a socket returns 0 bytes.
On Windows, we have to explicitly call WIN32 shutdown to receive
notification FD_CLOSE.