1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
|
// Copyright (C) 2024 The Qt Company Ltd.
// SPDX-License-Identifier: LicenseRef-Qt-Commercial OR GFDL-1.3-no-invariants-only
/*!
\page watchdog.html
\ingroup qtappman
\title Watchdog
\brief Describes configuration and mode of operation of the built-in watchdog mechanism.
\section1 Introduction
The application manager features a built-in watchdog mechanism that monitors the main thread's
event loop, every Quick window's render thread and all clients of the application manager's Wayland
compositor for unresponsive behavior.
The event loop and render thread monitoring is implemented for both the System UI as well as for
any QML application running in multi-process mode.
The watchdog is implemented as a separate thread that periodically (see \c checkInterval)
checks the state of the monitored subsystems. If any of these fail to respond within a given time
frame, the watchdog will first issue a warning (see \c warnTimeout) and eventually kill
(see \c killTimeout) the affected thread or client.
Please keep in mind, that due to the periodic nature of this check, the actual warning and killing
timeout messages might be delayed by up to the \c checkInterval.
Killing the affected thread directly (instead of just aborting the whole process)
will cause the application manager's crash handler to print a backtrace for the stuck thread,
which can be very useful to diagnose freezes.
\note The watchdog is disabled by default. You need to enable it by setting at least one of the
\c checkInterval configuration values in the \l{Configuration}{main configuration} file to
a timeout that suits your specific device setup.
\section1 Systemd Support
Support for systemd's watchdog is built into the application manager as well: see \{Installation}.
If enabled, the application manager will automatically detect at startup if it was launched by
systemd and if the systemd unit file has the \c{WatchdogSec} option set. If this is the case, the
application manager will periodically send the requested notifications to systemd from its
watchdog thread.
\section1 Logging
The watchdog logs all its messages to the \c{am.wd} logging category. All logging is done from the
separate watchdog thread and the main thread to minimize interference with the monitored threads
or render loops.
The following logging levels are used:
\table
\header
\li Log Level
\li Description
\row
\li \c info
\li The watchdog started (or stopped) watching an object (thread, window, Wayland client).
\row
\li \c warning
\li A \c warnTimeout has been exceeded.
\row
\li \c critical
\li A \c killTimeout has been exceeded.
\endtable
\section1 Performance Considerations
Nothing in life comes for free and the watchdog is no exception. While the overhead of the watchdog
is generally very low, it does impact three areas:
\list
\li For every frame rendered, the watchdog adds three invocations of a \e direct signal/slot
connection: each call retrieves the current system time and stores it via an atomic
fetch-and-store operation.
\li For every Qt event delivered in a watched thread, the watchdog adds two callbacks: each call
checks the state via an atomic load, then retrieves the current system time, but only one
stores it via an atomic fetch-and-store operation.
\li The separate watchdog thread runs a periodic check (see \c checkInterval). It retrieves the
current system time and then collects time data via atomic load operations once for each of
the watched objects.
\endlist
\section1 Configuration
The watchdog is configured via the \c{watchdog} key in the \l{Configuration}{main configuration}
file. Applications inherit these settings, but can also override any value by setting the
corresponding key in their \l{Manifest Definition}{info.yaml manifest} file.
The following interval and timeout values listed below let you specify the exact
\l{Time Duration Values}{times} with milli-seconds precision.
Setting any of the values to \c 0ms (or \c off) disables the respective functionality.
There's also the \c{--disable-watchdog} command line option that makes your life easier when
debugging or testing in a production environment, as it completely disables all watchdog
functionality in the System UI as well as in QML applications.
\table
\header
\li Config Key
\li Type
\li Description
\row
\li \c eventloop/checkInterval
\li duration
\li If set to a positive time duration, the main event loop will be monitored by triggering
a timer every \c checkInterval. (default: off)
\row
\li \c eventloop/warnTimeout
\li duration
\li In case the check timer is not firing within \c warnTimeout, the watchdog will print a
warning. In addition another warning will be printed if the timer does eventually fire,
stating the exact duration the event loop was blocked. (default: off)
\row
\li \c eventloop/killTimeout
\li duration
\li In case the check timer is not firing within \c killTimeout, the watchdog will print a
critical warning and then abort the thread running the main event loop. (default: off)
\row
\li \c quickwindow/checkInterval
\li duration
\li The render thread monitor works a bit differently to the event loop and Wayland one:
Instead of just a single "blocked" state, three different states are monitored:
\list
\li \c Sync: The time it takes for the render thread to synchronize with the main thread.
\li \c Render: The time it takes for the render thread to actually render a frame.
\li \c Swap: The time the render thread spends in the graphics driver, swapping buffers.
\endlist
As a render thread is not always actively rendering, the watchdog will only print a
warning every \c checkInterval, if the thread is active and stuck in one of the
aforementioned states. This periodic report also contains some statistics on how often
the render thread got stuck in each state. (default: off)
\row
\li \c quickwindow/warnTimeout
\li duration
\li The watchdog will print a warning if a render thread is stuck in any of the syncing,
rendering or swapping states for longer than \c warnTimeout. In addition another warning
will be printed if the thread eventually leaves the state it was stuck in, stating the
exact duration it was blocked. (default: off)
\row
\li \c quickwindow/killTimeout
\li duration
\li In case a render thread is stuck in any of the syncing, rendering or swapping states for
longer than \c killTimeout, the watchdog will print a critical warning and then abort
the thread. (default: off)
\row
\li \c wayland/checkInterval
\li duration
\li If set to a positive time duration, all currently active Wayland clients that use the
XDG shell protocol will be pinged every \c checkInterval. (default: off)
\row
\li \c wayland/warnTimeout
\li duration
\li In case the pong reply from the Wayland client is not received within \c warnTimeout,
the watchdog will print a warning. In addition another warning will be printed if the
pong reply is eventually received, stating the exact duration the ping/pong round-trip
took. (default: off)
\row
\li \c wayland/killTimeout
\li duration
\li In case the pong reply from the Wayland client is not received within \c killTimeout,
the watchdog will print a critical warning and then kill the unresponsive Wayland
client. For application manager apps, ApplicationObject::stop() with \c forceKill set
to \c true will be invoked. Other apps will be killed by raising \c SIGKILL on the
process id associated with the Wayland client. (default: off)
\endtable
*/
|