Timely Information Updating for Mobile Devices Without and With ML Advice

Yu-Pin Hsu and Yi-Hsuan Tseng A preliminary version of this work appeared in the Proc. of IEEE ISIT, 2019 [tseng2019online]. Yu-Pin Hsu is with the Department of Communication Engineering, National Taipei University, New Taipei City 237303, Taiwan (e-mail: [email protected]). Yi-Hsuan Tseng was with the Department of Communication Engineering, National Taipei University, New Taipei City 237303, Taiwan (email: [email protected]).

Abstract

This paper investigates an information update system in which a mobile device monitors a physical process and sends status updates to an access point (AP). A fundamental trade-off arises between the timeliness of the information maintained at the AP and the update cost incurred at the device. To address this trade-off, we propose an online algorithm that determines when to transmit updates using only available observations. The proposed algorithm asymptotically achieves the optimal competitive ratio against an adversary that can simultaneously manipulate multiple sources of uncertainty, including the operation duration, the information staleness, the update cost, and the availability of update opportunities. Furthermore, by incorporating machine learning (ML) advice of unknown reliability into the design, we develop an ML-augmented algorithm that asymptotically attains the optimal consistency-robustness trade-off, even when the adversary can additionally corrupt the ML advice. The optimal competitive ratio scales linearly with the range of update costs, but is unaffected by other uncertainties. Moreover, an optimal competitive online algorithm exhibits a threshold-like response to the ML advice: it either fully trusts or completely ignores the ML advice, as partially trusting the advice cannot improve the consistency without severely degrading the robustness. Extensive simulations in stochastic settings further validate the theoretical findings in the adversarial environment.

I Introduction

In recent years, the demand for timely information has surged across diverse systems. In Internet-of-Things (IoT) networks (e.g., unmanned aerial vehicles deployed for disaster response [gupta2015survey]), each IoT device is equipped with sensors (e.g., GPS, radar, and temperature sensors) that continuously monitor its surroundings. These sensors generate status updates about physical processes and transmit them to a central controller. By aggregating such updates, the controller has a real-time view of the environment, thereby enabling intelligent decision-making. Similarly, in location-based smartphone applications (e.g., navigation and gaming [karki2020characterizing]), users frequently report their locations to a central server so that the service can respond in real time. In both cases, a central entity relies on timely status updates from mobile devices to perform time-sensitive inference tasks.

To quantify the timeliness of information maintained at a central entity, Kaul, Yates, and Gruteser introduced the age of information metric in [kaul2012real], defined as the time elapsed since the most recently received update was generated. Under this definition, the information at the central entity linearly ages with time until it is updated. In addition to the linear aging function, more general nonlinear aging functions [kosta2017age] have also been analyzed. These functions further characterize the quality of an update, e.g., capturing how quickly the information held by the central entity deviates from the true status, or representing the penalty associated with using outdated information in decision-making. In this paper, we consider general aging functions.

While frequent updates reduce the age of information at a central entity, they also incur substantial update costs (e.g., energy consumption and bandwidth utilization) at local devices. Such costs are particularly significant for resource-limited mobile devices (e.g., battery-powered and bandwidth-constrained IoT devices or smartphones). We therefore investigate the fundamental trade-off between the age of information at the central entity and the update cost at the mobile device. Specifically, this paper considers an information update system in which a mobile device monitors a physical process and reports its latest status to a nearby access point (AP). To balance information timeliness at the AP with resource consumption at the device, a scheduling algorithm that determines when to transmit updates is crucial. Our goal is to design such an algorithm to minimize the total cost over the operation duration, where the total cost jointly accounts for an age cost (representing the AP’s information staleness) and an update cost (representing the device’s resource expenditure).

The scheduling problem is complicated by several forms of uncertainty in mobile networks: 1) The operation duration is uncertain, e.g., the runtime of a location-based application depends on how long a smartphone user keeps the application active. 2) The age increment may vary over time, e.g., a location update becomes stale more quickly when the device moves at a higher speed. 3) The update cost is also time-varying, e.g., user mobility causes fluctuations in energy consumption. 4) The device’s update opportunities may also be intermittent. Such cases include sporadic update arrivals (e.g., due to misalignment between the update generation periods and transmission slots) and transmission constraints that prevent the device from sending in certain slots (e.g., due to a power-saving policy [lin2022survey] or uplink scheduling decisions imposed by the AP [takeda2020understanding]).

Most prior works modeled uncertainties using stationary stochastic processes, e.g., employing an M/M/1 queueing model in [kaul2012real] to represent the update arrival and service processes. However, such assumptions are often unrealistic, e.g., when a device moves arbitrarily so that the service time no longer follows an exponential distribution. Even if such models fit reality reasonably well, the operation duration may be too short for the process to converge to stationarity. Moreover, practical stochastic models for operation duration or the information aging are often unclear. Such non-stationary uncertainties pose the central challenge for our scheduling design. Under non-stationarity, a scheduling algorithm cannot rely on future knowledge and must instead operate solely based on past and present observations, as an online algorithm.

Our first contribution is the design and analysis of online scheduling algorithms that operate under observable information. Specifically, the proposed algorithm requires knowledge only of the current age increment of the status held by the AP and also whether an update opportunity is currently available, without relying on any prior knowledge of the operation duration or the entire sequences of status aging, update costs, and update opportunities. Let $R$ denote the ratio between the maximum and minimum update cost. Our main result establishes that, asymptotically in the large update cost regime, the proposed algorithm achieves a total cost at most $\frac{e^{1/R}}{e^{1/R}-1}=\mathcal{O}(R)$ (known as the competitive ratio) times the minimum total cost attained by an optimal offline algorithm with complete knowledge of all uncertainties. This competitive ratio turns out to be optimal. Thus, we can observe that the optimal competitive ratio scales linearly with the range of update costs, while remaining independent of all other sources of uncertainty.

The above guarantee holds in the worst case (also referred to as the adversarial environment), across all possible uncertainty instances. However, such worst-case analysis can be overly pessimistic, since in practice future events often follow patterns that would be predicted using machine learning (ML). Motivated by this, Lykouris and Vassilvitskii [lykouris2018competitive] proposed incorporating (potentially imperfect) ML advice into online algorithms, as an approach that goes beyond the worst-case analysis. A central challenge in this setting is that the reliability of the ML advice is generally unknown. Thus, the design goal is twofold: 1) when the ML advice is accurate, the algorithm should perform well; 2) when the advice is unreliable, the algorithm should still provide performance guarantees. However, it is impossible to achieve both properties simultaneously. For example, an algorithm that blindly trusts the ML advice performs excellently under accurate ML advice but can suffer arbitrarily poor performance when the ML advice is wrong. Hence, the objective is to optimally balance these two properties.

Our second contribution is to integrate ML advice (specifically, advising the next update time) into our online scheduling framework. We introduce a hyperparameter $\lambda\in(0,1]$ to control the level of trust in the ML advice, where a smaller $\lambda$ places greater reliance on the ML advice. Our main result is that the proposed algorithm achieves the following asymptotic trade-off: 1) it achieves a total cost at most $\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1}=\mathcal{O}(R)$ (known as the consistency) times the cost of blindly following the ML advice; 2) it also achieves a total cost at most $\frac{e^{\lambda/R}}{e^{\lambda/R}-1}=\mathcal{O}(\frac{R}{\lambda})$ (known as the robustness) times the minimum cost achieved by an optimal offline algorithm. Again, this balance depends only on the ratio $R$ and turns out to be optimal. We can observe that partially trusting the ML advice with $\lambda\in(0,1)$ cannot leverage the ML advice, as it yields (asymptotically) no improvement in the consistency. Thus, an optimal online algorithm almost displays a threshold-type behavior with respect to ML advice, either fully adopting it or ignoring it altogether.

II Related Work

Extensive research has been conducted on analyzing and minimizing the age of information in diverse system settings. For example, Costa et al. [costa2016age] derived closed-form expressions for the average age in single-source systems; then, Yates and Kaul [yates2018age] extended the analysis to multi-source scenarios. Building on the foundational work, numerous system design strategies have been proposed to minimize the age, including scheduling algorithms [kadota2018scheduling], resource allocation schemes [park2020centralized], and sampling strategies [ornee2019sampling]. Beyond solely minimizing the age, several trade-off problems have also been investigated, such as age–throughput trade-off [mankar2021throughput, wang2025understanding] and age–energy trade-off [nath2018optimum, gu2019timely]. A comprehensive survey of these efforts is provided in [yates2021age].

Most prior age-related works assume stationary stochastic processes to model uncertainties (e.g., [yates2018age, kadota2018scheduling, park2020centralized, ornee2019sampling, mankar2021throughput, wang2025understanding, nath2018optimum, gu2019timely]). Since such assumptions can be overly optimistic, several works have examined how non-stationary (adversarial) environments affect information timeliness from different perspectives. Examples include adversarial ON/OFF channels [tseng2019online, sinha2022optimizing], adversarial update arrivals [saurav2023online], and adversarial aging functions [lin2025optimal, tripathi2021online]. Recent work [liu2025learning] further incorporated ML advice into online algorithms for adversarial ON/OFF channels.

To the best of our knowledge, there is no unified design and analysis framework capable of handling an adversary that simultaneously controls multiple sources of uncertainty as in our model, where the adversary can jointly manipulate the operation duration, information aging, update cost, and update opportunities. This gap is critical, since mobile networks inherently involve several forms of non-stationary uncertainty, and it is also technically challenging because the adversary is so powerful. Particularly, the impact of adversarially varying update costs (beyond simple ON/OFF channel models) on age-driven design has not been explored in the existing literature, and our results reveal that it is the most critical source of uncertainty affecting performance.

We address these challenges gradually. Sections III–V focus on the first three types of uncertainty introduced in Section I, and Section VII further generalizes the results to incorporate the fourth type of uncertainty.

III System overview

Refer to caption — Figure 1: An example network model: (a) a mobile device updating an AP; (b) the age of information at the AP when the device sends updates in slots 3 and 5.

As illustrated in Fig. 1(a), we consider an information update system in which a mobile device monitors a physical process and reports its latest status to a nearby access point (AP). The system operates in discrete time slots indexed by $t=1,2,\cdots,T$ , where $T$ represents the total operation duration.

We begin with a scenario in which the device always has an update packet at the beginning of every slot and is also permitted to transmit it in every slot. Then, for each slot $t$ , the device decides whether to transmit the update. Let $d(t)\in\{0,1\}$ denote the device’s transmission decision, where $d(t)=1$ if the device transmits in slot $t$ , and $d(t)=0$ otherwise. If the device transmits at the beginning of a slot, the update is delivered by the end of that slot. In Section VII, we extend the model to more general scenarios in which the device cannot transmit updates in certain slots, i.e., under intermittent update opportunities.

III-A Age of information

If the device decides to transmit an update at the beginning of slot $t$ , then the age of information at the AP is reset to zero at the end of slot $t$ , indicating that the AP has received the latest update. Otherwise, the age increases by an amount $\Delta A(t)$ to reflect the continued staleness of the information at the AP. The value of $\Delta A(t)$ can vary across slots $t$ . Let $A(t)$ denote the age of information maintained by the AP at the end of slot $t$ . As illustrated in Fig. 1(b), the evolution of $A(t)$ across time slots is given by

\displaystyle A(t)=\begin{cases}A(t-1)+\Delta A(t),&\text{if }d(t)=0,\\ 0,&\text{if }d(t)=1,\end{cases}

(1)

with initial age $A(0)=A_{0}$ , where $A_{0}$ is specified by the AP during the initial connection. We define the age increment sequence as $\boldsymbol{\Delta A}=(\Delta A(1),\Delta A(2),\cdots,\Delta A(T))$ .

III-B Problem formulation

While transmitting updates in every slot minimizes the age of information at the AP, it also incurs substantial resource consumption (e.g., energy and bandwidth) at the device. To capture this trade-off, we introduce two cost metrics: the age cost and the update cost. Specifically, we assume that each unit of age in a slot incurs a cost of one unit; thus, the age cost in slot $t$ is given by $A(t)$ . In addition, if the device transmits an update in slot $t$ , it incurs an update cost denoted by $C(t)$ . For instance, $C(t)$ can be modeled as the product of a unit cost $C_{u}(t)$ and the transmission energy $\mathcal{E}(t)$ by $C(t)=C_{u}(t)\mathcal{E}(t)$ . Here, $\mathcal{E}(t)$ depends on the instantaneous channel condition between the device and the AP, which may fluctuate due to user mobility. Meanwhile, $C_{u}(t)$ may also vary over time, e.g., depending on the device’s remaining energy or, when multiple packets are present, different unit costs can be assigned to prioritize certain packets over others. The form $C_{u}(t)\mathcal{E}(t)$ can also be interpreted more generally as a unit cost multiplied by resource expenditure (e.g., energy or bandwidth). However, for clarity, in the remainder of this paper we focus on the energy example. We define the update cost sequence as $\mathbf{C}=(C(1),\cdots,C(T))$ .

To balance the age cost and the update cost, the device needs a scheduling algorithm defined as $\boldsymbol{\pi}=(d(1),\cdots,d(T))$ . The total cost incurred by a scheduling algorithm $\boldsymbol{\pi}$ depends on the sources of uncertainty, including the operation duration $T$ , the age increment sequence $\boldsymbol{\Delta A}$ , and the update cost sequence $\mathbf{C}$ . We represent this uncertainty instance as $\mathcal{I}=\{T,\boldsymbol{\Delta A},\mathbf{C}\}$ . Given an instance $\mathcal{I}$ and a scheduling algorithm $\boldsymbol{\pi}$ , the total cost is defined as the sum of age and update costs:

\displaystyle J(\mathcal{I},\boldsymbol{\pi})=\sum_{t=1}^{T}\Big(C(t)d(t)+A(t)\Big).

(2)

Our objective is to design a scheduling algorithm $\boldsymbol{\pi}$ that minimizes the total cost $J(\mathcal{I},\boldsymbol{\pi})$ .

III-C Scheduling Classification

A scheduling algorithm is referred to as an offline scheduling algorithm if it has prior knowledge of the entire instance $\mathcal{I}$ . Such algorithms are generally impractical in real-world systems due to their reliance on future information. Thus, this paper focuses on a more realistic design, where the device has access only to the historical or current information but lacks knowledge of future values.

Because of the potential unavailability of real-time channel information, the current update cost $C(t)$ may not be known at the beginning of slot $t$ . Therefore, we design scheduling algorithms that do not rely on instantaneous channel knowledge. Instead, the algorithms use only the maximum and minimum possible update costs, denoted by $C_{M}$ and $C_{m}$ . The values of $C_{M}$ and $C_{m}$ can be estimated from the historically observed worst and best channel conditions (e.g., those observed by the AP within its service region and reported by the AP). If these values are unavailable, see Remark 10 for a slight modification of the proposed algorithms that preserves some performance guarantee.

A scheduling algorithm is called a (cost-agnostic) online scheduling algorithm if it requires only the constants $C_{M}$ and $C_{m}$ and the realized age increments up to the current slot. For simplicity, we will omit the term cost-agnostic, with the understanding that all references to online scheduling algorithms refer to the cost-agnostic setting. Without access to the complete uncertainty instance, an online algorithm is generally unable to achieve the minimum total cost attainable by an optimal offline scheduling algorithm. Given an instance $\mathcal{I}$ , let $\mathrm{OPT}(\mathcal{I})$ denote the minimum total cost achieved by an optimal offline algorithm. We evaluate the performance of online algorithms in terms of their competitiveness [buchbinder2009design] relative to the offline optimum, defined as follows.

Definition 1.

A scheduling algorithm $\boldsymbol{\pi}$ is said to be $\gamma$ -competitive if $\frac{J(\mathcal{I},\boldsymbol{\pi})}{\mathrm{OPT}(\mathcal{I})}\leq\gamma$ , for all possible instances $\mathcal{I}$ .

That is, a $\gamma$ -competitive online scheduling algorithm guarantees that the resulting total cost is at most $\gamma$ times the offline minimum cost, regardless of the instance $\mathcal{I}$ . Our goal is to design an online scheduling algorithm that achieves the smallest possible competitive ratio $\gamma$ .

Note that in addition to the competitive ratio, another common performance metric is regret [lattimore2020bandit]. Regret measures the additive performance gap between an online learning approach and the best offline algorithm restricted to fixed decision rules. In contrast, the competitive ratio compares the performance of an online algorithm against the best offline algorithm without such restrictions. Moreover, online learning approaches typically provide regret guarantees only as the decision horizon grows to infinity. Such guarantees are less suitable for our setting, since a device may monitor and transmit updates only for a short and unpredictable duration.

Moreover, with the advancement of machine learning (ML) techniques, it is increasingly feasible to leverage ML to provide scheduling advice. A scheduling algorithm is referred to as an online scheduling algorithm with ML if it additionally has access to ML advice. Let $\mathcal{M}(t)\in\{0,1\}$ denote the decision advised by ML at slot $t$ . We define the sequence $\boldsymbol{\mathcal{M}}=(\mathcal{M}(1),\cdots,\mathcal{M}(T))$ as the ML advice. Because such a scheduling algorithm can adapt its actions based on the ML advice, its decision sequence $\boldsymbol{\pi}$ is allowed to be a function of $\boldsymbol{\mathcal{M}}$ .

This paper considers the setting where the ML advice may be untrusted. While following perfect ML advice (which can minimize the total cost) yields the minimum total cost, blindly trusting imperfect ML advice (which cannot minimize the total cost) may lead to poor performance. Moreover, we assume that the reliability of the ML advice is unknown a priori. In this context, we characterize online scheduling algorithms with ML in terms of two metrics introduced in [lykouris2018competitive]: consistency and robustness. Consistency quantifies performance relative to the ML advice, while robustness guarantees performance in the worst case. These notions are formally defined as follows.

Definition 2.

An online scheduling algorithm $\boldsymbol{\pi}$ with ML advice is said to be $\alpha$ -consistent if $\frac{J(\mathcal{I},\boldsymbol{\pi})}{J(\mathcal{I},\boldsymbol{\mathcal{M}})}\leq\alpha$ , for all possible instances $\mathcal{I}$ and ML advice $\boldsymbol{\mathcal{M}}$ ; it is said to be $\beta$ -robust if $\frac{J(\mathcal{I},\boldsymbol{\pi})}{\mathrm{OPT}(\mathcal{I})}\leq\beta$ , for all possible instances $\mathcal{I}$ and advice $\boldsymbol{\mathcal{M}}$ .

In other words, an $\alpha$ -consistent and $\beta$ -robust online scheduling algorithm ensures that 1) when the ML advice is perfect, the resulting total cost is at most $\alpha$ times the offline minimum cost, and 2) when the advice is arbitrary or even adversarial, the cost remains within a factor of $\beta$ of the offline minimum cost. An algorithm that fully trusts the ML advice may achieve near-optimal consistency, but this often comes at the expense of robustness. Therefore, there exists an inherent trade-off between consistency $\alpha$ and robustness $\beta$ . The goal of this paper is to design an online scheduling algorithm with ML that achieves the optimal consistency-robustness trade-off, namely, to minimize the consistency $\alpha$ for any fixed robustness $\beta$ .

IV Linear program formulation

The main challenge in designing our scheduling algorithm arises from several forms of uncertainty that are impractical to model using stationary stochastic processes and also limited current observations. To address these challenges, we use online algorithm design techniques based on linear programming [buchbinder2009design, bamas2020primal].

However, casting our problem as a linear program (LP) is non-trivial due to the non-linear nature of the age cost. For example, consider a scenario where the device transmits an update in slot $1$ and schedules the next update in slot $x$ . If the age increases linearly by one unit per slot until the next update, the cumulative age from slot $1$ to slot $x$ is given by $\sum_{t=1}^{x}t=x(x+1)/2$ , which grows quadratically with the decision variable $x$ . See [arafa2017age] for a concrete example illustrating this behavior. This quadratic growth implies that the total age cost $\sum_{t=1}^{T}A(t)$ in Eq. (2) includes non-linear terms, thereby complicating direct LP formulation.

To overcome this issue, we introduce a transformation of the age evolution into an equivalent virtual queueing system, described in Section IV-A. This transformation facilitates an LP formulation for the offline scheduling problem, as presented in Section IV-B. The resulting LP formulation serves as the foundation for the design and analysis of our online scheduling algorithms: Section V develops an online scheduling algorithm without ML, Section VI incorporates ML advice into the scheduling process, and Section VII extends the model to intermittent update opportunities.

IV-A Virtual queueing system

Without loss of generality, we assume that the age increment $\Delta A(t)$ is an integer for all $t$ . If this is not the case, we can multiply both $C(t)$ and $\Delta A(t)$ by a common constant so that every $\Delta A(t)$ becomes integer-valued. Such scaling does not alter the optimal solution to the objective in Eq. (2). Based on this assumption, we introduce a virtual queue that mirrors the evolution of the integer-valued age.

We construct a virtual queueing system (shown in Fig. 2) consisting of a virtual server, a virtual queue, and virtual packet arrivals. The virtual system operates in the same discrete time slots as the real mobile network. Initially, the virtual queue contains $A_{0}$ virtual packets. At the beginning of each slot $t$ , $\Delta A(t)$ virtual packets arrive at the virtual queue. If the device decides to transmit an update in slot $t$ (in the actual network), then the virtual server clears the virtual queue at the end of the slot. Otherwise, the virtual server remains idle and the virtual packets accumulate. As a result, the virtual queue size evolves as follows: it resets to zero if $d(t)=1$ (i.e., the virtual server clears the virtual queue, corresponding to an update in the actual network), or increases by $\Delta A(t)$ if $d(t)=0$ (i.e., the virtual server idles, corresponding to no update in the actual network). This evolution exactly mirrors the age dynamics in Eq. (1). Therefore, we use the same notation $A(t)$ to denote the virtual queue size at the end of slot $t$ .

We index the virtual packets by $1,2,\cdots$ according to their arrival times, and let $T_{i}$ denote the slot in which virtual packet $i$ arrives. For each virtual packet $i$ , we use a binary variable $z_{i}(t)\in\{0,1\}$ to indicate whether it remains in the virtual queue at the end of slot $t$ , where $z_{i}(t)=1$ if it is still present, and $z_{i}(t)=0$ otherwise. Using this notation, the virtual queue size at the end of slot $t$ can be expressed as $A(t)=\sum_{i:T_{i}\leq t}z_{i}(t)$ , which counts the number of virtual packets that have arrived by slot $t$ and remain in the virtual queue. This representation allows us to express the age $A(t)$ as a linear function of the binary variables $z_{i}(t)$ . Substituting this expression into Eq. (2), we can rewrite the total cost $J(\mathcal{I},\pi)$ as the following linear function:

\displaystyle J(\mathcal{I},\pi)=\sum_{t=1}^{T}\left(C(t)d(t)+\sum_{i:T_{i}\leq t}z_{i}(t)\right).

(3)

This linear expression facilitates the formulation of an LP in the next section. Moreover, by Eq. (3), the update cost $C(t)$ can also be interpreted as a clearing cost incurred when the virtual queue is cleared in slot $t$ , while holding a virtual packet for one slot incurs a unit holding cost.

IV-B LP formulation

We note that the clear/idle behavior in the virtual queueing system directly corresponds to sending/withholding an acknowledgment (ACK) to clear all received packets in the Transmission Control Protocol (TCP). From this perspective, we can leverage prior studies for the classic online TCP ACK problem [buchbinder2009design, Chapter 12] to formulate our offline optimal scheduling problem as an integer program with a linear objective and constraint functions:


$\displaystyle\min_{x(t),z_{i}(t)}$	$\displaystyle\hskip 5.69046pt\sum_{t=1}^{T}\left(C(t)x(t)+\sum_{i:T_{i}\leq t}z_{i}(t)\right)$	(4a)
s.t.	$\displaystyle\hskip 5.69046ptz_{i}(t)+\sum_{\tau=T_{i}}^{t}x(\tau)\geq 1,$
	$\displaystyle\hskip 28.45274pt\text{for all $i$ such that $T_{i}\leq t$ and for all $t$};$	(4b)
	$\displaystyle\hskip 5.69046ptx(t),\,z_{i}(t)\in\{0,1\},\quad\text{for all $i$ and $t$.}$	(4c)

In this integer program, we introduce a variable $x(t)$ to denote whether the device transmits an update in slot $t$ . That is, $x(t)$ in the integer program is exactly the decision variable $d(t)$ introduced earlier. The reason for redefining this variable is that we will relax $x(t)$ to take a real value between 0 and 1, which prevents immediate interpretation as a transmission decision. Later in Section V, we show how to convert a fractional solution for $x(t)$ into a randomized scheduling decision for $d(t)$ . Moreover, the constraint in Eq. (4b) ensures that each virtual packet $i$ arriving by slot $t$ either remains in the virtual queue at the end of slot $t$ (i.e., $z_{i}(t)=1$ in the first term of Eq. (4b)) or has been cleared by slot $t$ (i.e., there exists a slot $\tau\in\{T_{i},\cdots,t\}$ such that $x(\tau)=1$ in the second term of Eq. (4b)).

By relaxing the integrality constraint (4c) to allow continuous variables, we obtain the following LP:


$\displaystyle\min_{x(t),z_{i}(t)}$	$\displaystyle\hskip 5.69046pt\sum_{t=1}^{T}\left(C(t)x(t)+\sum_{i:T_{i}\leq t}z_{i}(t)\right)$	(5a)
s.t.	$\displaystyle\hskip 5.69046ptz_{i}(t)+\sum_{\tau=T_{i}}^{t}x(\tau)\geq 1,$
	for all $i$ such that $T_{i}\leq t$ and for all $t$ ;	(5b)
	$\displaystyle\hskip 5.69046ptx(t),\,z_{i}(t)\geq 0,\quad\text{for all $i$ and $t$.}$	(5c)

Next, Section V proposes an online algorithm to compute a feasible solution to LP (5) without relying on ML, while Section VI extends this approach by incorporating ML advice.

Remark 3.

Before solving our online problem, we remark that, via the virtual queue transformation, our formulation generalizes the classical online TCP ACK problem [buchbinder2009design, Chapter 12] and its learning-augmented variant [bamas2020primal]. In the classical TCP ACK setting, each ACK incurs a constant cost. In contrast, our objective in Eq. (4a) allows the ACK (i.e., clearing) cost to vary across slots. Later, in Section VII, we further generalize the problem to scenarios where the ACK channel alternates between ON and OFF states, and transmitting an ACK during an ON slot incurs an adversarially chosen cost. This generalization is practically relevant in noisy wireless environments; moreover, it poses theoretical challenges, since the adversary simultaneously controls multiple sources of uncertainty and an online algorithm is restricted to transmitting an ACK only in ON slots. As we will show, our generalized setting also yields a fundamentally different optimal competitive ratio that depends solely on the cost range (i.e., time-varying costs are the dominant factor). When augmented with ML advice, our generalized setting exhibits behavior that differs qualitatively from the classic learning-augmented TCP model; in particular, it yields a threshold-like optimal trust rule when the cost range is large.

V Online scheduling algorithm design without ML

This section develops an online scheduling algorithm without ML by leveraging LP (5). Section V-A introduces an online algorithm that can compute a feasible solution to the proposed LP in an online fashion. Based on this fractional solution, Section V-C proposes a randomized online scheduling algorithm without ML.

V-A Online LP algorithm

We propose Alg. 1, referred to as the online LP algorithm, which computes a feasible solution to LP (5). All variables are initialized to zero in Line 1. At the beginning of each slot $t$ , the algorithm iteratively adjusts the variables for all virtual packets that have arrived by slot $t$ , as specified in Line 1.

The underlying idea is that in each slot $t$ , our scheduling algorithm that will be proposed in Section V-C makes a probabilistic decision: to set $d(t)=1$ with some probability or $d(t)=0$ otherwise. The probability is governed by the current value of $x(t)$ , which is determined in Line 1 of Alg. 1. Accordingly, $x(t)$ can be interpreted as the probability of clearing the virtual queue in slot $t$ . In this context, the cumulative sum $\sum_{\tau=T_{i}}^{t}x(\tau)$ represents the cumulative clearing probability (up to slot $t$ ) for virtual packet $i$ .

With this interpretation, the condition in Line 1 checks whether virtual packet $i$ has already been cleared by slot $t$ . If $\sum_{\tau=T_{i}}^{t}x(\tau)\geq 1$ , virtual packet $i$ is considered cleared, and no further processing is required. Otherwise, the virtual packet may still remain in the virtual queue and its associated variables should be adjusted. As shown in Line 1, for each such packet, Line 1 increases the value of $x(t)$ . That is, the more virtual packets remain in the virtual queue, the higher the resulting clearing probability.

Moreover, the idea behind Line 1 is that it adjusts the cumulative clearing probability $\sum_{\tau=T_{i}}^{t}x(\tau)$ as follows:

	$\displaystyle\sum_{\tau=T_{i}}^{t}x(\tau)$	$\displaystyle\leftarrow\sum_{\tau=T_{i}}^{t}x(\tau)+\text{increment of $x(t)$ in Line~\ref{lp-alg:x} }$
		$\displaystyle\quad=\left(1+\frac{1}{C_{M}}\right)\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}},$		(6)

which increases the cumulative clearing probability by a multiplicative factor of $1+(1/C_{M})$ and an additive factor of $1/(\theta C_{M})$ . The constant $\theta$ is chosen as in Line 1 so that the algorithm asymptotically achieves the minimum achievable competitive ratio (as stated in Lemma 9). The appearance of $C_{M}$ in the denominators reflects that a larger update cost reduces the rate at which the clearing probability increases. In addition, Line 1 sets $z_{i}(t)=1-\sum_{\tau=T_{i}}^{t}x(\tau)$ to ensure that the constraint in Eq. (5b) is satisfied, so that Alg. 1 produces a feasible solution to LP (5).

Note that Alg. 1 operates in an online manner, as it requires only the constants $C_{M}$ and $C_{m}$ , and the knowledge of virtual arrivals up to the current slot (which corresponds to the age increment sequence up to the current slot), without relying on any future information.

/* Initialize all variables as follows: */

x(t)

z_{i}(t)

\leftarrow 0

for all

i

and

t

;

\theta\leftarrow(1+\frac{1}{C_{M}})^{C_{m}}-1

;

/* At the beginning of slot

t=1,\cdots,T

, adjust all variables as follows: */

7foreach virtual packet such that $T_{i}\leq t$ do

8 if $\sum_{\tau=T_{i}}^{t}x(\tau)<1$ then

z_{i}(t)\leftarrow 1-\sum_{\tau=T_{i}}^{t}x(\tau)

;

x(t)\leftarrow x(t)+\frac{1}{C_{M}}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}}

;

13 end if

15 end foreach

Algorithm 1 Online LP algorithm without ML

V-B Analysis of Alg. 1

In this section, we analyze the objective value in Eq. (5a) computed by Alg. 1. Unlike prior studies [buchbinder2009design, bamas2020primal] that analyze online algorithms for LPs using primal–dual techniques, our analysis exploits structural properties of Alg. 1 and its relation to an optimal offline scheduling algorithm. An advantage of our approach is that it provides a unified analysis framework for all proposed LP algorithms (including Algs. 1, 3, and 4) without the need to construct separate dual solutions for different scenarios.

Let $P(t)=\{i:T_{i}\leq t\}$ denote the set of virtual packets that have arrived by slot $t$ . The following two lemmas characterize properties of this set. Here, when a virtual packet satisfies the condition in Line 1 and thus triggers the operation in Line 1, we say that it activates. For clarity and continuity, we move most detailed proofs of this paper to the appendices in the supplemental material.

Lemma 4.

For a fixed slot $t$ , after the virtual packets in $P(t)$ have activated $n$ times since slot $t$ , the value computed by Alg. 1 satisfies

\displaystyle\sum_{\tau=T_{i}}^{\infty}x(\tau)\geq\frac{\left(1+\frac{1}{C_{M}}\right)^{n}-1}{\theta},

for all $i\in P(t)$ .

Proof.

(Sketch) We prove by induction on $n$ . See Appendix A for details. ∎

Lemma 4 immediately implies the following result.

Lemma 5.

For a fixed slot $t$ , the virtual packets in $P(t)$ can activate at most $\lceil C_{m}\rceil$ times since slot $t$ .

Proof.

Fix a slot $t$ . By Lemma 4 and the choice of $\theta=(1+(1/C_{M}))^{C_{m}}-1$ defined in Line 1, once the packets in $P(t)$ have activated $\lceil C_{m}\rceil$ times, we obtain

\displaystyle\sum_{\tau=T_{i}}^{\infty}x(\tau)

\displaystyle\;\geq\;\frac{\left(1+\frac{1}{C_{M}}\right)^{\lceil C_{m}\rceil}-1}{\left(1+\frac{1}{C_{M}}\right)^{C_{m}}-1}\;\geq\;1,

for all $i\in P(t)$ , which implies that the condition in Line 1 no longer holds. Hence, the virtual packets in $P(t)$ can activate at most $\lceil C_{m}\rceil$ times. ∎

Leveraging Lemma 5, we are now ready to analyze the objective value in Eq. (5a) achieved by Alg. 1 in the following theorem. The theorem also characterizes the asymptotic behavior when the update cost scales linearly with the energy consumption, i.e., $C(t)=C_{u}\mathcal{E}(t)$ for a constant unit cost $C_{u}$ . The asymptotic regime $C_{u}\to\infty$ models scenarios with severely constrained resources. In this regime, the competitive ratio depends only on the ratio between the maximum and minimum update cost, denoted by $R=C_{M}/C_{m}$ . Let $\mathcal{E}_{M}$ and $\mathcal{E}_{m}$ denote the maximum and minimum per-update energy consumption, respectively. Then, when $C(t)=C_{u}\mathcal{E}(t)$ , the same ratio can also be written as $R=\mathcal{E}_{M}/\mathcal{E}_{m}$ .

Theorem 6.

The objective value in Eq. (5a) computed by Alg. 1 at the end of slot $T$ is bounded above by

\displaystyle\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{(1+\frac{1}{C_{M}})^{C_{m}}-1}\right)\mathrm{OPT}(\mathcal{I}),

for all possible instances $\mathcal{I}$ . Moreover, as the unit cost $C_{u}$ scales to infinity, the ratio with respect to the optimum approaches $\frac{e^{1/R}}{e^{1/R}-1}$ .

Proof.

(Sketch) Fix an instance $\mathcal{I}$ . Suppose that an optimal offline scheduling algorithm clears the virtual queue in slots $t_{1},\cdots,t_{n}$ , performing a total of $n$ clearing operations. Let $t_{0}=0$ and $t_{n+1}=T$ . We divide the timeline into $n+1$ periods, where period $k$ consists of slots $t_{k-1}+1$ through $t_{k}$ .

Let $J^{*}(k)$ denote the cost incurred by an optimal offline scheduling algorithm in period $k$ . Let $H^{*}(k)$ denote the holding cost incurred by the optimal offline scheduling algorithm for all virtual packets arriving in period $k$ . Consider a fixed $k\in\{1,\cdots,n\}$ . Including the additional clearing cost in slot $t_{k}$ , we have $J^{*}(k)=H^{*}(k)+C(t_{k})$ .

Similarly, let $J(k)$ denote the increment of the objective value in Eq. (5a) by Alg. 1, according to the activations of all virtual packets that arrive in period $k$ . Note that one activation increases the objective value by

		$\displaystyle C(t)\left(\frac{1}{C_{M}}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}}\right)+\left(1-\sum_{\tau=T_{i}}^{t}x(\tau)\right)$
	$\displaystyle\leq\;$	$\displaystyle 1+\frac{1}{\theta}\quad\text{(since $C(t)\leq C_{M}$).}$		(7)

Next, we count the number of activations made by the virtual packets arriving in period $k$ . First, $H^{*}(k)$ exactly counts the number of iterations of Line 1 from slot $t_{k-1}+1$ through slot $t_{k}-1$ in period $k$ for the virtual packets arriving during this period. Second, by Lemma 5, the virtual packets arriving in period $k$ can activate at most $\lceil C_{m}\rceil$ additional times from slot $t_{k}$ onward. Hence, they can activate at most $H^{*}(k)+\lceil C_{m}\rceil$ times in total.

Thus, we obtain

$\displaystyle J(k)$	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\big(H^{*}(k)+\lceil C_{m}\rceil\big)$	(8)
	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\frac{\lceil C_{m}\rceil}{C_{m}}\,\big(H^{*}(k)+C(t_{k})\big)$
	$\displaystyle\leq\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{\theta}\right)J^{*}(k).$

The inequality also holds for $k=n+1$ . Thus, the objective value computed by Alg. 1 satisfies

\sum_{k=1}^{n+1}J(k)\leq\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{\theta}\right)\sum_{k=1}^{n+1}J^{*}(k).

Substituting $\mathrm{OPT}(\mathcal{I})=\sum_{k=1}^{n+1}J^{*}(k)$ and $\theta=(1+(1/C_{M}))^{C_{m}}-1$ completes the proof. See Appendix B for details. ∎

Next, we also use Lemma 5 to analyze the computational complexity of Alg. 1, as stated in the following lemma. Here, we use $\Delta A_{M}$ to denote the maximum value of $\Delta A(t)$ for all possible $t$ .

Lemma 7.

At the end of any slot, at most $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ virtual packets satisfy the condition in Line 1 of Alg. 1.

Proof.

See Appendix C for details. ∎

According to Lemma 7, at the beginning of slot $t$ , at most $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil+\Delta A_{M}$ virtual packets may satisfy the condition in Line 1. Therefore, Line 1 needs at most $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil+\Delta A_{M}$ iterations. The computational complexity of the online algorithm scales quadratically with the minimum update cost.

V-C Randomized online scheduling algorithm

/* Initialize all variables as follows: */

x_{\text{pre-sum}},x_{\text{sum}},x(t)\leftarrow 0

for all

t

;

\theta\leftarrow(1+\frac{1}{C_{M}})^{C_{m}}-1

;

5Choose a random number

u\in[0,1)

with the continuous uniform distribution;

/* At the beginning of slot

t=1,\cdots,T

, do as follows: */

7foreach virtual packet such that $T_{i}\leq t$ do

8 if $\sum_{\tau=T_{i}}^{t}x(\tau)<1$ then

x(t)\leftarrow x(t)+\frac{1}{C_{M}}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}}

;

12 end if

14 end foreach

x_{\text{pre-sum}}\leftarrow x_{\text{sum}}

;

x_{\text{sum}}\leftarrow x_{\text{sum}}+\min\{x(t),1\}

;

18if $x_{\text{pre-sum}}\leq u<x_{\text{sum}}$ then

d(t)\leftarrow 1

;

u\leftarrow u+1

;

22 else

d(t)\leftarrow 0

;

25 end if

Algorithm 2 Randomized online scheduling algorithm without ML.

Leveraging the fractional-to-probabilistic conversion technique proposed in [buchbinder2009design, Chapter 12], this section presents a randomized online scheduling algorithm in Alg. 2, which converts the fractional solution $x(t)$ generated by Alg. 1 into a probabilistic transmission decision.

Alg. 2 adjusts the variable $x(t)$ in Line 2 using the same rule as in Alg. 1. In addition, we introduce two auxiliary variables: $x_{\text{pre-sum}}$ and $x_{\text{sum}}$ . The variable $x_{\text{pre-sum}}$ records the cumulative sum of $\min\{x(t),1\}$ up to slot $t-1$ (Line 2), while $x_{\text{sum}}$ records the cumulative sum up to slot $t$ (Line 2). In Line 2, Alg. 2 selects a uniform random number $u\in[0,1)$ . Then, according to Lines 2 and 2, if there exists $k\in\mathbb{N}$ such that $u+k\in[x_{\text{pre-sum}},x_{\text{sum}})$ , then the device decides to transmit an update (Line 2); otherwise, the device idles (Line 2). The idea behind Alg. 2 mirrors the classical technique for sampling from a distribution using its cumulative distribution function. In particular, by the uniform randomness of $u$ , the probability of transmitting an update (equivalently, clearing the virtual queue) in slot $t$ is exactly $\min\{x(t),1\}$ , and the cumulative transmission probability by slot $t$ is $\min\left\{\sum_{\tau=T_{i}}^{t}x(\tau),1\right\}$ .

Because of the randomness of Alg. 2, we evaluate its performance in terms of the expected competitive ratio.

Theorem 8.

The expected competitive ratio of Alg. 2 is

\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{(1+\frac{1}{C_{M}})^{C_{m}}-1}\right).

Moreover, as the unit cost $C_{u}$ scales to infinity, the ratio approaches $\frac{e^{1/R}}{e^{1/R}-1}$ .

Proof.

Fix an instance $\mathcal{I}$ . Following [tseng2019online], we can show that the expected clearing cost in each slot $t$ under Alg. 2 is upper bounded by the value of $C(t)x(t)$ as computed by Alg. 1. Similarly, the expected number of virtual packets present in slot $t$ under Alg. 2 is upper bounded by $z_{i}(t)$ as computed by Alg. 1. Therefore, the expected total cost in Eq. (3) incurred by Alg. 2 is bounded above by the objective value in Eq. (5a) computed by Alg. 1. The result then follows directly from Theorem 6. ∎

Next, we show that the competitive ratio of Alg. 2 is optimal by establishing a matching lower bound as follows.

Lemma 9.

No online algorithm can achieve a competitive ratio smaller than $\frac{e^{1/R}}{e^{1/R}-1}$ .

Proof.

(Sketch) Consider the initial age $A_{0}=1$ , a fixed age increment sequence $\boldsymbol{\Delta A}=(0,\cdots,0)$ , and a fixed update cost sequence $\mathbf{C}=(C_{m},C_{m}R,C_{m}R,\cdots,C_{m}R)$ . Only the operation duration $T$ is unknown to the device. The age increment sequence models a scenario where information becomes stale very slowly or no more aging when there is no content change in the process the device is monitoring as in [salimnejad2025age]. Despite this single source of unknown uncertainty, we show in Appendix D that as $C_{m}\to\infty$ , no online scheduling algorithm can achieve a competitive ratio smaller than $(e^{1/R})/(e^{1/R}-1)$ . ∎

Through the lower bound on the competitive ratio in Lemma 9 and the matching achievability scheme proposed in Alg. 2, we establish that the optimal competitive ratio against an adversary that can jointly manipulate the operation duration, the age increment, and the update cost is $(e^{1/R})/(e^{1/R}-1)$ . This matches the result in the classic online TCP [buchbinder2009design] for $R=1$ (without cost variation). Moreover, this ratio is $\mathcal{O}(R)$ for large $R$ , scaling linearly with the update cost range $R$ , while is unaffected by all other sources of uncertainty. Thus, when the cost fluctuates, it becomes fundamentally harder for any online algorithm to balance timeliness and update cost. Moreover, see Fig. 3 for the competitive ratio at finite values of $R$ , which also appears approximately linear in $R$ .

Remark 10.

This remark discusses how Alg. 1 can be adapted to scenarios in which the bounds $C_{M}$ and $C_{m}$ on the update cost are not known in advance. In this case, we propose to periodically estimate the update cost using channel estimation techniques (e.g., see [liu2025learning]). Let $T_{\text{est}}$ denote the estimation period and assume that the update cost $C(t)$ is measured in slots $t=1,\,T_{\text{est}}+1,\,2T_{\text{est}}+1,\,\cdots$ . For each slot $t$ , by $C_{M}(t)=\max_{0\leq n\leq(t-1)/T_{\text{est}}}C(nT_{\text{est}}+1)$ and $C_{m}(t)=\min_{0\leq n\leq(t-1)/T_{\text{est}}}C(nT_{\text{est}}+1)$ we define the maximum and minimum observed costs up to slot $t$ , respectively. In Alg. 2 (and the corresponding Alg. 1), we replace $C_{M}$ and $C_{m}$ by $C_{M}(t)$ and $C_{m}(t)$ , respectively. This remains an online algorithm, as no future information is used. Let $\theta(t)=(1+(1/(C_{M}(t)))^{C_{m}(t)}-1$ . Then, the increment of the objective value in Eq. (7) becomes

	$\displaystyle C(t)\!\left(\frac{1}{C_{M}(t)}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta(t)\,C_{M}(t)}\right)+\left(1-\sum_{\tau=T_{i}}^{t}x(\tau)\right)$
	$\displaystyle\qquad\leq\frac{\max\{C(t),C_{M}(t)\}}{C_{M}(t)}\left(1+\frac{1}{\theta(t)}\right).$

Let $\Delta C_{M}=\max_{t}(C(t)-C(t-1))$ denote the maximum per-slot variation of the update cost. Since $C_{M}(t)$ is the maximum observed cost up to slot $\lfloor(t-1)/T_{\text{est}}\rfloor T_{\text{est}}+1$ , and there are at most $T_{\text{est}}-1$ additional slots between slot $\lfloor(t-1)/T_{\text{est}}\rfloor T_{\text{est}}+1$ and slot $t$ , we have

\max\{C(t),C_{M}(t)\}\leq C_{M}(t)+(T_{\text{est}}-1)\Delta C_{M}.

Moreover, $C_{M}(t)\leq C_{M}$ and $C_{m}(t)\geq C_{m}$ , so $\theta(t)\geq\theta$ , where $\theta$ denotes the value used in the original algorithms. Thus, the increment above is bounded by

\frac{C_{M}(t)+(T_{\text{est}}-1)\Delta C_{M}}{C_{M}(t)}\left(1+\frac{1}{\theta}\right).

If the device can estimate $C(t)$ at the beginning of each slot (i.e., $T_{\text{est}}=1$ ), then the factor above equals $1+(1/\theta)$ , achieving the same competitive ratio as in Theorems 6 and 8. Otherwise, suppose that the update cost also takes the form $C(t)=C_{u}\mathcal{E}(t)$ . Let $\Delta\mathcal{E}_{M}=\max_{t}(\mathcal{E}(t)-\mathcal{E}(t-1))$ . Then, we can express the bound by

\left(1+(T_{\text{est}}-1)\frac{\Delta\mathcal{E}_{M}}{\mathcal{E}_{m}}\right)\left(1+\frac{1}{\theta}\right).

As $C_{u}\to\infty$ , the modified online algorithm achieves the asymptotic competitive ratio of $\mathcal{O}(e^{1/R}/(e^{1/R}-1))$ , which matches the order of the original competitive ratio (but with an inflated multiplicative pre-constant $1+(T_{\text{est}}-1)\Delta\mathcal{E}_{M}/\mathcal{E}_{m}$ ). The same fix can apply to all remaining algorithms proposed later, yielding the same pre-constant.

VI Online scheduling algorithm design with ML

This section extends the proposed online scheduling algorithm by incorporating ML that can suggest the next transmission time (i.e., to clear the virtual queue). We focus on the online LP algorithm design as in Section V-A, since the resulting fractional solution can also be converted into a randomized scheduling algorithm as in Section V-C.

VI-A Online LP algorithm with ML

This section extends the online LP algorithm in Alg. 1 by incorporating ML advice $\boldsymbol{\mathcal{M}}$ with unknown reliability, as described in Alg. 3. The key idea underlying Alg. 3 is as follows. A new variable $T_{ML}$ is introduced in Line 3 and is adjusted in Line 3 whenever the ML advice suggests clearing the virtual queue. Hence, the value of $T_{ML}$ represents the most recent slot when the ML advice recommended a clearing. Since the ML advice may be imperfect, the device does not blindly follow it. Instead, Alg. 1 modulates its response based on whether a virtual packet $i$ has been suggested for clearing by the ML advice at the beginning of slot $t$ :

•

If the ML advice has already recommended clearing the virtual packet $i$ (checked via Line 3), Alg. 3 raises the clearing probability more aggressively by setting a smaller value for $\theta$ in Line 3. We refer to this as a fast step.
•

Conversely, if the ML advice has not yet recommended clearing the virtual packet $i$ by slot $t$ (checked via Line 3), Alg. 3 raises the clearing probability more conservatively by setting a larger value for the constant $\theta$ in Line 3. We refer to this as a slow step.

The adjustment of $\theta$ is governed by a hyperparameter $\lambda\in(0,1]$ , which reflects the device’s level of trust in the ML advice. A smaller value of $\lambda$ corresponds to greater confidence in the ML advice and leads to closer alignment with it, whereas a larger value represents caution and yields more robust behavior. This trade-off between the consistency and the robustness with respect to $\lambda$ will be further discussed in Section VI-B.

/* Initialize all variables as follows: */

x(t)

z_{i}(t)

\leftarrow 0

for all

i

and

t

;

T_{ML}\leftarrow 0

;

/* At the beginning of slot

t=1,\cdots,T

, adjust all variables as follows: */

6if $\mathcal{M}(t)=1$ then

T_{ML}\leftarrow t

;

9 end if

11foreach virtual packet $i$ such that $T_{i}\leq t$ do

12 if $\sum_{\tau=T_{i}}^{t}x(\tau)<1$ then

z_{i}(t)\leftarrow 1-\sum_{\tau=T_{i}}^{t}x(\tau)

15 if $T_{i}<T_{ML}$ then

\theta\leftarrow(1+\frac{1}{C_{M}})^{C_{m}\lambda}-1

;

18 else

\theta\leftarrow(1+\frac{1}{C_{M}})^{C_{m}/\lambda}-1

;

21 end if

x(t)\leftarrow x(t)+\frac{1}{C_{M}}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}}

;

25 end if

27 end foreach

Algorithm 3 Online LP algorithm with ML

VI-B Analysis of Alg. 3

In this section, we establish the robustness and consistency of the proposed Alg. 3. To this end, we begin by analyzing the set $P(t)$ under Alg. 3 in the following two lemmas, analogous to Lemmas 4 and 5, respectively. Here, if a virtual packet activates and performs a slow or fast step, we say that it activates a slow or fast step, respectively. Moreover, we denote $\theta_{s}=(1+(1/C_{M}))^{C_{m}/\lambda}-1$ and $\theta_{f}=(1+(1/C_{M}))^{C_{m}\lambda}-1$ .

Lemma 11.

For a fixed slot $t$ , after the virtual packets in $P(t)$ have activated $N_{s}$ slow steps and $N_{f}$ fast steps since slot $t$ , the value computed by Alg. 3 satisfies

	$\displaystyle\sum_{\tau=T_{i}}^{\infty}x(\tau)\geq$	$\displaystyle\,\frac{(1+\frac{1}{C_{M}})^{N_{s}}-1}{\theta_{s}}\left(1+\frac{1}{C_{M}}\right)^{N_{f}}$
		$\displaystyle+\frac{(1+\frac{1}{C_{M}})^{N_{f}}-1}{\theta_{f}},$

for all $i$ such that $T_{i}\leq t$ .

Proof.

(Sketch) We prove by induction on $N_{f}$ . See Appendix E for details. ∎

Lemma 12.

For a fixed slot $t$ , the virtual packets in $P(t)$ can activate $N_{s}$ slow steps and $N_{f}$ fast steps since slot $t$ , subject to the condition $N_{s}\lambda+N_{f}\leq C_{m}+1$ .

Proof.

(Sketch) Using Lemma 11, we show that when $N_{s}\lambda+N_{f}\geq C_{m}$ , then $\sum_{\tau=T_{i}}^{\infty}x(\tau)\geq 1$ for all $i\in P(t)$ . See Appendix F for details. ∎

Leveraging Lemma 12, we are ready to analyze the objective value in Eq. (5a) computed by Alg. 3. The next two theorems analyze its robustness and consistency, respectively.

Theorem 13.

The objective value in Eq. (5a) computed by Alg. 3 at the end of slot $T$ is bounded above by

\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{(1+\frac{1}{C_{M}})^{C_{m}\lambda}-1}\right)\mathrm{OPT}(\mathcal{I}),

for all possible instances $\mathcal{I}$ and ML advice $\boldsymbol{\mathcal{M}}$ . Moreover, as the unit cost $C_{u}$ scales to infinity, the ratio with respect to the optimum approaches $\frac{e^{\lambda/R}}{e^{\lambda/R}-1}$ .

Proof.

(Sketch) We follow the proof of Theorem 6 and show that $J(k)\leq(1+(1/C_{m}))(1+(1/\theta_{f}))J^{*}(k)$ for all $k$ , under Alg. 3. Then, applying the same reasoning as in the proof of Theorem 6 and substituting the definition of $\theta_{f}$ completes the proof. See Appendix G for details. ∎

Theorem 14.

The objective value in Eq. (5a) computed by Alg. 3 at the end of slot $T$ is bounded above by

	$\displaystyle\max\left\{1+\frac{1}{\left(1+\frac{1}{C_{M}}\right)^{C_{m}/\lambda}-1},\right.$
	$\displaystyle\quad\quad\left.\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\cdot\left(1+\frac{1}{\left(1+\frac{1}{C_{M}}\right)^{C_{m}\lambda}-1}\right)\right\}\cdot J(\mathcal{I},\boldsymbol{\mathcal{M}})$

for all possible instances $\mathcal{I}$ and ML advice $\boldsymbol{\mathcal{M}}$ . Moreover, as the unit update cost $C_{u}\to\infty$ , the ratio with respect to the optimum converges to $\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1}$ .

Proof.

(Sketch) Fix an instance $\mathcal{I}$ and ML advice $\boldsymbol{\mathcal{M}}$ . We follow the proof of Theorem 6, with minor modifications. Redefine $t_{k}$ for $k\in\{1,\cdots,n\}$ as the slot when $\boldsymbol{\mathcal{M}}$ clears the virtual queue for the $k$ -th time. These redefined time points determine a new set of periods, replacing those used in the proof of Theorem 6.

Let $J_{\boldsymbol{\mathcal{M}}}(k)$ denote the cost incurred by $\boldsymbol{\mathcal{M}}$ in period $k$ . Let $J(k)$ be the increment of the objective value in Eq. (5a) by Alg. 3, according to the slow and fast step activations of all virtual packets arriving in period $k$ . We show that

\displaystyle J(k)\leq\max\left\{1+\frac{1}{\theta_{s}},\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\left(1+\frac{1}{\theta_{f}}\right)\right\}J_{\boldsymbol{\mathcal{M}}}(k),

for all $k$ . Thus, the total objective value computed by Alg. 3 satisfies

\sum_{k=1}^{n+1}J(k)\;\leq\;\max\left\{1+\frac{1}{\theta_{s}},\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\left(1+\frac{1}{\theta_{f}}\right)\right\}\sum_{k=1}^{n+1}J_{\boldsymbol{\mathcal{M}}}(k).

Substituting $J(\mathcal{I},\boldsymbol{\mathcal{M}})=\sum_{k=1}^{n+1}J_{\boldsymbol{\mathcal{M}}}(k)$ and the definitions of $\theta_{s}$ and $\theta_{f}$ proves the theorem. See Appendix H for details. ∎

Next, we show that the results in the above two theorems characterize the optimal consistency-robustness trade-off.

Lemma 15.

A $\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1}$ -consistency scheduling algorithm has a robustness of at least $\frac{e^{\lambda/R}}{e^{\lambda/R}-1}$ .

Proof.

(Sketch) Using the same instance as in the proof of Lemma 9, we can establish the result. See Appendix I for details. ∎

Combining the lower bound in Lemma 15 with the matching achievability scheme in Alg. 3, we establish that the optimal consistency–robustness trade-off is characterized by the pair of $(\lambda e^{\lambda/R})/(e^{\lambda/R}-1)$ and $(e^{\lambda/R})/(e^{\lambda/R}-1)$ . This matches the result for the classic online TCP with ML [bamas2020primal] when $R=1$ (without cost variation).

Note that in many prior studies on ML-augmented online algorithms (e.g., [bamas2020primal, liu2025learning]), the consistency approaches 1 as the trust parameter $\lambda\to 0$ , i.e., setting $\lambda\to 0$ forces the algorithm to rely entirely on the ML advice. In contrast, in our setting the consistency approaches $R$ as $\lambda\to 0$ , which exceeds 1 whenever $R>1$ . This difference is explained as follows. The robustness becomes unbounded as $\lambda\to 0$ . Because this trade-off is optimal, the robustness is infinite whenever the consistency falls below $R$ . This implies that robustness guarantees collapse even when an online algorithm follows ML advice only partially (so that its consistency remains below $R$ ). Thus, in our setting, taking $\lambda\to 0$ does not force the algorithm to rely fully on the ML advice. Instead, it identifies the limit in which the algorithm becomes as consistent with the ML advice as possible while still maintaining a finite robustness guarantee.

Thus far, we have characterized the optimal consistency–robustness trade-off as the trust level in the ML advice varies. Tuning the trust level leads to different competitive ratios. We next discuss how to determine an optimal trust level that minimizes the competitive ratio when the value of $R$ is large. For large $R$ , the consistency $(\lambda e^{\lambda/R})/(e^{\lambda/R}-1)$ varies from $(e^{1/R})/(e^{1/R}-1)\approx R$ at $\lambda=1$ to $R$ as $\lambda\rightarrow 0$ . Hence, the consistency is nearly identical for all $\lambda\in(0,1]$ . In contrast, for large $R$ , the robustness $(e^{\lambda/R})/(e^{\lambda/R}-1)=\mathcal{O}(R/\lambda)$ degrades as $\lambda$ decreases. Then, considering all possible $\lambda\in(0,1]$ (representing partial or no trust in the ML advice), an optimal online scheduling algorithm $\pi^{*}$ that minimizes the competitive ratio satisfies the following performance guarantees:

\displaystyle J(\mathcal{I},\pi^{*})\leq\min\big\{R\cdot J(\mathcal{I},\boldsymbol{\mathcal{M}}),\,(R/\lambda)\cdot\mathrm{OPT}(\mathcal{I})\big\},

for all possible $\mathcal{I}$ , $\boldsymbol{\mathcal{M}}$ , and all $\lambda\in(0,1]$ . The minimum is attained at $\lambda=1$ , leading $J(\mathcal{I},\pi^{*})\leq R\cdot\mathrm{OPT}(\mathcal{I})$ . In addition, considering full trust in the ML advice, we also have $J(\mathcal{I},\pi^{*})\leq J(\mathcal{I},\boldsymbol{\mathcal{M}})$ . Suppose that the ML advice $\boldsymbol{\mathcal{M}}$ satisfies the following reliability guarantee: $\frac{J(\mathcal{I},\boldsymbol{\mathcal{M}})}{OPT(\mathcal{I})}\leq\zeta(\boldsymbol{\mathcal{M}})$ for all possible $\mathcal{I}$ . Then, we have

\displaystyle J(\mathcal{I},\pi^{*})\leq\min\big\{R,\,\zeta(\boldsymbol{\mathcal{M}})\big\}\cdot OPT(\mathcal{I}),

for all possible $\mathcal{I}$ and $\boldsymbol{\mathcal{M}}$ . That is, for large $R$ , regardless of the ML reliability $\zeta(\boldsymbol{\mathcal{M}})$ , the optimal response to ML advice (for minimizing the competitive ratio) is threshold-like: the algorithm should either fully trust the ML advice if $\zeta(\boldsymbol{\mathcal{M}})\leq R$ or completely ignore it otherwise.

Moreover, see Fig. 4 for the trade-off at finite values of $R$ . Here, we also observe a dramatic degradation in robustness resulting from even a small change in consistency. This indicates that the threshold structure nearly holds as well.

VII Intermittent update opportunities

We extend our framework to scenarios where the device may be unable to update in certain slots (e.g., when no update is generated or when the device cannot transmit). Let $U(t)$ indicate whether the device is able to update in slot $t$ , where $U(t)=1$ if it can and $U(t)=0$ otherwise. Let $\mathbf{U}=(U(1),\cdots,U(T))$ denote the update opportunity sequence. We then redefine the uncertainty instance as $\mathcal{I}=\{T,\boldsymbol{\Delta A},\mathbf{C},\mathbf{U}\}$ to incorporate this additional source of uncertainty.

To model this, we augment the virtual queueing system described in Section IV-A with a virtual ON/OFF channel. Specifically, if $U(t)=1$ , the virtual channel is ON and the virtual server is allowed to clear virtual packets; if $U(t)=0$ , the virtual channel is OFF and the virtual server must idle. This leads to the following revised LP:


$\displaystyle\min_{x(t),z_{i}(t)}$	$\displaystyle\hskip 5.69046pt\sum_{t=1}^{T}\left(C(t)x(t)+\sum_{i:T_{i}\leq t}z_{i}(t)\right)$	(9a)
s.t.	$\displaystyle\hskip 5.69046ptz_{i}(t)+\sum_{\tau=T_{i}}^{t}U(\tau)x(\tau)\geq 1,$
	for all $i$ such that $T_{i}\leq t$ and for all $t$ ;	(9b)
	$\displaystyle\hskip 5.69046ptx(t),z_{i}(t)\geq 0\quad\text{for all $i$ and $t$}.$	(9c)

Here, Eq. (9b) differs from Eq. (5b) because a virtual packet is cleared only when the virtual channel is ON.

Next, we generalize Algs. 1 (without ML) and 3 (with ML) in Sections VII-A and VII-B, respectively, to handle scenarios with intermittent update opportunities.

VII-A Without ML Advice

This section extends Alg. 1, as described in Alg. 4. The key design change is that $x(t)$ is adjusted only when the virtual channel is ON, i.e., when $U(t)=1$ (Line 4). Furthermore, unlike Alg. 1, which adjusts $x(t)$ only for the current slot $t$ , Alg. 4 also considers all prior virtual OFF slots that occurred since the previous virtual ON slot. Concretely, for each such prior virtual OFF slot (Line 4), if the constraint in Eq. (9b) still holds (Line 4), the algorithm keeps increasing $x(t)$ (Line 4). This reflects the intuition that virtual packets held longer in the queue (due to consecutive virtual OFF periods) should have higher clearing probabilities once the virtual channel becomes ON. To implement this logic, the algorithm maintains a pointer $\hat{t}$ (Line 4) to denote the starting slot of this multiple increment procedure. This pointer is adjusted in Line 4 when the condition in Line 4 holds. The pointer identifies the slot immediately following the previous virtual ON slot, which is either the current slot (if the current virtual channel is ON) or the start of the current virtual OFF period (otherwise).

Note that if a virtual packet arrives in a virtual OFF slot, it must remain in the virtual queue until the next virtual ON slot. This limitation applies to all scheduling algorithms (including an offline optimal algorithm). Thus, the multiple increment mechanism applies only to those virtual packets that arrived before the previous virtual ON slot (Line 4). For virtual packets that arrive after the previous virtual ON slot, Alg. 4 adjusts $x(t)$ only once (Line 4).

/* Initialize all variables as follows: */

x(t)

z_{i}(t)

\leftarrow 0

for all

i

and

t

;

\theta\leftarrow(1+\frac{1}{C_{M}})^{C_{m}}-1

;

\hat{t}\leftarrow 1

;

/* At the beginning of slot

t=1,\cdots,T

, adjust all variables as follows: */

8if $t>1$ and $U(t-1)=1$ then

\hat{t}\leftarrow t

;

11 end if

13foreach virtual packet $i$ such that $T_{i}\leq t$ do

14 if $\sum_{\tau=T_{i}}^{t}x(\tau)<1$ then

z_{i}(t)\leftarrow 1-\sum_{\tau=T_{i}}^{t}x(\tau)

;

17 if $U(t)=1$ then

18 if $T_{i}<\hat{t}$ then

19 for $t^{\prime}=t$ down to $\hat{t}$ do

20 if $\sum_{\tau=T_{i}}^{t}x(\tau)<1$ then

x(t)\leftarrow x(t)+\frac{1}{C_{M}}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}}

;

23 end if

25 end for

27 else

28 if $\sum_{\tau=T_{i}}^{t}x(\tau)<1$ then

x(t)\leftarrow x(t)+\frac{1}{C_{M}}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}}

;

31 end if

33 end if

35 end if

37 end if

39 end foreach

Algorithm 4 Online LP algorithm without ML for intermittent update opportunities.

We next analyze the performance of Alg. 4 and show that it can achieve the same asymptotic competitive ratio as stated in Theorem 6. To that end, we present a lemma that bounds the increment of $\sum_{\tau=T_{i}}^{\infty}x(\tau)$ under the multiple increment mechanism in Alg. 4. Here, when a virtual packet satisfies the condition in Line 4 or 4 and thus triggers the operation in Line 4 or 4, we say that it activates.

Lemma 16.

For a fixed slot $t$ , after the virtual packets in $P(t)$ have activated $n$ times since slot $t$ , the value of $\sum_{\tau=T_{i}}^{\infty}x(\tau)$ computed by Alg. 4 increases (relative to the beginning of slot $t$ ) by at most

\displaystyle\left(1+\frac{1}{\theta}\right)\left[\left(1+\frac{1}{C_{M}}\right)^{n}-1\right],

for all $i\in P(t)$ .

Proof.

(Sketch) We prove by induction on $n$ . See Appendix J for details. ∎

Using Lemma 16, we are ready to analyze Alg. 4 in the next result. Here, let $T_{\text{OFF}}$ denote the maximum number of consecutive virtual OFF slots up to and including the next virtual ON slot.

Theorem 17.

The objective value in Eq. (9a) computed by Alg. 4 at the end of slot $T$ is bounded above by

	$\displaystyle\left[\left(1+\frac{1}{C_{m}}\right)^{1+2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}\left(1+\frac{1}{(1+\frac{1}{C_{M}})^{C_{m}}-1}\right)\right.$
	$\displaystyle\quad\left.+\frac{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}{C_{m}}\right]\mathrm{OPT}(\mathcal{I}),$

for all possible instances $\mathcal{I}$ . Moreover, if $\Delta A_{M}$ and $T_{\text{OFF}}$ is finite, then as the unit cost $C_{u}$ scales to infinity, the ratio with respect to the optimum approaches $\frac{e^{1/R}}{e^{1/R}-1}$ .

Proof.

(Sketch) We follow the proof of Theorem 6. Fix a period $k\in\{1,\dots,n\}$ and a virtual ON slot $t$ in that period. We bound the increment of the objective value in Eq. (9a) in slot $t$ by considering how Alg. 4 adjusts $x(t)$ and the matching $z$ -variables, where we use the notation $\sum_{\tau=0}^{t}x(\tau)\big|_{\text{condition}}$ to represent the value of $\sum_{\tau=T_{i}}^{t}$ under the specific condition:

If a virtual packet $i$ arrives before $\hat{t}$ and activates in iteration $t^{\prime}$ of Line 4 in slot $t$ , then Line 4 increases $x(t)$ by

\displaystyle\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg|_{\text{after the activation}}\right)+\frac{1}{\theta C_{M}}.

However, the paired $z_{i}(t^{\prime})$ was already set to be $\sum_{\tau=T_{i}}^{t^{\prime}}x(\tau)$ in a slot $t^{\prime}\leq t$ . Because $x(\tau)$ does not change (for all possible $\tau$ ) over the virtual OFF period until slot $t$ , we have

\displaystyle z_{i}(t^{\prime})=1-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg|_{\text{start of slot~$t$}}\right).

Hence, the increment of the objective value due to the adjustment of $x(t)$ from virtual packet $i$ in iteration $t^{\prime}$ and the paired $z_{i}(t^{\prime})$ is

	$\displaystyle C(t)\left[\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{t}x(\tau)\Big\|_{\text{after the activation}}\right)+\frac{1}{\theta C_{M}}\right]$
	$\displaystyle\quad+1-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right)$
	$\displaystyle\leq 1+\frac{1}{\theta}$
	$\displaystyle+\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{after the activation}}\right)-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right).$

By Lemma 7, we can show that there are at most $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil T_{\text{OFF}}$ activations from the start of slot $t$ until the considered activation. By Lemma 16, we have

	$\displaystyle\left(\sum_{\tau=T_{i}}^{t}x(\tau)\big\|_{\text{after the activation}}\right)-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right)$
	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\left[\left(1+\frac{1}{C_{M}}\right)^{2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil T_{\text{OFF}}}-1\right],$

which gives the bound on the increment of the objective value:

\displaystyle\left(1+\frac{1}{\theta}\right)\left(1+\frac{1}{C_{M}}\right)^{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}.

(10)

2.

If a virtual packet $i$ arrives before $\hat{t}$ but does not activate in iteration $t^{\prime}$ of Line 4 in slot $t$ , then because the paired $z_{i}(t^{\prime})$ was still set to $1-\sum_{\tau=T_{i}}^{t^{\prime}}x(\tau)$ in slot $t^{\prime}\leq t$ , it also contributes at most $1$ to the objective value in that iteration.
3.

If a virtual packet $i$ arrives in slot $t$ and activates in slot $t$ , then by the proof of Theorem 6 the increment from $x(t)$ in Line 4 and its paired $z_{i}(t)$ in Line 4 is $1+(1/\theta)$ , which is also bounded above by Eq. (10).

Next, we show that, for the virtual packets arriving in the period, there are at most $\lceil C_{m}\rceil$ Case 1 and Case 3 increments since slot $t_{k}$ , and at most $2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}$ Case 2 increments since slot $t_{k}$ . Then, we can rewrite Eq. (8) as

	$\displaystyle J(k)$	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\left(1+\frac{1}{C_{m}}\right)^{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}\left(H^{*}(k)+\lceil C_{m}\rceil\right)$
		$\displaystyle\quad+1\cdot\left(2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}\right).$

Finally, following the proof of Theorem 6 yields the desired result. See Appendix K for details. ∎

Compared with Theorem 6, the competitive ratio in Theorem 17 scales that in Theorem 6 by a factor of $(1+(1/C_{m}))^{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}$ (which approaches 1 as $C_{u}\to\infty$ ), and includes an additional term $(2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}})/C_{m}$ (which also approaches 0 as $C_{u}\to\infty$ ). Thus, even under intermittent update opportunities, Alg. 4 asymptotically achieves the lower bound in Lemma 9.

Next, we show that if the adversary is so powerful that it can also set $\Delta A_{M}$ or $T_{\text{OFF}}$ arbitrarily large, then no competitive ratio can be guaranteed.

Lemma 18.

If either $\Delta A_{M}$ or $T_{\text{OFF}}$ is unbounded, then no online algorithm can achieve a finite competitive ratio as $C_{m}\to\infty$ .

Proof.

See Appendix L for details. ∎

VII-B With ML advice

This section further incorporates ML advice. To this end, we modify Alg. 3 by applying the multiple increment mechanism for previous virtual OFF slots, as in Alg. 4. By extending the proof of Theorem 17 to revise those of Theorems 13 and 14, we can obtain the same scaling and additional terms (as in Theorem 17), yielding the same asymptotic results as in Theorems 13 and 14.

VIII Numerical studies

The previous sections established that our proposed algorithms achieve the best possible competitiveness and the optimal consistency–robustness trade-off in adversarial environments. In this section, we complement the theoretical results by evaluating the algorithms in stationary stochastic environments through numerical experiments.

We adopt a setting similar to [hsu2019scheduling], which derived optimal offline scheduling policies under stationary assumptions. We simulate a horizon of $T=10000$ slots. Update opportunities follow a Bernoulli process with rate $0.7$ . To model update costs with memory, we use a two-state Markov chain (as in the Gilbert–Elliott model) with states $\mathsf{L}$ (low cost) and $\mathsf{H}$ (high cost). Let $S(t)\in\{\mathsf{L},\mathsf{H}\}$ denote the state in slot $t$ , and let $\mathrm{tr}_{p}$ and $\mathrm{tr}_{q}$ be the transition probabilities from $\mathsf{L}$ to $\mathsf{H}$ and from $\mathsf{H}$ to $\mathsf{L}$ , respectively.

Following [hsu2019scheduling], an optimal offline scheduling policy in the stationary setting can be characterized by two age thresholds: if $S(t)=\mathsf{H}$ , the device transmits when the age reaches a threshold $T_{\mathsf{H}}$ ; if $S(t)=\mathsf{L}$ , it does when the age reaches a threshold $T_{\mathsf{L}}$ . We compute the optimal pair of $T_{\mathsf{H}}$ and $T_{\mathsf{L}}$ via exhaustive search to minimize the total cost in Eq. (2).

Next, we consider both a linear aging function in Section VIII-A and a nonlinear one in Section VIII-B.

VIII-A Linear aging function

In this section, we consider a constant age increment process with $\Delta A(t)=1$ for all $t$ . We evaluate the proposed online algorithm without ML in Section VIII-A1 and the ML-augmented version in Section VIII-A2.

VIII-A1 Online scheduling algorithm without ML

In this section, we validate Alg. 2 (modified according to Alg. 4 to handle intermittent update opportunities). Figs. 5 and 6 show the time-average cost (y-axis) for various values of $C_{m}$ (x-axis). In Fig. 5. we set $\mathrm{tr}_{p}=0.2$ , $\mathrm{tr}_{q}=0.8$ , resulting in longer stays in state $\mathsf{L}$ ; in Fig. 6, we set $\mathrm{tr}_{p}=0.8$ , $\mathrm{tr}_{q}=0.2$ , resulting in longer stays in state $\mathsf{H}$ . Each figure has three subfigures corresponding to $R=1$ , $R=3$ , and $R=5$ , respectively. Each subfigure plots five curves: “Proposed” (the proposed online algorithm without ML), “Revised” (a modified version described later), “Greedy” (a baseline policy described later), “OPT” (the offline optimum), and “Theory” (the upper bound from Theorem 8 multiplied by OPT). We observe that the proposed algorithm performs significantly better in practice than the worst-case theoretical upper bound, especially as the update cost range increases.

For comparison, we also simulate an online greedy algorithm (labeled “Greedy” in the figures) that myopically minimizes the current slot cost, i.e., it transmits in slot $t$ if the update cost $C(t)$ is less than the cost of waiting (i.e., $A(t)+1$ ). Note that while the greedy algorithm requires knowledge of the current update cost in each slot, the proposed algorithm does not. From Figs. 5 and 6, the proposed algorithm outperforms the greedy baseline except when the system frequently enters the high cost state with large $C_{M}$ (i.e., in Fig. 6(c)). This exception would be explained by Lemma 5: the proposed algorithm must transmit before activating $\lceil C_{m}\rceil$ times, forcing overly frequent updates when $C_{M}$ is large and occurs often, as in the environment of Fig. 6(c).

Although Alg. 2 asymptotically achieves the optimal competitive ratio and thus serves as an achievability scheme for the lower bound, we observe that it may be too aggressive in such stochastic environments. To remedy this, we propose a revised version in which the constant $\theta$ in Alg. 2 (and Alg. 1) is replaced by $(1+(1/C_{M}))^{C_{M}}-1$ . By Lemma 5, the revised algorithm transmits before activating $\lceil C_{M}\rceil$ times, thereby reducing the update frequency when $C_{M}$ is large. From Figs. 5 and 6 (labeled “Revised” in the figures), this revised algorithm consistently achieves the best empirical performance.

Given the revised algorithm’s superior stochastic performance, we analyze its worst-case guarantees. By Lemma 5, the virtual packets arriving in period $k$ can activate at most $\lceil C_{M}\rceil$ times from slot $t_{k}$ onward. Let $\theta^{\prime}=(1+(1/C_{M}))^{C_{M}}-1$ . Hence, Eq. (8) becomes

	$\displaystyle J(k)$	$\displaystyle\leq\left(1+\frac{1}{\theta^{\prime}}\right)\left(H^{*}(k)+\lceil C_{M}\rceil\right)$
		$\displaystyle\leq\frac{C_{M}+1}{C_{m}}\left(1+\frac{1}{\theta^{\prime}}\right)J^{*}(k).$

As $C_{u}\to\infty$ , we have $(C_{M}+1)/C_{m}\to R$ and $\theta^{\prime}\to e-1$ , yielding an asymptotic competitive ratio of $(e/(e-1))R$ . This remains order-optimal.

VIII-A2 Online scheduling algorithm with ML

In this subsection, we evaluate the benefit of incorporating ML advice. Using the same argument as in the previous section, we can show that replacing the constant $\theta$ in Alg. 3 with the revised value $(1+(1/C_{M}))^{C_{M}}-1$ achieves the robustness of $(e^{\lambda}/(e^{\lambda}-1))R$ and the consistency of $(e^{\lambda}/(e^{\lambda}-1))\lambda R$ , as $C_{u}\to\infty$ . The revised algorithm also preserves the order of the optimal consistency–robustness trade-off. Because the revised algorithm performs better in the stochastic environment as shown in the previous section, here we evaluate its performance when augmented with ML advice.

Let $T^{*}_{\boldsymbol{\mathcal{M}}}(t)$ denote the offline optimal threshold in slot $t$ . To investigate imperfect ML advice with controllable errors, we model the ML-estimated threshold as $T_{\boldsymbol{\mathcal{M}}}(t)=T^{*}_{\boldsymbol{\mathcal{M}}}(t)+\mathcal{N}$ , where $\mathcal{N}$ is a zero-mean Gaussian random variable with variance chosen such that

\Pr\!\left[\frac{\lvert T_{\boldsymbol{\mathcal{M}}}(t)-T^{*}_{\boldsymbol{\mathcal{M}}}(t)\rvert}{T^{*}_{\boldsymbol{\mathcal{M}}}(t)}\leq\epsilon\right]=0.95,

i.e., with $95\%$ probability the relative error is within $\epsilon$ . The ML advice is then $\mathcal{M}(t)=1$ if $A(t)\geq\max\{T_{\boldsymbol{\mathcal{M}}}(t),1\}$ and $\mathcal{M}(t)=0$ otherwise.

Figs. 7 (for tr ${}_{p}=0.2$ and tr ${}_{q}=0.8$ ) and 8 (for tr ${}_{p}=0.8$ and tr ${}_{q}=0.2$ ) plot the time-average cost of the proposed online algorithm with ML for $\epsilon=0,0.5,1,1.5,2,2.5$ . Each subfigure shows the results for $\lambda=10^{-5},0.25,0.5,0.75,1$ . According to our simulations, the performance of completely following the ML advice coincides with that of $\lambda=10^{-5}$ ; hence, we do not plot it for clarity. We observe that for small errors ( $\epsilon=0,0.5,1$ ), completely following the ML advice yields the best performance; for large errors ( $\epsilon=1.5,2,2.5$ ), completely ignoring the ML advice with $\lambda=1$ performs best. There also exists a sharp transition between the two regimes in the stochastic setting, matching the threshold-type behavior by the asymptotic analysis in the adversarial setting.

VIII-B Nonlinear Aging Function

In this section, we consider a nonlinear aging function similar to that in [kosta2017age]. Specifically, if $x$ slots have elapsed since the most recent update, then the age of information in the current slot is given by $\lfloor e^{0.3x}\rfloor-1$ . Throughout this section, we fix $R=2$ .

Figs. 9 (for $\mathrm{tr}_{p}=0.2$ and $\mathrm{tr}_{q}=0.8$ ) and 10 (for $\mathrm{tr}_{p}=0.8$ and $\mathrm{tr}_{q}=0.2$ ) show the time-average cost of the online algorithms without ML. As in the linear aging case, both the proposed and the revised algorithms outperform the baseline policy.

We also evaluate the revised algorithm with ML advice in Figs. 11 (for $\mathrm{tr}_{p}=0.2$ and $\mathrm{tr}_{q}=0.8$ ) and 12 (for $\mathrm{tr}_{p}=0.8$ and $\mathrm{tr}_{q}=0.2$ ), for several values of the ML error parameter $\epsilon$ . In Fig. 11, when $\epsilon=1,3$ , completely following the ML advice yields the best performance; when $\epsilon=5,7$ , the performance is nearly identical for all values of $\lambda$ ; when $\epsilon=9,11$ , completely ignoring the ML advice is optimal. Similarly, in Fig. 12, when $\epsilon=1,2$ , completely following the ML advice is optimal; when $\epsilon=3,4$ , the performance is nearly the same for all $\lambda$ ; when $\epsilon=5,6$ , completely ignoring the ML advice performs best.

These results exhibit the same qualitative behavior observed under the linear aging case in the previous section: either blindly following the ML advice or completely ignoring it yields near-optimal performance, while partially trusting the ML advice provides little benefit.

IX Conclusion

This paper investigated a mobile information updating system subject to four sources of uncertainty. We developed online scheduling algorithms that enable a mobile device to cost-efficiently maintain fresh information at a central entity. The proposed online algorithm without ML asymptotically achieves the optimal competitive ratio, while the ML-augmented version also asymptotically attains the optimal consistency–robustness trade-off. Moreover, when augmented with ML, we showed that either blindly following or completely ignoring the ML advice minimizes the competitive ratio. This work opens several promising research directions for network design under non-stationary uncertainty. Interesting extensions include dynamically adjusting the threshold (for either blindly following ML or completely ignoring it) because the reliability of the ML advice is unknown in general, developing an optimal algorithm for both the adversarial and stochastic environments, exploring multi-device or networked update systems, and incorporating sampling decisions jointly with update scheduling.

X Acknowledgments

We thank the authors of [liu2025learning] for pointing out mistakes in our earlier preliminary work [tseng2019online]. This research was supported by the National Science and Technology Council, Taiwan, under Grant No. 110-2221-E-305-008-MY3 and 113-2628-E-305-001-MY3.

Supplementary Material

Appendix A Proof of Lemma 4

We use the notation $\sum_{\tau=0}^{\infty}x(\tau)\big|_{\text{condition}}$ to represent the value of $\sum_{\tau=T_{i}}^{\infty}$ under the specific condition. Fix a slot $t$ . We prove the claim by induction on $n$ . When $n=1$ , by Eq. (6) we have

	$\displaystyle\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg\|_{n=1}\right)$	$\displaystyle=\left(1+\frac{1}{C_{M}}\right)\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg\|_{n=0}\right)+\frac{1}{\theta C_{M}}$
		$\displaystyle\geq\frac{1}{\theta C_{M}}=\frac{\left(1+\frac{1}{C_{M}}\right)^{1}-1}{\theta},$

for all $i\in P(t)$ .

Assume that the result holds for $n-1$ , i.e.,

\displaystyle\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg|_{n-1}\right)\geq\frac{\left(1+\frac{1}{C_{M}}\right)^{n-1}-1}{\theta},

for all $i\in P(t)$ .

We show that the result holds for $n$ : after the additional step, by Eq. (6) we have

	$\displaystyle\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg\|_{n}\right)$	$\displaystyle=\left(1+\frac{1}{C_{M}}\right)\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg\|_{n-1}\right)+\frac{1}{\theta C_{M}}$
		$\displaystyle\geq\left(1+\frac{1}{C_{M}}\right)\left(\frac{\left(1+\frac{1}{C_{M}}\right)^{n-1}-1}{\theta}\right)$
		$\displaystyle\quad+\frac{1}{\theta C_{M}}$
		$\displaystyle=\frac{\left(1+\frac{1}{C_{M}}\right)^{n}-1}{\theta},$

for all $i\in P(t)$ . This completes the inductive step and proves the lemma.

Appendix B Proof of Theorem 6

Fix an instance $\mathcal{I}$ . Suppose that an optimal offline scheduling algorithm (denoted by $\boldsymbol{\pi}^{*}$ ) clears the virtual queue in slots $t_{1},\cdots,t_{n}$ , for a total of $n$ clearing operations. Let $t_{0}=0$ and $t_{n+1}=T$ . We divide the timeline into $n+1$ periods, where period $k$ consists of slots $t_{k-1}+1$ through $t_{k}$ .

Let $J^{*}(k)$ denote the cost incurred by $\boldsymbol{\pi}^{*}$ in period $k$ . The total cost in Eq. (3) incurred by $\boldsymbol{\pi}^{*}$ is then $\sum_{k=1}^{n+1}J^{*}(k)$ . We calculate $J^{*}(k)$ for two cases.

1.

For $k\in\{1,\cdots,n\}$ : For each slot $\tau$ in period $k$ , the number of virtual packets present in the virtual queue is $\sum_{i=1}^{\infty}\mathbf{1}_{\{t_{k-1}+1\leq T_{i}\leq\tau\}}$ , which checks all virtual packets that arrived after the previous clearing in slot $t_{k-1}$ until slot $\tau$ . Hence, the holding cost of all the virtual packets that arrive in period $k$ is $\sum_{\tau=t_{k-1}+1}^{t_{k}-1}\sum_{i=1}^{\infty}\mathbf{1}_{\{t_{k-1}+1\leq T_{i}\leq\tau\}}$ . We denote this quantity by $H^{*}(k)$ . Adding the clearing cost $C(t_{k})$ in slot $t_{k}$ , we have $J^{*}(k)=H^{*}(k)+C(t_{k})$ .
2.

For $k=n+1$ : Here, the holding cost has the same form as above, but since no clearing occurs in this period, the cost is $J^{*}(n+1)=H^{*}(n+1)$ .

Next, let $J(k)$ denote the increment of the objective value in Eq. (5a) by Alg. 1, according to the activations of all virtual packets that arrive in period $k$ . This includes the increments of $z_{i}(t)$ in Line 1 and of $x(t)$ in Line 1, for all virtual packet $i$ with $t_{k-1}+1\leq T_{i}\leq t_{k}$ and for all slots $t$ . The objective value computed by Alg. 1 is then $\sum_{k=1}^{n+1}J(k)$ . We analyze $J(k)$ in two cases.

For $k\in\{1,\cdots,n\}$ : By the condition in Line 1, a virtual packet contributes to the objective value only when it activates. If a virtual packet $i$ arriving in period $k$ activates in some slot $t$ , then Line 1 increases $z_{i}(t)$ by $1-\sum_{\tau=T_{i}}^{t}x(\tau)$ , and Line 1 increases $x(t)$ by $(1/C_{M})(\sum_{\tau=T_{i}}^{t}x(\tau))+(1/(\theta C_{M}))$ . Hence, one activation increases the objective value by

		$\displaystyle C(t)\left(\frac{1}{C_{M}}\sum_{\tau=T_{i}}^{t}x(\tau)+\frac{1}{\theta C_{M}}\right)+\left(1-\sum_{\tau=T_{i}}^{t}x(\tau)\right)$
	$\displaystyle\leq\;$	$\displaystyle 1+\frac{1}{\theta}\quad\text{(since $C(t)\leq C_{M}$).}$		(11)

Note that $H^{*}(k)$ exactly counts the number of iterations of Line 1 from slot $t_{k-1}+1$ through slot $t_{k}-1$ for the virtual packets that arrive in period $k$ . Thus, the virtual packets can activate at most $H^{*}(k)$ times before slot $t_{k}$ . Moreover, by Lemma 5, the virtual packets arriving in period $k$ can activate at most $\lceil C_{m}\rceil$ additional times from slot $t_{k}$ onward. Hence, they can activate at most $H^{*}(k)+\lceil C_{m}\rceil$ times in total. Combining this with Eq. (11), we obtain

$\displaystyle J(k)$	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\big(H^{*}(k)+\lceil C_{m}\rceil\big)$	(12)
	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\frac{\lceil C_{m}\rceil}{C_{m}}\,\big(H^{*}(k)+C(t_{k})\big)$
	$\displaystyle\leq\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{\theta}\right)J^{*}(k).$

For $k=n+1$ : Since Alg. 1 terminates in slot $t_{n+1}$ with no clearing, the virtual packets arriving in this terminal period can activate at most $H^{*}(n+1)$ times. Thus, we have

	$\displaystyle J(n+1)$	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)H^{*}(n+1)$
		$\displaystyle\leq\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{\theta}\right)J^{*}(n+1).$

Combining both cases yields

\sum_{k=1}^{n+1}J(k)\;\leq\;\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{\theta}\right)\sum_{k=1}^{n+1}J^{*}(k).

Substituting $\mathrm{OPT}(\mathcal{I})=\sum_{k=1}^{n+1}J^{*}(k)$ and $\theta=(1+(1/C_{M}))^{C_{m}}-1$ proves the first part of the theorem.

For the asymptotic result, as $C_{u}\to\infty$ we have $(1+1/C_{m})\to 1$ and $(1+(1/C_{M}))^{C_{m}}=(1+(1/C_{M}))^{C_{M}/R}\to e^{1/R}$ , so the ratio approaches $e^{1/R}/(e^{1/R}-1)$ .

Appendix C Proof of Lemma 7

Fix a slot $t$ . First, if the total number of virtual packets that have arrived by slot $t$ is less than $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ , then the result is immediate. Otherwise, suppose that the number of virtual packet arrivals by slot $t$ is at least $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ .

Let $i^{\prime}=\arg\max_{i}\{T_{i}\leq t\}$ denote the most recently arrived virtual packet by slot $t$ . Since at most $\Delta A_{M}$ virtual packets can arrive in a single slot, the arrival of $\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ virtual packets requires at least $\lceil\sqrt{C_{m}/\Delta A_{M}}\rceil$ slots. Therefore, virtual packet $i^{\prime}-\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil+1$ must have arrived by slot $t-\lceil\sqrt{C_{m}/\Delta A_{M}}\rceil+1$ . This implies that the set $P(t-\lceil\sqrt{C_{m}/\Delta A_{M}}\rceil+1)$ contains the subset $\{i\in\mathbb{N}:i^{\prime}-2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil+1\leq i\leq i^{\prime}-\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil\}$ , which consists of $\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ virtual packets.

From slot $t-\lceil\sqrt{C_{m}/\Delta A_{M}}\rceil+1$ through slot $t$ (a total of $\lceil\sqrt{C_{m}/\Delta A_{M}}\rceil$ slots), these $\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ virtual packets in $P(t-\lceil\sqrt{C_{m}/\Delta A_{M}}\rceil+1)$ have at least $\lceil C_{m}\rceil$ opportunities to activate. By Lemma 5, the virtual packet $i^{\prime}-2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil+1$ in this set will no longer satisfy the condition in Line 1 at the end of slot $t$ . Therefore, at most $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ virtual packets can satisfy the condition at the end of slot $t$ .

Appendix D Proof of Lemma 9

Consider the initial age $A_{0}=1$ , a fixed age increment sequence $\boldsymbol{\Delta A}=(0,\cdots,0)$ , and a fixed update (or clearing) cost sequence $\mathbf{C}=(C_{m},C_{m}R,C_{m}R,\cdots,C_{m}R)$ . Only the operation duration $T$ is unknown to the device. Since the age no longer increases after the first update, the device will not transmit again beyond the first update. Thus, the scheduling problem reduces to deciding when to send a single update (i.e., deciding when to clear the virtual queue once) under uncertainty in $T$ .

To simplify the analysis, we rescale the objective function in Eq. (3) by dividing it by $C_{m}$ , resulting in

\displaystyle\sum_{t=1}^{T}\left(\frac{C(t)}{C_{m}}d(t)+\sum_{i:T_{i}\leq t}\frac{1}{C_{m}}z_{i}(t)\right).

This transformation does not alter the optimal solution. We redefine $C(t)/C_{m}$ as the new (normalized) clearing cost. Under the given instance, the transformation yields a clearing cost of $1$ in slot $1$ , and a clearing cost of $R$ in all subsequent slots. Moreover, the term $1/C_{m}$ can be interpreted as the cost of holding a virtual packet for a slot of duration $1/C_{m}$ , under the convention that holding a virtual packet for one unit time incurs a unit cost. As $C_{m}\to\infty$ , the slot length approaches zero, and the problem transitions to a continuous-time scheduling model, similar to prior studies [bamas2020primal]. In this continuous-time setting, we assume that time starts at $0$ . The clearing cost becomes $C(0)=1$ and $C(t)=R$ for all $t>0$ . Moreover, we consider a time horizon $T\in(0,1]$ , which is unknown to the device .

We now establish a lower bound on the competitive ratio of any randomized online scheduling algorithm. Let $p(t)$ denote the probability density function (PDF) describing the randomized clearing time. Since $C(0)=1$ and $C(t)=R\geq 1$ for all $t\in(0,1]$ , the virtual server optimally clears the virtual queue before time 1. Thus, the PDF $p(t)$ of the randomized clearing time must satisfy the condition $\int_{0}^{1}p(t)\,dt=1$ .

For a given realization of $T\in(0,1]$ , the expected cost incurred by the randomized algorithm is $\int_{0}^{T}(R+t)p(t)\,dt+\int_{T}^{1}Tp(t)\,dt$ , where the first term accounts for the cost incurred when the virtual server decides to clear by time $T$ , and the second term accounts for the cost incurred when the virtual server decides to clear after time $T$ . Moreover, for this instance, since $T\leq 1$ , an optimal offline strategy is to idle for all time, incurring the minimum total cost of $T$ . Let $c$ be the competitive ratio of the randomized algorithm. Then, we have $\int_{0}^{T}(R+t)p(t)\,dt+\int_{T}^{1}Tp(t)\,dt\leq cT$ for all $T\in(0,1]$ . Thus, we derive the following optimization problem to find the smallest achievable competitive ratio $c$ :


$\displaystyle\min_{c,\,p(\cdot)\geq 0}\,$	$\displaystyle c$	(13a)
s.t.	$\displaystyle\int_{0}^{T}(R+t)p(t)\,dt+\int_{T}^{1}Tp(t)\,dt\leq cT,$
	$\displaystyle\hskip 99.58464pt\text{for all $T\in(0,1]$};$	(13b)
	$\displaystyle\int_{0}^{1}p(t)\,dt=1.$	(13c)

We propose a candidate solution of the form $p(t)=Ke^{t/R}$ for some constant $K$ . Substituting this into the constraint in Eq. (13c), we can obtain $K=1/(R(e^{1/R}-1))$ , which yields $p(t)=1/(R(e^{1/R}-1))\cdot e^{t/R}$ . Substituting this density into the left-hand side of constraint (13b), we obtain

\displaystyle\int_{0}^{T}(R+t)p(t)\,dt+\int_{T}^{1}Tp(t)\,dt=\frac{e^{1/R}}{e^{1/R}-1}T.

Comparing with the right-hand side $cT$ , we conclude that

\displaystyle c\geq\frac{e^{1/R}}{e^{1/R}-1},

which establishes the desired lower bound on the competitive ratio.

Appendix E Proof of Lemma 11

Let $S$ denote the value of $\sum_{\tau=T_{i}}^{\infty}x(\tau)$ at the beginning of some iteration of Line 3 in a slot. Suppose that a slow step is followed by a fast step, and let $S_{s\rightarrow f}$ represent the resulting value of $\sum_{\tau=T_{i}}^{\infty}x(\tau)$ after these two steps. By Eq. (6), we have

	$\displaystyle S_{s\rightarrow f}$	$\displaystyle=\underbrace{\left(1+\frac{1}{C_{M}}\right)\underbrace{\left(\left(1+\frac{1}{C_{M}}\right)S+\frac{1}{\theta_{s}C_{M}}\right)}_{\sum_{\tau=T_{i}}^{\infty}x(\tau)\big\|_{\text{after the slow step}}}+\frac{1}{\theta_{f}C_{M}}}_{\sum_{\tau=T_{i}}^{\infty}x(\tau)\big\|_{\text{after the fast step}}}$
		$\displaystyle=S\left(1+\frac{1}{C_{M}}\right)^{2}+\frac{1}{\theta_{s}C_{M}}+\frac{1}{\theta_{s}C_{M}^{2}}+\frac{1}{\theta_{f}C_{M}}.$

Similarly, let $S_{f\rightarrow s}$ denote the resulting value of $\sum_{\tau=T_{i}}^{\infty}x(\tau)$ after applying a fast step followed by a slow step. We have

\displaystyle S_{f\rightarrow s}=S\left(1+\frac{1}{C_{M}}\right)^{2}+\frac{1}{\theta_{f}C_{M}}+\frac{1}{\theta_{f}C_{M}^{2}}+\frac{1}{\theta_{s}C_{M}}.

Since $\theta_{s}\geq\theta_{f}$ , we have $S_{s\rightarrow f}\leq S_{f\rightarrow s}$ . Therefore, swapping a fast step and a subsequent slow step can reduce the value of $\sum_{\tau=T_{i}}^{\infty}x(\tau)$ . Hence, to prove the desired lower bound, we can assume that all $N_{s}$ slow steps occur first, followed by all $N_{f}$ fast steps.

Next, we prove the bound by induction on $N_{f}$ . When $N_{f}=0$ , the result follows from Lemma 4, which gives

\displaystyle\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg|_{N_{f}=0}\right)\geq\frac{(1+\frac{1}{C_{M}})^{N_{s}}-1}{\theta_{s}},

for all $i\in P(t)$ . Assume that the result holds for $N_{f}-1$ , i.e.,

	$\displaystyle\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg\|_{N_{f}-1}\right)$	$\displaystyle\geq\frac{(1+\frac{1}{C_{M}})^{N_{s}}-1}{\theta_{s}}\left(1+\frac{1}{C_{M}}\right)^{N_{f}-1}$
		$\displaystyle\quad+\frac{(1+\frac{1}{C_{M}})^{N_{f}-1}-1}{\theta_{f}},$

for all $i\in P(t)$ . We show the result also holds for $N_{f}$ : after an additional fast step, by Eq. (6) we have

	$\displaystyle\quad\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg\|_{N_{f}}\right)$
	$\displaystyle=\left(1+\frac{1}{C_{M}}\right)\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\bigg\|_{N_{f}-1}\right)+\frac{1}{\theta_{f}C_{M}}$
	$\displaystyle\geq\left(1+\frac{1}{C_{M}}\right)\left[\frac{(1+\frac{1}{C_{M}})^{N_{s}}-1}{\theta_{s}}\left(1+\frac{1}{C_{M}}\right)^{N_{f}-1}\right.$
	$\displaystyle\quad\left.+\frac{(1+\frac{1}{C_{M}})^{N_{f}-1}-1}{\theta_{f}}\right]+\frac{1}{\theta_{f}C_{M}}$
	$\displaystyle=\frac{(1+\frac{1}{C_{M}})^{N_{s}}-1}{\theta_{s}}\left(1+\frac{1}{C_{M}}\right)^{N_{f}}+\frac{(1+\frac{1}{C_{M}})^{N_{f}}-1}{\theta_{f}},$

for all $i\in P(t)$ . This completes the inductive step and proves the lemma.

Appendix F Proof of Lemma 12

Fix a slot $t$ . We claim that if $N_{s}\lambda+N_{f}\geq C_{m}$ , then $\sum_{\tau=T_{i}}^{\infty}x(\tau)\geq 1$ for all $i\in P(t)$ , implying that the condition in Line 3 no longer holds.

To prove the claim, it suffices to consider the case where the $N_{s}$ slow steps are followed by the $N_{f}$ fast steps (as in Appendix E). Applying Lemma 11 under $N_{s}\lambda+N_{f}\geq C_{m}$ yields

	$\displaystyle\sum_{\tau=T_{i}}^{\infty}x(\tau)$	$\displaystyle\geq\frac{\left(1+\frac{1}{C_{M}}\right)^{(C_{m}-N_{f})/\lambda}-1}{\theta_{s}}\left(1+\frac{1}{C_{M}}\right)^{N_{f}}$
		$\displaystyle\quad+\frac{\left(1+\frac{1}{C_{M}}\right)^{N_{f}}-1}{\theta_{f}},$		(14)

for all $i\in P(t)$ . If $N_{f}=0$ , then Eq. (14) (after the inequality) equals $1$ . Next, we show that Eq. (14) is nondecreasing in $N_{f}$ . Differentiating it with respect to $N_{f}$ gives

	$\displaystyle\ln\left(1+\frac{1}{C_{M}}\right)\left(1+\frac{1}{C_{M}}\right)^{N_{f}}$
	$\displaystyle\quad\times\left[-\frac{\frac{1-\lambda}{\lambda}\left(1+\frac{1}{C_{M}}\right)^{(C_{m}-N_{f})/\lambda}+1}{\theta_{s}}+\frac{1}{\theta_{f}}\right].$

To show the derivative is nonnegative, we examine the bracketed term:

	$\displaystyle-\frac{\frac{1-\lambda}{\lambda}\left(1+\frac{1}{C_{M}}\right)^{(C_{m}-N_{f})/\lambda}+1}{\theta_{s}}+\frac{1}{\theta_{f}}$
$\displaystyle\mathop{\geq}^{(a)}\$	$\displaystyle-\frac{\frac{1-\lambda}{\lambda}\left(1+\frac{1}{C_{M}}\right)^{C_{m}/\lambda}+1}{\theta_{s}}+\frac{1}{\theta_{f}}$
$\displaystyle=\$	$\displaystyle-\frac{\frac{1-\lambda}{\lambda}\left(1+\frac{1}{C_{M}}\right)^{C_{M}/(R\lambda)}+1}{\left(1+\frac{1}{C_{M}}\right)^{C_{M}/(R\lambda)}-1}+\frac{1}{\left(1+\frac{1}{C_{M}}\right)^{(C_{M}\lambda)/R}-1},$	(15)

where $(a)$ sets $N_{f}=0$ (since $(1+(1/C_{M}))^{(C_{m}-N_{f})/\lambda}$ decreases in $N_{f}$ ). From [bamas2020primal, Page 27], we can write $(1+(1/C_{M}))^{C_{M}}=e^{x}$ for some $x\in(0,1)$ . Let $x^{\prime}=x/R\in(0,1)$ . Then, Eq. (15) becomes

-\frac{\frac{1-\lambda}{\lambda}e^{x^{\prime}/\lambda}+1}{e^{x^{\prime}/\lambda}-1}+\frac{1}{e^{x^{\prime}\lambda}-1},

which is known to be nonnegative [bamas2020primal, Page 27]. Hence, Eq. (14) is nondecreasing as its derivative is nonnegative, proving the claim.

Appendix G Proof of Theorem 13

We follow Appendix B. Consider a period $k\in\{1,\cdots,n\}$ . Here, a single slow or fast step activation can increase the objective value in Eq. (5a) by at most $1+(1/\theta_{s})$ or $1+(1/\theta_{f})$ , respectively. Since $1+(1/\theta_{s})\leq 1+(1/\theta_{f})$ , the increment of the objective value due to the activations of all virtual packets arriving in period $k$ , from slot $t_{k-1}+1$ up to slot $t_{k}-1$ , is bounded above by

\displaystyle H^{*}(k)\left(1+\frac{1}{\theta_{f}}\right).

(16)

Moreover, let $N_{s}(k)$ and $N_{f}(k)$ denote the numbers of slow and fast steps, respectively, performed by the virtual packets arriving in period $k$ , from slot $t_{k}$ onward. Then, the increment of the objective value due to these $N_{s}(k)+N_{f}(k)$ activations is

	$\displaystyle\quad N_{s}(k)\left(1+\frac{1}{\theta_{s}}\right)+N_{f}(k)\left(1+\frac{1}{\theta_{f}}\right)$
	$\displaystyle\mathop{\leq}^{(a)}\frac{C_{m}+1-N_{f}(k)}{\lambda}\left(1+\frac{1}{\theta_{s}}\right)+N_{f}(k)\left(1+\frac{1}{\theta_{f}}\right),$		(17)

where $(a)$ is because $N_{s}(k)\lambda+N_{f}(k)\leq C_{m}+1$ from Lemma 12. Differentiating Eq. (17) (after the inequality) with respect to $N_{f}(k)$ gives

	$\displaystyle\quad-\frac{1}{\lambda}\left(1+\frac{1}{\theta_{s}}\right)+\left(1+\frac{1}{\theta_{f}}\right)$
	$\displaystyle=-\frac{1}{\lambda}\left(1+\frac{1}{\left(1+\frac{1}{C_{M}}\right)^{C_{m}/\lambda}-1}\right)$
	$\displaystyle\quad+\left(1+\frac{1}{\left(1+\frac{1}{C_{M}}\right)^{C_{m}\lambda}-1}\right).$

Following Appendix F, this derivative can be rewritten as

	$\displaystyle\quad-\frac{1}{\lambda}\left(1+\frac{1}{e^{x^{\prime}/\lambda}-1}\right)+\left(1+\frac{1}{e^{x^{\prime}\lambda}-1}\right)$
	$\displaystyle=-\frac{1}{\lambda}\left(\frac{1}{1-e^{-x^{\prime}/\lambda}}\right)+\frac{1}{1-e^{-x^{\prime}\lambda}},$

for some $x^{\prime}\in(0,1)$ . This expression is known to be nonnegative [bamas2020primal, Page 27]. Hence, Eq. (17) is nondecreasing in $N_{f}(k)$ , and its maximum is following attained at $N_{f}(k)=C_{m}+1$ ,

\displaystyle\big(C_{m}+1\big)\left(1+\frac{1}{\theta_{f}}\right).

(18)

Combining Eqs. (16) and (18), we obtain

	$\displaystyle J(k)$	$\displaystyle\leq\left(1+\frac{1}{\theta_{f}}\right)\left(H^{*}(k)+C_{m}+1\right)$
		$\displaystyle\leq\left(1+\frac{1}{\theta_{f}}\right)\frac{C_{m}+1}{C_{m}}\left(H^{*}(k)+C(t_{k})\right)$
		$\displaystyle\leq\left(1+\frac{1}{C_{m}}\right)\left(1+\frac{1}{\theta_{f}}\right)J^{*}(k).$

Finally, following the line of Appendix B and substituting the definition of $\theta_{f}$ completes the proof.

Appendix H Proof of Theorem 14

Fix an instance $\mathcal{I}$ and ML advice $\boldsymbol{\mathcal{M}}$ . We follow Appendix B, with minor modifications. Redefine $t_{k}$ for $k\in\{1,\cdots,n\}$ as the slot when $\boldsymbol{\mathcal{M}}$ clears the virtual queue for the $k$ -th time. These redefined time points determine a new set of periods, replacing the periods defined in Appendix B.

Let $J_{\boldsymbol{\mathcal{M}}}(k)$ denote the cost incurred by $\boldsymbol{\mathcal{M}}$ in period $k$ . Then, the total cost in Eq. (3) incurred by $\boldsymbol{\mathcal{M}}$ is $\sum_{k=1}^{n+1}J_{\boldsymbol{\mathcal{M}}}(k)$ . Let $H_{\boldsymbol{\mathcal{M}}}(k)$ be the holding cost incurred by $\boldsymbol{\mathcal{M}}$ for all virtual packets arriving in period $k$ . Following Appendix B, we have $J_{\boldsymbol{\mathcal{M}}}(k)=H_{\boldsymbol{\mathcal{M}}}(k)+C(t_{k})$ for all $k\in\{1,\cdots,n\}$ , and $J_{\boldsymbol{\mathcal{M}}}(n+1)=H_{\boldsymbol{\mathcal{M}}}(n+1)$ .

Let $J(k)$ be the increment of the objective value in Eq. (5a) by Alg. 3, according to the slow and fast step activations of all virtual packets arriving in period $k$ . Consider a fixed $k\in\{1,\cdots,n\}$ . The virtual packets arriving in period $k$ activate only slow steps from slot $t_{k-1}+1$ until slot $t_{k}-1$ (before advising clearing). Each slow step activation increases the objective value by at most $1+(1/\theta_{s})$ , so the total increment of the objective value in this interval is at most $(1+(1/\theta_{s}))H_{\boldsymbol{\mathcal{M}}}(k)$ . Moreover, after advising clearing in slot $t_{k}$ , the same virtual packets activate only fast steps. Following the proof of Lemma 5, there are at most $\lceil C_{m}\lambda\rceil$ such activations, each increasing the objective value by at most $1+(1/\theta_{f})$ . Thus, the total increment of the objective value after slot $t_{k}$ is at most $(1+(1/\theta_{f}))\lceil C_{m}\lambda\rceil$ .

Hence, we have

	$\displaystyle J(k)$	$\displaystyle\leq\left(1+\frac{1}{\theta_{s}}\right)H_{\boldsymbol{\mathcal{M}}}(k)+\left(1+\frac{1}{\theta_{f}}\right)\lceil C_{m}\lambda\rceil$
		$\displaystyle=\left(1+\frac{1}{\theta_{s}}\right)H_{\boldsymbol{\mathcal{M}}}(k)+\left(1+\frac{1}{\theta_{f}}\right)\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\,C_{m}$
		$\displaystyle\leq\max\left\{1+\frac{1}{\theta_{s}},\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\left(1+\frac{1}{\theta_{f}}\right)\right\}\big(H_{\boldsymbol{\mathcal{M}}}(k)+C_{m}\big)$
		$\displaystyle\leq\max\left\{1+\frac{1}{\theta_{s}},\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\left(1+\frac{1}{\theta_{f}}\right)\right\}\big(H_{\boldsymbol{\mathcal{M}}}(k)+C(t_{k})\big)$
		$\displaystyle=\max\left\{1+\frac{1}{\theta_{s}},\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\left(1+\frac{1}{\theta_{f}}\right)\right\}J_{\boldsymbol{\mathcal{M}}}(k).$

Similarly, we also have

J(n+1)\leq\max\left\{1+\frac{1}{\theta_{s}},\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\left(1+\frac{1}{\theta_{f}}\right)\right\}J_{\boldsymbol{\mathcal{M}}}(n+1).

Combining the two cases yields

\sum_{k=1}^{n+1}J(k)\leq\max\left\{1+\frac{1}{\theta_{s}},\frac{\lceil C_{m}\lambda\rceil}{C_{m}}\left(1+\frac{1}{\theta_{f}}\right)\right\}\sum_{k=1}^{n+1}J_{\boldsymbol{\mathcal{M}}}(k).

Substituting $J(\mathcal{I},\boldsymbol{\mathcal{M}})=\sum_{k=1}^{n+1}J_{\boldsymbol{\mathcal{M}}}(k)$ and the definitions of $\theta_{s}$ and $\theta_{f}$ proves the first part of the theorem.

Finally, following the derivation in Appendix G, we obtain the asymptotic behavior of the bound as $C_{u}\to\infty$ .

Appendix I Proof of Lemma 15

We use the same continuous-time instance as in Appendix D. Consider a $\big(\lambda e^{\lambda/R}\big)/(e^{\lambda/R}-1)$ -consistent scheduling algorithm that chooses a random update (or equivalently clearing) time $t\in[0,1]$ with PDF $p(t)$ (so $\int_{0}^{1}p(t)\,dt=1$ ). Let $c$ denote the robustness factor. Following Appendix D, we have

\displaystyle\int_{0}^{T}(R+t)p(t)\,dt+\int_{T}^{1}Tp(t)\,dt\leq cT,

(19)

for all $T\in(0,1]$ .

Setting $T=1$ and assuming perfect ML advice (updating at time $0$ ), the ML advice incurs a cost of $1$ , while the online algorithm incurs a cost of $\int_{0}^{1}(R+t)\,p(t)\,dt$ . By $\big(\lambda e^{\lambda/R}\big)/(e^{\lambda/R}-1)$ -consistency, we have

\int_{0}^{1}(R+t)\,p(t)\,dt\;\leq\;\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1}\cdot 1.

(20)

We now lower bound the optimal robustness $c$ subject to Eqs. (19) and (20), and $\int_{0}^{1}p(t)\,dt=1$ :


$\displaystyle\min_{c,\,p(\cdot)}\,$	$\displaystyle c$	(21a)
s.t.	$\displaystyle\int_{0}^{T}(R+t)\,p(t)\,dt+\int_{T}^{1}Tp(t)\,dt\leq cT,$
	$\displaystyle\hskip 99.58464pt\text{for all $T\in(0,1]$};$	(21b)
	$\displaystyle\int_{0}^{1}(R+t)\,p(t)\,dt\leq\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1};$	(21c)
	$\displaystyle\int_{0}^{1}p(t)\,dt=1.$	(21d)

By weak duality, any feasible solution to the dual of Eq. (21) yields a lower bound on $c$ . Let $\eta(T)$ , $\mu$ , and $\nu$ be the dual variables for Eqs. (21b), (21c), and (21d), respectively. Then, the dual program can be written as follows:


$\displaystyle\max_{\eta(\cdot),\mu,\nu}\quad$	$\displaystyle\nu-\mu\,\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1}$	(22a)
s.t.	$\displaystyle\int_{0}^{1}T\eta(T)\,dT=1;$	(22b)
	$\displaystyle\nu-(R+t)\mu\leq\int_{0}^{t}T\eta(T)\,dT$
	$\displaystyle\hskip 79.6678pt+(R+t)\int_{t}^{1}\eta(T)\,dT,$
	$\displaystyle\hskip 85.35826pt\text{for all $t\in[0,1]$}.$	(22c)

Next, we propose a feasible solution to the optimization problem in Eq. (22). We propose $\eta(T)=Ke^{-T/R}\mathbf{1}_{\{T\leq\lambda\}}$ for some constant $K$ to be determined, where $\mathbf{1}_{\{\cdot\}}$ is the indicator function. Substituting this form into Eq. (22b) yields $K=1/(R^{2}-R^{2}e^{-\lambda/R}-R\lambda e^{-\lambda/R})$ .

We further propose $\nu=aK$ and $\mu=bKe^{-\lambda/R}$ for some constants $a$ and $b$ to be determined. Substituting these into the objective in Eq. (22a) gives

	$\displaystyle\nu-\mu\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1}$
$\displaystyle=\,$	$\displaystyle K\left(a-be^{-\lambda/R}\frac{\lambda e^{\lambda/R}}{e^{\lambda/R}-1}\right)$
$\displaystyle=\,$	$\displaystyle\frac{1}{R^{2}-R^{2}e^{-\lambda/R}-R\lambda e^{-\lambda/R}}\left(a-ae^{-\lambda/R}-b\lambda e^{-\lambda/R}\right)$
	$\displaystyle\times\left(\frac{e^{\lambda/R}}{e^{\lambda/R}-1}\right).$	(23)

To make the objective equal to $(e^{\lambda/R})/(e^{\lambda/R}-1)$ (the claimed robustness bound), we choose $a=R^{2}$ and $b=R$ .

Next, we verify that the chosen values satisfy Eq. (22c). We consider two cases:

For $t\leq\lambda$ : The left-hand side of Eq. (22c) (before the inequality) is

\nu-(R+t)\mu=K\big(R^{2}-R^{2}e^{-\lambda/R}-tRe^{-\lambda/R}\big).

The right-hand side is

	$\displaystyle\quad\int_{0}^{t}T\eta(T)\,dT+(R+t)\int_{t}^{1}\eta(T)\,dT$
	$\displaystyle=K\left(\int_{0}^{t}Te^{-T/R}\,dT+(R+t)\int_{t}^{\lambda}e^{-T/R}\,dT\right)$
	$\displaystyle=K\big(R^{2}-R^{2}e^{-\lambda/R}-tRe^{-\lambda/R}\big),$

which matches the left-hand side.

For $t>\lambda$ : The left-hand side is

	$\displaystyle\nu-(R+t)\mu$	$\displaystyle\leq\nu-(R+\lambda)\mu$
		$\displaystyle=K\big(R^{2}-R^{2}e^{-\lambda/R}-\lambda Re^{-\lambda/R}\big)$
		$\displaystyle=1,$

The right-hand side is

	$\displaystyle\quad\int_{0}^{t}T\eta(T)\,dT+(R+t)\int_{t}^{1}\eta(T)\,dT$
	$\displaystyle=\int_{0}^{\lambda}T\eta(T)\,dT=1,$

so the inequality holds.

Therefore, Eq. (22c) is satisfied in both cases. By the weak duality theorem, the minimum possible robustness $c$ is at least

\frac{e^{\lambda/R}}{e^{\lambda/R}-1},

as stated in Eq. (23).

Appendix J Proof of Lemma 16

Fix a slot $t$ . We prove the result by induction on $n$ . When $n=1$ , by Eq. (6) we have

	$\displaystyle\quad\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n=1}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n=0}\right)$
	$\displaystyle=\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n=0}\right)+\frac{1}{\theta C_{M}}$
	$\displaystyle\leq\frac{1}{C_{M}}\cdot 1+\frac{1}{\theta C_{M}}$
	$\displaystyle=\left(1+\frac{1}{\theta}\right)\!\left[\left(1+\frac{1}{C_{M}}\right)^{1}-1\right],$

for all $i\in P(t)$ .

Assume the result holds for $n-1$ , i.e.,

	$\displaystyle\quad\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)$
	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\!\left[\left(1+\frac{1}{C_{M}}\right)^{n-1}-1\right],$

for all $i\in P(t)$ .

We show that the result holds for $n$ :

	$\displaystyle\quad\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)$
	$\displaystyle=\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)$
	$\displaystyle\quad+\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)$
	$\displaystyle\leq\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)+\frac{1}{\theta C_{M}}$
	$\displaystyle\quad+\left(1+\frac{1}{\theta}\right)\!\left[\left(1+\frac{1}{C_{M}}\right)^{n-1}-1\right],$		(24)

where the inequality uses Eq. (6) and the inductive hypothesis; moreover, the term $\sum_{\tau=T_{i}}^{\infty}x(\tau)\big|_{n-1}$ can be further calculated as follows:

	$\displaystyle\quad\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)$
	$\displaystyle=\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)+\Bigg[\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)\Bigg]$
	$\displaystyle\leq 1+\left(1+\frac{1}{\theta}\right)\!\left[\left(1+\frac{1}{C_{M}}\right)^{n-1}-1\right].$

Plugging this into Eq. (24) yields

	$\displaystyle\quad\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)$
	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\!\left[\left(1+\frac{1}{C_{M}}\right)^{n}-1\right],$

for all $i\in P(t)$ . This completes the inductive step and proves the lemma.

Appendix K Proof of Theorem 17

For any scheduling algorithm, a virtual packet that arrives during a virtual OFF slot must remain in the virtual queue until the next virtual ON slot. Hence, the holding cost accrued by the virtual packets that arrive in virtual OFF slots until the slot immediately before the next virtual ON slot is identical across all algorithms (including the offline optimum). We therefore remove this constant from the objective, which is equivalent to deferring any virtual packet arrival in a virtual OFF slot to the next virtual ON slot. Then, it suffices to consider a fixed instance $\mathcal{I}$ in which no virtual packet arrives in a virtual OFF slot.

We follow Appendix B. Fix a period $k\in\{1,\dots,n\}$ and a virtual ON slot $t$ in that period. We bound the increment of the objective value in Eq. (9a) in slot $t$ by considering how $x(t)$ and the matching $z$ -variables are adjusted by Alg. 4. There are three mutually exclusive cases:

If a virtual packet $i$ arrives before $\hat{t}$ and activates in iteration $t^{\prime}$ of Line 4 in slot $t$ , then Line 4 increases $x(t)$ by

\displaystyle\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg|_{\text{after the activation}}\right)+\frac{1}{\theta C_{M}}.

\displaystyle z_{i}(t^{\prime})=1-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg|_{\text{start of slot~$t$}}\right).

Hence, the increment of the objective value due to the adjustment of $x(t)$ from virtual packet $i$ in iteration $t^{\prime}$ and the paired $z_{i}(t^{\prime})$ is

	$\displaystyle C(t)\left[\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{t}x(\tau)\Big\|_{\text{after the activation}}\right)+\frac{1}{\theta C_{M}}\right]$
	$\displaystyle\quad+1-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right)$
	$\displaystyle\leq 1+\frac{1}{\theta}$
	$\displaystyle+\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{after the activation}}\right)-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right).$		(25)

Next, we analyze the total number of activations performed from the start of slot $t$ until the considered activation. By Lemma 7, at most $2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil$ virtual packets can satisfy the condition in Line 4 at the end of the previous ON slot $\hat{t}-1$ . Because no virtual packets arrive during the virtual OFF period, also at most $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil$ virtual packets can satisfy the condition at the beginning of slot $t$ . Moreover, since each such virtual packet can be iterated at most $T_{\text{OFF}}$ times in Line 4, the total number of activations performed from the start of slot $t$ until the considered activation is bounded by $2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil T_{\text{OFF}}$ .

Then, by Lemma 16, we have

	$\displaystyle\left(\sum_{\tau=T_{i}}^{t}x(\tau)\big\|_{\text{after the activation}}\right)-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right)$
	$\displaystyle\leq\left(1+\frac{1}{\theta}\right)\left[\left(1+\frac{1}{C_{M}}\right)^{2\left\lceil\sqrt{\Delta A_{M}C_{m}}\right\rceil T_{\text{OFF}}}-1\right].$

Substituting this in Eq. (25) gives the bound on the increment of the objective value:

\displaystyle\left(1+\frac{1}{\theta}\right)\left(1+\frac{1}{C_{M}}\right)^{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}.

(26)

2.

If a virtual packet $i$ arrives before $\hat{t}$ but does not activate in iteration $t^{\prime}$ of Line 4, then because the paired $z_{i}(t^{\prime})$ was still set to $1-\sum_{\tau=T_{i}}^{t^{\prime}}x(\tau)$ in slot $t^{\prime}\leq t$ , it contributes at most $1$ to the objective value in that iteration.
3.

If a virtual packet $i$ arrives in slot $t$ and activates in slot $t$ , then by Appendix B the increment from $x(t)$ in Line 4 and its paired $z_{i}(t)$ in Line 4 is $1+(1/\theta)$ , which is also bounded above by Eq. (26).

Next, we count the occurrences of the three cases since slot $t_{k}$ for the virtual packets that arrive in the period. By Lemma 5, the virtual packets that arrive in period $k$ can activate (in Cases 1 and 3 together) at most $\lceil C_{m}\rceil$ times since $t_{k}$ . In addition, by Lemma 7, at most $2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil$ virtual packets meet the condition in Line 4 at the end of slot $t_{k}$ . Once such a virtual packet stops activating, because of at most $T_{\text{OFF}}$ iterations in Line 4, the virtual packet can contribute at most $T_{\text{OFF}}$ Case 2 increments. Thus, there are at most $2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}$ Case 2 increments since slot $t_{k}$ .

Then, we can revise Eq. (12) in Appendix B as

	$\displaystyle J(k)$	$\displaystyle\leq\underbrace{\left(1+\frac{1}{\theta}\right)\left(1+\frac{1}{C_{m}}\right)^{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}H^{*}(k)}_{(a)}$
		$\displaystyle\quad+\underbrace{\left(1+\frac{1}{\theta}\right)\left(1+\frac{1}{C_{m}}\right)^{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}\lceil C_{m}\rceil}_{(b)}$
		$\displaystyle\quad+\underbrace{1\cdot\left(2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}\right)}_{(c)},$

where (a) corresponds to all cases before slot $t_{k}$ ; (b) corresponds to Cases 1 and 3 from slot $t_{k}$ onward; (c) corresponds to Case 2 from slot $t_{k}$ onward. These terms can be further simplified as

\displaystyle(a)+(b)

\displaystyle\leq\left(1+\frac{1}{\theta}\right)\left(1+\frac{1}{C_{m}}\right)^{1+2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}J^{*}(k),

and also

	$\displaystyle(c)$	$\displaystyle\leq\frac{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}{C_{m}}\,(H^{*}(k)+C_{m})$
		$\displaystyle\leq\frac{2\lceil\sqrt{\Delta A_{M}C_{m}}\rceil T_{\text{OFF}}}{C_{m}}J^{*}(k).$

Finally, following Appendix B yields the desired result.

Appendix L Proof of Lemma 18

First, we consider the initial age $A_{0}=1$ and a fixed update cost sequence $\mathbf{C}=(C_{m},\cdots,C_{m})$ . Consider an online algorithm that updates in slot $t$ with probability $p(t)$ . Depending on $p(1)$ , the adversary constructs the operation duration $T$ , the age increment sequence $\boldsymbol{\Delta A}$ , and the update opportunity sequence $\mathbf{U}$ as follows:

1.

If $p(1)<1$ : Set $T=2$ , $\boldsymbol{\Delta A}=(0,C_{m}^{2}-2)$ , and $\mathbf{U}=(1,0)$ . In this case, the online algorithm incurs expected total cost

$\displaystyle\quad p(1)\,C_{m}+(1-p(1))\bigl(A(1)+A(2)\bigr)$

$\displaystyle=p(1)\,C_{m}+(1-p(1))C_{m}^{2},$

while the offline optimum updates in slot 1 and incurs total cost $C_{m}$ . Hence, the competitive ratio is $p(1)+(1-p(1))C_{m}$ , which diverges as $C_{m}\to\infty$ .
2.

If $p(1)=1$ : Set $T=1$ , $\Delta A(1)=0$ , and $U(1)=1$ . In this case, the online algorithm incurs cost $C_{m}$ , while the offline optimum chooses not to update and incurs total cost $1$ . Hence, the competitive ratio is $C_{m}$ , which again diverges as $C_{m}\to\infty$ .

In both cases, if $\Delta A_{M}$ can be arbitrarily large, the adversary can construct an instance that forces the competitive ratio of any online algorithm to diverge.

Second, we consider the initial age $A_{0}=1$ , a fixed age increment sequence $\boldsymbol{\Delta A}=(0,\cdots,0)$ , and a fixed update cost sequence $\mathbf{C}=(C_{m},\cdots,C_{m})$ . Depending on $p(1)$ , the adversary constructs the operation duration $T$ and the update opportunity sequence $\mathbf{U}$ as follows:

1.

If $p(1)<1$ : Set $T=C_{m}^{2}+1$ and $\mathbf{U}=(1,0,0,\cdots,0)$ . In this case, the online algorithm incurs expected total cost

$\displaystyle\quad p(1)\,C_{m}+(1-p(1))(T-1)$

$\displaystyle=p(1)\,C_{m}+(1-p(1))C_{m}^{2},$

while the offline optimum updates in slot 1 and incurs total cost $C_{m}$ . Hence, the competitive ratio is $p(1)+(1-p(1))C_{m}$ , which diverges as $C_{m}\to\infty$ .
2.

If $p(1)=1$ : Set $T=1$ and $U(1)=1$ . In this case, the online algorithm incurs cost $C_{m}$ , while the offline optimum chooses not to update and incurs total cost $1$ . Hence, the competitive ratio is $C_{m}$ , which again diverges as $C_{m}\to\infty$ .

In both cases, if $T_{\text{OFF}}$ can be arbitrarily large, the adversary can also construct an instance that forces the competitive ratio of any online algorithm to diverge, completing the proof.

	$\displaystyle C(t)\left[\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{t}x(\tau)\Big\|_{\text{after the activation}}\right)+\frac{1}{\theta C_{M}}\right]$
	$\displaystyle\quad+1-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right)$
	$\displaystyle\leq 1+\frac{1}{\theta}$
	$\displaystyle+\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{after the activation}}\right)-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right).$

	$\displaystyle\quad\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)$
	$\displaystyle=\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)$
	$\displaystyle\quad+\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)-\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{0}\right)$
	$\displaystyle\leq\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{\infty}x(\tau)\Big\|_{n-1}\right)+\frac{1}{\theta C_{M}}$
	$\displaystyle\quad+\left(1+\frac{1}{\theta}\right)\!\left[\left(1+\frac{1}{C_{M}}\right)^{n-1}-1\right],$		(24)

	$\displaystyle C(t)\left[\frac{1}{C_{M}}\left(\sum_{\tau=T_{i}}^{t}x(\tau)\Big\|_{\text{after the activation}}\right)+\frac{1}{\theta C_{M}}\right]$
	$\displaystyle\quad+1-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right)$
	$\displaystyle\leq 1+\frac{1}{\theta}$
	$\displaystyle+\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{after the activation}}\right)-\left(\sum_{\tau=T_{i}}^{t}x(\tau)\bigg\|_{\text{start of slot~$t$}}\right).$		(25)

	$\displaystyle\quad p(1)\,C_{m}+(1-p(1))\bigl(A(1)+A(2)\bigr)$
	$\displaystyle=p(1)\,C_{m}+(1-p(1))C_{m}^{2},$