Timely Information Updating for Mobile Devices Without and With ML Advice
Abstract
This paper investigates an information update system in which a mobile device monitors a physical process and sends status updates to an access point (AP). A fundamental trade-off arises between the timeliness of the information maintained at the AP and the update cost incurred at the device. To address this trade-off, we propose an online algorithm that determines when to transmit updates using only available observations. The proposed algorithm asymptotically achieves the optimal competitive ratio against an adversary that can simultaneously manipulate multiple sources of uncertainty, including the operation duration, the information staleness, the update cost, and the availability of update opportunities. Furthermore, by incorporating machine learning (ML) advice of unknown reliability into the design, we develop an ML-augmented algorithm that asymptotically attains the optimal consistency-robustness trade-off, even when the adversary can additionally corrupt the ML advice. The optimal competitive ratio scales linearly with the range of update costs, but is unaffected by other uncertainties. Moreover, an optimal competitive online algorithm exhibits a threshold-like response to the ML advice: it either fully trusts or completely ignores the ML advice, as partially trusting the advice cannot improve the consistency without severely degrading the robustness. Extensive simulations in stochastic settings further validate the theoretical findings in the adversarial environment.
I Introduction
In recent years, the demand for timely information has surged across diverse systems. In Internet-of-Things (IoT) networks (e.g., unmanned aerial vehicles deployed for disaster response [gupta2015survey]), each IoT device is equipped with sensors (e.g., GPS, radar, and temperature sensors) that continuously monitor its surroundings. These sensors generate status updates about physical processes and transmit them to a central controller. By aggregating such updates, the controller has a real-time view of the environment, thereby enabling intelligent decision-making. Similarly, in location-based smartphone applications (e.g., navigation and gaming [karki2020characterizing]), users frequently report their locations to a central server so that the service can respond in real time. In both cases, a central entity relies on timely status updates from mobile devices to perform time-sensitive inference tasks.
To quantify the timeliness of information maintained at a central entity, Kaul, Yates, and Gruteser introduced the age of information metric in [kaul2012real], defined as the time elapsed since the most recently received update was generated. Under this definition, the information at the central entity linearly ages with time until it is updated. In addition to the linear aging function, more general nonlinear aging functions [kosta2017age] have also been analyzed. These functions further characterize the quality of an update, e.g., capturing how quickly the information held by the central entity deviates from the true status, or representing the penalty associated with using outdated information in decision-making. In this paper, we consider general aging functions.
While frequent updates reduce the age of information at a central entity, they also incur substantial update costs (e.g., energy consumption and bandwidth utilization) at local devices. Such costs are particularly significant for resource-limited mobile devices (e.g., battery-powered and bandwidth-constrained IoT devices or smartphones). We therefore investigate the fundamental trade-off between the age of information at the central entity and the update cost at the mobile device. Specifically, this paper considers an information update system in which a mobile device monitors a physical process and reports its latest status to a nearby access point (AP). To balance information timeliness at the AP with resource consumption at the device, a scheduling algorithm that determines when to transmit updates is crucial. Our goal is to design such an algorithm to minimize the total cost over the operation duration, where the total cost jointly accounts for an age cost (representing the AP’s information staleness) and an update cost (representing the device’s resource expenditure).
The scheduling problem is complicated by several forms of uncertainty in mobile networks: 1) The operation duration is uncertain, e.g., the runtime of a location-based application depends on how long a smartphone user keeps the application active. 2) The age increment may vary over time, e.g., a location update becomes stale more quickly when the device moves at a higher speed. 3) The update cost is also time-varying, e.g., user mobility causes fluctuations in energy consumption. 4) The device’s update opportunities may also be intermittent. Such cases include sporadic update arrivals (e.g., due to misalignment between the update generation periods and transmission slots) and transmission constraints that prevent the device from sending in certain slots (e.g., due to a power-saving policy [lin2022survey] or uplink scheduling decisions imposed by the AP [takeda2020understanding]).
Most prior works modeled uncertainties using stationary stochastic processes, e.g., employing an M/M/1 queueing model in [kaul2012real] to represent the update arrival and service processes. However, such assumptions are often unrealistic, e.g., when a device moves arbitrarily so that the service time no longer follows an exponential distribution. Even if such models fit reality reasonably well, the operation duration may be too short for the process to converge to stationarity. Moreover, practical stochastic models for operation duration or the information aging are often unclear. Such non-stationary uncertainties pose the central challenge for our scheduling design. Under non-stationarity, a scheduling algorithm cannot rely on future knowledge and must instead operate solely based on past and present observations, as an online algorithm.
Our first contribution is the design and analysis of online scheduling algorithms that operate under observable information. Specifically, the proposed algorithm requires knowledge only of the current age increment of the status held by the AP and also whether an update opportunity is currently available, without relying on any prior knowledge of the operation duration or the entire sequences of status aging, update costs, and update opportunities. Let denote the ratio between the maximum and minimum update cost. Our main result establishes that, asymptotically in the large update cost regime, the proposed algorithm achieves a total cost at most (known as the competitive ratio) times the minimum total cost attained by an optimal offline algorithm with complete knowledge of all uncertainties. This competitive ratio turns out to be optimal. Thus, we can observe that the optimal competitive ratio scales linearly with the range of update costs, while remaining independent of all other sources of uncertainty.
The above guarantee holds in the worst case (also referred to as the adversarial environment), across all possible uncertainty instances. However, such worst-case analysis can be overly pessimistic, since in practice future events often follow patterns that would be predicted using machine learning (ML). Motivated by this, Lykouris and Vassilvitskii [lykouris2018competitive] proposed incorporating (potentially imperfect) ML advice into online algorithms, as an approach that goes beyond the worst-case analysis. A central challenge in this setting is that the reliability of the ML advice is generally unknown. Thus, the design goal is twofold: 1) when the ML advice is accurate, the algorithm should perform well; 2) when the advice is unreliable, the algorithm should still provide performance guarantees. However, it is impossible to achieve both properties simultaneously. For example, an algorithm that blindly trusts the ML advice performs excellently under accurate ML advice but can suffer arbitrarily poor performance when the ML advice is wrong. Hence, the objective is to optimally balance these two properties.
Our second contribution is to integrate ML advice (specifically, advising the next update time) into our online scheduling framework. We introduce a hyperparameter to control the level of trust in the ML advice, where a smaller places greater reliance on the ML advice. Our main result is that the proposed algorithm achieves the following asymptotic trade-off: 1) it achieves a total cost at most (known as the consistency) times the cost of blindly following the ML advice; 2) it also achieves a total cost at most (known as the robustness) times the minimum cost achieved by an optimal offline algorithm. Again, this balance depends only on the ratio and turns out to be optimal. We can observe that partially trusting the ML advice with cannot leverage the ML advice, as it yields (asymptotically) no improvement in the consistency. Thus, an optimal online algorithm almost displays a threshold-type behavior with respect to ML advice, either fully adopting it or ignoring it altogether.
II Related Work
Extensive research has been conducted on analyzing and minimizing the age of information in diverse system settings. For example, Costa et al. [costa2016age] derived closed-form expressions for the average age in single-source systems; then, Yates and Kaul [yates2018age] extended the analysis to multi-source scenarios. Building on the foundational work, numerous system design strategies have been proposed to minimize the age, including scheduling algorithms [kadota2018scheduling], resource allocation schemes [park2020centralized], and sampling strategies [ornee2019sampling]. Beyond solely minimizing the age, several trade-off problems have also been investigated, such as age–throughput trade-off [mankar2021throughput, wang2025understanding] and age–energy trade-off [nath2018optimum, gu2019timely]. A comprehensive survey of these efforts is provided in [yates2021age].
Most prior age-related works assume stationary stochastic processes to model uncertainties (e.g., [yates2018age, kadota2018scheduling, park2020centralized, ornee2019sampling, mankar2021throughput, wang2025understanding, nath2018optimum, gu2019timely]). Since such assumptions can be overly optimistic, several works have examined how non-stationary (adversarial) environments affect information timeliness from different perspectives. Examples include adversarial ON/OFF channels [tseng2019online, sinha2022optimizing], adversarial update arrivals [saurav2023online], and adversarial aging functions [lin2025optimal, tripathi2021online]. Recent work [liu2025learning] further incorporated ML advice into online algorithms for adversarial ON/OFF channels.
To the best of our knowledge, there is no unified design and analysis framework capable of handling an adversary that simultaneously controls multiple sources of uncertainty as in our model, where the adversary can jointly manipulate the operation duration, information aging, update cost, and update opportunities. This gap is critical, since mobile networks inherently involve several forms of non-stationary uncertainty, and it is also technically challenging because the adversary is so powerful. Particularly, the impact of adversarially varying update costs (beyond simple ON/OFF channel models) on age-driven design has not been explored in the existing literature, and our results reveal that it is the most critical source of uncertainty affecting performance.
III System overview
As illustrated in Fig. 1(a), we consider an information update system in which a mobile device monitors a physical process and reports its latest status to a nearby access point (AP). The system operates in discrete time slots indexed by , where represents the total operation duration.
We begin with a scenario in which the device always has an update packet at the beginning of every slot and is also permitted to transmit it in every slot. Then, for each slot , the device decides whether to transmit the update. Let denote the device’s transmission decision, where if the device transmits in slot , and otherwise. If the device transmits at the beginning of a slot, the update is delivered by the end of that slot. In Section VII, we extend the model to more general scenarios in which the device cannot transmit updates in certain slots, i.e., under intermittent update opportunities.
III-A Age of information
If the device decides to transmit an update at the beginning of slot , then the age of information at the AP is reset to zero at the end of slot , indicating that the AP has received the latest update. Otherwise, the age increases by an amount to reflect the continued staleness of the information at the AP. The value of can vary across slots . Let denote the age of information maintained by the AP at the end of slot . As illustrated in Fig. 1(b), the evolution of across time slots is given by
| (1) |
with initial age , where is specified by the AP during the initial connection. We define the age increment sequence as .
III-B Problem formulation
While transmitting updates in every slot minimizes the age of information at the AP, it also incurs substantial resource consumption (e.g., energy and bandwidth) at the device. To capture this trade-off, we introduce two cost metrics: the age cost and the update cost. Specifically, we assume that each unit of age in a slot incurs a cost of one unit; thus, the age cost in slot is given by . In addition, if the device transmits an update in slot , it incurs an update cost denoted by . For instance, can be modeled as the product of a unit cost and the transmission energy by . Here, depends on the instantaneous channel condition between the device and the AP, which may fluctuate due to user mobility. Meanwhile, may also vary over time, e.g., depending on the device’s remaining energy or, when multiple packets are present, different unit costs can be assigned to prioritize certain packets over others. The form can also be interpreted more generally as a unit cost multiplied by resource expenditure (e.g., energy or bandwidth). However, for clarity, in the remainder of this paper we focus on the energy example. We define the update cost sequence as .
To balance the age cost and the update cost, the device needs a scheduling algorithm defined as . The total cost incurred by a scheduling algorithm depends on the sources of uncertainty, including the operation duration , the age increment sequence , and the update cost sequence . We represent this uncertainty instance as . Given an instance and a scheduling algorithm , the total cost is defined as the sum of age and update costs:
| (2) |
Our objective is to design a scheduling algorithm that minimizes the total cost .
III-C Scheduling Classification
A scheduling algorithm is referred to as an offline scheduling algorithm if it has prior knowledge of the entire instance . Such algorithms are generally impractical in real-world systems due to their reliance on future information. Thus, this paper focuses on a more realistic design, where the device has access only to the historical or current information but lacks knowledge of future values.
Because of the potential unavailability of real-time channel information, the current update cost may not be known at the beginning of slot . Therefore, we design scheduling algorithms that do not rely on instantaneous channel knowledge. Instead, the algorithms use only the maximum and minimum possible update costs, denoted by and . The values of and can be estimated from the historically observed worst and best channel conditions (e.g., those observed by the AP within its service region and reported by the AP). If these values are unavailable, see Remark 10 for a slight modification of the proposed algorithms that preserves some performance guarantee.
A scheduling algorithm is called a (cost-agnostic) online scheduling algorithm if it requires only the constants and and the realized age increments up to the current slot. For simplicity, we will omit the term cost-agnostic, with the understanding that all references to online scheduling algorithms refer to the cost-agnostic setting. Without access to the complete uncertainty instance, an online algorithm is generally unable to achieve the minimum total cost attainable by an optimal offline scheduling algorithm. Given an instance , let denote the minimum total cost achieved by an optimal offline algorithm. We evaluate the performance of online algorithms in terms of their competitiveness [buchbinder2009design] relative to the offline optimum, defined as follows.
Definition 1.
A scheduling algorithm is said to be -competitive if , for all possible instances .
That is, a -competitive online scheduling algorithm guarantees that the resulting total cost is at most times the offline minimum cost, regardless of the instance . Our goal is to design an online scheduling algorithm that achieves the smallest possible competitive ratio .
Note that in addition to the competitive ratio, another common performance metric is regret [lattimore2020bandit]. Regret measures the additive performance gap between an online learning approach and the best offline algorithm restricted to fixed decision rules. In contrast, the competitive ratio compares the performance of an online algorithm against the best offline algorithm without such restrictions. Moreover, online learning approaches typically provide regret guarantees only as the decision horizon grows to infinity. Such guarantees are less suitable for our setting, since a device may monitor and transmit updates only for a short and unpredictable duration.
Moreover, with the advancement of machine learning (ML) techniques, it is increasingly feasible to leverage ML to provide scheduling advice. A scheduling algorithm is referred to as an online scheduling algorithm with ML if it additionally has access to ML advice. Let denote the decision advised by ML at slot . We define the sequence as the ML advice. Because such a scheduling algorithm can adapt its actions based on the ML advice, its decision sequence is allowed to be a function of .
This paper considers the setting where the ML advice may be untrusted. While following perfect ML advice (which can minimize the total cost) yields the minimum total cost, blindly trusting imperfect ML advice (which cannot minimize the total cost) may lead to poor performance. Moreover, we assume that the reliability of the ML advice is unknown a priori. In this context, we characterize online scheduling algorithms with ML in terms of two metrics introduced in [lykouris2018competitive]: consistency and robustness. Consistency quantifies performance relative to the ML advice, while robustness guarantees performance in the worst case. These notions are formally defined as follows.
Definition 2.
An online scheduling algorithm with ML advice is said to be -consistent if , for all possible instances and ML advice ; it is said to be -robust if , for all possible instances and advice .
In other words, an -consistent and -robust online scheduling algorithm ensures that 1) when the ML advice is perfect, the resulting total cost is at most times the offline minimum cost, and 2) when the advice is arbitrary or even adversarial, the cost remains within a factor of of the offline minimum cost. An algorithm that fully trusts the ML advice may achieve near-optimal consistency, but this often comes at the expense of robustness. Therefore, there exists an inherent trade-off between consistency and robustness . The goal of this paper is to design an online scheduling algorithm with ML that achieves the optimal consistency-robustness trade-off, namely, to minimize the consistency for any fixed robustness .
IV Linear program formulation
The main challenge in designing our scheduling algorithm arises from several forms of uncertainty that are impractical to model using stationary stochastic processes and also limited current observations. To address these challenges, we use online algorithm design techniques based on linear programming [buchbinder2009design, bamas2020primal].
However, casting our problem as a linear program (LP) is non-trivial due to the non-linear nature of the age cost. For example, consider a scenario where the device transmits an update in slot and schedules the next update in slot . If the age increases linearly by one unit per slot until the next update, the cumulative age from slot to slot is given by , which grows quadratically with the decision variable . See [arafa2017age] for a concrete example illustrating this behavior. This quadratic growth implies that the total age cost in Eq. (2) includes non-linear terms, thereby complicating direct LP formulation.
To overcome this issue, we introduce a transformation of the age evolution into an equivalent virtual queueing system, described in Section IV-A. This transformation facilitates an LP formulation for the offline scheduling problem, as presented in Section IV-B. The resulting LP formulation serves as the foundation for the design and analysis of our online scheduling algorithms: Section V develops an online scheduling algorithm without ML, Section VI incorporates ML advice into the scheduling process, and Section VII extends the model to intermittent update opportunities.
IV-A Virtual queueing system
Without loss of generality, we assume that the age increment is an integer for all . If this is not the case, we can multiply both and by a common constant so that every becomes integer-valued. Such scaling does not alter the optimal solution to the objective in Eq. (2). Based on this assumption, we introduce a virtual queue that mirrors the evolution of the integer-valued age.
We construct a virtual queueing system (shown in Fig. 2) consisting of a virtual server, a virtual queue, and virtual packet arrivals. The virtual system operates in the same discrete time slots as the real mobile network. Initially, the virtual queue contains virtual packets. At the beginning of each slot , virtual packets arrive at the virtual queue. If the device decides to transmit an update in slot (in the actual network), then the virtual server clears the virtual queue at the end of the slot. Otherwise, the virtual server remains idle and the virtual packets accumulate. As a result, the virtual queue size evolves as follows: it resets to zero if (i.e., the virtual server clears the virtual queue, corresponding to an update in the actual network), or increases by if (i.e., the virtual server idles, corresponding to no update in the actual network). This evolution exactly mirrors the age dynamics in Eq. (1). Therefore, we use the same notation to denote the virtual queue size at the end of slot .
We index the virtual packets by according to their arrival times, and let denote the slot in which virtual packet arrives. For each virtual packet , we use a binary variable to indicate whether it remains in the virtual queue at the end of slot , where if it is still present, and otherwise. Using this notation, the virtual queue size at the end of slot can be expressed as , which counts the number of virtual packets that have arrived by slot and remain in the virtual queue. This representation allows us to express the age as a linear function of the binary variables . Substituting this expression into Eq. (2), we can rewrite the total cost as the following linear function:
| (3) |
This linear expression facilitates the formulation of an LP in the next section. Moreover, by Eq. (3), the update cost can also be interpreted as a clearing cost incurred when the virtual queue is cleared in slot , while holding a virtual packet for one slot incurs a unit holding cost.
IV-B LP formulation
We note that the clear/idle behavior in the virtual queueing system directly corresponds to sending/withholding an acknowledgment (ACK) to clear all received packets in the Transmission Control Protocol (TCP). From this perspective, we can leverage prior studies for the classic online TCP ACK problem [buchbinder2009design, Chapter 12] to formulate our offline optimal scheduling problem as an integer program with a linear objective and constraint functions:
| (4a) | ||||
| s.t. | ||||
| (4b) | ||||
| (4c) | ||||
In this integer program, we introduce a variable to denote whether the device transmits an update in slot . That is, in the integer program is exactly the decision variable introduced earlier. The reason for redefining this variable is that we will relax to take a real value between 0 and 1, which prevents immediate interpretation as a transmission decision. Later in Section V, we show how to convert a fractional solution for into a randomized scheduling decision for . Moreover, the constraint in Eq. (4b) ensures that each virtual packet arriving by slot either remains in the virtual queue at the end of slot (i.e., in the first term of Eq. (4b)) or has been cleared by slot (i.e., there exists a slot such that in the second term of Eq. (4b)).
By relaxing the integrality constraint (4c) to allow continuous variables, we obtain the following LP:
| (5a) | ||||
| s.t. | ||||
| for all such that and for all ; | (5b) | |||
| (5c) | ||||
Next, Section V proposes an online algorithm to compute a feasible solution to LP (5) without relying on ML, while Section VI extends this approach by incorporating ML advice.
Remark 3.
Before solving our online problem, we remark that, via the virtual queue transformation, our formulation generalizes the classical online TCP ACK problem [buchbinder2009design, Chapter 12] and its learning-augmented variant [bamas2020primal]. In the classical TCP ACK setting, each ACK incurs a constant cost. In contrast, our objective in Eq. (4a) allows the ACK (i.e., clearing) cost to vary across slots. Later, in Section VII, we further generalize the problem to scenarios where the ACK channel alternates between ON and OFF states, and transmitting an ACK during an ON slot incurs an adversarially chosen cost. This generalization is practically relevant in noisy wireless environments; moreover, it poses theoretical challenges, since the adversary simultaneously controls multiple sources of uncertainty and an online algorithm is restricted to transmitting an ACK only in ON slots. As we will show, our generalized setting also yields a fundamentally different optimal competitive ratio that depends solely on the cost range (i.e., time-varying costs are the dominant factor). When augmented with ML advice, our generalized setting exhibits behavior that differs qualitatively from the classic learning-augmented TCP model; in particular, it yields a threshold-like optimal trust rule when the cost range is large.
V Online scheduling algorithm design without ML
This section develops an online scheduling algorithm without ML by leveraging LP (5). Section V-A introduces an online algorithm that can compute a feasible solution to the proposed LP in an online fashion. Based on this fractional solution, Section V-C proposes a randomized online scheduling algorithm without ML.
V-A Online LP algorithm
We propose Alg. 1, referred to as the online LP algorithm, which computes a feasible solution to LP (5). All variables are initialized to zero in Line 1. At the beginning of each slot , the algorithm iteratively adjusts the variables for all virtual packets that have arrived by slot , as specified in Line 1.
The underlying idea is that in each slot , our scheduling algorithm that will be proposed in Section V-C makes a probabilistic decision: to set with some probability or otherwise. The probability is governed by the current value of , which is determined in Line 1 of Alg. 1. Accordingly, can be interpreted as the probability of clearing the virtual queue in slot . In this context, the cumulative sum represents the cumulative clearing probability (up to slot ) for virtual packet .
With this interpretation, the condition in Line 1 checks whether virtual packet has already been cleared by slot . If , virtual packet is considered cleared, and no further processing is required. Otherwise, the virtual packet may still remain in the virtual queue and its associated variables should be adjusted. As shown in Line 1, for each such packet, Line 1 increases the value of . That is, the more virtual packets remain in the virtual queue, the higher the resulting clearing probability.
Moreover, the idea behind Line 1 is that it adjusts the cumulative clearing probability as follows:
| (6) |
which increases the cumulative clearing probability by a multiplicative factor of and an additive factor of . The constant is chosen as in Line 1 so that the algorithm asymptotically achieves the minimum achievable competitive ratio (as stated in Lemma 9). The appearance of in the denominators reflects that a larger update cost reduces the rate at which the clearing probability increases. In addition, Line 1 sets to ensure that the constraint in Eq. (5b) is satisfied, so that Alg. 1 produces a feasible solution to LP (5).
Note that Alg. 1 operates in an online manner, as it requires only the constants and , and the knowledge of virtual arrivals up to the current slot (which corresponds to the age increment sequence up to the current slot), without relying on any future information.
V-B Analysis of Alg. 1
In this section, we analyze the objective value in Eq. (5a) computed by Alg. 1. Unlike prior studies [buchbinder2009design, bamas2020primal] that analyze online algorithms for LPs using primal–dual techniques, our analysis exploits structural properties of Alg. 1 and its relation to an optimal offline scheduling algorithm. An advantage of our approach is that it provides a unified analysis framework for all proposed LP algorithms (including Algs. 1, 3, and 4) without the need to construct separate dual solutions for different scenarios.
Let denote the set of virtual packets that have arrived by slot . The following two lemmas characterize properties of this set. Here, when a virtual packet satisfies the condition in Line 1 and thus triggers the operation in Line 1, we say that it activates. For clarity and continuity, we move most detailed proofs of this paper to the appendices in the supplemental material.
Lemma 4.
For a fixed slot , after the virtual packets in have activated times since slot , the value computed by Alg. 1 satisfies
for all .
Proof.
(Sketch) We prove by induction on . See Appendix A for details. ∎
Lemma 4 immediately implies the following result.
Lemma 5.
For a fixed slot , the virtual packets in can activate at most times since slot .
Proof.
Leveraging Lemma 5, we are now ready to analyze the objective value in Eq. (5a) achieved by Alg. 1 in the following theorem. The theorem also characterizes the asymptotic behavior when the update cost scales linearly with the energy consumption, i.e., for a constant unit cost . The asymptotic regime models scenarios with severely constrained resources. In this regime, the competitive ratio depends only on the ratio between the maximum and minimum update cost, denoted by . Let and denote the maximum and minimum per-update energy consumption, respectively. Then, when , the same ratio can also be written as .
Theorem 6.
Proof.
(Sketch) Fix an instance . Suppose that an optimal offline scheduling algorithm clears the virtual queue in slots , performing a total of clearing operations. Let and . We divide the timeline into periods, where period consists of slots through .
Let denote the cost incurred by an optimal offline scheduling algorithm in period . Let denote the holding cost incurred by the optimal offline scheduling algorithm for all virtual packets arriving in period . Consider a fixed . Including the additional clearing cost in slot , we have .
Similarly, let denote the increment of the objective value in Eq. (5a) by Alg. 1, according to the activations of all virtual packets that arrive in period . Note that one activation increases the objective value by
| (7) |
Next, we count the number of activations made by the virtual packets arriving in period . First, exactly counts the number of iterations of Line 1 from slot through slot in period for the virtual packets arriving during this period. Second, by Lemma 5, the virtual packets arriving in period can activate at most additional times from slot onward. Hence, they can activate at most times in total.
Next, we also use Lemma 5 to analyze the computational complexity of Alg. 1, as stated in the following lemma. Here, we use to denote the maximum value of for all possible .
Proof.
See Appendix C for details. ∎
V-C Randomized online scheduling algorithm
Leveraging the fractional-to-probabilistic conversion technique proposed in [buchbinder2009design, Chapter 12], this section presents a randomized online scheduling algorithm in Alg. 2, which converts the fractional solution generated by Alg. 1 into a probabilistic transmission decision.
Alg. 2 adjusts the variable in Line 2 using the same rule as in Alg. 1. In addition, we introduce two auxiliary variables: and . The variable records the cumulative sum of up to slot (Line 2), while records the cumulative sum up to slot (Line 2). In Line 2, Alg. 2 selects a uniform random number . Then, according to Lines 2 and 2, if there exists such that , then the device decides to transmit an update (Line 2); otherwise, the device idles (Line 2). The idea behind Alg. 2 mirrors the classical technique for sampling from a distribution using its cumulative distribution function. In particular, by the uniform randomness of , the probability of transmitting an update (equivalently, clearing the virtual queue) in slot is exactly , and the cumulative transmission probability by slot is .
Because of the randomness of Alg. 2, we evaluate its performance in terms of the expected competitive ratio.
Theorem 8.
The expected competitive ratio of Alg. 2 is
Moreover, as the unit cost scales to infinity, the ratio approaches .
Proof.
Fix an instance . Following [tseng2019online], we can show that the expected clearing cost in each slot under Alg. 2 is upper bounded by the value of as computed by Alg. 1. Similarly, the expected number of virtual packets present in slot under Alg. 2 is upper bounded by as computed by Alg. 1. Therefore, the expected total cost in Eq. (3) incurred by Alg. 2 is bounded above by the objective value in Eq. (5a) computed by Alg. 1. The result then follows directly from Theorem 6. ∎
Next, we show that the competitive ratio of Alg. 2 is optimal by establishing a matching lower bound as follows.
Lemma 9.
No online algorithm can achieve a competitive ratio smaller than .
Proof.
(Sketch) Consider the initial age , a fixed age increment sequence , and a fixed update cost sequence . Only the operation duration is unknown to the device. The age increment sequence models a scenario where information becomes stale very slowly or no more aging when there is no content change in the process the device is monitoring as in [salimnejad2025age]. Despite this single source of unknown uncertainty, we show in Appendix D that as , no online scheduling algorithm can achieve a competitive ratio smaller than . ∎
Through the lower bound on the competitive ratio in Lemma 9 and the matching achievability scheme proposed in Alg. 2, we establish that the optimal competitive ratio against an adversary that can jointly manipulate the operation duration, the age increment, and the update cost is . This matches the result in the classic online TCP [buchbinder2009design] for (without cost variation). Moreover, this ratio is for large , scaling linearly with the update cost range , while is unaffected by all other sources of uncertainty. Thus, when the cost fluctuates, it becomes fundamentally harder for any online algorithm to balance timeliness and update cost. Moreover, see Fig. 3 for the competitive ratio at finite values of , which also appears approximately linear in .
Remark 10.
This remark discusses how Alg. 1 can be adapted to scenarios in which the bounds and on the update cost are not known in advance. In this case, we propose to periodically estimate the update cost using channel estimation techniques (e.g., see [liu2025learning]). Let denote the estimation period and assume that the update cost is measured in slots . For each slot , by and we define the maximum and minimum observed costs up to slot , respectively. In Alg. 2 (and the corresponding Alg. 1), we replace and by and , respectively. This remains an online algorithm, as no future information is used. Let . Then, the increment of the objective value in Eq. (7) becomes
Let denote the maximum per-slot variation of the update cost. Since is the maximum observed cost up to slot , and there are at most additional slots between slot and slot , we have
Moreover, and , so , where denotes the value used in the original algorithms. Thus, the increment above is bounded by
If the device can estimate at the beginning of each slot (i.e., ), then the factor above equals , achieving the same competitive ratio as in Theorems 6 and 8. Otherwise, suppose that the update cost also takes the form . Let . Then, we can express the bound by
As , the modified online algorithm achieves the asymptotic competitive ratio of , which matches the order of the original competitive ratio (but with an inflated multiplicative pre-constant ). The same fix can apply to all remaining algorithms proposed later, yielding the same pre-constant.
VI Online scheduling algorithm design with ML
This section extends the proposed online scheduling algorithm by incorporating ML that can suggest the next transmission time (i.e., to clear the virtual queue). We focus on the online LP algorithm design as in Section V-A, since the resulting fractional solution can also be converted into a randomized scheduling algorithm as in Section V-C.
VI-A Online LP algorithm with ML
This section extends the online LP algorithm in Alg. 1 by incorporating ML advice with unknown reliability, as described in Alg. 3. The key idea underlying Alg. 3 is as follows. A new variable is introduced in Line 3 and is adjusted in Line 3 whenever the ML advice suggests clearing the virtual queue. Hence, the value of represents the most recent slot when the ML advice recommended a clearing. Since the ML advice may be imperfect, the device does not blindly follow it. Instead, Alg. 1 modulates its response based on whether a virtual packet has been suggested for clearing by the ML advice at the beginning of slot :
- •
- •
The adjustment of is governed by a hyperparameter , which reflects the device’s level of trust in the ML advice. A smaller value of corresponds to greater confidence in the ML advice and leads to closer alignment with it, whereas a larger value represents caution and yields more robust behavior. This trade-off between the consistency and the robustness with respect to will be further discussed in Section VI-B.
VI-B Analysis of Alg. 3
In this section, we establish the robustness and consistency of the proposed Alg. 3. To this end, we begin by analyzing the set under Alg. 3 in the following two lemmas, analogous to Lemmas 4 and 5, respectively. Here, if a virtual packet activates and performs a slow or fast step, we say that it activates a slow or fast step, respectively. Moreover, we denote and .
Lemma 11.
For a fixed slot , after the virtual packets in have activated slow steps and fast steps since slot , the value computed by Alg. 3 satisfies
for all such that .
Proof.
(Sketch) We prove by induction on . See Appendix E for details. ∎
Lemma 12.
For a fixed slot , the virtual packets in can activate slow steps and fast steps since slot , subject to the condition .
Leveraging Lemma 12, we are ready to analyze the objective value in Eq. (5a) computed by Alg. 3. The next two theorems analyze its robustness and consistency, respectively.
Theorem 13.
Proof.
Theorem 14.
Proof.
(Sketch) Fix an instance and ML advice . We follow the proof of Theorem 6, with minor modifications. Redefine for as the slot when clears the virtual queue for the -th time. These redefined time points determine a new set of periods, replacing those used in the proof of Theorem 6.
Let denote the cost incurred by in period . Let be the increment of the objective value in Eq. (5a) by Alg. 3, according to the slow and fast step activations of all virtual packets arriving in period . We show that
for all . Thus, the total objective value computed by Alg. 3 satisfies
Substituting and the definitions of and proves the theorem. See Appendix H for details. ∎
Next, we show that the results in the above two theorems characterize the optimal consistency-robustness trade-off.
Lemma 15.
A -consistency scheduling algorithm has a robustness of at least .
Proof.
Combining the lower bound in Lemma 15 with the matching achievability scheme in Alg. 3, we establish that the optimal consistency–robustness trade-off is characterized by the pair of and . This matches the result for the classic online TCP with ML [bamas2020primal] when (without cost variation).
Note that in many prior studies on ML-augmented online algorithms (e.g., [bamas2020primal, liu2025learning]), the consistency approaches 1 as the trust parameter , i.e., setting forces the algorithm to rely entirely on the ML advice. In contrast, in our setting the consistency approaches as , which exceeds 1 whenever . This difference is explained as follows. The robustness becomes unbounded as . Because this trade-off is optimal, the robustness is infinite whenever the consistency falls below . This implies that robustness guarantees collapse even when an online algorithm follows ML advice only partially (so that its consistency remains below ). Thus, in our setting, taking does not force the algorithm to rely fully on the ML advice. Instead, it identifies the limit in which the algorithm becomes as consistent with the ML advice as possible while still maintaining a finite robustness guarantee.
Thus far, we have characterized the optimal consistency–robustness trade-off as the trust level in the ML advice varies. Tuning the trust level leads to different competitive ratios. We next discuss how to determine an optimal trust level that minimizes the competitive ratio when the value of is large. For large , the consistency varies from at to as . Hence, the consistency is nearly identical for all . In contrast, for large , the robustness degrades as decreases. Then, considering all possible (representing partial or no trust in the ML advice), an optimal online scheduling algorithm that minimizes the competitive ratio satisfies the following performance guarantees:
for all possible , , and all . The minimum is attained at , leading . In addition, considering full trust in the ML advice, we also have . Suppose that the ML advice satisfies the following reliability guarantee: for all possible . Then, we have
for all possible and . That is, for large , regardless of the ML reliability , the optimal response to ML advice (for minimizing the competitive ratio) is threshold-like: the algorithm should either fully trust the ML advice if or completely ignore it otherwise.
Moreover, see Fig. 4 for the trade-off at finite values of . Here, we also observe a dramatic degradation in robustness resulting from even a small change in consistency. This indicates that the threshold structure nearly holds as well.
VII Intermittent update opportunities
We extend our framework to scenarios where the device may be unable to update in certain slots (e.g., when no update is generated or when the device cannot transmit). Let indicate whether the device is able to update in slot , where if it can and otherwise. Let denote the update opportunity sequence. We then redefine the uncertainty instance as to incorporate this additional source of uncertainty.
To model this, we augment the virtual queueing system described in Section IV-A with a virtual ON/OFF channel. Specifically, if , the virtual channel is ON and the virtual server is allowed to clear virtual packets; if , the virtual channel is OFF and the virtual server must idle. This leads to the following revised LP:
| (9a) | ||||
| s.t. | ||||
| for all such that and for all ; | (9b) | |||
| (9c) | ||||
Here, Eq. (9b) differs from Eq. (5b) because a virtual packet is cleared only when the virtual channel is ON.
Next, we generalize Algs. 1 (without ML) and 3 (with ML) in Sections VII-A and VII-B, respectively, to handle scenarios with intermittent update opportunities.
VII-A Without ML Advice
This section extends Alg. 1, as described in Alg. 4. The key design change is that is adjusted only when the virtual channel is ON, i.e., when (Line 4). Furthermore, unlike Alg. 1, which adjusts only for the current slot , Alg. 4 also considers all prior virtual OFF slots that occurred since the previous virtual ON slot. Concretely, for each such prior virtual OFF slot (Line 4), if the constraint in Eq. (9b) still holds (Line 4), the algorithm keeps increasing (Line 4). This reflects the intuition that virtual packets held longer in the queue (due to consecutive virtual OFF periods) should have higher clearing probabilities once the virtual channel becomes ON. To implement this logic, the algorithm maintains a pointer (Line 4) to denote the starting slot of this multiple increment procedure. This pointer is adjusted in Line 4 when the condition in Line 4 holds. The pointer identifies the slot immediately following the previous virtual ON slot, which is either the current slot (if the current virtual channel is ON) or the start of the current virtual OFF period (otherwise).
Note that if a virtual packet arrives in a virtual OFF slot, it must remain in the virtual queue until the next virtual ON slot. This limitation applies to all scheduling algorithms (including an offline optimal algorithm). Thus, the multiple increment mechanism applies only to those virtual packets that arrived before the previous virtual ON slot (Line 4). For virtual packets that arrive after the previous virtual ON slot, Alg. 4 adjusts only once (Line 4).
We next analyze the performance of Alg. 4 and show that it can achieve the same asymptotic competitive ratio as stated in Theorem 6. To that end, we present a lemma that bounds the increment of under the multiple increment mechanism in Alg. 4. Here, when a virtual packet satisfies the condition in Line 4 or 4 and thus triggers the operation in Line 4 or 4, we say that it activates.
Lemma 16.
For a fixed slot , after the virtual packets in have activated times since slot , the value of computed by Alg. 4 increases (relative to the beginning of slot ) by at most
for all .
Proof.
(Sketch) We prove by induction on . See Appendix J for details. ∎
Using Lemma 16, we are ready to analyze Alg. 4 in the next result. Here, let denote the maximum number of consecutive virtual OFF slots up to and including the next virtual ON slot.
Theorem 17.
Proof.
(Sketch) We follow the proof of Theorem 6. Fix a period and a virtual ON slot in that period. We bound the increment of the objective value in Eq. (9a) in slot by considering how Alg. 4 adjusts and the matching -variables, where we use the notation to represent the value of under the specific condition:
-
1.
If a virtual packet arrives before and activates in iteration of Line 4 in slot , then Line 4 increases by
However, the paired was already set to be in a slot . Because does not change (for all possible ) over the virtual OFF period until slot , we have
Hence, the increment of the objective value due to the adjustment of from virtual packet in iteration and the paired is
-
2.
If a virtual packet arrives before but does not activate in iteration of Line 4 in slot , then because the paired was still set to in slot , it also contributes at most to the objective value in that iteration.
- 3.
Next, we show that, for the virtual packets arriving in the period, there are at most Case 1 and Case 3 increments since slot , and at most Case 2 increments since slot . Then, we can rewrite Eq. (8) as
Finally, following the proof of Theorem 6 yields the desired result. See Appendix K for details. ∎
Compared with Theorem 6, the competitive ratio in Theorem 17 scales that in Theorem 6 by a factor of (which approaches 1 as ), and includes an additional term (which also approaches 0 as ). Thus, even under intermittent update opportunities, Alg. 4 asymptotically achieves the lower bound in Lemma 9.
Next, we show that if the adversary is so powerful that it can also set or arbitrarily large, then no competitive ratio can be guaranteed.
Lemma 18.
If either or is unbounded, then no online algorithm can achieve a finite competitive ratio as .
Proof.
See Appendix L for details. ∎
VII-B With ML advice
This section further incorporates ML advice. To this end, we modify Alg. 3 by applying the multiple increment mechanism for previous virtual OFF slots, as in Alg. 4. By extending the proof of Theorem 17 to revise those of Theorems 13 and 14, we can obtain the same scaling and additional terms (as in Theorem 17), yielding the same asymptotic results as in Theorems 13 and 14.
VIII Numerical studies
The previous sections established that our proposed algorithms achieve the best possible competitiveness and the optimal consistency–robustness trade-off in adversarial environments. In this section, we complement the theoretical results by evaluating the algorithms in stationary stochastic environments through numerical experiments.
We adopt a setting similar to [hsu2019scheduling], which derived optimal offline scheduling policies under stationary assumptions. We simulate a horizon of slots. Update opportunities follow a Bernoulli process with rate . To model update costs with memory, we use a two-state Markov chain (as in the Gilbert–Elliott model) with states (low cost) and (high cost). Let denote the state in slot , and let and be the transition probabilities from to and from to , respectively.
Following [hsu2019scheduling], an optimal offline scheduling policy in the stationary setting can be characterized by two age thresholds: if , the device transmits when the age reaches a threshold ; if , it does when the age reaches a threshold . We compute the optimal pair of and via exhaustive search to minimize the total cost in Eq. (2).
Next, we consider both a linear aging function in Section VIII-A and a nonlinear one in Section VIII-B.
VIII-A Linear aging function
In this section, we consider a constant age increment process with for all . We evaluate the proposed online algorithm without ML in Section VIII-A1 and the ML-augmented version in Section VIII-A2.
VIII-A1 Online scheduling algorithm without ML
In this section, we validate Alg. 2 (modified according to Alg. 4 to handle intermittent update opportunities). Figs. 5 and 6 show the time-average cost (y-axis) for various values of (x-axis). In Fig. 5. we set , , resulting in longer stays in state ; in Fig. 6, we set , , resulting in longer stays in state . Each figure has three subfigures corresponding to , , and , respectively. Each subfigure plots five curves: “Proposed” (the proposed online algorithm without ML), “Revised” (a modified version described later), “Greedy” (a baseline policy described later), “OPT” (the offline optimum), and “Theory” (the upper bound from Theorem 8 multiplied by OPT). We observe that the proposed algorithm performs significantly better in practice than the worst-case theoretical upper bound, especially as the update cost range increases.
For comparison, we also simulate an online greedy algorithm (labeled “Greedy” in the figures) that myopically minimizes the current slot cost, i.e., it transmits in slot if the update cost is less than the cost of waiting (i.e., ). Note that while the greedy algorithm requires knowledge of the current update cost in each slot, the proposed algorithm does not. From Figs. 5 and 6, the proposed algorithm outperforms the greedy baseline except when the system frequently enters the high cost state with large (i.e., in Fig. 6(c)). This exception would be explained by Lemma 5: the proposed algorithm must transmit before activating times, forcing overly frequent updates when is large and occurs often, as in the environment of Fig. 6(c).
Although Alg. 2 asymptotically achieves the optimal competitive ratio and thus serves as an achievability scheme for the lower bound, we observe that it may be too aggressive in such stochastic environments. To remedy this, we propose a revised version in which the constant in Alg. 2 (and Alg. 1) is replaced by . By Lemma 5, the revised algorithm transmits before activating times, thereby reducing the update frequency when is large. From Figs. 5 and 6 (labeled “Revised” in the figures), this revised algorithm consistently achieves the best empirical performance.
Given the revised algorithm’s superior stochastic performance, we analyze its worst-case guarantees. By Lemma 5, the virtual packets arriving in period can activate at most times from slot onward. Let . Hence, Eq. (8) becomes
As , we have and , yielding an asymptotic competitive ratio of . This remains order-optimal.
VIII-A2 Online scheduling algorithm with ML
In this subsection, we evaluate the benefit of incorporating ML advice. Using the same argument as in the previous section, we can show that replacing the constant in Alg. 3 with the revised value achieves the robustness of and the consistency of , as . The revised algorithm also preserves the order of the optimal consistency–robustness trade-off. Because the revised algorithm performs better in the stochastic environment as shown in the previous section, here we evaluate its performance when augmented with ML advice.
Let denote the offline optimal threshold in slot . To investigate imperfect ML advice with controllable errors, we model the ML-estimated threshold as , where is a zero-mean Gaussian random variable with variance chosen such that
i.e., with probability the relative error is within . The ML advice is then if and otherwise.
Figs. 7 (for tr and tr) and 8 (for tr and tr) plot the time-average cost of the proposed online algorithm with ML for . Each subfigure shows the results for . According to our simulations, the performance of completely following the ML advice coincides with that of ; hence, we do not plot it for clarity. We observe that for small errors (), completely following the ML advice yields the best performance; for large errors (), completely ignoring the ML advice with performs best. There also exists a sharp transition between the two regimes in the stochastic setting, matching the threshold-type behavior by the asymptotic analysis in the adversarial setting.
VIII-B Nonlinear Aging Function
In this section, we consider a nonlinear aging function similar to that in [kosta2017age]. Specifically, if slots have elapsed since the most recent update, then the age of information in the current slot is given by . Throughout this section, we fix .
Figs. 9 (for and ) and 10 (for and ) show the time-average cost of the online algorithms without ML. As in the linear aging case, both the proposed and the revised algorithms outperform the baseline policy.
We also evaluate the revised algorithm with ML advice in Figs. 11 (for and ) and 12 (for and ), for several values of the ML error parameter . In Fig. 11, when , completely following the ML advice yields the best performance; when , the performance is nearly identical for all values of ; when , completely ignoring the ML advice is optimal. Similarly, in Fig. 12, when , completely following the ML advice is optimal; when , the performance is nearly the same for all ; when , completely ignoring the ML advice performs best.
These results exhibit the same qualitative behavior observed under the linear aging case in the previous section: either blindly following the ML advice or completely ignoring it yields near-optimal performance, while partially trusting the ML advice provides little benefit.
IX Conclusion
This paper investigated a mobile information updating system subject to four sources of uncertainty. We developed online scheduling algorithms that enable a mobile device to cost-efficiently maintain fresh information at a central entity. The proposed online algorithm without ML asymptotically achieves the optimal competitive ratio, while the ML-augmented version also asymptotically attains the optimal consistency–robustness trade-off. Moreover, when augmented with ML, we showed that either blindly following or completely ignoring the ML advice minimizes the competitive ratio. This work opens several promising research directions for network design under non-stationary uncertainty. Interesting extensions include dynamically adjusting the threshold (for either blindly following ML or completely ignoring it) because the reliability of the ML advice is unknown in general, developing an optimal algorithm for both the adversarial and stochastic environments, exploring multi-device or networked update systems, and incorporating sampling decisions jointly with update scheduling.
X Acknowledgments
We thank the authors of [liu2025learning] for pointing out mistakes in our earlier preliminary work [tseng2019online]. This research was supported by the National Science and Technology Council, Taiwan, under Grant No. 110-2221-E-305-008-MY3 and 113-2628-E-305-001-MY3.
Supplementary Material
Appendix A Proof of Lemma 4
We use the notation to represent the value of under the specific condition. Fix a slot . We prove the claim by induction on . When , by Eq. (6) we have
for all .
Assume that the result holds for , i.e.,
for all .
We show that the result holds for : after the additional step, by Eq. (6) we have
for all . This completes the inductive step and proves the lemma.
Appendix B Proof of Theorem 6
Fix an instance . Suppose that an optimal offline scheduling algorithm (denoted by ) clears the virtual queue in slots , for a total of clearing operations. Let and . We divide the timeline into periods, where period consists of slots through .
Let denote the cost incurred by in period . The total cost in Eq. (3) incurred by is then . We calculate for two cases.
-
1.
For : For each slot in period , the number of virtual packets present in the virtual queue is , which checks all virtual packets that arrived after the previous clearing in slot until slot . Hence, the holding cost of all the virtual packets that arrive in period is . We denote this quantity by . Adding the clearing cost in slot , we have .
-
2.
For : Here, the holding cost has the same form as above, but since no clearing occurs in this period, the cost is .
Next, let denote the increment of the objective value in Eq. (5a) by Alg. 1, according to the activations of all virtual packets that arrive in period . This includes the increments of in Line 1 and of in Line 1, for all virtual packet with and for all slots . The objective value computed by Alg. 1 is then . We analyze in two cases.
-
1.
For : By the condition in Line 1, a virtual packet contributes to the objective value only when it activates. If a virtual packet arriving in period activates in some slot , then Line 1 increases by , and Line 1 increases by . Hence, one activation increases the objective value by
(11) Note that exactly counts the number of iterations of Line 1 from slot through slot for the virtual packets that arrive in period . Thus, the virtual packets can activate at most times before slot . Moreover, by Lemma 5, the virtual packets arriving in period can activate at most additional times from slot onward. Hence, they can activate at most times in total. Combining this with Eq. (11), we obtain
(12) -
2.
For : Since Alg. 1 terminates in slot with no clearing, the virtual packets arriving in this terminal period can activate at most times. Thus, we have
Combining both cases yields
Substituting and proves the first part of the theorem.
For the asymptotic result, as we have and , so the ratio approaches .
Appendix C Proof of Lemma 7
Fix a slot . First, if the total number of virtual packets that have arrived by slot is less than , then the result is immediate. Otherwise, suppose that the number of virtual packet arrivals by slot is at least .
Let denote the most recently arrived virtual packet by slot . Since at most virtual packets can arrive in a single slot, the arrival of virtual packets requires at least slots. Therefore, virtual packet must have arrived by slot . This implies that the set contains the subset , which consists of virtual packets.
From slot through slot (a total of slots), these virtual packets in have at least opportunities to activate. By Lemma 5, the virtual packet in this set will no longer satisfy the condition in Line 1 at the end of slot . Therefore, at most virtual packets can satisfy the condition at the end of slot .
Appendix D Proof of Lemma 9
Consider the initial age , a fixed age increment sequence , and a fixed update (or clearing) cost sequence . Only the operation duration is unknown to the device. Since the age no longer increases after the first update, the device will not transmit again beyond the first update. Thus, the scheduling problem reduces to deciding when to send a single update (i.e., deciding when to clear the virtual queue once) under uncertainty in .
To simplify the analysis, we rescale the objective function in Eq. (3) by dividing it by , resulting in
This transformation does not alter the optimal solution. We redefine as the new (normalized) clearing cost. Under the given instance, the transformation yields a clearing cost of in slot , and a clearing cost of in all subsequent slots. Moreover, the term can be interpreted as the cost of holding a virtual packet for a slot of duration , under the convention that holding a virtual packet for one unit time incurs a unit cost. As , the slot length approaches zero, and the problem transitions to a continuous-time scheduling model, similar to prior studies [bamas2020primal]. In this continuous-time setting, we assume that time starts at . The clearing cost becomes and for all . Moreover, we consider a time horizon , which is unknown to the device .
We now establish a lower bound on the competitive ratio of any randomized online scheduling algorithm. Let denote the probability density function (PDF) describing the randomized clearing time. Since and for all , the virtual server optimally clears the virtual queue before time 1. Thus, the PDF of the randomized clearing time must satisfy the condition .
For a given realization of , the expected cost incurred by the randomized algorithm is , where the first term accounts for the cost incurred when the virtual server decides to clear by time , and the second term accounts for the cost incurred when the virtual server decides to clear after time . Moreover, for this instance, since , an optimal offline strategy is to idle for all time, incurring the minimum total cost of . Let be the competitive ratio of the randomized algorithm. Then, we have for all . Thus, we derive the following optimization problem to find the smallest achievable competitive ratio :
| (13a) | ||||
| s.t. | ||||
| (13b) | ||||
| (13c) | ||||
We propose a candidate solution of the form for some constant . Substituting this into the constraint in Eq. (13c), we can obtain , which yields . Substituting this density into the left-hand side of constraint (13b), we obtain
Comparing with the right-hand side , we conclude that
which establishes the desired lower bound on the competitive ratio.
Appendix E Proof of Lemma 11
Let denote the value of at the beginning of some iteration of Line 3 in a slot. Suppose that a slow step is followed by a fast step, and let represent the resulting value of after these two steps. By Eq. (6), we have
Similarly, let denote the resulting value of after applying a fast step followed by a slow step. We have
Since , we have . Therefore, swapping a fast step and a subsequent slow step can reduce the value of . Hence, to prove the desired lower bound, we can assume that all slow steps occur first, followed by all fast steps.
Next, we prove the bound by induction on . When , the result follows from Lemma 4, which gives
for all . Assume that the result holds for , i.e.,
for all . We show the result also holds for : after an additional fast step, by Eq. (6) we have
for all . This completes the inductive step and proves the lemma.
Appendix F Proof of Lemma 12
Fix a slot . We claim that if , then for all , implying that the condition in Line 3 no longer holds.
To prove the claim, it suffices to consider the case where the slow steps are followed by the fast steps (as in Appendix E). Applying Lemma 11 under yields
| (14) |
for all . If , then Eq. (14) (after the inequality) equals . Next, we show that Eq. (14) is nondecreasing in . Differentiating it with respect to gives
To show the derivative is nonnegative, we examine the bracketed term:
| (15) |
where sets (since decreases in ). From [bamas2020primal, Page 27], we can write for some . Let . Then, Eq. (15) becomes
which is known to be nonnegative [bamas2020primal, Page 27]. Hence, Eq. (14) is nondecreasing as its derivative is nonnegative, proving the claim.
Appendix G Proof of Theorem 13
We follow Appendix B. Consider a period . Here, a single slow or fast step activation can increase the objective value in Eq. (5a) by at most or , respectively. Since , the increment of the objective value due to the activations of all virtual packets arriving in period , from slot up to slot , is bounded above by
| (16) |
Moreover, let and denote the numbers of slow and fast steps, respectively, performed by the virtual packets arriving in period , from slot onward. Then, the increment of the objective value due to these activations is
| (17) |
where is because from Lemma 12. Differentiating Eq. (17) (after the inequality) with respect to gives
Following Appendix F, this derivative can be rewritten as
for some . This expression is known to be nonnegative [bamas2020primal, Page 27]. Hence, Eq. (17) is nondecreasing in , and its maximum is following attained at ,
| (18) |
Appendix H Proof of Theorem 14
Fix an instance and ML advice . We follow Appendix B, with minor modifications. Redefine for as the slot when clears the virtual queue for the -th time. These redefined time points determine a new set of periods, replacing the periods defined in Appendix B.
Let denote the cost incurred by in period . Then, the total cost in Eq. (3) incurred by is . Let be the holding cost incurred by for all virtual packets arriving in period . Following Appendix B, we have for all , and .
Let be the increment of the objective value in Eq. (5a) by Alg. 3, according to the slow and fast step activations of all virtual packets arriving in period . Consider a fixed . The virtual packets arriving in period activate only slow steps from slot until slot (before advising clearing). Each slow step activation increases the objective value by at most , so the total increment of the objective value in this interval is at most . Moreover, after advising clearing in slot , the same virtual packets activate only fast steps. Following the proof of Lemma 5, there are at most such activations, each increasing the objective value by at most . Thus, the total increment of the objective value after slot is at most .
Hence, we have
Similarly, we also have
Combining the two cases yields
Substituting and the definitions of and proves the first part of the theorem.
Finally, following the derivation in Appendix G, we obtain the asymptotic behavior of the bound as .
Appendix I Proof of Lemma 15
We use the same continuous-time instance as in Appendix D. Consider a -consistent scheduling algorithm that chooses a random update (or equivalently clearing) time with PDF (so ). Let denote the robustness factor. Following Appendix D, we have
| (19) |
for all .
Setting and assuming perfect ML advice (updating at time ), the ML advice incurs a cost of , while the online algorithm incurs a cost of . By -consistency, we have
| (20) |
We now lower bound the optimal robustness subject to Eqs. (19) and (20), and :
| (21a) | ||||
| s.t. | ||||
| (21b) | ||||
| (21c) | ||||
| (21d) | ||||
By weak duality, any feasible solution to the dual of Eq. (21) yields a lower bound on . Let , , and be the dual variables for Eqs. (21b), (21c), and (21d), respectively. Then, the dual program can be written as follows:
| (22a) | ||||
| s.t. | (22b) | |||
| (22c) | ||||
Next, we propose a feasible solution to the optimization problem in Eq. (22). We propose for some constant to be determined, where is the indicator function. Substituting this form into Eq. (22b) yields .
We further propose and for some constants and to be determined. Substituting these into the objective in Eq. (22a) gives
| (23) |
To make the objective equal to (the claimed robustness bound), we choose and .
Next, we verify that the chosen values satisfy Eq. (22c). We consider two cases:
-
1.
For : The left-hand side of Eq. (22c) (before the inequality) is
The right-hand side is
which matches the left-hand side.
-
2.
For : The left-hand side is
The right-hand side is
so the inequality holds.
Therefore, Eq. (22c) is satisfied in both cases. By the weak duality theorem, the minimum possible robustness is at least
as stated in Eq. (23).
Appendix J Proof of Lemma 16
Assume the result holds for , i.e.,
for all .
Appendix K Proof of Theorem 17
For any scheduling algorithm, a virtual packet that arrives during a virtual OFF slot must remain in the virtual queue until the next virtual ON slot. Hence, the holding cost accrued by the virtual packets that arrive in virtual OFF slots until the slot immediately before the next virtual ON slot is identical across all algorithms (including the offline optimum). We therefore remove this constant from the objective, which is equivalent to deferring any virtual packet arrival in a virtual OFF slot to the next virtual ON slot. Then, it suffices to consider a fixed instance in which no virtual packet arrives in a virtual OFF slot.
We follow Appendix B. Fix a period and a virtual ON slot in that period. We bound the increment of the objective value in Eq. (9a) in slot by considering how and the matching -variables are adjusted by Alg. 4. There are three mutually exclusive cases:
-
1.
If a virtual packet arrives before and activates in iteration of Line 4 in slot , then Line 4 increases by
However, the paired was already set to be in a slot . Because does not change (for all possible ) over the virtual OFF period until slot , we have
Hence, the increment of the objective value due to the adjustment of from virtual packet in iteration and the paired is
(25) Next, we analyze the total number of activations performed from the start of slot until the considered activation. By Lemma 7, at most virtual packets can satisfy the condition in Line 4 at the end of the previous ON slot . Because no virtual packets arrive during the virtual OFF period, also at most virtual packets can satisfy the condition at the beginning of slot . Moreover, since each such virtual packet can be iterated at most times in Line 4, the total number of activations performed from the start of slot until the considered activation is bounded by .
Then, by Lemma 16, we have
Substituting this in Eq. (25) gives the bound on the increment of the objective value:
(26) -
2.
If a virtual packet arrives before but does not activate in iteration of Line 4, then because the paired was still set to in slot , it contributes at most to the objective value in that iteration.
- 3.
Next, we count the occurrences of the three cases since slot for the virtual packets that arrive in the period. By Lemma 5, the virtual packets that arrive in period can activate (in Cases 1 and 3 together) at most times since . In addition, by Lemma 7, at most virtual packets meet the condition in Line 4 at the end of slot . Once such a virtual packet stops activating, because of at most iterations in Line 4, the virtual packet can contribute at most Case 2 increments. Thus, there are at most Case 2 increments since slot .
Then, we can revise Eq. (12) in Appendix B as
where (a) corresponds to all cases before slot ; (b) corresponds to Cases 1 and 3 from slot onward; (c) corresponds to Case 2 from slot onward. These terms can be further simplified as
and also
Finally, following Appendix B yields the desired result.
Appendix L Proof of Lemma 18
First, we consider the initial age and a fixed update cost sequence . Consider an online algorithm that updates in slot with probability . Depending on , the adversary constructs the operation duration , the age increment sequence , and the update opportunity sequence as follows:
-
1.
If : Set , , and . In this case, the online algorithm incurs expected total cost
while the offline optimum updates in slot 1 and incurs total cost . Hence, the competitive ratio is , which diverges as .
-
2.
If : Set , , and . In this case, the online algorithm incurs cost , while the offline optimum chooses not to update and incurs total cost . Hence, the competitive ratio is , which again diverges as .
In both cases, if can be arbitrarily large, the adversary can construct an instance that forces the competitive ratio of any online algorithm to diverge.
Second, we consider the initial age , a fixed age increment sequence , and a fixed update cost sequence . Depending on , the adversary constructs the operation duration and the update opportunity sequence as follows:
-
1.
If : Set and . In this case, the online algorithm incurs expected total cost
while the offline optimum updates in slot 1 and incurs total cost . Hence, the competitive ratio is , which diverges as .
-
2.
If : Set and . In this case, the online algorithm incurs cost , while the offline optimum chooses not to update and incurs total cost . Hence, the competitive ratio is , which again diverges as .
In both cases, if can be arbitrarily large, the adversary can also construct an instance that forces the competitive ratio of any online algorithm to diverge, completing the proof.