Efficient Asynchronous Federated Evaluation with Strategy
Similarity Awareness for Intent-Based Networking
in Industrial Internet of Things

Shaowen Qin, Jianfeng Zeng, Haodong Guo, Xiaohuan Li, Jiawen Kang, Qian Chen, Dusit Niyato This work was supported in part by the National Natural Science Foundation of China under Grant U22A2054. (Corresponding author: Xiaohuan Li.)Shaowen Qin, Jianfeng Zeng and Haodong Guo are with the Guangxi University Key Laboratory of Intelligent Networking and Scenario System (School of Information and Communication, Guilin University of Electronic Technology), Guilin 541004, China (e-mails: [email protected]; [email protected]; [email protected]).Xiaohuan Li is with the Guangxi University Key Laboratory of Intelligent Networking and Scenario System (School of Information and Communication, Guilin University of Electronic Technology), Guilin 541004, China, and also with National Engineering Laboratory for Comprehensive Transportation Big Data Application Technology (Guangxi), Nanning 530001, China (e-mails: [email protected]).Jiawen Kang is with the School of Automation, Guangdong University of Technology, Guangzhou 510006, China (e-mail: [email protected]).Qian Chen is with the School of Architecture and Transportation Engineering, GUET, Guilin, 541004, China (e-mail: [email protected]).Dusit Niyato is with the College of Computing and Data Science, Nanyang Technological University, Singapore (e-mail: [email protected]).This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
Abstract

Intent-Based Networking (IBN) offers a promising paradigm for intelligent and automated network control in Industrial Internet of Things (IIoT) environments by translating high-level user intents into executable network strategies. However, frequent strategy deployment and rollback are impractical in real-world IIoT systems due to tightly coupled workflows and high downtime costs, while the heterogeneity and privacy constraints of IIoT nodes further complicate centralized policy verification. To address these challenges, we propose FEIBN, a Federated Evaluation Enhanced Intent-Based Networking framework. FEIBN leverages large language models (LLMs) to align multimodal user intents into structured strategy tuples and employs federated learning to perform distributed policy verification across IIoT nodes without exposing raw data. To improve training efficiency and reduce communication overhead, we design SSAFL, a Strategy Similarity Aware Federated Learning mechanism that selects task-relevant nodes based on strategy similarity and resource status, and triggers asynchronous model uploads only when updates are significant. Experiments demonstrate that SSAFL can improve model accuracy, accelerate model convergence, and reduce the cost by 27.8% compared with SemiAsyn.

I Introduction

With the rapid advancement of intelligent manufacturing, the Industrial Internet of Things (IIoT) has evolved substantially in both scale and complexity, becoming a core enabling technology for modern industrial systems [r1, r2]. Intent-Based Networking (IBN) provides a promising paradigm for intelligent operation in IIoT by allowing users to express desired outcomes through human-readable intents, which are automatically translated into executable policies for deployment and enforcement [r3, r4]. However, IIoT intents often involve task execution goals, device coordination rules, safety constraints, and temporal requirements, rather than simple network configuration updates [r46]. For example, in a sensing-driven environment equipped with temperature, humidity, water-level, and ultrasonic modules, an engineer may express intentions such as “increase the sampling priority of the ultrasonic sensing module” or “allocate more processing resources to the water-level monitoring zone.” Ensuring that such high-level instructions are correctly interpreted and mapped to actionable IIoT strategies is crucial for safe and efficient system operation [r5, r37]. Traditional intent analysis methods, which rely on rule-based or shallow semantic models [r47], suffer from limited generalization and adaptability in complex industrial scenarios. Large Language Models (LLMs) [r6], with their powerful semantic understanding and cross-modal reasoning capabilities, can integrate intents expressed across different modalities into a unified semantic representation, thereby significantly enhancing the intent recognition capability of IBN systems [r43].

However, accurate intent recognition alone is insufficient to ensure reliable policy execution. Unlike traditional network management intents that primarily involve routing or configuration updates, IIoT intents directly drive physical actions, making incorrect interpretations or unsafe deployments potentially lead to costly downtime or even physical hazards [r35, r36]. This necessitates thorough policy verification prior to deployment to prevent costly failures or interruptions [r4]. Existing AI-based methods to verify network policies before actual deployment, which requires uploading operational and environmental data from multiple devices to a centralized server for model training and performance evaluation. Nevertheless, IIoT nodes are typically distributed and heterogeneous, and the data held by each node often involves sensitive information such as device parameters and operational status [r7], rendering centralized evaluation and prediction model training infeasible. Federated Learning (FL) [r8, r9], as a distributed collaborative learning framework, enables cross-node policy verification without requiring raw data to leave local devices [r45]. FL can be categorized into synchronous FL and asynchronous FL. In synchronous FL, the server must wait for all clients to upload their updates, causing faster clients to idle until the slowest ones finish. This straggler effect slows down training and leads to inefficient resource utilization, resulting in prolonged aggregation time and delayed convergence [r24, r25]. Asynchronous FL addresses the previously mentioned challenges by allowing the server to aggregate and update models promptly upon receiving a single client model [r23]. This method significantly reduces the waiting times for faster clients and expedites the training process of the global model. Although integrating asynchronous FL with industrial intent-based networking effectively enhances distributed policy verification, it also brings the following new issues.

  • i.

    There is a lack of a complete framework that connects multimodal intent fusion, semantic translation, policy generation, and distributed verification into a unified process. Although several recent studies have introduced LLMs into IBN, existing LLMs can only process unstructured textual descriptions, which do not fully meet the requirements of multimodal inputs [r33, r34]. Moreover, current IBN approaches for IIoT largely focus on intent interpretation while seldom integrating verification and feedback mechanisms into the overall workflow, making it difficult to form a closed-loop system in which intents can be accurately interpreted, reliably executed, and continuously optimized.

  • ii.

    Because different strategies often correspond to distinct execution conditions and action sets [r38], IBN policy verification tasks exhibit strong task-specific characteristics. However, existing methods usually neglect the relevance between nodes and strategies, with node evaluation metrics only focusing on capability, which can result in inefficient or low-value training.

  • iii.

    IBN policy verification tasks impose strict requirements on communication efficiency and response time, since frequent uploads of minor updates may lead to resource waste and delay timely strategy deployment due to prolonged training [r10]. Although asynchronous FL accelerates global model updates, it often results in redundant communication and unstable convergence due to uneven resource availability and unbalanced node participation.

To address these challenges, we propose a Federated Evaluation Enhanced Intent Based Networking (FEIBN) framework tailored for IIoT environments, which aims to enhance the precision and adaptability of intent understanding through multi-modal alignment and semantic modeling, while mitigating the risks of high deployment costs and node heterogeneity inherent in traditional IBN systems. The framework is driven by user intents and employs multi-modal alignment and LLMs to more precisely and efficiently transform heterogeneous intent expressions into a unified policy semantic space. Meanwhile, a federated evaluation mechanism is introduced to verify the effectiveness of the generated strategies in a distributed manner, thereby ensuring data privacy and enhancing evaluation efficiency. Furthermore, because existing participation metrics overlook task relevance and strategy similarity, we design a Strategy-Similarity-Aware Federated Learning (SSAFL) mechanism within the framework to address the inefficiencies in training and communication during policy validation. This mechanism introduces a new metric called the participation score, which evaluates nodes based on both historical strategy similarity and resource availability. Nodes with higher participation scores, indicating stronger task relevance and greater resource availability, are dynamically prioritized for training. In addition, an asynchronous upload mechanism based on model update magnitude is adopted, allowing only significant local updates to be uploaded. This design effectively reduces communication overhead while maintaining model convergence quality. The major contributions of this paper are as follows.

  • We propose FEIBN, a Federated Evaluation Enhanced IBN framework. FEIBN employs multi-modal alignment combined with LLMs to improve the accuracy, consistency, and adaptability of intent understanding in IIoT environments. Moreover, by integrating federated learning for distributed policy verification, FEIBN enhances the precision of intent–policy mapping and strengthens deployment reliability across heterogeneous IIoT nodes.

  • We design SSAFL, a strategy-similarity-aware FL mechanism that prioritizes nodes based on strategy relevance and resource availability. SSAFL achieves more efficient training, faster convergence, and substantial communication cost reduction, ensuring practical scalability for policy validation in IIoT networks.

  • We analyze the effectiveness of SSAFL by comparing it with FedAvg, FedAsyn, and SemiAsyn on realistic datasets. The experimental results show that SSAFL can improve model accuracy, accelerate model convergence, and significantly reduce network communication costs.

The rest of the paper is organized as follows. Section II reviews the related work. Section III presents the system model of the proposed FEIBN framework. Section IV details the design of the SSAFL. Section V presents the experimental setup and results. Section VI concludes the paper.

II Related Work

II-A Intent-Based Networking

IBN abstracts user requirements into high-level intents and automatically maps them to executable network policies, offering a promising approach for achieving automated and intelligent network control in IIoT environments. With the advancement of artificial intelligence, some studies have leveraged AI-driven methods to enhance intent understanding. The authors in [r15] introduced an AI-powered IBN architecture that automates the mapping from user intents to policy execution logic. In addition, LLMs have also been explored as powerful tools for semantic alignment in IBN systems. The authors in [r16] designed a custom LLM-driven framework for extracting intents in 5G core networks, showcasing significantly improved intent interpretation for policy generation. The authors in [r17] proposed an LLM-guided assurance mechanism to detect and correct intent drift in real time, ensuring policy consistency. The authors in [r18] introduced an industrial Agentic AI system that decomposes high-level intent into executable control flows using LLM agents, demonstrating feasibility in predictive maintenance scenarios. A summary of related studies is provided in Table I.

However, due to the involvement of multiple production-line devices in IIoT environments, it is impractical to frequently deploy and roll back strategies in real-world industrial operations. IBN in IIoT still lacks effective mechanisms for verifying the effectiveness of strategies prior to deployment. In addition, the heterogeneity and distributed nature of IIoT nodes further exacerbate the complexity of centralized policy verification and coordination.

TABLE I: Summary of Related Work on IBN
Ref. Focus Insight Advantages & Limitations
[r15] End-to-end intent life cycle design including intent parsing, policy generation, and closed-loop execution Establishes a complete AI-driven IBN pipeline that transforms high-level intents into enforceable network policies through multi-stage processing \checkmark Provides structured IBN architecture, covers full policy workflow ×\times Lacks LLM-based semantic reasoning, limited validation under dynamic IIoT or heterogeneous environments
[r16] LLM-based natural-language intent extraction, entity recognition, and slot filling Demonstrates that LLMs significantly improve intent interpretation accuracy in 5G core networks and reduce configuration ambiguity \checkmark Enhances understanding of telecom intents, improves mapping precision ×\times Focus solely on extraction without supporting policy verification, assurance, or runtime validation
[r17] Runtime assurance, semantic drift detection, state-to-intent consistency verification Introduces the concept of intent drift, enabling LLMs to detect mismatches between desired intents and actual network behaviors \checkmark Strong in assurance and runtime monitoring, provides a new conceptual model ×\times No intent translation or policy generation; performance relies heavily on drift model robustness
[r18] Agentic AI–based intent decomposition, multi-agent orchestration, and tool-enabled execution Proposes an agentic intent-processing pipeline that decomposes industrial intents into actionable tasks via LLM-based multi-agent collaboration \checkmark Strong alignment with Industry 5.0, enables autonomous planning and execution ×\times Conceptual and lacks network-level policy verification, not tailored for communication constraints or heterogeneity
TABLE II: Summary of Related Work on FL.
Ref. Focus Insight Advantages & Limitations
[r10] Asynchronous aggregation under heterogeneous device states Improves model freshness by adjusting aggregation timing according to client states \checkmark Better stability under asynchronous updates ×\times Does not distinguish task relevance among clients and lacks mechanisms to prevent low-value or irrelevant updates from harming global convergence
[r11] Similarity-aware personalized FL Uses confidence estimation and similarity weighting to improve personalized performance \checkmark Higher accuracy for heterogeneous autonomous devices. ×\times Does not address client participation strategy and overlooks the impact of unreliable or inconsistent updates on training efficiency
[r12] Task-grained knowledge sharing for heterogeneous task sequences Shares compact task knowledge to support continual learning across diverse edge tasks \checkmark Strong support for heterogeneous tasks with reduced communication cost ×\times Does not handle asynchronous participation and lacks a mechanism to prioritize high-value contributors under dynamic edge conditions
[r13] Client clustering and personalized lightweight patches Forms intrinsic client groups to improve personalization under non-IID data \checkmark Strong personalization capability ×\times elies on fixed cluster structures and lacks adaptive handling of dynamic client states

II-B Federated Learning

Due to the wide distribution of IIoT nodes and the high sensitivity of local data, federated learning often faces practical challenges such as task diversity, heterogeneous device capabilities, and varying policy applicability across clients, making it difficult to meet the personalized and efficient requirements of IBN policy verification. To address this, several studies have focused on task-aware federated learning approaches. The authors in [r11] proposed a federated learning method that emphasizes task similarity among clients by adopting a confidence-aware weighted aggregation strategy, guiding clients with similar tasks to share model parameters more closely and thus improving knowledge transfer efficiency. The authors in [r12] introduced a task-granular knowledge aggregation method, where each client selectively integrates only the task-relevant parts of global knowledge to reduce communication costs and mitigate catastrophic forgetting. The authors in [r13] presented a personalized federated learning framework based on task similarity, which dynamically adjusts aggregation weights to enhance collaborative effectiveness across tasks. The authors in [r10] developed an asynchronous federated learning framework tailored for heterogeneous IoT environments, utilizing asynchronous updates and adaptive aggregation to improve training efficiency and overall stability under non-synchronous conditions. A summary of related studies is provided in Table II.

Refer to caption
Figure 1: The FEIBN framework for IIoT. The framework supports intent-driven strategy generation and verification across distributed IIoT nodes. Within this framework, an LLM translates multimodal user intents into strategy tuples, and asynchronous federated policy verification is performed based on a similarity-aware node selection mechanism.

However, most of these methods are designed for general-purpose learning tasks and lack mechanisms specifically tailored for IBN policy verification, such as explicit modeling of task relevance, policy–semantic alignment, and strategy-aware client selection. Particularly in IIoT-based IBN scenarios, where devices are highly heterogeneous, node states are dynamic, and both semantic relevance and communication efficiency are critical, existing approaches fall short in balancing training efficiency with verification quality.

III Federated Evaluation Enhanced Intent-Based Networking with LLM

To enable intelligent intent understanding and distributed policy verification in IIoT environments, we propose the FEIBN, as illustrated in Fig. 1. The FEIBN framework consists of four core modules: intent expression, intent translation, intent analyses, and network configuration. First, in the intent expression module, users express their intents in multiple modalities, which are processed by a multimodal alignment module composed of pretrained encoders to extract semantic features. These features are then fused and interpreted in the intent translation module by an LLM, producing a structured strategy tuple. Next, in the intent analysis module, strategy validation is initiated across distributed IIoT nodes. A similarity-aware participation scoring mechanism evaluates each node’s relevance to the current strategy and its available resources. Based on this score, a subset of high-quality nodes is selected to participate in local training. Each participating node computes the magnitude of its local model update and uploads the update only if it exceeds a dynamic threshold, ensuring communication efficiency. Finally, in network configuration module, the central server aggregates these updates to evaluate the policy effectiveness, and, if validated, the policy is deployed to the industrial control system for execution. The main notations used in this paper are shown in Table III.

TABLE III: SUMMARY OF MAIN NOTATIONS
Notation Description
aga_{g} Scaling factor controlling sensitivity to threshold differences in condition similarity.
bmb_{m} Bias parameter in the projection for modality mm.
did_{i} Magnitude of local model update at client ii.
f(θ;xj,yj)f(\theta;x_{j},y_{j}) Loss function on sample (xj,yj)(x_{j},y_{j}), typically a regression loss such as MSE.
gng_{n} A goal element, defined by a metric, relational operator, and threshold.
h(g,g)h(g,g^{\prime}) Pairwise condition similarity between two goals.
mm Modality index, such as text, audio, or vision.
q(t)q(t) Set of client indices whose updates arrive within event window tt.
uiu_{i} Normalized CPU utilization of node ii.
ww aggregation weight.
zmz^{m} Projected embedding of modality mm.
Ai,jA_{i,j} Action set of the jj-th historical strategy of node ii.
BiB_{i} Normalized available bandwidth of node ii.
Ci,jC_{i,j} Condition set of the jj-th historical strategy of node ii.
DiD_{i} Local dataset of node ii.
EE^{\prime} Entities or resources in the executable strategy tuple.
Fi(θ)F_{i}(\theta) Local objective function at node ii.
GG^{\prime} Executable goal set in the strategy.
HiH_{i} Suitability score of node ii, combining similarity and resources.
II Total number of IIoT nodes.
SS Structured intent tuple including user, goals, entities, actions, and time.
SimiSim_{i} Strategy similarity of node ii.
TiT_{i} Number of local rounds completed by client ii.
UU User in the intent tuple.
Γit\Gamma_{i}^{t} Communication cost of client ii at round tt.
β1,β2\beta_{1},\beta_{2} Weights for similarity and resource in the suitability score.
δ1,δ2\delta_{1},\delta_{2} Weights for CPU and bandwidth in the resource score.
ϵi\epsilon_{i} Client-specific upload threshold.
η\eta Local learning rate used in client training.
λs\lambda_{s} Scaling factor in adaptive threshold design.
μg,μg\mu_{g},\mu_{g^{\prime}} Threshold values of conditions in goals gg and gg^{\prime}.
ν\nu Convergence tolerance in global objective.
θit\theta_{i}^{t} Local model parameter of node ii at time tt.
γ1,γ2,γ3\gamma_{1},\gamma_{2},\gamma_{3} Weights for action, condition, and resource similarity components.
τs\tau_{s} Threshold for selecting clients based on suitability score.
y,y^y,\hat{y} True and predicted outputs in regression-based evaluation.

III-A Intent Expression

In IIoT environments, user intents may appear in diverse forms. For instance, a field operator may issue a voice command such as “prioritize safety policies in the pump station due to abnormal vibration,” a supervisor may send a text message like “increase throughput of line B by 10% within 2 hours,” while a monitoring system may provide a visual signal indicating machine overheating. These heterogeneous inputs contain complementary cues, text captures explicit goals, audio conveys urgency or priority, and vision reflects real-time physical states.

We develop an intent expression module in FEIBN that projects text, audio, and images into a unified semantic space, ensuring that intents expressed across diverse industrial contexts can be uniformly interpreted and effectively processed. To achieve consistent interpretation, these heterogeneous signals are first encoded into modality-specific embeddings. Specifically, textual sequences are processed using a pretrained BERT encoder [r26], audio waveforms are transformed into latent representations by Wav2Vec2 [r27], and visual inputs are converted into high-level semantic features via ResNet [r28]. These models are selected for their strong generalization ability and proven robustness across multiple tasks, making them suitable for industrial scenarios where signals exhibit diverse formats and noise patterns. Since these encoders produce representations in different spaces, a learnable linear projection is applied to map each modality into a unified latent space as follows:

zm=Wmhm+bm\begin{split}z_{m}=W_{m}h_{m}+b_{m}\end{split} (1)

where mm represents the type of input. WmW_{m} and bmb_{m} are trainable parameters. The projected embeddings from multiple modalities are concatenated and then processed by a Transformer encoder, which models cross-modal dependencies and contextual relations among modalities. For example, it can associate the spoken phrase “slow down” with a corresponding visual cue of increasing conveyor-belt speed, thereby reinforcing semantic coherence. Through self-attention, the Transformer learns which modality carries dominant information for a given intent. The resulting fused representation zz serves as a comprehensive semantic descriptor that combines textual precision, auditory intent strength, and visual situational awareness. Finally, the fused representation zz is passed to an LLM (e.g., GPT [r29], DeepSeek [r31], and LLaMA [r32]), providing a coherent semantic interface for LLM-based intent translation. This unified representation enables the LLM to reason over structurally consistent inputs, thereby improving the accuracy and stability of policy generation and forming the foundation for subsequent strategy generation and strategy-similarity evaluation.

III-B Intent Translation

To ensure that high-level intents can be accurately and efficiently deployed in IIoT networks, the unified semantic representation needs to be converted into executable network strategies. In the intent translation module, the LLM is used for strategy generation, transforming abstract multimodal semantics into actionable and verifiable network configurations. The output strategy generated by an LLM is formally represented as a structured intent tuple, denoted as

S=<U,G,E,A,T>\begin{split}S=<U,G,E,A,T>\end{split} (2)

where UU denotes the user who defines the intent. GG denotes the objective. EE denotes the infrastructure for deploying the intent. AA denotes the set of actions to be executed in the network. TT denotes the period that the required service is scheduled to occur.

Once the intent tuple S is received, the Central Strategy Engine transforms it into executable strategy tuples, denoted as

S=<U,G,E,A,T>\begin{split}S^{\prime}=<U,G^{\prime},E^{\prime},A^{\prime},T>\end{split} (3)

where G=<g1,g2,,gn>G^{\prime}=<g_{1},g_{2},\ldots,g_{n}> denotes the set of goals, representing the target objectives that the strategy aims to achieve, where each goal gng_{n} can be formally expressed as gn=nθng_{n}=\ell_{n}\vartriangleright\theta_{n}, with n\ell_{n} representing a metric, θn\theta_{n} a threshold, and \vartriangleright a relational operator (e.g., >,<,,>,<,\geq,\leq). E=<e1,e2,,ek>E^{\prime}=<e_{1},e_{2},\ldots,e_{k}> identifies the devices or resources affected by the strategy. A=<a1,a2,,ak>A^{\prime}=<a_{1},a_{2},\ldots,a_{k}> denotes the set of actions to be executed, with each action aka_{k} indicating a concrete operational step. TT denotes the period in which the strategy is expected to take effect. Specifies when the required service behavior should be enacted. Below, we provide an example output of the intent translation module for a user intent such as “reduce communication delay for the ultrasonic sensing module”: <operator_02<’operator\_02’, {metric:latency\{’metric’:’latency’, operator:<’operator’:’<’, value:15}’value’:15\}, ultrasonic_module’ultrasonic\_module’, {type:QoS_adjustment\{’type’:’QoS\_adjustment’, params:{priority:5}}’params’:\{’priority’:5\}\}, (0,600s)>(0,600s)>. The field U identifies the initiating user (operator_\_02). The goal set GG^{\prime} specifies that the end-to-end latency should be kept below 15ms. The entity set EE^{\prime} indicates that the strategy targets the ultrasonic module. The action set AA^{\prime} describes a concrete network operation, namely a QoS adjustment that raises the scheduling priority of the corresponding traffic to level 5, encoded through the type and params fields. Finally, the time field TT defines a 600 second window during which this strategy should be enforced.

III-C Intent Analyses with LLM

In IIoT environments, where production lines are fixed and downtime costs are high, it is impractical to validate strategies through frequent real-world deployments. Therefore, the intent analysis module is designed to evaluate the effectiveness of strategies in a distributed manner prior to actual deployment.

The intent analysis module initiates a federated learning based on strategy SS to collaboratively train a predictive model capable of evaluating the strategy. We represent the set of IIoT nodes involved as Inode={1,,i,,I}I_{node}=\{1,\ldots,i,\ldots,I\}. Each node iInodei\in I_{node} possesses a local dataset DiD_{i}, consisting of samples (xj,yj)\left(x_{j},y_{j}\right), where xjx_{j} denotes the input feature, and yjy_{j} is the corresponding label indicating whether policy SS is suitable under the local context. Let ww denote the shared model parameter and f(w;xj;yj)f\left(w;x_{j};y_{j}\right) be the loss function on the jj-th sample. The local objective of node ii is defined as

Fi(θ)=1|Di|(xj,yj)Dif(θ;xj;yj)\begin{split}F_{i}\left(\theta\right)=\frac{1}{\left|D_{i}\right|}\sum_{\left(x_{j},y_{j}\right)\in D_{i}}f\left(\theta;x_{j};y_{j}\right)\end{split} (4)

where |Di|\left|D_{i}\right| represents the size of the dataset DiD_{i}. Therefore, the loss function F(w)F\left(w\right) of the server side can be calculated as

F(θ)=i=1I|Di|Fi(θ)|D|\begin{split}F\left(\theta\right)=\sum_{i=1}^{I}\frac{\left|D_{i}\right|F_{i}\left(\theta\right)}{\left|D\right|}\end{split} (5)

where |D|=i=1I|Di|\left|D\right|=\sum_{i=1}^{I}\left|D_{i}\right|. According to the above loss function, the optimization objective of FL can be formulated as

θ=argminθF(θ)\begin{split}\theta^{*}=\arg\min_{\theta}F(\theta)\end{split} (6)

where ww^{*} is the optimal global model.

After convergence, the global model outputs a deployability score for strategy SS, reflecting the probability that SS can achieve its goal set GG^{\prime} across heterogeneous IIoT nodes. High-scoring strategies are approved for configuration and deployment, while low-scoring ones are refined or re-evaluated. Furthermore, to achieve efficient and scalable federated evaluation across heterogeneous IIoT nodes, a strategy similarity aware federated learning mechanism is employed. which is discussed in Section IV.

III-D Network Configurations

After the intent analysis module verifies that a candidate strategy SS satisfies the performance and safety requirements, the strategy proceeds to the network configuration stage for deployment in the industrial environment. In this stage, the verified intent is translated into executable control commands that are delivered to the corresponding network elements and industrial devices.

The action set AA^{\prime} is mapped to concrete configuration commands for each entity eEe\in E^{\prime}, which can be abstracted as

ce=Φe(S),eE,c_{e}=\Phi_{e}(S^{\prime}),\quad\forall e\in E^{\prime}, (7)

where cec_{e} denotes the configuration state of entity ee and Φe()\Phi_{e}(\cdot) represents the configuration mapping implemented by the controller.

During deployment, real-time telemetry data, such as latency, bandwidth utilization, equipment status, and workload metrics, are continuously collected and compared with the expected performance objectives defined during strategy generation. Let ~n(t)\tilde{\ell}_{n}(t) denote the measured value of metric n\ell_{n} at time tt. The satisfaction indicator of goal gng_{n} at time tt is defined as

σn(t)={1,if ~n(t)θn,0,otherwise.\sigma_{n}(t)=\begin{cases}1,&\text{if }\tilde{\ell}_{n}(t)\ \triangleright\ \theta_{n},\\[2.0pt] 0,&\text{otherwise.}\end{cases} (8)

and the overall satisfaction of strategy SS^{\prime} at time tt is given by

JS(t)=n=1|G|σn(t),J_{S}(t)=\prod_{n=1}^{|G^{\prime}|}\sigma_{n}(t), (9)

where JS(t)=1J_{S}(t)=1 indicates that all goals in GG^{\prime} are satisfied and JS(t)=0J_{S}(t)=0 otherwise.

Over a deployment window TT, the empirical satisfaction probability of SS^{\prime} is computed as

pS=1|T|tTJS(t),p_{S}=\frac{1}{|T|}\sum_{t\in T}J_{S}(t), (10)

where |T||T| denotes the number of observation instants in TT. When deviations from the desired targets are detected, e.g., when pS<pminp_{S}<p_{\min} for a predefined reliability threshold pmin(0,1)p_{\min}\in(0,1), the system dynamically adjusts configuration parameters or triggers re-verification through the federated evaluation process. This adaptive feedback ensures that each deployed strategy remains valid and stable even under varying network conditions or workload fluctuations.

The network configuration module bridges the gap between intent-level decision-making and operational execution. It ensures that every strategy applied in the IIoT system is validated, explainable, and adaptive to dynamic industrial environments, thereby enabling trustworthy and autonomous operation within the intent-based networking framework.

IV Strategy Similarity Aware Asynchronous Federated Learning

In IFEIBN, federated learning is employed to enable distributed policy verification. Traditional FL methods are primarily designed for general-purpose tasks and therefore cannot effectively distinguish which nodes possess the historical knowledge most relevant to the current strategy, nor can they leverage such relevance to guide efficient model training. To address this limitation, we design SSAFL, which introduces a strategy similarity metric to quantify the semantic closeness between the current strategy and each node’s historical strategy set. SSAFL adaptively selects nodes that are both semantically aligned and resource-sufficient, ensuring that nodes with the highest contribution value participate more substantially in the FL process. Furthermore, SSAFL incorporates a similarity-driven asynchronous update mechanism to prioritize meaningful model uploads and aggregation. As shown in Fig. 2, each node evaluates its strategy similarity score and resource availability score, which together determine its adaptability score for participation in the current federated round. Nodes with adaptability scores exceeding the upload threshold are selected to upload their local model updates to the server, while the others are temporarily excluded from the aggregation process. The server then performs a weighted aggregation to update the global model and redistributes it to the nodes that contributed updates. This mechanism ensures that nodes with higher semantic relevance to the current strategy and sufficient computational resources contribute more effectively to the global optimization process.

Refer to caption
Figure 2: The process of SSAFL. Nodes first evaluate their strategy similarity and resource availability, and only those satisfying the selection criteria join the current round. Selected nodes receive the global model, perform local training, and compute adaptability scores. Nodes whose adaptability meets the upload threshold send updates asynchronously to the server. The server aggregates the received updates and distributes the refined global model.

IV-A Strategy Similarity Based Node Selection Scheme

To accurately quantify the similarity between the current strategy SS and the historical strategies maintained by nodes in FEIBN, we design a strategy similarity metric. This metric is decomposed into three components: action similarity, condition similarity, and resource similarity. The strategy similarity score of node ii for strategy SS is defined as

Simi(S)=γ1|AAi,jAAi,j|+γ21|𝒢|c𝒞maxcci,jh(g,g)\begin{split}Sim_{i}(S)=\gamma_{1}\left|\frac{A\cap A_{i,j}}{A\cup A_{i,j}}\right|+\gamma_{2}\frac{1}{|\mathcal{G}|}\sum_{c\in\mathcal{C}}\max_{c^{\prime}\in c_{i,j}}h(g,g^{\prime})\end{split} (11)

where γ1,γ2[0,1]\gamma_{1},\gamma_{2}\in\left[0,1\right] are weights satisfying γ1+γ2=1\gamma_{1}+\gamma_{2}=1. |AAi,j||AAi,j|\frac{\left|A\cap A_{i,j}\right|}{\left|A\cup A_{i,j}\right|} denotes the action similarity, which evaluates the overlap between the action sets of the two strategies, and is calculated using the Jaccard similarity coefficient. ||\left|\bullet\right| denotes the cardinality of a set, a value of 1 indicates identical action sets, and a value of 0 indicates no common actions. 1|𝒢|c𝒞maxcci,jh(g,g)\frac{1}{|\mathcal{G}|}\sum_{c\in\mathcal{C}}\max_{c^{\prime}\in c_{i,j}}h(g,g^{\prime}) denotes the condition similarity, which measures the degree of alignment between the conditions under which actions are applied. h(g,g)h\left(g,g^{\prime}\right) is a pairwise condition similarity function, which can be defined as

h(g,g)={exp(ag(|μgμg|μg)),if metrics match0,otherwise\begin{split}h(g,g^{\prime})=\begin{cases}\exp\left(-a_{g}\left(\frac{|\mu_{g}-\mu_{g^{\prime}}|}{\mu_{g}}\right)\right),&\text{if metrics match}\\ 0,&\text{otherwise}\end{cases}\end{split} (12)

where μg\mu_{g} and μg\mu_{g^{\prime}} are the thresholds of conditions cc and cc^{\prime}. αg>0\alpha_{g}>0 is a scaling factor controlling the sensitivity to threshold differences. h(g,g)h\left(g,g^{\prime}\right) adopts an exponential decay formulation to measure the semantic closeness between two intent conditions. It ensures that two conditions exhibit a high similarity score when they involve the same performance metric and their thresholds are close, while their similarity decreases rapidly as the threshold gap widens [r39]. Such behavior naturally reflects the semantics of intent conditions in IBN, where even small deviations in latency, loss, or throughput constraints may lead to significantly different operational requirements.

To efficiently select IIoT nodes for federated training in FEIBN, we design a suitability score HiH_{i} that evaluates each node’s potential contribution based on two key factors: strategy similarity and resource availability. The suitability score guides the asynchronous training process by preferentially selecting nodes most relevant to the current validation task. For a node ii and a target strategy SS, the suitability score HiH_{i} is defined as

Hi=β1Simi(S)+β2Resi\begin{split}H_{i}=\beta_{1}{Sim}_{i}\left(S\right)+\beta_{2}{Res}_{i}\end{split} (13)

where Resi{Res}_{i} denotes the current resource status of the node ii. β1,β2[0,1]\beta_{1},\beta_{2}\in\left[0,1\right] are weights satisfying β1+β2=1\beta_{1}+\beta_{2}=1.

The resource availability score Resi{Res}_{i} captures the computational and communication readiness of the node and is computed as

Resi=δ1(1Ui)+δ2Bi\begin{split}{Res}_{i}=\delta_{1}\left(1-U_{i}\right)+\delta_{2}B_{i}\end{split} (14)

where UiU_{i} denotes the normalized CPU utilization of node ii. BiB_{i} denotes the normalized available communication bandwidth. δ1,δ2[0,1]\delta_{1},\delta_{2}\in\left[0,1\right] are resource-specific importance weights satisfying δ1+δ2=1\delta_{1}+\ \delta_{2}=1.

Given a threshold τs\tau_{s}, node ii is selected to participate in the current training round if HiτsH_{i}\geq\tau_{s}. Otherwise, it remains idle for this training.

IV-B Adaptive Model Training and Updating

To efficiently validate strategies in FEIBN, we adopt an asynchronous FL approach, where node participation and model updates occur independently based on each node’s readiness and relevance to the current validation task. Upon receiving the current validation strategy SS from the server, each selected node ii initiates local training. Each node computes the L2L_{2} norm of its local model update, denotes as

Δθit2=θitθglobalt1\begin{split}\left\|\Delta\theta_{i}^{t}\right\|_{2}=\left\|\theta_{i}^{t}-\theta_{\text{global}}^{t-1}\right\|\end{split} (15)

where θit\theta_{i}^{t} denotes the node’s local model parameters after training. θglobalt1\theta_{global}^{t-1} denotes the latest global model parameters received by the node before local training. We define Δθit2\left\|\Delta\theta_{i}^{t}\right\|_{2} as the distance between the model trained by node ii and the global model.

We set an update threshold for the node to upload its update only when it exceeds this threshold. The update threshold is defined as

ϵi=ϵbase×(1+λs×(1Simi(S)))\begin{split}\epsilon_{i}=\epsilon_{base}\times\left(1+\lambda_{s}\times\left(1-{Sim}_{i}\left(S\right)\right)\right)\end{split} (16)

where ϵbase\epsilon_{base} is the base threshold value. λs\lambda_{s} is the scaling factor controlling the influence of similarity on the threshold.

The node uploads its model update Δθit\Delta\theta_{i}^{t} to the server if and only if Δθitϵi\left\|\Delta\theta_{i}^{t}\right\|\geq\epsilon_{i}. Otherwise, the node will continue to train its local model until the model distance reaches a threshold, thus avoiding unnecessary communication overhead.

When a node ii uploads its local model update Δθit\Delta\theta_{i}^{t} to the server after passing the upload threshold, the server performs asynchronous aggregation immediately without waiting for other nodes. The server receives Δθit\Delta\theta_{i}^{t} and computes the preliminary weight wiw_{i}^{\prime} as

wi=Simi(S)×Δθit2\begin{split}w_{i}^{\prime}={Sim}_{i}\left(S\right)\times\left\|\Delta\theta_{i}^{t}\right\|_{2}\end{split} (17)

To avoid the situation where important nodes contribute insignificantly due to small update magnitudes, we introduce a minimum weight protection mechanism. The final aggregation weight wiw_{i} is defined as

wi=(wmin,wijQ(t)wj)\begin{split}w_{i}=\left(w_{\text{min}},\frac{w_{i}^{\prime}}{\sum_{j\in Q(t)}w_{j}^{\prime}}\right)\end{split} (18)

where wiw_{i} is a predefined minimum weight threshold. Q(t)Q\left(t\right) denotes the set of nodes whose updates have been received by the server in the current aggregation server. The server asynchronously updates the global model using:

θglobalt+1=θglobalt+wiΔθit\begin{split}\theta_{\text{global}}^{t+1}=\theta_{\text{global}}^{t}+w_{i}\Delta\theta_{i}^{t}\end{split} (19)

IV-C Problem Formulation

We define the communication cost incurred by node ii after the tt-th round of local training as Γit\Gamma_{i}^{t}. If the local model θit\theta_{i}^{t} satisfies Δθitϵi\left\|\Delta\theta_{i}^{t}\right\|\geq\epsilon_{i} and uploads the model, we define Γit=1\Gamma_{i}^{t}=1. Therefore, the communication cost of the node is more formally expressed as

Γit={Γit1+1,if Δθit2ϵiΓit1,otherwise.\begin{split}\Gamma_{i}^{t}=\begin{cases}\Gamma_{i}^{t-1}+1,&\text{if }\|\Delta\theta_{i}^{t}\|_{2}\geq\epsilon_{i}\\ \Gamma_{i}^{t-1},&\text{otherwise}.\end{cases}\end{split} (20)

In the FEIBN, the objective of SSAFL is to minimize the overall communication cost throughout the federated validation process while ensuring that the final global model achieves acceptable validation accuracy. The communication cost of each client throughout the training process is abbreviated as Γi=t=1TiΓit\Gamma_{i}=\sum_{t=1}^{T_{i}}\Gamma_{i}^{t}, where TiT_{i} denotes the number of rounds trained by the ii-th node. Then, the objective function can be formulated as

mini=1IΓis.t.F(θt)F(θ)+νΔθit2ϵi\begin{split}\begin{aligned} &\min\sum_{i=1}^{I}\Gamma_{i}\\ &\text{s.t.}\quad F(\theta^{t})\leq F(\theta^{*})+\nu\\ &\qquad\|\Delta\theta_{i}^{t}\|_{2}\geq\epsilon_{i}\end{aligned}\end{split} (21)

where θ\theta^{\ast} is the optimal FL training model, and ν\nu is a constant.

IV-D Algorithm Design and Explanation

Algorithm 1 Client-side Local Training with Thresholded Upload
0: Received (θt,S,ϵi)(\theta^{t},S,\epsilon_{i}); local data DiD_{i}; local epochs EiE_{i}; stepsize η\eta
1:θiθt\theta_{i}\leftarrow\theta^{t}
2:repeat
3:  // Local training against Fi(w)F_{i}(w) in Eq. (4)
4:  for e=1e=1 to EiE_{i} do
5:   θiθiηFi(θi)\theta_{i}\leftarrow\theta_{i}-\eta\nabla F_{i}(\theta_{i})
6:  end for
7:  Δθiθiθt\Delta\theta_{i}\leftarrow\theta_{i}-\theta^{t}; diΔθi2d_{i}\leftarrow\|\Delta\theta_{i}\|_{2} {Eq. (16)}
8:  if diϵid_{i}\geq\epsilon_{i} then
9:   Upload Δθi\Delta\theta_{i} (and metadata such as |Di|,t|D_{i}|,t) to server
10:   wait for next θt+1\theta^{t+1} from server; θtθt+1\theta^{t}\!\leftarrow\!\theta^{t+1}; θiθt\theta_{i}\!\leftarrow\!\theta^{t}
11:  else
12:   Continue local training to accumulate updates
13:  end if
14:until receive signal from server
Algorithm 2 Server-side SSAFL in FEIBN
0: Intent tuple S=U,G,E,A,TS=\langle U,G,E,A,T\rangle, initial global model θ0\theta^{0}; weight sets γ,β,δ\gamma,\beta,\delta; thresholds τs,ϵbase\tau_{s},\epsilon_{\text{base}}; scale λs\lambda_{s}; minimum weight wminw_{\min}; stopping tolerance ν\nu.
1: // Node scoring and selection uses Eqs. (11),(13),(14)
2:for each node ii do
3:  Compute Simi(S)Sim_{i}(S) by Eq. (11); compute ResiRes_{i} by Eq. (14);
4:  Hiβ1Simi(S)+β2ResiH_{i}\leftarrow\beta_{1}\,Sim_{i}(S)+\beta_{2}\,Res_{i} {Eq. (13)}
5:end for
6:P{iHiτs}P\leftarrow\{\,i\mid H_{i}\geq\tau_{s}\,\} {select by threshold}
7: // Personalized upload thresholds uses Eq. (16)
8:for each iPi\in P do
9:  ϵiϵbase(1+λs(1Simi(S)))\epsilon_{i}\leftarrow\epsilon_{\text{base}}\big(1+\lambda_{s}(1-Sim_{i}(S))\big)
10:  Send (θt,S,ϵi)(\theta^{t},S,\epsilon_{i}) to client ii
11:end for
12: // Event-driven asynchronous aggregation uses Eqs. (15),(17),(19)
13: Initialize a short event window Δ\Delta and buffer Q(t)=Q(t)\!=\!\varnothing
14:loop
15:  Upon receiving update Δθi\Delta\theta_{i} from any iPi\in P:
16:  if Δθi2ϵi\|\Delta\theta_{i}\|_{2}\geq\epsilon_{i} then
17:   Q(t)Q(t){i}Q(t)\leftarrow Q(t)\cup\{i\}
18:   wiSimi(S)Δθi2w^{\prime}_{i}\leftarrow Sim_{i}(S)\cdot\|\Delta\theta_{i}\|_{2} {Eq. (17)}
19:  end if
20:  if window Δ\Delta expires or |Q(t)|1|Q(t)|\geq 1 then
21:   Min-weight protection & normalization:
22:   w~iwijQ(t)wj\tilde{w}_{i}\leftarrow\frac{w^{\prime}_{i}}{\sum_{j\in Q(t)}w^{\prime}_{j}}, wimax{wmin,w~i}w_{i}\leftarrow\max\{w_{\min},\tilde{w}_{i}\}, wiwijQ(t)wjw_{i}\leftarrow\frac{w_{i}}{\sum_{j\in Q(t)}w_{j}}
23:   θt+1θt+jQ(t)wjΔθj\theta^{t+1}\leftarrow\theta^{t}+\sum_{j\in Q(t)}w_{j}\,\Delta\theta_{j} {asynchronous update.}
24:   Update communication counters Γi\Gamma_{i} by Eq. (20); clear Q(t)Q(t)
25:   tt+1t\leftarrow t+1
26:  end if
27:  if F(θt)F(θ)+νF(\theta^{t})\leq F(\theta^{\ast})+\nu or tTmaxt\geq T_{\max} then
28:   break
29:  end if
30:end loop
31:return θt\theta^{t}

The proposed SSAFL training process consists of two components: a client-side training procedure (Algorithm 1) and a server-side coordination mechanism (Algorithm 2). The client module handles local training and decides whether to upload updates based on an update norm threshold. The server module computes similarity-aware participation scores to select relevant nodes and aggregates valid updates asynchronously.

Algorithm 1 specifies the behavior of each participating client. After initialization with the received global model θt\theta^{t}, intent tuple SS, and threshold ϵi\epsilon_{i}, the client performs local SGD training (Lines 3–6) according to Eq. (4). It then computes the update Δθi=θiθt\Delta\theta_{i}=\theta_{i}-\theta^{t} and its L2 norm (Line 7, Eq. (15)). If the update magnitude exceeds the threshold ϵi\epsilon_{i} (Lines 8–10), the client uploads Δθi\Delta\theta_{i} to the server and waits for the next global model. Otherwise, it continues local training to accumulate larger updates (Lines 11–12), thereby avoiding unnecessary communication. The process repeats until a stop signal is issued by the server (Line 13).

Algorithm 2 describes the federated training and aggregation procedure executed by the central server. Lines 1–4 compute the strategy similarity Simi(S)Sim_{i}(S) (Eq. (11)) and resource availability ResiRes_{i} (Eq. (14)) for each node, then derive the suitability score HiH_{i} using Eq. (13). Line 5 selects nodes with HiτsH_{i}\geq\tau_{s} to participate in training, ensuring only task-relevant and resource-capable nodes are involved. Lines 6–8 set personalized upload thresholds ϵi\epsilon_{i} according to Eq. (16), making high-similarity nodes more likely to upload. Lines 9–24 form the asynchronous event-driven loop: updates are received (Lines 11–13) and pre-weights wiw^{\prime}_{i} are computed (Eq. (17)); micro-batch aggregation is triggered (Lines 14–21), where minimum weight protection and normalization are applied before updating the global model via Eq.(19). The communication counters Γi\Gamma_{i} are updated following Eq. (20). Finally, convergence is checked (Lines 22–24) based on Eq. (21), and the global model θt\theta^{t} is returned (Line 26).

The computational cost of SSAFL follows the same order as standard synchronous and asynchronous FL. For each aggregation event the server requires O(|Q(t)|)O(|Q(t)|) operations to normalize weights and update the global model, where |Q(t)||Q(t)| is the size of the micro-batch. The client-side training follows the standard stochastic gradient descent (SGD) procedure and thus retains a complexity of O(Ei|Di|)O(E_{i}\cdot|D_{i}|) per local epoch, where EiE_{i} is the number of local epochs and |Di||D_{i}| the dataset size [r40]. Regarding communication, each client transmits its update only when the condition in Eq. (16) is satisfied. The expected number of transmissions per client is thereby reduced from TiT_{i}.

According to the convergence conditions in the FL definition given by the literature [r30], the convergence of the proposed SSAFL update rule can be analyzed following the asynchronous federated optimization framework in [r21]. The detailed convergence analysis of SSAFL is provided in Appendix Convergence Analysis of SSAFL.

V Numerical Results

V-A Experimental Setting

We model the strategy validation problem as a regression task, where the goal is to predict the effectiveness score of a given strategy unit S=<U,G,E,A,T>S=<U,G,E,A,T> within its contextual environment. The predicted value y^\hat{y} is employed to approximate the true deployable outcome yy.

Experimental Environment. The experiments were carried out on a computing platform running Ubuntu 22.04.5 LTS, equipped with an Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz and 4 × NVIDIA RTX 3090 GPUs. The experiments were implemented in Python 3.9, with federated training simulated using the FedML framework.

Refer to caption
Figure 3: Matching accuracy of strategy tuples under different LLMs.
Refer to caption
Figure 4: Federated evaluation number under different matching accuracies.

Datasets. Datasets used in this experiment consists of two components. The first part is device parameter data obtained from the publicly available Edge-IIoTset [r19] dataset, which includes real device operation logs and sensor parameters across various IIoT scenarios, thereby providing a representative reflection of IIoT node behaviors and characteristics under different operating conditions. The second part is intent-related data, which encompasses common business requirements in IIoT scenarios, such as bandwidth allocation, latency constraints, and energy–throughput trade-offs. In our setup, each client holds heterogeneous data sources, which naturally form a feature-skew non-IID distribution. Moreover, since the performance gains of SSAFL stem primarily from its similarity-aware scoring mechanism and asynchronous evaluation dynamics rather than from dataset-specific statistical properties, the same qualitative trends are expected to hold across different datasets.

Methods. We conducted comparative experiments on several federated learning strategies, including FedAvg [r20], Federated Asynchronous Learning (FedAsyn) [r21], and Semi-Asynchronous FL (SemiAsyn) [r22]. In FedAsyn, the server updates the global model immediately upon receiving an update from any client, whereas in SemiAsyn, the server performs an update once it has received updates from top k clients.

V-B FEIBN Performance Comparison

To evaluate the contribution of the multimodal alignment module, we analyze the accuracy of the generated strategy tuples SS. As shown in Fig. 3, the alignment module notably improves the precision of slot prediction, with the most significant gain observed in the “Action”. This indicates that multimodal semantic fusion helps the model capture complex operational intents that cannot be fully expressed in text alone.

Fig. 4 shows the variation in the number of federated evaluations under different matching accuracies. As alignment accuracy increases from 0.6 to 0.9, the number of evaluations performed decreases significantly. This result indicates that higher alignment quality enhances the semantic consistency of the strategies generated by the LLM (i.e., GPT-5.1 and DeepSeek-V3.2), enabling the system to make more accurate and confident decisions. Consequently, fewer redundant verifications are required, thereby improving the overall efficiency of the federated evaluation process.

Refer to caption
Figure 5: Deployment time of different methods.

Fig. 5 shows the total time required for strategy deployment across different methods. Adding only the alignment module slightly increases the deployment time due to the additional semantic parsing process. In contrast, FEIBN that integrates both alignment and federated evaluation results in a higher overall time cost, especially under lower alignment accuracy such as FEIBN-0.6, where more verification rounds are required. As the alignment accuracy increases to FEIBN-0.9, the deployment time decreases accordingly, indicating that improved alignment quality enhances the efficiency of federated validation and reduces the number of verifications.

V-C SSAFL Performance Comparison

We randomly assign each node a subset of the training data from the dataset as its local training set, while the test set is retained on the server for performance evaluation. Following previous experimental settings, we compare SSAFL with other FL methods, with each method repeated five times. In addition, an ablation experiment is conducted on the adaptive model aggregation at the server side within SSAFL to verify the impact of this controllable factor on model training. When SSAFL does not include adaptive aggregation, it is denoted as SSAFL*. The experimental results are reported in Table IV as point estimates using the mean ± standard deviation.

TABLE IV: Comparison of different methods.
Method MAE ↓ RMSE ↓ R2R^{2}
FedAvg 0.0637±0.023\pm 0.023 0.0677±0.035\pm 0.035 0.8398±0.17\pm 0.17
FedAsyn 0.0865±0.036\pm 0.036 0.0921±0.033\pm 0.033 0.7462±0.28\pm 0.28
SemiAsyn 0.0594±0.017\pm 0.017 0.0629±0.022\pm 0.022 0.8840±0.11\pm 0.11
SSAFL* 0.0541±0.023\pm 0.023 0.0597±0.015\pm 0.015 0.8703±0.08\pm 0.08
SSAFL 0.0497±\pm0.011 0.0521±\pm0.017 0.9177±\pm0.12
Refer to caption
Figure 6: Model training curves of different methods.
Refer to caption
Figure 7: Communication rounds of clients under different methods.

Fig. 6 illustrates the R²-based training curves of five federated learning methods. SSAFL achieves the best training performance among all compared methods, converging to an R² of 0.89 within only 15 epochs. Its ablated variant SSAFL* also performs well, validating the effectiveness of similarity-aware node selection. FedAvg and FedAsyn show slower convergence and lower final R² scores, around 0.85 and 0.83 respectively. Overall, these results highlight the advantages of combining intent-aware participation scoring and asynchronous communication in federated policy verification.

To evaluate the communication cost of different FL strategies under heterogeneous client latency, we configure Client 1, Client 5, and Client 10 as fast, medium, and slow clients, respectively, by assigning different local training times and upload delays. The experimental results are displayed in Fig. 7. Synchronous FedAvg produces identical communication rounds for all clients since each aggregation must wait for the slowest client. In contrast, asynchronous strategies show clear disparities. Fast clients upload much more frequently, while slow clients contribute fewer updates. SSAFL achieves the lowest communication rounds across all clients by suppressing redundant fast-client uploads and filtering low-impact updates from slow clients.

VI Conclusion

In this paper, we have proposed FEIBN, a Federated Evaluation Enhanced Intent-Based Networking framework tailored for IIoT environments. FEIBN leverages large language models to align heterogeneous multimodal intents into structured strategy tuples, and integrates federated learning to achieve distributed policy verification without exposing sensitive local data. To address the challenges of communication cost and training efficiency, we have further designed SSAFL, a Strategy Similarity Aware Federated Learning mechanism that combines similarity-aware node selection with adaptive asynchronous update thresholds. The experiments have demonstrated that SSAFL significantly improves model accuracy and convergence speed while reducing communication overhead compared with existing synchronous and asynchronous baselines. The ablation studies further validated the effectiveness of similarity-aware participation scoring and adaptive aggregation in enhancing federated policy verification.

Convergence Analysis of SSAFL

According to the convergence conditions in the FL definition given by [r30, r44], it is assumed that Centralized Learning converges to the optimal model parameter θ(c)\theta^{(c)} and FL converges to the optimal model parameter θ(f)\theta^{(f)}. If the gap between the two is small enough, that is, θ(f)θ(c)<ρ\theta^{(f)}-\theta^{(c)}<\rho (ρ\rho is an infinitesimal constant), it means that the FL model can converge.

We analyze the proposed SSAFL under standard smoothness assumptions [r41, r42] for the global objective F(θ)=i=1IpiFi(θ)F(\theta)=\sum_{i=1}^{I}p_{i}F_{i}(\theta), where pi=|Di|j|Dj|p_{i}=\frac{|D_{i}|}{\sum_{j}|D_{j}|}. Recall that in each aggregation event, the server updates θt+1=θt+iQ(t)wiΔθiti\theta^{t+1}=\theta^{t}+\sum_{i\in Q(t)}w_{i}\Delta\theta_{i}^{t_{i}}, where Q(t)Q(t) is the set of arrived clients within the micro-batch window, titt_{i}\leq t is the (possibly stale) local generation time of Δθiti\Delta\theta_{i}^{t_{i}}, and wiw_{i} are similarity-aware aggregation weights after minimum-weight protection and renormalization. Each client ii uploads only if Δθiti2ϵi\|\Delta\theta_{i}^{t_{i}}\|_{2}\geq\epsilon_{i}, where ϵi=ϵbase(1+λs(1Simi(S)))\epsilon_{i}=\epsilon_{\text{base}}(1+\lambda_{s}(1-\text{Sim}_{i}(S))).

-A Assumptions

  • A1

    (L-smoothness) Each local objective FiF_{i} is LL-smooth: Fi(θ)Fi(θ)Lθθ\|\nabla F_{i}(\theta)-\nabla F_{i}(\theta^{\prime})\|\leq L\|\theta-\theta^{\prime}\|; hence FF is LL-smooth.

  • A2

    (Unbiased local gradients & bounded variance) Local stochastic gradients are unbiased with variance σ2\sigma^{2}: 𝔼[gi(θ)|θ]=Fi(θ)\mathbb{E}[g_{i}(\theta)\,|\,\theta]=\nabla F_{i}(\theta) and 𝔼gi(θ)Fi(θ)2σ2\mathbb{E}\|g_{i}(\theta)-\nabla F_{i}(\theta)\|^{2}\leq\sigma^{2}.

  • A3

    (Bounded staleness) The delay is bounded: 0ttiτmax0\leq t-t_{i}\leq\tau_{\max}.

  • A4

    (Step sizes) Each client uses a constant stepsize η12L\eta\leq\frac{1}{2L} in local SGD, with a finite number of local steps per upload.

  • A5

    (Weights) Aggregation weights satisfy wiwmin>0w_{i}\geq w_{\min}>0 for iQ(t)i\in Q(t) and iQ(t)wi=1\sum_{i\in Q(t)}w_{i}=1.

  • A6

    (Trigger bias control) The upload rule acts as magnitude-based sparsification: there exists ζ[0,1)\zeta\in[0,1) such that iQ(t)w~iΔθitiiQ(t)wiΔθitiζiQ(t)w~iΔθiti\left\|\sum_{i\in Q(t)}\tilde{w}_{i}\Delta\theta_{i}^{t_{i}}-\sum_{i\in Q(t)}w_{i}\Delta\theta_{i}^{t_{i}}\right\|\leq\zeta\left\|\sum_{i\in Q(t)}\tilde{w}_{i}\Delta\theta_{i}^{t_{i}}\right\|, where w~i\tilde{w}_{i} are the pre-weights before minimum-weight protection.

Assumption A6 is mild: with thresholded uploads and renormalization, the effective deviation from the pre-weighted update is bounded; the bound improves as ϵbase\epsilon_{\text{base}}\downarrow or λs\lambda_{s}\downarrow.

-B One-step Progress

By LL-smoothness and the update rule, F(θt+1)F(θt)+F(θt),iQ(t)wiΔθiti+L2iQ(t)wiΔθiti2F(\theta^{t+1})\leq F(\theta^{t})+\left\langle\nabla F(\theta^{t}),\sum_{i\in Q(t)}w_{i}\Delta\theta_{i}^{t_{i}}\right\rangle+\frac{L}{2}\left\|\sum_{i\in Q(t)}w_{i}\Delta\theta_{i}^{t_{i}}\right\|^{2}. Each client’s local update with step size η\eta and EiE_{i} steps satisfies 𝔼[Δθiti|θti]ηEiFi(θti)\mathbb{E}[\Delta\theta_{i}^{t_{i}}\,|\,\theta^{t_{i}}]\approx-\eta E_{i}\nabla F_{i}(\theta^{t_{i}}) and 𝔼Δθiti2c1η2Ei2(Fi(θti)2+σ2)\mathbb{E}\|\Delta\theta_{i}^{t_{i}}\|^{2}\leq c_{1}\eta^{2}E_{i}^{2}(\|\nabla F_{i}(\theta^{t_{i}})\|^{2}+\sigma^{2}) for some constant c1c_{1} determined by the local optimizer. Using bounded staleness (A3) and smoothness, we relate stale gradients to current ones: Fi(θti)Fi(θt)Lθtiθtc2ητmax\|\nabla F_{i}(\theta^{t_{i}})-\nabla F_{i}(\theta^{t})\|\leq L\|\theta^{t_{i}}-\theta^{t}\|\leq c_{2}\eta\tau_{\max}, which yields the following descent lemma.

Lemma 1 (Descent with staleness and trigger). Under A1–A6 and η12L\eta\leq\frac{1}{2L}, 𝔼[F(θt+1)]𝔼[F(θt)]η𝔼(1Lη2κτκζ)𝔼F(θt)2+c3η2Σ\mathbb{E}\big[F(\theta^{t+1})\big]\leq\mathbb{E}\big[F(\theta^{t})\big]-\eta\,\mathbb{E}\Big(1-\frac{L\eta}{2}-\kappa_{\tau}-\kappa_{\zeta}\Big)\,\mathbb{E}\|\nabla F(\theta^{t})\|^{2}+c_{3}\eta^{2}\Sigma, where E=iQ(t)wiEiE=\sum_{i\in Q(t)}w_{i}E_{i}, κτ=𝒪(Lητmax)\kappa_{\tau}=\mathcal{O}(L\eta\tau_{\max}) captures staleness, κζ=𝒪(ζ)\kappa_{\zeta}=\mathcal{O}(\zeta) captures trigger/renormalization bias, and Σ=iQ(t)wi2σ2\Sigma=\sum_{i\in Q(t)}w_{i}^{2}\sigma^{2}.

-C Main Results

Theorem 1 (Convex case). If FF is convex and bounded below by FF^{\star}, then choosing a constant ηmin{14L,12L(τmax+1)}\eta\leq\min\{\frac{1}{4L},\frac{1}{2L(\tau_{\max}+1)}\} yields 1Tt=0T1𝔼F(θt)2𝒪(F(θ0)FηET)+𝒪(ηΣ)+𝒪(Lητmax)+𝒪(ζ).\frac{1}{T}\sum_{t=0}^{T-1}\mathbb{E}\|\nabla F(\theta^{t})\|^{2}\leq\mathcal{O}\left(\frac{F(\theta^{0})-F^{\star}}{\eta ET}\right)+\mathcal{O}\big(\eta\Sigma\big)+\mathcal{O}\big(L\eta\tau_{\max}\big)+\mathcal{O}(\zeta). In particular, with η=Θ(1/T)\eta=\Theta(1/\sqrt{T}) we obtain the standard sublinear rate 𝒪(1/T)\mathcal{O}(1/\sqrt{T}) in terms of gradient norm, and with constant η\eta we get 𝒪(1/T)\mathcal{O}(1/T) + steady-state error governed by variance, staleness, and trigger bias.

Theorem 2 (PL condition). If FF satisfies the Polyak–Łojasiewicz (PL) inequality 12F(θ)2μ(F(θ)F)\frac{1}{2}\|\nabla F(\theta)\|^{2}\geq\mu(F(\theta)-F^{\star}) for some μ>0\mu>0, then for ηmin{14L,μ4L2(τmax+1)}\eta\leq\min\{\frac{1}{4L},\frac{\mu}{4L^{2}(\tau_{\max}+1)}\}, 𝔼[F(θt+1)F](1μηE)𝔼[F(θt)F]+c4(η2Σ+ηLτmax+ζ)\mathbb{E}[F(\theta^{t+1})-F^{\star}]\leq(1-\mu\eta E)\,\mathbb{E}[F(\theta^{t})-F^{\star}]+c_{4}\big(\eta^{2}\Sigma+\eta L\tau_{\max}+\zeta\big), i.e., linear convergence to a neighborhood whose radius scales with variance Σ\Sigma, staleness τmax\tau_{\max}, and trigger bias ζ\zeta.

-D Remarks on Design Parameters

Thresholds & similarity. Larger similarity Simi(S)\text{Sim}_{i}(S) gives smaller ϵi\epsilon_{i} and hence more frequent uploads; this reduces ζ\zeta (smaller trigger bias) and tightens the neighborhood in Theorem 2, at the cost of more communication. Conversely, a larger λs\lambda_{s} or ϵbase\epsilon_{\text{base}} shrinks traffic but increases ζ\zeta.

Minimum-weight protection. Enforcing wiwminw_{i}\geq w_{\min} prevents starvation of informative but low-magnitude updates, which stabilizes EE and improves the contraction factor 1μηE1-\mu\eta E.

Staleness. A smaller micro-batch window and bounded network delay keep τmax\tau_{\max} small, reducing the degradation terms 𝒪(Lητmax)\mathcal{O}(L\eta\tau_{\max}) and improving both bounds.

Overall, SSAFL achieves standard convergence guarantees of asynchronous federated optimization under common assumptions, while its similarity-aware triggering and weighting introduce explicit, controllable trade-offs among accuracy, communication, and delay.