Skip to content

Conversation

eddieruan-alibaba
Copy link
Contributor

This HLD describes why we need to introduce RIB/FIB in SONiC and how we approach it.

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

No pipelines are associated with this pull request.

@eddieruan-alibaba
Copy link
Contributor Author

@StormLiangMS
Copy link
Contributor

Could we add a knob to enable this feature selectively, based on demand? The reason I ask is that the NHG process between kernel and zebra has not been stable—we’ve had it disabled since 202405 to address a production issue (see sonic-net/sonic-buildimage#23459
). I can imagine this new design will introduce additional complexity, so until it matures and is qualified in production, it would be helpful to have a knob to control it.

Copy link

@selva-nexthop selva-nexthop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one design question regarding where the NHG block is being maintained. Currently, this HLD proposes to keep the NHG block in fpmsyncd and persist it in Redis for warm restart reconciliation. While this works for the FRR-based stack, it introduces a few long-term architectural risks:

Tight Coupling to FRR:
I am not sure about other deployments of SONiC using different RIB entities. But modularization-wise, just want to understand the rationale of the placement of NHG blcok.
Maintaining NHG mapping in fpmsyncd makes the design dependent on FRR as the RIB. If in future deployments another routing entity (or alternate northbound programming mechanism) is used instead of FRR, the cache management logic would need to be duplicated in other agents

Placement of State Ownership:
Conceptually, NHG reconciliation feels closer to orchagent, since orchagent is the consumer responsible for applying NHG state into ASIC DB. Having the reconciliation state live in fpmsyncd risks splitting ownership of critical NHG state between two daemons.

Redis as Central Persistency:
Persisting the cache in Redis is fine for warm restart, but the ownership boundary still matters. If orchagent owns NHG lifecycle, it would be more consistent for it to also own the NHG cache, while Redis just acts as the persistence layer.

@eddieruan-alibaba
Copy link
Contributor Author

I have one design question regarding where the NHG block is being maintained. Currently, this HLD proposes to keep the NHG block in fpmsyncd and persist it in Redis for warm restart reconciliation. While this works for the FRR-based stack, it introduces a few long-term architectural risks:

Tight Coupling to FRR: I am not sure about other deployments of SONiC using different RIB entities. But modularization-wise, just want to understand the rationale of the placement of NHG blcok. Maintaining NHG mapping in fpmsyncd makes the design dependent on FRR as the RIB. If in future deployments another routing entity (or alternate northbound programming mechanism) is used instead of FRR, the cache management logic would need to be duplicated in other agents

EDDIE:
FPM is Forwarding Plane Manager. It is a component of FRR that provides a mechanism to export routing updates (like route additions, modifications, or deletions) from FRR’s internal Routing Information Base (RIB) to an external forwarding plane.

Once we use FPM, it is tight coupled with FRR.

Currently, I don't see any valid use cases to replace FRR to somethingelse. I would not worry about this tight couple to FRR, since it is current arch design. Any change in this arch design would need to be discussed in TSC.

Placement of State Ownership: Conceptually, NHG reconciliation feels closer to orchagent, since orchagent is the consumer responsible for applying NHG state into ASIC DB. Having the reconciliation state live in fpmsyncd risks splitting ownership of critical NHG state between two daemons.

EDDIE:
We map zebra NHG information to various SONiC APP DB objects here., SONiC NHG OBJ is just one of them. Currently, this step is handled in zebra, now we want to move fpmsyncd to provide FIB functionality.

Redis as Central Persistency: Persisting the cache in Redis is fine for warm restart, but the ownership boundary still matters. If orchagent owns NHG lifecycle, it would be more consistent for it to also own the NHG cache, while Redis just acts as the persistence layer.

EDDIE: We discussed it in WG before. Current logic, during warmreboot, fpmsyncd would check if routes information are changed or not and skip routes updating to avoid unnecessary hardware update. We want to keep this design philosophy as well.

Redis is an expensive access. We want to keep only SONIC OBJs in there. FIB would break RIB information into corresponding SONiC OBJs.

@eddieruan-alibaba
Copy link
Contributor Author

Could we add a knob to enable this feature selectively, based on demand? The reason I ask is that the NHG process between kernel and zebra has not been stable—we’ve had it disabled since 202405 to address a production issue (see sonic-net/sonic-buildimage#23459 ). I can imagine this new design will introduce additional complexity, so until it matures and is qualified in production, it would be helpful to have a knob to control it.

That is a valid ask. We don't plan to remove current code path. If NHG usage is disabled, we will use current code path. But if NHG usage is enabled, we will go with RIB/FIB path.

# Route Congernce Handling
A key objective of introducing the RIB/FIB model is to enhance route convergence. The design principle is to perform rapid updates on affected NHGs by removing failed paths based on existing NHG information. This mechanism mitigates traffic loss and provides sufficient time for routing protocols to complete reconvergence.

## NHT Trigger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can an example of local failure e.g., interface failure, be added and show how that work?

Copy link
Contributor Author

@eddieruan-alibaba eddieruan-alibaba Oct 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Example of backwalk

This section is used to give example on how convergence gets handled.


![image](images/zebra_nhg_chain.png)

Once the resolve-through and resolve-via information is available, the SONiC forwarding chain can be represented as shown in the following graph.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add clear definition of resolved-through and -via? Terminology is missing

In general, the FIB is responsible for handling both NHG and route events. However, since SONiC’s slow path relies on the Linux kernel and does not process route information in fpmsyncd, the SONiC FIB primarily handles NHG events. The code in fpmsyncd responsible for processing Zebra NHG events is referred to as the NHG block.

## NHG Block
![image](images/NHG_block.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are few things missing here:
1- a clear explanation about the difference of programming information from zebra to kernel versus SONiC.
2- a section describing the assumptions / expectations of what FRR is providing (old approach vs new approach)
3- Clear API definition about the extension coming from FRR
4- a section describing why RIB/FIB design address warm reboot requirements and how.
5- fpmsyncd managing a new NHG-ID - why is that required? why are NHG-ID provided by zebra cannot be used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please take a look https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#handling-srv6-vpn-forwarding-chain-different-from-linux
  2. Please take a look https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#fib-high-level-design I added a section to explain the differences in term of data structures in two approaches.
  3. I will let Mark to handle them for FRR design.
  4. and 5 are discussed in WG many times. I attached some meeting minutes links in https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#enable-nhg-id-handling-for-improving-route-convergence

### Backwalk Step 2
Fpmsyncd uses 2064:100::1d to trigger a lookup in NEXTHOP to SONIG NHG ID hash table. This lookup returns a list of SONiC NHGs which contains 2064:100::1d as its nexthop.

![image](images/backwalk_2.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like this picture defer from previous one with a new NHG-ID A. This is confusing..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

* Nexthop address : used in walk spec or for SONiC NHG table walk, a.k.a PIC edge case.
* Its current resolved NHG ID : this NHG id is used to trigger backwalk update , a.k.a PIC core case.

## Backwalk infra
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why a backwalk function is required. As soon a NHG-ID is updated, nothing else is required. Isn't the whole purpose of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is required.

In the backwalk step 1 example, if 2064:100::1d is withdrawn by IGP ( eBGP in this case), you will get an NHT event on 2064:100::1d with NHG at 243. The event for updating 243 is not enough. We need to backwalk from 243 to other depending NHG and updating involved NHG. For example, we need to update 265, but no need to update 260.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explaining the problem in detail and how backwalk is solving it would be helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#example-of-backwalk

Here is an example to explain how backwalk is used. It is a very common method in normal FIB handling.

Given these challenges, the FRR community recommends redefining Zebra to function solely as the RIB, delegating FIB management to each data plane. This allows forwarding chains to be derived and optimized according to the specific requirements of each data plane implementation.

## FIB's location
The FIB functionality in SONiC can be introduced at two possible points: before APPDB, within fpmsyncd, or after APPDB, within orchagent. Following discussions in the Routing Working Group, the current consensus is to implement FIB functionality in fpmsyncd.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convince about this. I'm not sure if FIB should be part of fpmsyncd. Question: do we need to change APP_DB object and orchagent to process in a much better way NHG-ID coming from FRR since the structure is better than before. (not flatten anymore). I would have expect changes due to that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#fibs-location

The main idea is to determine prefix' SONiC NHG ID before updating to APPDB. This way, we could avoid too many unnecessary routes updates to APPDB.

@pbrisset
Copy link
Contributor

Regarding the comment about FRR and FIB manager being tightly coupled, I must agree with it. FIB manager should also work for controllers taping on SONiC.

Perhaps, Requirements should be clearly stated at the beginning of this document.

@eddieruan-alibaba
Copy link
Contributor Author

eddieruan-alibaba commented Oct 14, 2025

Regarding the comment about FRR and FIB manager being tightly coupled, I must agree with it. FIB manager should also work for controllers taping on SONiC.

Perhaps, Requirements should be clearly stated at the beginning of this document.

I added a section at the very beginning of this document to clarify these discussions are not in the scope of this document.

https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#topics-not-in-the-scope

@mssonicbld
Copy link
Collaborator

/azp run

Copy link

No pipelines are associated with this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants