-
Notifications
You must be signed in to change notification settings - Fork 1.2k
RIB FIB HLD #2060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
RIB FIB HLD #2060
Conversation
/azp run |
No pipelines are associated with this pull request. |
Could we add a knob to enable this feature selectively, based on demand? The reason I ask is that the NHG process between kernel and zebra has not been stable—we’ve had it disabled since 202405 to address a production issue (see sonic-net/sonic-buildimage#23459 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one design question regarding where the NHG block is being maintained. Currently, this HLD proposes to keep the NHG block in fpmsyncd and persist it in Redis for warm restart reconciliation. While this works for the FRR-based stack, it introduces a few long-term architectural risks:
Tight Coupling to FRR:
I am not sure about other deployments of SONiC using different RIB entities. But modularization-wise, just want to understand the rationale of the placement of NHG blcok.
Maintaining NHG mapping in fpmsyncd makes the design dependent on FRR as the RIB. If in future deployments another routing entity (or alternate northbound programming mechanism) is used instead of FRR, the cache management logic would need to be duplicated in other agents
Placement of State Ownership:
Conceptually, NHG reconciliation feels closer to orchagent, since orchagent is the consumer responsible for applying NHG state into ASIC DB. Having the reconciliation state live in fpmsyncd risks splitting ownership of critical NHG state between two daemons.
Redis as Central Persistency:
Persisting the cache in Redis is fine for warm restart, but the ownership boundary still matters. If orchagent owns NHG lifecycle, it would be more consistent for it to also own the NHG cache, while Redis just acts as the persistence layer.
EDDIE: Once we use FPM, it is tight coupled with FRR. Currently, I don't see any valid use cases to replace FRR to somethingelse. I would not worry about this tight couple to FRR, since it is current arch design. Any change in this arch design would need to be discussed in TSC.
EDDIE:
EDDIE: We discussed it in WG before. Current logic, during warmreboot, fpmsyncd would check if routes information are changed or not and skip routes updating to avoid unnecessary hardware update. We want to keep this design philosophy as well. Redis is an expensive access. We want to keep only SONIC OBJs in there. FIB would break RIB information into corresponding SONiC OBJs. |
That is a valid ask. We don't plan to remove current code path. If NHG usage is disabled, we will use current code path. But if NHG usage is enabled, we will go with RIB/FIB path. |
# Route Congernce Handling | ||
A key objective of introducing the RIB/FIB model is to enhance route convergence. The design principle is to perform rapid updates on affected NHGs by removing failed paths based on existing NHG information. This mechanism mitigates traffic loss and provides sufficient time for routing protocols to complete reconvergence. | ||
|
||
## NHT Trigger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can an example of local failure e.g., interface failure, be added and show how that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Example of backwalk
This section is used to give example on how convergence gets handled.
|
||
 | ||
|
||
Once the resolve-through and resolve-via information is available, the SONiC forwarding chain can be represented as shown in the following graph. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add clear definition of resolved-through and -via? Terminology is missing
doc/ribfib/ribfib.md
Outdated
In general, the FIB is responsible for handling both NHG and route events. However, since SONiC’s slow path relies on the Linux kernel and does not process route information in fpmsyncd, the SONiC FIB primarily handles NHG events. The code in fpmsyncd responsible for processing Zebra NHG events is referred to as the NHG block. | ||
|
||
## NHG Block | ||
 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there are few things missing here:
1- a clear explanation about the difference of programming information from zebra to kernel versus SONiC.
2- a section describing the assumptions / expectations of what FRR is providing (old approach vs new approach)
3- Clear API definition about the extension coming from FRR
4- a section describing why RIB/FIB design address warm reboot requirements and how.
5- fpmsyncd managing a new NHG-ID - why is that required? why are NHG-ID provided by zebra cannot be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please take a look https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#handling-srv6-vpn-forwarding-chain-different-from-linux
- Please take a look https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#fib-high-level-design I added a section to explain the differences in term of data structures in two approaches.
- I will let Mark to handle them for FRR design.
- and 5 are discussed in WG many times. I attached some meeting minutes links in https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#enable-nhg-id-handling-for-improving-route-convergence
### Backwalk Step 2 | ||
Fpmsyncd uses 2064:100::1d to trigger a lookup in NEXTHOP to SONIG NHG ID hash table. This lookup returns a list of SONiC NHGs which contains 2064:100::1d as its nexthop. | ||
|
||
 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this picture defer from previous one with a new NHG-ID A. This is confusing..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
* Nexthop address : used in walk spec or for SONiC NHG table walk, a.k.a PIC edge case. | ||
* Its current resolved NHG ID : this NHG id is used to trigger backwalk update , a.k.a PIC core case. | ||
|
||
## Backwalk infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why a backwalk function is required. As soon a NHG-ID is updated, nothing else is required. Isn't the whole purpose of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is required.
In the backwalk step 1 example, if 2064:100::1d is withdrawn by IGP ( eBGP in this case), you will get an NHT event on 2064:100::1d with NHG at 243. The event for updating 243 is not enough. We need to backwalk from 243 to other depending NHG and updating involved NHG. For example, we need to update 265, but no need to update 260.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explaining the problem in detail and how backwalk is solving it would be helpful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#example-of-backwalk
Here is an example to explain how backwalk is used. It is a very common method in normal FIB handling.
doc/ribfib/ribfib.md
Outdated
Given these challenges, the FRR community recommends redefining Zebra to function solely as the RIB, delegating FIB management to each data plane. This allows forwarding chains to be derived and optimized according to the specific requirements of each data plane implementation. | ||
|
||
## FIB's location | ||
The FIB functionality in SONiC can be introduced at two possible points: before APPDB, within fpmsyncd, or after APPDB, within orchagent. Following discussions in the Routing Working Group, the current consensus is to implement FIB functionality in fpmsyncd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convince about this. I'm not sure if FIB should be part of fpmsyncd. Question: do we need to change APP_DB object and orchagent to process in a much better way NHG-ID coming from FRR since the structure is better than before. (not flatten anymore). I would have expect changes due to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#fibs-location
The main idea is to determine prefix' SONiC NHG ID before updating to APPDB. This way, we could avoid too many unnecessary routes updates to APPDB.
Regarding the comment about FRR and FIB manager being tightly coupled, I must agree with it. FIB manager should also work for controllers taping on SONiC. Perhaps, Requirements should be clearly stated at the beginning of this document. |
I added a section at the very beginning of this document to clarify these discussions are not in the scope of this document. https://github.com/eddieruan-alibaba/SONiC/blob/ribfib/doc/ribfib/ribfib.md#topics-not-in-the-scope |
/azp run |
No pipelines are associated with this pull request. |
This HLD describes why we need to introduce RIB/FIB in SONiC and how we approach it.