Network Working Group                                            T. Zhou
Internet-Draft                                                    Huawei
Intended status: Experimental                                      D. Li
Expires: 21 April 2025                               Tsinghua University
                                                                 X. Geng
                                                                  Huawei
                                                         18 October 2024


                  Perceptive Routing Information Model
           draft-zhou-rtgwg-perceptive-routing-information-00

Abstract

   This docuement defines the information model for perceptive routing,
   which could serve as a foundational component in the implementation
   of perceptive routing.

Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 21 April 2025.

Copyright Notice

   Copyright (c) 2024 IETF Trust and the persons identified as the
   document authors.  All rights reserved.






Zhou, et al.              Expires 21 April 2025                 [Page 1]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
   2.  Terminologies . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Perceptive Routing General Process  . . . . . . . . . . . . .   3
   4.  Perceptive Routing Information Model  . . . . . . . . . . . .   4
     4.1.  Local information model of PR Sensing Node  . . . . . . .   4
       4.1.1.  Port Failure  . . . . . . . . . . . . . . . . . . . .   5
       4.1.2.  Congestion  . . . . . . . . . . . . . . . . . . . . .   5
       4.1.3.  Queue Length  . . . . . . . . . . . . . . . . . . . .   5
       4.1.4.  Link SLA  . . . . . . . . . . . . . . . . . . . . . .   6
     4.2.  Network information model of PR Sensing Node  . . . . . .   6
       4.2.1.  On Path Information . . . . . . . . . . . . . . . . .   6
       4.2.2.  Bottleneck Information  . . . . . . . . . . . . . . .   6
       4.2.3.  Topology Information  . . . . . . . . . . . . . . . .   7
     4.3.  Routing decision information model of PR routing node . .   7
       4.3.1.  Reroute . . . . . . . . . . . . . . . . . . . . . . .   7
       4.3.2.  Congestion Control  . . . . . . . . . . . . . . . . .   7
       4.3.3.  ECMP (Equal-Cost Multi-Path) Mode . . . . . . . . . .   8
       4.3.4.  Hierarchical Routing  . . . . . . . . . . . . . . . .   8
       4.3.5.  Service Routing . . . . . . . . . . . . . . . . . . .   8
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   9
   8.  Normative References  . . . . . . . . . . . . . . . . . . . .   9
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   In a lot of scenarios, especailly in DC, adaptive routing has emerged
   as a crucial technique for enhancing network performance and
   resilience.  Traditional routing methods, which rely on static or
   pre-defined paths, often struggle to cope with rapidly changing
   network conditions, such as link failures, congestion, and varying
   traffic demands.  Adaptive routing addresses these challenges by
   allowing routing decisions to be adjusted in real time, based on the
   current state of the network.





Zhou, et al.              Expires 21 April 2025                 [Page 2]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


   Adaptive routing systems like Perceptive Routing (PR) continuously
   monitor network parameters, such as port status, congestion levels,
   and link SLAs, to make informed decisions that improve traffic
   distribution and fault tolerance.  A standardized information model
   could abstract the essential properties and relationships within the
   system, allowing different implementations to interact seamlessly.
   This model offers a common information model for representing the
   state of the network, allowing devices to communicate critical
   information such as failures, congestion, and optimal paths,
   facilitating dynamic and automated decision-making.

   This docuement defines the information model for perceptive routing,
   which could serve as a foundational component in the implementation
   of perceptive routing.

2.  Terminologies

   PR-SN: Perceptive Routing Sensing Node, percept local and network
   information for routing decisions.

   PR-RN: Perceptive Routing Routing Node, use multi-dimensional sensory
   information to make routing decisions, including reroute, adjust
   speed, load balance, etc.

   PR-N: Perceptive Routing Notification, the message from PR-SN to PR-
   RN.

3.  Perceptive Routing General Process

   The perceptive routing (PR) mechanism, akin to the adaptive routing
   network (ARN), aims to ensure efficient and resilient routing in
   dynamic network environments.  PR involves real-time monitoring and
   decision-making based on multi-dimensional network status
   information, which enables the network to adapt to changes, such as
   congestion or link failures, with minimal disruption.  Here's a
   summary of the general process:

   1.  Detection of Network Status Changes

   Perceptive Routing Sensing Nodes (PR-SN) continuously monitor both
   local and network-level conditions to detect any anomalies or changes
   in network performance, for example congestion or link/node failure.
   When such conditions are detected, PR-SN assesses whether they can be
   resolved locally or require further action.

   2.  Impact Assessment and Notification





Zhou, et al.              Expires 21 April 2025                 [Page 3]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


   If the PR-SN determines that the local measures (e.g., congestion
   mitigation strategies) are insufficient to address the problem, it
   generates a Perceptive Routing Notification (PR-N).  The PR-N message
   contains detailed information about the change in network status
   (e.g., the type of failure, affected links/nodes, etc.) and is sent
   to the Perceptive Routing Routing Node (PR-RN) or other designated
   nodes.  These messages inform PR-RN about issues that could affect
   network performance, allowing them to take proactive steps.

   3.  Routing Decision and Mitigation

   Upon receiving the PR-N message, PR-RN analyzes the specific
   information provided to make appropriate routing decisions.  This
   decisions includes:

   *  Rerouting: Selecting an alternative path to avoid the impacted
      link or node.

   *  Traffic load adjustment: Rebalancing traffic flows to prevent
      further congestion or link overload.

   *  Congestion control and ECMP: Adjusting traffic flows across
      multiple paths if available, using mechanisms like Equal-Cost
      Multi-Path (ECMP).

   *  Hierarchical routing decisions: In cases of large-scale network
      changes, PR-RN may use hierarchical routing strategies to route
      traffic across different layers of the network efficiently.

   By leveraging real-time data provided by PR-SN and using advanced
   decision-making algorithms, PR-RN ensures that traffic is rerouted or
   adjusted dynamically, reducing latency, avoiding congested paths, and
   enhancing overall network efficiency.

   The following sections Define a standardized information model for
   this general process.

4.  Perceptive Routing Information Model


4.1.  Local information model of PR Sensing Node

   This section focuses on the attributes collected by a Perceptive
   Routing (PR) sensing node that monitors and gathers real-time data
   about local conditions.






Zhou, et al.              Expires 21 April 2025                 [Page 4]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


4.1.1.  Port Failure

   This type of attribute represents the status of ports on a node.
   This attribute indicates whether a port has failed and can no longer
   transmit or receive traffic.  Monitoring port failures allows the
   network to quickly reroute traffic or trigger failover mechanisms.

   The possible attributes could include:

   *  Port Status: Indicates if the port is active, down, or in a failed
      state.

   *  Failure Cause: Specifies reasons for failure, such as hardware
      issues, misconfigurations, or timeouts.

4.1.2.  Congestion

   This type of attribute represents the level of congestion at the
   node, typically measured by monitoring packet delay, packet loss, and
   throughput.  This attribute informs the system of where congestion
   points are forming, helping to reroute traffic or apply congestion
   control techniques.

   The possible attributes could include:

   *  Traffic Load: Measures current traffic levels on the link

   *  Congestion Thresholds: Defines limits for congestion states

   *  Packet Drop Rate: The rate at which packets are dropped due to
      congestion

4.1.3.  Queue Length

   This type of attribute represents the length of queues in the node.
   High queue lengths indicate potential bottlenecks and delays, while
   short queues suggest fast packet forwarding.  This attribute is vital
   for assessing node performance and avoiding network congestion.

   The possible attributes could include:

   *  Queue Depth: Real-time data about the number of packets in the
      queue.

   *  Queue Thresholds: Defines situations where the queue has
      overflowed, possible leading to packet loss





Zhou, et al.              Expires 21 April 2025                 [Page 5]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


4.1.4.  Link SLA

   This type of attribute represents the Service Level Agreement (SLA)
   associated with the link, including metrics like bandwidth, latency,
   jitter, and availability.  The node monitors whether the link's
   performance is within the agreed SLA parameters and flags any
   violations for corrective actions.

   The possible attributes could include:

   *  Link Latency: Measures the round-trip delay across the link.

   *  Bandwidth Utilization: Tracks the percentage of available
      bandwidth being used.

4.2.  Network information model of PR Sensing Node

   This section covers the attributes about network conditions beyond
   the local node, providing insights about paths, bottlenecks, and
   topology to assist in making routing decisions.

4.2.1.  On Path Information

   This type of attribute represents detailed information about the
   current paths in use for traffic forwarding, including path metrics
   such as latency, jitter, and hop count.  This attribute allows the
   node to assess the quality of the existing paths and their
   suitability for ongoing traffic demands.

   The possible attributes could include:

   *  Hop Count: Number of hops the data takes between source and
      destination.

   *  Latency Per Hop: The time it takes to traverse each node.

4.2.2.  Bottleneck Information

   This type of attribute identifies and describes network bottlenecks
   where traffic is delayed or congested.  This can include points where
   the capacity of a link is exceeded or where high latency is
   introduced due to excessive queuing.

   The possible attributes could include:

   *  Link Utilization: Monitors bandwidth use on specific bottleneck
      links.




Zhou, et al.              Expires 21 April 2025                 [Page 6]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


   *  Queue Status: Alerts when queues at a bottleneck link are nearing
      full capacity.

4.2.3.  Topology Information

   This type of attribute rsepresents the structure of the network from
   the node's perspective.  This attribute includes details such as
   connected neighbors, available paths, link states, and node status,
   providing a global view of the network for optimizing routing
   decisions.

   The possible attributes could include:

   *  Neighboring Nodes: A list of adjacent nodes and their statuses.

   *  Link Metrics: Performance and quality of links connecting nodes in
      the topology.

4.3.  Routing decision information model of PR routing node

   This section covers the key attributes that influence the decision-
   making processes within a routing node.  These attributes determine
   how traffic is routed, how congestion is managed, and how network
   resources are allocated.

4.3.1.  Reroute

   This type of attribute describes the mechanisms and criteria used to
   reroute traffic in response to changes in the network, such as link
   failures or congestion events.  This attribute ensures that traffic
   is dynamically redirected to optimal paths.

   The possible attributes could include:

   *  Reroute Path: The alternative path selected during rerouting.

   *  Failover Time: Time taken to switch to an alternate path.

4.3.2.  Congestion Control

   This type of attribute details the strategies and protocols used to
   manage congestion at the routing node.  This attribute includes
   techniques like rate-limiting, traffic shaping, or prioritizing
   certain flows to alleviate network congestion.

   The possible attributes could include:





Zhou, et al.              Expires 21 April 2025                 [Page 7]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


   *  Congestion Avoidance Policies: Mechanisms to prevent congestion
      before it occurs.

   *  Rate Limiting: Controls the traffic rate to avoid overwhelming the
      network.

4.3.3.  ECMP (Equal-Cost Multi-Path) Mode

   This type of attribute refers to Equal-Cost Multi-Path (ECMP)
   routing, where multiple paths with equal cost are used to distribute
   traffic evenly across the network.  This attribute describes how ECMP
   is implemented and the criteria for path selection.

   The possible attributes could include:

   *  Hash Algorithm: Determines how ECMP chooses paths.

   *  Traffic Distribution: Shows how traffic is split across multiple
      paths.

4.3.4.  Hierarchical Routing

   This type of attribute covers the use of hierarchical routing
   techniques to manage larger networks efficiently.  This attribute
   provides information about how the network is divided into tiers or
   areas, with routing decisions optimized within each layer.

   The possible attributes could include:

   *  Routing Layers: Defines the layers of routing, such as access,
      aggregation, and core.

   *  Aggregated Traffic Metrics: Summarizes traffic data for groups of
      lower-layer nodes.

4.3.5.  Service Routing

   This type of attribute describes how the routing node handles
   service-specific routing requirements, such as directing traffic
   based on application needs (e.g., video streaming, voice, or data).
   This attribute ensures that service-level routing objectives are met,
   such as prioritizing latency-sensitive traffic.

   The possible attributes could include:

   *  Service Path: The path chosen for traffic according to a specific
      service type.




Zhou, et al.              Expires 21 April 2025                 [Page 8]

Internet-Draft  draft-zhou-rtgwg-perceptive-routing-info    October 2024


   *  Service-Specific SLAs: Monitors SLA adherence based on service-
      level routing.

5.  Security Considerations


6.  IANA Considerations

   This document makes no request of IANA.


   Note to RFC Editor: this section may be removed on publication as an
   RFC.

7.  Acknowledgements


8.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

Authors' Addresses

   Tianran Zhou
   Huawei
   Email: zhoutianran@huawei.com


   Dan Li
   Tsinghua University
   Email: tolidan@tsinghua.edu.cn


   Xuesong Geng
   Huawei
   Email: gengxuesong@huawei.com












Zhou, et al.              Expires 21 April 2025                 [Page 9]