Network Working Group                                           Q. Xiong
Internet-Draft                                           ZTE Corporation
Intended status: Informational                                    K. Yao
Expires: 8 July 2025                                        China Mobile
                                                                C. Huang
                                                           China Telecom
                                                                  Z. Han
                                                            China Unicom
                                                                 J. Zhao
                                                                   CAICT
                                                          4 January 2025


       Problem Statement for High Performance Wide Area Networks
                 draft-xiong-hpwan-problem-statement-01

Abstract

   High Performance Wide Area Network (HP-WAN) is designed for many
   applications such as scientific research, academia, education and
   other data-intensive applications which demand high-speed data
   transmission over WANs, and it needs to provide efficient
   transmission services within a completion time.  This document
   outlines the problems for HP-WANs.

Status of This Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at https://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on 8 July 2025.

Copyright Notice

   Copyright (c) 2025 IETF Trust and the persons identified as the
   document authors.  All rights reserved.





Xiong, et al.              Expires 8 July 2025                  [Page 1]

Internet-Draft   Problems Statement for High Performance    January 2025


   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents (https://trustee.ietf.org/
   license-info) in effect on the date of publication of this document.
   Please review these documents carefully, as they describe your rights
   and restrictions with respect to this document.  Code Components
   extracted from this document must include Revised BSD License text as
   described in Section 4.e of the Trust Legal Provisions and are
   provided without warranty as described in the Revised BSD License.

Table of Contents

   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   3
   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   3
   3.  Technical Goals for HP-WANs . . . . . . . . . . . . . . . . .   4
   4.  Problem Statement . . . . . . . . . . . . . . . . . . . . . .   5
     4.1.  Unscheduled Traffic with Instantaneous Congestion . . . .   5
     4.2.  Long Convergence Time due to Incast Congestion  . . . . .   6
     4.3.  RTT Fluctuation upon Long Feedback Loop . . . . . . . . .   6
   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   7
   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   7
   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   7
   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   7
     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   7
   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   9

1.  Introduction

   As described in [I-D.kcrh-hpwan-state-of-art], data is fundamental
   for research, academia, education, industrial and other data-
   intensive applications, such as High Performance Computing (HPC) for
   scientific research, cloud storage and backup of industrial internet
   data, distributed training of Artificial Intelligence (AI), and so
   on.  Within these applications, they may generate huge volumes of
   data by using advanced instruments and high-end computing devices.
   They need to be connected between research institutions,
   universities, and data centers across large geographical areas over
   long-distance links.  For example, sharing data between research
   institutes must transfer over hundreds or thousands of kilometers.
   It needs to ensure large-scale data transfer and provide stable and
   efficient transmission services over non-dedicated Wide Area Networks
   (WANs).  Moreover, some applications may demand a periodic or on-
   demand migration with variable transmission frequency, requiring
   timely data transmission within a completion time.

   More recently, the massive data transmission and long-distance
   connection over complicated WANs have become a key factor affecting
   the performance of existing transport layer protocols such as



Xiong, et al.              Expires 8 July 2025                  [Page 2]

Internet-Draft   Problems Statement for High Performance    January 2025


   Transfer Control Protocol (TCP), Quick UDP Internet Connections
   (QUIC), Remote Direct Memory Access (RDMA) and so on.  And the
   traditional congestion control algorithms are typically implemented
   at the host (sender and receiver) perform blind transmission by
   controlling the size of the congestion window with rate adjusting by
   detection of overloaded links.  It will be difficult to predict the
   performance due to the unpredictable behaviour of the WANs.  For
   example, for the host, without awareness of network capability, it
   will lead to a poor convergence speed impacting the completion time
   due to the slow start and passive rates adjusting.  It will also lead
   to RTT fluctuation due to large buffer and long queues upon long
   feedback loop.  For the network, it will transfer the unscheduled
   traffic with low bandwidth utilization due to the bottleneck links
   and instantaneous congestion.  All of above will impact the
   performance and result in the untimely transmission of high-volume
   data.  So the network should consider to provide predictable
   capability and the transport protocols should also consider to signal
   and collaborate with the network to negotiate QoS and improve overall
   HP-WANs transmission performance.

   High Performance Wide Area Network (HP-WAN) is designed for many
   applications such as scientific research, academia, education and
   other data-intensive applications which demand high-speed data
   transmission over WANs, and it needs to provide efficient
   transmission services within a completion time.  This document
   outlines the problems for HP-WANs.

1.1.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
   "OPTIONAL" in this document are to be interpreted as described in BCP
   14 [RFC2119] [RFC8174] when, and only when, they appear in all
   capitals, as shown here.

2.  Terminology

   This document adopts the terminology defined in
   [I-D.kcrh-hpwan-state-of-art].

   It also makes use of the following abbreviations and definitions in
   this document:

   BDP:           Bandwidth Delay Product

   DC:            Data Center

   DCI:           Data Centers Interconnection



Xiong, et al.              Expires 8 July 2025                  [Page 3]

Internet-Draft   Problems Statement for High Performance    January 2025


   HPC:           High Performance Computing

   WAN:           Wide Area Networks

   PFC:           Priority Flow Control

   ECN:           Explicit Congestion Notification

   ECMP:          Equal-Cost Multipath

   RTT:           Round-Trip Time

   TCP:           Transfer Control Protocol

   RDMA:          Remote Direct Memory Access

   QUIC:          Quick UDP Internet Connections

3.  Technical Goals for HP-WANs

   The services need to be provided in HP-WANs mainly focus on massive
   data with timely transmission while multiple services may co-exist
   over long-distance WANs as described below.

   *  Massive data transmission, high-speed or high-volume data
      transfer, e.g. the data volume of a flow could be at 2Gbps~1Tbps.

   *  Timely data transmission, it has a completion time but without
      strict real-time transmission requirements, e.g.
      minutes~milliseconds.

   *  Scheduled transmission, the data volume, start time, frequency,
      completion time could be specified.

   *  Long-distance transmission over non-dedicated WANs with long RTT
      latency, routing changes, network congestion, packet loss and link
      quality fluctuations, e.g. the distance between two sites or DCs
      could be more than 100km or 1000km.

   *  Multiple services are co-existed with concurrent flows.

   It is required to achieve high throughput data transmission for a HP-
   WAN flow to achieve a completion time.  Moreover, it is also crucial
   to maximize bandwidth utilization while ensuring fairness among
   multiple services.  This document outlines the technical goals for
   HP-WANs as described below.





Xiong, et al.              Expires 8 July 2025                  [Page 4]

Internet-Draft   Problems Statement for High Performance    January 2025


   *  High throughput: ensuring the high-speed data transmission within
      a completion time for a flow, which will be impacted by the
      bandwidth, convergence time, start time and RTT.

   *  High bandwidth utilization: efficiently using available network
      capacity with fairness to maximize data transfer rates and
      minimize the completion time for multiple flows.

4.  Problem Statement

   The traditional congestion control mechanisms perform blind
   transmission by controlling the size of the congestion window with
   rate adjusting by detection of overloaded links.  The WAN is a black
   box to provide unpredictable behaviours for high-speed transmission
   due to the issues such as long Round-Trip Time (RTT), routing
   changes, network congestion, packet loss, link quality fluctuations
   and bursty traffic.  Moreover, the services are massive and
   concurrent with multiple types and different traffic models, which
   may occupy a large amount of network resources leading to low network
   utilization.  The BDP (Bandwidth Delay Product) which represents the
   maximum amount of data that can be in transit on the network at any
   given time is variable over WANs.  And the inflight data is difficult
   to predict for host-based congestion control algorithms.

   Existing network technologies face numerous challenges and fall short
   of meeting performance requirements.  This document highlights the
   key issues associated with HP-WANs in the following sub-sections.

4.1.  Unscheduled Traffic with Instantaneous Congestion

   The host sends large traffic with blind transmission leading to the
   instantaneous congestion and variable bandwidth in WANs.  The network
   infrastructure may struggle to handle high-volume data transfers
   efficiently if applications do not proactively schedule traffic and
   network resources are not scheduled to estimate and mitigate
   congestion preemptively.  For multiple high-speed flows, the rapid
   arrival and departure of cross-traffic without scheduling creates
   significant fluctuations for available bandwidth in WANs, making it
   difficult to find the correct rate.  Without awareness of these
   traffic patterns, the network risks unscheduled resource allocation,
   leading to low bottleneck bandwidth utilization and reduced overall
   throughput, which impacting the completion time.

   For example, for HPC applications, a large amount of data will be
   transmitted, e.g. the data volumes of a single flow may be from 10G
   to 1TB, the host sends the unscheduled large traffic leading to the
   instantaneous congestion, packet loss, and queuing delay within
   network devices in WANs, resulting in low throughput.  Considering



Xiong, et al.              Expires 8 July 2025                  [Page 5]

Internet-Draft   Problems Statement for High Performance    January 2025


   the multiple services with various types of flows, the optimal
   bandwidth and transmission time may be different and the traffic is
   random to join and leave without to be scheduled to multiple paths
   and fine-grained network resources, which can not achieve the timely
   transmission.  The resource of WANs should be scheduled at the
   elements along the path to provide predictable capability for high-
   speed transmission.

4.2.  Long Convergence Time due to Incast Congestion

   The traditional congestion control algorithm have poor convergence
   speed based on blind transmission with rate adjusting due to the
   unpredictable behaviour of WANs such as incast congestion.  When
   determining the starting rate of data transmission, the slow start in
   congestion control will lead to overall throughput bottleneck with
   insufficient bandwidth utilization and fail to fully unleash the
   potential of the network capacity.  But the fast start can not adapt
   to the buffer capacity of network devices especially when multiple
   flows are transmitted over the same link, causing network congestion
   and resulting in packet loss and transmission delay.

   For example, it will use the slow start and blind detection with
   unawareness of network capability leading to long convergence time
   such as Cubic (e.g.over 50s), BBR (e.g.over 30s) and BBRv2
   (e.g.30~50s).  BBR divides the entire process into four stages,
   Startup, Drain, ProbeBW and ProbeRTT.  The probe cycle of ProbeRTT
   state is long, e.g. 10s.  The convergence time will be multiple probe
   cycle which will impact the completion time at seconds level.  There
   is a significant transmission capacity gaps between the appropriate
   sending rate and the available network capacity.  The transport
   protocols should signal and collaborate with the network to negotiate
   the rate for the host to send traffic.

4.3.  RTT Fluctuation upon Long Feedback Loop

   The congestion algorithms are implemented by controlling the size of
   the congestion window and adjusting the sending rates upon the
   network status feedback.  It will delay the network feedback due to
   the long-distance transmission delays and large RTT, resulting in the
   inability to adjust the transmission rate in a timely manner.  It
   will be challenging for congestion control over WANs for controlling
   the total amount of data entering the network to maintain the traffic
   at an acceptable level, leading to RTT fluctuation due to long queues
   and large buffer at network devices with high-speed transmission upon
   the long network state feedback loop.  Especially when multiple flows
   targeting an aggregating node, the maximum value is exceeding devices
   buffer capacity.




Xiong, et al.              Expires 8 July 2025                  [Page 6]

Internet-Draft   Problems Statement for High Performance    January 2025


   For example, the loss-based congestion control algorithms, such as
   Reno and CUBIC, depends on the congestion notification with packet
   loss.  Explicit Congestion Notification (ECN) can be used to achieve
   an end-to-end congestion notification based on IP and transport
   layers.  When a congestion occurred, the network may signal
   congestion by ECN markings or by dropping packets, and the receiver
   passes this information back to the sender in transport-layer
   acknowledgements, notifying the source to adjust the transmission
   rate.  It will use the slow start, requiring large buffer which is
   impacted by multiple hops and long RTT latency over WANs.

   And the congestion-based congestion control algorithms such as BBR,
   depends on the measurement of congestion, it actively measures
   bottleneck bandwidth (BtlBw) and round-trip propagation time (RTprop)
   based on the model to calculate the BDP and then to adjust the
   transmission rate to maximize throughput and minimize latency.  But
   BBR relies on real-time measurement of the parameters, and will
   optimize the buffer overflow, but it is not significant under large
   RTT, e.g. retransmission will increase when the buffer size is less
   than two BDPs, thereby affecting the control precision of BBR in
   long-distance networks.

5.  Security Considerations

   This document covers several of representative applications and
   network scenarios that are expected to make use of HP-WAN
   technologies.  Each of the potential use cases does not raise any
   security concerns or issues, but may have security considerations
   from both the use-specific perspective and the technology-specific
   perspective.

6.  IANA Considerations

   This document makes no requests for IANA action.

7.  Acknowledgements

   The authors would like to acknowledge Guangping Huang, Yao Liu and
   Zheng Zhang for their thorough review and very helpful comments.

8.  References

8.1.  Normative References

   [I-D.ietf-tcpm-accurate-ecn]
              Briscoe, B., Kühlewind, M., and R. Scheffenegger, "More
              Accurate Explicit Congestion Notification (ECN) Feedback
              in TCP", Work in Progress, Internet-Draft, draft-ietf-



Xiong, et al.              Expires 8 July 2025                  [Page 7]

Internet-Draft   Problems Statement for High Performance    January 2025


              tcpm-accurate-ecn-31, 21 December 2024,
              <https://datatracker.ietf.org/doc/html/draft-ietf-tcpm-
              accurate-ecn-31>.

   [I-D.kcrh-hpwan-state-of-art]
              King, D., Chown, T., Rapier, C., and D. Huang, "Current
              State of the Art for High Performance Wide Area Networks",
              Work in Progress, Internet-Draft, draft-kcrh-hpwan-state-
              of-art-00, 7 October 2024,
              <https://datatracker.ietf.org/doc/html/draft-kcrh-hpwan-
              state-of-art-00>.

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119,
              DOI 10.17487/RFC2119, March 1997,
              <https://www.rfc-editor.org/info/rfc2119>.

   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
              of Explicit Congestion Notification (ECN) to IP",
              RFC 3168, DOI 10.17487/RFC3168, September 2001,
              <https://www.rfc-editor.org/info/rfc3168>.

   [RFC7424]  Krishnan, R., Yong, L., Ghanwani, A., So, N., and B.
              Khasnabish, "Mechanisms for Optimizing Link Aggregation
              Group (LAG) and Equal-Cost Multipath (ECMP) Component Link
              Utilization in Networks", RFC 7424, DOI 10.17487/RFC7424,
              January 2015, <https://www.rfc-editor.org/info/rfc7424>.

   [RFC8174]  Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC
              2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174,
              May 2017, <https://www.rfc-editor.org/info/rfc8174>.

   [RFC8664]  Sivabalan, S., Filsfils, C., Tantsura, J., Henderickx, W.,
              and J. Hardwick, "Path Computation Element Communication
              Protocol (PCEP) Extensions for Segment Routing", RFC 8664,
              DOI 10.17487/RFC8664, December 2019,
              <https://www.rfc-editor.org/info/rfc8664>.

   [RFC9232]  Song, H., Qin, F., Martinez-Julia, P., Ciavaglia, L., and
              A. Wang, "Network Telemetry Framework", RFC 9232,
              DOI 10.17487/RFC9232, May 2022,
              <https://www.rfc-editor.org/info/rfc9232>.

   [RFC9331]  De Schepper, K. and B. Briscoe, Ed., "The Explicit
              Congestion Notification (ECN) Protocol for Low Latency,
              Low Loss, and Scalable Throughput (L4S)", RFC 9331,
              DOI 10.17487/RFC9331, January 2023,
              <https://www.rfc-editor.org/info/rfc9331>.



Xiong, et al.              Expires 8 July 2025                  [Page 8]

Internet-Draft   Problems Statement for High Performance    January 2025


   [RFC9438]  Xu, L., Ha, S., Rhee, I., Goel, V., and L. Eggert, Ed.,
              "CUBIC for Fast and Long-Distance Networks", RFC 9438,
              DOI 10.17487/RFC9438, August 2023,
              <https://www.rfc-editor.org/info/rfc9438>.

Authors' Addresses

   Quan Xiong
   ZTE Corporation
   China
   Email: xiong.quan@zte.com.cn


   Kehan Yao
   China Mobile
   China
   Email: yaokehan@chinamobile.com


   Cancan Huang
   China Telecom
   China
   Email: huangcanc@chinatelecom.cn


   Zhengxin Han
   China Unicom
   China
   Email: hanzx21@chinaunicom.cn


   Junfeng Zhao
   CAICT
   Beijing
   China
   Email: zhaojunfeng@caict.ac.cn















Xiong, et al.              Expires 8 July 2025                  [Page 9]