From d2c7df42886ed80cf2e5a82d9a1521c0003dddf8 Mon Sep 17 00:00:00 2001 From: Amir Shehata Date: Mon, 5 Oct 2020 11:31:18 -0700 Subject: [PATCH] LUDOC-479 lnet: Clarify transmit and routing credits Updated the explanation of transmit and routing credits to make them more in line with what is actually implemented in the code. Change-Id: I912f39167afb4e50271362ed41b7ec6475c96370 Signed-off-by: Amir Shehata Reviewed-on: https://review.whamcloud.com/40143 Reviewed-by: Andreas Dilger Tested-by: jenkins --- LustreProc.xml | 56 +++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 35 insertions(+), 21 deletions(-) diff --git a/LustreProc.xml b/LustreProc.xml index ab3b89a..073b032 100644 --- a/LustreProc.xml +++ b/LustreProc.xml @@ -2090,7 +2090,7 @@ nid refs state max rtr min tx min queue rtr - Number of routing buffer credits. + Number of available routing buffer credits. @@ -2108,7 +2108,7 @@ nid refs state max rtr min tx min queue tx - Number of send credits. + Number of available send credits. @@ -2132,27 +2132,41 @@ nid refs state max rtr min tx min queue - Credits are initialized to allow a certain number of operations (in the example - above the table, eight as shown in the max column. LNet keeps track - of the minimum number of credits ever seen over time showing the peak congestion that - has occurred during the time monitored. Fewer available credits indicates a more - congested resource. - The number of credits currently in flight (number of transmit credits) is shown in - the tx column. The maximum number of send credits available is shown - in the max column and never changes. The number of router buffers - available for consumption by a peer is shown in the rtr - column. - Therefore, rtr – tx is the number of transmits - in flight. Typically, rtr == max, although a configuration can be set - such that max >= rtr. The ratio of routing buffer credits to send - credits (rtr/tx) that is less than max indicates - operations are in progress. If the ratio rtr/tx is greater than - max, operations are blocking. - LNet also limits concurrent sends and number of router buffers allocated to a single - peer so that no peer can occupy all these resources. + Credits are initialized to allow a certain number of operations + (in the example above the table, eight as shown in the + max column. LNet keeps track of the minimum + number of credits ever seen over time showing the peak congestion + that has occurred during the time monitored. Fewer available credits + indicates a more congested resource. + The number of credits currently available is shown in the + tx column. The maximum number of send credits is + shown in the max column and never changes. The + number of currently active transmits can be derived by + (max - tx), as long as + tx is greater than or equal to 0. Once + tx is less than 0, it indicates the number of + transmits on that peer which have been queued for lack of credits. + + The number of router buffer credits available for consumption + by a peer is shown in rtr column. The number of + routing credits can be configured separately at the LND level or at + the LNet level by using the peer_buffer_credits + module parameter for the appropriate module. If the routing credits + is not set explicitly, it'll default to the maximum transmit credits + defined by peer_credits module parameter. + Whenever a gateway routes a message from a peer, it decrements the + number of available routing credits for that peer. If that value + goes to zero, then messages will be queued. Negative values show the + number of queued message waiting to be routed. The number of + messages which are currently being routed from a peer can be derived + by (max_rtr_credits - rtr). + LNet also limits concurrent sends and number of router buffers + allocated to a single peer so that no peer can occupy all resources. + - nis - Shows the current queue health on this node. + nis - Shows current queue health on the node. + Example: # lctl get_param nis nid refs peer max tx min -- 1.8.3.1