Whamcloud - gitweb
LU-930 ptlrpc: clarify AT error message
authorAurelien Degremont <degremoa@amazon.com>
Tue, 18 Jan 2022 13:55:01 +0000 (13:55 +0000)
committerAndreas Dilger <adilger@whamcloud.com>
Mon, 29 Jan 2024 08:51:33 +0000 (08:51 +0000)
Clarify the error message related to passed deadline
for AT early replies. It was indicating that the system
was CPU bound which is most of the time wrong, as the issue
is rather communication failure delaying RPC traffic.
This could be confusing to people which will look for
CPU resource consumption where the network traffic is
more at cause.

Also try to use less cryptic keywords which makes only
sense to the feature developer, and not to admins.

Lustre-change: https://review.whamcloud.com/49548
Lustre-commit: 9ce04000fba07706c73b8adb3605c959e5b62712

Test-Parameters: trivial
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Change-Id: Icdff8f4c6fb9905233f6b8ed1b961b2fd1127667
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/ex/lustre-release/+/53772
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
lustre/ptlrpc/service.c

index e39cd5c..25d93f2 100644 (file)
@@ -1604,12 +1604,11 @@ static int ptlrpc_at_check_timed(struct ptlrpc_service_part *svcpt)
                 * We're already past request deadlines before we even get a
                 * chance to send early replies
                 */
-               LCONSOLE_WARN("%s: This server is not able to keep up with request traffic (cpu-bound).\n",
-                             svcpt->scp_service->srv_name);
-               CWARN("earlyQ=%d reqQ=%d recA=%d, svcEst=%d, delay=%lldms\n",
-                     counter, svcpt->scp_nreqs_incoming,
-                     svcpt->scp_nreqs_active,
-                     at_get(&svcpt->scp_at_estimate), delay_ms);
+               LCONSOLE_WARN("'%s' is processing requests too slowly, client may timeout. Late by %ds, missed %d early replies (reqs waiting=%d active=%d, at_estimate=%d, delay=%lldms)\n",
+                             svcpt->scp_service->srv_name, -first, counter,
+                             svcpt->scp_nreqs_incoming,
+                             svcpt->scp_nreqs_active,
+                             at_get(&svcpt->scp_at_estimate), delay_ms);
        }
 
        /*