The lnet_ni:ni_ping_count is currently reset on every (healthy) tx.
We should only reset it when receiving a message over the NI. Taking
net_lock 0 on every tx results in a performance loss for certain
workloads.
Test-Parameters: trivial testlist=sanity-lnet
Fixes:
8fdf2bc62a ("LU-13569 lnet: Recover local NI w/exponential backoff interval")
HPE-bug-id: LUS-10427
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I67ea3aa977cb5d67b04f7957120c29e9985c83e6
Reviewed-on: https://review.whamcloud.com/45235
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
* faster recovery.
*/
lnet_inc_healthv(&ni->ni_healthv, lnet_health_sensitivity);
- lnet_net_lock(0);
- ni->ni_ping_count = 0;
/*
* It's possible msg_txpeer is NULL in the LOLND
* case. Only increment the peer's health if we're
* as indication that the router is fully healthy.
*/
if (lpni && msg->msg_rx_committed) {
+ lnet_net_lock(0);
lpni->lpni_ping_count = 0;
+ ni->ni_ping_count = 0;
/*
* If we're receiving a message from the router or
* I'm a router, then set that lpni's health to
&the_lnet.ln_mt_peerNIRecovq,
ktime_get_seconds());
}
+ lnet_net_unlock(0);
}
- lnet_net_unlock(0);
/* we can finalize this message */
return -1;