From: Cyril Bordage Date: Sat, 10 Dec 2022 00:51:16 +0000 (+0100) Subject: LU-16378 lnet: handles unregister/register events X-Git-Tag: 2.15.53~1 X-Git-Url: https://git.whamcloud.com/?a=commitdiff_plain;h=3c9282a67d73799a03cb1d254275685c1c1e4df2;p=fs%2Flustre-release.git LU-16378 lnet: handles unregister/register events When network is restarted, devices are unregistered and then registered again. When a device registers using an index that is different from the previous one (before network was restarted), LNet ignores it. Consequently, this device stays with link in fatal state. To fix that, we catch unregistering events to clear the saved index value, and when a registering event comes, we save the new value. Signed-off-by: Cyril Bordage Change-Id: I17e93a1103d588f3e630a9c7446b345f4d472b97 Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/49375 Tested-by: jenkins Tested-by: Maloo Reviewed-by: Serguei Smirnov Reviewed-by: Amir Shehata Reviewed-by: Oleg Drokin --- diff --git a/lnet/klnds/socklnd/socklnd.c b/lnet/klnds/socklnd/socklnd.c index 84e176f..b1539ac 100644 --- a/lnet/klnds/socklnd/socklnd.c +++ b/lnet/klnds/socklnd/socklnd.c @@ -1993,10 +1993,28 @@ ksocknal_handle_link_state_change(struct net_device *dev, sa = (void *)&ksi->ksni_addr; found_ip = false; - if (ksi->ksni_index != ifindex || - strcmp(ksi->ksni_name, dev->name)) + if (strcmp(ksi->ksni_name, dev->name)) + continue; + + if (ksi->ksni_index == -1) { + if (dev->reg_state != NETREG_REGISTERED) + continue; + /* A registration just happened: save the new index for + * the device */ + ksi->ksni_index = ifindex; + goto out; + } + + if (ksi->ksni_index != ifindex) continue; + if (dev->reg_state == NETREG_UNREGISTERING) { + /* Device is being unregitering, we need to clear the + * index, it can change when device will be back */ + ksi->ksni_index = -1; + goto out; + } + ni = net->ksnn_ni; in_dev = __in_dev_get_rtnl(dev); @@ -2092,6 +2110,8 @@ static int ksocknal_device_event(struct notifier_block *unused, case NETDEV_UP: case NETDEV_DOWN: case NETDEV_CHANGE: + case NETDEV_REGISTER: + case NETDEV_UNREGISTER: ksocknal_handle_link_state_change(dev, operstate); break; }