Whamcloud - gitweb
LU-14283 obdclass: connect vs disconnect race 56/41256/3
authorWang Shilong <wshilong@ddn.com>
Sat, 16 Jan 2021 12:46:13 +0000 (20:46 +0800)
committerOleg Drokin <green@whamcloud.com>
Fri, 22 Jan 2021 20:14:39 +0000 (20:14 +0000)
There might be a possible race if setup (connect)
and cleanup (disconnect) are tangled together(similar
comments in osc_disconnect()):

  Thread1: Thread2:
   connecting  class_cleanup
   obd->obd_setup = 0
  if (obd->obd_set_up)
osc_init_grant() /*skipped*/

And If RPC was waked up and send out before
class_disconnect_exports(), It might hit divide zero crash
in osc_announce_cached() because @cl_max_extent_pages is zero.

The problem is we clear @obd_setup too early, It should be cleared
when OBD is really shutdown.

Fixes: 45900a ("LU-4134 obdclass: obd_device improvement")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I898b6f53602c05221a3154a61615a0e270167ac6
Reviewed-on: https://review.whamcloud.com/41256
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>

index bf86016..69ca1b4 100644 (file)
@@ -837,11 +837,6 @@ int class_cleanup(struct obd_device *obd, struct lustre_cfg *lcfg)
        /* Leave this on forever */
        obd->obd_stopping = 1;
-       /*
-        * function can't return error after that point, so clear setup flag
-        * as early as possible to avoid finding via obd_devs / hash
-        */
-       obd->obd_set_up = 0;
        /* wait for already-arrived-connections to finish. */