Whamcloud - gitweb
LU-16633 obdclass: fix rpc slot leakage 61/50261/12
authorAlex Zhuravlev <bzzz@whamcloud.com>
Fri, 10 Mar 2023 17:47:05 +0000 (20:47 +0300)
committerOleg Drokin <green@whamcloud.com>
Tue, 28 Mar 2023 22:18:21 +0000 (22:18 +0000)
commit91a3726f313df33e099320d171039f8371fec27f
treecd5cb258bb94d65287328df277a9fdafde3be745
parent12c34651994b77ac0cf231cd710f9d511845a4e1
LU-16633 obdclass: fix rpc slot leakage

obd_get_mod_rpc_slot() can race with obd_put_mod_rpc_slot():
finishing wait_woken() resets WQ_FLAG_WOKEN (which is set
when the corresponding thread gets a slot incrementing
cl_mod_rpcs_in_flight. then another thread execting
__wake_up_locked_key() may find that wq_entry again and call
claim_mod_rpc_function() one more time again incrementing
cl_mod_rpc_in_flight. thus it's incremented twice for a
single obd_get_mod_rpc_slot().

 #1: obd_get_mod_rpc_slot() #2: obd_put_mod_rpc_slot()
flags &= ~WQ_FLAG_WOKEN
list_add()
wait_woken()
schedule claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()

flags &= ~WQ_FLAG_WOKEN

#3: obd_put_mod_rpc_slot()
claim_mod_rpc_function()
cl_mod_rpcs_in_flight++
wake_up()
list_del()

the patch introduces a replacement for WQ_FLAG_WOKEN which is never
reset once set.

Fixes: 5243630b09 ("LU-15947 obdclass: improve precision of wakeups for mod_rpcs")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I29371c8c85414413c5a8e41dec3632f64ad127bb
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/50261
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/mdc/mdc_request.c
lustre/obdclass/genops.c