Whamcloud - gitweb
LU-9075 mdt: avoid race causing mdt_coordinator_cb() err msgs 43/25243/2
authorBruno Faccini <bruno.faccini@intel.com>
Fri, 3 Feb 2017 16:55:37 +0000 (17:55 +0100)
committerOleg Drokin <oleg.drokin@intel.com>
Sun, 23 Apr 2017 03:11:09 +0000 (03:11 +0000)
This patch mainly moves mdt_agent_record_update() call before
mdt_cdt_remove_request() in mdt_hsm_update_request_state(), to
avoid the frequent couple of
"(mdt_coordinator.c:1473:mdt_hsm_update_request_state()) ...
Cannot find running request for cookie ..."
and
"(mdt_coordinator.c:339:mdt_coordinator_cb()) ...
cannot cleanup timed out request ..."
error msgs, likely to concern active requests that have completed
and thus that have already been removed from memory in
mdt_hsm_update_request_state() (using mdt_cdt_remove_request() and
in the context of a MDT thread handling CT's MDS_HSM_PROGRESS
requests), but the corresponding action LLOG record update is stuck
awaiting for CDT to give-back cdt_llog_lock in
mdt_agent_record_update().

Others related but minor changes are, use of arr_req_change instead of
arr_req_create to more accuratelly determine if a request exceeds the
timeout, and change main debug msg in mdt_hsm_update_request_state()
to reflect if action LLOG record update will occur or not.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I043813f1ff11a7e9e99c534fa8560a35e2c52543
Reviewed-on: https://review.whamcloud.com/25243
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>

No differences found