Whamcloud - gitweb
LU-4509 ptlrpc: re-enqueue ptlrpcd worker 05/9705/8
authorJames Simmons <uja.ornl@gmail.com>
Fri, 2 May 2014 13:52:41 +0000 (09:52 -0400)
committerOleg Drokin <oleg.drokin@intel.com>
Thu, 22 May 2014 04:52:16 +0000 (04:52 +0000)
commit901d3aff446fdfa7bb1bd694ac6708f4bb070f34
tree12be0702704f2ced7ea9d850255315574887aed9
parented82a26746d22fc150f5b096aa9185a049d22a30
LU-4509 ptlrpc: re-enqueue ptlrpcd worker

osc_extent_wait can be stuck in scenario like this:

1) thread-1 held an active extent
2) thread-2 called flush cache, and marked this extent as "urgent"
   and "sync_wait"
3) thread-3 wants to write to the same extent, osc_extent_find will
   get "conflict" because this extent is "sync_wait", so it starts
   to wait...
4) cl_writeback_work has been scheduled by thread-4 to write some
   other extents, it has sent RPCs but not returned yet.
5) thread-1 finished his work, and called osc_extent_release()->
   osc_io_unplug_async()->ptlrpcd_queue_work(), but found
   cl_writeback_work is still running, so it's ignored (-EBUSY)
6) thread-3 is stuck because nobody will wake him up.

This patch allows ptlrpcd_work to be rescheduled, so it will not
miss request anymore

Lustre-commit: d1dcff3084e929f5768dc733cdc104cddf168c06
Lustre-change: http://review.whamcloud.com/8922

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I4929d52b2d409c2ce081147bb5ee3dd380a86c43
Reviewed-on: http://review.whamcloud.com/8922
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/9705
lustre/ptlrpc/client.c