Whamcloud - gitweb
LU-4509 ptlrpc: re-enqueue ptlrpcd worker 22/8922/4
authorLiang Zhen <liang.zhen@intel.com>
Mon, 20 Jan 2014 12:52:51 +0000 (20:52 +0800)
committerOleg Drokin <oleg.drokin@intel.com>
Mon, 3 Feb 2014 02:30:54 +0000 (02:30 +0000)
commitd1dcff3084e929f5768dc733cdc104cddf168c06
tree2c2190899c9829c6fb63e989f19d75065ab02ed3
parentb8d04564d54fbff0836f3783fd70dcbc8771c008
LU-4509 ptlrpc: re-enqueue ptlrpcd worker

osc_extent_wait can be stuck in scenario like this:

1) thread-1 held an active extent
2) thread-2 called flush cache, and marked this extent as "urgent"
   and "sync_wait"
3) thread-3 wants to write to the same extent, osc_extent_find will
   get "conflict" because this extent is "sync_wait", so it starts
   to wait...
4) cl_writeback_work has been scheduled by thread-4 to write some
   other extents, it has sent RPCs but not returned yet.
5) thread-1 finished his work, and called osc_extent_release()->
   osc_io_unplug_async()->ptlrpcd_queue_work(), but found
   cl_writeback_work is still running, so it's ignored (-EBUSY)
6) thread-3 is stuck because nobody will wake him up.

This patch allows ptlrpcd_work to be rescheduled, so it will not
miss request anymore

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I4929d52b2d409c2ce081147bb5ee3dd380a86c43
Reviewed-on: http://review.whamcloud.com/8922
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
lustre/ptlrpc/client.c