Whamcloud - gitweb
LU-13377 llite: fix dead loop for short write 18/38018/6
authorWang Shilong <wshilong@ddn.com>
Sat, 21 Mar 2020 01:58:09 +0000 (09:58 +0800)
committerOleg Drokin <green@whamcloud.com>
Tue, 7 Apr 2020 17:21:25 +0000 (17:21 +0000)
commit13dfe0df4956afb50b323a11615b0b34ed014e53
tree277a0b03664b8697dc9a77ec877980c32bcd4dbd
parentd292fb12a0bdcb3d8dae22767929b20b10d1197e
LU-13377 llite: fix dead loop for short write

|->vvp_io_write_start()
 |->__generic_file_write_iter()
    |->iov_iter_advance() if write succeed()
  |->vvp_io_write_commit()
     |->update ci_nob

The problem is we will move forward iov iter inside
__generic_file_write_iter(), but @ci_nob will be updated
after vvp_io_write_commit().

If out of quota or some other problems happen, this could
cause a mismatch with @ci_nob and @vui_iter.

And @vui_iter->count will be reset using @ci_nob in
iov_iter_reexpand(), this will make @vui_iter->count
more than what it really left, and we could dead loop
in vvp_mmap_locks() if IO need be retried or restarted:

vvp_io_write_lock+0x45/0x80 [lustre]
cl_io_lock+0x5f/0x3d0 [obdclass]
cl_io_loop+0x92/0x190 [obdclass]
ll_file_io_generic+0x7b3/0xc90 [lustre]
ll_file_aio_write+0x12d/0x1f0 [lustre]
ll_file_write+0xce/0x1e0 [lustre]
vfs_write+0xc0/0x1f0
SyS_write+0x7f/0xf0
system_call_fastpath+0x22/0x27

Change-Id: I5fb4c18cf02fb17bf50122b63decacef678caa01
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/38018
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
lustre/include/obd_support.h
lustre/llite/vvp_io.c
lustre/tests/sanity.sh