Whamcloud - gitweb
LU-13377 llite: fix dead loop for short write 63/38163/2
authorWang Shilong <wshilong@ddn.com>
Sat, 21 Mar 2020 01:58:09 +0000 (09:58 +0800)
committerOleg Drokin <green@whamcloud.com>
Tue, 14 Apr 2020 17:56:11 +0000 (17:56 +0000)
commit9e8b4e2fc2f0b3ad61b0fed9326580dad0389cbf
treed49b690d56328bd359757932cc037035db2d8608
parentde994667dda925109e862edadb4aa4feaecd0e6b
LU-13377 llite: fix dead loop for short write

|->vvp_io_write_start()
 |->__generic_file_write_iter()
    |->iov_iter_advance() if write succeed()
  |->vvp_io_write_commit()
     |->update ci_nob

The problem is we will move forward iov iter inside
__generic_file_write_iter(), but @ci_nob will be updated
after vvp_io_write_commit().

If out of quota or some other problems happen, this could
cause a mismatch with @ci_nob and @vui_iter.

And @vui_iter->count will be reset using @ci_nob in
iov_iter_reexpand(), this will make @vui_iter->count
more than what it really left, and we could dead loop
in vvp_mmap_locks() if IO need be retried or restarted:

vvp_io_write_lock+0x45/0x80 [lustre]
cl_io_lock+0x5f/0x3d0 [obdclass]
cl_io_loop+0x92/0x190 [obdclass]
ll_file_io_generic+0x7b3/0xc90 [lustre]
ll_file_aio_write+0x12d/0x1f0 [lustre]
ll_file_write+0xce/0x1e0 [lustre]
vfs_write+0xc0/0x1f0
SyS_write+0x7f/0xf0
system_call_fastpath+0x22/0x27

Lustre-change: https://review.whamcloud.com/38018
Lustre-commit: 13dfe0df4956afb50b323a11615b0b34ed014e53

Change-Id: I5fb4c18cf02fb17bf50122b63decacef678caa01
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/38163
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
lustre/include/obd_support.h
lustre/llite/vvp_io.c
lustre/tests/sanity.sh