Whamcloud - gitweb
LU-17478 clio: parallelize unaligned DIO write copy
The data copying for unaligned/hybrid IO reads is already
parallel because it is done by the ptlrpc threads at the
end of the IO. That for the writes is not - it is done
by the submitting thread during IO submission.
This has a huge performance impact, limiting writes to
around 3.0 GiB/s when reads are at 12 GiB/s.
With the iov iter issue fixed, we can do this copy as
part of IO submission.
With this and the patch to use page pools for buffer
allocation (https://review.whamcloud.com/53670), the
maximum performance of hybrid IO is basically the same as
DIO, at least for current speeds.
This means hybrid reads and writes at 20 GiB/s with
current master + this and the pool patch.
Note this requires a funny workaround: If a user thread
calls fsync while a DIO write is in progress, the user
thread can pick that write up at the RPC layer and become
responsible for writing it out, even though that write
isn't in cache. (Because the write is waiting to be
picked up by a ptlrpcd thread.)
If that DIO write is unaligned, the user thread is unable
to do the memory copy. It's not an option to have the
thread ignore a ready RPC, so instead we spawn a kthread
to handle this case.
This only occurs when DIO is racing with fsync, so
performance doesn't really matter, and this cost is OK.
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Ic8209e1fda97cda83e5b87baba48d15dd4dcc15f
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/53844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>