Whamcloud - gitweb
LU-8964 clio: Parallelize generic I/O
Add parallel version of cl_io_loop() function which use information
about stripes from LOV layer and process them in parallel.
This feature is disabled by default. To enable it you should run
"lctl set_param llite.*.pio=1" command.
IOR results on KNL for:
access = file-per-process
ordering in a file = sequential offsets
ordering inter file= no tasks offsets
clients = 1 (1 per node)
repetitions = 1
blocksize = 128 GiB
aggregate filesize = 128 GiB
xfsize pio Write Read
16 none 170.46 372.12
16 off 370.46 926.53
16 on 668.49 899.55
32 off 368.75 908.95
32 on 469.54 987.64
IOR results on Broadwell Xeon for:
access = file-per-process
ordering in a file = sequential offsets
ordering inter file= no tasks offsets
clients = 1 (1 per node)
repetitions = 1
xfersize = 16 MiB
blocksize = 128 GiB
aggregate filesize = 128 GiB
pio Write Read
none 1419.80 1277.88
off 1348.98 2245.84
on 990.76 2320.08
The scalability IOR results on other Broadwell Xeon for:
access = file-per-process
ordering in a file = sequential offsets
ordering inter file= no tasks offsets
repetitions = 1
xfersize = 4 MiB
blocksize = 8 GiB
Threads pio Write Read
32 off 9358.38 2649.28
32 on 9147.14 2677.44
64 off 8538.65 2811.05
64 on 8944.19 2908.44
128 off 7978.61 2937.03
128 on 8613.91 2928.44
The numbers are in ‘MB/s’
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Id028faba1726fb377d0e903e8b8095d5ea9d1ee2
Reviewed-on: https://review.whamcloud.com/26468
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
15 files changed: