Whamcloud - gitweb
LU-9749 llite: Reduce overhead for ll_do_fast_read
In ll_do_fast_read, looking up a cl_env adds some overhead,
and can also cause spinlock contention on older kernels.
Fast read can safely use the preallocated percpu cl_env, so
do that to reduce overhead.
SLES numbers on recent Xeon, CentOS numbers on VMs on
older hardware. SLES has queued spinlocks and scales
perfectly with multiple threads, with or without this
patch. CentOS scales poorly at small I/O sizes without
this patch.
SLES is SLES12SP2, CentOS is CentOS 7.3.
SLES:
1 thread
8b 1K 1M
Without: 23 2200 6800
With: 27.5 2500 7200
4 threads
8b 1K 1M
Without: 90 8700 27000
With: 108 10000 28000
Earlier kernel (CentOS 7.3):
1 thread
8b 1K 1M
Without: 9 1000 5100
with: 12 1300 5800
4 threads
8b 1K 1M
Without: 22 2400 17000
With: 48 4900 20000
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ice5d653ace5ce76bc8911501a9b15c11b7a3234a
Reviewed-on: https://review.whamcloud.com/27970
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>