git://git.whamcloud.com - fs/lustre-release.git/commit

author	Patrick Farrell <paf@cray.com>
	Mon, 17 Jul 2017 14:03:07 +0000 (09:03 -0500)
committer	Oleg Drokin <oleg.drokin@intel.com>
	Sat, 22 Jul 2017 02:54:57 +0000 (02:54 +0000)
commit	c084c6215851d238d14b0d414374b6b55c91f525
tree	4cccbfee2f669eb818614cba59103d33a7e8c63d	tree \| snapshot
parent	834e942d3328f97b852fb6f4992775f5e3963483	commit \| diff

LU-9749 llite: Reduce overhead for ll_do_fast_read

In ll_do_fast_read, looking up a cl_env adds some overhead,
and can also cause spinlock contention on older kernels.

Fast read can safely use the preallocated percpu cl_env, so
do that to reduce overhead.

SLES numbers on recent Xeon, CentOS numbers on VMs on
older hardware.  SLES has queued spinlocks and scales
perfectly with multiple threads, with or without this
patch.  CentOS scales poorly at small I/O sizes without
this patch.

SLES is SLES12SP2, CentOS is CentOS 7.3.

SLES:
1 thread
         8b   1K   1M
Without: 23   2200 6800
With:    27.5 2500 7200

4 threads
         8b   1K    1M
Without: 90   8700  27000
With:    108  10000 28000

Earlier kernel (CentOS 7.3):
1 thread
         8b  1K   1M
Without: 9   1000 5100
with:    12  1300 5800

4 threads
         8b  1K   1M
Without: 22  2400 17000
With:    48  4900 20000

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ice5d653ace5ce76bc8911501a9b15c11b7a3234a
Reviewed-on: https://review.whamcloud.com/27970
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>

lustre/llite/file.c		diff \| blob \| history
lustre/llite/rw.c		diff \| blob \| history