When copying one page to another in kernel memory, the
kernel has an optimized copy_page which can be used instead
of memcpy().
This is relevant in the lnet loopback subsystem, which
copies from the kiov from the client to that on the server.
(Using the same page is nasty for a lot of reasons, so
copying is best.)
So we can check for the full page to full page copy and
use that.
On my little tiny VM system, this improves maximum write
performance (with the fake write fail_loc enabled) by about
20%, from 4.4 GiB/s to 5.7 GiB/s.
We should also eventually be able to add a fake copy to the
fake read/fake write fail loc, but that's a bit tricky, so
will be left out of this patch.
Signed-off-by: Patrick Farrell <patrick.farrell@oracle.com>
Change-Id: Iea89447ed03bd4646544883b588873700f6e09a4
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/54923
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
#define DEBUG_SUBSYSTEM S_LNET
#include <linux/pagemap.h>
+#include <linux/mm.h>
#include <lnet/lib-lnet.h>
#include <linux/nsproxy.h>
* However in practice at least one of the kiovs will be mapped
* kernel pages and the map/unmap will be NOOPs */
- memcpy (daddr, saddr, this_nob);
+ if (this_nob == PAGE_SIZE && !diov->bv_offset && !doffset &&
+ !siov->bv_offset && !soffset)
+ copy_page(daddr, saddr);
+ else
+ memcpy (daddr, saddr, this_nob);
nob -= this_nob;
if (diov->bv_len > doffset + this_nob) {