Whamcloud - gitweb
fs/lustre-release.git
2 months agoLU-13799 llite: Adjust dio refcounting 47/39447/16
Patrick Farrell [Fri, 7 May 2021 19:50:15 +0000 (15:50 -0400)]
LU-13799 llite: Adjust dio refcounting

We get a page reference in cl_page_find, then immediately
add another for cl_2queue_add and remove the first
reference.  This is pretty silly, since the life cycle is
the same on these.

This improves DIO/AIO page submission by around 2%.

This patch reduces i/o time in ms/GiB by:
Write: 2 ms/GiB
Read: 2 ms/GiB

Totals:
Write: 170 ms/GiB
Read: 162 ms/GiB

mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect

With previous pa5ches in series:
write        5955 MiB/s
read         6218 MiB/s

Plus this patch:
write        6028 MiB/s
read         6305 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I228eca6d48c6007bbf2c8caae5e477b7d40521d1
Reviewed-on: https://review.whamcloud.com/39447
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13799 lov: Improve DIO submit 46/39446/16
Patrick Farrell [Fri, 7 May 2021 19:42:20 +0000 (15:42 -0400)]
LU-13799 lov: Improve DIO submit

Skip some unnecessary looping in page submission for the
DIO case.

This gives about a 2% improvement for AIO/DIO page
submission.

This patch reduces i/o time in ms/GiB by:
Write: 2 ms/GiB
Read: 2 ms/GiB

Totals:
Write: 172 ms/GiB
Read: 165 ms/GiB

mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect

With previous patches in series:
write        7726 MiB/s
read         5899 MiB/s

Plus this patch:
write        5954 MiB/s
read         6217 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: Iedad978438ee3f1f3290d990311532626cba9e2d
Reviewed-on: https://review.whamcloud.com/39446
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13799 llite: Remove transient page counting 41/39441/15
Patrick Farrell [Sat, 29 May 2021 01:32:43 +0000 (21:32 -0400)]
LU-13799 llite: Remove transient page counting

Transient page counting is not used for anything, as
already noted in the commit message, but costs something
like 4% of the time in DIO page submission.

Remove it.

mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect

This patch reduces i/o time in ms/GiB by:
Write: 6 ms/GiB
Read: 11 ms/GiB

Totals:
Write: 174 ms/GiB
Read: 167 ms/GiB

With previous patches in series:
write     5703 MiB/s
read      5756 MiB/s

Plus this patch:
write     5900 MiB/s
read      6136 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I825de4f1b5d1dd1476a4a711bfa51e7d24b5027a
Reviewed-on: https://review.whamcloud.com/39441
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-13799 llite: Modify AIO/DIO reference counting 42/39442/14
Patrick Farrell [Fri, 7 May 2021 15:50:51 +0000 (11:50 -0400)]
LU-13799 llite: Modify AIO/DIO reference counting

For DIO pages, it's enough to have a reference on the
cl_object associated with the AIO.  This saves taking a
reference on the cl_object for each page, which saves about
5% of the time when doing DIO/AIO.

This is possible because the lifecycle of the aio struct is
always greater than that of the associated pages.

This patch reduces i/o time in ms/GiB by:
Write: 6 ms/GiB
Read: 1 ms/GiB

Totals:
Write: 198 ms/GiB
Read: 197 ms/GiB

mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect

With previous patches in series:
write     5030 MiB/s
read      5174 MiB/s

Plus this patch:
write     5183 MiB/s
read      5200 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I970cda20417265b4b66a8eed6e74440e5d3373b8
Reviewed-on: https://review.whamcloud.com/39442
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13326 mds: remove MDS_SETATTR_PORTAL and service 98/37798/7
Andreas Dilger [Wed, 4 Mar 2020 20:28:26 +0000 (12:28 -0800)]
LU-13326 mds: remove MDS_SETATTR_PORTAL and service

Remove the MDS_SETATTR_PORTAL and the service threads listening on
this portal since they are unused since Lustre 2.1 and are no longer
needed.

Remove module tunables related to the mds_attr service threads:
- mds_attr_num_threads
- mds_attr_cpu_bind
- mds_attr_num_cpts

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I64f4f3f0004e1895ef7b49b31a4ad687a1abcca2
Reviewed-on: https://review.whamcloud.com/37798
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13417 test: mkdir_on_mdt0() in more tests 15/44315/8
Lai Siyao [Thu, 8 Jul 2021 08:09:01 +0000 (16:09 +0800)]
LU-13417 test: mkdir_on_mdt0() in more tests

Replace mkdir with mkdir_on_mdt0() in several tests.

Update recovery-small test_110k() in case there are opened files on
MDT1 which would cause umount stall.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iebc32568b7fc146b658f47c5f5053fd3db24432f
Reviewed-on: https://review.whamcloud.com/44315
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-14655 lnet: Protect lpni deref in lnet_health_check 03/43503/3
Chris Horn [Wed, 28 Apr 2021 01:10:16 +0000 (20:10 -0500)]
LU-14655 lnet: Protect lpni deref in lnet_health_check

Discovery thread can modify peer NI/peer net/peer relationship
so we need to be careful when dereferencing the peer NI pointer in
lnet_health_check(). Discovery thread operations under net lock, so
move the peer NI dereference under the net lock which is taken for
incrementing the health stats.

Move some of the other code that is only relevant for messages with a
health status != LNET_MSG_STATUS_OK under the appropriate condition.

HPE-bug-id: LUS-9962
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I3e6763b71bcdc9281f46b79c59e40f939190d468
Reviewed-on: https://review.whamcloud.com/43503
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13417 test: generate uneven MDTs early for sanity 413 84/44384/6
Lai Siyao [Tue, 20 Jul 2021 01:24:36 +0000 (09:24 +0800)]
LU-13417 test: generate uneven MDTs early for sanity 413

Fill MDT early to generate uneven MDTs for sanity test_413, and
add test_413z to unlink these directories.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-1
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I84e3670bb40c3666488139d6a272f29188b0dfae
Reviewed-on: https://review.whamcloud.com/44384
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-14868 llite: revert 'simplify callback handling for async getattr' 71/44371/6
Andreas Dilger [Wed, 21 Jul 2021 23:38:37 +0000 (23:38 +0000)]
LU-14868 llite: revert 'simplify callback handling for async getattr'

This reverts commit cbaaa7cde45f59372c75577d7274f7e2e38acd24.

This is causing process hangs and timeouts during file removal.

Test-Parameters: trivial
Fixes: cbaaa7cde4 ("LU-14139 llite: simplify callback handling for async getattr")
Change-Id: I77f5bc460850bfe7a5143e22b0c5f3e14a40474a
Reviewed-on: https://review.whamcloud.com/44371
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-14826 mdt: getattr_name("..") under striped directory 68/44168/6
Lai Siyao [Thu, 8 Jul 2021 14:25:51 +0000 (10:25 -0400)]
LU-14826 mdt: getattr_name("..") under striped directory

For getattr_name(".."), it should return FID of the master object for
striped directories. This includes changes on both client and server:
* lmv_getattr_name() should use master object FID if it's looking up
  "..".
* mdt_raw_lookup() should check parent object is sub stripe, if so
  it needs to lookup again to get master object FID. For old client
  without above change this needs to be checked twice.

This is needed by NFS export, because ll_get_parent() find parent by
getattr_name("..").

Reenable check_fhandle_syscall and update sanityn test_102.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I72c951293e41656ce3778750147402d7f8ca4cec
Reviewed-on: https://review.whamcloud.com/44168
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
2 months agoLU-14833 sec: quiet spurious gss_init_svc_upcall() message 97/44197/2
Sebastien Buisson [Fri, 9 Jul 2021 12:52:40 +0000 (14:52 +0200)]
LU-14833 sec: quiet spurious gss_init_svc_upcall() message

Switch from CWARN to CDEBUG(D_SEC) for message printed by
gss_init_svc_upcall():
Init channel is not opened by lsvcgssd, following request might be
dropped until lsvcgssd is active
Indeed, this message is printed no matter GSS is enabled or not, and
we do not have any way to check this by the time the kernel module
is loaded.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I66c8c2a16e58ca75973226c80e0f4a92c90b4025
Reviewed-on: https://review.whamcloud.com/44197
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-14114 lnet: print device status in net show command 69/44169/2
Cyril Bordage [Wed, 7 Jul 2021 13:27:54 +0000 (15:27 +0200)]
LU-14114 lnet: print device status in net show command

A device can be in fatal state, if the cable was disconnected, or the
port brought down on the switch side. In these cases, the LND (o2iblnd
for now), will flag the device in fatal state. That device will not be
used any further. However, it's health will not be decremented. This
causes some confusion when examining the state of the node.
It is better to print the device status in the output of the lnetctl
net show command.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I7c635ab1062f6153449fcec1bc07585065818a72
Reviewed-on: https://review.whamcloud.com/44169
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-14804 nodemap: do not return error for improper ACL 27/44127/3
Sebastien Buisson [Thu, 1 Jul 2021 15:20:39 +0000 (00:20 +0900)]
LU-14804 nodemap: do not return error for improper ACL

In nodemap_map_acl(), in case the ACL is incorrect, do nothing
and just return initial size to caller.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I26aba9ce43e4a8878bfa47e145b1b44cfff89403
Reviewed-on: https://review.whamcloud.com/44127
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-14300 quota: avoid nested lqe lookup 26/43326/5
Sergey Cheremencev [Mon, 12 Apr 2021 23:44:34 +0000 (02:44 +0300)]
LU-14300 quota: avoid nested lqe lookup

lqe_locate called from qmt_pool_lqes_lookup for lqe
that hasn't an entry on a disk calls qmt_lqe_set_default.
This may call qmt_set_id_notify->qmt_pool_lqes_spec
and rewrite already added lqes in a qti. Rewritten
lqes may trigger an assertion:

LustreError: 5072:0:(qmt_pool.c:838:qmt_pool_lqes_lookup()) ASSERTION( (((qmt_info(env)->qti_lqes_num) > 16 ? qmt_info(env)->qti_lqes : qmt_info(env)->qti_lqes_small)[(qmt_info(env)->qti_glbl_lqe_idx)])->lqe_is_global ) failed:
LustreError: 5072:0:(qmt_pool.c:838:qmt_pool_lqes_lookup()) LBUG
Pid: 5072, comm: mdt_rdpg00_003 3.10.0-957.1.3957.1.3.x4.1.15.x86_64 #1 SMP Mon Nov 18 14:47:03 PST 2019
Call Trace:
 [<ffffffffc046f62c>] libcfs_call_trace+0x8c/0xc0 [libcfs]
 [<ffffffffc046f94c>] lbug_with_loc+0x4c/0xa0 [libcfs]
 [<ffffffffc0e4ae38>] qmt_pool_lqes_lookup+0x798/0x8f0 [lquota]
 [<ffffffffc0e3b0ce>] qmt_intent_policy+0x86e/0xe00 [lquota]
 [<ffffffffc109d53d>] mdt_intent_opc+0x3bd/0xb40 [mdt]
 [<ffffffffc10a5134>] mdt_intent_policy+0x1a4/0x360 [mdt]
 [<ffffffffc0a7bedb>] ldlm_lock_enqueue+0x3cb/0xad0 [ptlrpc]
 [<ffffffffc0aa4a46>] ldlm_handle_enqueue0+0xa56/0x1610 [ptlrpc]
 [<ffffffffc0b304b2>] tgt_enqueue+0x62/0x210 [ptlrpc]
 [<ffffffffc0b3753a>] tgt_request_handle+0x7ea/0x1750 [ptlrpc]

or a deadlock(2 same lqes qti_lqes array):

 call_rwsem_down_write_failed+0x17/0x30
 qti_lqes_write_lock+0xb1/0x1b0 [lquota]
 qmt_dqacq0+0x2ee/0x1ac0 [lquota]
 qmt_intent_policy+0xbfe/0xe00 [lquota]
 mdt_intent_opc+0x3ba/0xb50 [mdt]
 mdt_intent_policy+0x1a1/0x360 [mdt]
 ldlm_lock_enqueue+0x3d6/0xaa0 [ptlrpc]
 ldlm_handle_enqueue0+0xa76/0x1620 [ptlrpc]
 tgt_enqueue+0x62/0x210 [ptlrpc]
 tgt_request_handle+0x96a/0x1680 [ptlrpc]
 kthread+0xd1/0xe0

Patch adds a sanity-quota_73b to check that the isssue
doesn't exist anymore.

Change-Id: Ib1ebe82c3b6e819b2538f30af08930060bd659ae
HPE-bug-id: LUS-9902
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://es-gerrit.dev.cray.com/158581
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/43326
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-14508 lfs: make mirror operations preserve timestamps 09/42009/17
John L. Hammond [Thu, 11 Mar 2021 16:02:54 +0000 (10:02 -0600)]
LU-14508 lfs: make mirror operations preserve timestamps

Save and try to restore the file timestamps around the various mirror
operations. Add sanity-flr tests 61[abc] to verify this.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I5ef754e46cfbe82c731a709209576bbfcc73af3d
Reviewed-on: https://review.whamcloud.com/42009
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12214 build: fix SLES build/install 72/39972/20
Alexey Lyashkov [Fri, 15 Jan 2021 15:21:15 +0000 (10:21 -0500)]
LU-12214 build: fix SLES build/install

Redhat and SuSe can have different library name for same devel,
lets drop a strong requrement to the library package name and
ask rpm to use an autoprovide option.

Test-Parameters: trivial
Test-Parameters: clientdistro=sles15sp1 ossdistro=el7.7 mdsdistro=el8.2
HPE-bug-id: LUS-7204
Fixes: e1bf37870d LU-12214 build: fix build with gss enabled
Fixes: d746e64fe1 LU-13562 build: SUSE build support for azure
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I7e0fe83f9090e7616ab156fa75fed4821099406e
Reviewed-on: https://review.whamcloud.com/39972
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12022 tests: error on resync failure sanity-flr 54/35754/7
James Nunez [Tue, 15 Jun 2021 17:14:49 +0000 (11:14 -0600)]
LU-12022 tests: error on resync failure sanity-flr

In sanity-flr test 200, we should error if the final resync
fails.  Replace all calls to 'mirror_io resync' that does
not inject an error to  '$LFS mirror resync'.

Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I9b2ec1beb7060086808b7529467bef80c8e9659f
Reviewed-on: https://review.whamcloud.com/35754
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
2 months agoLU-6142 libcfs: checkpatch cleanup of libcfs fail.c 07/44207/4
James Simmons [Sat, 10 Jul 2021 14:54:23 +0000 (10:54 -0400)]
LU-6142 libcfs: checkpatch cleanup of libcfs fail.c

Resolve several checkpatch issues reported for fail.c. This brings
us into aligment with the native Linux client version.

Test-Parameters: trivial
Change-Id: I71e71f48a94fa20756f7696b5fbf115c919d05d3
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44207
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-6142 lnet: convert kiblnd/ksocknal_thread_start to vararg 22/44122/3
Mr NeilBrown [Thu, 1 Jul 2021 03:19:29 +0000 (13:19 +1000)]
LU-6142 lnet: convert kiblnd/ksocknal_thread_start to vararg

Rather than requiring the called to format a thread name into a temp
buffer, change these thread_start function to accept a format and
args, and to hand them directly to kthread_run().

This is done with a macro rather than a function as the functions are
trivial and varargs is slightly easier with macros.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I73926ef38a9e84061d1a3f9acf5c0be4a247f957
Reviewed-on: https://review.whamcloud.com/44122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 lnet: discard lnet_current_net_count 89/44089/2
Mr NeilBrown [Mon, 28 Jun 2021 06:22:02 +0000 (16:22 +1000)]
LU-6142 lnet: discard lnet_current_net_count

The variable lnet_current_net_count is never used.  So remove it.
The function lnet_get_net_count() is only used to update thar
variable, so remove it too.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id61f381f6220356c5b96c8a5822d8748a8ba43a4
Reviewed-on: https://review.whamcloud.com/44089
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-14217 osd-zfs: allow SEEK_HOLE/DATA only with sync 70/40970/2
Mikhail Pershin [Tue, 15 Dec 2020 11:47:20 +0000 (14:47 +0300)]
LU-14217 osd-zfs: allow SEEK_HOLE/DATA only with sync

ZFS doesn't report valid offset for SEEK_DATA if there are dirty
data, but may report SEEK_HOLE correctly that cause unreliable
results when same offset can be reported as HOLE (correctly) and
also as DATA, incorrectly but because switching to generic approach,
assuming all file is data and hole beyond end of file.

To avoid that we have to sync dirty data when dmu_offset_next()
reports EBUSY and repeat lseek call. Considering that this can
cause slowdown this behavior is controlled via new 'sync_on_lseek'
option. With this option turned off osd-zfs reports that it doesn't
support SEEK_DATA/HOLE because we cannot use unrealiable results
in our tools to copy sparse data

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic92c127628ce517a9c2f79f595a1d16116930383
Reviewed-on: https://review.whamcloud.com/40970
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-14805 llite: No locked parallel DIO 31/44131/3
Patrick Farrell [Fri, 2 Jul 2021 17:24:48 +0000 (13:24 -0400)]
LU-14805 llite: No locked parallel DIO

If we are doing locked DIO, the OSC & LDLM locks are
released at the end of cl_io_loop, ie, before we wait for
parallel DIO at the llite layer.

This is problematic because the locks are released before
i/o done using them is complete; this can lead to data
inconsistencies.  (And at least one LBUG, see LU-14805.)

The easiest solution for now is only do parallel DIO when
working lockless (which is the default; DIO only switches
to locked to manage conflicts with buffered i/o).

This problem & fix apply to AIO as well as parallel DIO.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If98a0551d6dde54220b406b26e978e284a6b1ebf
Reviewed-on: https://review.whamcloud.com/44131
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13440 utils: fix handling of lsa_stripe_off = -1 30/43530/11
Andreas Dilger [Tue, 4 May 2021 01:25:23 +0000 (19:25 -0600)]
LU-13440 utils: fix handling of lsa_stripe_off = -1

Use LMV_OFFSET_DEFAULT instead of "-1" for parsing lfs_setdirstripe()
since parse_targets() will return "(__u32)-1" to the caller for the
stripe index, but lsa_stripe_off is a signed long long so it is
interpreted as 4294967295.  This causes the parsing to fail when
"lfs setdirstripe -i -1 --max-inherit-rr 1" is used.

Update sanity test_413a/413c to also specify "-i -1" to verify this.

In sanity test 413a,413b and 413c, create "qos" directory on most
full directory, so that its subdirectories won't be created on the
same MDT.

Fixes: f167f78e3bfd ("LU-13440 lmv: add default LMV inherit depth")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic934f859173155b1b2df56fcd315c8da633ebbe5
Reviewed-on: https://review.whamcloud.com/43530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-14541 llite: avoid stale data reading 76/43476/5
Wang Shilong [Wed, 28 Apr 2021 14:26:10 +0000 (22:26 +0800)]
LU-14541 llite: avoid stale data reading

remove_mapping() can prohibit to kill page from page cache due page
refcount!=2, in vvp_page_delete() clear uptodate flag in case
stale data reading later.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I322debec951b1a342246475456c0f40e10b0e578
Reviewed-on: https://review.whamcloud.com/43476
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14828 tests: Remove extra debug 75/44175/3
Patrick Farrell [Wed, 7 Jul 2021 16:44:52 +0000 (12:44 -0400)]
LU-14828 tests: Remove extra debug

Accidentally committed 398m with extra debug.
This is sometimes causing OOMs in testing, and it's a
mistake anyway.

Fixes: cba07b68f9 ("LU-13798 llite: parallelize direct i/o issuance")
Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I734aa3a952d2c085b3fc0014af1bdc0e881000e6
Reviewed-on: https://review.whamcloud.com/44175
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-14817 build: __xa_set_mark is not checked anymore 38/44138/3
Vitaly Fertman [Sat, 3 Jul 2021 09:25:14 +0000 (12:25 +0300)]
LU-14817 build: __xa_set_mark is not checked anymore

LC__XA_SET_MARK does not check for __xa_set_mark anymore after
LU-9859, however the result variable still exists and its value
has changed from 'no' to 'yes'.

Test-Parameters: trivial
Fixes: 84e12028be ("LU-9859 libcfs: add support for Xarray")
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: I24fffe7f2727b1d892ec3cabfc6e65ae8f68e024
Reviewed-on: https://review.whamcloud.com/44138
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14808 utils: fix YAML support for DOM files 33/44133/3
Vitaly Fertman [Tue, 15 Jun 2021 14:47:25 +0000 (17:47 +0300)]
LU-14808 utils: fix YAML support for DOM files

LFS getstripe never reports LLAPI_LAYOUT_DEFAULT for any stripe
parameter, but 0 or -1 whatever is appropriate.

LU-3285 added extra verification for the DOM parameters, precisely
the stripe count, size and offset have no sense for DOM and are
expected to be LLAPI_LAYOUT_DEFAULT. However, this brakes the yaml
support which uses getstripe output as the wanted values.

Also move the sanity-flr test_6 to ALWAYS_EXCEPT due to LU-14818.

Fixes: 6744eb8eeb ("LU-3285 lfs: add parameter for Data-on-MDT file")
Signed-off-by: Vitaly Fertman <c17818@cray.com>
HPE-bug-id: LUS-10090
Change-Id: Ide0c0fc264c7d1bac487306edf896d90153cf768
Reviewed-on: https://es-gerrit.dev.cray.com/158810
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-on: https://review.whamcloud.com/44133
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14786 lod: create missing debugfs file 13/44113/3
James Simmons [Tue, 29 Jun 2021 17:14:39 +0000 (13:14 -0400)]
LU-14786 lod: create missing debugfs file

While cleaning up debugfs symlinks the needed, but unused lod debugfs
directory was dropped. This results in the broken symlink

/sys/kernel/debug/lustre/lov/lustre-MDT0000-mdtlov

lctl params handling didn't see this due to glob returning only valid
directory entries so the error didn't get reported by stat(). Restore
the debugfs directory and add a new test to conf-sanity to detect any
potential breakage in the future.

Change-Id: I8fe0732d6caeeb83554833205998e24214343f88
Test-Parameters: env=ONLY=10a testlist=conf-sanity
Fixes: 462d476d ("LU-8066 obd: cleanup server sysfs symlinks handling")
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44113
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14677 sec: migrate/extend/split on encrypted file 78/43878/6
Sebastien Buisson [Fri, 28 May 2021 16:11:53 +0000 (18:11 +0200)]
LU-14677 sec: migrate/extend/split on encrypted file

lfs migrate/extend/split makes use of volatile files to swap layouts.
When operation is carried out on an encrypted file, the volatile file
must be assigned the same encryption context as the original file, so
that data moved/copied to different OSTs is identical to the original
file's.
Also update sanity-sec test_52 to exercise these commands.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3878b5e9e6d3738dfee0ce0f89a3646e6a7b976f
Reviewed-on: https://review.whamcloud.com/43878
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14430 mdd: rename mti_oa to mdi_oa and friends 39/43739/5
Andreas Dilger [Thu, 13 May 2021 10:42:20 +0000 (04:42 -0600)]
LU-14430 mdd: rename mti_oa to mdi_oa and friends

Rename fields in mdd_thread_info to confusion with mdt_thread_info.
The second patch of several to rename all mdd_thread_info fields
to use a more unique field prefix:

  mti_dof->mdi_dof
  mti_dt_rec->mdi_dt_rec
  mti_ent->mdi_ent
  mti_flags->mdi_flags
  mti_hint->mdi_hint
  mti_key->mdi_key
  mti_link_data->mdi_link_data
  mti_name->mdi_name
  mti_oa->mdi_oa
  mti_range->mdi_range
  mti_spec->mdi_spec

The mti_lmv and mti_lrl fields are removed since they are unused.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6fd4b7f26b7e9561d8a8585eaa5438d6093ebbe5
Reviewed-on: https://review.whamcloud.com/43739
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14430 mdd: rename mti_big_buf to mdi_big_buf 38/43738/6
Andreas Dilger [Thu, 13 May 2021 10:27:49 +0000 (04:27 -0600)]
LU-14430 mdd: rename mti_big_buf to mdi_big_buf

Avoid serious confusion with the MDT mti_big_buf, and other fields
in mdd_thread_info, since they are two separate buffers completely.

  mti_big_buf->mdi_big_buf
  mti_chlg_buf->mdi_chlg_buf
  mti_link_buf->mdi_link_buf
  mti_xattr_buf->mdi_xattr_buf

The first patch of several to rename all mdd_thread_info fields.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib0ec91c8481e747ed058afe5c08c3f60203ebbe5
Reviewed-on: https://review.whamcloud.com/43738
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag 91/40791/8
Qian Yingjin [Mon, 30 Nov 2020 02:08:17 +0000 (10:08 +0800)]
LU-10499 pcc: introducing OBD_CONNECT2_PCCRO flag

Add a new connection flag OBD_CONNECT2_PCCRO to solve the access
consistency from the old client without PCC-RO support.

By necessity, also include definitions for OBD_CONNECT2_MODE_CONVERT
and OBD_CONNECT2_BATCH_RPC so obd_connect_names[] works.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I19716e94a86e53353c1628d414c92e61e084dfc9
Reviewed-on: https://review.whamcloud.com/40791
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
3 months agoLU-14780 llite: failed ASSERTION(ldlm_has_layout(lock)) 54/44054/2
Bobi Jam [Fri, 4 Jun 2021 03:58:29 +0000 (11:58 +0800)]
LU-14780 llite: failed ASSERTION(ldlm_has_layout(lock))

When setting layout in layout lock, the lock could lost its layout
bits, and we'd try fetch the layout lock again.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I10f96e4cb03cfe228d3c1ea1500b1a8d8e4e5e54
Reviewed-on: https://review.whamcloud.com/44054
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14459 mdt: support fixed directory layout 91/43291/7
Lai Siyao [Wed, 28 Apr 2021 21:30:00 +0000 (05:30 +0800)]
LU-14459 mdt: support fixed directory layout

User may not want directories split automatically in some cases:
*.directory migrated.
* directory restriped.

To support this, an LMV flag LMV_HASH_FLAG_FIXED is added, and it will
be set on migrated/restriped directories. NB, if directory is migrated
or restriped to a one-stripe directory, it won't be transformed into a
plain directory, because this flag needs to be kept.

Update sanity 230q.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Icd12b2aa34d391e32c3323a8b9c24449ea3e3d0e
Reviewed-on: https://review.whamcloud.com/43291
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14459 mdt: restripe parent may be a stripe 90/43290/8
Lai Siyao [Mon, 12 Apr 2021 03:30:13 +0000 (11:30 +0800)]
LU-14459 mdt: restripe parent may be a stripe

mdt_restripe() check parent LMV sanity with lmv_is_sane(), but parent
may be a stripe, use lmv_is_sane2() instead.

Clear lmv_migrate_hash/offset in layout shrink/update, though it
won't cause any issue, it's strange to see values set in debug logs.

Add more race check between directory restripe, auto-split and
migration.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I38950a07a8c9a8b4b20a2fd7aff229d27dbb403c
Reviewed-on: https://review.whamcloud.com/43290
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14459 llite: reset pfid after dir migration 89/43289/7
Lai Siyao [Mon, 12 Apr 2021 03:17:37 +0000 (11:17 +0800)]
LU-14459 llite: reset pfid after dir migration

A plain directory will be turned into to a stripe upon
migration/restripe, and reversely if target is plain directory, the
target stripe will be turned into directory after.

In the first case, set pfid, and in the latter case, clear pfid,
otherwise ll_lock_cancel_bits() will use the wrong master inode.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I01cac0103dc79d493166e6b090508d24f9678a57
Reviewed-on: https://review.whamcloud.com/43289
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14739 quota: nodemap squashed root cannot bypass quota 88/43988/7
Sebastien Buisson [Fri, 11 Jun 2021 14:49:47 +0000 (16:49 +0200)]
LU-14739 quota: nodemap squashed root cannot bypass quota

When root on client is squashed via a nodemap's squash_uid/squash_gid,
its IOs must not bypass quota enforcement as it normally does without
squashing.
So on client side, do not set OBD_BRW_FROM_GRANT for every page being
used by root. And on server side, check if root is squashed via a
nodemap and remove OBD_BRW_NOQUOTA.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I95b31277273589e363193cba8b84870f008bb079
Reviewed-on: https://review.whamcloud.com/43988
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14733 o2iblnd: Avoid double posting invalidate 90/44190/3
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:01 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Avoid double posting invalidate

When the kib_tx is provisioned during kiblnd_fmr_pool_map(), spare
WRs in the kib_fast_reg_descriptor are setup and the mapping of
pages is given to the mr.

kiblnd_post_tx_locked() then posts the spare WRs from the
kib_fast_reg_descriptor.

if (rc == 0)
return 0;

The code returns and the kib_fast_reg_descriptor is still contains
the spare WRs.   The next time the kib_tx is used, the
now obsolete WRs will be inadvertently posted.   For rdmavt, the
obsolete invalidate will cause an -EINVAL to be returned from
the post send.

Fix by adding a state variable frd_posted to the kib_fast_reg_descriptor.
The variable is set to false in kiblnd_fmr_pool_unmap().
kiblnd_post_tx_locked() is adjusted to avoid prepending the
kib_fast_reg_descriptor WRs when frd_posted is true.   After
the post succeeds, the frd_posted is set to true.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: I426dd05e635392e75d1aa48808782a229e83ce5f
Reviewed-on: https://review.whamcloud.com/44190
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14733 o2iblnd: Move racy NULL assignment 89/44189/2
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:00 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Move racy NULL assignment

kiblnd_fmr_pool_unmap() can race map and subsequent processing
because of this flaw in unmap:

if (frd) {
frd->frd_valid = false;
spin_lock(&fps->fps_lock);
list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list);
spin_unlock(&fps->fps_lock);
fmr->fmr_frd = NULL;
}

The fmr can be pulled off the list in kiblnd_fmr_pool_unmap() on
another CPU an fmr_frd could be in a state of flux and
potentially be seen incorrectly later on as the kib_tx is processed.

Fix my moving the fmr_frd assignment to before the fmr is added to the
list.

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: Ibddf132a363ecfe9db3cc06287cec873c021d2fb
Reviewed-on: https://review.whamcloud.com/44189
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14798 lnet: RMDA infrastructure updates 09/44109/2
Amir Shehata [Thu, 6 Feb 2020 01:46:03 +0000 (17:46 -0800)]
LU-14798 lnet: RMDA infrastructure updates

Add infrastructure to force RDMA for payloads < 4K.
Add infrastructure to extract the first page in a
payload. Useful for determining the type of the payload
to be transmitted.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Id7dc26c83f00dadd26feca94fc4d8233872650d3
Lustre-change: https://review.whamcloud.com/37453
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Whamcloud-bug-id: EX-773
Reviewed-on: https://review.whamcloud.com/44109
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-14779 utils: no DNS lookups for NID in get_param 56/44056/4
Andreas Dilger [Wed, 23 Jun 2021 08:20:24 +0000 (02:20 -0600)]
LU-14779 utils: no DNS lookups for NID in get_param

Calling libcfs_str2nid() speculatively in "lctl get_param" to see
if there is a NID in the parameter name results in multiple DNS
lookups for invalid hostnames (e.g. "exports.192.168.0.10"). That
may take a very long time if there are a large number of connected
clients, and if the DNS server overloaded or is having problems.

Instead of doing these speculative NID conversions, skip the whole
NID string in the parameter name for the two known parameters that
may contain a NID ("*.exports.<NID>.*" and "*.MGC<NID>.*").  This
is considerably faster since it is only working on a local string.

If new parameters are added that contain a NID (unlikely, but
possible), then "clean_path()" would need to be updated as part
of that change.

Fixes: 85cbe1a3ee69 ("LU-5030 util: migrate lctl params functions to use cfs_get_paths()")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I51f865e4ce3a7bc4879f9d688c4b3a68d731810f
Reviewed-on: https://review.whamcloud.com/44056
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14778 readahead: fix to reserve min pages 50/44050/3
Wang Shilong [Tue, 22 Jun 2021 01:26:40 +0000 (09:26 +0800)]
LU-14778 readahead: fix to reserve min pages

@pages_min might be larger than @pages which indicate
more pages should be read, and it will cause a warning
later.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ifd82f709c3877172f08b87ab0551da735a0613e0
Reviewed-on: https://review.whamcloud.com/44050
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14776 ldiskfs: Add Ubuntu 20.04 HWE support 39/44039/4
James Simmons [Mon, 28 Jun 2021 16:15:43 +0000 (10:15 -0600)]
LU-14776 ldiskfs: Add Ubuntu 20.04 HWE support

Use the already landed ldiskfs support for Linux 5.8.0 to enable
support for the Ubuntu 20.04 HWE 5.8.0-53 kernel. Another change
that started with the 5.7 kernel is removal of the flag
EXT4_GET_BLOCKS_KEEP_SIZE. The code was no longer needed with the
removal of EXT4_EOFBLOCKS_FL which happened in 2012. e2fsprog
support for this flag has been removed since version 1.42.2.

Change-Id: I60db446bab50178a601e1c2c20e782435f9f50f2
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44039
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14767 utils: mkfs.lustre allow lazy_itable_init=1 19/44019/3
Andreas Dilger [Wed, 16 Jun 2021 20:48:33 +0000 (14:48 -0600)]
LU-14767 utils: mkfs.lustre allow lazy_itable_init=1

When "lazy_itable_init=0" was added to the mke2fs options the call
to append_unique() to see whether "lazy_itable_init" was already
listed in the mke2fs options was incorrect. It checks to see if
"lazy_itable_init=0" is already present in the options, and doesn't
match "lazy_itable_init=1" if it was specified on the command-line.

Separate the key and value passed to append_unique() so that it can
check if any form of the key is present in the existing options.

Test-Parameters: trivial testlist=conf-sanity
Fixes: 701cc249594e ("LU-13533 utils: ext4lazyinit should be disabled")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic7a6dbb81f004dd35f0f1c5f5ddec0fb363ebbe5
Reviewed-on: https://review.whamcloud.com/44019
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-11872 quota: add get/set project support for non-dir/file 06/44006/7
Wang Shilong [Tue, 22 Jun 2021 12:09:29 +0000 (20:09 +0800)]
LU-11872 quota: add get/set project support for non-dir/file

Add ablity to get/set non-dir/file's project ID and state.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ib8eee09254f9751797b5deb7f753c34eb2c0d5a5
Reviewed-on: https://review.whamcloud.com/44006
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14750 lnet: use ni fatal error when calculating net health 62/43962/2
Serguei Smirnov [Wed, 9 Jun 2021 21:22:12 +0000 (14:22 -0700)]
LU-14750 lnet: use ni fatal error when calculating net health

When ni is flagged with "fatal_error" by LND, its health score
remains unaffected. This allows for the net containing such ni
to be selected for tx even if it is the only ni in this net.
Take "fatal_error" status of the ni into account when calculating
the net health score.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ib76245f835f1458873f0c05ad9b6727d295857de
Reviewed-on: https://review.whamcloud.com/43962
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-11698 libcfs: Add checksum speed under /sys/fs 43/43943/11
Arshad Hussain [Tue, 8 Jun 2021 09:32:01 +0000 (05:32 -0400)]
LU-11698 libcfs: Add checksum speed under /sys/fs

This patch adds total of registered checksum and all
registered checksum names along with their speed under
/sys/kernel/debug/lustre/checksum_speed

TestCase sanity/77m added.

Sample output:
$ lctl get_param checksum_speed
checksum_speed=adler32: 1955
crc32: 2423
crc32c: 14035

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: If125032e35bfd9221eb66e6f77bf7e3753ffcc0f
Reviewed-on: https://review.whamcloud.com/43943
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14093 lnet: annotate LNET_WIRE_HANDLE_COOKIE_NONE as u64 13/43713/6
Dominique Martinet [Sat, 15 May 2021 22:32:56 +0000 (07:32 +0900)]
LU-14093 lnet: annotate LNET_WIRE_HANDLE_COOKIE_NONE as u64

Fix the following warning on new gcc with -Wextra when including
lustre_idl.h on external project:

.../include/linux/lnet/lnet-types.h: In function LNetMDHandleIsInvalid:
.../include/linux/lnet/lnet-types.h:355:46:
   error: comparison of integer expressions of different signedness:
   int and __u64 {aka long long unsigned int} [-Werror=sign-compare]
        return (LNET_WIRE_HANDLE_COOKIE_NONE == h.cookie);
                                             ^~

Change-Id: I05f21dcca5fe9dd15d1e0b6cb9a29c3999bcd807
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-on: https://review.whamcloud.com/43713
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-14648 lod: protect lod_object layout info 71/43671/3
Bobi Jam [Wed, 12 May 2021 08:18:00 +0000 (16:18 +0800)]
LU-14648 lod: protect lod_object layout info

Need to protect lod_object's layout access with ldo_layout_mutex.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I2c4a2078bdce64d15485d3ff18f6670d42ca90ba
Reviewed-on: https://review.whamcloud.com/43671
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14762 lmv: compare space to mkdir on parent MDT 97/43997/5
Lai Siyao [Mon, 14 Jun 2021 07:26:47 +0000 (15:26 +0800)]
LU-14762 lmv: compare space to mkdir on parent MDT

In QOS subdirectory creation, subdirectories are kept on parent MDT
if it is less full than average, however it checks weight other than
free space, while "weight = free space - penalty", if MDTs have
different penalties, the result is not accurate, therefore this may
not work.

Check free space instead, and loosen the critirion to allow the
free space within the range of QOS threshold.

Fixes: 3f6fc483013d ("LU-13439 lmv: qos stay on current MDT if less full")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id34cf8f3f58fee9d329f0d05c2f7a6463b67dfe1
Reviewed-on: https://review.whamcloud.com/43997
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-14654 tests: Ensure recovery_limit zero works as expected 02/43502/3
Chris Horn [Thu, 29 Apr 2021 18:09:07 +0000 (13:09 -0500)]
LU-14654 tests: Ensure recovery_limit zero works as expected

When lnet_recovery_limit is set to zero (the default) peer NIs are
eligible for recovery pings indefinitely. Verify this functionality
by modifying sanity-lnet test_211 to use recovery_limit 0 to make
a peer NI re-eligible for recovery.

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-9953
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I00cb0940133e15ec73491e875d08b6db2bff3fe5
Reviewed-on: https://review.whamcloud.com/43502
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14654 lnet: Correct peer NI recovery age out calculation 01/43501/3
Chris Horn [Thu, 29 Apr 2021 18:14:34 +0000 (13:14 -0500)]
LU-14654 lnet: Correct peer NI recovery age out calculation

The calculation to age a peer NI out of recovery is only valid if
lnet_recovery_limit is non-zero. When set to zero, we allow peer NIs
to be in recovery indefinitely.

Test-Parameters: trivial
HPE-bug-id: LUS-9953
Fixes: cc27201a76 ("LU-13569 lnet: Age peer NI out of recovery")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6bb40ca3a9affa0eaaae9deb1cecdb03e4bb42c5
Reviewed-on: https://review.whamcloud.com/43501
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13055 mdd: per-user changelog names and mask 80/43380/9
Mikhail Pershin [Tue, 22 Jun 2021 18:16:26 +0000 (21:16 +0300)]
LU-13055 mdd: per-user changelog names and mask

Allow specifying a name for newly-registered changelog users,
rather than the default "clNNN" that is otherwise used. This
allows services to register a "well-known" changelog user,
rather than having to store the changelog username in HA storage
outside of the filesystem.

Each changelog user still has a unique ID appended to it, to allow
the changelog_clear and changelog_deregister commands to be run
using only the ID if necessary/desired. User name can be used to
deregister. User name is also unique per server.

If no name is given, then default "cl" format is used.

With this new functionality, it is possible to specify the name like:
 # lctl --device testfs-MDT0000 changelog_register --user watcher
   testfs-MDT0000: Registered changelog userid 'cl13-watcher'

Per-user mask is also added to allow specific operation logging on
per-user basis. Mask can be set only during registration. Resulting
mask from per-server mask and all user masks is used for current
changelog operations.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I56028f54cc97bbc9af03fd6559c19ef854f759d8
Reviewed-on: https://review.whamcloud.com/43380
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-4684 tests: enable racer directory migration 59/41359/4
Andreas Dilger [Thu, 28 Jan 2021 20:44:27 +0000 (13:44 -0700)]
LU-4684 tests: enable racer directory migration

Enable the dir_migrate test by default in racer test runs.

Update test selection logic to match newer script code style.

Test-Parameters: trivial testlist=racer env=DURATION=3600
Test-Parameters: fstype=zfs testlist=racer env=DURATION=600
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ifba84c64b30d90b4a159232751b68c48c88dafcc
Reviewed-on: https://review.whamcloud.com/41359
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14139 llite: simplify callback handling for async getattr 12/40712/11
Qian Yingjin [Thu, 19 Nov 2020 15:15:37 +0000 (23:15 +0800)]
LU-14139 llite: simplify callback handling for async getattr

In this patch, it prepares the inode and set lock data directly in
the callback interpret of the intent async getattr RPC request (in
ptlrpcd context), simplifies the old impementation that defer this
work in the statahead thread.

According to the benchmark result, the workload "ls -l" to a large
directory on a client without any caching (server and client),
containing 1M files (47001 bytes) shows the results with measured
elapsed time:
- w/o patch: 180 seconds;
- w patch: 181 seconds;

There is no any obvious performance regession.

Change-Id: Ifcfad3eb26d831bec3beea0c3d7045f31d35fa6a
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/40712
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 obdclass: resolve lu_ref checkpatch issues 88/44088/2
James Simmons [Sat, 26 Jun 2021 18:05:15 +0000 (14:05 -0400)]
LU-6142 obdclass: resolve lu_ref checkpatch issues

Fix up all the checkpatch issues reported for the code handling
lu_ref. Also change USE_LU_REF to CONFIG_LUSTRE_DEBUG_LU_REF
which will match what will be upstream.

Change-Id: I100e2679fc04c97eb67e4d44c4f6a6b530da6fa8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44088
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14734 osd-ldiskfs: enable large_dir automatically 31/43931/7
Andreas Dilger [Sat, 5 Jun 2021 08:34:15 +0000 (02:34 -0600)]
LU-14734 osd-ldiskfs: enable large_dir automatically

Enable the large_dir feature automatically at mount time for
filesystems that do not have it enabled already.  Otherwise,
the REMOTE_PARENT_DIR may overflow if there are many remote
entries created, or for object directories on very large OSTs.
It isn't really needed on a dedicated MGS filesystem.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1c4ead26b09d60567ad12945d7b366b53475cebb
Reviewed-on: https://review.whamcloud.com/43931
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14516 mgc: configurable wait-to-reprocess time 20/42020/19
Alex Zhuravlev [Fri, 12 Mar 2021 09:00:37 +0000 (12:00 +0300)]
LU-14516 mgc: configurable wait-to-reprocess time

so we can set it shorter, for testing purposes at least. to change
minimal wait time MGC module option 'mgc_requeue_timeout_min'
should be used (in seconds). additionally a random value upto
mgc_requeue_timeout_min is added to avoid a flood of config re-read
requests from clients. if mgc_requeue_timeout_min is set to 0,
then random part will be upto 1 second.

ost-pools: before: 5840s, after:a 3474s
sanity-flr: before: 1575s, after: 1381s
sanity-quota: before: 10679s, after: 9703s

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iff7dad4ba14d687b7e891a1c346397e4c370800d
Reviewed-on: https://review.whamcloud.com/42020
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14074 scripts: automatic LNet unconfigure 98/41698/3
Cyril Bordage [Fri, 19 Feb 2021 17:12:45 +0000 (18:12 +0100)]
LU-14074 scripts: automatic LNet unconfigure

After using the lnetctl utility a reference count is taken on the LNet
modules. lnetctl lnet unconfigure is called in order for
lustre_rmmod to remove the LNet module.

Signed-off-by: Cyril Bordage <cbordage@whamcloud.com>
Change-Id: I7251a0c62c45da7b3cb0fddea97394b32cb6902a
Reviewed-on: https://review.whamcloud.com/41698
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13799 osc: Simplify clipping for transient pages 40/39440/12
Patrick Farrell [Fri, 7 May 2021 15:38:07 +0000 (11:38 -0400)]
LU-13799 osc: Simplify clipping for transient pages

The combination of page clip and page flag setting for
transient pages takes up several % of the time when
submitting them for async DIO.

But neither is required - Transient pages do not change
after creation except in limited cases, and in any case,
they are only accessible from the submitting thread -
there is no possibility of parallel access.

So we can set the page flags, etc, at init time.

This patch improves i/o time in ms/GiB by:
Write: 17 ms/GiB
Read: 22 ms/GiB

Totals:
Write: 204 ms/GiB
Read: 198 ms/GiB

mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect

With previous patches in series:
write     4647 MiB/s
read      4888 MiB/s

Plus this patch:
write     5030 MiB/s
read      5174 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I974ebb0f55734a8628f1f7e1c01092eb2ce5f83b
Reviewed-on: https://review.whamcloud.com/39440
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13799 clio: Implement real list splice 39/39439/11
Patrick Farrell [Fri, 7 May 2021 15:37:40 +0000 (11:37 -0400)]
LU-13799 clio: Implement real list splice

Lustre's list_splice is actually just a slightly
depressing list_for_each; let's use a real list_splice.

This saves significant time in AIO/DIO page submission,
getting a several % performance boost.

This patch reduces i/o time in ms/GiB by:
Write: 16 ms/GiB
Read: 14 ms/GiB

Totals:
Write: 220 ms/GiB
Read: 209 ms/GiB

mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect

With previous patches in series:
write     4326 MiB/s
read      4587 MiB/s

With this patch:
write     4647 MiB/s
read      4888 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: Icfd4a3d9dd6f162b011b402a1c88d7dae53eff40
Reviewed-on: https://review.whamcloud.com/39439
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-13799 osc: Don't get time for each page 37/39437/13
Patrick Farrell [Fri, 7 May 2021 15:35:28 +0000 (11:35 -0400)]
LU-13799 osc: Don't get time for each page

Getting the time when each batch of pages starts is
sufficiently accurate, and ktime_get() is several % of the
CPU time when doing AIO + DIO.

This relies on previous patches in this series.

Measuring this in milliseconds/gigabyte lets us measure the
improvement in absolute terms, rather than just relative
terms.

This patch reduces i/o time in ms/GiB by:
Write: 17 ms/GiB
Read: 6 ms/GiB

Totals:
Write: 237 ms/GiB
Read: 223 ms/GiB

IOR:
mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
Without the patch:
write     4030 MiB/s
read      4468  MiB/s

With patch:
write     4326 MiB/s
read      4587 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I02897bf810683bc77a7d09156cdb83ba1d25ebf1
Reviewed-on: https://review.whamcloud.com/39437
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-13798 llite: parallelize direct i/o issuance 36/39436/30
Patrick Farrell [Fri, 28 May 2021 23:53:55 +0000 (19:53 -0400)]
LU-13798 llite: parallelize direct i/o issuance

Currently, the direct i/o code issues an i/o to a given
stripe, and then waits for that i/o to complete.  (This is
for i/os from a single process.)  This forces DIO to send
only one RPC at a time, serially.

In the case of multi-stripe files and larger i/os from
userspace, this means that i/o is serialized - so single
thread/single process direct i/o doesn't see any benefit
from the combination of extra stripes & larger i/os.

Using part of the AIO support, it is possible to move this
waiting up a level, so it happens after all the i/o is
issued.  (See LU-4198 for AIO support.)

This means we can issue many RPCs and then wait,
dramatically improving performance vs waiting for each RPC
serially.

This is referred to as 'parallel dio'.

Notes:
AIO is not supported on pipes, so we fall back to the old
sync behavior if the source or destination is a pipe.

Error handling is similar to buffered writes: We do not
wait for individual chunks, so we can get an error on an RPC
in the middle of an i/o.  The solution is to return an
error in this case, because we cannot know how many bytes
were written contiguously.  This is similar to buffered i/o
combined with fsync().

The performance improvement from this is dramatic, and
greater at larger sizes.

lfs setstripe -c 8 -S 4M .
mpirun -np 1  $IOR -w -r -t 64M -b 64G -o ./iorfile --posix.odirect
Without the patch:
write     764.85 MiB/s
read      682.87 MiB/s

With patch:
write     4030 MiB/s
read      4468  MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I7e8df7d16b131b55a235f57c3280509559f94476
Reviewed-on: https://review.whamcloud.com/39436
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9680 utils: add netlink infrastructure 30/34230/36
James Simmons [Wed, 16 Jun 2021 19:28:13 +0000 (15:28 -0400)]
LU-9680 utils: add netlink infrastructure

Netlink was designed as a successor to ioctl as defined under
RFC 3549. There are several advantages to using netlink over
ioctls or virtual file system interfaces like proc. Collecting
proc doesn't scale well which was seen with power drain on Android
phones. A netlink implementation was developed to remove this
performance hit. Details can be read at:

https://lwn.net/Articles/406975

Besides the scaling gains the other benefit is the flexiblity
with API changes. Adding or removing information to be transmitted
doesn't require creating a new interface like ioctl do. Instead
you add new code to handle the stream of attributes read from the
socket. Lastly you can multiplex data to N listeners with groups
using one request.

This patch adds netlink handling in a generic way that can be
used by the libyaml library. This greatly lowers the barrier by
only requiring the implementor to understand the libyaml API.

Change-Id: Idcdac653a1f9cc9931238e869c3beadaefcf3410
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/34230
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13716 tests: skip sanity 205b for older servers 93/43993/2
James Nunez [Sat, 12 Jun 2021 00:05:08 +0000 (18:05 -0600)]
LU-13716 tests: skip sanity 205b for older servers

Lustre job stats and sanity test 205b were modified in Lustre
version 2.13.54.91.  When we run version intop testing with
servers less than this version and clients that are greater,
the test will fail.

Skip sanity test 205b for Lustre servers with version less than
2.13.54.91 and client greater than that version.

Test-Parameters: trivial
Test-Parameters: serverdistro=el7.9 serverversion=2.12.6 env=ONLY=205 testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Icc5d6a6adcf03e5bd16b678596f28590fe31516e
Reviewed-on: https://review.whamcloud.com/43993
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
3 months agoLU-14533 tests: skip sanity-pfl 0d for older servers 71/43971/3
James Nunez [Thu, 10 Jun 2021 21:05:18 +0000 (15:05 -0600)]
LU-14533 tests: skip sanity-pfl 0d for older servers

sanity-pfl test 0d was added to Lustre version 2.14.50.115.
When we run version interop testing with servers with
version less than this, the test will fail.

We should skip sanity-pfl test 0d if the Lustre server
version is less than 2.14.50.115.

Fixes: 83e38bba62 ("LU-14180 utils: verify setstripe comp_end is valid")

Test-Parameters: trivial
Test-Parameters: serverversion=2.14.0 serverdistro=el8.3 env=ONLY=0d testlist=sanity-pfl
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I49b45c7a1e4804fece33d53a4fb946b49254de2b
Reviewed-on: https://review.whamcloud.com/43971
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
3 months agoLU-14322 tests: skip sanityn 51e for old servers 69/43969/3
James Nunez [Thu, 10 Jun 2021 18:54:51 +0000 (12:54 -0600)]
LU-14322 tests: skip sanityn 51e for old servers

sanityn test 51e was added to Lustre version 2.13.54.148.
When we run version interop testing with servers less than
this version, the test will fail.

We should skip sanityn test 51e if the server version is
less than 2.13.54.148.

Fixes: 3ea729fe82 ("LU-13693 lfs: check early for MDS_OPEN_DIRECTORY")

Test-Parameters: trivial
Test-Parameters: serverversion=2.12.6 serverdistro=el7.9 env=ONLY=51e testlist=sanityn
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Id2f165b275c97c3a1396a0da18a3f254dbe5efa7
Reviewed-on: https://review.whamcloud.com/43969
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
3 months agoLU-14755 tests: create custom pools 66/43966/2
Elena Gryaznova [Thu, 10 Jun 2021 09:51:52 +0000 (12:51 +0300)]
LU-14755 tests: create custom pools

We are interested in running some tests on fs with
the pools. The proposed enhancement allows to create
$FS_NPOOLS number of pools containing $FS_POOL_NOSTS
number of osts. If $FS_NPOOLS not set the number of
pools created is $OSTCOUNT / $FS_POOL_NOSTS.
Pools names are $FS_POOL based. Pools are not created if
FS_POOL not set.
Examples 1:
  FS_POOL=global OSTCOUNT=2
lustre.global0
OST lustre-OST0000_UUID
OST lustre-OST0001_UUID
Example 2:
  FS_POOL=global OSTCOUNT=6 FS_POOL_NOSTS=3
lustre.global0
OST lustre-OST0000_UUID
OST lustre-OST0001_UUID
OST lustre-OST0002_UUID
lustre.global1
OST lustre-OST0003_UUID
OST lustre-OST0004_UUID
OST lustre-OST0005_UUID
Example 3:
  FS_POOL=p OSTCOUNT=5 KEEP_POOLS=true FS_NPOOLS=7 FS_POOL_NOSTS=3
Pool: lustre.p0
lustre-OST0000_UUID
lustre-OST0001_UUID
lustre-OST0002_UUID
Pool: lustre.p1
lustre-OST0003_UUID
lustre-OST0004_UUID
lustre-OST0000_UUID
Pool: lustre.p2
lustre-OST0001_UUID
lustre-OST0002_UUID
lustre-OST0003_UUID
Pool: lustre.p3
lustre-OST0004_UUID
lustre-OST0000_UUID
lustre-OST0001_UUID
Pool: lustre.p4
lustre-OST0002_UUID
lustre-OST0003_UUID
lustre-OST0004_UUID
Pool: lustre.p5
lustre-OST0000_UUID
lustre-OST0001_UUID
lustre-OST0002_UUID
Pool: lustre.p6
lustre-OST0003_UUID
lustre-OST0004_UUID
lustre-OST0000_UUID

Patch adds the ability to remove all old pools at the
start if DELETE_OLD_POOLS set to true (default is false)
and the ability keep the new pools not deleted at the
end if KEEP_POOLS set to true (default is false).

Test-Parameters: trivial testlist=sanity-flr,ost-pools,ost-pools,sanity-pfl,sanity,sanityn
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-8172
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Change-Id: I73b72f9f39933b5b875978ce4fede5e9828c4c71
Reviewed-on: https://review.whamcloud.com/43966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14327 tests: skip sanity-sec test 55 for older servers 49/43949/5
James Nunez [Tue, 8 Jun 2021 16:34:29 +0000 (10:34 -0600)]
LU-14327 tests: skip sanity-sec test 55 for older servers

sanity-sec test 55 was added to lustre-master version
2.13.57.12 and to lustre-b2_12 version 2.12.6.3.  When
we run version interop testing with Lustre servers less
than these versions, the test will fail.  Thus, skip
sanity-sec test 55 for Lustre severs less than 2.12.6.3.

Fixes: 355787745f21 (“LU-14121 nodemap: do not force fsuid/fsgid squashing”)

Test-Parameters: trivial
Test-Parameters: serverversion=2.12.6 serverdistro=el7.9 env=ONLY=55 testlist=sanity-sec
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ie002c921e853897105396185b38485799df31b7a
Reviewed-on: https://review.whamcloud.com/43949
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9897 utils: allow setting llverfs subdir count 47/39347/2
Andreas Dilger [Fri, 30 Aug 2019 23:19:29 +0000 (17:19 -0600)]
LU-9897 utils: allow setting llverfs subdir count

Allow specifying the subdirectory count directly rather
than calculating it based on the filesystem size.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Idcae188ef4bdb417f0f983718bce7e55093ebbe5
Reviewed-on: https://review.whamcloud.com/39347
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12982 tests: skip conf-sanity 5i for old servers 11/36811/4
James Nunez [Wed, 20 Nov 2019 22:03:11 +0000 (15:03 -0700)]
LU-12982 tests: skip conf-sanity 5i for old servers

conf-sanity tests 5i was added to lustre-master with version
2.12.54.  For all version interop testing with Lustre servers with
version less than 2.12.54 and newer clients, conf-sanity test 5i
will fail and should be skipped.

Fixes: d1b5146eda4f (LU-12206 mdt: mdt_init0 failure handling)

Test-Parameters: trivial
Test-Parameters: serverversion=2.12.6 serverdistro=el7.9 fstype=ldiskfs env=ONLY=5 testlist=conf-sanity

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ia493b6f80b42fbd92254150e8d40a6fbb1039635
Reviewed-on: https://review.whamcloud.com/36811
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14752 obdclass: handle EBUSY returned for lu_object hashtable 68/43968/4
James Simmons [Thu, 10 Jun 2021 16:53:57 +0000 (12:53 -0400)]
LU-14752 obdclass: handle EBUSY returned for lu_object hashtable

When the rhashtable grows to a certain size it will be rescaled.
When rescaling you can be returned a ENOMEM or EBUSY error. This
we reported as:

LustreError: 3594004:0:(lu_object.c:2472:lu_object_assign_fid()) ASSERTION( rc == 0 ) failed: failed hashtable insertion: rc = -16
LustreError: 3594004:0:(lu_object.c:2472:lu_object_assign_fid()) LBUG
Pid: 3594004, comm: mdt01_020 4.18.0-240.22.1.1toss.t4.x86_64 #1 SMP Tue Apr 13 17:18:40 PDT 2021
Call Trace TBD:
Kernel panic - not syncing: LBUG
...
Call Trace:
dump_stack+0x5c/0x80
panic+0xe7/0x2a9
lbug_with_loc.cold.10+0x18/0x18 [libcfs]
lu_object_assign_fid+0x3b8/0x3c0 [obdclass]

Add handling the EBUSY case for our lu_object hash.

Fixes: aff14dbc522 ("LU-8130 lu_object: convert lu_object cache to rhashtable")
Change-Id: Id85f32633117e02850b799e8d95e3e35d982cbd4
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43968
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14741 obdclass: Wake up entire queue of requests on close completion 41/43941/3
Oleg Drokin [Mon, 7 Jun 2021 19:17:27 +0000 (15:17 -0400)]
LU-14741 obdclass: Wake up entire queue of requests on close completion

Since close requests could be stuck behind normal requests and get
more slots we need to wake up entire accumulated queue waiting
for the next modrpc slot or have additional waitqueue just for
close requests.

This patch goes with the former approach.

Fixes: 1fc013f901 ("LU-5319 mdc: manage number of modify RPCs in flight")
Change-Id: Ib4333c7f6731dd435364d5e5f529577a1600a235
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43941
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
3 months agoLU-13417 test: use mkdir_on_mdt0() in replay-dual 92/43492/6
Lai Siyao [Thu, 29 Apr 2021 03:51:33 +0000 (11:51 +0800)]
LU-13417 test: use mkdir_on_mdt0() in replay-dual

Replace mkdir with mkdir_on_mdt0() in replay-dual.sh if directory
needs to be created on MDT0.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=replay-dual
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9093e633412991571e18cb0ea264af013672bd8b
Reviewed-on: https://review.whamcloud.com/43492
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 months agoLU-13417 test: use mkdir_on_mdt0() in misc tests 91/43491/4
Lai Siyao [Thu, 29 Apr 2021 03:46:21 +0000 (11:46 +0800)]
LU-13417 test: use mkdir_on_mdt0() in misc tests

Replace mkdir with mkdir_on_mdt0() if directory needs to be created
on MDT0 in following tests:
* conf-sanity
* lustre-rsync-test
* ost-pools
* replay-ost-single
* replay-single
* replay-vbr
* sanity-hsm
* sanity-pcc
* sanity-quota
* sanity-sec

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=conf-sanity,lustre-rsync-test,ost-pools,replay-ost-single,replay-single,replay-vbr,sanity-hsm,sanity-pcc,sanity-quota,sanity-sec
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I96369f25982558a1dac7f4f7fe80a95bc1c0207d
Reviewed-on: https://review.whamcloud.com/43491
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 months agoLU-13417 test: add mkdir_on_mdt0() 89/43489/7
Lai Siyao [Wed, 28 Apr 2021 14:36:24 +0000 (22:36 +0800)]
LU-13417 test: add mkdir_on_mdt0()

Once default LMV is set on ROOT, and default stripe offset is "-1",
mkdir may not create directory on MDT0, but it's a premise for many
tests. Add a function mkdir_on_mdt0() to create directory on MDT0
by "lfs mkdir -i 0".

Replace mkdir with mkdir_on_mdt0() for such tests in sanity.sh and
sanityn.sh.

Test-Parameters: trivial testlist=sanityn
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I6155d036e6b28153d0bdbdbc01088bd68ee9e0af
Reviewed-on: https://review.whamcloud.com/43489
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 months agoLU-10948 mdt: New connect flag for non-open-by-fid lock request 07/43907/4
Oleg Drokin [Thu, 3 Jun 2021 00:10:47 +0000 (20:10 -0400)]
LU-10948 mdt: New connect flag for non-open-by-fid lock request

While we removed the 2.1 check for open by fid when open
lock is requested, when you talk to old servers that don't
have that patch - they get an open error, so introduce a compat
flag.

Change-Id: I94d50ad98a2828519853a35fa90c5063adf2feab
Fixes: 41d99c4902 ("LU-10948 llite: Introduce inode open heat counter")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43907
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 months agoLU-14742 socklnd: detect link state to set fatal error on ni 52/43952/4
Serguei Smirnov [Tue, 8 Jun 2021 21:11:41 +0000 (14:11 -0700)]
LU-14742 socklnd: detect link state to set fatal error on ni

To help avoid selecting lnet ni which corresponds to a downed
ethernet link for sending, add a mechanism for detecting link
events in socklnd. On link up/down events, find corresponding
ni and toggle ni_fatal_error_on flag, similar to o2iblnd way.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ie9f4f02fcb8b988c77bf63f751d5a621e79e9f58
Reviewed-on: https://review.whamcloud.com/43952
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14729 osd-ldiskfs: fix to declare write commits 94/43994/4
Wang Shilong [Mon, 14 Jun 2021 01:28:51 +0000 (09:28 +0800)]
LU-14729 osd-ldiskfs: fix to declare write commits

Fallocation might introduce unwritten extents, writting
data will trigger extents split, so we should reserve
credits for this case, to avoid complicated calculation,
we just use normal credits calculation if extent is mapped
as unwritten.

See comments in ext4:
If we add a single extent, then in the worse case, each tree
level index/leaf need to be changed in case of the tree split.
If more extents are inserted, they could cause the whole tree
split more than once, but this is really rare.

Lustre always reserve extents in 1 extent case, this is wrong.
Also fix indirect blocks calculation.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I9b67ec7b002711f040f46d0c77a645bb6f57a7de
Reviewed-on: https://review.whamcloud.com/43994
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12888 tests: remove big files in sanity 11/36511/10
Alex Zhuravlev [Fri, 18 Oct 2019 04:56:52 +0000 (07:56 +0300)]
LU-12888 tests: remove big files in sanity

otherwise sanity easily fails on a local setup

Test-Parameters: trivial

Change-Id: Ia0a561e650fca05837445eebe25ff1dea15366e4
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36511
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
3 months agoLU-14093 utils: fix DLSYM buffer over flow 38/43938/3
James Simmons [Mon, 7 Jun 2021 12:33:59 +0000 (08:33 -0400)]
LU-14093 utils: fix DLSYM buffer over flow

The 'name' string passed to DLSYM macro is created from the fsname
buffer in load_backfs_module(). That buffer is greater than 512
bytes in size but the temporary buffer in DLSYM is only 64. The
newest gcc version detect this bug.

mount_utils.c: In function ‘load_backfs_module’:
mount_utils.c:530:36: error: ‘%s’ directive output may be truncated writing up to 507 bytes into a region of size 64 [-Werror=format-truncation=]
  530 |   snprintf(_fname, sizeof(_fname), "%s_%s", prefix, #func); \
      |                                    ^~~~~~~
mount_utils.c:593:2: note: in expansion of macro ‘DLSYM’
  593 |  DLSYM(name, ops, init);

Change-Id: I8ae30a5288f236fb9272dffd40f44175e5e03ef9
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43938
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14736 utils: Change leak_finder to use stdout 34/43934/3
Patrick Farrell [Sat, 5 Jun 2021 21:17:23 +0000 (17:17 -0400)]
LU-14736 utils: Change leak_finder to use stdout

It is not an error for a leak checking script to find a
leak, so don't have leak_finder.pl print to stderr.  It also
prints several pieces of basic status to stderr, for which
there is no reason at all.

This makes it easier to redirect the output for interactive
use.

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: Iab226726ca4b36ada40a305962beedc363398c37
Reviewed-on: https://review.whamcloud.com/43934
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
3 months agoLU-14731 mdd: clear orphans changelog entries 01/43901/4
John L. Hammond [Wed, 2 Jun 2021 17:05:01 +0000 (12:05 -0500)]
LU-14731 mdd: clear orphans changelog entries

In mdd_changelog_llog_init(), adjust the orphan changelog index logic
to account for the case when no users are registered. Add sanity
test_160n() to verify this.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I03b0c1002a0e16f26af8ec23bf06c9a07dec858a
Reviewed-on: https://review.whamcloud.com/43901
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14690 kernel: RHEL 8.4 server support 91/43791/8
Jian Yu [Fri, 4 Jun 2021 07:47:14 +0000 (00:47 -0700)]
LU-14690 kernel: RHEL 8.4 server support

This patch makes changes to support RHEL 8.4 release with
kernel 4.18.0-305.3.1.el8_4 for Lustre server.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Change-Id: I484af80c4764367b40b28ce459a6ff9d87edf3a8
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43791
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14653 tests: Correct include path for sanity-lnet test_300 00/43500/6
Chris Horn [Thu, 29 Apr 2021 17:45:56 +0000 (12:45 -0500)]
LU-14653 tests: Correct include path for sanity-lnet test_300

We need to supply an appropriate include path for sanity-lnet
test_300 when we're running in tree.

Test-Parameters: trivial testlist=sanity-lnet env=ONLY=300
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ia04a713ef6f1989507a77a618328d31f74d48e0d
Reviewed-on: https://review.whamcloud.com/43500
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14649 lnet: Correct distance calculation of local NIDs 98/43498/3
Chris Horn [Wed, 28 Apr 2021 16:33:40 +0000 (11:33 -0500)]
LU-14649 lnet: Correct distance calculation of local NIDs

Multi-rail peers can have multiple local NIDs on the same net, but
LNetDist() may only identify a NID as local if it is the first one
returned by lnet_get_next_ni_locked().

We need to check all local NIs to find a match for the target NID
in LNetDist().

Add test to check LNetDist() calculation of local NIDs for a peer with
multiple NIDs on the same net.

HPE-bug-id: LUS-9964
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ic8855f7798a90972c69d89d039d0bba882d8aed1
Reviewed-on: https://review.whamcloud.com/43498
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14513 osd: release o_guard before quota acquisition 18/42018/11
Alex Zhuravlev [Fri, 12 Mar 2021 06:17:11 +0000 (09:17 +0300)]
LU-14513 osd: release o_guard before quota acquisition

to avoid deadlocks as regular transactions (like write) start
a transaction, then grab o_guard.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I2678677ed6c213e4bed30cc1218e48b8f2900dc4
Reviewed-on: https://review.whamcloud.com/42018
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-930 utils: add --help option to lfs sub-commands 59/34659/21
Andreas Dilger [Tue, 21 Jan 2020 10:29:36 +0000 (03:29 -0700)]
LU-930 utils: add --help option to lfs sub-commands

Add the "--help" and "-h" options to lfs sub-commands, and
print out an error message if an invalid argument is given.
Otherwise, it is possible to get a help message but have no
idea why the command is failing (e.g. typo in argument name).

Format the usage messages consistently, using {} to indicate a
choice between multiple required parameters, putting arguments
in [] for optional parameters, and using capitalized arguments.

Update respective man pages to list "--help|-h" option.

Remove the old SETSTRIPE and GETSTRIPE checks from spelling.txt
to avoid spurious checkpatch warnings.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ic583c8161d1d5380e353f43a8613dd86c93ebbe5
Reviewed-on: https://review.whamcloud.com/34659
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-14489 utils: fix 'lfs find --mdt-count' 66/43866/3
Andreas Dilger [Fri, 28 May 2021 21:15:10 +0000 (15:15 -0600)]
LU-14489 utils: fix 'lfs find --mdt-count'

Running "lfs find --mdt-count" causes the find to exit if there
is no directory striping, rather than continuing to the next item.

If cb_get_dirstripe() receives ENODATA then it should consider
that directory as not having any striping and move on, rather
than returning this error to the caller.

Don't crash in cb_getdirstripe() if it is called with a NULL
directory pointer or no directory is opened.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If8dd135a86a6a8911bf804542132b2e7a3ce7057
Reviewed-on: https://review.whamcloud.com/43866
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12577 llog: protect partial updates from readers 89/43589/8
Alex Zhuravlev [Sun, 9 May 2021 06:32:55 +0000 (09:32 +0300)]
LU-12577 llog: protect partial updates from readers

llog_osd_write_rec() adds a record in few steps: the header is
updated first, then the record itself is appended. per-loghandle
semaphore is used, but remote readers allocate a new separate
loghandle for every access (header reading, blocks), the the
readers can't use loghandle's semaphore to avoid accessing partial
updates. use object-based locking [censored] to serialize the writer
vs the readers.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie4e4d4a1e9a6fcdea9fcca7d80b0da920e786424
Reviewed-on: https://review.whamcloud.com/43589
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13124 scrub: check for multiple linked file 94/37194/31
Hongchao Zhang [Tue, 8 Jun 2021 21:50:34 +0000 (05:50 +0800)]
LU-13124 scrub: check for multiple linked file

The files on OSTs should have only one link, but it could
have more than one link when there are some disk failures
"multiply claimed block(s)" and fixed by e2fsck to clone
these conflicted blocks. This patch adds the check of these
multiple linked files in Scrub on OST.

The name of the objects in "O" depends on the object's FID,
the directory pattern is O/[FID_SEQ]/[SUB_DIR]/[FID_OID],
the inodes of these multiple linked files are normal, but
there is only one directroy entry compatible with the object,
this patch scans all files under "O" to check whether its name
is matched with its FID.

Change-Id: I280a725939b037006935d47e9ef426a4a6a7b317
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37194
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-14627 lnet: Ensure ref taken when queueing for discovery 18/43418/9
Chris Horn [Thu, 22 Apr 2021 19:51:44 +0000 (14:51 -0500)]
LU-14627 lnet: Ensure ref taken when queueing for discovery

Call lnet_peer_queue_for_discovery() in
lnet_discovery_event_handler() to ensure that we take a ref on
the peer when forcing it onto the discovery queue. This also ensures
that the peer state has LNET_PEER_DISCOVERING.

Add a test to sanity-lnet.sh that can trigger the refcount loss bug
in discovery.

HPE-bug-id: LUS-7651
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie2908668c4ffde0f993b5b7ea9aa58acd1d6fa9c
Reviewed-on: https://review.whamcloud.com/43418
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-13569 tests: Check LNet Health recovery logic 23/39723/24
Chris Horn [Mon, 24 Aug 2020 21:14:07 +0000 (16:14 -0500)]
LU-13569 tests: Check LNet Health recovery logic

Add test cases to validate LNet Health recovery of local and peer
NIs.

The new test cases are added to the except list for aarch64 due to
an unresolved issue with the LNet drop functionality on that
architecture.

A few style issues are also addressed by this patch.

An asterisk was being supplied to the lctl net_drop_del commands when
this should have been the '-a' flag.

A bug in cleanup_testsuite is addressed where we were using the
wrong filename for the tmp files created by the subtests.

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-9109
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I965df2449770631caa03ced7726abb0ea76c17e6
Reviewed-on: https://review.whamcloud.com/39723
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-13569 lnet: Add health ping stats 14/40314/12
Chris Horn [Thu, 15 Oct 2020 22:33:33 +0000 (17:33 -0500)]
LU-13569 lnet: Add health ping stats

Add the NI and peer NI ping count and next ping timestamp to
detailed output of lnetctl peer and net output.

Test-Parameters: trivial
HPE-bug-id: LUS-9109
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I208cb3ea0b08a2984572cf0ec9874dbd09f6168e
Reviewed-on: https://review.whamcloud.com/40314
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-14729 osd-ldiskfs: declare dirty block groups correctly 90/43890/2
Wang Shilong [Wed, 2 Jun 2021 01:52:39 +0000 (09:52 +0800)]
LU-14729 osd-ldiskfs: declare dirty block groups correctly

Calculate dirty block groups only include estimated extents,
indirect blocks and extent node/leaf blocks are missed, this
could make us short of credits.

Fixes: 0271b17b80a82 ("LU-14134 osd-ldiskfs: reduce credits for new writing")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Iec8525823b04e909c030f94bf75b8eca60d31c50
Reviewed-on: https://review.whamcloud.com/43890
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-14093 llapi: remove ignored qualifier 12/43712/3
Dominique Martinet [Sat, 15 May 2021 22:32:53 +0000 (07:32 +0900)]
LU-14093 llapi: remove ignored qualifier

Fixes the following warning on newer gcc with -Wextra:
.../include/lustre/lustreapi.h:1000:1: error: type qualifiers ignored on function return type [-Werror=ignored-qualifiers]
 1000 | const __u16 llapi_layout_string_flags(char *string);
      | ^~~~~

As the parameter is ignored, this should make no code difference

Test-parameters: trivial

Change-Id: I049166bbc586007cdecc93225d508693607ef04e
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-on: https://review.whamcloud.com/43712
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-14093 utils: fix format-overflow warning 11/43711/3
Dominique Martinet [Sat, 15 May 2021 22:32:47 +0000 (07:32 +0900)]
LU-14093 utils: fix format-overflow warning

Fix the following warning on gcc11 by making numbuf big enough to fit
format content.

lfs.c: In function ‘print_quota’:
lfs.c:7719:48: error: ‘sprintf’ may write a terminating nul past the end of the destination [-Werror=format-overflow=]
 7719 |                         sprintf(numbuf[0], "%s*", strbuf);
      |                                                ^
lfs.c:7719:25: note: ‘sprintf’ output between 2 and 33 bytes into a destination of size 32
 7719 |                         sprintf(numbuf[0], "%s*", strbuf);
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test-parameters: trivial

Change-Id: I021e6ffff2e1405eadbe689f718674af4d4d6376
Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Reviewed-on: https://review.whamcloud.com/43711
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
4 months agoLU-13811 client: don't panic for mgs evictions 55/43655/3
Alexander Boyko [Tue, 11 May 2021 09:33:36 +0000 (05:33 -0400)]
LU-13811 client: don't panic for mgs evictions

Avoid client panics for MGS evictions.
Create a function to check if the eviction is coming
from an MGS, and if so to ignore it.

Rework dump_on_eviction and lbug_on_eviction so
all logic is handled in one place.

Test-Parameters: trivial
HPE-bug-id: LUS-197
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Signed-off-by: Ben Evans <jevans@cray.com>
Change-Id: Iaa8b06f52fa22ac891b569bc8a2271c8e1e63a3b
Reviewed-on: https://review.whamcloud.com/43655
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-13971 quota: report Pool Quotas for a user 75/39975/12
Sergey Cheremencev [Fri, 18 Sep 2020 13:24:10 +0000 (16:24 +0300)]
LU-13971 quota: report Pool Quotas for a user

Patch adds ability to show quota limits and usage
from all pools per user. Since this patch
long option --pool without argument results
in printing Pool Quotas for all known pools:
lfs quota -u quota_usr --pool /mnt/testfs
Pools from lustre:
Quotas for pool: qpool1
Disk quotas for usr quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       0       0   10240       -       0       0       0       -
Quotas for pool: qpool2
Disk quotas for usr quota_usr (uid 60000):
     Filesystem  kbytes   quota   limit   grace   files   quota   limit   grace
    /mnt/lustre       0       0   20480       -       0       0       0       -

To get information for specific pool you still
need to set pool name after --pool:
lfs quota -u quota_usr --pool flash /mnt/testfs

Patch also adds sanity-quota_74 to check new
feature.

Test-Parameters: trivial testlist=sanity-quota
HPE-bug-id: LUS-8720
Change-Id: Ib918eef84c2352946ce13342471f36e2b500df32
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/39975
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 months agoLU-13419 osc: Move shrink update to per-write 14/38214/6
Patrick Farrell [Mon, 13 Apr 2020 16:23:42 +0000 (11:23 -0500)]
LU-13419 osc: Move shrink update to per-write

Updating the grant shrink interval is currently done for
each page submitted, rather than once per write.  Since
the grant shrink interval is in seconds, this is
unnecessary.

This came up because this function showed up in the perf
traces for https://review.whamcloud.com/#/c/38151/, and
it is called with the cl_loi_list_lock held.

Note that this change makes this access to the grant shrink
interval a 'dirty' access, without locking, but the grant
shrink interval is:
A) Already accessed like this in various places, and
B) can safely be out of date or suffer a lost update
without affecting correctness or performance.

IOR performance testing with this test:
mpirun -np 36 $IOR -o $LUSTRE -w -t 1M -b 2G -i 1 -F

No patches:
5942 MiB/s
With 38151:
14950 MiB/s
With 38151+this:
15320 MiB/s

Signed-off-by: Patrick Farrell <farr0186@gmail.com>
Change-Id: I8110b3c2570c183d58be2bccdbf76813ea3e373a
Reviewed-on: https://review.whamcloud.com/38214
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>