Whamcloud - gitweb
fs/lustre-release.git
9 months agoLU-13783 osd-ldiskfs: use alloc_file_pseudo to create fake files 76/43876/20
James Simmons [Wed, 8 Dec 2021 22:13:40 +0000 (17:13 -0500)]
LU-13783 osd-ldiskfs: use alloc_file_pseudo to create fake files

With kallsyms_lookup_name() no longer exported with 5.8+ kernels
this means the work around to setup the security handling broke.
Currently osd-ldiskfs will crash due to security_alloc() never
being called. The solution is to use alloc_file_pseudo() instead
to create our fake file.

Change-Id: Ib417ebdda7d9829a231c568022618154c273f3e6
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43876
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14704 tests: disable opencache for sanity/29 77/43777/14
Alex Zhuravlev [Tue, 25 May 2021 04:13:55 +0000 (07:13 +0300)]
LU-14704 tests: disable opencache for sanity/29

otherwise lock counting is not quite correct

Fixes: 41d99c4902 ("LU-10948 llite: Introduce inode open heat counter")

Test-Parameters: trivial

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia73e8aa4a16b7ced29490d41c8eac4ee839a3406
Reviewed-on: https://review.whamcloud.com/43777
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-11388 test: enable replay-single test_131b 21/40421/7
Vikentsi Lapa [Tue, 27 Oct 2020 14:39:58 +0000 (14:39 +0000)]
LU-11388 test: enable replay-single test_131b

Issue is fixed, so this commit verifies fix.

Test-Parameters: trivial env=ONLY=131 testlist=replay-single fstype=zfs
Signed-off-by: Vikentsi Lapa <vlapa@whamcloud.com>
Change-Id: I609146172c1fee2a955d5c41f623c8b8c2ffaeaa
Reviewed-on: https://review.whamcloud.com/40421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-15357 mdd: fix changelog context leak 31/45831/4
Mikhail Pershin [Sat, 11 Dec 2021 12:49:47 +0000 (15:49 +0300)]
LU-15357 mdd: fix changelog context leak

The mdd_changelog_clear() shouldn't skip llog_ctxt_put()
in case of error.

Fixes: 6b183927e1 (LU-14553 changelog: eliminate mdd_changelog_clear warning)
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I9c9aa3ce0d11e8f67470b450d007f2a1081644c6
Reviewed-on: https://review.whamcloud.com/45831
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15252 mdt: reduce contention at mdt_lsom_update 09/45709/5
Alexander Boyko [Thu, 2 Dec 2021 09:43:54 +0000 (04:43 -0500)]
LU-15252 mdt: reduce contention at mdt_lsom_update

mot_som_mutex serialize all close requests with lsom updates for
a same mdt_object. For a massive open/read/close single shared
file load, it leads to high load avarage cause many threads sleep
on mutex.
This patch introduces a cached lsom size, and uses a mutex at update
part only. Close requests with lsom size less or equal to cached size
would not take a mutex at all.

Test results MPI open/flock/funlock/close SSF
10 iterations 10 node 100 thread each, 1000 file ops per thread
close time secs master patch MDT load avarage master patch
avg             0.142  0.086                  47.05  38.89
max             0.164  0.129                  49.39  44.77
min             0.097  0.041                  44.44  34.7

Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I807b468b128295df9391b0467e74d4f10240662e
Reviewed-on: https://review.whamcloud.com/45709
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-7372 tests: re-enable replay-dual test_26 82/43982/2
Andreas Dilger [Fri, 11 Jun 2021 07:19:52 +0000 (01:19 -0600)]
LU-7372 tests: re-enable replay-dual test_26

Re-enable test_26 since it was just the unfortunate victim of
either test_24 or test_25 causing MDS unmount to hang.

Test-Parameters: trivial testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib944028e798488c425501f0c48bf812fc13ebbe5
Reviewed-on: https://review.whamcloud.com/43982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15262 osd: bio_integrity_prep_fn return value processing 46/45646/3
Alexey Lyashkov [Mon, 22 Nov 2021 13:32:23 +0000 (16:32 +0300)]
LU-15262 osd: bio_integrity_prep_fn return value processing

There is osd_bio_integrity_handle() fn in lustre/osd-ldiskfs/osd_io.c
It checks the returned code of bio_integrity_prep_fn() but between
mainstream Linux 4.12 and 4.13 kernel integrity API has changed and
in 4.13+ (as well as for any RHEL8 including first beta)

bio_integrity_prep() returns boolean true on success.

HPe-bug-id: LUS-10443
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: I973aa8ccae024157ad863d26afc7b1264a5c7149
Reviewed-on: https://review.whamcloud.com/45646
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoNew tag 2.14.56 2.14.56 v2_14_56
Oleg Drokin [Mon, 13 Dec 2021 20:16:40 +0000 (15:16 -0500)]
New tag 2.14.56

Change-Id: I2491f69b4d4e4a7ae8ed39bef8c9806127c93d79
Signed-off-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7] 87/45687/2
Jian Yu [Tue, 30 Nov 2021 21:43:10 +0000 (13:43 -0800)]
LU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.49.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I356b8a8345a4a91d6d1c1a4a9b4eab4bb5afe75b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45687
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15260 tests: numfailovers() fix 33/45633/6
Elena Gryaznova [Mon, 22 Nov 2021 15:13:07 +0000 (18:13 +0300)]
LU-15260 tests: numfailovers() fix

Patch fixes numfailovers() to use comma
separated MDTS list correctly. Without this fix
in newer bash version we see the following error:
  line 69: mds1,mds2,mds3,mds4_nums: bad substitution

Fixes: a7a2133bfa ("b=18696 new RECOVERY_RANDOM_SCALE test")
Fixes: b594948509 ("TT-59 remove . and - from the node name")
Test-Parameters: trivial testlist=recovery-random-scale
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10619
Change-Id: I4c28e3c62cada60dc1241948dc4e969e0e10ce9a
Reviewed-on: https://review.whamcloud.com/45633
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15263 quota: fix bug in qmt_pool_recalc 32/45632/2
Sergey Cheremencev [Thu, 21 Oct 2021 20:28:01 +0000 (23:28 +0300)]
LU-15263 quota: fix bug in qmt_pool_recalc

env should be freed at the end of qmt_pool_recalc,
as it is needed in qpi_putref. It causes a panic,
if pool has the last reference:
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a0
IP: [<ffffffffc08de2d7>] lu_context_key_get+0x17/0x30 [obdclass]
...
Call Trace:
 [<ffffffffc08de358>] lu_object_free.isra.30+0x68/0x170 [obdclass]
 [<ffffffffc08e1a35>] lu_object_put+0xc5/0x3e0 [obdclass]
 [<ffffffffc100e56c>] qmt_pool_free+0x30c/0x590 [lquota]
 [<ffffffffc10100b5>] qmt_pool_recalc+0x365/0x1260 [lquota]
 [<ffffffff8bac1c31>] kthread+0xd1/0xe0
 [<ffffffff8c176c37>] ret_from_fork_nospec_begin+0x21/0x21

HPE-bug-id: LUS-10426
Change-Id: Ic23dcb858ff811757f38948aa572c936c076e21e
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/45632
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15208 ldiskfs: add support for Ubuntu20 kernel 5.4.0.90 47/45547/4
Li Dongyang [Fri, 12 Nov 2021 12:30:43 +0000 (23:30 +1100)]
LU-15208 ldiskfs: add support for Ubuntu20 kernel 5.4.0.90

Also fix the lustre-build-ldiskfs.m4 to select correct series file.
We use -ge to check the kernel release version, so greater version
should come on top.

Change-Id: Id6b599ef5b2ea823e203aaa6a40917e49f98f4d9
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/45547
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-930 doc: update lustre.7 man page 93/45493/2
Andreas Dilger [Mon, 8 Nov 2021 21:03:24 +0000 (14:03 -0700)]
LU-930 doc: update lustre.7 man page

Update the lustre.7 man page to better describe current functionality.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I979841e597fcfa8448c708dd66d4d89d3018b1cc
Reviewed-on: https://review.whamcloud.com/45493
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Rick Mohr <mohrrf@ornl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4] 60/45460/4
Jian Yu [Tue, 30 Nov 2021 22:07:40 +0000 (14:07 -0800)]
LU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.25.1.el8_4.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Change-Id: Ic70f7330f90a36646bb36e0c6015ea22882b20b9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45460
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15190 ptlrpc: fix duplication check 45/45445/5
Alex Zhuravlev [Wed, 3 Nov 2021 06:31:06 +0000 (09:31 +0300)]
LU-15190 ptlrpc: fix duplication check

ptlrpc_server_check_for_resend() skips duplication check if
current exp_rpc_count == 0 which is wrong as exp_rpc_count
is incremented for RPCs in progress.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4ba1600341d916871f66aceb4d6a1043dd015e55
Reviewed-on: https://review.whamcloud.com/45445
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12784 tests: fix large_xattr_enabled() for ZFS 64/45264/2
Andreas Dilger [Fri, 15 Oct 2021 17:31:57 +0000 (11:31 -0600)]
LU-12784 tests: fix large_xattr_enabled() for ZFS

Fix large_xattr_enabled() check for ZFS filesystems, since bash
functions return "0" for true.  Otherwise, all ZFS tests that
check large_xattr_enabled() will be skipped.

Fixes: 84097792f56c ("LU-12784 llite: limit max xattr size by kernel value")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie566244c6b1f46b947a96331e7623b9b863ebbe5
Reviewed-on: https://review.whamcloud.com/45264
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15057 utils: pool quota man 21/45121/4
Sergey Cheremencev [Wed, 31 Mar 2021 12:13:53 +0000 (15:13 +0300)]
LU-15057 utils: pool quota man

Adding pool quota man for setquota and
quota commands.
Remove [-o <obd_uuid>|-i <mdt_idx>|-I <ost_idx>]
from the case "lfs quota -t". Grace period
is stored only at quota master. Furthermore,
command lfs quota -t -I 0 /mnt/testfs fails
with EOPNOTSUPP.

Test-Parameters: trivial
HPE-bug-id: LUS-9869
Change-Id: I368e22b782bd3626f64907059ea329e94986535b
Reviewed-on: https://es-gerrit.dev.cray.com/158556
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45121
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-13756 quota: up_read leak in qmt_pool_lookup 06/45106/7
Sergey Cheremencev [Thu, 30 Sep 2021 15:58:16 +0000 (18:58 +0300)]
LU-13756 quota: up_read leak in qmt_pool_lookup

qmt_pool_lock is not released if qti_pools_add fails in
qmt_pool_lookup.

Change-Id: Ic2adb44468d51af7aefcbb91279260ae6f85d67a
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45106
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14975 dne: dir migration in non-recursive mode 02/44802/10
Lai Siyao [Thu, 26 Aug 2021 11:37:09 +0000 (07:37 -0400)]
LU-14975 dne: dir migration in non-recursive mode

Add an option "-d|--directory" option for "lfs migrate -m" to
migrate specified directory only, which is similar to "ls -d".

Add sanity 230w.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ib97949e3840a3b49f7074b16e259582a9bf16e3b
Reviewed-on: https://review.whamcloud.com/44802
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14956 fld: repeat failed FLDB lookup 23/44723/13
Alex Zhuravlev [Mon, 23 Aug 2021 07:29:18 +0000 (10:29 +0300)]
LU-14956 fld: repeat failed FLDB lookup

it's possible that LWP reconnection is in progress after remote
MDS restart. if FLDB misses an entry, then FLDB lookup can fail
with EAGAIN and whole RPC processing (like MDS_REINT) can fail
as well. try to lookup few times in cases of EAGAIN.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib6aeaf7706a6465b0c8bee696d985bb440ed192e
Reviewed-on: https://review.whamcloud.com/44723
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9859 mdd: unwind md_capable() 80/44580/8
James Simmons [Tue, 17 Aug 2021 14:16:41 +0000 (10:16 -0400)]
LU-9859 mdd: unwind md_capable()

The inline function md_capable() is just a wrapper
around cap_raised() which adds little benefit. Lets
just remove the use of this wrapper.

Change-Id: I1a5f4b2e34b4cf358b52b3fc4bdeff17fdab50c9
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14754 tests: add Overstripe support to racer 64/43964/5
Elena Gryaznova [Thu, 10 Jun 2021 09:19:44 +0000 (12:19 +0300)]
LU-14754 tests: add Overstripe support to racer

The files are created with a overstripe layout if
RACER_ENABLE_OVERSTRIPE=true is set.

We would like to have the ability to use the "real"
layouts, i.e. to limit the number of stripes per OST
instead of allowing racer to achieve the max LOV_MAX_STRIPE_COUNT
value. Patch adds RACER_LOV_MAX_STRIPECOUNT equal to
LOV_MAX_STRIPE_COUNT by default.

Test-Parameters: trivial testlist=racer
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9466, LUS-9608
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: I550922938438afa121af275fd1d6f60082db9b54
Reviewed-on: https://review.whamcloud.com/43964
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14392 gnilnd: re-enable large I/o buffers 73/41373/2
Shaun Tancheff [Sun, 31 Jan 2021 16:20:54 +0000 (10:20 -0600)]
LU-14392 gnilnd: re-enable large I/o buffers

DVS on gni breaks the LNet 1M handshake of LNET_MAX_IOV.

Introduce GNILND_MAX_IOV with a 4M i/o maximum and a hint
LNET_MD_GNILND so LNet can accept the large buffer w/o complaint.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4e78c0022fdece0d6945bbcc47e2e64d4d181dca
Reviewed-on: https://review.whamcloud.com/41373
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11915 tests: fix conf-sanity 115 test 49/38849/9
Artem Blagodarenko [Fri, 5 Jun 2020 15:45:45 +0000 (11:45 -0400)]
LU-11915 tests: fix conf-sanity 115 test

Not enough xattrs added to move outside inode.
Add one additional xattr. The test works only with FLAKEY=false.

Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Test-Parameters: testlist=conf-sanity env="ONLY=115"
HPE-bug-id: LUS-6966
Change-Id: Iab13ed3434effb03e1209755ac51eba2debe7387
Reviewed-on: https://review.whamcloud.com/38849
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-13542 osd: brw stats are initialized too late 54/38554/6
Andrew Perepechko [Thu, 25 Nov 2021 17:28:01 +0000 (20:28 +0300)]
LU-13542 osd: brw stats are initialized too late

Lustre crashes with the following stack trace:

 [<ffffffffc113cbac>] lprocfs_oh_tally+0x2c/0x40 [obdclass]
 [<ffffffffc169719b>] record_start_io.part.14+0x2b/0x40 [osd_zfs]
 [<ffffffffc1698322>] osd_read+0xa2/0x180 [osd_zfs]
 [<ffffffffc1167dee>] dt_record_read+0x1e/0x70 [obdclass]
 [<ffffffffc1190997>] lustre_index_restore+0x527/0x1720 [obdclass]
 [<ffffffffc16b2564>] osd_initial_OI_scrub+0xa34/0xd50 [osd_zfs]
 [<ffffffffc16b34fd>] osd_scrub_setup+0x9ed/0xb90 [osd_zfs]
 [<ffffffffc168a97b>] osd_mount+0xf4b/0x1380 [osd_zfs]

osd_procfs_init()/osd_stats_init() are called *after*
osd_initial_OI_scrub(), so osd stats are not yet initialized
when osd_read() first tries to update them.

This patch separates osd stats initialization from procfs
initialization so that osd stats should become initialized
by the time scrub starts its own initialization.

Change-Id: I15ab03e77eaab76e3dea8067b849c891e89aa9a8
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-8173
Reviewed-on: https://review.whamcloud.com/38554
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10640 tests: ha.sh script improvements 29/31229/9
Elena Gryaznova [Wed, 17 Nov 2021 10:13:26 +0000 (13:13 +0300)]
LU-10640 tests: ha.sh script improvements

In each load iteration check for all created directories
that ls' long format output does not contain question
marks ('?').
'?'s may be reported if
  stat(2)->getattr()->ll_glimpse_size()
fails, which is expected in case of failover. Then the test
is to wait until recovery is completed, repeat the check
and exit with error if '?' still exists after second check.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-4894
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Change-Id: I88495511797aaad53c923c90f88f92f1412380ce
Reviewed-on: https://review.whamcloud.com/31229
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15252 mdc: add client tunable to disable LSOM update 19/45619/3
Alexander Boyko [Fri, 19 Nov 2021 08:08:16 +0000 (03:08 -0500)]
LU-15252 mdc: add client tunable to disable LSOM update

It seems that mdt_lsom_update() has a serious issue with a single
shared file because of its mdt-level mutex for every close request.
The patch adds mdc_lsom parameter to mdc, base on it state client
sends or not LSOM updates to MDT. By default LSOM is on.

lctl set_param mdc.*.mdc_lsom=[on|off]

For a configuration when LSOM is not used the patch helps
MDT with load avarage with a specific load when many threads
open/read/close for a single file.

HPE-bug-id: LUS-10604
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Iba0e745a94825641da6b0a1c09488b1e2f54658b
Reviewed-on: https://review.whamcloud.com/45619
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15167 quota: fallocate send UID/GID for quota 75/45475/5
Arshad Hussain [Sun, 7 Nov 2021 14:46:29 +0000 (09:46 -0500)]
LU-15167 quota: fallocate send UID/GID for quota

Calling fallocate() on a newly created file did not account quota
usage properly because the OST object did not have a UID/GID
assigned yet. Update the fallocate code in the OSC to always send
the file UID/GID/PROJID to the OST so that the object ownership
can be updated before space is allocated.

Test-case: sanity-quota/78 added

Fixes: 48457868a02a ("LU-3606 fallocate: Implement fallocate preallocate operation")
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I86d80a7f415a80100f7d2fb5f417cf47bf5b2900
Reviewed-on: https://review.whamcloud.com/45475
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14514 flr: mirror split should not make stale file 24/42024/23
Bobi Jam [Thu, 2 Sep 2021 16:27:34 +0000 (00:27 +0800)]
LU-14514 flr: mirror split should not make stale file

Mirror split could leave an all stale mirrors file, this patch
prevent removing the last non-stale mirror from the file.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I63007784929a2cd18d2823e2250f7307ca7d8d45
Reviewed-on: https://review.whamcloud.com/42024
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11952 mdt: fix reconstruct open 12/35112/19
Andriy Skulysh [Sun, 3 Mar 2019 18:10:31 +0000 (20:10 +0200)]
LU-11952 mdt: fix reconstruct open

We shouldn't start a new transaction on resend.

Store fid of an opened object and use it during
reconstruction of the resend.

Change-Id: I8c21e9661903d3d4090ad29e43480e2ba7e35c39
Cray-bug-id: LUS-6957, LUS-7286
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/35112
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15169 Revert "LU-14668 lnet: Lock primary NID logic" 86/45386/3
Chris Horn [Tue, 26 Oct 2021 20:23:37 +0000 (15:23 -0500)]
LU-15169 Revert "LU-14668 lnet: Lock primary NID logic"

This patch breaks client mounts under certain LNet configurations.

This reverts commit 024f9303bc6f32a3113357c864765c4f9c93ed03.

Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ic1f1d07694fe49df14c803a9434d673e61c7dd67
Reviewed-on: https://review.whamcloud.com/45386
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-13601 llite: avoid needless large stats alloc 01/40901/13
Andreas Dilger [Tue, 8 Dec 2020 06:54:58 +0000 (23:54 -0700)]
LU-13601 llite: avoid needless large stats alloc

Allocate the ll_rw_extents_info (5896 bytes), ll_rw_offset_info
(6400 bytes), and ll_rw_process_info (640 bytes) structs only
when these stats are enabled, which is very rarely.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I59bbfce8d7f2422d810617d5fa712a67333ebbe5
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-on: https://review.whamcloud.com/40901
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15181 osd-ldiskfs: a typo in osd_declare_write_commit() 23/45423/3
Andrew Perepechko [Sun, 31 Oct 2021 19:03:48 +0000 (22:03 +0300)]
LU-15181 osd-ldiskfs: a typo in osd_declare_write_commit()

A typo in osd_declare_write_commit() makes ASAN emit warnings like:
UBSAN: Undefined behaviour in /lustre/osd-ldiskfs/osd_io.c:1404:2
shift exponent 4096 is too large for 32-bit type 'int'

Change-Id: Ie612d9c6655211445c00bd17fa1bf7a836af3542
Fixes: f0f92773e ("LU-14187 osd-ldiskfs: fix locking in write commit")
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-on: https://review.whamcloud.com/45423
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
9 months agoLU-15182 git: Add .gitreview file 07/44907/4
Xinliang Liu [Tue, 14 Sep 2021 01:20:32 +0000 (01:20 +0000)]
LU-15182 git: Add .gitreview file

Add .gitreview file, so that we can use "git review -s" to setup
remote push url and use "git review" to send patch for review.

Git review cmd is a very convenient tool to push, review for
Gerrit review system. See more details here:
https://docs.opendev.org/opendev/git-review/latest/usage.html

Test-Parameters: trivial
Change-Id: Ic8223bdfcb7a696328f921159a63d625359e45a6
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/44907
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9859 gss: replace cfs_size_roundXX macros. 84/45584/3
James Simmons [Tue, 16 Nov 2021 16:07:13 +0000 (11:07 -0500)]
LU-9859 gss: replace cfs_size_roundXX macros.

Many of the cfs_size_roundX() macros are not even used so delete
them. Replace cfs_size_round4() uses in the GSS layer with
round_up(var, 4);

Change-Id: Id35f0f7b60f8d00f425d9b15d2a76aa4d0fa5f2f
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/45584
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
9 months agoLU-15217 pcc: disable PCC for encrypted files 45/45545/4
Qian Yingjin [Fri, 30 Jul 2021 08:47:55 +0000 (16:47 +0800)]
LU-15217 pcc: disable PCC for encrypted files

When files are encrypted in Lustre using fscrypt, they should
normally not be accessible to users without the proper encyrption
key. However, if a user has then encryption key loaded when they
read a file, it may be decrypted in memory and saved to the PCC
backend in unencrypted form.

Due to the above reason, we just disable PCC caching for encrypted
files.

DDN-bug-id: EX-3571
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6c363dcad7a6bc8520350c0295f6e221bec3abb0
Reviewed-on: https://review.whamcloud.com/45545
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14325 tests: skip replay-single 134 for older servers 50/45450/3
James Nunez [Wed, 3 Nov 2021 19:39:06 +0000 (13:39 -0600)]
LU-14325 tests: skip replay-single 134 for older servers

The fix for a PFL file lost during recovery was landed to
Lustre 2.13.53.  Servers prior to 2.13.53 will fail the
replay-single test, test_134, added to the original patch
to check that PFL files are not lost.  Thus, we need to
skip this test for Lustre servers less than 2.13.53.

Fixes: 72d45e1d344c ("LU-13809 mdc: fix lovea for replay")
Test-Parameters: trivial env=ONLY=134 testlist=replay-single
Test-Parameters: serverdistro=el7.9 serverversion=2.12.7 env=ONLY=134 testlist=replay-single
Test-Parameters: serverdistro=el7.7 serverversion=2.13.0 env=ONLY=134 testlist=replay-single
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Id70f9e06f6221f88a54d696afce9de70cbcf1efa
Reviewed-on: https://review.whamcloud.com/45450
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alena Nikitenko <anikitenko@ddn.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
9 months agoLU-15186 o2iblnd: Default map_on_demand to 1 31/45431/3
Chris Horn [Mon, 1 Nov 2021 20:06:31 +0000 (15:06 -0500)]
LU-15186 o2iblnd: Default map_on_demand to 1

On kernels that provide global MR we default to using that exclusively
even if FMR/FastReg is available. This causes an interop issue if the
active side of a connection request has a higher fragment count than
the passive side  because FMR/FastReg may be needed to map the higher
fragment count. We should change the default map_on_demand to 1 so
that FMR/FastReg is used by default. map_on)demand can still be set
to 0 if needed.

Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I76010a905f151efbb0b109ae6f5fba6fb7ce1956
Reviewed-on: https://review.whamcloud.com/45431
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15166 tests: restore osp-syn threads after test_818 75/45375/4
Vladimir Saveliev [Tue, 26 Oct 2021 16:53:01 +0000 (19:53 +0300)]
LU-15166 tests: restore osp-syn threads after test_818

test_818() is supposed to leave osp-syn threads up after the test end,
otherwise, following tests get "logging isn't available, run LFSCK".

Use fail $SINGLEMDS for that.

Test-Parameters: trivial testlist=sanity
HPE-bug-id: LUS-10495
Change-Id: Ib4876f4c4d39fc87f86788d8611838b8078e4aac
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45375
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12857 tests: allow clients to be IDLE after recovery 18/45318/3
Andreas Dilger [Thu, 21 Oct 2021 01:47:25 +0000 (19:47 -0600)]
LU-12857 tests: allow clients to be IDLE after recovery

If clients are not connected to an OST when it fails (connection
is IDLE), they do not need to be involved in recovery, so this
should not be considered an error when checking the client state.

Test-Parameters: trivial testlist=recovery-mds-scale env=SLOW=no
Test-Parameters: testlist=conf-sanity
Test-Parameters: testlist=replay-dual,replay-single
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6cfeb718acd233378ed1608f22061bc15c3ebbe5
Reviewed-on: https://review.whamcloud.com/45318
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15126 kernel: RHEL 8.5 server support 06/45306/9
Jian Yu [Wed, 17 Nov 2021 20:11:43 +0000 (12:11 -0800)]
LU-15126 kernel: RHEL 8.5 server support

This patch makes changes to support RHEL 8.5 release
with kernel 4.18.0-348.2.1.el8_5 for Lustre server.

Test-Parameters: trivial fstype=ldiskfs \
env=SANITY_EXCEPT="101j" \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Test-Parameters: trivial fstype=zfs \
env=SANITY_EXCEPT="101j" \
clientdistro=el8.5 serverdistro=el8.5 testlist=sanity

Change-Id: Ie976d8fd3e6fcf8a564eff8a41ad0fd51b2c858c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45306
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15222 build: Update ZFS version to 2.0.6 67/45567/4
Jian Yu [Mon, 15 Nov 2021 17:01:18 +0000 (09:01 -0800)]
LU-15222 build: Update ZFS version to 2.0.6

Update ZFS version to 2.0.6. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.0.6

Change-Id: I2a7df45b79f402c3d3bce8b137edd11b5224b576
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45567
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5] 85/45285/8
Jian Yu [Wed, 17 Nov 2021 19:58:14 +0000 (11:58 -0800)]
LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]

This patch makes changes to support new RHEL 8.5 release
for Lustre client.

Test-Parameters: trivial env=SANITY_EXCEPT="101j" \
clientdistro=el8.5

Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45285
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
9 months agoLU-15184 llite: properly detect SELinux disabled case 01/45501/4
Sebastien Buisson [Tue, 9 Nov 2021 16:03:19 +0000 (17:03 +0100)]
LU-15184 llite: properly detect SELinux disabled case

Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45501
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15119 tgt: BUG at tgt_brw_read+0x16bf/0x1d80 73/45273/3
Andriy Skulysh [Wed, 6 Oct 2021 10:25:12 +0000 (13:25 +0300)]
LU-15119 tgt: BUG at tgt_brw_read+0x16bf/0x1d80

struct tgt_thread_big_cache {
  local = {{
      lnb_file_offset = 0,
      lnb_page_offset = 0,
      lnb_len = 0,
      lnb_rc = 0,
      lnb_page = 0xffffddee74fae100,
so npages_read becomes 0

Change-Id: Ie2201c9fc6f0350b1c6dcb480cff52f44d5413db
HPE-bug-id: LUS-10510
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/45273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15110 quota: cosmetic changes in PQ 58/45258/3
Sergey Cheremencev [Fri, 15 Oct 2021 14:11:47 +0000 (17:11 +0300)]
LU-15110 quota: cosmetic changes in PQ

cosmetic changes in PQ:
- make tgt_pool_free and qmt_sarr_pool_free void
- remove outdated comment from qmt_pool_lqes_lookup
- replace tabs with spaces

HPE-bug-id: LUS-9547
Change-Id: If4918b647eed1d971d00c521d010d0c72d349207
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45258
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15079 quota: include qsd_thread_info into mgs thread context 81/45181/2
Vladimir Saveliev [Tue, 24 Aug 2021 14:57:37 +0000 (17:57 +0300)]
LU-15079 quota: include qsd_thread_info into mgs thread context

mgs service thread envs do not get supplied with qsd_thread_info, which
may lead to the failure shown below:
(lu_object.h:1274:lu_env_info()) ASSERTION( info ) failed:
(lu_object.h:1274:lu_env_info()) LBUG
Pid: 146951, comm: ll_mgs_0003 3.10.0-957.1.3957.1.3.x4.3.25.x86_64 #1 SMP
Call Trace:
 libcfs_call_trace+0x8e/0xf0 [libcfs]
 lbug_with_loc+0x4c/0xa0 [libcfs]
 qsd_refresh_usage+0x25e/0x2f0 [lquota]
 qsd_op_adjust+0x2f1/0x730 [lquota]
 osd_object_delete+0x2b2/0x360 [osd_ldiskfs]
 lu_object_free.isra.32+0x68/0x170 [obdclass]
 lu_site_purge_objects+0x2fe/0x530 [obdclass]
 lu_object_find_at+0x371/0xa60 [obdclass]
 dt_locate_at+0x1d/0xb0 [obdclass]
 llog_osd_open+0x50e/0xf30 [obdclass]
 llog_open+0x15a/0x3e0 [obdclass]
 llog_origin_handle_open+0x334/0x720 [ptlrpc]
 tgt_llog_open+0x33/0xe0 [ptlrpc]
 mgs_llog_open+0x46/0x460 [mgs]
 tgt_request_handle+0x96a/0x1680 [ptlrpc]

Supply msg service context with qsd_thread_info.

Change-Id: If8664b81e1f64df015dad46ba26c9c1d1e3f54bf
HPE-bug-id: LUS-10334
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45181
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15059 nrs: do not overwrite "cmd" in nrs_tbf_rule 42/45142/4
Etienne AUJAMES [Wed, 6 Oct 2021 20:11:17 +0000 (22:11 +0200)]
LU-15059 nrs: do not overwrite "cmd" in nrs_tbf_rule

"cmd" pointer inside ptlrpc_lprocfs_nrs_tbf_rule_seq_write() and
nrs_tbf_parse_cmd are static. This could cause a double kfree call
because "cmd" could be overwriten by another "nrs_tbf_rule" write
instance.

Let's try to remove the "static" definition.

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I8cd7d9dd0483778c82bbf8711c07e49255983f4b
Reviewed-on: https://review.whamcloud.com/45142
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
9 months agoLU-14699 mdd: proactive changelog garbage collection 68/45068/9
Mikhail Pershin [Fri, 24 Sep 2021 15:47:44 +0000 (18:47 +0300)]
LU-14699 mdd: proactive changelog garbage collection

Currently changelog starts garbage collection when user
exceeds maximum idle timeout, there is also limit by amount
of idle records but it is used only for old changelog users
which have no cur_time field, therefore it is not used at
all nowadays. Another problem is that garbage collection is
started only when changelog is almost full. That causes
often situations when changelog might have very old users
staying much longer than idle timeout and having idle
records above maximum limit consuming space for nothing.

Patch reworks changelog GC in the following way:
- GC starts when changelog is almost full (old way) or
  either idle time or idle records limits are exceeded or
  when (idle_time * idle_records) exceeds its limit as well.
  The latest limit is calculated as:
  (idle_time * idle_records) / 84600 > (1 << 32) which is a
  reasonable heuristic for deciding if a user is "too idle"
  in both cases when lots records being created quickly vs
  user is idle a very long time.
- to avoid the processing of changelog users each time GC is
  checking all conditions both least user record and time
  are tracked when changelog users are initialized or
  purged/canceled. Both values are stored as mdd_changelog
  fields mc_minrec and mc_mintime
- test 160g is changed to test the new approach when idle
  indexes are checked always along with idle time checks
- test 160s is added in sanity.sh to check heuristic approach
  with (idle_time * idle_records) value checking

Fixes: 3442db6faf68 ("LU-7340 mdd: changelogs garbage collection")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I6028f3164212a2377a4fc45b60a826c64f859099
Reviewed-on: https://review.whamcloud.com/45068
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14957 mdd: prepare xattrs before migration 41/44741/2
Lai Siyao [Thu, 5 Aug 2021 15:30:22 +0000 (11:30 -0400)]
LU-14957 mdd: prepare xattrs before migration

In directory migration, the xattrs should be prepared before starting
transaction, otherwise if remote MDT is down, which will cause local
MDT stuck as well.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I79279e7b0c051a7542a71066fffd4ad70f559368
Reviewed-on: https://review.whamcloud.com/44741
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14941 lnet: Fix source specified to routed destination 30/44730/5
Chris Horn [Thu, 12 Aug 2021 21:16:05 +0000 (16:16 -0500)]
LU-14941 lnet: Fix source specified to routed destination

If a source NI is specified for a send then we should not modify the
destination NID that was passed to lnet_send().

HPE-bug-id: LUS-10301
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie47558d5bce97a0dca30ff7d072dcd39eb903324
Reviewed-on: https://review.whamcloud.com/44730
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14940 lnet: Fix source specified send to different net 28/44728/3
Chris Horn [Thu, 12 Aug 2021 21:08:44 +0000 (16:08 -0500)]
LU-14940 lnet: Fix source specified send to different net

The destination NI is fixed for all source-specified sends. Thus, in
order for a source-specified send to be considered "local", i.e. a
send that does not require a route, the destination NID must be on
the same net as the specified source.

HPE-bug-id: LUS-10303
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I4847db1d393bbc36def65123f260b928ebbf944e
Reviewed-on: https://review.whamcloud.com/44728
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14939 lnet: Allow specifying a source NID for lnetctl ping 27/44727/5
Chris Horn [Thu, 12 Aug 2021 16:26:07 +0000 (11:26 -0500)]
LU-14939 lnet: Allow specifying a source NID for lnetctl ping

Add a new --source option for lnetctl ping command. This allows the
user to specify a local NI from which to send the ping. This also
ensures that the specified destination NID is also used. Otherwise,
pings to multi-rail peers may end up going to a different peer NI
based on the multi-rail selection algorithm. The ability to specify
a source NI, and thus fix the destination NI, is a great help in
troubleshooting communication issues between multi-rail peers.

Add test to exercise lnetctl ping --source option.

HPE-bug-id: LUS-10296
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I454217b30a92414de537880f076a11a693b1f0b3
Reviewed-on: https://review.whamcloud.com/44727
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14867 doc: a few words about an asterisk in lfs quota 57/44357/4
Sergey Cheremencev [Thu, 10 Jun 2021 10:08:28 +0000 (13:08 +0300)]
LU-14867 doc: a few words about an asterisk in lfs quota

Clarify the difference between an asterisk printed per OST
in verbouse lfs quota output and an asterisk near the whole
filesystem usage.

Change-Id: I778fe1f7b1f6f8d55c311d81bd2b311d82463390
Test-Parameters: trivial
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/44357
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14437 gnilnd: use ktime_get_seconds() to get time 79/41679/4
Shaun Tancheff [Sat, 9 Oct 2021 04:33:23 +0000 (11:33 +0700)]
LU-14437 gnilnd: use ktime_get_seconds() to get time

Use ktime_get_seconds() to directly get the time inatead of
getting a timespec and converting it.

Fixes: 4b0e495e3c ("LU-14080 gnilnd: updates for SUSE 15 SP2")
Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I256855ceb9e038a9960fa76fe6e3bfe63fb16580
Reviewed-on: https://review.whamcloud.com/41679
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14188 osd: remove not used iam_container lock 90/40890/7
Artem Blagodarenko [Sat, 5 Dec 2020 01:40:52 +0000 (20:40 -0500)]
LU-14188 osd: remove not used iam_container lock

There is an rw_semaphore
struct iam_container {
    ...
        /*
         * read-write lock protecting index consistency.
         */
        struct rw_semaphore     ic_sem;   <<<<<<
        struct dynlock       ic_tree_lock;
        /* Protect ic_idle_bh */
        struct mutex         ic_idle_mutex;
     ...
};

There is initialization
 2    234  lustre/osd-ldiskfs/osd_iam.c <<iam_container_init>>
             init_rwsem(&c->ic_sem);

There are wrappers
   3    622  lustre/osd-ldiskfs/osd_iam.c <<iam_container_write_lock>>
             down_write(&ic->ic_sem);
   4    627  lustre/osd-ldiskfs/osd_iam.c <<iam_container_write_unlock>>
             up_write(&ic->ic_sem);
   5    632  lustre/osd-ldiskfs/osd_iam.c <<iam_container_read_lock>>
             down_read(&ic->ic_sem);
   6    637  lustre/osd-ldiskfs/osd_iam.c <<iam_container_read_unlock>>
             up_read(&ic->ic_sem);

But this wrappers are not used. And, based on the git history, never been used.
Let's delete this useless code.

Change-Id: Ied1122f034e53fee08888e1091f700bda4507f00
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
HPE-bug-id: LUS-9545
Reviewed-on: https://review.whamcloud.com/40890
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14073 ldiskfs: update ldiskfs patches for Linux 5.9 97/40397/4
Mr NeilBrown [Tue, 9 Nov 2021 15:55:33 +0000 (10:55 -0500)]
LU-14073 ldiskfs: update ldiskfs patches for Linux 5.9

Minor conflict changes only.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If0b58fc278f6721e44e26c6052509c31068f8e78
Reviewed-on: https://review.whamcloud.com/40397
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9325 obdclass: make niduuid for lustre_stop_mgc() static 17/33617/8
James Simmons [Mon, 1 Nov 2021 18:31:24 +0000 (14:31 -0400)]
LU-9325 obdclass: make niduuid for lustre_stop_mgc() static

The process to create a proper string for niduuid can be made
simpler and avoid a memory allocation.

Change-Id: I52cb01117e41cbcf2756477e91934a42d31fd157
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/33617
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12511 ldlm: free resource when ldlm_lock_create() fails. 85/45585/2
Mr. NeilBrown [Tue, 16 Nov 2021 16:11:22 +0000 (11:11 -0500)]
LU-12511 ldlm: free resource when ldlm_lock_create() fails.

ldlm_lock_create() gets a resource, but don't put it on
all failure paths. It should.

Change-Id: Ib49bcafdeac834c412adad9db135034d1ea06a04
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/45585
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12511 llite: fix misuse of current->parent. 80/45580/2
Mr. NeilBrown [Mon, 15 Nov 2021 19:16:54 +0000 (14:16 -0500)]
LU-12511 llite: fix misuse of current->parent.

current->parent is used by ptrace to redirect some signal delivery
to the ptracer.  It should only be used by 'ptrace' or 'signal' code.
All other users should  use current->real_parent, which is the real
parent.

Change-Id: I3212fd9f3db9935b8144dba8af82eb2f387a5c45
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/45580
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15083 ldiskfs: disable xattr credits check 46/45546/2
Li Dongyang [Fri, 12 Nov 2021 12:24:05 +0000 (23:24 +1100)]
LU-15083 ldiskfs: disable xattr credits check

Add ext4-xattr-disable-credits-check.patch to the
ldiskfs series from kernel 4.15+ except the
rhel8 ones. They already have this fix.

Change-Id: I1f9818c394377cdba6b11d2b24c26d2354f41ac0
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/45546
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9162 lod: option to set max stripe count per filesystem 32/45532/7
Lei Feng [Thu, 11 Nov 2021 06:06:39 +0000 (14:06 +0800)]
LU-9162 lod: option to set max stripe count per filesystem

Add an option to set max default stripe count when the stripe count
is set to -1.

Change-Id: I02634a02a6f6579750fe964662b7e644af1689d6
Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/45532
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14781 osp: osp_object_free access NULL pointer 42/45442/3
Bobi Jam [Tue, 2 Nov 2021 07:27:59 +0000 (15:27 +0800)]
LU-14781 osp: osp_object_free access NULL pointer

If a osp_object is created by multiple threads at the same time,
lu_object_find_at() could allocate an osp_object without object
initialization, before hash inserting of the object, it find another
object has been created and inserted by another thread, it will free
the unintialized osp_object, and osp_object_free() will access
an uninitialized list_head (opo_xattr_list).

This patch initialize osp_object::opo_xattr_list in its allocation
function.

Call trace:
            lu_object_free.isra.30+0xf2/0x170 [obdclass]
            lu_object_find_at+0x496/0x930 [obdclass]
            lod_initialize_objects+0x3e4/0xba0 [lod]
            lod_parse_striping+0x693/0xc20 [lod]
            lod_striping_load+0x2b2/0x660 [lod]
            lod_declare_destroy+0x12b/0x600 [lod]
            mdd_declare_finish_unlink+0x91/0x210 [mdd]
            mdd_unlink+0x48f/0xab0 [mdd]
            mdt_reint_unlink+0xc32/0x1550 [mdt]
            mdt_reint_rec+0x83/0x210 [mdt]
            mdt_reint_internal+0x6e1/0xb00 [mdt]
            mdt_reint+0x67/0x140 [mdt]
            tgt_request_handle+0xaee/0x15f0 [ptlrpc]
            ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
            ptlrpc_main+0xb34/0x1470 [ptlrpc]
            kthread+0xd1/0xe0

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib86aca5b41e94a1758f177655ea3a0f680335e0f
Reviewed-on: https://review.whamcloud.com/45442
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
9 months agoLU-15114 osp: changes queuing throttle 65/45265/8
Alexander Zarochentsev [Tue, 31 Aug 2021 05:36:30 +0000 (08:36 +0300)]
LU-15114 osp: changes queuing throttle

Prevent queue of sync changes from growing too much
by adding resends when queue size reaches
some (tunable) limit.

HPE-bug-id: LUS-10345
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I5efabb91d3700c58d9451f81c5fed9a22ae404fb
Reviewed-on: https://review.whamcloud.com/45265
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-14491 ldiskfs: do not corrupt journal with bh change rh7.6 64/45164/2
Andrew Perepechko [Fri, 8 Oct 2021 10:05:50 +0000 (17:05 +0700)]
LU-14491 ldiskfs: do not corrupt journal with bh change rh7.6

Currently, ldiskfs_xattr_delete_inode() zeroes xattr inode
references in cached buffers that haven't been prepared by
get_write_access().

When using journal checksums, it is possible that these buffers
are modified after the checksum is calculated but before the
buffer has been written to journal. Journal replay will fail
with a journal checksum error message if this transaction needs
to be replayed.

This is a port of:

Lustre-commit: d563f5140ffa183d0854cf7cd493ad884e314e3d
Lustre-change: https://review.whamcloud.com/41896

Test-Parameters: trivial
HPE-bug-id: LUS-9483
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I5ce1016737a16ddf74811c43ae74296d4f3e03b0
Reviewed-on: https://review.whamcloud.com/45164
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15068 ptlrpc: Do not unlink difficult reply until sent 38/45138/3
Chris Horn [Tue, 5 Oct 2021 19:11:29 +0000 (14:11 -0500)]
LU-15068 ptlrpc: Do not unlink difficult reply until sent

If a difficult reply is queued in LNet, or the PUT for it is
otherwise delayed, then it is possible for the commit callback
to unlink the reply MD which will abort the send. This results in
client hitting "slow reply" timeout for the associated RPC and
an unnecessary reconnect (and possibly resend).

This patch replaces the rs_on_net flag with rs_sent and rs_unlinked.
These flags indicate whether the send event for the reply MD has
been generated, and whether the MD has been unlinked, respectively.

If rs_sent is set, but rs_unlinked has not been set, then ptlrpc_hr
is free to unlink the reply MD as a result of the commit callback.
The reply-ack will simply be dropped by the server.

If ptlrpc_hr is processing the reply because of commit callback, and
rs_sent has not been set, then ptlrpc_hr will not unlink the reply
MD. This means that the reply_out_callback must also be modified to
check for this case when the send event occurs. Otherwise, if the ACK
never arrives from the client, then the MD would never be unlinked.
Thus when the send event occurs, and rs_handled is set, the
reply_out_callback will schedule the reply for handling by ptlrpc_hr.

HPE-bug-id: LUS-10505
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ib8f4853c7ab35d72624fce7ee3fba9e59a746e1f
Reviewed-on: https://review.whamcloud.com/45138
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15065 quota: fix BIO write performance drop 33/45133/7
Sergey Cheremencev [Wed, 15 Sep 2021 15:05:45 +0000 (18:05 +0300)]
LU-15065 quota: fix BIO write performance drop

Before the patch qti_lqes_qunit_min used int to store qunit
value, while lqe_qunit type is _u64. lqe_qunit > 2G caused
an overflow in a local integer argument. For example, when
block hard limit was set to 500TB(i.e. lqe_qunit was about
64TB in a system with 2 OSTs), qti_lqes_qunit_min returned
0 instead of 64TB in a qmt_lvbo_fill. Thus new qunit was not
set on OSTs(qsd_set_qunit wasn't called). Without the qunit,
OST began to send release request after each acquire. For
example, to write 10MB at the OST were sent 2 acquire and
2 release reuests(as qunit was not set on OST). With the
fix, i.e. in a normal case, OST needs just one acquire
request. The issue caused performance drop in a bufferred
write up to 15%-20% if compare with a baseline without PQ
patches.

Note, the issue exists only when a hard limit is set to some
high value(>100GB). The exact hard limit value depends on OSTs
number in a system and on amount of used space, but let's think
that issue doesn't exist on a clean system with 2 OSTs and hard
block limit 100G(this case was checked).

Remove qmt_pool_hash - it is not used anywhere since
"LU-11023 quota: remove quota pool ID".

HPE-bug-id: LUS-10250
Change-Id: I2c4ce38f5b9395ed1f4868d4c8efc00751116b15
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45133
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15179 tests: add trap cleanup_quota_test 18/45418/5
Sergey Cheremencev [Fri, 29 Oct 2021 19:00:46 +0000 (22:00 +0300)]
LU-15179 tests: add trap cleanup_quota_test

Add stack_trap cleanup_quota_test to the tests that
use setup_quota_test. If a test fails without calling
cleanup_quota_test, it may cause later tests to fail
due to used space > 0.

Remove ${tdir}_dom, if exists, in cleanup_quota_test.
sanity-quota_75 doesn't remove test_dom directory.

Test-Parameters: trivial  testlist=sanity-quota
Fixes: a4fbe734("LU-14739 quota: nodemap squashed root cannot bypass quota")
Change-Id: Ife4fd499b427bee79f74a5e172d233fe6a83e240
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45418
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15025 quota: stale edquot after clearing limits 00/45000/10
Sergey Cheremencev [Fri, 18 Jun 2021 16:08:16 +0000 (19:08 +0300)]
LU-15025 quota: stale edquot after clearing limits

When hard and soft limit set to 0, lqe enforced flag is also
set to false. As qmt_adjust_qunit does not handle not enforced
lqes, edquot set to the pool continues to be true and a user
gets -EDQUOT even if all pool limits are cleared. This was ok
for global pool lqe as since it turned off, zero limits are
sent to OSTs causing OSTs to release all granted space and
avoid EDQUOT. Fix this for PQ - set edquot and qunit to zero,
since appropriate lqe becomes "not enforced".

HPE-bug-id: LUS-10146
Change-Id: I1e08929bae7e1b37b1e8cbbc44859a786b5fb090
Reviewed-on: https://es-gerrit.dev.cray.com/158915/
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45000
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-15191 quota: set correct revoke_time 47/45447/4
Sergey Cheremencev [Tue, 12 Oct 2021 15:21:49 +0000 (18:21 +0300)]
LU-15191 quota: set correct revoke_time

When we do qmt_adjust_qunit, there are several lqes
and lqe_revoke_time is set for some of them, it means
appropriate OSTs have been already notified with the
least qunit and there is no chance to free more space.
If a qunit of the current lqe becomes equal to the least
qunit, find an lqe with the minimum(earliest) revoke_time
and set this revoke_time to the current one.

This patch fixes the following case. For example, we have
8 OSTs and 4 MDTs(i.e. 12 slaves) and a pool with just one
OST. Global hard block limit for the user is 50M, and 10M
for this user in a pool. User's usage is 0. As global pool
has 12 slaves it's initial qunit value is 1M, i.e. equal to
the least qunit. At the same time initial qunit value for the
pool with one OST is 4M. When user begins to write, pool's
qunit is decreased to 1M, but lqe_revoke is not set - it
should be set only after sending new qunit to OSTs in
qmt_lvbo_update. However, it won't be send because appropriate
lge_qunit in lqe global array already has the same value.
This problem caused sanity-quota_72 to hang instead of fail
with EDQUOT in test_1_check_write.

HPE-bug-id: LUS-10516
Change-Id: I5878c1e719ae83a69ad5dbc3653717bb1b4de632
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45447
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10391 uapi: move out kernel only code. 44/44844/4
James Simmons [Fri, 3 Sep 2021 23:22:18 +0000 (19:22 -0400)]
LU-10391 uapi: move out kernel only code.

Userland doesn't use the new IPv6 function in the UAPI nidstr.h.
So move this to the kernel header lib-types.h. Normally this
wouldn't matter but the python wrappers break with the LUTF
project. With the move to Netlink in the future for the user land
API we shouldn't need these functions anyways.

Test-parameters: trivial
Change-Id: I2ac6d102d4575f639573253c21723d62ca08abc2
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44844
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12347 llite: do not take mod rpc slot for getxattr 51/44151/10
Vladimir Saveliev [Thu, 9 Sep 2021 12:05:24 +0000 (15:05 +0300)]
LU-12347 llite: do not take mod rpc slot for getxattr

The following scenario may lead to client eviction:
clientA                clientB                  MDS
threadA1: write to file F1, get
and hold DoM MDC LDLM lock L1:
   ->cl_io_loop()
    ->cl_io_lock()
     :
     ->mdc_lock_granted()
      ->lock->l_writers++
      [hold ref until write done]

threadA2-A8: create files F2-F8:
   ->ll_file_open()
    ->mdc_enqueue_base()
     ->ldlm_cli_enqueue()
      ->ptlrpc_get_mod_rpc_slot()
      ->ptlrpc_queue_wait()
      [hold RPC slot until create done]

                                              OST(s) in recovery.
                                              MDS waiting on OST(s) to
                                              precreate new objects.

threadA1:
   -> cl_io_start()
    -> __generic_file_aio_write()
     -> file_remove_suid()
      -> ll_xattr_cache_refill()
       -> mdc_xattr_common()
        -> ptlrpc_get_mod_rpc_slot()
        [blocked waiting for RPC slot]

                       threadB1: write file F1,
       enqueue DoM MDC lock L1

                                              MDS sends blocking AST
                                              to clientA for lock L1

ldlm_threadA3: cannot cancel busy lock L1:
   -> ldlm_handle_bl_callback()
   ["Lock L1 referenced, will be cancelled later"]

                                              MDS evicts clientA for
                                              not cancelling lock L1

threadA1: never completes write:
  ->cl_io_end()
   ->cl_io_unlock()
    ->osc_lock_cancel()
     ->lock->l_writers--;

The fix is to add IT_GETXATTR to list of operations which do not
need mod rpc slot.

Tests to illustrate the issue is added.

wait_for_function(): total sleep time (wait) is to be equal to max
when 1 is returned.

Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
HPE-bug-id: LUS-7271
Change-Id: I1b80677df084bda141b9ac58a78b765bd0b14a41
Reviewed-on: https://review.whamcloud.com/44151
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14428 libcfs: separate daemon_list from cfs_trace_data 93/41493/10
Mr NeilBrown [Wed, 10 Feb 2021 22:49:54 +0000 (09:49 +1100)]
LU-14428 libcfs: separate daemon_list from cfs_trace_data

cfs_trace_data provides a fifo for trace messages.  To minimize
locking, there is a separate fifo for each CPU, and even for different
interrupt levels per-cpu.

When a page is remove from the fifo to br written to a file, the page
is added to a "daemon_list".  Trace message on the daemon_list have
already been logged to a file, but can be easily dumped to the console
when a bug occurs.

The daemon_list is always accessed from a single thread at a time, so
the per-CPU facilities for cfs_trace_data are not needed.  However
daemon_list is currently managed per-cpu as part of cfs_trace_data.

This patch moves the daemon_list of pages out to a separate structure
- a simple linked list, protected by cfs_tracefile_sem.

Rather than using a 'cfs_trace_page' to hold linkage information and
content size, we use page->lru for linkage and page->private for
the size of the content in each page.

This is a step towards replacing cfs_trace_data with the Linux
ring_buffer which provides similar functionality with even less
locking.

In the current code, if the daemon which writes trace data to a file
cannot keep up with load, excess pages are moved to the daemon_list
temporarily before being discarded.  With the patch, these page are
simply discarded immediately.
If the daemon thread cannot keep up, that is a configuration problem
and temporarily preserving a few pages is unlikely to really help.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie894f7751cadacb515215f18182163ea5d26e969
Reviewed-on: https://review.whamcloud.com/41493
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15107 mdt: Exclusive create isn't replayed 54/45254/6
Andriy Skulysh [Wed, 16 Sep 2020 11:38:41 +0000 (14:38 +0300)]
LU-15107 mdt: Exclusive create isn't replayed

mdt_finish_open() fails on exclusive create check as
there isn't an infrormation that the file was created.

Set DISP_OPEN_CREATE for exclusive open replay,
as we know that the original request has succeeded.

Change-Id: Idc71db76b757670b91b3bb1718870a018a805ed2
HPE-bug-id: LUS-9358
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/45254
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9680 net: Netlink improvements 58/44358/14
James Simmons [Wed, 27 Oct 2021 16:03:10 +0000 (12:03 -0400)]
LU-9680 net: Netlink improvements

With the expansion of the use of Netlink several issues have been
encountered. This patch fixes many of the issues. The issues are:

1) Fix idx handling in lnet_genl_parse_list() function. It needs
   to always been incremented. Some renaming suggestion for
   enum lnet_nl_scalar_attrs from Neil. New LN_SCALAR_ATTR_INT_VALUE
   to allow pushing integers as well as strings from userspace.

2) Create struct genl_filter_list which will be used to create
   a list of items to pass back to userland. This will be a common
   setup.

3) A normal user can't read /sys/debug/kernel/lustre which breaks
   lctl ***_params XXX since the first function called is
   llapi_param_get_paths(). Without the ability to read the
   debugfs tree glob() will fail. The solution is to use the
   kernel's glob function and just pass the requested string to
   the kernel.

4) For the external coordinator work you create a YAML parser
   that listens for kernel generated Netlink packets. This is
   a continuous stream vs an one time reply which we don't
   handle correctly. We move the handling of the completion
   of a Netlink packet series icompletely into the function
   yaml_netlink_msg_complete. In yaml_netlink_msg_parse() for
   the async case add "---" and in yaml_netlink_msg_complete()
   add "..." to define the beginning and end of a YAML document.

5) We have 3 types of setups. For kernel generated events it is
   possible to use just a YAML parser to listen for events. For
   the normal request -> reply setup we need both a YAML emitter
   and YAML parser. The last case is just sending commands to
   the kernel which only needs a YAML emitter. It is possible for
   that action to fail so we need to add handling for errors to
   the YAML emitter. We keep error handling for the YAML parser
   as well to handle the case of a stand along YAML parser listener.

6) Reworked the code that translates YAML to Netlink packets to
   send to the kernel so the both key and value pairs are sent
   seperately for the mapping case. This avoids dealing with
   complex string parsing in the kernel.

7) Error message handling was incorrect. struct nlmsgerr msg field
   is also the start of the nlattrs for the ext ack handling.

Test-Parameters: trivial
Change-Id: Ic8eee8fd0020b7a63565de6ef69f2c74bf4bdcd8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44358
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14713 llite: mend the trunc_sem_up_write() 44/43844/11
Bobi Jam [Thu, 2 Sep 2021 14:24:50 +0000 (22:24 +0800)]
LU-14713 llite: mend the trunc_sem_up_write()

The original lli_trunc_sem replace change (commit e5914a61ac) fixed a
lock scenario:

t1 (page fault)          t2 (dio read)              t3 (truncate)
|- vm_mmap_pgoff()       |- vvp_io_read_start()     |- vvp_io_setattr
|- down_write(mmap_sem)  |- down_read(trunc_sem)            _start()
|- do_map()              |- ll_direct_IO_impl()
|- vvp_io_fault_start    |- ll_get_user_pages()

 |- down_write(
 |- down_read(mmap_sem)        trunc_sem)
|- down_read(trunc_sem)

t1 waits for read semaphore of trunc_sem which is hindered by t3,
since t3 is waiting for the write semaphore while t2 take its read
semaphore,and t2 is waiting for mmap_sem which has been taken by t1,
and a deadlock ensues.

commit e5914a61ac changes the down_read(trunc_sem) to
trunc_sem_down_read_nowait() in page fault path, to make it ignore
that there is a down_write(trunc_sem) waiting, just takes the read
semaphore if no writer has taken the semaphore, and breaks the
deadlock.

But there is a delicacy in using wake_up_var(), wake_up_var()->
__wake_up_bit()->waitqueue_active() locklessly test for waiters on the
queue, and if it's called without explicit smp_mb() it's possible for
the waitqueue_active() to ge hoisted before the condition store such
that we'll observe an empty wait list and the waiter might not
observe the condition, and the waiter won't get woke up whereafter.

Fixes: e5914a61ac ("LU-12460 llite: replace lli_trunc_sem")
Change-Id: Ifdda2c1c8a4171466be1723923c136e84de8ce0e
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14971 test: align mirror_io resync implementation 02/45202/3
Bobi Jam [Tue, 12 Oct 2021 12:02:01 +0000 (20:02 +0800)]
LU-14971 test: align mirror_io resync implementation

Align the mirror_io resync implementation with
llapi_mirror_resync_many().

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Icf11c4c2302f36fc0f9682e0a310058081e1214f
Reviewed-on: https://review.whamcloud.com/45202
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15142 lctl: fixes for set_param -P and llog_print 32/45332/5
Mikhail Pershin [Thu, 14 Oct 2021 14:16:21 +0000 (17:16 +0300)]
LU-15142 lctl: fixes for set_param -P and llog_print

- properly handle permanent param deletion
- don't print skipped parameters in llog_print output
- add --raw option to llog_print to output all entries
  including markers

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id93a206a255dc885343efa293e1ee2672493e5e5
Reviewed-on: https://review.whamcloud.com/45332
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15197 llite: Do not count tiny write twice 76/45476/3
Patrick Farrell [Sun, 7 Nov 2021 22:04:40 +0000 (17:04 -0500)]
LU-15197 llite: Do not count tiny write twice

We accidentally count bytes written with tiny write twice
in stats.  Remove the extra count.

This also has the positive effect of improving tiny write
performance by about 4% by removing an extra call to the
stats code (the main cost is ktime_get()).

Before, 8 byte dd:
13.9 MiB/s
After:
14.3 MiB/s

Test-parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ia11e7f16e3e3d0c4012f87cde817ad7b21128fa8
Reviewed-on: https://review.whamcloud.com/45476
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15152 tests: auster reports wrong testsuite status 43/45343/3
Chris Horn [Fri, 22 Oct 2021 01:51:40 +0000 (01:51 +0000)]
LU-15152 tests: auster reports wrong testsuite status

auster always reports testsuites returned 0 even when there
are failures.

Test-Parameters: trivial testlist=sanity-lnet env=ONLY=230,ONLY_REPEAT=20
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I7a8101d6cfd854d8419edf55c18a72e211f5e5c8
Reviewed-on: https://review.whamcloud.com/45343
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14662 lnet: set eth routes needed for multi rail 65/44065/15
Serguei Smirnov [Wed, 23 Jun 2021 22:51:21 +0000 (15:51 -0700)]
LU-14662 lnet: set eth routes needed for multi rail

When ksocklnd is initialized or new ethernet interfaces
are added via lnetctl, set the routing rules using a common
shell script ksocklnd-config. This ensures control over
source interface when sending traffic.

For example, for eth0 with ip 192.168.122.142/24:
   the output of "ip route show table eth0" should be
192.168.122.0/24 dev eth0 proto kernel scope link src 192.168.122.142

This step can be omitted by specifying
   options ksocklnd skip_mr_route_setup=1
in the conf file, or by using switch
   --skip-mr-route-setup
when adding NI with lnetctl. Note that the module parameter
takes priority over the lnetctl switch: if skip-mr-route-setup
is not specified when adding NI with lnetctl, the route still
won't get created if the conf file has skip_mr_route_setup=1.

The route also won't be created if any route already exists
for the given interface, assuming advanced users who manage
routing on their own will want to continue doing so.

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia14e637bd29d4bbce5dd93daad9992336b2e6b15
Reviewed-on: https://review.whamcloud.com/44065
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14448 lod: verify LOV early in lod_get_default_striping 70/45370/3
Lai Siyao [Sat, 23 Oct 2021 14:44:36 +0000 (10:44 -0400)]
LU-14448 lod: verify LOV early in lod_get_default_striping

lod_get_default_striping() will get both default LOV and default LMV,
and parse them to struct lod_default_striping one by one, however the
LOV and LMV data are both stored in lod_thread_info.lti_ea_store, so
lod_verify_striping() should verify LOV upon getting LOV, otherwise
if both exists, it's LMV that's verified, which will return -EINVAL.

Fixes: 6a08df2d0effc7a ("LU-14448 lod: verify LOV before set/inherit")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I9763d35bdbc74101fa8515d5096ec457a4cb3524
Reviewed-on: https://review.whamcloud.com/45370
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-9704 grant: ignore grant info on read resend 71/45371/5
Vladimir Saveliev [Wed, 3 Nov 2021 10:52:14 +0000 (13:52 +0300)]
LU-9704 grant: ignore grant info on read resend

The following scenario makes a message like "claims 28672 GRANT, real
grant 0" to appear:

 1. client owns X grants and run rpcs to shrink part of those
 2. server fails over so that the shrink rpc is to be resent.
 3. on the clinet reconnect server and client sync on initial amount
 of grants for the client.
 4. shrink rpc is resend, if server disk space is enough, shrink does
 not happen and the client adds amount of grants it was going to
 shrink to its newly initial amount of grants. Now, client thinks that
 it owns more grants than it does from server points of view.
 5. the client consumes grants and sends rpcs to server. Server avoids
 allocating new grants for the client if the current amount of grant
 is big enough:
static long tgt_grant_alloc(struct obd_export *exp, u64 curgrant,
...
        if (curgrant >= want || curgrant >= ted->ted_grant + chunk)
                RETURN(0);
 6. client continues grants consuming which eventually leads to
 complains like "claims 28672 GRANT, real grant 0".

In case of resent of read and set_info:shrink RPCs grant info should
be ignored as it was reset on reconnect.

Tests to illustrate the issue is added.

HPE-bug-id: LUS-7666
Change-Id: I8af1db287dc61c713e5439f4cf6bd652ce02c12c
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45371
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-930 doc: update lfs migrate usage and man page 78/43378/11
Andreas Dilger [Mon, 19 Apr 2021 21:32:31 +0000 (15:32 -0600)]
LU-930 doc: update lfs migrate usage and man page

Update the usage and man page for "lfs migrate -m", noting that
this command will recursively migrate an entire directory tree.

It is not currently possible to migrate files with DoM components
between MDTs, so provide an example of how to work around this.

Only print the command-line options for commands in the usage
message instead of the full usage, since it is otherwise much
too verbose to see the actual error message being printed.  The
user should read the lfs-migrate.1 man page for full usage.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9e34a33fcc3f0e2b90bc499fb7b946c53e6111d1
Reviewed-on: https://review.whamcloud.com/43378
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15121 llite: skip request slot for lmv_revalidate_slaves() 75/45275/2
Andriy Skulysh [Fri, 30 Aug 2019 11:43:29 +0000 (14:43 +0300)]
LU-15121 llite: skip request slot for lmv_revalidate_slaves()

Some syscalls need lmv_revalidate_slaves(). It requires
second lock enqueue and the it can be blocked by
lack of RPC slots.

Don't acquire rpc slot for second lock enqueue.

Change-Id: Ida23c648c2bd169c4d238543731796232aa490dc
HPE-bug-id: LUS-8416
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/45275
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15141 quota: optimize capability check for root squash 22/45322/2
Sebastien Buisson [Thu, 21 Oct 2021 06:56:44 +0000 (08:56 +0200)]
LU-15141 quota: optimize capability check for root squash

On client side, checking for owner/group quota can be directly
bypassed if this is for root and there is no root squash.

Change-Id: If29eca428d8748df412a717615e4d0a4886ddd04
Fixes: a4fbe7341b ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/45322
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12678 ptlrpc: remove bogus LASSERT 21/45421/3
Andreas Dilger [Sat, 30 Oct 2021 00:40:40 +0000 (18:40 -0600)]
LU-12678 ptlrpc: remove bogus LASSERT

In the error case, it isn't possible for rc to be both -ENOMEM and
0 at the same time, so remove the incorrect LASSERT(rc == 0) to
avoid crashing the system on an allocation failure.

Improve error messages to conform to code style.

Fixes: ceeeae4271fd ("LU-12678 lnet: me: discard struct lnet_handle_me")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I61ac5d735d7b2658dae76213a2d40cbfd2bb8bb9
Reviewed-on: https://review.whamcloud.com/45421
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11667 tests: Fix sanity test 317 for 64K PAGE_SIZE OST 95/45395/6
Xinliang Liu [Thu, 28 Oct 2021 09:48:38 +0000 (09:48 +0000)]
LU-11667 tests: Fix sanity test 317 for 64K PAGE_SIZE OST

When create a file, blocks are allocated with PAGE_SIZE aligned,
see function osd_ldiskfs_map_inode_pages(). E.g. for 64K PAGE_SIZE
Arm64 OST server, if create a file with size less than 64K, it
actually allocates 128 blocks each block 512 Bytes.

It needs to adjust the test for 64K PAGE_SIZE OST server.

Test-Parameters: trivial
Test-Parameters: clientarch=aarch64 fstype=ldiskfs testlist=sanity \
env=PTLDEBUG=-1,ONLY=317
Test-Parameters: fstype=ldiskfs testlist=sanity \
env=PTLDEBUG=-1,ONLY=317

Change-Id: Iada701f4f424093e847fc70aa843873b75fe6b06
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/45395
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15156 kernel: back port patch for rwsem issue 83/45383/3
Yang Sheng [Tue, 26 Oct 2021 08:09:20 +0000 (16:09 +0800)]
LU-15156 kernel: back port patch for rwsem issue

RHEL7 included a defeat in rwsem. It can cause a
thread hung on rwsem waiting infinity. Backport
commit: 5c1ec49b60cdb31e51010f8a647f3189b774bddf
to fix this issue.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ic5c469ce744ad5882c13163a9bfe14faef8fd446
Reviewed-on: https://review.whamcloud.com/45383
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15168 osd: use large allocation for idc cache 82/45382/2
Alex Zhuravlev [Wed, 27 Oct 2021 05:48:03 +0000 (08:48 +0300)]
LU-15168 osd: use large allocation for idc cache

as in some cases (e.g. ofd precreate) the cache can grow to dozens
of kilobytes (sizeof(struct idc_map_cache)=40 * 1024).

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Id9e0996a7a1d07065f4a50c1d5be5051e756559a
Reviewed-on: https://review.whamcloud.com/45382
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
10 months agoLU-15154 kernel: kernel update SLES15 SP3 [5.3.18-59.27.1] 49/45349/4
Jian Yu [Mon, 25 Oct 2021 17:33:03 +0000 (10:33 -0700)]
LU-15154 kernel: kernel update SLES15 SP3 [5.3.18-59.27.1]

Update SLES15 SP3 kernel to 5.3.18-59.27.1 for Lustre client.

Test-Parameters: trivial

Change-Id: Ie3c369a8e93a75b4afbde55489bd3819bb39e1de
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45349
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15138 lnet: Fail peer add for existing gw peer 37/45337/5
Chris Horn [Fri, 22 Oct 2021 00:13:19 +0000 (00:13 +0000)]
LU-15138 lnet: Fail peer add for existing gw peer

If there's an existing peer entry for a peer that is being added
via CLI, and that existing peer was not created via the CLI, then
DLC will attempt to delete the existing peer before creating a new
one. The exit status of the peer deletion was not being checked.
This results in the ability to add duplicate peers for gateways,
because gateways cannot be deleted via lnetctl unless the routes
for that gateway have been removed first.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I9b7864a2bd21540336f72d96e180c89bd0aae8dc
Reviewed-on: https://review.whamcloud.com/45337
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15086 ptlrpc: fix timeout after spurious wakeup 08/45308/5
Alex Zhuravlev [Wed, 20 Oct 2021 11:10:57 +0000 (14:10 +0300)]
LU-15086 ptlrpc: fix timeout after spurious wakeup

so that final timeout don't exceed requested one

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iff5e08c589cbbc3c483915002f3f9df7a6f2678a
Reviewed-on: https://review.whamcloud.com/45308
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15122 osd-ldiskfs: Fix ASSERTION( iobuf->dr_rw == 0 ) with 64KB PAGE_SIZE 88/45288/7
Xinliang Liu [Tue, 19 Oct 2021 08:15:59 +0000 (08:15 +0000)]
LU-15122 osd-ldiskfs: Fix ASSERTION( iobuf->dr_rw == 0 ) with 64KB PAGE_SIZE

During a writing, if there is a page can not be mapped to blocks
at once, it will cause "ASSERTION( iobuf->dr_rw == 0 )" crash
which leads by the overflow access of mapped blocks array.

This will happen on Arm platforms easily with 64KB PAGE_SIZE.
And will not happen on x86_64 platforms with 4KB PAGE_SIZE.
Because for 4KB block size, if PAGE_SIZE is 4KB, then i == 0
and blocks_left_page == 1. Which makes the inner loop each time
handle one block. Thus the outer loop condition "block_idx <
block_idx_end;" insures blocks[] array access not overflow.

Check the actual mapped count so that mapped blocks array
access will not overflow.

Fixes: 0271b17b80a8 ("LU-14134 osd-ldiskfs: reduce credits for new writing")
Change-Id: Icd46c04bea2d7930456840694d422758eebb4186
Signed-off-by: Xinliang Liu <xinliang.liu@linaro.org>
Reviewed-on: https://review.whamcloud.com/45288
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15102 lnet: Reset ni_ping_count only on receive 35/45235/3
Chris Horn [Wed, 13 Oct 2021 23:30:01 +0000 (18:30 -0500)]
LU-15102 lnet: Reset ni_ping_count only on receive

The lnet_ni:ni_ping_count is currently reset on every (healthy) tx.
We should only reset it when receiving a message over the NI. Taking
net_lock 0 on every tx results in a performance loss for certain
workloads.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: 8fdf2bc62a ("LU-13569 lnet: Recover local NI w/exponential backoff interval")
HPE-bug-id: LUS-10427
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I67ea3aa977cb5d67b04f7957120c29e9985c83e6
Reviewed-on: https://review.whamcloud.com/45235
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15092 o2iblnd: Fix logic for unaligned transfer 16/45216/4
Chris Horn [Thu, 16 Sep 2021 17:12:38 +0000 (12:12 -0500)]
LU-15092 o2iblnd: Fix logic for unaligned transfer

It's possible for there to be an offset for the first page of a
transfer. However, there are two bugs with this code in o2iblnd.

The first is that this use-case will require LNET_MAX_IOV + 1 local
RDMA fragments, but we do not specify the correct corresponding values
for the max page list to ib_alloc_fast_reg_page_list(),
ib_alloc_fast_reg_mr(), etc.

The second issue is that the logic in kiblnd_setup_rd_kiov() attempts
to obtain one more scatterlist entry than is actually needed. This
causes the transfer to fail with -EFAULT.

Test-Parameters: trivial
HPE-bug-id: LUS-10407
Fixes: d226464aca ("LU-8057 ko2iblnd: Replace sg++ with sg = sg_next(sg)")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifb843f11ae34a99b7d8f93d94966e3dfa1ce90e5
Reviewed-on: https://review.whamcloud.com/45216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15094 o2iblnd: map_on_demand not needed for frag interop 15/45215/2
Chris Horn [Wed, 29 Sep 2021 17:42:26 +0000 (12:42 -0500)]
LU-15094 o2iblnd: map_on_demand not needed for frag interop

The map_on_demand tunable is not used for setting max frags so don't
require that it be set in order to negotiate max frags.

HPE-bug-id: LUS-10488
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie89f1f035f4b05244feffb848c14582a8c7cf0e6
Reviewed-on: https://review.whamcloud.com/45215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15049 quota: fix a panic with pool number > 16 05/45105/3
Sergey Cheremencev [Thu, 17 Jun 2021 10:45:42 +0000 (13:45 +0300)]
LU-15049 quota: fix a panic with pool number > 16

Fix a panic that may occur when there are more than 16
pools in a system:
qti_pools_add()) ASSERTION( qti->qti_pools_num >= QMT_MAX_POOL_NUM ) failed: Forgot init? ffff91a5f9625800

HPE-bug-id: LUS-10116
Change-Id: I4f73b74d2fd3e85a51cf3c30e2eec29645f164be
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45105
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-15048 quota: check that qti_lqes has been inited 02/45102/3
Sergey Cheremencev [Thu, 22 Jul 2021 10:56:24 +0000 (13:56 +0300)]
LU-15048 quota: check that qti_lqes has been inited

qti_lqes_resotre_init/fini should check that qti_lqes
has been inited before address qti_lqes_count.

Fix helps against following panic:
qti_lqes_restore_fini()) ASSERTION( qmt_info(env)->qti_lqes_rstr ) failed:

HPE-bug-id: LUS-10239
Change-Id: Ic93d87535f615fe419b2c3a2453506c515837031
Reviewed-on: https://es-gerrit.dev.cray.com/159116
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45102
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-14713 llite: tighten condition for fault not drop mmap_sem 15/44715/7
Bobi Jam [Thu, 2 Sep 2021 16:38:43 +0000 (00:38 +0800)]
LU-14713 llite: tighten condition for fault not drop mmap_sem

As __lock_page_or_retry() indicates, filemap_fault() will return
VM_FAULT_RETRY without releasing mmap_sem iff flags contains
FAULT_FLAG_ALLOW_RETRY and FAULT_FLAG_RETRY_NOWAIT.

So ll_fault0() should pass in FAULT_FLAG_ALLOW_RETRY |
FAULT_FLAG_RETRY_NOWAIT in ll_filemap_fault() so that when it
returns VM_FAULT_RETRY, we can pass on trying normal fault
under DLM lock as mmap_sem is still being held.

While in Linux 5.1 (6b4c9f4469819) FAULT_FLAG_RETRY_NOWAIT is enough
to not drop mmap_sem when failed to lock the page.

Fixes: 87865e4ae9 ("LU-13182 llite: Avoid eternel retry loops with MAP_POPULATE")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9420c587301722b597155558657577349a8141e4
Reviewed-on: https://review.whamcloud.com/44715
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>