Whamcloud - gitweb
fs/lustre-release.git
19 months agoLU-14780 llite: failed ASSERTION(ldlm_has_layout(lock)) b2_12-next
Bobi Jam [Fri, 4 Jun 2021 03:58:29 +0000 (11:58 +0800)]
LU-14780 llite: failed ASSERTION(ldlm_has_layout(lock))

When setting layout in layout lock, the lock could lost its layout
bits, and we'd try fetch the layout lock again.

Lustre-change: https://review.whamcloud.com/44054
Lustre-commit: 1b166d6dd6a2f39dfe35b60be169b288665d0283

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I10f96e4cb03cfe228d3c1ea1500b1a8d8e4e5e54
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
19 months agoLU-14838 tests: skip sanityn/32a if no truncate_lock
Andreas Dilger [Tue, 7 Jun 2022 20:52:52 +0000 (14:52 -0600)]
LU-14838 tests: skip sanityn/32a if no truncate_lock

Newer servers do not support truncate_lock since 2.14.53.
Skip sanityn.sh test_32a if this feature is not available.

Test-Parameters: trivial testlist=sanityn env=ONLY=32a
Test-Parameters: clientversion=2.14 testlist=sanityn env=ONLY=32a
Fixes: 6335dba839 ("LU-14838 osc: Remove lockless truncate")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibe37b59eff2b11a1b5e6ddd7a5c0ba6dae9993f5

19 months agoLU-14644 vvp: wait for nrpages to be updated
Vitaly Fertman [Tue, 27 Apr 2021 18:43:06 +0000 (21:43 +0300)]
LU-14644 vvp: wait for nrpages to be updated

truncate_inode_pages() says there still may be a page in a process
of deletion upon return. wait for another thread which is doing
__delete_from_page_cache() to get nrpages updated.

Lustre-change: https://review.whamcloud.com/43464
Lustre-commit: 7d5d004506650c3739898e70d72c9a86b8aeeb88

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I165b3d0866efaf2eb7e977520ebba4ee831874ab
HPE-bug-id: LUS-8842

19 months agoLU-15137 socklnd: expect two control connections maximum
Serguei Smirnov [Thu, 4 Nov 2021 18:35:43 +0000 (11:35 -0700)]
LU-15137 socklnd: expect two control connections maximum

As a result of connecting to ourselves, e.g. pinging own nid,
two control type connections are established vs. just one
in case of connecting externally.
Fix the control connection counter to be able to handle that.

Lustre-change: https://review.whamcloud.com/45461
Lustre-commit: ee9a03d8308c5918a17e2e45fd59ee5a4c38acaf

Test-Parameters: trivial testlist=sanity-lnet
Fixes: e8842e86 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Idce01d81e3924226b5b163d2472cbcd4f6eb5819
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46034
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
19 months agoLU-15137 socklnd: decrement connection counters on close
Serguei Smirnov [Sat, 30 Oct 2021 18:39:26 +0000 (11:39 -0700)]
LU-15137 socklnd: decrement connection counters on close

To gracefully handle potential race with delayed connection create,
decrement connection counters per type as connections are being
closed.

Lustre-change: https://review.whamcloud.com/45422
Lustre-commit: 7e26413aa85fdc931721cde36bae3bf2bb97e63f

Test-Parameters: trivial testlist=sanity-lnet
Fixes: e8842e86 ("LU-12815 socklnd: add conns_per_peer parameter")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ieb3b44701e4999ea1fe63234162dd5878d65958a
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46035
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
19 months agoLU-12815 socklnd: add conns_per_peer parameter
Serguei Smirnov [Thu, 4 Feb 2021 01:35:00 +0000 (20:35 -0500)]
LU-12815 socklnd: add conns_per_peer parameter

Introduce conns_per_peer ksocklnd module parameter.
In typed mode, this parameter shall control
the number of BULK_IN and BULK_OUT tcp connections,
while the number of CONTROL connections shall stay
at 1. In untyped mode, this parameter shall control
the number of untyped connections.
The default conns_per_peer is 1. Max is 127.
Performance scaling on 100GbE:

 conns_per_peer     speed
        1        1.7GiB/s
        2        3.3GiB/s
        4        6.4GiB/s
        8       11.5GiB/s

Lustre-change: https://review.whamcloud.com/41056
Lustre-commit: 71b2476e4ddb95aa42f4a0ea3f23b1826017bfa5

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I1f4ef22141882224e14e18c2526554dcfa69c871
Reviewed-on: https://review.whamcloud.com/41411
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
19 months agoLU-15645 obdclass: llog to handle gaps
Alex Zhuravlev [Wed, 16 Mar 2022 09:10:38 +0000 (12:10 +0300)]
LU-15645 obdclass: llog to handle gaps

due to old errors an update llog can contaain gaps in index.
this shouldn't block llog processing and recovery. actual
gaps in transaction sequence should be catched by VBR.

Lustre-change: https://review.whamcloud.com/46837
Lustre-commit: TBD (from b3de0d57bd0f7cd2e918aa9d3f08be1c69697b80)

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I11ec817e356f9658118c34706ef3a533e7faba83
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
19 months agoLU-13195 osp: osp_send_update_req() should check generation
Alex Zhuravlev [Mon, 27 Sep 2021 13:28:50 +0000 (16:28 +0300)]
LU-13195 osp: osp_send_update_req() should check generation

and don't send requests depending on just failed one

Lustre-change: https://review.whamcloud.com/45042
Lustre-commit: dff1e0d21c8c6bb20d63669252190795198bc49f

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I27a2b21130e33287168204ad829c0a53002b517e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
19 months agoLU-12577 llog: protect partial updates from readers
Alex Zhuravlev [Sun, 9 May 2021 06:32:55 +0000 (09:32 +0300)]
LU-12577 llog: protect partial updates from readers

llog_osd_write_rec() adds a record in few steps: the header is
updated first, then the record itself is appended. per-loghandle
semaphore is used, but remote readers allocate a new separate
loghandle for every access (header reading, blocks), the the
readers can't use loghandle's semaphore to avoid accessing partial
updates. use object-based locking [censored] to serialize the writer
vs the readers.

Lustre-change: https://review.whamcloud.com/43589
Lustre-commit: ae1404feefc1572fdafed938a3fc18131d675678

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie4e4d4a1e9a6fcdea9fcca7d80b0da920e786424
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
19 months agoLU-11861 obdclass: fix build with debug kernel
Alexey Lyashkov [Tue, 15 Jan 2019 12:24:42 +0000 (15:24 +0300)]
LU-11861 obdclass: fix build with debug kernel

Move declaration before usage.

Lustre-change: https://review.whamcloud.com/34030
Lustre-commit: 2dc87bb143e998e25585673b5f0ba7e2f317475e

Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I9a6c451bb5454b1542f0b06041f6938702e20b36
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
19 months agoLU-12739 lnet: Don't queue msg when discovery has completed
Chris Horn [Mon, 9 Sep 2019 17:54:08 +0000 (12:54 -0500)]
LU-12739 lnet: Don't queue msg when discovery has completed

In lnet_initiate_peer_discovery(), it is possible for the peer object
to change after the call to lnet_discover_peer_locked(), and it is
also possible for the peer to complete discovery between the first
call to lnet_peer_is_uptodate() and our placing the lnet_msg onto
the peer's lp_dc_pendq. After the call to lnet_discover_peer_locked()
check whether the, potentially new, peer object is up to date while
holding the lp_lock. If the peer is up to date, then we needn't
queue the message. Otherwise, we continue to hold the lock to place
the message on the peer's lp_dc_pendq.

Lustre-change: https://review.whamcloud.com/36139
Lustre-commit: 4ef62976448d6821df9aab3e720fd8d9d0bdefce

Test-Parameters: trivial testlist=sanity-lnet
Cray-bug-id: LUS-7596
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib3da7447588479bb35afcc3fe176b9120d915a89

19 months agoLU-xxxx - disable DOM in racer
Oleg Drokin [Tue, 23 Oct 2018 05:51:18 +0000 (01:51 -0400)]
LU-xxxx - disable DOM in racer

this is another source of timeouts?

22 months agoNew release 2.12.9 2.12.9 v2_12_9
Oleg Drokin [Fri, 17 Jun 2022 18:13:35 +0000 (14:13 -0400)]
New release 2.12.9

Change-Id: I099e525b0053ec5ecdd02b231be5bfa146ade633
Signed-off-by: Oleg Drokin <green@whamcloud.com>
22 months agoNew RC 2.12.9-RC1 2.12.9-RC1 v2_12_9-RC1
Oleg Drokin [Thu, 2 Jun 2022 13:02:58 +0000 (09:02 -0400)]
New RC 2.12.9-RC1

Change-Id: I63b2b223a57d26da40427502b639fe51f4f6e9d5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-14181 tests: except sanity test_64e 64f with SHARED_KEY 99/40999/2
Sebastien Buisson [Thu, 10 Dec 2020 08:37:43 +0000 (09:37 +0100)]
LU-14181 tests: except sanity test_64e 64f with SHARED_KEY

Add sanity test_64e and test_64f to ALWAYS_EXCEPT when
SHARED_KEY is used.

Lustre-change: https://review.whamcloud.com/40865
Lustre-commit: aa3bdbc23bc86bae565e78b38946f4ac8fcbeacb

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iaa9f5038a59f9ddc50dd9ac81ca81effd8bb9b1b
Reviewed-on: https://review.whamcloud.com/40999
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
22 months agoLU-14658 tests: fix conf-sanity 122b test 73/46873/2
Alexander Boyko [Mon, 13 Dec 2021 20:00:48 +0000 (15:00 -0500)]
LU-14658 tests: fix conf-sanity 122b test

Sometimes the test 122b failed with:
dd: failed to open '/mnt/lustre/d122b.conf-sanity/f122b.conf-sanity':
Numerical result out of range

ZFS readonly simulation produces OS_STATFS_READONLY flag.
It leads to zero stripe_count at lod_get_stripe_count(), and
lod_qos_prep_create() returns -34(ERANGE).

The patch fixes it by file creation before replay_barrier.

Lustre-change: https://review.whamcloud.com/46864
Lustre-commit: 853d5e4a25f393033b132659d24b7aad6916e3b8

Test-Parameters: trivial fstype=zfs env=ONLY=122b,ONLY_REPEAT=4 testlist=conf-sanity
Fixes: 747fed818be5 ("LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I7ec04ffe09d0038bcf99e1a571f14d2bfb6a5df5
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46873
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15854 tests: fix version check for sanity test_64 45/47345/3
Aurelien Degremont [Fri, 13 May 2022 12:42:38 +0000 (12:42 +0000)]
LU-15854 tests: fix version check for sanity test_64

Add missing or proper server version check for interop
testing for sanity test 64i and 64h.

Lustre-change: https://review.whamcloud.com/47343/
Lustre-commit: TBD (63832046a5c78a2425f1f07e2ec3f7beb9b0561e)

Test-Parameters: trivial testlist=sanity env=ONLY=64
Fixes: 38c78ac ("LU-9704 grant: ignore grant info on read resend")
Fixes: 4894683 ("LU-14124 target: set OBD_MD_FLGRANT in read's reply")
Change-Id: Iec21a407f467db3e9cb197d0a1436ea4e821bef2
Signed-off-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-on: https://review.whamcloud.com/47345
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15795 kernel: new kernel [RHEL 8.6 4.18.0-372.9.1.el8] 02/47302/3
Jian Yu [Wed, 18 May 2022 02:31:14 +0000 (19:31 -0700)]
LU-15795 kernel: new kernel [RHEL 8.6 4.18.0-372.9.1.el8]

This patch makes changes to support new RHEL 8.6 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.6

Change-Id: Id738259ed94104c3a3c7bb5c1b853cfabad49405
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/47302
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
22 months agoLU-15093 libcfs: Check if param_set_uint_minmax is provided 83/47383/2
Chris Horn [Wed, 18 May 2022 02:29:12 +0000 (19:29 -0700)]
LU-15093 libcfs: Check if param_set_uint_minmax is provided

Linux kernel v5.15 commit 2a14c9ae15a38148484a128b84bff7e9ffd90d68
moved param_set_uint_minmax to common code.

Lustre-change: https://review.whamcloud.com/45214
Lustre-commit: 3337e9fe920b260e34ff62c0840279ea6bff34ca

HPE-bug-id: LUS-10469
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ifd1d72ae531f0f6c7cd96cc28fbc07c8a8b70886
Reviewed-on: https://review.whamcloud.com/47383
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-10235 mdt: mdt_create: check EEXIST without lock 74/41674/7
Dominique Martinet [Wed, 10 Jan 2018 13:08:06 +0000 (14:08 +0100)]
LU-10235 mdt: mdt_create: check EEXIST without lock

mkdir() currently gets a write lock on the parent even if the new
directory already exists.

This patch adds an initial lookup of the new directory without a DLM
lock so that other clients do not need to cancel their DLM lock if the
"new" directory already exists, but will continue as usual if directory
did not exist.

There is a small race window that child was created by others after our
check and before locking parent, but this can be detected later during
index insert.

Performance change on two haswell 16-core VMs with ib, mean values of
mpirun -n 8 ./mdtest -D -i 8 -I 1000

test environment | directory creation | tree creation
local, no patch  | 1725/s             | 769/s
local, patch     | 1821/s             | 788/s
remote, no patch | 1729/s             | 772/s
remote, patch    | 1687/s             | 787/s

The differences are of the order of the noise here, with all mkdirs
being effective.

If directories exist, some simple stress on four nodes shows intended
improvements:
clush -w vm[0-3] 'seq 0 10000 |
    xargs -P 7 -I{} sh -c "(({}%3==0)) &&
        mkdir /mnt/lustre/testdir/foo 2>/dev/null ||
        stat /mnt/lustre/testdir > /dev/null"'

with patch: 10s
without patch: 19s
(the difference grows exponentially with number of clients and hangs
with over 60 clients without the patch; exact time was not re-measured
with patch)

Updated sanityn.sh 43a 45a to avoid race conditions.

Add sanityn.sh test_43j to verify above scenario.

Lustre-change: https://review.whamcloud.com/30880
Lustre-commit: 79acb9a9e7d3c3185a047f5b067382a814c0e9e5

Test-Parameters: envdefinitions=SLOW=yes testlist=replay-vbr,replay-vbr
Change-Id: I37fc9c8ffc7ab334c0645042beda5bef01284564
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/41674
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-13974 tests: update log corruption 64/46864/3
Alexander Boyko [Tue, 24 Nov 2020 09:05:36 +0000 (04:05 -0500)]
LU-13974 tests: update log corruption

Test case reproduce missing object for sub transaction during
set xattr operation.
First setattr got -2, second already started, but didn't
make llog_add yet. In this case llog osp object is stale after
top_trans_start. So declaration phase can not refresh llogs. And
at llog_osd_write_rec osp object changes stale state to
valid(dt_attr_get), but llog handle and llog header are invalid.
A new record would be added to updatelog with wrong index.
In that case processing of update log fails with

fs1-MDT0001-osp-MDT0003: [0x2:0x400024d0:0x2] Invalid record: index
112926 but expected 112925
lod_sub_recovery_thread()) fs1-MDT0001-osp-MDT0003 get update log
failed: rc = -34
Recovery aborted, and clients are evicted.

Lustre-change: https://review.whamcloud.com/40743
Lustre-commit: 562837124ec7bffeba7edb4b4b899bc271833374

HPE-bug-id: LUS-9030
Test-Parameters: testlist=sanity  envdefinitions=ONLY="427"
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I6a47fed1bc01f4be62216d1d0787adc413df0cf5
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46864
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
23 months agoLU-13356 client: don't use OBD_CONNECT_MNE_SWAB 09/41309/2
Alexander Boyko [Wed, 11 Mar 2020 10:40:52 +0000 (06:40 -0400)]
LU-13356 client: don't use OBD_CONNECT_MNE_SWAB

OBD_CONNECT_MNE_SWAB is equal to OBD_CONNECT_MDS_MDS, and
it was used at MGC client in past for mne swabbing during interop.
Right now it is interpreted at MGS like OBD_CONNECT_MDS_MDS and skip
these clients from eviction and lock canceling after timeout.

Lustre-change: https://review.whamcloud.com/37880
Lustre-commit: 3fe77a129e131014ff654bde616a62a1e243e322

Fixes: 1bdc4fd0594e ("LU-6307 obdclass: distinguish MGC/MDT connection properly")
Test-Parameters: testlist=runtests clientversion=2.12 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Test-Parameters: testlist=runtests serverversion=2.12 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Test-Parameters: testlist=runtests clientversion=2.10 clientdistro=el7.6 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Test-Parameters: testlist=runtests serverversion=2.10 serverdistro=el7.6 envdefinitions=MDS_MOUNT_OPTS="-orw",OST_MOUNT_OPTS="-orw"
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-8484
Change-Id: I4f8ddeb1808cfaee7507e0efcdefa24040cfcbb6
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/41309
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
23 months agoLU-13195 osp: invalidate object on write error 63/46863/3
Alex Zhuravlev [Mon, 27 Apr 2020 07:24:33 +0000 (10:24 +0300)]
LU-13195 osp: invalidate object on write error

do this unconditionally, to avoid cases when the object is
on another request's invalidation list.

Lustre-change: https://review.whamcloud.com/38387
Lustre-commit: 9e1071b517578ed3752efb1412017c8f93cd333b

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8ee0c484e695e88c0ea6fb13ac377fa689150780
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
23 months agoLU-13974 llog: check stale osp object 62/46862/2
Alexander Boyko [Tue, 24 Nov 2020 05:34:11 +0000 (00:34 -0500)]
LU-13974 llog: check stale osp object

The logic of osp_attr_get has 2 path,
1) return attributes from a cache for health osp object
2) make an out update request and return attributes for stale
osp object, object lose stale state.

When some out update request with llog writes failed, osp object
become stale. But llog handle stay inconsistent (bitmap,count,
last_index), and a next llog_add->llog_osd_write_rec do dt_attr_get,
gets attributes and makes osp object valid, and uses wrong llog
handle data. The result is index jump at llog file - recX, recX+2.
And it makes an error during update log processing if failover take
a place.
The fix adds dt_object_stale function to check osp_object.
llog_osd_write_rec check it and return ESTALE. llog_add would fail
with ESTALE error and doesn't corrupt update log.

Lustre-change: https://review.whamcloud.com/40742
Lustre-commit: 82c6e42d6137f39a1f2394b7bc6e8d600eb36181

HPE-bug-id: LUS-9030
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Iadf53fd816e1c5bde0a19d4c537f0408796c864a
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/46862
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14536 obi2lnd: don't try to reconnect if there's no listener 96/45896/2
Li Dongyang [Fri, 19 Mar 2021 10:21:58 +0000 (21:21 +1100)]
LU-14536 obi2lnd: don't try to reconnect if there's no listener

For each discovery we try to reconnect up to retry_count times,
default to 5. during MDT mount process conf log, there will be
multiple discovery made for each OST.
If the OSTs are not up, the mount will have a long time out.

Lustre-change: https://review.whamcloud.com/42111
Lustre-commit: 67ba3ce23d32266eabd5f8c56fa78d65920455e8

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: If1d854216d2f26089c52d3fb501092b7f48a444d
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45896
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14536 o2iblnd: don't resend if there's no listener 95/45895/2
Li Dongyang [Fri, 19 Mar 2021 09:26:28 +0000 (20:26 +1100)]
LU-14536 o2iblnd: don't resend if there's no listener

If there's no listener at remote peer, we will
get IB_CM_REJ_INVALID_SERVICE_ID, currently we
will try to resend which makes the discovery longer
than necessary when connecting to a node which is
not up.
Use -EHOSTUNREACH instead of -ECONNREFUSED,
so we don't end up queued for resend.

Lustre-change: https://review.whamcloud.com/42109
Lustre-commit: 0ab06eb9d865a47ea3e09880a41a9e8f0a78b6a6

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ifaf14bc3ada2e2469669285917e366af669817e2
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45895
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-10931 lnet: handle unlink before send completes 98/45898/2
Amir Shehata [Mon, 8 Jul 2019 19:33:31 +0000 (12:33 -0700)]
LU-10931 lnet: handle unlink before send completes

If LNetMDUnlink() is called on an md with md->md_refcount > 0 then
the eq callback isn't called.
There is a scenario where the response times out before the send
completes. So we have a refcount on the MD. The Unlink callback gets
dropped on the floor. Send completes, but because we've already timed
out, the REPLY for the GET is dropped. Now we're left with a peer
that is in the following state:
LNET_PEER_MULTI_RAIL
LNET_PEER_DISCOVERING
LNET_PEER_PING_SENT
But no more events are coming to it, and the discovery never
completes.

This scenario can get RPCs stuck as well if the response times out
before the send completes.

The solution is to set the event status to -ETIMEDOUT to inform
the send event handler that it should not expect a reply.

Lustre-commit: d8fc5c23fe541e0ff6ce5bec6302957714c3f69f
Lustre-change: https://review.whamcloud.com/35444

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ica0e1a823d0d1200bb8cc42a6e058785da1d4fa4
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/45898
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15357 mdd: fix changelog context leak 32/45832/3
Mikhail Pershin [Sat, 11 Dec 2021 12:49:47 +0000 (15:49 +0300)]
LU-15357 mdd: fix changelog context leak

The mdd_changelog_clear() shouldn't skip llog_ctxt_put()
in case of error.

Lustre-change: https://review.whamcloud.com/45831
Lustre-commit: TBD (from c330a73e4cffb1fb642fadfa38001275251d1f14)

Fixes: 6b183927e1 (LU-14553 changelog: eliminate mdd_changelog_clear warning)
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I9c9aa3ce0d11e8f67470b450d007f2a1081644c6
Reviewed-on: https://review.whamcloud.com/45832
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12483 tests: fix sanity test 60h running conditions 93/45993/2
Oleg Drokin [Thu, 6 Jan 2022 21:50:16 +0000 (14:50 -0700)]
LU-12483 tests: fix sanity test 60h running conditions

The test is supposed to run in DNE mode on 2.12.4 or above,
but the conditions are somehow reversed.

Lustre-change: https://review.whamcloud.com/35355
Lustre-commit: dfd64242755b2b993ad6fe177480fb391d6eb6bb

Fixes: 5b1ea58c21e ("LU-11907 dne: allow access to striped dir with broken layout")
Change-Id: I322941a6098b0dbfbabe2f5c70f40f8e81d1bbab
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45993
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alena Nikitenko <anikitenko@ddn.com>
2 years agoLU-15009 ofd: continue precreate if LAST_ID is less on MDT 30/45930/2
Lai Siyao [Thu, 16 Sep 2021 21:49:33 +0000 (17:49 -0400)]
LU-15009 ofd: continue precreate if LAST_ID is less on MDT

It's possible that precreate succeeded on OST, but MDT didn't get the
reply, and assumed failure. In this case, the LAST_ID on MDT is
smaller than that on OST, instead of report error and stop precreate,
it's better to move precreate window forward.

Lustre-change: https://review.whamcloud.com/44984
Lustre-commit: 1711e26ae861c28829870c2433caf7ee232909cf

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ia6ca418ec0ea6797b7eccc1610879331307fad07
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45930
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14688 mdt: changelog purge deletes plain llog 90/43990/4
Alexander Boyko [Mon, 17 May 2021 13:29:01 +0000 (09:29 -0400)]
LU-14688 mdt: changelog purge deletes plain llog

With a massive cancel records changelog could delete a plain
llog file and skip one by one record cancelling.
Also patch fixes the race between llog_destroy and llog_next_block.

Lustre-change: https://review.whamcloud.com/43719
Lustre-commit: d813c75df6798efbf3228347628c0d671ca7269c

HPE-bug-id: LUS-9950
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I47c2ed97945e979745255381f83b6a417d7ba8b1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/43990
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14606 llog: hide ENOENT for cancelling record 72/43572/5
Alexander Boyko [Mon, 12 Apr 2021 12:19:47 +0000 (08:19 -0400)]
LU-14606 llog: hide ENOENT for cancelling record

Llog allows parallel records processing. A record could be cancelled
at callback. If two threads processing and cancelling the same record,
one thread would get ENOENT.
The error was observed during purging changlog records.The patch
adds reproducer test sanity 160m.

This is a valid case, let's hide ENOENT error from a caller.

Lustre-change: https://review.whamcloud.com/43264
Lustre-commit: 0b60647c0382426e3b4105d82d04862d2e4831cb

HPE-bug-id: LUS-9826
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id00b959e6f329c2ad34966f8a17a52f71680f24c
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43572
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13636 obdclass: drop nlink if directory is removed 66/44466/2
Alex Zhuravlev [Fri, 5 Jun 2020 12:15:22 +0000 (15:15 +0300)]
LU-13636 obdclass: drop nlink if directory is removed

To make e2fsck happy.  Otherwise, all the features using
local directories (quota, nodemap, nid tables) can leave
orphaned objects as nlink doesn't drop to 0.

Lustre-change: https://review.whamcloud.com/38844
Lustre-commit: c6d5c6606a38e2b550a81591935b0091faba4a2e

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9e20a304d66c61f312168715e888757bc06b6ed0
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/44466
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-2233 tests: improve tests sanityn/40-47 91/44391/3
Alex Zhuravlev [Mon, 29 Apr 2019 08:21:13 +0000 (11:21 +0300)]
LU-2233 tests: improve tests sanityn/40-47

sanity/40-46 usually take 800-900s which is almost a half
of the whole sanityn pass. 99.(9)% of time the tests just
wait to ensure specific order the operations execute in.

the patch changes cfs_fail_timeout_set() so that it can
interrupt waiting if fail_loc is set to 0 - polling with
1/10s frequency is used.

the tests itself are modified to reset fail_loc. to be
able to do so both operations (referenced as OP1 and OP2
in the tests) are run in background. once started and then
ensured with pdo_sched() helper that MDS threads got to the
blocking points, we can interrupt OP1 and do usual checks.

ONLY=40-47 sh sanityn.sh take: 1017s before and 78s after.

Lustre-change: https://review.whamcloud.com/4392
Lustre-commit: 743b85a32e24cff0c77dff739691043970a0901e

LU-12470 tests: increase pdirops timeout

There are pretty regular failures of the sanityn pdirops test_40-47.
Increase the timeout slightly to reduces the frequency of failures.

Lustre-change: https://review.whamcloud.com/37304
Lustre-commit: b35f50c96c608ba650a5b3cf29fa129e01025549

Test-Parameters: trivial testlist=sanityn,sanityn,sanityn,sanityn
Test-Parameters: testlist=sanityn,sanityn,sanityn,sanityn,sanityn
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Ib8aec2b4517a6f84402ccae66f6d5ceac6d73d85
Reviewed-on: https://review.whamcloud.com/44391
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7] 07/45707/2
Jian Yu [Thu, 2 Dec 2021 08:43:38 +0000 (00:43 -0800)]
LU-15292 kernel: kernel update RHEL7.9 [3.10.0-1160.49.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.49.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I356b8a8345a4a91d6d1c1a4a9b4eab4bb5afe75b
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12661 tests: skip sanity 817 for kernel 4.12+ 63/39863/3
Andreas Dilger [Wed, 9 Sep 2020 00:29:06 +0000 (18:29 -0600)]
LU-12661 tests: skip sanity 817 for kernel 4.12+

Skip the NFS exec mode bug for kernels 4.12 and later, since this
is also being hit on SLES12/15 kernel 4.12.14+ and not just 4.14.

Lustre-change: https://review.whamcloud.com/39838
Lustre-commit: 3e2c28437404b0ccbd7bbfb8f77788678975b63d

Test-Parameters: trivial
Fixes: 4fed33473ca2 ("LU-12661 tests: skip sanity 817 if kernel >= 4.14")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ibc4ffda72bd7827e250c4583c760505b8f3ebbe5
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39863
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12579 tests: allow some margin in runtests 79/45579/2
Andreas Dilger [Fri, 30 Aug 2019 23:23:50 +0000 (17:23 -0600)]
LU-12579 tests: allow some margin in runtests

Allow some margin in the space used by runtests for internal
log files for Lustre and the underlying filesystem.

Lustre-change: https://review.whamcloud.com/36011
Lustre-commit: c05656557353954b2a9799c4e702329db2d38851

Test-Parameters: trivial testlist=runtests,runtests,runtests
Test-Parameters: mdtcount=4 mdscount=2 testlist=runtests,runtests,runtests
Test-Parameters: fstype=zfs testlist=runtests,runtests,runtests
Test-Parameters: fstype=zfs mdtcount=4 mdscount=2 testlist=runtests,runtests

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I34b47a8436c5718be311698a3f6e6d7af7ea45ad
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45579
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-12751 tests: add missing error() 78/45578/3
Alex Zhuravlev [Wed, 11 Sep 2019 14:32:21 +0000 (17:32 +0300)]
LU-12751 tests: add missing error()

nothing else I can say

Lustre-change: https://review.whamcloud.com/36159
Lustre-commit: 78f7b7709f9b45b5faae6e7c7b3093c246a08086

Test-Parameters: trivial

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I040771e57ec6f6c6bfbde5a21358c6747f4f20dc
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alena Nikitenko <anikitenko@ddn.com>
2 years agoLU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4] 13/45513/4
Jian Yu [Wed, 17 Nov 2021 20:43:25 +0000 (12:43 -0800)]
LU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.25.1.el8_4 for Lustre client.

Test-Parameters: trivial clientdistro=el8.4 testlist=sanity

Change-Id: Ic70f7330f90a36646bb36e0c6015ea22882b20b9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45513
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12410 lnet: Add additional output to sanity-lnet.sh 88/44188/3
Chris Horn [Thu, 19 Sep 2019 19:01:05 +0000 (14:01 -0500)]
LU-12410 lnet: Add additional output to sanity-lnet.sh

Add wrappers around ip netns exec and lnetctl commands to generate
some additional test output. This makes it easier to see what each
test case is doing from the test script output, and aids in debugging
any problems.

Lustre-change: https://review.whamcloud.com/36242
Lustre-commit: 32528a689889989607a34b21efa583429bda1422

Test-parameters: trivial testlist=sanity-lnet

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I95b18cb3a090527548a8f9e65845eb4a18dea6d6
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44188
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoNew release 2.12.8 2.12.8 v2_12_8
Oleg Drokin [Thu, 18 Nov 2021 19:04:45 +0000 (14:04 -0500)]
New release 2.12.8

Change-Id: I33decc215454eb6bc85361dfd7d68a11db4113c4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14587 ptlrpc: remove LASSERT in nrs_polices proc handler 68/45568/2
Lei Feng [Tue, 12 Oct 2021 06:33:22 +0000 (14:33 +0800)]
LU-14587 ptlrpc: remove LASSERT in nrs_polices proc handler

It's not necessary to LASSERT() in nrs_polices proc handler.
CERROR() and returning error is good enough.

Lustre-change: https://review.whamcloud.com/45200
Lustre-commit: 9997f94d4b6ee335d2bf86f94bd43464d5b8f061

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I09f06dc4ab90e49b2df66a9b47a74678c64cdd2f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45568
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-9704 grant: ignore grant info on read resend 74/45474/2
Vladimir Saveliev [Wed, 3 Nov 2021 10:52:14 +0000 (13:52 +0300)]
LU-9704 grant: ignore grant info on read resend

The following scenario makes a message like "claims 28672 GRANT, real
grant 0" to appear:

 1. client owns X grants and run rpcs to shrink part of those
 2. server fails over so that the shrink rpc is to be resent.
 3. on the clinet reconnect server and client sync on initial amount
 of grants for the client.
 4. shrink rpc is resend, if server disk space is enough, shrink does
 not happen and the client adds amount of grants it was going to
 shrink to its newly initial amount of grants. Now, client thinks that
 it owns more grants than it does from server points of view.
 5. the client consumes grants and sends rpcs to server. Server avoids
 allocating new grants for the client if the current amount of grant
 is big enough:
static long tgt_grant_alloc(struct obd_export *exp, u64 curgrant,
...
        if (curgrant >= want || curgrant >= ted->ted_grant + chunk)
                RETURN(0);
 6. client continues grants consuming which eventually leads to
 complains like "claims 28672 GRANT, real grant 0".

In case of resent of read and set_info:shrink RPCs grant info should
be ignored as it was reset on reconnect.

Tests to illustrate the issue is added.

Lustre-change: https://review.whamcloud.com/45371
Lustre-commit: TBD

Change-Id: I8af1db287dc61c713e5439f4cf6bd652ce02c12c
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5] 28/45528/3
Jian Yu [Mon, 15 Nov 2021 19:12:16 +0000 (11:12 -0800)]
LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]

This patch makes changes to support new RHEL 8.5 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.5

Lustre-change: https://review.whamcloud.com/45285
Lustre-commit: TBD (from a1b4ee323ad650d2fdff3754596771dd0c8df507)

Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45528
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14128 lov: correctly set OST obj size 48/45448/4
Bobi Jam [Wed, 3 Nov 2021 18:19:09 +0000 (14:19 -0400)]
LU-14128 lov: correctly set OST obj size

When extends a PFL file to a size locating at a boundary of a stripe
in a component, the truncate won't set the size of the OST object
in the prior stripe.

This patch record the prior stripe in
lov_layout_raid0::lo_trunc_stripeno and add the stripe in the
truncate IO and enqueue the lock covering it.

Lustre-change: https://review.whamcloud.com/40581
Lustre-commit: 98015004516cad1173e2bac2a4695bdc56e4d9a4

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic5d8e3c16f950003736cd6dbd5af404613f818c7
Reviewed-on: https://review.whamcloud.com/45448
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14543 target: prevent overflowing of tgd->tgd_tot_granted 90/45490/2
Vladimir Saveliev [Fri, 19 Mar 2021 12:08:47 +0000 (15:08 +0300)]
LU-14543 target: prevent overflowing of tgd->tgd_tot_granted

If tgd->tgd_tot_granted < ted->ted_grant then there should not be:
   tgd->tgd_tot_granted -= ted->ted_grant;
which breaks tgd->tgd_tot_granted.
In case of obvious ted->ted_grant damage, recalculate
tgd->tgd_tot_granted using list of exports.

The same change is made for tgd->tgd_tot_dirty.

This patch also adds sanity check for exp->exp_target_data.ted_grant
increase in tgt_grant_alloc() to catch grant counting corruption as
soon as it happened.

Lustre-change: https://review.whamcloud.com/45474
Lustre-commit: bb5d81ea95502fb5709e176b561b70aa5280ee07

Fixes: af2d3ac30e ("LU-11939 tgt: Do not assert during grant cleanup")
Change-Id: I36ba7496f7b72b4881e98c06ec254a8eefd4c13f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45490
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-11939 tgt: Do not assert during grant cleanup 89/45489/3
Patrick Farrell [Fri, 8 Feb 2019 17:14:06 +0000 (12:14 -0500)]
LU-11939 tgt: Do not assert during grant cleanup

Client/server grant inconsistencies discovered during
cleanup are indicative of a bug, but any problems they
would cause have already occurred at this point.

So do not assert during this cleanup.

Lustre-change: https://review.whamcloud.com/34215
Lustre-commit: af2d3ac30eafead6b47c5db20d76433c091d89de

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic9b827b1005bc321a290505a368349699ddf2f38
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45489
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15184 llite: properly detect SELinux disabled case 27/45527/3
Sebastien Buisson [Mon, 15 Nov 2021 19:06:31 +0000 (11:06 -0800)]
LU-15184 llite: properly detect SELinux disabled case

Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.

Lustre-change: https://review.whamcloud.com/45501
Lustre-commit: TBD (from 85779753abe0451e2b0b82dcf5d4a4d111b0bfb8)

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45527
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14413 test: test for overstriping for sanity 27M 54/44354/7
James Simmons [Wed, 28 Jul 2021 00:10:29 +0000 (20:10 -0400)]
LU-14413 test: test for overstriping for sanity 27M

The introduction of sanity 27M broke interop with 2.12 LTS since
over striping doesn't exist in that version. Adjust the test to
use over striping if the client supports it, otherwise just use
traditional striping.

Lustre-change: https://review.whamcloud.com/44340
Lustre-commit: 4e1f9c4bd1d96063a1fbb2dfaab41b15836167ab

Test-Parameters: trivial testlist=sanity env=ONLY=27M
Change-Id: I2d788a116cbb749a83d6cec36f97d06533b32421
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44340
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44354
Reviewed-by: James Nunez <jnunez@whamcloud.com>
2 years agoLU-14598 ofd: fix for IDIF sequence at ofd_preprw_write 41/43541/2
Alexander Boyko [Thu, 8 Apr 2021 08:23:54 +0000 (04:23 -0400)]
LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write

During recovery write operation could create and load a sequence
if it comes before creation request from MDT0. ofd_preprw_write() uses
wrong logic for taking sequence for IDIF fids. And if oid overflows
32bit and takes a part at IDIF sequence, write request loads wrong
ofd sequence. And after that it is used for other IO. The next
create from MDT0 cause an error:
Too many FIDs to precreate OST replaced or reformatted...

The test 122b reproduce issue when OST using a wrong sequence for
MDT0 IDIF. This error requires objects id grater than 32bit, and
write request during recovery, it should be processed before a create
requset from MDT0.
For a visible error at console the last object id should be
1<<32 + (OST_MAX_PRECREATE * 5). Error is
lustre-OST0000: Too many FIDs to precreate OST replaced or
    reformatted: LFSCK will clean up

Lustre-change: https://review.whamcloud.com/43248
Lustre-commit: 747fed818be5a4e09281ab1d9fd5b3a13763ab40

HPE-bug-id: LUS-9595
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I09e6f88b1f0d03fec59b24ef096cbc7baa5388ae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/43541
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14565 ofd: Do not rely on tgd_blockbit 55/43955/9
Arshad Hussain [Mon, 29 Mar 2021 05:22:11 +0000 (10:52 +0530)]
LU-14565 ofd: Do not rely on tgd_blockbit

tgd_blockbit is recordsize bits set during mkfs.
This once set does not change. However, 'zfs set'
can be used to change the OST blocksize. Instead
of using cached value of 'tgd_blockbit' always
calculate the blocksize bits which may have
changed.

Test-case: sanity/104c added

Conflicts:
lustre/mdt/mdt_handler.c

Lustre-change: https://review.whamcloud.com/43154/
Lustre-commit: 8ee6e1c8825c4fabfd6c39db11081839ca53d454

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icc100cca0d5ae492c41d60f0bf97512450f796bc
Reviewed-on: https://review.whamcloud.com/43955
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13054 ldiskfs: split htree_lock as separate patch 21/44121/3
Yang Sheng [Sun, 26 Apr 2020 11:59:16 +0000 (19:59 +0800)]
LU-13054 ldiskfs: split htree_lock as separate patch

The htree_lock part is identical in the different
distro version of pdirop patch. So move it out as
separate patch to reduce maintenance effort.

Lustre-change: https://review.whamcloud.com/38372
Lustre-commit: 42880f9502ba57b7ee35559d7b07d2f1a3adec72

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I423cc957de37ccdb097c9893f69481ce947ac78c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-13054 ldiskfs: htree_node wrongly granted 20/44120/3
Yang Sheng [Sun, 26 Apr 2020 11:56:40 +0000 (19:56 +0800)]
LU-13054 ldiskfs: htree_node wrongly granted

The thread was waken up accidently. So need check
whether the lock granted or not after wake up.
Also fix issue that major always set to 0 since
hbit initialize incorrect. The performace should be
impacted especial operate in big directory.

kernel BUG at lustre/ldiskfs/htree_lock.c:429!
 Call Trace:
 htree_node_release_all+0x5a/0x80 [ldiskfs]
 htree_unlock+0x22/0x70 [ldiskfs]
 osd_index_ea_delete+0x30e/0xb10 [osd_ldiskfs]
 lod_sub_delete+0x1c8/0x460 [lod]
 lod_delete+0x24/0x30 [lod]
 __mdd_index_delete_only+0x194/0x250 [mdd]
 __mdd_index_delete+0x46/0x290 [mdd]
 mdd_unlink+0x5f8/0xaa0 [mdd]
 mdo_unlink+0x46/0x48 [mdt]
 mdt_reint_unlink+0xbed/0x14b0 [mdt]

Lustre-change: https://review.whamcloud.com/38371
Lustre-commit: 4597a2b4fc33711f66eb1c21fc125d028bd3f2ec

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I5972961bc78b349214c6756642717d126f0c4b26
Reviewed-on: https://review.whamcloud.com/44120
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7] 54/45354/2
Jian Yu [Mon, 25 Oct 2021 18:47:37 +0000 (11:47 -0700)]
LU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.45.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I11c307bfd6a6b353bc7b6fe40bb5d604bc9b3fdc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45354
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10) 55/45355/2
Arshad Hussain [Mon, 25 Oct 2021 18:51:50 +0000 (11:51 -0700)]
LU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10)

ZFS: (2.0.0-1)
Lustre: 608cce73d51 LU-15007 tests: quota enable cmd fix
CentOS: 3.10.0-1160.15.2.el7.x86_64

This patch fixes two build failures seens as below for
the above configuration

First
~~~~~
In file included from:
/root/zfs/zfs_git_lustre_build/zfs/include/sys/spa.h:39:0,
from libmount_utils_zfs.c:32:
/root/zfs/<path>/.../sys/zfs_context.h:110:27:
fatal error: sys/byteorder.h: No such file or directory
#include <sys/byteorder.h>

Second
~~~~~~
gcc -rdynamic -shared -export-dynamic -pthread \
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libzfs/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libnvpair/.libs
-o mount_osd_zfs.so \
`ar -t libmount_utils_zfs.a` \
-ldl -lzfs -lnvpair -lzpool
/usr/bin/ld: cannot find -lzpool
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
collect2: error: ld returned 1 exit status

Lustre-change: https://review.whamcloud.com/45016
Lustre-commit: 8931f7e4e5da39389a79eff11dc04bb468beb715

Change-Id: Iaf868391e414deb7ac8df43847250bbcd0115d5e
Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45355
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14124 target: set OBD_MD_FLGRANT in read's reply 71/45471/2
Vladimir Saveliev [Wed, 20 Oct 2021 10:32:11 +0000 (13:32 +0300)]
LU-14124 target: set OBD_MD_FLGRANT in read's reply

If tgt_grant_shrink() decides to not shrink grants - a client is
supposed to restore its cl_grant_avail in osc_update_grant(). In case
of read OBD_MD_FLGRANT is not set on reply's body->oa.o_valid, so
osc_update_grant() misses the cl_grant_avail update. As result server
keeps thinking that client has a lot of grants while a client thinks
that it is missing grants badly. That may lead to performance
degradation.

A test to illustrate the issue is included.

Lustre-change: https://review.whamcloud.com/43375
Lustre-commit: 4894683342d77964daeded9fbc608fc46aa479ee

Test-Parameters: testlist=sanity
Change-Id: Ibe7ce0af5701226c8be3ae3f9ad57c354791fa0f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45471
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2] 64/45364/2
Jian Yu [Mon, 25 Oct 2021 23:40:08 +0000 (16:40 -0700)]
LU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2]

Update SLES12 SP5 kernel to 4.12.14-122.91.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ia6620869fa84d72f8d22c4a8a039600037ddb2d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45364
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14696 llite: check read only mount for setquota 23/44923/3
Hongchao Zhang [Wed, 15 Sep 2021 11:44:23 +0000 (19:44 +0800)]
LU-14696 llite: check read only mount for setquota

During setting quota, it should fail if the mount is read-only.

Lustre-change: https://review.whamcloud.com/43765
Lustre-commit: 29e00cecc6019fbdb5bd98511970970ac5ef5318

Change-Id: I966ac71d0a4a72dcb998f09ffc0f99ae28498e27
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44923
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4] 51/44951/2
Jian Yu [Thu, 16 Sep 2021 00:53:27 +0000 (17:53 -0700)]
LU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.19.1.el8_4 for Lustre client.

Test-Parameters: trivial clientdistro=el8.4 testlist=sanity

Change-Id: Icedc6cf2a5678cfbce76c47507137c0ea41d0b06
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44951
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7] 76/44876/2
Jian Yu [Thu, 9 Sep 2021 00:38:05 +0000 (17:38 -0700)]
LU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.42.2.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9 \
testlist=sanity

Change-Id: I377ea5d1e28c50b1087dfca7cb32f44afb9bf5f5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44876
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1] 63/44863/2
Jian Yu [Tue, 7 Sep 2021 19:56:49 +0000 (12:56 -0700)]
LU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1]

Update SLES12 SP5 kernel to 4.12.14-122.83.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: I2b35d129550b895324bb3e2e61910ad10e846f03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-11546 utils: enable large_dir for ldiskfs 81/36781/6
Li Dongyang [Wed, 23 Oct 2019 00:10:34 +0000 (11:10 +1100)]
LU-11546 utils: enable large_dir for ldiskfs

Format MDT with "large_dir" option by default,
to get over the 10M-entry limit for the directories.

Lustre-change: https://review.whamcloud.com/36555
Lustre-commit: cd1faa0124f21e12a5ecd83c709c13918264fc86

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ie51e6ce28b5f00adc9958de24794a760d9b43b77
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36781
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-12627 ofd: reset fti_attr in ofd_lvbo_update() 69/44269/5
Wang Shilong [Sat, 3 Aug 2019 06:27:22 +0000 (14:27 +0800)]
LU-12627 ofd: reset fti_attr in ofd_lvbo_update()

This patch try to fix following panic:

(ofd_internal.h:440:tsi2ofd_info()) ASSERTION( info->fti_attr.la_valid == 0 ) failed:
(ofd_internal.h:440:tsi2ofd_info()) LBUG
[ 5321.108598] Call Trace:
[ 5321.109347]  [<ffffffffc06fc8bc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[ 5321.111342]  [<ffffffffc06fc96c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 5321.113026]  [<ffffffffc147631a>] ofd_preprw+0xcfa/0x1160 [ofd]
[ 5321.114643]  [<ffffffffc0bb934c>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[ 5321.116373]  [<ffffffffc0bbc50a>] tgt_request_handle+0x91a/0x15c0 [ptlrpc]
[ 5321.118230]  [<ffffffffc0b61636>] ptlrpc_server_handle_request+0x256/0xb00 [ptlrpc]
[ 5321.120318]  [<ffffffffc0b6516c>] ptlrpc_main+0xbac/0x1560 [ptlrpc]
[ 5321.122001]  [<ffffffff84cc1c31>] kthread+0xd1/0xe0
[ 5321.123023]  [<ffffffff85374c37>] ret_from_fork_nospec_end+0x0/0x39
[ 5321.124066]  [<ffffffffffffffff>] 0xffffffffffffffff

If this is server lock, tgt_brw_lock() will finally call
ofd_lvbo_update() upon lock canceling which will use @fti_attr
and pollute value:

|->ptlrpc_main
 |->lu_context_enter(le_ctx)
  |->tgt_brw_write
   |->tgt_brw_lock
    |->tgt_extent_lock
     |->ldlm_cli_enqueue_local
      |->ldlm_lock_enqueue
       |->ldlm_run_ast_work
        |->ptlrpc_check_set
          |->ldlm_cb_interpret
           |->ldlm_handle_ast_error
            |->ofd_lvbo_update
             |->ofd_attr_get polluted @info->fti_attr

  |->tgt_brw_write
   |->ofd_preprw
    |->tsi2ofd_info
      |->ASSERTION(info->fti_attr.la_valid == 0)

 |->lu_context_exit(le_ctx)--->memset @fti_attr

To fix this problem, reset fti_attr->la_valid before
ofd_lvbo_update() return just like what offd_lvbo_init() did.

Lustre-change: https://review.whamcloud.com/35685
Lustre-commit: 8ffbe6b82fac1d3e4d4391bcba74dc2ee1411a69

Change-Id: Ib6b448dd21603cfe0305d8425862a96ef3f7fee8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44269
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14876 out: don't connect to busy MDS-MDS export 62/44362/6
Mikhail Pershin [Wed, 21 Jul 2021 15:14:01 +0000 (18:14 +0300)]
LU-14876 out: don't connect to busy MDS-MDS export

MDS-MDS connection is missing check for busy requests upon
reconnect, so resent can be executed concurrently with
original request.

- in ptlrpc_server_check_resend_in_progress() remove exception
  for bulk requests, they can be compared by XID nowadays.
  This prevents OUT requests vs resent execution as well.
- fix messages in target_handle_connect() to report correct
  information about connection details
- in out_handle() check for last_xid only once per OUT_UPDATE
- test 110m is added to recovery-small to reproduce the issue

Lustre-change: https://review.whamcloud.com/44390
Lustre-commit: 301d76a71176c186129231ddd1323bae21100165

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2ad183674d59a2cdeab0037bd8551c607b10ffeb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-11518 ldlm: cancel LRU improvement 07/41007/3
Vitaly Fertman [Wed, 16 Dec 2020 16:54:10 +0000 (11:54 -0500)]
LU-11518 ldlm: cancel LRU improvement

Add @batch parameter to cancel LRU, which means if at least 1 lock is
cancelled, try to cancel at least a batch locks. This functionality
will be used in later patches.

Limit the LRU cancel by 1 thread only, however, not for those which
have the @max limit given (ELC), as LRU may be left not cleaned up
in full.

Lustre-change: https://review.whamcloud.com/39561
Lustre-commit: 3d4b5dacb3053f39d79d59860a903a19e76b9318

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ide21c4a2b2209b8a721249466ea1e651c8532c8a
HPE-bug-id: LUS-8678
Reviewed-on: https://es-gerrit.dev.cray.com/157067
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41007
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
2 years agoLU-11768 test: make at_max to take effect 45/41345/2
Hongchao Zhang [Thu, 10 Oct 2019 20:22:25 +0000 (16:22 -0400)]
LU-11768 test: make at_max to take effect

In test_6 of sanity-quota, the "at_max" won't affect
the "at_current" if there is no RPC to be sent in that
import, which still makes the following DQACQ request
to have larger timeout value and triggers watchdog.

Lustre-change: https://review.whamcloud.com/36431
Lustre-commit: 550af84a91505c85824ffad2990d31c8e8ab4dd9

Fixes: d8226b93 ("LU-11768 test: limit at_max to timeout in time")
Test-Parameters: trivial testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Iccc969459647aa70da6f6ecb0d8d13a404bf8088
Reviewed-on: https://review.whamcloud.com/41345
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13423 tests: cleanup_netns correctly set result 03/44203/3
Shaun Tancheff [Tue, 7 Apr 2020 23:05:06 +0000 (18:05 -0500)]
LU-13423 tests: cleanup_netns correctly set result

The existence test for 'test1pl' should not result in
cleanup_netns returning failure to the caller.

A slightly more terse if/else can be used to ensure the
caller is notified of failure only in the case of
test1pl not being deleted.

Lustre-change: https://review.whamcloud.com/38157
Lustre-commit: 410b655c71849e5a26251f7c187b19ed8f504bd7

Test-Parameters: trivial
HPE-bug-id: LUS-8713

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I85dee20ec0f0ccd0be17597431fcedda9469d9da
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44203
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14204 tests: make sure we have a single import 98/40998/2
Sebastien Buisson [Wed, 9 Dec 2020 17:53:12 +0000 (18:53 +0100)]
LU-14204 tests: make sure we have a single import

In sanity, retrieve the exact name of the import being used on the
client, in order to properly get information such as lock_count
or lru_size.

Change-Id: I065b7da7990c7171d5baa24f3400c5f8ffc12fc9
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/40998
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14098 obdclass: try to skip corrupted llog records 96/44396/2
Alex Zhuravlev [Mon, 26 Jul 2021 06:18:06 +0000 (09:18 +0300)]
LU-14098 obdclass: try to skip corrupted llog records

if llog's header or record is found corrupted, then
ignore the remaining records and try with the next one.

Lustre-commit: 910eb97c1b43a44a9da2ae14c3b83e28ca6342fc
Lustre-change: https://review.whamcloud.com/40754

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86a682a8874a2184e8891ff0ee8a68414d232a79
Reviewed-on: https://review.whamcloud.com/44396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14733 o2iblnd: Avoid double posting invalidate 17/44217/2
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:01 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Avoid double posting invalidate

When the kib_tx is provisioned during kiblnd_fmr_pool_map(), spare
WRs in the kib_fast_reg_descriptor are setup and the mapping of
pages is given to the mr.

kiblnd_post_tx_locked() then posts the spare WRs from the
kib_fast_reg_descriptor.

if (rc == 0)
return 0;

The code returns and the kib_fast_reg_descriptor is still contains
the spare WRs.   The next time the kib_tx is used, the
now obsolete WRs will be inadvertently posted.   For rdmavt, the
obsolete invalidate will cause an -EINVAL to be returned from
the post send.

Fix by adding a state variable frd_posted to the kib_fast_reg_descriptor.
The variable is set to false in kiblnd_fmr_pool_unmap().
kiblnd_post_tx_locked() is adjusted to avoid prepending the
kib_fast_reg_descriptor WRs when frd_posted is true.   After
the post succeeds, the frd_posted is set to true.

Lustre-change: https://review.whamcloud.com/44190
Lustre-commit: 5930576791e864529e6ef9b46f3e09cc4b635fc2

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: I426dd05e635392e75d1aa48808782a229e83ce5f
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44217
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7] 77/44377/2
Jian Yu [Thu, 22 Jul 2021 07:31:50 +0000 (00:31 -0700)]
LU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.36.2.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ie2898b1df28c8b99ea4099e94baafe388c6aa626
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44377
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14733 o2iblnd: Move racy NULL assignment 16/44216/2
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:00 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Move racy NULL assignment

kiblnd_fmr_pool_unmap() can race map and subsequent processing
because of this flaw in unmap:

if (frd) {
frd->frd_valid = false;
spin_lock(&fps->fps_lock);
list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list);
spin_unlock(&fps->fps_lock);
fmr->fmr_frd = NULL;
}

The fmr can be pulled off the list in kiblnd_fmr_pool_unmap() on
another CPU an fmr_frd could be in a state of flux and
potentially be seen incorrectly later on as the kib_tx is processed.

Fix my moving the fmr_frd assignment to before the fmr is added to the
list.

Lustre-change: https://review.whamcloud.com/44189
Lustre-commit: 023113fb8946f3565529e7327fdcd90ab9db3ba3

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: Ibddf132a363ecfe9db3cc06287cec873c021d2fb
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13729 osd-ldiskfs: race access to iam_formats during setup 56/44356/2
Wang Shilong [Tue, 30 Jun 2020 01:12:48 +0000 (09:12 +0800)]
LU-13729 osd-ldiskfs: race access to iam_formats during setup

It might be possible during OST mounting, two targets reach
iam_format_guess() at the same time, if @initialized is 0,
they both access iam_lxx_format_init(), however list operation
inside is not protected by any locking which cause list corruptions
finally.

We could fix this by doing formats registration in module init,
since there are only two formats, just remove pointless list.

Lustre-change: https://review.whamcloud.com/39213
Lustre-commit: 54d0f5de911af52e7f2a978c4b6cd158fed87dc5

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I6dd5a4d1297792b47fb4b94052465a7e0f9123aa
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/44356
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12836 osd-zfs: Catch all ZFS pool change events 29/43929/3
Tony Hutter [Fri, 12 Mar 2021 01:23:16 +0000 (17:23 -0800)]
LU-12836 osd-zfs: Catch all ZFS pool change events

This change adds the following symlinks:

  vdev_attach-lustre -> statechange-lustre.sh
  vdev_remove-lustre -> statechange-lustre.sh
  vdev_clear-lustre -> statechange-lustre.sh

This makes it so the statechange-lustre.sh script is also called on
all ZFS events that could change the pool state.

Lustre-change: https://review.whamcloud.com/43552
Lustre-commit: e11a47da71a2e2482e4c4cf582d663cd76a2ecab

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Change-Id: I18edc86749e8ab91bb45f21aafd3fd47e78cbaef
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13055 mdd: don't assert on unknown changelog lrh_type 10/43710/7
Mikhail Pershin [Fri, 14 May 2021 17:01:43 +0000 (20:01 +0300)]
LU-13055 mdd: don't assert on unknown changelog lrh_type

Supplemental patch for old server code to prevent assertion
on unknown/new changelog record and user record types

Test-Parameters: env=ONLY=160 testlist=sanity serverjob=lustre-master serverbuildno=0
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I5d45c6ef659feb2b143edf6286df9904378171ba
Reviewed-on: https://review.whamcloud.com/43710
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-7791 ldlm: signal vs CP callback race 97/44297/2
Andriy Skulysh [Tue, 3 May 2016 07:41:56 +0000 (10:41 +0300)]
LU-7791 ldlm: signal vs CP callback race

In case of interrupted wait for a CP AST
failed_lock_cleanup() sets LDLM_FL_LOCAL_ONLY, so
the client wouldn't cancel the lock on CP AST.

A lock isn't canceled on the server on reception

Lustre-change: https://review.whamcloud.com/19898
Lustre-commit: 7fff052c930da4822c3b2a13d130da7473a20a58

Cray-bug-id: LUS-2021
Change-Id: Id1e365b41f1fb8a0f9a32c0c929457b22ceba8ef
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/44297
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoNew release 2.12.7 2.12.7 v2_12_7
Oleg Drokin [Thu, 15 Jul 2021 04:10:42 +0000 (00:10 -0400)]
New release 2.12.7

Change-Id: I6f98d22dd887538b32dead45b037c44541103c13
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoNew RC 2.12.7-RC1 2.12.7-RC1 v2_12_7-RC1
Oleg Drokin [Sun, 27 Jun 2021 14:24:07 +0000 (10:24 -0400)]
New RC 2.12.7-RC1

Change-Id: I7bccb2825193ffcdf984f53db9d606c097b784bf
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14327 tests: skip sanity-sec test 55 for older servers 50/43950/6
James Nunez [Tue, 8 Jun 2021 16:34:29 +0000 (10:34 -0600)]
LU-14327 tests: skip sanity-sec test 55 for older servers

sanity-sec test 55 was added to lustre-b2_12 version
2.12.6.3.  When we run version interop testing with
Lustre servers less than 2.12.6.3, the test will fail.
Thus, skip sanity-sec test 55 for Lustre servers less
than 2.12.6.3.

Lustre-change: https://review.whamcloud.com/43949
Lustre-commit: abda4d06a41dfb526b4a66cb5fae6ff1a4c6c01b

Fixes: 355787745f21 (“LU-14121 nodemap: do not force fsuid/fsgid squashing”)

Test-Parameters: trivial
Test-Parameters: serverversion=2.10.8 serverdistro=el7.6 env=ONLY=55 testlist=sanity-sec
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ie002c921e853897105396185b38485799df31b7a
Reviewed-on: https://review.whamcloud.com/43950
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
2 years agoLU-7372 tests: re-enable replay-dual test_26 78/43978/4
Andreas Dilger [Fri, 11 Jun 2021 00:52:43 +0000 (18:52 -0600)]
LU-7372 tests: re-enable replay-dual test_26

Re-enable test_26 since it was just the unfortunate victim of
either test_24 or test_25 causing MDS unmount to hang.

Lustre-change: https://review.whamcloud.com/43982
Lustre-commit: TBD (from 0f509199a25db416759c3bbcce85c6b79d623585)

Test-Parameters: trivial testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib944028e798488c425501f0c48bf812fc13ebbe5
Reviewed-on: https://review.whamcloud.com/43978
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14673 sec: annotate algorithms taking optional key 53/43653/7
Sebastien Buisson [Tue, 11 May 2021 08:59:03 +0000 (10:59 +0200)]
LU-14673 sec: annotate algorithms taking optional key

Crypto algorithms implementing a ->setkey() method but that can also
be used without a key must set the CRYPTO_ALG_OPTIONAL_KEY flag if
defined in the kernel.
In Lustre, adler32 and crc32 implementations define a ->setkey()
method, but their "key" is not actually a cryptographic key.

Lustre-change: https://review.whamcloud.com/43656
Lustre-commit: b161e7b777e63bb4328aeab9e50560f919fedc31

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I362211d1b1aa3763fe1481cebb3629b255f29e41
Reviewed-on: https://review.whamcloud.com/43653
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
2 years agoLU-14627 lnet: Ensure ref taken when queueing for discovery 01/44001/5
Chris Horn [Thu, 22 Apr 2021 19:51:44 +0000 (14:51 -0500)]
LU-14627 lnet: Ensure ref taken when queueing for discovery

Call lnet_peer_queue_for_discovery() in
lnet_discovery_event_handler() to ensure that we take a ref on
the peer when forcing it onto the discovery queue. This also ensures
that the peer state has LNET_PEER_DISCOVERING.

Add a test to sanity-lnet.sh that can trigger the refcount loss bug
in discovery.

Lustre-change: https://review.whamcloud.com/43418
Lustre-commit: 2ce6957b69370b0ce75725d1d91866bf55c07fa8

HPE-bug-id: LUS-7651
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie2908668c4ffde0f993b5b7ea9aa58acd1d6fa9c
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44001
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14627 tests: Create unload_modules_local 60/43960/3
Chris Horn [Fri, 23 Apr 2021 19:05:02 +0000 (14:05 -0500)]
LU-14627 tests: Create unload_modules_local

t-f allows for loading modules on single node via load_modules_local.
However, there is no corresponding unload_modules_local that can be
called to cleanup after call to load_modules_local, so we create it.
unload_modules() refactored to use unload_modules_local.

Also address a potential issue that can prevent LND modules from
unloading. Some LNet setup (particularly those in sanity-lnet) may
require that we call lnetctl lnet unconfigure (or lctl net down)
to drop a ref on the module before it can be unloaded.

Lustre-change: https://review.whamcloud.com/43425
Lustre-commit: 32304d863ae98c641f541362f54e7b1f24b350a6

HPE-bug-id: LUS-9031
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6458a7728f5f559f8641c5a9e29dd775c8445c38
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14627 lnet: Allow delayed sends 59/43959/2
Chris Horn [Wed, 21 Apr 2021 19:22:46 +0000 (14:22 -0500)]
LU-14627 lnet: Allow delayed sends

The net_delay_add has some code related to delaying sends, but it
isn't fully implemented. Modify lnet_post_send_locked() to check
whether the message being sent matches a rule and should be delayed.

Fix some bugs with how the delay timers were set and checked.

Lustre-change: https://review.whamcloud.com/43416
Lustre-commit: ab14f3bc852e708100d21770c00235f95841708a

HPE-bug-id: LUS-7651
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Icbd9ee81d2ff0162a01a4187807ea2114a42276d
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12675 mdt: release object reference upon error 40/43940/3
Bruno Faccini [Wed, 21 Aug 2019 13:32:54 +0000 (15:32 +0200)]
LU-12675 mdt: release object reference upon error

LBUG ("(lu_object.c:1196:lu_device_fini()) ASSERTION(
atomic_read(&d->ld_ref) == 0) failed: Refcount is <x>") can
intermitently occur during umount of MDT0000, upon specific
use cases (playing with file/dir having foreign LOV/LMV), and
due to object reference set/leaked on server side.

Lustre-change: https://review.whamcloud.com/35845
Lustre-commit: 4649899fbba095c7c3eb7ce1c8893040ed6e2494

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ic49b2bb0402b1a6e51d7ba656f9957eeda1bd0fb
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43940
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-13182 llite: Avoid eternel retry loops with MAP_POPULATE 58/43958/4
Oleg Drokin [Wed, 9 Jun 2021 16:30:12 +0000 (09:30 -0700)]
LU-13182 llite: Avoid eternel retry loops with MAP_POPULATE

Kernels 5.4+ have an infinite retry loop from MAP_POPULATE mmap
option. Use the FAULT_FLAG_RETRY_NOWAIT to instruct filemap_fault
to not drop the mmap_sem so if the call fails, we could use
the slow path and break the loop from forming.
(Idea by Neil Brown)

Lustre-change: https://review.whamcloud.com/40221
Lustre-commit: bb50c62c6f4cdd7a31145ab81e7c166e0760ed11

Test-Parameters: trivial testlist=sanity-hsm env=ONLY=1 clientdistro=ubuntu2004

Change-Id: I320ab9ca447282aea15ef2030ef8671c4260d895
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-7372 tests: skip replay-dual test_24/25 77/43977/4
Andreas Dilger [Fri, 11 Jun 2021 00:47:52 +0000 (18:47 -0600)]
LU-7372 tests: skip replay-dual test_24/25

Not sure which one of these subtests is causing problems, but
they are causing the following runtests test to hang unmounting
the MDS, just like test_26 was doing previously.

This is only a stopgap to confirm that one of these subtests is
causing the later unmount hang, and to get testing passing again.
There needs to be further isolation done to test_24 or test_25,
and to re-enable test_26, but that can be done afterward.

Test-Parameters: trivial testgroup=review-dne-part-2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6d94eb040052b4912cf29ea37ca36ca4503ebbe5
Reviewed-on: https://review.whamcloud.com/43977
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-10350 lod: adjust stripe count to available ost count 76/43976/2
Bobi Jam [Fri, 28 May 2021 08:25:52 +0000 (16:25 +0800)]
LU-10350 lod: adjust stripe count to available ost count

* In ost-pool.sh, reset $MOUNT's stripe offset, so that the created
  directory will not inherit it from root directory.

* Preserve the root directory layout in replay-single (run before
  ost-pools) to avoid leaving a bad layout on the root dir.
  Lustre-change: https://review.whamcloud.com/43872

Lustre-change: https://review.whamcloud.com/43882
Lustre-commit: TBD (from c82f557324bc0048c308d1a2135699e7c83169e1)

Test-Parameters: trivial testlist=replay-single,ost-pools
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idf6884faf1271a3864710aeab0ba0eca154bf492
Reviewed-on: https://review.whamcloud.com/43976
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4] 44/43744/5
Jian Yu [Sun, 6 Jun 2021 07:38:49 +0000 (00:38 -0700)]
LU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4]

This patch makes changes to support new RHEL 8.4 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.4

Change-Id: I47d4706f9175d489ef0e6226492af20f44f0677e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43744
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13783 osc: handle removal of NR_UNSTABLE_NFS 78/43778/2
Mr NeilBrown [Fri, 3 Jul 2020 05:33:36 +0000 (15:33 +1000)]
LU-13783 osc: handle removal of NR_UNSTABLE_NFS

In Linux 5.8 the NR_UNSTABLE_NFS page counters are go.  All pages that
have been writen but are not yet safe are now counted in NR_WRITEBACK.

So change osc_page to count in NR_WRITEBACK, but if NR_UNSTABLE_NFS
still exists in the kernel, use a #define to direct the updates to
that counter.

Conflicts:
libcfs/autoconf/lustre-libcfs.m4

Lustre-change: https://review.whamcloud.com/39260
Lustre-commit: 3e5faa441266cd8dc2ee54ae140ad0129b4affa0

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I49cbc267fafaee949f45b2e559511aedcf4d8fed
Reviewed-on: https://review.whamcloud.com/43778
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12355 llite: MS_* flags and SB_* flags split 79/40379/4
Shaun Tancheff [Thu, 18 Jul 2019 14:19:03 +0000 (09:19 -0500)]
LU-12355 llite: MS_* flags and SB_* flags split

In kernel 4.20 the MS_* flags should only be used for mount
time flags and SB_* flags for checking super_block.s_flags
The MS_* flags have moved to a uapi header

Conflicts:
lustre/llite/llite_lib.c

Lustre-commit: 72a84970e6d2a2d4b3a35f2ee058511be2fda82e
Lustre-change: https://review.whamcloud.com/35019

Linux-commit: e262e32d6bde0f77fb0c95d977482fc872c51996

Test-Parameters: trivial
Change-Id: Ifd64efb16c7795377ece066d01ae04dc004a13ac
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/40379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12355 llite: totalram_pages changed to atomic_long_t 76/40376/3
Shaun Tancheff [Sat, 15 Jun 2019 19:32:26 +0000 (14:32 -0500)]
LU-12355 llite: totalram_pages changed to atomic_long_t

Kernel 5.0 changed totalram_pages to atomic_long_t
Provide an abstracted accessor now that totalram_pages
is now a function

Conflicts:
libcfs/autoconf/lustre-libcfs.m4
libcfs/include/libcfs/libcfs.h
lustre/llite/lproc_llite.c

Lustre-commit: 5ca5b19e8efdfede8ec3405eaced7202984f396b
Lustre-change: https://review.whamcloud.com/35025

Linux-commit: ca79b0c211af63fa3276f0e3fd7dd9ada2439839

Test-Parameters: trivial
Change-Id: I558e42074004e2ee5f79deea0d363e5bea332729
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/40376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12999 mgs: Cleanup string handling in name_create_mdt 21/43321/3
Shaun Tancheff [Mon, 2 Dec 2019 17:32:50 +0000 (11:32 -0600)]
LU-12999 mgs: Cleanup string handling in name_create_mdt

To satisfy gcc8 -Werror=format-overflow sanity test the mdt_idx
before calling snprintf.

Lustre-change: https://review.whamcloud.com/36817
Lustre-commit: 298cdb5c0b6136b91e76c9c515bfbc2df99bae0b

Test-Parameters: trivial
Cray-bug-id: LUS-8186
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I2c8764d3715290ee2bd8c96cdc98b532f50632c6
Reviewed-on: https://review.whamcloud.com/43321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14588 o2ib: make config script aware of the ofed symbols 56/43556/2
Serguei Smirnov [Tue, 6 Apr 2021 22:54:01 +0000 (15:54 -0700)]
LU-14588 o2ib: make config script aware of the ofed symbols

LNet o2ib configuration script needs to be aware of the external
ofed dkms symbols when testing for availability of o2ib features
by building "conftest" kernel objects. If this is not done,
symbols from the core kernel are used by default which is
different from what is used when actually building LNet,
at least on Ubuntu. This patch adds the check for external symbols.

Lustre-change: https://review.whamcloud.com/43223
Lustre-commit: bcc5d784826d2d7a8eece28e96fab8b0fa02ab17

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Iea566f8a3feb86b8bef2f4501a3abc968d76451a
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43556
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14617 utils: llog_reader updatelog support 58/43658/2
Alexander Boyko [Fri, 16 Apr 2021 09:57:34 +0000 (05:57 -0400)]
LU-14617 utils: llog_reader updatelog support

The patch adds printing UPDATE_REC for llog_reader. It is usefull
for updatelog analyze. Here is an example of record

 [0x50001a21b:0x1233d:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x50001a211:0x475:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x3800182e3:0x475:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x200032c9a:0x245:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x200000001:0x15:0x0] type:write/12 params:2 p_0:3 p_1:4
 p_0 - 12/trusted.lov
 p_1 - 0/
 p_2 - 25972/\x0100000000000000000000000000000000000000000002000...
 p_3 - 25974/\x0800000000000000P\xD1AB006x0000000400EC^\x000000...
 p_4 - 1/

llog logic processing base on incrementing record index,
the fix adds checks for it. Also adds more info from header,
and drops useless - Bit X not set.

Lustre-change: https://review.whamcloud.com/43343
Lustre-commit: 9962d6f84db5fd587bbe13640a9361c2872f3728

Test-Parameters: trivial
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id50de15040526dc07ae708ac5db046832706be31
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43658
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3] 45/43545/2
Jian Yu [Wed, 5 May 2021 17:23:27 +0000 (10:23 -0700)]
LU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.22.1.el8_3 for Lustre client.

Test-Parameters: trivial clientdistro=el8.3

Change-Id: I1a3152d95822a74e05f9b44f590a6cdb1f8b02b6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43545
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7] 26/43626/2
Jian Yu [Mon, 10 May 2021 19:31:58 +0000 (12:31 -0700)]
LU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.25.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ic846d648c45476cc4886ce86577605bf3e66d935
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43626
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2] 32/43632/2
Jian Yu [Mon, 10 May 2021 21:27:19 +0000 (14:27 -0700)]
LU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2]

Update SLES12 SP5 kernel to 4.12.14-122.66.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ib2bf4795ccb21dbd0bb9202228ff32d73a203eee
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43632
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14553 changelog: eliminate mdd_changelog_clear warning 55/43555/2
Olaf Faaland [Thu, 25 Mar 2021 01:35:10 +0000 (18:35 -0700)]
LU-14553 changelog: eliminate mdd_changelog_clear warning

When handling a changelog_clear request, the user may specify a
range of indices which do not exist.  Similarly, the user may
specify a changelog user which does not exist.  Neither indicates
a problem within Lustre that justifies a a console warning.

Change those cases to CDEBUG.

Lustre-change: https://review.whamcloud.com/43125
Lustre-commit: 6b183927e19715d093c80a35ebc42a1cda5e70e2

Test-Parameters: trivial
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I64bab12ef4978c4bf7139f5f36a39f9b109616fb
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43555
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14603 ptlrpc: quiet messages for unsupported opcodes 60/43260/3
Andreas Dilger [Sun, 11 Apr 2021 02:04:30 +0000 (20:04 -0600)]
LU-14603 ptlrpc: quiet messages for unsupported opcodes

Quiet messages for OST_FALLOCATE and OST_SEEK RPCs that can
be sent from 2.14.0 clients.

Lustre-change: https://review.whamcloud.com/43257
Lustre-commit: TBD (from c7427f6618308996e76718baeba492c0b09dd5b3)

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I35496168e3aa29ecb06076654ef0aa97ba2540e5
Reviewed-on: https://review.whamcloud.com/43260
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>