Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-12751 tests: add missing error() 78/45578/3
Alex Zhuravlev [Wed, 11 Sep 2019 14:32:21 +0000 (17:32 +0300)]
LU-12751 tests: add missing error()

nothing else I can say

Lustre-change: https://review.whamcloud.com/36159
Lustre-commit: 78f7b7709f9b45b5faae6e7c7b3093c246a08086

Test-Parameters: trivial

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I040771e57ec6f6c6bfbde5a21358c6747f4f20dc
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45578
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alena Nikitenko <anikitenko@ddn.com>
2 years agoLU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4] 13/45513/4
Jian Yu [Wed, 17 Nov 2021 20:43:25 +0000 (12:43 -0800)]
LU-15196 kernel: kernel update RHEL8.4 [4.18.0-305.25.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.25.1.el8_4 for Lustre client.

Test-Parameters: trivial clientdistro=el8.4 testlist=sanity

Change-Id: Ic70f7330f90a36646bb36e0c6015ea22882b20b9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45513
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12410 lnet: Add additional output to sanity-lnet.sh 88/44188/3
Chris Horn [Thu, 19 Sep 2019 19:01:05 +0000 (14:01 -0500)]
LU-12410 lnet: Add additional output to sanity-lnet.sh

Add wrappers around ip netns exec and lnetctl commands to generate
some additional test output. This makes it easier to see what each
test case is doing from the test script output, and aids in debugging
any problems.

Lustre-change: https://review.whamcloud.com/36242
Lustre-commit: 32528a689889989607a34b21efa583429bda1422

Test-parameters: trivial testlist=sanity-lnet

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I95b18cb3a090527548a8f9e65845eb4a18dea6d6
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44188
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoNew release 2.12.8 2.12.8 v2_12_8
Oleg Drokin [Thu, 18 Nov 2021 19:04:45 +0000 (14:04 -0500)]
New release 2.12.8

Change-Id: I33decc215454eb6bc85361dfd7d68a11db4113c4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14587 ptlrpc: remove LASSERT in nrs_polices proc handler 68/45568/2
Lei Feng [Tue, 12 Oct 2021 06:33:22 +0000 (14:33 +0800)]
LU-14587 ptlrpc: remove LASSERT in nrs_polices proc handler

It's not necessary to LASSERT() in nrs_polices proc handler.
CERROR() and returning error is good enough.

Lustre-change: https://review.whamcloud.com/45200
Lustre-commit: 9997f94d4b6ee335d2bf86f94bd43464d5b8f061

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I09f06dc4ab90e49b2df66a9b47a74678c64cdd2f
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45568
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-9704 grant: ignore grant info on read resend 74/45474/2
Vladimir Saveliev [Wed, 3 Nov 2021 10:52:14 +0000 (13:52 +0300)]
LU-9704 grant: ignore grant info on read resend

The following scenario makes a message like "claims 28672 GRANT, real
grant 0" to appear:

 1. client owns X grants and run rpcs to shrink part of those
 2. server fails over so that the shrink rpc is to be resent.
 3. on the clinet reconnect server and client sync on initial amount
 of grants for the client.
 4. shrink rpc is resend, if server disk space is enough, shrink does
 not happen and the client adds amount of grants it was going to
 shrink to its newly initial amount of grants. Now, client thinks that
 it owns more grants than it does from server points of view.
 5. the client consumes grants and sends rpcs to server. Server avoids
 allocating new grants for the client if the current amount of grant
 is big enough:
static long tgt_grant_alloc(struct obd_export *exp, u64 curgrant,
...
        if (curgrant >= want || curgrant >= ted->ted_grant + chunk)
                RETURN(0);
 6. client continues grants consuming which eventually leads to
 complains like "claims 28672 GRANT, real grant 0".

In case of resent of read and set_info:shrink RPCs grant info should
be ignored as it was reset on reconnect.

Tests to illustrate the issue is added.

Lustre-change: https://review.whamcloud.com/45371
Lustre-commit: TBD

Change-Id: I8af1db287dc61c713e5439f4cf6bd652ce02c12c
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45474
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5] 28/45528/3
Jian Yu [Mon, 15 Nov 2021 19:12:16 +0000 (11:12 -0800)]
LU-15126 kernel: new kernel [RHEL 8.5 4.18.0-348.2.1.el8_5]

This patch makes changes to support new RHEL 8.5 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.5

Lustre-change: https://review.whamcloud.com/45285
Lustre-commit: TBD (from a1b4ee323ad650d2fdff3754596771dd0c8df507)

Change-Id: I068f091817126fffc14402254f45dcd75ba7f3fc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45528
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14128 lov: correctly set OST obj size 48/45448/4
Bobi Jam [Wed, 3 Nov 2021 18:19:09 +0000 (14:19 -0400)]
LU-14128 lov: correctly set OST obj size

When extends a PFL file to a size locating at a boundary of a stripe
in a component, the truncate won't set the size of the OST object
in the prior stripe.

This patch record the prior stripe in
lov_layout_raid0::lo_trunc_stripeno and add the stripe in the
truncate IO and enqueue the lock covering it.

Lustre-change: https://review.whamcloud.com/40581
Lustre-commit: 98015004516cad1173e2bac2a4695bdc56e4d9a4

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ic5d8e3c16f950003736cd6dbd5af404613f818c7
Reviewed-on: https://review.whamcloud.com/45448
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14543 target: prevent overflowing of tgd->tgd_tot_granted 90/45490/2
Vladimir Saveliev [Fri, 19 Mar 2021 12:08:47 +0000 (15:08 +0300)]
LU-14543 target: prevent overflowing of tgd->tgd_tot_granted

If tgd->tgd_tot_granted < ted->ted_grant then there should not be:
   tgd->tgd_tot_granted -= ted->ted_grant;
which breaks tgd->tgd_tot_granted.
In case of obvious ted->ted_grant damage, recalculate
tgd->tgd_tot_granted using list of exports.

The same change is made for tgd->tgd_tot_dirty.

This patch also adds sanity check for exp->exp_target_data.ted_grant
increase in tgt_grant_alloc() to catch grant counting corruption as
soon as it happened.

Lustre-change: https://review.whamcloud.com/45474
Lustre-commit: bb5d81ea95502fb5709e176b561b70aa5280ee07

Fixes: af2d3ac30e ("LU-11939 tgt: Do not assert during grant cleanup")
Change-Id: I36ba7496f7b72b4881e98c06ec254a8eefd4c13f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45490
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-11939 tgt: Do not assert during grant cleanup 89/45489/3
Patrick Farrell [Fri, 8 Feb 2019 17:14:06 +0000 (12:14 -0500)]
LU-11939 tgt: Do not assert during grant cleanup

Client/server grant inconsistencies discovered during
cleanup are indicative of a bug, but any problems they
would cause have already occurred at this point.

So do not assert during this cleanup.

Lustre-change: https://review.whamcloud.com/34215
Lustre-commit: af2d3ac30eafead6b47c5db20d76433c091d89de

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic9b827b1005bc321a290505a368349699ddf2f38
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45489
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-15184 llite: properly detect SELinux disabled case 27/45527/3
Sebastien Buisson [Mon, 15 Nov 2021 19:06:31 +0000 (11:06 -0800)]
LU-15184 llite: properly detect SELinux disabled case

Usually, security_dentry_init_security() returns -EOPNOTSUPP when
SELinux is disabled. But on some kernels (e.g. rhel 8.5) it returns
0 when SELinux is disabled, and in this case the security context is
empty.
So in both cases make sure the security context name is not set, which
means "SELinux is disabled" for the rest of the code.

Lustre-change: https://review.whamcloud.com/45501
Lustre-commit: TBD (from 85779753abe0451e2b0b82dcf5d4a4d111b0bfb8)

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3b9608f9768288de89570c158e8429560fa0213f
Reviewed-on: https://review.whamcloud.com/45527
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14413 test: test for overstriping for sanity 27M 54/44354/7
James Simmons [Wed, 28 Jul 2021 00:10:29 +0000 (20:10 -0400)]
LU-14413 test: test for overstriping for sanity 27M

The introduction of sanity 27M broke interop with 2.12 LTS since
over striping doesn't exist in that version. Adjust the test to
use over striping if the client supports it, otherwise just use
traditional striping.

Lustre-change: https://review.whamcloud.com/44340
Lustre-commit: 4e1f9c4bd1d96063a1fbb2dfaab41b15836167ab

Test-Parameters: trivial testlist=sanity env=ONLY=27M
Change-Id: I2d788a116cbb749a83d6cec36f97d06533b32421
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44340
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44354
Reviewed-by: James Nunez <jnunez@whamcloud.com>
2 years agoLU-14598 ofd: fix for IDIF sequence at ofd_preprw_write 41/43541/2
Alexander Boyko [Thu, 8 Apr 2021 08:23:54 +0000 (04:23 -0400)]
LU-14598 ofd: fix for IDIF sequence at ofd_preprw_write

During recovery write operation could create and load a sequence
if it comes before creation request from MDT0. ofd_preprw_write() uses
wrong logic for taking sequence for IDIF fids. And if oid overflows
32bit and takes a part at IDIF sequence, write request loads wrong
ofd sequence. And after that it is used for other IO. The next
create from MDT0 cause an error:
Too many FIDs to precreate OST replaced or reformatted...

The test 122b reproduce issue when OST using a wrong sequence for
MDT0 IDIF. This error requires objects id grater than 32bit, and
write request during recovery, it should be processed before a create
requset from MDT0.
For a visible error at console the last object id should be
1<<32 + (OST_MAX_PRECREATE * 5). Error is
lustre-OST0000: Too many FIDs to precreate OST replaced or
    reformatted: LFSCK will clean up

Lustre-change: https://review.whamcloud.com/43248
Lustre-commit: 747fed818be5a4e09281ab1d9fd5b3a13763ab40

HPE-bug-id: LUS-9595
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I09e6f88b1f0d03fec59b24ef096cbc7baa5388ae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/43541
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14565 ofd: Do not rely on tgd_blockbit 55/43955/9
Arshad Hussain [Mon, 29 Mar 2021 05:22:11 +0000 (10:52 +0530)]
LU-14565 ofd: Do not rely on tgd_blockbit

tgd_blockbit is recordsize bits set during mkfs.
This once set does not change. However, 'zfs set'
can be used to change the OST blocksize. Instead
of using cached value of 'tgd_blockbit' always
calculate the blocksize bits which may have
changed.

Test-case: sanity/104c added

Conflicts:
lustre/mdt/mdt_handler.c

Lustre-change: https://review.whamcloud.com/43154/
Lustre-commit: 8ee6e1c8825c4fabfd6c39db11081839ca53d454

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Icc100cca0d5ae492c41d60f0bf97512450f796bc
Reviewed-on: https://review.whamcloud.com/43955
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13054 ldiskfs: split htree_lock as separate patch 21/44121/3
Yang Sheng [Sun, 26 Apr 2020 11:59:16 +0000 (19:59 +0800)]
LU-13054 ldiskfs: split htree_lock as separate patch

The htree_lock part is identical in the different
distro version of pdirop patch. So move it out as
separate patch to reduce maintenance effort.

Lustre-change: https://review.whamcloud.com/38372
Lustre-commit: 42880f9502ba57b7ee35559d7b07d2f1a3adec72

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I423cc957de37ccdb097c9893f69481ce947ac78c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44121
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-13054 ldiskfs: htree_node wrongly granted 20/44120/3
Yang Sheng [Sun, 26 Apr 2020 11:56:40 +0000 (19:56 +0800)]
LU-13054 ldiskfs: htree_node wrongly granted

The thread was waken up accidently. So need check
whether the lock granted or not after wake up.
Also fix issue that major always set to 0 since
hbit initialize incorrect. The performace should be
impacted especial operate in big directory.

kernel BUG at lustre/ldiskfs/htree_lock.c:429!
 Call Trace:
 htree_node_release_all+0x5a/0x80 [ldiskfs]
 htree_unlock+0x22/0x70 [ldiskfs]
 osd_index_ea_delete+0x30e/0xb10 [osd_ldiskfs]
 lod_sub_delete+0x1c8/0x460 [lod]
 lod_delete+0x24/0x30 [lod]
 __mdd_index_delete_only+0x194/0x250 [mdd]
 __mdd_index_delete+0x46/0x290 [mdd]
 mdd_unlink+0x5f8/0xaa0 [mdd]
 mdo_unlink+0x46/0x48 [mdt]
 mdt_reint_unlink+0xbed/0x14b0 [mdt]

Lustre-change: https://review.whamcloud.com/38371
Lustre-commit: 4597a2b4fc33711f66eb1c21fc125d028bd3f2ec

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I5972961bc78b349214c6756642717d126f0c4b26
Reviewed-on: https://review.whamcloud.com/44120
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7] 54/45354/2
Jian Yu [Mon, 25 Oct 2021 18:47:37 +0000 (11:47 -0700)]
LU-15099 kernel: kernel update RHEL7.9 [3.10.0-1160.45.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.45.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I11c307bfd6a6b353bc7b6fe40bb5d604bc9b3fdc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45354
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10) 55/45355/2
Arshad Hussain [Mon, 25 Oct 2021 18:51:50 +0000 (11:51 -0700)]
LU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10)

ZFS: (2.0.0-1)
Lustre: 608cce73d51 LU-15007 tests: quota enable cmd fix
CentOS: 3.10.0-1160.15.2.el7.x86_64

This patch fixes two build failures seens as below for
the above configuration

First
~~~~~
In file included from:
/root/zfs/zfs_git_lustre_build/zfs/include/sys/spa.h:39:0,
from libmount_utils_zfs.c:32:
/root/zfs/<path>/.../sys/zfs_context.h:110:27:
fatal error: sys/byteorder.h: No such file or directory
#include <sys/byteorder.h>

Second
~~~~~~
gcc -rdynamic -shared -export-dynamic -pthread \
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libzfs/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libnvpair/.libs
-o mount_osd_zfs.so \
`ar -t libmount_utils_zfs.a` \
-ldl -lzfs -lnvpair -lzpool
/usr/bin/ld: cannot find -lzpool
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
collect2: error: ld returned 1 exit status

Lustre-change: https://review.whamcloud.com/45016
Lustre-commit: 8931f7e4e5da39389a79eff11dc04bb468beb715

Change-Id: Iaf868391e414deb7ac8df43847250bbcd0115d5e
Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45355
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14124 target: set OBD_MD_FLGRANT in read's reply 71/45471/2
Vladimir Saveliev [Wed, 20 Oct 2021 10:32:11 +0000 (13:32 +0300)]
LU-14124 target: set OBD_MD_FLGRANT in read's reply

If tgt_grant_shrink() decides to not shrink grants - a client is
supposed to restore its cl_grant_avail in osc_update_grant(). In case
of read OBD_MD_FLGRANT is not set on reply's body->oa.o_valid, so
osc_update_grant() misses the cl_grant_avail update. As result server
keeps thinking that client has a lot of grants while a client thinks
that it is missing grants badly. That may lead to performance
degradation.

A test to illustrate the issue is included.

Lustre-change: https://review.whamcloud.com/43375
Lustre-commit: 4894683342d77964daeded9fbc608fc46aa479ee

Test-Parameters: testlist=sanity
Change-Id: Ibe7ce0af5701226c8be3ae3f9ad57c354791fa0f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45471
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2] 64/45364/2
Jian Yu [Mon, 25 Oct 2021 23:40:08 +0000 (16:40 -0700)]
LU-15160 kernel: kernel update SLES12 SP5 [4.12.14-122.91.2]

Update SLES12 SP5 kernel to 4.12.14-122.91.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ia6620869fa84d72f8d22c4a8a039600037ddb2d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45364
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14696 llite: check read only mount for setquota 23/44923/3
Hongchao Zhang [Wed, 15 Sep 2021 11:44:23 +0000 (19:44 +0800)]
LU-14696 llite: check read only mount for setquota

During setting quota, it should fail if the mount is read-only.

Lustre-change: https://review.whamcloud.com/43765
Lustre-commit: 29e00cecc6019fbdb5bd98511970970ac5ef5318

Change-Id: I966ac71d0a4a72dcb998f09ffc0f99ae28498e27
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44923
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4] 51/44951/2
Jian Yu [Thu, 16 Sep 2021 00:53:27 +0000 (17:53 -0700)]
LU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.19.1.el8_4 for Lustre client.

Test-Parameters: trivial clientdistro=el8.4 testlist=sanity

Change-Id: Icedc6cf2a5678cfbce76c47507137c0ea41d0b06
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44951
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7] 76/44876/2
Jian Yu [Thu, 9 Sep 2021 00:38:05 +0000 (17:38 -0700)]
LU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.42.2.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9 \
testlist=sanity

Change-Id: I377ea5d1e28c50b1087dfca7cb32f44afb9bf5f5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44876
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1] 63/44863/2
Jian Yu [Tue, 7 Sep 2021 19:56:49 +0000 (12:56 -0700)]
LU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1]

Update SLES12 SP5 kernel to 4.12.14-122.83.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: I2b35d129550b895324bb3e2e61910ad10e846f03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44863
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11546 utils: enable large_dir for ldiskfs 81/36781/6
Li Dongyang [Wed, 23 Oct 2019 00:10:34 +0000 (11:10 +1100)]
LU-11546 utils: enable large_dir for ldiskfs

Format MDT with "large_dir" option by default,
to get over the 10M-entry limit for the directories.

Lustre-change: https://review.whamcloud.com/36555
Lustre-commit: cd1faa0124f21e12a5ecd83c709c13918264fc86

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ie51e6ce28b5f00adc9958de24794a760d9b43b77
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36781
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-12627 ofd: reset fti_attr in ofd_lvbo_update() 69/44269/5
Wang Shilong [Sat, 3 Aug 2019 06:27:22 +0000 (14:27 +0800)]
LU-12627 ofd: reset fti_attr in ofd_lvbo_update()

This patch try to fix following panic:

(ofd_internal.h:440:tsi2ofd_info()) ASSERTION( info->fti_attr.la_valid == 0 ) failed:
(ofd_internal.h:440:tsi2ofd_info()) LBUG
[ 5321.108598] Call Trace:
[ 5321.109347]  [<ffffffffc06fc8bc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
[ 5321.111342]  [<ffffffffc06fc96c>] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 5321.113026]  [<ffffffffc147631a>] ofd_preprw+0xcfa/0x1160 [ofd]
[ 5321.114643]  [<ffffffffc0bb934c>] tgt_brw_write+0xc7c/0x1cf0 [ptlrpc]
[ 5321.116373]  [<ffffffffc0bbc50a>] tgt_request_handle+0x91a/0x15c0 [ptlrpc]
[ 5321.118230]  [<ffffffffc0b61636>] ptlrpc_server_handle_request+0x256/0xb00 [ptlrpc]
[ 5321.120318]  [<ffffffffc0b6516c>] ptlrpc_main+0xbac/0x1560 [ptlrpc]
[ 5321.122001]  [<ffffffff84cc1c31>] kthread+0xd1/0xe0
[ 5321.123023]  [<ffffffff85374c37>] ret_from_fork_nospec_end+0x0/0x39
[ 5321.124066]  [<ffffffffffffffff>] 0xffffffffffffffff

If this is server lock, tgt_brw_lock() will finally call
ofd_lvbo_update() upon lock canceling which will use @fti_attr
and pollute value:

|->ptlrpc_main
 |->lu_context_enter(le_ctx)
  |->tgt_brw_write
   |->tgt_brw_lock
    |->tgt_extent_lock
     |->ldlm_cli_enqueue_local
      |->ldlm_lock_enqueue
       |->ldlm_run_ast_work
        |->ptlrpc_check_set
          |->ldlm_cb_interpret
           |->ldlm_handle_ast_error
            |->ofd_lvbo_update
             |->ofd_attr_get polluted @info->fti_attr

  |->tgt_brw_write
   |->ofd_preprw
    |->tsi2ofd_info
      |->ASSERTION(info->fti_attr.la_valid == 0)

 |->lu_context_exit(le_ctx)--->memset @fti_attr

To fix this problem, reset fti_attr->la_valid before
ofd_lvbo_update() return just like what offd_lvbo_init() did.

Lustre-change: https://review.whamcloud.com/35685
Lustre-commit: 8ffbe6b82fac1d3e4d4391bcba74dc2ee1411a69

Change-Id: Ib6b448dd21603cfe0305d8425862a96ef3f7fee8
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44269
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14876 out: don't connect to busy MDS-MDS export 62/44362/6
Mikhail Pershin [Wed, 21 Jul 2021 15:14:01 +0000 (18:14 +0300)]
LU-14876 out: don't connect to busy MDS-MDS export

MDS-MDS connection is missing check for busy requests upon
reconnect, so resent can be executed concurrently with
original request.

- in ptlrpc_server_check_resend_in_progress() remove exception
  for bulk requests, they can be compared by XID nowadays.
  This prevents OUT requests vs resent execution as well.
- fix messages in target_handle_connect() to report correct
  information about connection details
- in out_handle() check for last_xid only once per OUT_UPDATE
- test 110m is added to recovery-small to reproduce the issue

Lustre-change: https://review.whamcloud.com/44390
Lustre-commit: 301d76a71176c186129231ddd1323bae21100165

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I2ad183674d59a2cdeab0037bd8551c607b10ffeb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-11518 ldlm: cancel LRU improvement 07/41007/3
Vitaly Fertman [Wed, 16 Dec 2020 16:54:10 +0000 (11:54 -0500)]
LU-11518 ldlm: cancel LRU improvement

Add @batch parameter to cancel LRU, which means if at least 1 lock is
cancelled, try to cancel at least a batch locks. This functionality
will be used in later patches.

Limit the LRU cancel by 1 thread only, however, not for those which
have the @max limit given (ELC), as LRU may be left not cleaned up
in full.

Lustre-change: https://review.whamcloud.com/39561
Lustre-commit: 3d4b5dacb3053f39d79d59860a903a19e76b9318

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ide21c4a2b2209b8a721249466ea1e651c8532c8a
HPE-bug-id: LUS-8678
Reviewed-on: https://es-gerrit.dev.cray.com/157067
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41007
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
3 years agoLU-11768 test: make at_max to take effect 45/41345/2
Hongchao Zhang [Thu, 10 Oct 2019 20:22:25 +0000 (16:22 -0400)]
LU-11768 test: make at_max to take effect

In test_6 of sanity-quota, the "at_max" won't affect
the "at_current" if there is no RPC to be sent in that
import, which still makes the following DQACQ request
to have larger timeout value and triggers watchdog.

Lustre-change: https://review.whamcloud.com/36431
Lustre-commit: 550af84a91505c85824ffad2990d31c8e8ab4dd9

Fixes: d8226b93 ("LU-11768 test: limit at_max to timeout in time")
Test-Parameters: trivial testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Iccc969459647aa70da6f6ecb0d8d13a404bf8088
Reviewed-on: https://review.whamcloud.com/41345
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13423 tests: cleanup_netns correctly set result 03/44203/3
Shaun Tancheff [Tue, 7 Apr 2020 23:05:06 +0000 (18:05 -0500)]
LU-13423 tests: cleanup_netns correctly set result

The existence test for 'test1pl' should not result in
cleanup_netns returning failure to the caller.

A slightly more terse if/else can be used to ensure the
caller is notified of failure only in the case of
test1pl not being deleted.

Lustre-change: https://review.whamcloud.com/38157
Lustre-commit: 410b655c71849e5a26251f7c187b19ed8f504bd7

Test-Parameters: trivial
HPE-bug-id: LUS-8713

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I85dee20ec0f0ccd0be17597431fcedda9469d9da
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44203
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14204 tests: make sure we have a single import 98/40998/2
Sebastien Buisson [Wed, 9 Dec 2020 17:53:12 +0000 (18:53 +0100)]
LU-14204 tests: make sure we have a single import

In sanity, retrieve the exact name of the import being used on the
client, in order to properly get information such as lock_count
or lru_size.

Change-Id: I065b7da7990c7171d5baa24f3400c5f8ffc12fc9
Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/40998
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14098 obdclass: try to skip corrupted llog records 96/44396/2
Alex Zhuravlev [Mon, 26 Jul 2021 06:18:06 +0000 (09:18 +0300)]
LU-14098 obdclass: try to skip corrupted llog records

if llog's header or record is found corrupted, then
ignore the remaining records and try with the next one.

Lustre-commit: 910eb97c1b43a44a9da2ae14c3b83e28ca6342fc
Lustre-change: https://review.whamcloud.com/40754

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86a682a8874a2184e8891ff0ee8a68414d232a79
Reviewed-on: https://review.whamcloud.com/44396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14733 o2iblnd: Avoid double posting invalidate 17/44217/2
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:01 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Avoid double posting invalidate

When the kib_tx is provisioned during kiblnd_fmr_pool_map(), spare
WRs in the kib_fast_reg_descriptor are setup and the mapping of
pages is given to the mr.

kiblnd_post_tx_locked() then posts the spare WRs from the
kib_fast_reg_descriptor.

if (rc == 0)
return 0;

The code returns and the kib_fast_reg_descriptor is still contains
the spare WRs.   The next time the kib_tx is used, the
now obsolete WRs will be inadvertently posted.   For rdmavt, the
obsolete invalidate will cause an -EINVAL to be returned from
the post send.

Fix by adding a state variable frd_posted to the kib_fast_reg_descriptor.
The variable is set to false in kiblnd_fmr_pool_unmap().
kiblnd_post_tx_locked() is adjusted to avoid prepending the
kib_fast_reg_descriptor WRs when frd_posted is true.   After
the post succeeds, the frd_posted is set to true.

Lustre-change: https://review.whamcloud.com/44190
Lustre-commit: 5930576791e864529e6ef9b46f3e09cc4b635fc2

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: I426dd05e635392e75d1aa48808782a229e83ce5f
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44217
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7] 77/44377/2
Jian Yu [Thu, 22 Jul 2021 07:31:50 +0000 (00:31 -0700)]
LU-14871 kernel: kernel update RHEL7.9 [3.10.0-1160.36.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.36.2.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ie2898b1df28c8b99ea4099e94baafe388c6aa626
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44377
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14733 o2iblnd: Move racy NULL assignment 16/44216/2
Mike Marciniszyn [Wed, 7 Jul 2021 19:16:00 +0000 (15:16 -0400)]
LU-14733 o2iblnd: Move racy NULL assignment

kiblnd_fmr_pool_unmap() can race map and subsequent processing
because of this flaw in unmap:

if (frd) {
frd->frd_valid = false;
spin_lock(&fps->fps_lock);
list_add_tail(&frd->frd_list, &fpo->fast_reg.fpo_pool_list);
spin_unlock(&fps->fps_lock);
fmr->fmr_frd = NULL;
}

The fmr can be pulled off the list in kiblnd_fmr_pool_unmap() on
another CPU an fmr_frd could be in a state of flux and
potentially be seen incorrectly later on as the kib_tx is processed.

Fix my moving the fmr_frd assignment to before the fmr is added to the
list.

Lustre-change: https://review.whamcloud.com/44189
Lustre-commit: 023113fb8946f3565529e7327fdcd90ab9db3ba3

Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Change-Id: Ibddf132a363ecfe9db3cc06287cec873c021d2fb
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44216
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13729 osd-ldiskfs: race access to iam_formats during setup 56/44356/2
Wang Shilong [Tue, 30 Jun 2020 01:12:48 +0000 (09:12 +0800)]
LU-13729 osd-ldiskfs: race access to iam_formats during setup

It might be possible during OST mounting, two targets reach
iam_format_guess() at the same time, if @initialized is 0,
they both access iam_lxx_format_init(), however list operation
inside is not protected by any locking which cause list corruptions
finally.

We could fix this by doing formats registration in module init,
since there are only two formats, just remove pointless list.

Lustre-change: https://review.whamcloud.com/39213
Lustre-commit: 54d0f5de911af52e7f2a978c4b6cd158fed87dc5

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I6dd5a4d1297792b47fb4b94052465a7e0f9123aa
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/44356
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wangshilong1991@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12836 osd-zfs: Catch all ZFS pool change events 29/43929/3
Tony Hutter [Fri, 12 Mar 2021 01:23:16 +0000 (17:23 -0800)]
LU-12836 osd-zfs: Catch all ZFS pool change events

This change adds the following symlinks:

  vdev_attach-lustre -> statechange-lustre.sh
  vdev_remove-lustre -> statechange-lustre.sh
  vdev_clear-lustre -> statechange-lustre.sh

This makes it so the statechange-lustre.sh script is also called on
all ZFS events that could change the pool state.

Lustre-change: https://review.whamcloud.com/43552
Lustre-commit: e11a47da71a2e2482e4c4cf582d663cd76a2ecab

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Change-Id: I18edc86749e8ab91bb45f21aafd3fd47e78cbaef
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13055 mdd: don't assert on unknown changelog lrh_type 10/43710/7
Mikhail Pershin [Fri, 14 May 2021 17:01:43 +0000 (20:01 +0300)]
LU-13055 mdd: don't assert on unknown changelog lrh_type

Supplemental patch for old server code to prevent assertion
on unknown/new changelog record and user record types

Test-Parameters: env=ONLY=160 testlist=sanity serverjob=lustre-master serverbuildno=0
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I5d45c6ef659feb2b143edf6286df9904378171ba
Reviewed-on: https://review.whamcloud.com/43710
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-7791 ldlm: signal vs CP callback race 97/44297/2
Andriy Skulysh [Tue, 3 May 2016 07:41:56 +0000 (10:41 +0300)]
LU-7791 ldlm: signal vs CP callback race

In case of interrupted wait for a CP AST
failed_lock_cleanup() sets LDLM_FL_LOCAL_ONLY, so
the client wouldn't cancel the lock on CP AST.

A lock isn't canceled on the server on reception

Lustre-change: https://review.whamcloud.com/19898
Lustre-commit: 7fff052c930da4822c3b2a13d130da7473a20a58

Cray-bug-id: LUS-2021
Change-Id: Id1e365b41f1fb8a0f9a32c0c929457b22ceba8ef
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/44297
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoNew release 2.12.7 2.12.7 v2_12_7
Oleg Drokin [Thu, 15 Jul 2021 04:10:42 +0000 (00:10 -0400)]
New release 2.12.7

Change-Id: I6f98d22dd887538b32dead45b037c44541103c13
Signed-off-by: Oleg Drokin <green@whamcloud.com>
3 years agoNew RC 2.12.7-RC1 2.12.7-RC1 v2_12_7-RC1
Oleg Drokin [Sun, 27 Jun 2021 14:24:07 +0000 (10:24 -0400)]
New RC 2.12.7-RC1

Change-Id: I7bccb2825193ffcdf984f53db9d606c097b784bf
Signed-off-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14327 tests: skip sanity-sec test 55 for older servers 50/43950/6
James Nunez [Tue, 8 Jun 2021 16:34:29 +0000 (10:34 -0600)]
LU-14327 tests: skip sanity-sec test 55 for older servers

sanity-sec test 55 was added to lustre-b2_12 version
2.12.6.3.  When we run version interop testing with
Lustre servers less than 2.12.6.3, the test will fail.
Thus, skip sanity-sec test 55 for Lustre servers less
than 2.12.6.3.

Lustre-change: https://review.whamcloud.com/43949
Lustre-commit: abda4d06a41dfb526b4a66cb5fae6ff1a4c6c01b

Fixes: 355787745f21 (“LU-14121 nodemap: do not force fsuid/fsgid squashing”)

Test-Parameters: trivial
Test-Parameters: serverversion=2.10.8 serverdistro=el7.6 env=ONLY=55 testlist=sanity-sec
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ie002c921e853897105396185b38485799df31b7a
Reviewed-on: https://review.whamcloud.com/43950
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
3 years agoLU-7372 tests: re-enable replay-dual test_26 78/43978/4
Andreas Dilger [Fri, 11 Jun 2021 00:52:43 +0000 (18:52 -0600)]
LU-7372 tests: re-enable replay-dual test_26

Re-enable test_26 since it was just the unfortunate victim of
either test_24 or test_25 causing MDS unmount to hang.

Lustre-change: https://review.whamcloud.com/43982
Lustre-commit: TBD (from 0f509199a25db416759c3bbcce85c6b79d623585)

Test-Parameters: trivial testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Test-Parameters: testgroup=review-dne-part-2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib944028e798488c425501f0c48bf812fc13ebbe5
Reviewed-on: https://review.whamcloud.com/43978
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14673 sec: annotate algorithms taking optional key 53/43653/7
Sebastien Buisson [Tue, 11 May 2021 08:59:03 +0000 (10:59 +0200)]
LU-14673 sec: annotate algorithms taking optional key

Crypto algorithms implementing a ->setkey() method but that can also
be used without a key must set the CRYPTO_ALG_OPTIONAL_KEY flag if
defined in the kernel.
In Lustre, adler32 and crc32 implementations define a ->setkey()
method, but their "key" is not actually a cryptographic key.

Lustre-change: https://review.whamcloud.com/43656
Lustre-commit: b161e7b777e63bb4328aeab9e50560f919fedc31

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I362211d1b1aa3763fe1481cebb3629b255f29e41
Reviewed-on: https://review.whamcloud.com/43653
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
3 years agoLU-14627 lnet: Ensure ref taken when queueing for discovery 01/44001/5
Chris Horn [Thu, 22 Apr 2021 19:51:44 +0000 (14:51 -0500)]
LU-14627 lnet: Ensure ref taken when queueing for discovery

Call lnet_peer_queue_for_discovery() in
lnet_discovery_event_handler() to ensure that we take a ref on
the peer when forcing it onto the discovery queue. This also ensures
that the peer state has LNET_PEER_DISCOVERING.

Add a test to sanity-lnet.sh that can trigger the refcount loss bug
in discovery.

Lustre-change: https://review.whamcloud.com/43418
Lustre-commit: 2ce6957b69370b0ce75725d1d91866bf55c07fa8

HPE-bug-id: LUS-7651
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie2908668c4ffde0f993b5b7ea9aa58acd1d6fa9c
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44001
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14627 tests: Create unload_modules_local 60/43960/3
Chris Horn [Fri, 23 Apr 2021 19:05:02 +0000 (14:05 -0500)]
LU-14627 tests: Create unload_modules_local

t-f allows for loading modules on single node via load_modules_local.
However, there is no corresponding unload_modules_local that can be
called to cleanup after call to load_modules_local, so we create it.
unload_modules() refactored to use unload_modules_local.

Also address a potential issue that can prevent LND modules from
unloading. Some LNet setup (particularly those in sanity-lnet) may
require that we call lnetctl lnet unconfigure (or lctl net down)
to drop a ref on the module before it can be unloaded.

Lustre-change: https://review.whamcloud.com/43425
Lustre-commit: 32304d863ae98c641f541362f54e7b1f24b350a6

HPE-bug-id: LUS-9031
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6458a7728f5f559f8641c5a9e29dd775c8445c38
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43960
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14627 lnet: Allow delayed sends 59/43959/2
Chris Horn [Wed, 21 Apr 2021 19:22:46 +0000 (14:22 -0500)]
LU-14627 lnet: Allow delayed sends

The net_delay_add has some code related to delaying sends, but it
isn't fully implemented. Modify lnet_post_send_locked() to check
whether the message being sent matches a rule and should be delayed.

Fix some bugs with how the delay timers were set and checked.

Lustre-change: https://review.whamcloud.com/43416
Lustre-commit: ab14f3bc852e708100d21770c00235f95841708a

HPE-bug-id: LUS-7651
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Icbd9ee81d2ff0162a01a4187807ea2114a42276d
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43959
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12675 mdt: release object reference upon error 40/43940/3
Bruno Faccini [Wed, 21 Aug 2019 13:32:54 +0000 (15:32 +0200)]
LU-12675 mdt: release object reference upon error

LBUG ("(lu_object.c:1196:lu_device_fini()) ASSERTION(
atomic_read(&d->ld_ref) == 0) failed: Refcount is <x>") can
intermitently occur during umount of MDT0000, upon specific
use cases (playing with file/dir having foreign LOV/LMV), and
due to object reference set/leaked on server side.

Lustre-change: https://review.whamcloud.com/35845
Lustre-commit: 4649899fbba095c7c3eb7ce1c8893040ed6e2494

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ic49b2bb0402b1a6e51d7ba656f9957eeda1bd0fb
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43940
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-13182 llite: Avoid eternel retry loops with MAP_POPULATE 58/43958/4
Oleg Drokin [Wed, 9 Jun 2021 16:30:12 +0000 (09:30 -0700)]
LU-13182 llite: Avoid eternel retry loops with MAP_POPULATE

Kernels 5.4+ have an infinite retry loop from MAP_POPULATE mmap
option. Use the FAULT_FLAG_RETRY_NOWAIT to instruct filemap_fault
to not drop the mmap_sem so if the call fails, we could use
the slow path and break the loop from forming.
(Idea by Neil Brown)

Lustre-change: https://review.whamcloud.com/40221
Lustre-commit: bb50c62c6f4cdd7a31145ab81e7c166e0760ed11

Test-Parameters: trivial testlist=sanity-hsm env=ONLY=1 clientdistro=ubuntu2004

Change-Id: I320ab9ca447282aea15ef2030ef8671c4260d895
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43958
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-7372 tests: skip replay-dual test_24/25 77/43977/4
Andreas Dilger [Fri, 11 Jun 2021 00:47:52 +0000 (18:47 -0600)]
LU-7372 tests: skip replay-dual test_24/25

Not sure which one of these subtests is causing problems, but
they are causing the following runtests test to hang unmounting
the MDS, just like test_26 was doing previously.

This is only a stopgap to confirm that one of these subtests is
causing the later unmount hang, and to get testing passing again.
There needs to be further isolation done to test_24 or test_25,
and to re-enable test_26, but that can be done afterward.

Test-Parameters: trivial testgroup=review-dne-part-2
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6d94eb040052b4912cf29ea37ca36ca4503ebbe5
Reviewed-on: https://review.whamcloud.com/43977
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10350 lod: adjust stripe count to available ost count 76/43976/2
Bobi Jam [Fri, 28 May 2021 08:25:52 +0000 (16:25 +0800)]
LU-10350 lod: adjust stripe count to available ost count

* In ost-pool.sh, reset $MOUNT's stripe offset, so that the created
  directory will not inherit it from root directory.

* Preserve the root directory layout in replay-single (run before
  ost-pools) to avoid leaving a bad layout on the root dir.
  Lustre-change: https://review.whamcloud.com/43872

Lustre-change: https://review.whamcloud.com/43882
Lustre-commit: TBD (from c82f557324bc0048c308d1a2135699e7c83169e1)

Test-Parameters: trivial testlist=replay-single,ost-pools
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Idf6884faf1271a3864710aeab0ba0eca154bf492
Reviewed-on: https://review.whamcloud.com/43976
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4] 44/43744/5
Jian Yu [Sun, 6 Jun 2021 07:38:49 +0000 (00:38 -0700)]
LU-14690 kernel: new kernel [RHEL 8.4 4.18.0-305.3.1.el8_4]

This patch makes changes to support new RHEL 8.4 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.4

Change-Id: I47d4706f9175d489ef0e6226492af20f44f0677e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43744
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13783 osc: handle removal of NR_UNSTABLE_NFS 78/43778/2
Mr NeilBrown [Fri, 3 Jul 2020 05:33:36 +0000 (15:33 +1000)]
LU-13783 osc: handle removal of NR_UNSTABLE_NFS

In Linux 5.8 the NR_UNSTABLE_NFS page counters are go.  All pages that
have been writen but are not yet safe are now counted in NR_WRITEBACK.

So change osc_page to count in NR_WRITEBACK, but if NR_UNSTABLE_NFS
still exists in the kernel, use a #define to direct the updates to
that counter.

Conflicts:
libcfs/autoconf/lustre-libcfs.m4

Lustre-change: https://review.whamcloud.com/39260
Lustre-commit: 3e5faa441266cd8dc2ee54ae140ad0129b4affa0

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I49cbc267fafaee949f45b2e559511aedcf4d8fed
Reviewed-on: https://review.whamcloud.com/43778
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12355 llite: MS_* flags and SB_* flags split 79/40379/4
Shaun Tancheff [Thu, 18 Jul 2019 14:19:03 +0000 (09:19 -0500)]
LU-12355 llite: MS_* flags and SB_* flags split

In kernel 4.20 the MS_* flags should only be used for mount
time flags and SB_* flags for checking super_block.s_flags
The MS_* flags have moved to a uapi header

Conflicts:
lustre/llite/llite_lib.c

Lustre-commit: 72a84970e6d2a2d4b3a35f2ee058511be2fda82e
Lustre-change: https://review.whamcloud.com/35019

Linux-commit: e262e32d6bde0f77fb0c95d977482fc872c51996

Test-Parameters: trivial
Change-Id: Ifd64efb16c7795377ece066d01ae04dc004a13ac
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/40379
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12355 llite: totalram_pages changed to atomic_long_t 76/40376/3
Shaun Tancheff [Sat, 15 Jun 2019 19:32:26 +0000 (14:32 -0500)]
LU-12355 llite: totalram_pages changed to atomic_long_t

Kernel 5.0 changed totalram_pages to atomic_long_t
Provide an abstracted accessor now that totalram_pages
is now a function

Conflicts:
libcfs/autoconf/lustre-libcfs.m4
libcfs/include/libcfs/libcfs.h
lustre/llite/lproc_llite.c

Lustre-commit: 5ca5b19e8efdfede8ec3405eaced7202984f396b
Lustre-change: https://review.whamcloud.com/35025

Linux-commit: ca79b0c211af63fa3276f0e3fd7dd9ada2439839

Test-Parameters: trivial
Change-Id: I558e42074004e2ee5f79deea0d363e5bea332729
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/40376
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12999 mgs: Cleanup string handling in name_create_mdt 21/43321/3
Shaun Tancheff [Mon, 2 Dec 2019 17:32:50 +0000 (11:32 -0600)]
LU-12999 mgs: Cleanup string handling in name_create_mdt

To satisfy gcc8 -Werror=format-overflow sanity test the mdt_idx
before calling snprintf.

Lustre-change: https://review.whamcloud.com/36817
Lustre-commit: 298cdb5c0b6136b91e76c9c515bfbc2df99bae0b

Test-Parameters: trivial
Cray-bug-id: LUS-8186
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I2c8764d3715290ee2bd8c96cdc98b532f50632c6
Reviewed-on: https://review.whamcloud.com/43321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14588 o2ib: make config script aware of the ofed symbols 56/43556/2
Serguei Smirnov [Tue, 6 Apr 2021 22:54:01 +0000 (15:54 -0700)]
LU-14588 o2ib: make config script aware of the ofed symbols

LNet o2ib configuration script needs to be aware of the external
ofed dkms symbols when testing for availability of o2ib features
by building "conftest" kernel objects. If this is not done,
symbols from the core kernel are used by default which is
different from what is used when actually building LNet,
at least on Ubuntu. This patch adds the check for external symbols.

Lustre-change: https://review.whamcloud.com/43223
Lustre-commit: bcc5d784826d2d7a8eece28e96fab8b0fa02ab17

Test-Parameters: trivial
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Iea566f8a3feb86b8bef2f4501a3abc968d76451a
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43556
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14617 utils: llog_reader updatelog support 58/43658/2
Alexander Boyko [Fri, 16 Apr 2021 09:57:34 +0000 (05:57 -0400)]
LU-14617 utils: llog_reader updatelog support

The patch adds printing UPDATE_REC for llog_reader. It is usefull
for updatelog analyze. Here is an example of record

 [0x50001a21b:0x1233d:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x50001a211:0x475:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x3800182e3:0x475:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x200032c9a:0x245:0x0] type:xattr_set/7 params:3 p_0:0 p_1:1 p_2:2
 [0x200000001:0x15:0x0] type:write/12 params:2 p_0:3 p_1:4
 p_0 - 12/trusted.lov
 p_1 - 0/
 p_2 - 25972/\x0100000000000000000000000000000000000000000002000...
 p_3 - 25974/\x0800000000000000P\xD1AB006x0000000400EC^\x000000...
 p_4 - 1/

llog logic processing base on incrementing record index,
the fix adds checks for it. Also adds more info from header,
and drops useless - Bit X not set.

Lustre-change: https://review.whamcloud.com/43343
Lustre-commit: 9962d6f84db5fd587bbe13640a9361c2872f3728

Test-Parameters: trivial
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Id50de15040526dc07ae708ac5db046832706be31
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43658
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3] 45/43545/2
Jian Yu [Wed, 5 May 2021 17:23:27 +0000 (10:23 -0700)]
LU-14604 kernel: kernel update RHEL8.3 [4.18.0-240.22.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.22.1.el8_3 for Lustre client.

Test-Parameters: trivial clientdistro=el8.3

Change-Id: I1a3152d95822a74e05f9b44f590a6cdb1f8b02b6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43545
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7] 26/43626/2
Jian Yu [Mon, 10 May 2021 19:31:58 +0000 (12:31 -0700)]
LU-14670 kernel: kernel update RHEL7.9 [3.10.0-1160.25.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.25.1.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: Ic846d648c45476cc4886ce86577605bf3e66d935
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43626
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2] 32/43632/2
Jian Yu [Mon, 10 May 2021 21:27:19 +0000 (14:27 -0700)]
LU-14672 kernel: kernel update SLES12 SP5 [4.12.14-122.66.2]

Update SLES12 SP5 kernel to 4.12.14-122.66.2 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: Ib2bf4795ccb21dbd0bb9202228ff32d73a203eee
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43632
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14553 changelog: eliminate mdd_changelog_clear warning 55/43555/2
Olaf Faaland [Thu, 25 Mar 2021 01:35:10 +0000 (18:35 -0700)]
LU-14553 changelog: eliminate mdd_changelog_clear warning

When handling a changelog_clear request, the user may specify a
range of indices which do not exist.  Similarly, the user may
specify a changelog user which does not exist.  Neither indicates
a problem within Lustre that justifies a a console warning.

Change those cases to CDEBUG.

Lustre-change: https://review.whamcloud.com/43125
Lustre-commit: 6b183927e19715d093c80a35ebc42a1cda5e70e2

Test-Parameters: trivial
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I64bab12ef4978c4bf7139f5f36a39f9b109616fb
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43555
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14603 ptlrpc: quiet messages for unsupported opcodes 60/43260/3
Andreas Dilger [Sun, 11 Apr 2021 02:04:30 +0000 (20:04 -0600)]
LU-14603 ptlrpc: quiet messages for unsupported opcodes

Quiet messages for OST_FALLOCATE and OST_SEEK RPCs that can
be sent from 2.14.0 clients.

Lustre-change: https://review.whamcloud.com/43257
Lustre-commit: TBD (from c7427f6618308996e76718baeba492c0b09dd5b3)

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I35496168e3aa29ecb06076654ef0aa97ba2540e5
Reviewed-on: https://review.whamcloud.com/43260
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14119 lfsck: replace dt_lookup() with dt_lookup_dir() 65/43265/2
Lai Siyao [Wed, 13 Jan 2021 09:16:55 +0000 (17:16 +0800)]
LU-14119 lfsck: replace dt_lookup() with dt_lookup_dir()

Lfsck code calls dt_lookup() to lookup sub file under directory in
many places, but this function needs to to initialize directory with
dt_try_as_dir() first, while it's missing in several places, since
the overhead is trivial, call dt_lookup_dir() instead.

Lustre-change: https://review.whamcloud.com/41218
Lustre-commit: d525ad4bd0d5d851405e4249859a1c77378f0ee3

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I40bd8d51edece50353af1729cf867572a0abea78
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43265
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 osd: delete stale OI mapping entry 68/43268/2
Lai Siyao [Wed, 24 Feb 2021 03:31:06 +0000 (11:31 +0800)]
LU-14119 osd: delete stale OI mapping entry

Once LMA check shows OI mapping entry is stale, delete it from
OI table, as can avoid removing whole OI files.

Don't add OI mapping into cache until osd_fid_lookup(), because
the mapping in OI is not trustable until FID in LMA is checked,
otherwise it may mislead LFSCK.

Lustre-change: https://review.whamcloud.com/41741
Lustre-commit: 99d00b97ef5f209a002f250e7772055ff1a6d6d6

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I4b50dcc02149d485e4bf4a361ca2994daa280feb
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43268
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 osd-zfs: enable LUDA_VERIFY 67/43267/2
Lai Siyao [Tue, 19 Jan 2021 13:37:50 +0000 (21:37 +0800)]
LU-14119 osd-zfs: enable LUDA_VERIFY

In osd_dir_it_rec(), if dirent is successfully got, and the FID in
dirent is sane, it returns right away, however if
LUDA_VERIFY|LUDA_VERIFY_DRYRUN is set, the FID in dirent should be
compared with the FID in LMA, and replaced with the latter one if
they are differet.

Lustre-change: https://review.whamcloud.com/41274
Lustre-commit: f5136e81957e4b67ae6ed7764d378b817fac5ee2

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I35e2a4d4606044cd37cc5847cffc577740918988
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43267
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 osd: add mount option "resetoi" 69/43269/2
Lai Siyao [Wed, 3 Feb 2021 03:44:15 +0000 (11:44 +0800)]
LU-14119 osd: add mount option "resetoi"

OI files on zfs are special, and they can't be deleted by user space
tools like rm. Sometimes the OI files may contain stale OI mappings,
and they needed to be removed for namespace consistency. Add a mount
option 'resetoi' to recreate OI files on mount time, and it will
support both ldiskfs and zfs. This should be the standard way to
recreate OI files, other than mount as backend filesystem and unlink
them manually.

Add sanity-scrub 17.

Lustre-change: https://review.whamcloud.com/41402
Lustre-commit: f37bce8a573dfc5aac1b9f51f4d5c8314ba05d30

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idc0e4c2f3b81675c49c6c005bc30b61d8fd04503
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43269
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 lfsck: check linkea if it's newly added 70/43270/2
Lai Siyao [Thu, 14 Jan 2021 09:14:01 +0000 (17:14 +0800)]
LU-14119 lfsck: check linkea if it's newly added

In LFSCK phase one, if new linkea entry is added, and final linkea
entry count is more than one, add file in trace file, so that the
linkea sanity will be checked in phase two.

And in phase two check, if link parent FID can't be mapped to valid
inode, remove it from linkea.

Add sanity-lfsck 1d, which changed parent directory FID in LMA,
therefore the FID in LMA mismatches with parent FID in child linkea,
verify LFSCK can fix such inconsistency.

Lustre-change: https://review.whamcloud.com/41261
Lustre-commit: afd00cacd0b6ef87282887b4e965350a9c1a6821

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I315983d262110c1e36c3893fa2e51925d96c51a7
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43270
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14119 mdc: set fid2path RPC interruptible 66/43266/2
Lai Siyao [Wed, 13 Jan 2021 09:29:50 +0000 (17:29 +0800)]
LU-14119 mdc: set fid2path RPC interruptible

Sometimes OI scrub can't fix the inconsistency in FID and name, and
server will return -EINPROGRESS for fid2path request. Upon such
failure, client will keep resending the request. Set such request
to be interruptible to avoid deadlock.

Lustre-change: https://review.whamcloud.com/41219
Lustre-commit: bf475262610671534b1b1a33cebb49d8380b74f7

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I82192cb8a8256064ca632cabfe5581b12e86423b
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43266
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14527 kernel: kernel update RHEL7.9 [3.10.0-1160.21.1.el7] 89/42089/3
Jian Yu [Fri, 9 Apr 2021 19:02:25 +0000 (12:02 -0700)]
LU-14527 kernel: kernel update RHEL7.9 [3.10.0-1160.21.1.el7]

Update RHEL7.9 kernel to 3.10.0-1160.21.1.el7.

Test-Parameters: clientdistro=el7.9 serverdistro=el7.9

Change-Id: I1a46fe492d280b19c0f93458aaac975a4c873caf
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42089
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10632 tests: recovery-small test_26 idle_timeout 37/43237/2
Andreas Dilger [Thu, 11 Mar 2021 09:39:57 +0000 (02:39 -0700)]
LU-10632 tests: recovery-small test_26 idle_timeout

In recovery-small test_26() use "lfs df" instead of plain "df"
since statfs may be fetched from the MDS cache and will not
ensure that the client->OST connections are currently active.

Also, check a few entries further back in the OSC state log for an
EVICTED message, in case the client idle disconnects from the server
again while checking all of the imports.

Lustre-change: https://review.whamcloud.com/42006
Lustre-commit: b4391fcdaf392a50bd1419342eca3b730c077ed2

Test-Parameters: trivial testlist=recovery-small env=ONLY=26a,ONLY_REPEAT=100

Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8c370cb75f4e06258ef3c032630fc20354a15dcc
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43237
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13073 osp: don't block waiting for new objects 02/43202/2
Alex Zhuravlev [Fri, 16 Oct 2020 16:09:04 +0000 (19:09 +0300)]
LU-13073 osp: don't block waiting for new objects

if OST is down, then it's possible that few threads trying
to get already precreated object will get stuck. even worse
that all QoS-based allocations then are serialized by the
single semaphore, even those that wouldn't try to allocate
on failed OST.

the patch introduces noblock flag in the allocation hint
which is passed to OSP. then QoS code tries to allocate
objects in a non-blocking manner.

Lustre-change: https://review.whamcloud.com/40274
Lustre-commit: 2112ccb3c48ccf86aaf2a61c9f040571a6323f9c

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I38e66d7569aefecf800dbc32f1049ac87853439e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/43202
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14316 llite: quiet spurious ioctl warning 03/43103/2
Andreas Dilger [Fri, 5 Feb 2021 20:13:10 +0000 (13:13 -0700)]
LU-14316 llite: quiet spurious ioctl warning

Calling "lfs setstripe" prints a suprious warning about using the old
ioctl(LL_IOC_LOV_GETSTRIPE) when that is not actually the case.

Remove the ioctl warning for now and deal with related issues later.

Fixes: 364ec95f3688 ("LU-9367 llite: restore ll_file_getstripe in ll_lov_setstripe")

Lustre-change: https://review.whamcloud.com/41427
Lustre-commit: c6f65d8af116476d4fa62604a90b2e0d657b29b2

Test-Parameters: trivial

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I20f5a7adb60a30fce27e49827bd46229e2ce7057
Reviewed-on: https://review.whamcloud.com/43103
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14355 ptlrpc: do not output error when imp_sec is freed 54/41754/3
Sebastien Buisson [Mon, 25 Jan 2021 08:24:19 +0000 (17:24 +0900)]
LU-14355 ptlrpc: do not output error when imp_sec is freed

There is a race condition on client reconnect when the import is being
destroyed.  Some outstanding client bound requests are being processed
when the imp_sec has already been freed.
Ensure to output the error message in import_sec_validate_get() only
if import is not already in the zombie work queue.

Lustre-change: https://review.whamcloud.com/41310
Lustre-commit: 20cbbb084b671a1e82bd9ad23f8f1a074fc41afb

Fixes: 135fea8fa9 ("LU-4423 obdclass: use workqueue for zombie management")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I4b431128e04f11b1e3ee7de47090af87538c3558
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41754
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12506 changelog: support large number of MDT 87/42087/2
Hongchao Zhang [Wed, 4 Mar 2020 14:07:19 +0000 (09:07 -0500)]
LU-12506 changelog: support large number of MDT

At client, the changelog of each MDT is associated with
one miscdevice, but the number of miscdevice is limited
to 64 in Linux kernel, then it will fail if there are
more than 64 MDTs.

This patch replaces miscdevice with dynamic devices to
support more MDTs.

Lustre-change: https://review.whamcloud.com/37759
Lustre-commit: d0423abc1adc717b08de61be3556688cccd52ddf

Change-Id: Ie3ce76cbe1c613bf17d6350ea95546524b6d66b8
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42087
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13609 mgs: fix config_log buffer handling 77/41777/3
Stephane Thiell [Thu, 11 Feb 2021 00:15:02 +0000 (16:15 -0800)]
LU-13609 mgs: fix config_log buffer handling

Fix buffer handling in mgs_list_logs() to list all MGS config_logs
using multiple ioctl calls when we have a large number of targets.

Lustre-change: https://review.whamcloud.com/41478
Lustre-commit: e3f17defc141d8847562b610931255d37ed4dd3c

Fixes: 1d97a8b4cd3d ("LU-13609 llog: list all the log files correctly on MGS/MDT")
Signed-off-by: Stephane Thiell <sthiell@stanford.edu>
Change-Id: I1bf32e918e242f4da83c3d1624b7285a18a88d01
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41777
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13649 mdd: orhpan cleanup fix 76/39776/4
Vitaly Fertman [Mon, 8 Jun 2020 20:24:12 +0000 (23:24 +0300)]
LU-13649 mdd: orhpan cleanup fix

due to a race with mdd_close() the objects may have been already
destroyed by close and the 2nd destroy asserts on lu_object_is_dying()

The problem appeared in LU-12846 which removed the error handling
(ENOENT) returned by dt_delete - the entry was already removed from
the parent.

Lustre-change: https://review.whamcloud.com/38866
Lustre-commit: 364a75cbb9648f6d64d948e0dd53c59899021913

Fixes: 688d5da6a8 ("LU-12846 mdd: return error while delete failed")
HPE-bug-id: LUS-8864

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I7e2f3fca7b7d4440340fd3daaf8ec528010d9117
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39776
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-1538 tests: standardize test script init - sanity 26/41226/6
Andreas Dilger [Mon, 3 Jun 2019 14:39:19 +0000 (08:39 -0600)]
LU-1538 tests: standardize test script init - sanity

Standardize the initial Lustre test script initialization of the
test-framework.sh for clarity and consistency.

The LUSTRE path is already normalized in init_test_env(), so this
doesn't need to be done in the caller.  Use $(...) subshells instead
of `...` in the affected lines.  Remove PATH, NAME, TMP, LFS, LCTL
variable initialization, since it is already done in init_test_env().

Move MACHINEFILE into init_test_env().

Move get_lustre_env() to the end of init_test_env(). All test scripts
currently call init_test_env() and this move will allow all test
scripts to use the variables defined in get_lustre_env() without
having to modify the individual test scripts.

Move all definitions of ALWAYS_EXCEPT to after init_test_env()
and init_logging() and call build_test_filter() immediately
after these and SLOW definitions.

Lustre-change: https://review.whamcloud.com/34863
Lustre-commit: 8fa23490bb5fd0df2b1def8b14d51919abde6555

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1ef6639bcb3eb5179bd44da13b35fd843c267156
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41226
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14450 kernel: kernel update RHEL8.3 [4.18.0-240.15.1.el8_3] 80/42080/2
Jian Yu [Thu, 18 Mar 2021 19:46:54 +0000 (12:46 -0700)]
LU-14450 kernel: kernel update RHEL8.3 [4.18.0-240.15.1.el8_3]

Update RHEL8.3 kernel to 4.18.0-240.15.1.el8_3 for Lustre client.

Test-Parameters: trivial clientdistro=el8.3

Change-Id: I92ca7769fac17221da376788cfe79887ecc4c19c
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/42080
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-11518 ldlm: lru code cleanup 06/41006/2
Vitaly Fertman [Wed, 16 Dec 2020 16:50:46 +0000 (11:50 -0500)]
LU-11518 ldlm: lru code cleanup

cleanup includes:
 - no need in unused locks parameter in the lru policy, better to
   take the current value right in the policy if needed;
 - no need in a special SHRINKER policy, the same as the PASSED one
 - no need in a special DEFAULT policy, the same as the PASSED one;
 - no need in a special PASSED policy, LRU is to be cleaned anyway
   according to LRU resize or AGED policy;

bug fixes:
 - if the @min amount is given, it should not be increased on the
   amount of locks exceeding the limit, but the max of them is to
   be taken instead;
 - do not do ELC on enqueue if no LRU limits are reached;
 - do not keep lock in LRUR policy once we have cancelled @min locks,
   try to cancel instead until we reach the @max limit if given;
 - cancel locks from LRU with the new policy, if changed in sysfs;

Lustre-change: https://review.whamcloud.com/39560
Lustre-commit: 209a112eb153b4cc7429d70685a3bc2d7f51e45f

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I84369da54f680e5fbddd28089c40d1b90722d42d
HPE-bug-id: LUS-8678
Reviewed-on: https://es-gerrit.dev.cray.com/157066
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41006
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
3 years agoLU-11518 osc: cancel osc_lock list traversal once found the lock is being used 05/41005/2
Gu Zheng [Mon, 24 Jun 2019 05:51:20 +0000 (13:51 +0800)]
LU-11518 osc: cancel osc_lock list traversal once found the lock is being used

Currently, in osc_ldlm_weigh_ast, it walks osc_lock list (oo_ol_list)
to check whether target dlm is being used, normally, if found, it needs
to skip the rest ones and cancel the traversal, but it doesn't, let's
fix it here.

Lustre-change: Reviewed-on: https://review.whamcloud.com/35396
Lustre-commit: eb9aa909343b95769cbf90eec36ded8821d4aa12

Change-Id: I2e64d2938cdacb6c5baca73647d74c9fb8f54f8c
Fixes: 3f3a24dc5d7d ("LU-3259 clio: cl_lock simplification")
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41005
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
3 years agoLU-14490 lmv: striped directory as subdirectory mount 46/42046/2
Lai Siyao [Fri, 5 Mar 2021 09:07:34 +0000 (17:07 +0800)]
LU-14490 lmv: striped directory as subdirectory mount

lmv_intent_lookup() will replace fid1 with stripe FID, but if striped
directory is mounted as subdirectory mount, it should be handled
differently. Because fid2 is directory master object, if stripe is
located on different MDT as master object, it will be treated as
remote object by server, thus server won't reply LOOKUP lock back,
therefore each file access needs to lookup "/".

And remote directory (either plain or striped) shouldn't be used for
subdirectory mount, because remote object can't get LOOKUP lock.
Add an option "mdt_enable_remote_subdir_mount" (1 by default for
backward compatibility), mdt_get_root() will return -EREMOTE if
user specified subdir is a remote directory and this option is
disabled.

Add sanity 247g, updated 247f.

Lustre-change: https://review.whamcloud.com/41893
Lustre-commit: 503917278a8f1dd7dd578fea6551de6c5dc4ebb9

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I5e8f95ee95c4155336098e55b7569ed7a43865c1
Reviewed-on: https://review.whamcloud.com/42046
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14488 o2ib: Use rdma_connect_locked if it is defined 77/41977/2
Sergey Gorenko [Thu, 4 Mar 2021 12:33:16 +0000 (14:33 +0200)]
LU-14488 o2ib: Use rdma_connect_locked if it is defined

rdma_connect_locked() is added in the upstream kernel 5.10 and
MOFED-5.2-2. After that, it is not allowed to call rdma_connect()
in RDMA CM event handler; rdma_connect_locked() must be used
instead.

This commit adds configure checks to detect whether
rdma_connect_locked() is available and updates the event handler
to call the correct function.

Lustre-change: https://review.whamcloud.com/41887
Lustre-commit: 60d55e42ed9e043341790bf7624627c93cc99200

Test-Parameters: trivial
Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com>
Change-Id: I8068d04810bf6f0200292a55f3fdcea8c71d44c1
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41977
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
3 years agoLU-13972 o2iblnd: Don't retry indefinitely 11/42011/2
Amir Shehata [Thu, 11 Mar 2021 19:45:37 +0000 (11:45 -0800)]
LU-13972 o2iblnd: Don't retry indefinitely

If peer is down don't retry indefinitely. Use the retry_count
parameter to restrict the number of retries. After which the
connection fails and error is propagated up.

This prevents long timeouts when mounting a file system with
nodes which might have their NIDs configured in the FS, but the
nodes have been taken offline.

Lustre-change: https://review.whamcloud.com/39981
Lustre-commit: 7c8ad11ef08f0f2f886004ae4a56f67722c16d5c

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I4238323f0629f005c651adba4b384b98546514d0
Reviewed-on: https://review.whamcloud.com/42011
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14125 osc: prevent overflow of o_dropped 15/40615/8
Olaf Faaland [Wed, 11 Nov 2020 22:38:25 +0000 (14:38 -0800)]
LU-14125 osc: prevent overflow of o_dropped

In osc_announce_cached(), prevent o_dropped from overflowing.
Necessary because o_dropped AKA o_misc is 32 bits, but cl_lost_grant
is 64 bits.

Add a CDEBUG call so we can tell whether this happened.

Lustre-change: https://review.whamcloud.com/40659
Lustre-commit: 82e9a11056a55289c880786da71d8b1125f357b2

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Ia459934c789ae9609f851ae0a2581de583c6fc1c
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/40615
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12681 osc: wrong cache of LVB attrs, part2 40/40740/3
Vitaly Fertman [Wed, 11 Sep 2019 15:22:23 +0000 (18:22 +0300)]
LU-12681 osc: wrong cache of LVB attrs, part2

It may happen that osc oinfo lvb cache has size < kms.

It occurs if a reply re-ordering happens and an older size is applied
to oinfo unconditionally.

Another possibility is RA, when osc_match_base() attaches the dlm lock
to osc object but does not cache the lvb. The next layout change will
overwrites the lock lvb by the oinfo cache (previous LUS-7731 fix),
presumably smaller values. Therefore, the next lock re-use may run
into a problem with partial page write which thinks the preliminary
read is not needed.

Do not let the cached oinfo lvb size to become less than kms.
Also, cache the lock's lvb in the oinfo on osc_match_base().

Lustre-change: https://review.whamcloud.com/36200
Lustre-commit: 40319db5bc649adaf3dad066e2c1bb49f7f1c04a

Change-Id: I50136f57491364146ce7b6a81b814e474e3edb86
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/40740
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13344 gss: Update crypto to use sync_skcipher 94/40994/2
Shaun Tancheff [Sun, 24 May 2020 19:29:41 +0000 (14:29 -0500)]
LU-13344 gss: Update crypto to use sync_skcipher

As of linux v4.19-rc2-66-gb350bee5ea0f the change
   crypto: skcipher - Introduce crypto_sync_skcipher

Enabled the deprecation of blkcipher which was dropped
as of linux v5.4-rc1-159-gc65058b7587f
    crypto: skcipher - remove the "blkcipher" algorithm type

Based on the existence of SYNC_SKCIPHER_REQUEST_ON_STACK
use the sync_skcipher API or provide wrappers for the
blkcipher API

Lustre-change: https://review.whamcloud.com/38586
Lustre-commit: 0a65279121a5a0f5c8831dd2ebd6927a235a94c2

HPE-bug-id: LUS-8589
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I7683c20957213fd687ef5cf6dea64c842928db5b
Reviewed-on: https://review.whamcloud.com/40994
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13498 gss: update sequence in case of target disconnect 93/40993/2
Sebastien Buisson [Fri, 2 Oct 2020 12:05:55 +0000 (21:05 +0900)]
LU-13498 gss: update sequence in case of target disconnect

Client to OST connections can go idle, leading to target disconnect.
In this event, maintaining correct sequence number ensures that GSS
does not erroneously consider requests as replays.
Sequence is normally updated on export destroy, but this can occur too
late, ie after a new target connect request has been processed. So
explicitly update sec context at disconnect time.

Lustre-change: https://review.whamcloud.com/40122
Lustre-commit: 1275857c178fdf6e301345c7588499451c8ffd37

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I65c27e1ab459b2a29670580121ef6e1a00f18918
Reviewed-on: https://review.whamcloud.com/40993
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13754 gss: open sptlrpc init channel in R+W mode 67/40367/5
Sebastien Buisson [Fri, 30 Oct 2020 07:36:25 +0000 (00:36 -0700)]
LU-13754 gss: open sptlrpc init channel in R+W mode

Linux 5.3 changed struct cache_detail readers to writers.
As this mechanism is used by GSS authentication in Lustre via SunRPC,
we need to make sure lsvcgssd daemon does open
/proc/net/rpc/auth.sptlrpc.init/channel in R+W mode.

It also affects CentOS/RHEL 7.8, as the kernel commit was ported to
these distros.

Lustre-commit: 0d59f1a2c1e88495d1d697acabb572f67ccc211e
Lustre-change: https://review.whamcloud.com/39297

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: If88802d4f2bc3168dda4f79fe57f2f44ac7ef39e
Reviewed-on: https://review.whamcloud.com/40367
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12634 gss: uid_keyring and session_keyring moved 54/40354/8
Shaun Tancheff [Fri, 30 Oct 2020 07:18:46 +0000 (00:18 -0700)]
LU-12634 gss: uid_keyring and session_keyring moved

Linux 5.3 removed uid_keyring and session_keyring from user_struct
Prefer the lookup_user_key() API when it is available (~5.0)
Prefer get_request_key_auth() when it is available (~5.0)

kernel-commit: 0f44e4d976f96c6439da0d6717238efa4b91196e
kernel-commit: 822ad64d7e46a8e2c8b8a796738d7b657cbb146d

Remove LC_HAVE_CRED_TGCRED which is no longer used.

Lustre-commit: 97301a491d46cf2cf829185b52b8690287ab7ed6
Lustre-change: https://review.whamcloud.com/35743

Test-Parameters: trivial
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I6d551cd8a9e317b717a43cba9be57f184a281c0a
Reviewed-on: https://review.whamcloud.com/40354
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14031 ptlrpc: decrease time between reconnection 38/40638/4
Alexander Boyko [Wed, 14 Oct 2020 08:20:58 +0000 (04:20 -0400)]
LU-14031 ptlrpc: decrease time between reconnection

When a connection get a timeout or get an error reply from a sever,
the next attempt happens after PING_INTERVAL. It is equal to
obd_timeout/4. When a first reconnection fails, a second go to
failover pair. And a third connection go to a original server.
Only 3 reconnection before server evicts client base on blocking
ast timeout. Some times a first failed and the last is a bit late,
so client is evicted. It is better to try reconnect with a timeout
equal to a connection request deadline, it would increase a number
of attempts in 5 times for a large obd_timeout. For example,
    obd_timeout=200
     - [ 1597902357, CONNECTING ]
     - [ 1597902357, FULL ]
     - [ 1597902422, DISCONN ]
     - [ 1597902422, CONNECTING ]
     - [ 1597902433, DISCONN ]
     - [ 1597902473, CONNECTING ]
     - [ 1597902473, DISCONN ] <- ENODEV from a failover pair
     - [ 1597902523, CONNECTING ]
     - [ 1597902539, DISCONN ]

The patch adds a logic to wakeup pinger for failed connection request
with ETIMEDOUT or ENODEV. It adds imp_next_ping processing for
ptlrpc_pinger_main() time_to_next_wake calculation, and fixes setting
of imp_next_ping value.

Lustre-commit: de8ed5f19f04136a4addcb3f91496f26478d03e7
Lustre-change: https://review.whamcloud.com/40244

HPE-bug-id: LUS-8520
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ia0891a8ead1922810037f7d71092cd57c061dab9
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/40638
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14031 ptlrpc: remove unused code at pinger 37/40637/4
Etienne AUJAMES [Thu, 12 Nov 2020 18:12:59 +0000 (19:12 +0100)]
LU-14031 ptlrpc: remove unused code at pinger

The timeout_list was previously used for grant shrinking,
but right now is dead code.

Lustre-commit: f02266305941423a10e8e6ec33a5865e24c18432
Lustre-change: https://review.whamcloud.com/40243

HPE-bug-id: LUS-8520
Fixes: fc915a43786e ("LU-8708 osc: depart grant shrinking from pinger")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: Ia7a77b4ac19da768ebe1b0879d7123941f4490b5
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/40637
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14027 ldlm: Do not hang if recovery restarted during lock replay 24/41224/2
Oleg Drokin [Wed, 14 Oct 2020 03:55:02 +0000 (23:55 -0400)]
LU-14027 ldlm: Do not hang if recovery restarted during lock replay

LU-13600 introduced lock ratelimiting logic, but it did not take
into account that if there's a disconnection in the REPLAY_LOCKS
phase then yet unsent locks get stuck in the sending queue so
the replay locks thread hangs with imp_replay_inflight elevated
above zero.

The direct consequence from that is recovery state machine never
advances from REPLAY to REPLAY_LOCKS status when imp_replay_inflight
is non zero.

Adjust __ldlm_replay_locks() to check if the import state changed
before attempting to send any more requests.

Add a testcase.

Lustre-change: https://review.whamcloud.com/40238
Lustre-commit: 7ca495ec67f474e10352077fc40123e4818b8e69

Change-Id: Idbaf5461f33d1884088269d67d01071c7e1bf8a5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Fixes: 3b613a442b ("LU-13600 ptlrpc: limit rate of lock replays")
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Fixes: 6b6d9c0911 ("LU-13600 ptlrpc: limit rate of lock replays")
Reviewed-on: https://review.whamcloud.com/41224
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14027 ldlm: Do not wait for lock replay sending if import dsconnected 23/41223/2
Oleg Drokin [Fri, 16 Oct 2020 14:25:58 +0000 (10:25 -0400)]
LU-14027 ldlm: Do not wait for lock replay sending if import dsconnected

If import disconnected while we were preparing to send some lock replays
the sending RPC would get stuck on the sending list and would keep
the reconnected import state from progressing from REPLAY
to REPLAY_LOCKS state waiting for the queued replay RPCs to finish.

Set them as no_delay to ensure they don't wait.

LU-13600 exacerbated this issue some but it certainly exist
before it as well.

Lustre-change: https://review.whamcloud.com/40272
Lustre-commit: f06a4efe13faca21ae2a6afcf5718d748bb6ac5d

Change-Id: Id276a0be7657d9ad6cf40ad8e7a165d5cd341cb8
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-on: https://review.whamcloud.com/41223
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14395 kernel: kernel update RHEL7.9 [3.10.0-1160.15.2.el7] 46/41446/3
Jian Yu [Wed, 3 Mar 2021 03:35:04 +0000 (19:35 -0800)]
LU-14395 kernel: kernel update RHEL7.9 [3.10.0-1160.15.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.15.2.el7.

Change debuginfo download location since debuginfo.centos.org
does not provide kernel-debuginfo-common anymore.

The patch also reverts the following fix from RHEL 7.9 kernel
since version 3.10.0-1160.8.1.el7:

- [kernel] timer: Fix lockup in __run_timers() caused by
  large jiffies/timer_jiffies delta (Waiman Long) [1849716]

The above fix caused Hard LOCKUP kernel panic.

Test-Parameters: clientdistro=el7.9 serverdistro=el7.9

Change-Id: Icdd9e8bf4bd595dece266f6c5a9b0de344781a93
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41446
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10157 lnet: restore an maximal fragments count 77/41277/3
Alexey Lyashkov [Wed, 20 Jan 2021 03:00:40 +0000 (22:00 -0500)]
LU-10157 lnet: restore an maximal fragments count

Lowering a number of fragments blocks a connection from older clients
who want to use 256 fragments to transfer. Let's restore this number
to the original value.

Lustre-change: https://review.whamcloud.com/37385
Lustre-commit: 4072d863c240fa5466f0f616f7e9b1cfcdf0aa0e

Fixes: 272e49c ("LU-10157 lnet: make LNET_MAX_IOV dependent on page size")
Cray-bug-id: LUS-8139
Test-Parameters: trivial testlist=lnet-selftest,sanity-lnet
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I94ac16c1c75efda3e5f3f35ddfc5f39921c15873
Reviewed-on: https://review.whamcloud.com/41277
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10157 ptlrpc: fill md correctly. 76/41276/3
Alexey Lyashkov [Wed, 20 Jan 2021 02:57:28 +0000 (21:57 -0500)]
LU-10157 ptlrpc: fill md correctly.

MD fill should limit to the overall transfer size in additional
to the number a fragment.
Let's do this.

Lustre-change: https://review.whamcloud.com/37387
Lustre-commit: e1ac9e74844dc75d77ef740b3a44fad2efde30c5

Cray-bug-id: LUS-7159
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ibd3be1989c8dd5012e1b158f3942fd041f2da350
Reviewed-on: https://review.whamcloud.com/41276
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10157 ptlrpc: separate number MD and refrences for bulk 75/41275/3
Alexey Lyashkov [Wed, 20 Jan 2021 01:39:27 +0000 (20:39 -0500)]
LU-10157 ptlrpc: separate number MD and refrences for bulk

Introduce a bulk desc refs, it's different from MD's count ptlrpc
expects to have events from all MD's even it's filled or not. So,
number an MD's to post is related to the requested transfer size,
not a number MD's with data.

Lustre-change: https://review.whamcloud.com/37386/
Lustre-commit: 8a7f2d4b11801eae4c91904da9f9750a012a6b11

Cray-bug-id: LUS-8139
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ic7d62c5c8d30fd6b681853a65429394ed2f122f2
Reviewed-on: https://review.whamcloud.com/41275
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14125 obdclass: add grant fields to export procfile 24/39324/6
Olaf Faaland [Thu, 2 Jul 2020 21:25:32 +0000 (14:25 -0700)]
LU-14125 obdclass: add grant fields to export procfile

Add ted_{grant,reserved,dirty} to the export
procfile for OSTs, to allow comparison between the
OST's idea of grants allocated to the client with
the client's idea.

Lustre-change: https://review.whamcloud.com/40563
Lustre-commit: 53ee416097a9a77ca0ee352714af02e77489e3f8

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: Ib34582e2be55fe2007363b52cea4dee211b7f540
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39324
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14376 kernel: kernel update SLES12 SP5 [4.12.14-122.57.1] 57/41357/2
Jian Yu [Thu, 28 Jan 2021 20:16:09 +0000 (12:16 -0800)]
LU-14376 kernel: kernel update SLES12 SP5 [4.12.14-122.57.1]

Update SLES12 SP5 kernel to 4.12.14-122.57.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 817" testlist=sanity

Change-Id: I1ad5feb6f63cbaa948226fcb4248a2a767b67ce3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41357
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>