Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-14905 lfsck: linkEA overflow handling fix 69/44469/3
Vitaly Fertman [Mon, 2 Aug 2021 17:04:44 +0000 (20:04 +0300)]
LU-14905 lfsck: linkEA overflow handling fix

An absent link in EA is not an issue and not to be fixed if EA is
overflowed. lfsck should not report it is an issue if there is no
space for this link, and should not report it is fixed whereas it
is not (linkea_add_buf() returns 0 if so without having a new entry
added into EA and lfsck_namespace_assistant_handler_p1() later
reports it is repaired).

HPE-bug-id: LUS-8810
Signed-off-by: Vitaly Fertman <vitaly.fertman@hpe.com>
Change-Id: Iba1549045c8c3889adf55c99cdd88756e5643073
Reviewed-on: https://es-gerrit.dev.cray.com/158706
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-on: https://review.whamcloud.com/44469
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14866 lod: remove duplicate OST_TGT 51/44351/4
Sergey Cheremencev [Tue, 20 Jul 2021 09:24:33 +0000 (12:24 +0300)]
LU-14866 lod: remove duplicate OST_TGT

Remove duplicate OST_TGT from lod_ost_alloc_qos.

Change-Id: I4fbe2daa057f23a60e31e59d7c0db592945a5363
Fixes: 2112ccb3c4 ("LU-13073 osp: don't block waiting for new objects")
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/44351
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14989 sec: keep encryption context in xattr cache 48/45148/2
Sebastien Buisson [Thu, 7 Oct 2021 14:04:34 +0000 (16:04 +0200)]
LU-14989 sec: keep encryption context in xattr cache

When an inode is being cleared, its xattr cache must be completely
wiped. But in case of lock cancel, we want to keep the encryption
context, as further processing might need to check it.

Fixes: 1faf54e8bf ("LU-14989 sec: access to enc file's xattrs")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8a2f4497129353a7fbf86cdaaa13fae6e0988790
Reviewed-on: https://review.whamcloud.com/45148
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13941 osp: Silently lower requested create_count to maximum 67/39967/9
Shaun Tancheff [Mon, 23 Aug 2021 14:40:39 +0000 (09:40 -0500)]
LU-13941 osp: Silently lower requested create_count to maximum

When setting create_count it should silently accept a larger value
and truncate it to the current maximum.

This would avoid issues if that limit is changed in the future.

HPE-bug-id: LUS-5960
Test-Parameters: trivial testlist=parallel-scale,sanity
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I4727ba6fca747e1ae9850188ef63c7abb89904be
Reviewed-on: https://review.whamcloud.com/39967
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14797 nodemap: map project id 19/44119/9
Sebastien Buisson [Wed, 30 Jun 2021 16:30:57 +0000 (18:30 +0200)]
LU-14797 nodemap: map project id

Add calls to nodemap_map_id() in order to map project IDs from
client ID to server ID and conversely.
Also extend nodemap_can_setquota() to allow setquota on project
only if ID is not squashed or deny_unknown is not set.
Update sanity-sec test_27a to exercise the feature.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Id66458550d312404b1993ead8940c3d12eaadccd
Reviewed-on: https://review.whamcloud.com/44119
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15052 lnet: include linux/ethtool.h 09/45109/5
Jian Yu [Fri, 1 Oct 2021 06:27:07 +0000 (23:27 -0700)]
LU-15052 lnet: include linux/ethtool.h

Kernel 5.11.0-36 removes including linux/ethtool.h from
linux/netdevice.h, which caused the following build error:

dereferencing pointer to incomplete type 'const struct ethtool_ops'

This patch fixes the above issue by adding the include into
the file that uses the structure.

Test-Parameters: trivial

Change-Id: Ifc25de5acaebf2b5fd5bb6f1c303366ab9ea6ef6
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45109
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-8066 obdclass: Remove lprocfs_obd_short_io_bytes_* declarations 96/45096/3
Oleg Drokin [Thu, 30 Sep 2021 00:17:33 +0000 (20:17 -0400)]
LU-8066 obdclass: Remove lprocfs_obd_short_io_bytes_* declarations

The functions themselves were long renamed

Change-Id: Ic97d83a56d065ff1dadfc9a01d878e246e06a847
Test-Parameters: trivial
Fixes: 32fb31f3bf ("LU-8066 osc: move suitable values from procfs to sysfs")
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/45096
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
2 years agoLU-15040 mdc: update max_easize on reconnect 73/45073/3
Sergey Cheremencev [Wed, 11 Nov 2020 08:19:29 +0000 (11:19 +0300)]
LU-15040 mdc: update max_easize on reconnect

If MDS was restarted to enable ea_inode, clients should get new
max_easize value. However, cl_max_mds_easize is not updated. This may
cause lfs getstripe to fail if file has huge stripe number
(2000 for example):

*** Error in `lfs': free(): invalid pointer: 0x0000000000de09d0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81299)[0x7f0623c03299]
/lib64/libc.so.6(closedir+0xd)[0x7f0623c42ddd]
/lib/liblustreapi.so.1(+0xa557)[0x7f06248b5557]
/lib/liblustreapi.so.1(+0xad74)[0x7f06248b5d74]
lfs[0x4105b3]
/lib/liblustreapi.so.1(Parser_execarg+0x51)[0x7f06248c88e1]
lfs[0x40448e]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f0623ba4555]
lfs[0x4044fc]

HPE-bug-id: LUS-9478
Change-Id: If155a63e2f07536c6500b37b5e6191cb8b0d0607
Reviewed-on: https://es-gerrit.dev.cray.com/158100
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Nikitas Angelinas <nangelinas@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/45073
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12268 osd: BUG_ON for IAM corruption 72/45072/2
Alexander Boyko [Tue, 28 Sep 2021 13:27:12 +0000 (09:27 -0400)]
LU-12268 osd: BUG_ON for IAM corruption

The patch adds strict checks of buffer head overflow
for IAM dx blocks.

HPE-bug-id: LUS-10178
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I1608f6cbf00b5120fbc36d0c65fcfe37c43e375f
Reviewed-on: https://review.whamcloud.com/45072
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15038 mgc: release cl_mgc_mutex on error 63/45063/3
Andreas Dilger [Mon, 27 Sep 2021 18:29:58 +0000 (12:29 -0600)]
LU-15038 mgc: release cl_mgc_mutex on error

If local_oid_storage_init() returns an error, the cl_mgc_mutex()
should be released.

Fixes: 3e38436dc09 ("LU-2059 llog: MGC to use OSD API for backup logs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I921dde4e9202733874d8e7f980e95af23739a655
Reviewed-on: https://review.whamcloud.com/45063
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15011 tests: pool spill test modifications 56/45056/3
James Nunez [Mon, 27 Sep 2021 16:59:07 +0000 (10:59 -0600)]
LU-15011 tests: pool spill test modifications

Make the following modifications to the ost-pools
test suite:
test 29 - change check for 'when striping is specified
          explicitly' file from 'file-2' to 'file-3'

test 30 - Add bad parameter check for setting the threshold
          below zero

test 31 - 'do_nodes $mdts $LCTL get_param lod.*.pool.*'
          doesn’t print anything. Change to
          'do_nodes $mdts $LCTL get_param lod.*.pool.*.spill*'

Fixes: 0a998f4723 (“LU-14825 lod: pool spilling”)
Test-Parameters: trivial testlist=ost-pools
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Icbdc3d42b7f7609bc57cc37830975d831125d659
Reviewed-on: https://review.whamcloud.com/45056
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
2 years agoLU-15028 tests: improve ha.sh to be more verbose 25/45025/2
Elena Gryaznova [Wed, 22 Sep 2021 17:47:36 +0000 (20:47 +0300)]
LU-15028 tests: improve ha.sh to be more verbose

Patch adds some informing messages to make the failure
reason detection simpler.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10286
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Change-Id: I3bef165f497d745c3e8ee3c8a91532096100bb99
Reviewed-on: https://review.whamcloud.com/45025
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10) 16/45016/2
Arshad Hussain [Wed, 22 Sep 2021 11:01:10 +0000 (07:01 -0400)]
LU-15026 zfs: Fix ZFS(2.0.0-1) build error on CentOS (3.10)

ZFS: (2.0.0-1)
Lustre: 608cce73d51 LU-15007 tests: quota enable cmd fix
CentOS: 3.10.0-1160.15.2.el7.x86_64

This patch fixes two build failures seens as below for
the above configuration

First
~~~~~
In file included from:
/root/zfs/zfs_git_lustre_build/zfs/include/sys/spa.h:39:0,
from libmount_utils_zfs.c:32:
/root/zfs/<path>/.../sys/zfs_context.h:110:27:
fatal error: sys/byteorder.h: No such file or directory
#include <sys/byteorder.h>

Second
~~~~~~
gcc -rdynamic -shared -export-dynamic -pthread \
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libzfs/.libs/
-L/root/zfs/zfs_git_lustre_build/zfs/lib/libnvpair/.libs
-o mount_osd_zfs.so \
`ar -t libmount_utils_zfs.a` \
-ldl -lzfs -lnvpair -lzpool
/usr/bin/ld: cannot find -lzpool
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
collect2: error: ld returned 1 exit status

Test-Parameters: trivial fstype=zfs
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I5be8cc846da1ca213f1bd3c29b6b55acc22928f2
Reviewed-on: https://review.whamcloud.com/45016
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13575 lnet: Ensure round robin selection of peer NIs 04/45004/4
Chris Horn [Fri, 27 Aug 2021 21:59:33 +0000 (16:59 -0500)]
LU-13575 lnet: Ensure round robin selection of peer NIs

Use the peer net sequence number to set the peer NI sequence number to
ensure round robin selection of peer NIs on each peer net.

HPE-bug-id: LUS-10349
Test-Parameters: trivial
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I1fa14ad675ead4ae2c5b1d4edad250caa4498df2
Reviewed-on: https://review.whamcloud.com/45004
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13575 lnet: Ensure round robin selection of local NIs 03/45003/2
Chris Horn [Fri, 27 Aug 2021 21:29:09 +0000 (16:29 -0500)]
LU-13575 lnet: Ensure round robin selection of local NIs

Use the net sequence number to set the NI sequence number to ensure
round robin selection of NIs on each net.

Test-Parameters: trivial
HPE-bug-id: LUS-10349
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I6ce0b088fcad6312186e6fbad4ab14283aee55eb
Reviewed-on: https://review.whamcloud.com/45003
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15005 build: Ubuntu dkms packages missing dependancies 99/44999/3
James Beal [Tue, 21 Sep 2021 14:24:39 +0000 (15:24 +0100)]
LU-15005 build: Ubuntu dkms packages missing dependancies

It was noted that on astandard ubuntu install the dkms package
would fail to install as libnl-genl-3-dev was missing. As a
test we tried installing the dkms package on an upstream
cloud image for 18.04 and 20.04. We noted that a number of
packages were needed before dkms would install cleanly.

Test-Parameters: trivial testgroup=review-ldiskfs-ubuntu

Signed-off-by: James Beal <jb23@sanger.ac.uk>
Change-Id: I53a2f143dd2154a2d0b598db8c60fd8ff1421860
Reviewed-on: https://review.whamcloud.com/44999
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14474 llog: don't destroy next llog 98/44998/7
Alex Zhuravlev [Tue, 21 Sep 2021 12:23:56 +0000 (15:23 +0300)]
LU-14474 llog: don't destroy next llog

do not destroy empty llog if it's referenced as
the next one in a catalog.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I78bfeb90435aaee2b8536b647aa3acec56642ea0
Reviewed-on: https://review.whamcloud.com/44998
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15014 obdclass: lu_ref_add() called in atomic context 69/44969/4
James Simmons [Sun, 22 Aug 2021 22:40:49 +0000 (18:40 -0400)]
LU-15014 obdclass: lu_ref_add() called in atomic context

For the native Linux client testing I turn on lu_ref
debugging. When turned on the following errors occur:

[ 2885.946815] Call Trace:
[ 2885.951240]  dump_stack+0x68/0x9b
[ 2885.956523]  ___might_sleep+0x205/0x260
[ 2885.962245]  lu_ref_add+0x25/0x40 [obdclass]
[ 2885.968442]  vvp_pgcache_current+0x101/0x1a0 [lustre]
[ 2885.975370]  seq_read+0x1ab/0x3c0

and

[ 7042.102529]  dump_stack+0x68/0x9b
[ 7042.107328]  ___might_sleep+0x205/0x260
[ 7042.112647]  lu_ref_add+0x25/0x40 [obdclass]
[ 7042.118385]  mdc_lock_upcall+0x154/0x4d0 [mdc]
[ 7042.124275]  mdc_enqueue_send+0x508/0x580 [mdc]
[ 7042.130225]  ? mdc_lock_lvb_update+0x280/0x280 [mdc]

This is easily fixed with introducing a lu_object_ref_add_atomic()
function.

Test-Parameters: trivial
Change-Id: Ife7d255079a836570661f669c1e9c7c0ce6de4aa
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44969
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15013 osc: use original cli for osc_lru_reclaim for debug msg 66/44966/5
James Simmons [Fri, 17 Sep 2021 14:30:22 +0000 (10:30 -0400)]
LU-15013 osc: use original cli for osc_lru_reclaim for debug msg

Before the list cleanup introduced in osc_lru_reclaim() the
variable cli was both passed in and used to scan the
cl_client_cache. After the scan was done then we use cli in
a debug message. It appears to be the original intent was to
use the original cli passed in for the debug message, not the
last scanned item. After the list cleanup patch landed now
cli can be NULL which can crash the node. The fix is to use
a separate struct client_obd variable for the scan and use
the original cli passed in for the debug message.

Test-Parameters: trivial
Fixes: 8c166f6bf42c ("LU-6142 lustre: use list_first_entry() in lustre subdirectory.")
Change-Id: I5f0f1b986744fdd30af7f7856c1278b41447a373
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14927 scrub: create shared scrub_needs_check() function. 89/44689/7
James Simmons [Thu, 30 Sep 2021 16:26:19 +0000 (12:26 -0400)]
LU-14927 scrub: create shared scrub_needs_check() function.

The current functions osd_consistency_check() in both ldiskfs and
zfs use ktime_* functions which are exported for pure GPL modules.
This is not the case for ZFS. We can refactor the code to create
a new common function scrub_needs_check() that can be used along
side osd_consistency_check(). Fix a few cases where the error
code is not checked for ZFS.

Change-Id: I0cc6cd84a35ecc10b511096f4e749a2961da3bbf
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44689
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12262 llite: harden ll_sbi ll_flags 41/44541/16
James Simmons [Thu, 30 Sep 2021 18:57:02 +0000 (14:57 -0400)]
LU-12262 llite: harden ll_sbi ll_flags

For most file systems mount flags are straight forward but this
is not the case for Lustre. We have to consider if the server
backend supports a mount option. Additionally its possible to
disable or enable a feature using sysfs during run time. Some
features can't be managed with a mount option but still can
be managed with sysfs or based on what is enabled on the server
node. All these states are reported together in the debugfs
file sbi_flags. The mount specific options are reported in
the super block show_option ll_show_option().

With all this complexity it is easy for it to get out of sync
and report incorrect things. We consolidate this handling by
moving to using match_table_t that is used by various Linux
file system to parse options. LL_SBI_FLAGS is replaced by our
match_table_t, ll_sbi_flags_name, that can be used for mount
options as well as reporting the sbi_flags in debugfs. We take
advantage of the fact that mount option parse will stop at the
first NULL in ll_sbi_flags_name and after that NULL list the
other features flags that are managed with other methods besides
mount options.

The next change is the move of ll_flags to a bitmap which gives
us two advantages. The first is that we can support more than
32 flags in the future. Second is no need to use bit shifting
math since we can use th enum LL_SBI_* values directly with
clear_bit() / set_bit() / test_bit(). Allow these changes should
miminize future problems with keeping all these states in sync.

Change-Id: Ia4f08cdde54c0fd11440dcf6b60b5fcb8bfb4d63
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44541
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14895 client: allow case-insensitive checksum types 30/44530/4
Andreas Dilger [Fri, 30 Jul 2021 21:36:19 +0000 (15:36 -0600)]
LU-14895 client: allow case-insensitive checksum types

The current t10ip4K and t10crc4K checksum types use an upper-case 'K'
in the name, unlike the other checksum types which are all lower-case.
This is distinction is difficult to see in some fonts, and can cause
usage errors.  Accept upper-case variants of the checksum type names.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I97673ffa98cf8e5fc601ac7df5aaafb24b3ebbe5
Reviewed-on: https://review.whamcloud.com/44530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Etienne AUJAMES <eaujames@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14911 osp: release thandle if it was created 04/44504/7
Alex Zhuravlev [Thu, 5 Aug 2021 05:52:53 +0000 (08:52 +0300)]
LU-14911 osp: release thandle if it was created

osp_statfs_update() could leak thandle if transaction couldn't
start for a reason.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I541a5e4a7860008eb179d905ac57997b737f178c
Reviewed-on: https://review.whamcloud.com/44504
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14801: utils: improve performance of 'lfs find -perm' 18/44118/11
Courrier Guillaume [Wed, 23 Jun 2021 09:15:08 +0000 (11:15 +0200)]
LU-14801: utils: improve performance of 'lfs find -perm'

The current implementation of the "-perm" predicate queries the
stat information on the OST while it is not necessary. This patch
fixes that issue by moving the check to the correct location in
`cb_find_init`

On a simple setup with 1 MDT and 2 OSTs we observed for around 8000
files:
- 2.3s without the patch
- 1.9s with the patch

Test-Parameters: trivial
Change-Id: I30c0e89d136556058eadf6bede062577c6d36eaf
Signed-off-by: Courrier Guillaume <guillaume.courrier@cea.fr>
Reviewed-on: https://review.whamcloud.com/44118
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Rick Mohr <mohrrf@ornl.gov>
Reviewed-by: Anjus George <georgea@ornl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
2 years agoLU-14797 sec: add projid to nodemap 08/44108/8
Sebastien Buisson [Tue, 29 Jun 2021 15:54:59 +0000 (17:54 +0200)]
LU-14797 sec: add projid to nodemap

Add the ability to create id maps of a new type, projid. This also
requires adding a new value to map_mode, projid_only. Finally, a new
property named squash_projid is used to map all project ID to a
default one.
Update lctl man pages to mention these additions.
Update sanity-sec test_12 and test_15 to exercise projid mapping and
squash_projid property.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I63eba8b0d33feaa7ece8c1788cb587fcb330357a
Reviewed-on: https://review.whamcloud.com/44108
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14739 quota: fix quota with root squash enabled 47/44347/17
Wang Shilong [Tue, 20 Jul 2021 02:36:31 +0000 (10:36 +0800)]
LU-14739 quota: fix quota with root squash enabled

This patch tries to fix several problems:

1. OSD will ignore quota if IO comes from client
cache or root, however since following change:

LU-12687 osc: consume grants for direct I/O

DIO now consumes grant too, following check for
sync IO is wrong now:

(lnb[i].lnb_flags & (OBD_BRW_FROM_GRANT | OBD_BRW_SYNC))
        == OBD_BRW_FROM_GRANT)

This wass originally added to support 1.8 client, it is
going to be 2.15 now, so let's remove this broken check.

2. Server side will clear OBD_BRW_NOQUOTA if root squash
is enabled, this will revert fixes from:

"LU-13228 clio: mmap write when overquota"

We need to separate @ci_noquota and @oi_cap_sys_resource cases,
introduce a new flag OBD_BRW_SYS_RESOURCE, and extend test_75
to cover this case.

3. LU-14739 missed case that DoM quota should be considered
as well.

4. If EDQUOT is returned for root, we check the new root squash
flag OBD_FL_ROOT_SQUASH from server side. If this flag is not set,
we bypass quota for root, otherwise all root writes become sync
writes.

5. Fix a leftover problem with LU-9671 for DOM

Fixes: a4fbe7341baf12 ("LU-14739 quota: nodemap squashed root cannot bypass quota")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3fd23da7d56acb5b485540333208e5d5b0b48023
Reviewed-on: https://review.whamcloud.com/44347
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14475 log: Rewrite some log messages 92/41892/5
Lei Feng [Fri, 5 Mar 2021 12:01:37 +0000 (20:01 +0800)]
LU-14475 log: Rewrite some log messages

Some log messages are too short to be meaningful. Rewrite them.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Test-Parameters: trivial
Change-Id: I9ae7d7da23c7e227c4e2b84010fb0c2a06b8cc87
Reviewed-on: https://review.whamcloud.com/41892
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
2 years agoLU-12362 ptlrpc: use wait_woken() in ptlrpcd() 69/45069/4
Mr NeilBrown [Tue, 28 Sep 2021 05:20:30 +0000 (15:20 +1000)]
LU-12362 ptlrpc: use wait_woken() in ptlrpcd()

Using wait_event() to wait for ptlrpcd_check() to succeed is
problematic.  ptlrpcd_check() is complex and can wait for other
events.  This nested waiting can behave differently to expectation and
generates a warning

   do not call blocking ops when !TASK_RUNNING

This happens because the task state is set to TASK_IDLE before
ptlrpcd_check() is calls.

A better approach (introduce for precisely this use-case) is to use
wait_woken() and woken_wake_function().

When a wake_up is requested on the waitq, woken_wake_function() sets a
flag to record the wakeup.  wait_woken() will wait until this flag is
set.  This way, the task state doesn't need to be set until after
ptlrpcd_check() has completed.

wait_woken() was introduced in Linux 3.19, so libcfs is enhance to
provide the functionality on older kernels.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iaddf56e2e76c204435bbef3c857e54ce0a6772bc
Reviewed-on: https://review.whamcloud.com/45069
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: xinliang <xinliang.liu@linaro.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-6142 lod: return pools_hash_params to being static. 70/45070/4
Mr NeilBrown [Tue, 28 Sep 2021 06:44:40 +0000 (16:44 +1000)]
LU-6142 lod: return pools_hash_params to being static.

A recent patch changes pools_hash_params in lod_pool.c to no longer
be 'static'.  This is not ideal.

rhashtable interfaces are mostly 'static inlines' which contain a lot
of code which is mostly optimised away providing that the 'params'
structure is const and locally visible.  When these interfaces are
called with a params structure in another file, the code produces is
quite inefficient and wasteful.

It is generally cleaner to provide accessor functions which can be
exported to other compilation units.  It is even beneficial to do that
within the one file.

This patch introduces
   lod_pool_exists()
and
   lod_pool_find()

The first is 'extern' and thus 'pools_hash_params' can not be static.
The second is used in several places in lod_pool.c, improving code
quality and maintainability.

Fixes: 0a998f4723f5 ("LU-14825 lod: pool spilling")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ieafe2f23fe5cc71d9bdce73cbe7360f5cb540edf
Reviewed-on: https://review.whamcloud.com/45070
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14734 ldiskfs: improve message for large_dir 46/45046/3
Andreas Dilger [Fri, 24 Sep 2021 17:40:05 +0000 (11:40 -0600)]
LU-14734 ldiskfs: improve message for large_dir

Make it more clear that the large_dir feature has already been
enabled, rather than making the admin think that they need to
enable the feature themselves.

Test-Parameters: trivial
Fixes: f5967b06aac5 ("LU-14734 osd-ldiskfs: enable large_dir automatically")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ica59d3370148ed277d3541c05be065c4638daf8d
Reviewed-on: https://review.whamcloud.com/45046
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-12567 ptlrpc: handle reply and resend reorder 71/35571/16
Alexander Boyko [Fri, 19 Jul 2019 11:07:42 +0000 (07:07 -0400)]
LU-12567 ptlrpc: handle reply and resend reorder

ptlrpc can't detect a bulk transfer timeout
if rpc and bulk are reordered on router.
We should fail a bulk for situations where bulk is not
completed (after bulk timeout LNET_EVENT_UNLINK is set).

HPE-bug-id: LUS-7445, LUS-7569
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Change-Id: Iaf099d31f8fbc68c3edbfcff77ae424862e0adc1
Reviewed-on: https://review.whamcloud.com/35571
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14711 osc: Do not attempt sending empty pages 54/44654/3
Oleg Drokin [Wed, 28 Jul 2021 18:02:19 +0000 (14:02 -0400)]
LU-14711 osc: Do not attempt sending empty pages

Do not crash if trying to send a lock-prolonging emtpy read
to an old server, if the server does not support short reads.
Otherwise the client crashes when access the NULL page.

Test-Parameters: trivial
Fixes: 564070343ac4 ("LU-14711 osc: Notify server if cache discard takes a long time")
Change-Id: Icae7bf3ef16c45d33894b3c5fbac15b1a98c39d9
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44654
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
2 years agoNew tag 2.14.55 2.14.55 v2_14_55
Oleg Drokin [Mon, 4 Oct 2021 16:15:08 +0000 (12:15 -0400)]
New tag 2.14.55

Change-Id: I4dcb816ff9e6b27ebc6c870fce187f8122ba2bde
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14629 sec: do not block rename of topmost encrypted dir 54/45054/3
Sebastien Buisson [Mon, 27 Sep 2021 11:36:46 +0000 (13:36 +0200)]
LU-14629 sec: do not block rename of topmost encrypted dir

We intentionally forbid file and directory rename from encrypted to
unencrypted directory. But we must not block rename of the topmost
encrypted directory.

Fixes: 1158386ac9 ("LU-14629 sec: forbid file rename from enc to unencrypted dir")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I480a24b2b0327e1d9104f216da54720e4f351636
Reviewed-on: https://review.whamcloud.com/45054
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14989 sec: access to enc file's xattrs 55/44855/4
Sebastien Buisson [Tue, 7 Sep 2021 12:24:14 +0000 (14:24 +0200)]
LU-14989 sec: access to enc file's xattrs

Encryption context is stored in 'security.c' xattr. This is put in the
xattr cache via ll_xattr_cache_insert() to avoid sending a getxattr
request to the server. But this operation declares the xattr cache for
the inode as 'valid', with two consequences. It prevents any further
filling with other xattrs, and trying to read an xattr value will
directly return -ENODATA, without any attempt to fetch the xattr from
the server.
This is solved by adding a new ll_file_flags 'LLIF_XATTR_CACHE_FILLED'
that tells if the xattr cache for the inode has been filled. This bit
is set only by ll_xattr_cache_refill(), and 'valid' now just means the
xattr cache for the inode has been initialized.

Fixes: 40d91eafe2 ("LU-12275 sec: atomicity of encryption context getting/setting")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6c2b6870df29f26f048dedeb7212d1c801ca69e1
Reviewed-on: https://review.whamcloud.com/44855
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-10848 test: wait to process inodes in phase2 58/44658/3
Hongchao Zhang [Thu, 19 Aug 2021 09:27:18 +0000 (17:27 +0800)]
LU-10848 test: wait to process inodes in phase2

In test_8 of sanity-lfsck, the "LF_INCONSISTENT" flag was set
when processing the inodes with corrupted LinkEA in phase2,
LFSCK could have no chance to process it yet because of the delay
OBD_FAIL_LFSCK_DELAY3.

Test-Parameters: trivial testlist=sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck,sanity-lfsck
Change-Id: Id414728c998d527fbc27f877c6d31dcedcc12457
Signed-off-by: Hongchao Zhagn <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44658
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14777 lod: fix E2BIG on create 43/44043/3
Sergey Cheremencev [Fri, 18 Dec 2020 13:18:13 +0000 (16:18 +0300)]
LU-14777 lod: fix E2BIG on create

A fix solves 2 cases that caused create to fail
with -E2BIG.
1. Stripe count number should be calculated depending
on LOV_PATTERN_OVERSTRIPING flag.
2. In a case of failover lod_comp_entry_stripe_count
may return 0 if all OST targets have been disconnected.
Return EAGAIN in such case to calculate this later,
when at least one OST would be connected.

HPE-bug-id: LUS-9485
Fixes: aa72de32 ("LU-11691 lov: Limit layout size to max ea size")
Change-Id: I26cad4903d5dd6197fe1384013fbba8b2c76487c
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-on: https://review.whamcloud.com/44043
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14543 target: prevent overflowing of tgd->tgd_tot_granted 29/42129/14
Vladimir Saveliev [Fri, 19 Mar 2021 12:08:47 +0000 (15:08 +0300)]
LU-14543 target: prevent overflowing of tgd->tgd_tot_granted

If tgd->tgd_tot_granted < ted->ted_grant then there should not be:
   tgd->tgd_tot_granted -= ted->ted_grant;
which breaks tgd->tgd_tot_granted.
In case of obvious ted->ted_grant damage, recalculate
tgd->tgd_tot_granted using list of exports.

The same change is made for tgd->tgd_tot_dirty.

This patch also adds sanity check for exp->exp_target_data.ted_grant
increase in tgt_grant_alloc() to catch grant counting corruption as
soon as it happened. By default, the detected corruption is
CERROR()-ed, if needed that can be switched to LBUG() using lctl
set_param *.*.lbug_on_grant_miscount.
test-framework.sh:init_param_vars() enables LBUG().

Fixes: af2d3ac30e ("LU-11939 tgt: Do not assert during grant cleanup")
Change-Id: I36ba7496f7b72b4881e98c06ec254a8eefd4c13f
Signed-off-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Cray-bug-id: LUS-9875
Reviewed-on: https://review.whamcloud.com/42129
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15027 sec: initialize ll_inode_info for fake inode 23/45023/2
Sebastien Buisson [Wed, 22 Sep 2021 15:35:49 +0000 (17:35 +0200)]
LU-15027 sec: initialize ll_inode_info for fake inode

When creating an encrypted symlink, we need to make use of a fake
inode in order to be able to encrypt the target name before sending
the create request to the MDS.
This fake inode needs minimal initialization, but it is at least
required to properly initialize the ll_inode_info associated with this
fake inode.

Fixes: e735298935 ("LU-13717 sec: filename encryption - symlink support")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I20c30d873f9bffdbdc8b5f272cb8b80e5be7fbfb
Reviewed-on: https://review.whamcloud.com/45023
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15007 tests: quota enable cmd fix 19/44919/4
Alexander Zarochentsev [Mon, 13 Sep 2021 05:46:58 +0000 (08:46 +0300)]
LU-15007 tests: quota enable cmd fix

the tunable should be osd-*.*.quota_slave.enabled
not osd-*.*.quota_slave.enable.

Fixes: b9c359a70d ("LU-7004 tests: move from lctl conf_param to lctl set_param -P")

Test-Parameters: trivial testlist=ost-pools,sanity-quota
HPE-bug-id: LUS-10413
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I321436520ab48eff4bcc93611a0ada68fa33205e
Reviewed-on: https://review.whamcloud.com/44919
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4] 50/44950/2
Jian Yu [Thu, 16 Sep 2021 00:40:26 +0000 (17:40 -0700)]
LU-15008 kernel: kernel update RHEL8.4 [4.18.0-305.19.1.el8_4]

Update RHEL8.4 kernel to 4.18.0-305.19.1.el8_4.

Test-Parameters: trivial fstype=ldiskfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Test-Parameters: trivial fstype=zfs \
clientdistro=el8.4 serverdistro=el8.4 testlist=sanity

Change-Id: Icedc6cf2a5678cfbce76c47507137c0ea41d0b06
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44950
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14033 ldiskfs: Fix mballoc prefetch patch 94/44894/3
Shaun Tancheff [Sat, 11 Sep 2021 07:59:32 +0000 (02:59 -0500)]
LU-14033 ldiskfs: Fix mballoc prefetch patch

ext4-mballoc-prefetch patch was inadvertently broken during patch
rebasing.

In ext4_read_block_bitmap():
    ext4_read_block_bitmap_nowait() should not ignore locked

Test-Parameters: trivial
HPE-bug-id: LUS-9805
Fixes: fc87b01f96e8 ("LU-12477 ldiskfs: remove obsolete ext4 patches")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I6ebe9dfe48f48706da3623e3d32d33dddf35b395
Reviewed-on: https://review.whamcloud.com/44894
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
2 years agoLU-14996 lov: prefer mirrors on non-rotational OSTs 83/44883/22
Alex Zhuravlev [Thu, 9 Sep 2021 08:16:41 +0000 (11:16 +0300)]
LU-14996 lov: prefer mirrors on non-rotational OSTs

consider non-rotational OSTs as preferred unless explicit prefer
flag is set on a mirror.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I787bcba0b5e45842c9d4762c7f97a8f44a4fc9cb
Reviewed-on: https://review.whamcloud.com/44883
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7] 75/44875/2
Jian Yu [Thu, 9 Sep 2021 00:31:14 +0000 (17:31 -0700)]
LU-14994 kernel: kernel update RHEL7.9 [3.10.0-1160.42.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.42.2.el7.

Test-Parameters: trivial clientdistro=el7.9 serverdistro=el7.9

Change-Id: I377ea5d1e28c50b1087dfca7cb32f44afb9bf5f5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44875
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-8837 utils: move lustre_disk_data back to lustre_disk.h 29/44829/5
Jian Yu [Fri, 3 Sep 2021 06:04:52 +0000 (23:04 -0700)]
LU-8837 utils: move lustre_disk_data back to lustre_disk.h

This patch moves struct lustre_disk_data from mount_utils.h
back to lustre_disk.h so that it can be used in other codes
without including mount_utils.h.

Fixes: d62efba975d2 ("LU-8837 utils: make tools lightweight for lustre clients")
Change-Id: I589da2710e3cbe7d93a59928143f2b5cac955e6e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44829
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
2 years agoLU-12058 tests: improve sanity test_51d reliability 62/44762/7
Andreas Dilger [Thu, 26 Aug 2021 21:10:58 +0000 (15:10 -0600)]
LU-12058 tests: improve sanity test_51d reliability

The original commit message (b=10671, not in git history) stated:

    When selecting which OSTs to stripe files over, for files with
    a stripe count that divides evenly into the number of OSTs,
    the MDS is always picking the same starting OST for each file.
    Return the OST selection heuristic to the original design.

This test is mainly to catch logic errors in the object allocation
code, not to achieve perfect balance across all OSTs.

Firstly, fix the test to actually verify stripe-0 precession works.
This needs stripe_count=$OSTCOUNT, which was once the test default.

Make the test more robust by disabling QOS to give a more uniform
distribution of files across OSTs, even if they are space imbalanced.

Increase the threshold of error to reduce sensitivity to allocation
imbalances due to fewer preallocated objects available on the MDS.

Test-Parameters: trivial testlist=sanity env=ONLY=51d,ONLY_REPEAT=500
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I21f80ebb6f51e72bf4a5b19abe497ee9797a616a
Reviewed-on: https://review.whamcloud.com/44762
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vikentsi Lapa <vlapa@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14889 lproc: Add server checksum_type 55/44755/6
Arshad Hussain [Thu, 26 Aug 2021 12:07:50 +0000 (08:07 -0400)]
LU-14889 lproc: Add server checksum_type

This patch adds 'checksum_type' lproc entries under server:
1. obdfilter.$FSNAME-OST${count}.checksum_type
2. mdt.$FSNAME-MDT${count}.checksum_type

Test-case: sanity/77o added

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I12a26f5c8c8f2f93d57f377626b1753fc13ffbb3
Reviewed-on: https://review.whamcloud.com/44755
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14945 lnet: don't use hops to determine the route state 74/44674/6
Serguei Smirnov [Mon, 16 Aug 2021 23:37:30 +0000 (16:37 -0700)]
LU-14945 lnet: don't use hops to determine the route state

NodeA <-tcp1-> GW1 <-tcp2-> GW2 <-tcp3-> NodeB

Assuming GW1 knows how to reach tcp3 network and GW2 knows
how to reach tcp1 network, it should be possible to add routes
without specifying hop=2 on nodes A and B to reach tcp3 and tcp1
respectively and then be able to lnetctl ping between them.
Changes introduced by LU-13785 interpret default hops to be
equivalent to hop=1 set explicitly for the purpose of determining
route aliveness, which results in the routes created as described
above to be considered "down".

Fix it so that default hop setting doesn't prevent
the multi-hop scenario from working.

Test-Parameters: trivial
Fixes: 2e07619477 ("LU-13785 lnet: Use lr_hops for avoid_asym_router_failure")
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: I341ccdfe156434b0cb306359acc91a9193b44f7b
Reviewed-on: https://review.whamcloud.com/44674
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14895 osd-ldiskfs: combine checksum functions 56/44656/2
Andreas Dilger [Wed, 4 Aug 2021 09:42:37 +0000 (03:42 -0600)]
LU-14895 osd-ldiskfs: combine checksum functions

Reduce code duplication for nearly-identical checksum calculations.
The osd_dif_type1_generate() and osd_dif_type3_generate() were nearly
the same, as were osd_dif_type1_verify() and osd_dif_type3_verify().
Combine these functions to share the code, and handle the difference
between T10-PI type 1 and type 3 with an argument.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I40afb15fd80577ef6de918c90e4111e775ce7057
Reviewed-on: https://review.whamcloud.com/44656
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14895 brw: log T10 GRD tags during checksum calcs 55/44655/4
Andreas Dilger [Wed, 4 Aug 2021 08:08:12 +0000 (02:08 -0600)]
LU-14895 brw: log T10 GRD tags during checksum calcs

Log the T10 guard tags during checksum calculation on the client and
target to help identify where checksum errors are being introduced.
The added debugging is only active on RPC resend, so will not add
overhead during the normal IO path.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia4f14f2f2296da096acf629c74558386e7ce7057
Reviewed-on: https://review.whamcloud.com/44655
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14816 tests: mark sanity test_230d SLOW 16/44616/2
Andreas Dilger [Wed, 11 Aug 2021 20:59:06 +0000 (14:59 -0600)]
LU-14816 tests: mark sanity test_230d SLOW

Running sanity test_230d takes an average 15 minutes to finish,
and up to an hour in some cases, but has almost never failed.
Move it over to run only with SLOW=yes.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I02ec35c1533a6a97b5400d4419664b43ab49c502
Reviewed-on: https://review.whamcloud.com/44616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14677 sec: do not expose security.c to listxattr/getxattr 01/44101/14
Sebastien Buisson [Mon, 28 Jun 2021 18:32:16 +0000 (20:32 +0200)]
LU-14677 sec: do not expose security.c to listxattr/getxattr

security.c xattr, which contains encryption context, should not be
exposed by the xattr-related system calls such as listxattr() and
getxattr() because of its special semantics.
Update sanity-sec test_57 to test this.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I919f5cbafc53f5745fbfb5b9d2d7316e892d8c9f
Reviewed-on: https://review.whamcloud.com/44101
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14677 llite: move env contexts to ll_inode_info level 98/44198/11
Sebastien Buisson [Fri, 9 Jul 2021 13:41:34 +0000 (15:41 +0200)]
LU-14677 llite: move env contexts to ll_inode_info level

Contrary to file, inode is always available, so move the list of
env contexts from the file data to the ll_inode_info level.
This is needed because we will have to handle env properties in
ll_get_context() and ll_xattr_list()/ll_listxattr().
This also requires changing lli_lock from a spinlock to an rwlock.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I478d2a8eabfcb09074ba52601f05840d047a6da2
Reviewed-on: https://review.whamcloud.com/44198
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14790 tests: Check NI status when link is downed 73/44073/3
Chris Horn [Thu, 24 Jun 2021 18:37:42 +0000 (13:37 -0500)]
LU-14790 tests: Check NI status when link is downed

Add test to check that NI status is set to down when the
ni_fatal_error_on flag is set (i.e. when a link is down).

HPE-bug-id: LUS-10167
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: If98a899b0ee8dd9637c08774109668ad06244c60
Reviewed-on: https://review.whamcloud.com/44073
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Cyril Bordage <cbordage@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14825 lod: pool spilling 89/43989/22
Alex Zhuravlev [Wed, 7 Jul 2021 08:15:27 +0000 (11:15 +0300)]
LU-14825 lod: pool spilling

To avoid the problem of the fast pool becoming full this patch
introduces so-called pool spilling: for every OST pool a target
pool can be assigned which will be used instead of original one
if the original one's use is over specified threshold:

  lctl set_param lod.*.pool.pool1.spill_target=pool2
  lctl set_param lod.*.pool.pool1.spill_threshold_pct=80

i.e. once pool1 is 80+% used, then new files will be created on
pool2.

A chain (up to 10 at the moment) can be configured using the
settings like above when different OST pools are considered
one by one.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7f6dd4931ba64f3db8a7ae6a3b185f942a629ed7
Reviewed-on: https://review.whamcloud.com/43989
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-8962 lfs: Handle non-lustre and multiple args 26/42126/14
Arshad Hussain [Mon, 22 Mar 2021 11:43:15 +0000 (17:13 +0530)]
LU-8962 lfs: Handle non-lustre and multiple args

This patch addresses:

01: Handle multiple filesystems provided to 'lfs df'
02: Correctly report 'EOPNOTSUPP' for filesystems which
    are non-Lustre.
03: Make changes to test-framework.sh to handle modified
    return value from 'lfs df'. This changes For compatibility
    reason, ignores and masquerades EOPNOTSUPP as success.

The final return value is 0 for _all_ success or
value of the first failure for even a single failure
seen during the argument processing

sanity/56e Test-case added.

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: I73287d21792d89b8cde672acdaf9c9caf829522f
Reviewed-on: https://review.whamcloud.com/42126
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13550 osd-zfs: snapshot with incompatible clients 93/38593/2
Shaun Tancheff [Wed, 13 May 2020 20:16:41 +0000 (15:16 -0500)]
LU-13550 osd-zfs: snapshot with incompatible clients

snapshot_create fails when clients are connected that do not support
barrier requests.

Log some information to help the administrator track down the
connections blocking snapshot_create from succeeding.

Test-Parameters: fstype=zfs
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia59ea3c4c1a885e2591464cd4f8f77a1071b4786
Reviewed-on: https://review.whamcloud.com/38593
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-9699 osp: don't assert on OSP duplicating 53/27753/21
Jadhav Vikram [Tue, 25 Jul 2017 07:01:37 +0000 (12:31 +0530)]
LU-9699 osp: don't assert on OSP duplicating

Writeconf on an MDT with index > 0000 will cause
"add mdc" to be added to $FSNAME-client config
and "add osp" to be added to $FSNAME-MDTXXXX configs.

However, the configs may already contain these
directives. Duplicating the OSP device will
cause the assertion failure in osp_obd_connect():
ASSERTION( osp->opd_connects == 1 ) failed

Duplicating the MDC just returns -EEXIST in similar
situation.

A possible solution is to check configs for duplicates
before writing to them. However, sometimes we
would like to change nids which are part of
"add mdc" and "add osp".

Another solution is to mark previous entries with
SKIP flags. This patch implements this approach.
Since after revoking the config lock, the clients
and the MDTs will receive the updated log and
apply its newer entries, we still have to handle
OSP duplication, but this is only an issue
immediately after writeconf processing.

Seagate-bug-id: MRP-2634, MRP-3865
Change-Id: Idd7ad43c78d50e6bbe715850503aa0b01fcbf071
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/27753
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-9510 ldiskfs: to not verify preallocation in umount path 30/27130/12
Jadhav Vikram [Wed, 3 Feb 2021 15:24:31 +0000 (23:24 +0800)]
LU-9510 ldiskfs: to not verify preallocation in umount path

At umount time while discarding inode preallocation space, panic
occurred due to mismatch found in preallocation space free blocks
i.e pa->pa_free and free blocks calculated by reading on disk
block bitmap within preallocation space length. Similar crash will
occur when user sets errors=panic in mount option and if there is
mismatch in pa space free blocks.

Changes added to not verify mismatch in disk and in-memory
preallocated space unused blocks if the file system is being
umounted.

Seagate-bug-id: MRP-3741
Signed-off-by: Jadhav Vikram <vikramjadhav87@yahoo.co.in>
Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Reviewed-by: Alexey Leonidovich Lyashkov <alexey.lyashkov@seagate.com>
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Change-Id: I6d43905d49a219d1a5b966ab405e974a1f29b2f3
Reviewed-on: https://review.whamcloud.com/27130
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
2 years agoLU-14781 osp: osp object header could be NULL 55/44055/4
Bobi Jam [Fri, 3 Sep 2021 04:03:18 +0000 (12:03 +0800)]
LU-14781 osp: osp object header could be NULL

Don't call lu_object_header_fini upon NULL header in
osp_object_free().

Call trace:
lu_object_free.isra.30+0xf2/0x170 [obdclass]
lu_object_find_at+0x496/0x930 [obdclass]
lod_initialize_objects+0x3e4/0xba0 [lod]
lod_parse_striping+0x693/0xc20 [lod]
lod_striping_load+0x2b2/0x660 [lod]
lod_declare_destroy+0x12b/0x600 [lod]
mdd_declare_finish_unlink+0x91/0x210 [mdd]
mdd_unlink+0x48f/0xab0 [mdd]
mdt_reint_unlink+0xc32/0x1550 [mdt]
mdt_reint_rec+0x83/0x210 [mdt]
mdt_reint_internal+0x6e1/0xb00 [mdt]
mdt_reint+0x67/0x140 [mdt]
tgt_request_handle+0xaee/0x15f0 [ptlrpc]
ptlrpc_server_handle_request+0x24b/0xab0 [ptlrpc]
ptlrpc_main+0xb34/0x1470 [ptlrpc]
kthread+0xd1/0xe0

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Iec23cf06dffaa64c6f5853c28382ba930ee1076b
Reviewed-on: https://review.whamcloud.com/44055
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-13397 llite: support fallocate() on selected mirror 21/44721/3
Mikhail Pershin [Sun, 22 Aug 2021 19:41:33 +0000 (22:41 +0300)]
LU-13397 llite: support fallocate() on selected mirror

- add ability to do fallocate() on designated mirror in
  FLR file
- add missing FALLOC_FL_KEEP_SIZE flag to fallocate() call
  in llapi_hole_punch(). It was just not working without
  that flag silently
- add corresponding test_50d in sanity-flr.sh

Fixes: 4126fbb30c ("LU-13397 lfs: mirror resync to keep sparseness")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I8d700fce904c84458a50650f1d3cb09d23989eba
Reviewed-on: https://review.whamcloud.com/44721
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-5369 mdt: check lock handle instead assert 05/44905/4
Yang Sheng [Mon, 13 Sep 2021 21:04:00 +0000 (05:04 +0800)]
LU-5369 mdt: check lock handle instead assert

The lock handle could be NULL inn some corner case.
We should check it instead of LBUG.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I1afa7f8c129c104b012ae23141318365c388c503
Reviewed-on: https://review.whamcloud.com/44905
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-13717 sec: filename encryption - symlink support 94/43394/19
Sebastien Buisson [Tue, 31 Aug 2021 15:30:48 +0000 (17:30 +0200)]
LU-13717 sec: filename encryption - symlink support

On client side, call the appropriate llcrypt primitives from llite,
to proceed with symlink encryption before sending requests to servers
and symlink decryption upon request receipt.
The tricky part is that llcrypt needs an inode to encrypt the target
name. But by the time we prepare the symlink creation request to be
sent to the server with the target name (in ll_new_node), we do not
have an inode yet (it will be obtained only after we get the server
reply). So we create a fake inode and associate the right encryption
context to it, so that the symlink gets encrypted properly.

In order to report the correct size for an encrypted symlink (which is
ought to be the length of the symlink target), we need to read the
symlink target and decrypt or decode it in ->getattr(). This has a
performance hit, but given that the symlink target is cached in
->i_link (when the key is available), the symlink will not have to be
read and decrypted again later when it is actually followed,
readlink() is called, or lstat() is called again.
This part of the patch is adapted from kernel commit
d18760560593e5af921f51a8c9b64b6109d634c2
"fscrypt: add fscrypt_symlink_getattr() for computing st_size"

With encrypted file names, a symlink target is binary. So make sure
server side can handle that, by switching sp_symname to a
struct lu_name in struct md_op_spec.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ic6892fca8926a35001697c54aaf05d15563b139d
Reviewed-on: https://review.whamcloud.com/43394
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-15022 revert: "LU-14997 tests: Register stack_trap for sanity/104c" 08/45008/3
Andreas Dilger [Tue, 21 Sep 2021 22:58:30 +0000 (22:58 +0000)]
LU-15022 revert: "LU-14997 tests: Register stack_trap for sanity/104c"

This reverts commit 59b32113313c3566e5f3797bca404a5b19d5e305
since it caused constant test failures for ZFS backends.

Change-Id: I195cc483166294dbf97be50a9b747c8a2b534799
Test-Parameters: trivial fstype=zfs testlist=sanity env=ONLY=104,ONLY_REPEAT=20
Reviewed-on: https://review.whamcloud.com/45008
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoEX-3687 osp: do force disconnect if import is not ready 53/44753/4
Mikhail Pershin [Wed, 25 Aug 2021 17:03:47 +0000 (20:03 +0300)]
EX-3687 osp: do force disconnect if import is not ready

Send OSP_DISCONNECT only on health import. Otherwise,
force local disconnect for unhealthy imports.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Icd9f171271f4e17a65503fcc710ad3aaa2b84e1e
Reviewed-on: https://review.whamcloud.com/44753
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14997 tests: Register "stack_trap" for sanity/104c 82/44882/3
Arshad Hussain [Thu, 9 Sep 2021 09:18:42 +0000 (05:18 -0400)]
LU-14997 tests: Register "stack_trap" for sanity/104c

This patch is a minor improvement for calling cleanup
through 'stack_trap' versus doing right at the end of
the script.

Fixes: 8ee6e1c8825c ("LU-14565 ofd: Do not rely on tgd_blockbit")
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Change-Id: Iae2ca81091e0119f2117f4cd57b5cc2f6ac38c6c
Reviewed-on: https://review.whamcloud.com/44882
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
2 years agoLU-14990 tests: Detect correct LNet interface for sanity-lnet 57/44857/2
Chris Horn [Tue, 7 Sep 2021 15:24:14 +0000 (10:24 -0500)]
LU-14990 tests: Detect correct LNet interface for sanity-lnet

Determine the names of the interfaces used for LNet by parsing the
NIDs configured after calling load_modules(). Tests which reference
eth0 are modified to use the interface associated with the primary
NID (i.e. first NID output by lctl list_nids).

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-10385
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Id715aa3e5470d9c110f6248620b1a83920875e7b
Reviewed-on: https://review.whamcloud.com/44857
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14782 kernel: new kernel [SLES15 SP3 5.3.18-59.19.1] 62/44062/5
Jian Yu [Mon, 6 Sep 2021 02:19:07 +0000 (19:19 -0700)]
LU-14782 kernel: new kernel [SLES15 SP3 5.3.18-59.19.1]

This patch makes changes to support new SLES15 SP3 release
with kernel 5.3.18-59.19.1 for Lustre client.

Test-Parameters: trivial

Change-Id: Idf6fad9773dd242c02859a5c7b14401675c4ecf4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44062
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14991 tests: Correct whitespace in sanity-lnet test_101/102 56/44856/2
Chris Horn [Tue, 7 Sep 2021 15:47:06 +0000 (10:47 -0500)]
LU-14991 tests: Correct whitespace in sanity-lnet test_101/102

sanity-lnet.sh test_100 and test_101 use tab characters in the
expected yaml output, but yaml syntax does not allow tab characters.

Test-Parameters: trivial testlist=sanity-lnet
Fixes: a5cbe7883d ("LU-12815 socklnd: allow dynamic setting of conns_per_peer")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I0814f1965414f82cdc696cfe9996b33e863df982
Reviewed-on: https://review.whamcloud.com/44856
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1] 48/44848/2
Jian Yu [Mon, 6 Sep 2021 01:47:38 +0000 (18:47 -0700)]
LU-14934 kernel: kernel update SLES12 SP5 [4.12.14-122.83.1]

Update SLES12 SP5 kernel to 4.12.14-122.83.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 430c 817" testlist=sanity

Change-Id: I2b35d129550b895324bb3e2e61910ad10e846f03
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44848
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14965 ldiskfs: hold inode mutex for ldiskfs_orphan_add() 54/44754/3
Bobi Jam [Thu, 26 Aug 2021 10:19:11 +0000 (18:19 +0800)]
LU-14965 ldiskfs: hold inode mutex for ldiskfs_orphan_add()

See following warning:

ldiskfs/namei.c:3331 ldiskfs_orphan_add+0x11e/0x290 [ldiskfs]
Call Trace:
dump_stack+0x19/0x1b
__warn+0xd8/0x100
warn_slowpath_null+0x1d/0x20
ldiskfs_orphan_add+0x11e/0x290 [ldiskfs]
ldiskfs_xattr_inode_orphan_add+0xbb/0x110 [ldiskfs]
ldiskfs_xattr_delete_inode+0x5c/0x350 [ldiskfs]
ldiskfs_evict_inode+0x1a8/0x630 [ldiskfs]
evict+0xb4/0x180
iput+0xfc/0x190
osd_object_delete+0x1f8/0x370 [osd_ldiskfs]
lu_object_free.isra.27+0xb8/0x1c0 [obdclass]
lu_object_put+0xa5/0x460 [obdclass]
mdt_object_put+0x30/0x110 [mdt]
mdt_reint_unlink+0x8e0/0x1890 [mdt]
mdt_reint_rec+0x83/0x210 [mdt]
mdt_reint_internal+0x720/0xaf0 [mdt]
mdt_reint+0x67/0x140 [mdt]
tgt_request_handle+0x7ea/0x1750 [ptlrpc]
ptlrpc_server_handle_request+0x256/0xb10 [ptlrpc]
ptlrpc_main+0xb3c/0x14e0 [ptlrpc]
kthread+0xd1/0xe0
ret_from_fork_nospec_begin+0x21/0x21

Need to hold inode mutex on the external EA for ldiskfs_orphan_add()
to soothe the warning.

Fixes: f64e9f19f68e ("LU-12977 ldiskfs: properly take inode_lock() for truncates")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I3a1abfde3289c0bbd46e0d5a5b9d2ff7d7cf9273
Reviewed-on: https://review.whamcloud.com/44754
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
2 years agoLU-14323 tests: skip sanity-flr/pfl tests for older servers 94/44494/5
James Nunez [Wed, 4 Aug 2021 14:47:50 +0000 (08:47 -0600)]
LU-14323 tests: skip sanity-flr/pfl tests for older servers

sanity-flr test 46 sub tests 7, 8, 9 and 10 and sanity-pfl
test 16c were added to lustre-master version 2.13.53.205.
When we run version interop testing, these sanity-flr and
sanity-pfl tests will fail.  Thus skip sanity-flr test 46
subtests 7, 8, 9, and 10 and sanity-pfl test 16c when run
with servers with version less than 2.13.53.205 and clients
with later version.

Fixes: ee916af10de2 (“LU-13366 utils: SEL yaml and copy file support “)
Test-Parameters: trivial
Test-Parameters: env=ONLY=46 testlist=sanity-flr
Test-Parameters: env=ONLY=16 testlist=sanity-pfl
Test-Parameters: serverversion=2.12.7 serverdistro=el7.9 env=ONLY=46 testlist=sanity-flr
Test-Parameters: serverversion=2.12.7 serverdistro=el7.9 env=ONLY=16 testlist=sanity-pfl
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I09b88351a10891f63dceb9a2a74c92e4fffc13c5
Reviewed-on: https://review.whamcloud.com/44494
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
2 years agoLU-14709 pcc: VM_WRITE should not trigger layout write 83/44483/7
Qian Yingjin [Sat, 31 Jul 2021 07:45:56 +0000 (15:45 +0800)]
LU-14709 pcc: VM_WRITE should not trigger layout write

VM area marked with VM_WRITE means that pages may be written, but
mmap page write may never happen.
It should delay layout write until the actual modification on the
file happen in ->page_mkwrite().
Otherwise, it will trigger panic for PCC-RO sanity-pcc test_21f().

Fixes: f2d1c4ee4 ("LU-14647 flr: mmap write/punch does not stale other mirrors")
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1cbfef8a4ed7e2c718324fd8a21bafd6157b5f0c
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44483
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14896 utils: migrate file with only '--pool' option 65/44465/4
Etienne AUJAMES [Mon, 2 Aug 2021 10:26:58 +0000 (12:26 +0200)]
LU-14896 utils: migrate file with only '--pool' option

"lfs migrate -p pool_name test_file" initiate a migration but without
changing the layout pools (migrate from layout copy).

This patch implements the same behavior that:
"lfs setstripe -p pool_name test_file"
It sets the pool name and uses the default parameters for the plain
layout.

Add sanity test 56xg to check file migrations with pool.

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I1645eaca028974337218411d6a033f3acf9b9d6a
Reviewed-on: https://review.whamcloud.com/44465
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13055 changelog: use default mask if server has no mask 04/44404/3
Mikhail Pershin [Tue, 27 Jul 2021 10:37:01 +0000 (13:37 +0300)]
LU-13055 changelog: use default mask if server has no mask

When registering a new maskless user and server has no specific
mask set then effective mask to be set to DEFAULT value

Fixes: a15eb4f132 ("LU-13055 mdd: per-user changelog names and mask")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If799cb5cc29c60cce6ef6c987f2e493145e00e31
Reviewed-on: https://review.whamcloud.com/44404
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13903 utils: separate out server code for wiretest 73/43873/9
James Simmons [Sat, 21 Aug 2021 17:54:42 +0000 (13:54 -0400)]
LU-13903 utils: separate out server code for wiretest

Both the kernel and userland utility wiretest is used by both
client and server to validate data being sent over the network.
Make userland  wiretest buildable on the native Linux client
which lacks server specific data structures. Use of the UAPI
values to hardern testing of user land data passed to the
kernel.

Change-Id: I30efc8bf42ac461bab5a4371e940a027a23d12c9
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/43873
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13903 uapi: fixup UAPI headers for native Linux client. 64/44664/5
James Simmons [Sat, 4 Sep 2021 12:33:53 +0000 (08:33 -0400)]
LU-13903 uapi: fixup UAPI headers for native Linux client.

This covers all the UAPI problems outside of the user land
wiretest utility. One set of problems is build and the second is
that UAPI header definitions are either user land only or never
used to valid data going to or from user land.

1) Use UAPI header definitions to validate data send to or from
   kernel space. We check lum_hash_type using LMV_HASH_TYPE_MASK.
   This avoids a round trip to the server which will report back
   an error. The other case is we check the values returned for
   LL_IOC_HSM_ACTION. We keep the original behavior of passing
   unknown data to the user land application but add debug
   logging if the data looks corrupt to help track down bug
   issues.

2) We can use QIF_DQBLKSIZE* instead of Lustre specific values
   for our quota handling. QIF_DQBLKSIZE* is a Linux UAPI quota
   value.

3) The NOTIFY_GRACE_* macros are used only by user land. Move
   to lustreapi.h

4) A few of the UAPI definitions are used by utility code
   present on the client and the Lustre kernel server code; which
   are not sent over the wire. Handle these special cases. This
   covers the missing LCM_USER_MIRROR_FLAGS, LCME_TEMPLATE_FLAGS,
   and LQUOTA_* values. Once server code merges upstream we can
   clean this up.

5) lcfg_cmd2data() is server specific so in case of a client build
   we can have get_llog_event_name() just always return NULL.

6) Don't package OpenSFS UAPI headers when building for native
   Linux client.

Change-Id: I258ee917b005e438eb7c15fa6e0c4b72e9ea9d56
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44664
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13717 sec: filename encryption - digest support 92/43392/16
Sebastien Buisson [Fri, 22 Jan 2021 12:06:50 +0000 (21:06 +0900)]
LU-13717 sec: filename encryption - digest support

A number of operations are allowed on encrypted files without the key:
- read file metadata (stat);
- list directories;
- remove files and directories.
In order to present valid names to users, cipher text names are base64
encoded if they are short. Otherwise we compute a digested form of the
cipher text, made of the FID (16 bytes) followed by the second-to-last
cipher block (16 bytes), and we base64 encode this digested form for
presentation to user.
These transformations are carried out in the specific overlay
functions, that now need to know the fid of the file.

As the digested form does not contain the whole cipher text name,
server side needs to proceed to an operation by FID for requests such
as lookup and getattr. It also relies on the content of the LinkEA to
verify the digested form as received from client side.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I45d10a426373c2cfe0b92a58c351da452d085d7d
Reviewed-on: https://review.whamcloud.com/43392
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
2 years agoLU-13086 tests: restore compatibility with mpich 89/38689/8
Elena Gryaznova [Thu, 21 May 2020 10:13:41 +0000 (13:13 +0300)]
LU-13086 tests: restore compatibility with mpich

The addition of the --oversubscribe MPI option to mpi_run() is
OpenMPI specific.  Patch moves --oversubscribe to MPIRUN_OPTIONS
in local.sh to restore the compatibility with MPICH.

Test-Parameters: trivial clientdistro=el8.3 serverdistro=el7.7 testlist=parallel-scale,large-scale,performance-sanity
Test-Parameters: clientdistro=el8.4 serverdistro=el7.7 testlist=parallel-scale,large-scale,performance-sanity
Fixes: 3c7aca7472 ("LU-12395 build: build mpitests for el8")
Cray-bug-id: LUS-8006
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Change-Id: I0a6fab072212781d12877d2503ae8600cfdc8c7a
Reviewed-on: https://review.whamcloud.com/38689
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@hpe.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
2 years agoLU-13997 tests: sanity/418 to cancel all client locks 03/44803/4
Alex Zhuravlev [Wed, 1 Sep 2021 08:54:04 +0000 (11:54 +0300)]
LU-13997 tests: sanity/418 to cancel all client locks

verify idea about dirty client's data

Test-Parameters: trivial
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Test-Parameters: testlist=sanity env=ONLY=0-418 fstype=ldiskfs
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifef58a98b26c7790274d2a57aa52e4475e923dd0
Reviewed-on: https://review.whamcloud.com/44803
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 years agoLU-14959 ldlm: Check return value of ldlm_resource_get() 38/44738/4
Oleg Drokin [Tue, 24 Aug 2021 03:44:45 +0000 (23:44 -0400)]
LU-14959 ldlm: Check return value of ldlm_resource_get()

Fix the comment to properly indicate it returns ERR_PTR on
error and fix osc_req_attr_set() and mdc_get_lock_handle()
to actually check the return value before passing it on and
causing an unintended crash.

Change-Id: Ib85a62140a39744e85989c9a9c8aa2ed771d70d1
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44738
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
2 years agoLU-14951 llite: protect fd_{lease_}och 00/44700/2
Bobi Jam [Wed, 18 Aug 2021 13:24:50 +0000 (21:24 +0800)]
LU-14951 llite: protect fd_{lease_}och

Access ll_file_data::fd_och and fd_lease_och needs to lli_och_mutex
protection.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie9136aa345c6bf015aa73067acdaecf1a765b9f6
Reviewed-on: https://review.whamcloud.com/44700
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13195 osp: track destroyed OSP object 85/38385/11
Alex Zhuravlev [Mon, 27 Apr 2020 04:52:01 +0000 (07:52 +0300)]
LU-13195 osp: track destroyed OSP object

retain destroyed OSP objects in memory to prevent races when
in-flight destroyed is passed by read or attr_get leading to
incorrect local states.
also block operations to such an object with -ENOENT.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ied59f1a95458e8890249b92d4efc38e258a7e3cf
Reviewed-on: https://review.whamcloud.com/38385
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14729 osd-ldiskfs: declare should consider concurrency 16/44316/13
Wang Shilong [Thu, 15 Jul 2021 08:15:37 +0000 (16:15 +0800)]
LU-14729 osd-ldiskfs: declare should consider concurrency

Write in Lustre OSD is different than Ext4 since write
is serialized in local filesystem, however in OSD side,

many concurrent threads may grow tree before transaction starts.

Also fix to use @dirty_groups rather than @extents, remove
unnecessary @depth assignment.

Fixes: 9810341a8 ("LU-14729 osd-ldiskfs: fix to declare write commits")
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I1e0fc9069a579736a74b0ba2607056fe980574c3
Reviewed-on: https://review.whamcloud.com/44316
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14724 nrs: TBF rule list broken when change rule rank 25/43925/6
Qian Yingjin [Fri, 28 May 2021 03:56:12 +0000 (11:56 +0800)]
LU-14724 nrs: TBF rule list broken when change rule rank

When change rank of two adjacent rules in the TBF rule list in
@nrs_tbf_rule_change_rank():
list_move(&rule->tr_linkage, next_rule->tr_linkage.prev);

The previous pointer of @next_rule is @rule, using list_move
directly will break the rule list.
In this patch, it use list_del + list_add to repace list_move to
avoid TBF rule broken.
And also add a test case sanityn test_77o for this bug.

Fixes: aa14b0b9a152 ("LU-8006 ptlrpc: specify ordering of TBF policy rules")
Change-Id: Ica30d3329f07914657ac2c4089d66f934021b763
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/43925
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14711 tests: Ensure there's no eviction with long cache discard 69/43869/9
Oleg Drokin [Sat, 29 May 2021 02:42:49 +0000 (22:42 -0400)]
LU-14711 tests: Ensure there's no eviction with long cache discard

Just pause execution while doing page processing
for discard if appropriate failloc is set.

Change-Id: If0d04f3cad267cbeeab63040d63e048dcf03cd6b
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Test-Parameters: trivial testlist=sanity env=ONLY=903
Reviewed-on: https://review.whamcloud.com/43869
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
2 years agoLU-13717 sec: filename encryption 90/43390/15
Sebastien Buisson [Tue, 23 Mar 2021 13:58:50 +0000 (22:58 +0900)]
LU-13717 sec: filename encryption

On client side, call the appropriate llcrypt primitives from llite,
to proceed with filename encryption before sending requests to servers
and filename decryption upon request receipt.
Note we need specific overlay functions to handle encoding and
decoding of encrypted filenames, as we do not want server side to deal
with binary names before they reach the backend file system layer.

On server side, mainly the OSD layer, we need to know the encryption
status of files being processed.
If an object belongs to an encrypted file, the filename has been
encoded by the client because it is binary, so it needs to be decoded
before being handed over to the backend file system layer.
And conversely, the filename of an encrypted file has to be encoded
before being sent over the wire.
Note server side is osd-ldiskfs only for now.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7ac9047f5a046b8bc63afdbbb1f28e78aa5c8c7e
Reviewed-on: https://review.whamcloud.com/43390
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
2 years agoLU-14854 mdd: proper handle error in mdd_swap_layouts() 19/44319/5
Bobi Jam [Thu, 15 Jul 2021 18:20:54 +0000 (02:20 +0800)]
LU-14854 mdd: proper handle error in mdd_swap_layouts()

Only restore object's HSM xattr on error if it's for
SWAP_LAYOUTS_MDS_HSM.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I9d4c58cd3107c3900e72a0946d0ec7d7286dd43f
Reviewed-on: https://review.whamcloud.com/44319
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-9897 tests: add generated files to .gitignore 78/44778/3
James Simmons [Sat, 28 Aug 2021 23:55:49 +0000 (19:55 -0400)]
LU-9897 tests: add generated files to .gitignore

Several binaries and wrappers are created in the build process
that show up as files for git add which is not the case. Add
these files to .gitignore so avoid an accidental git addition.

Test-Parameters: trivial
Change-Id: If693ba7933c0329a333dec71ed6fb521a90435f4
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44778
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 years agoLU-14967 obdclass: EAGAIN after rhashtable_walk_next() 66/44766/3
Alex Zhuravlev [Fri, 27 Aug 2021 05:42:56 +0000 (08:42 +0300)]
LU-14967 obdclass: EAGAIN after rhashtable_walk_next()

rhashtable_walk_next() can return -EAGAIN when concurrent resizing
has happened. so the callers should check for this error and just
repeat rhashtable_walk_next().

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I15ba2cdf16c2678e18836b4f16b56a3b8bfdacd0
Reviewed-on: https://review.whamcloud.com/44766
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14776 zfs: fix Ubuntu 20 HWE build issues 49/44749/3
James Simmons [Wed, 25 Aug 2021 17:17:51 +0000 (11:17 -0600)]
LU-14776 zfs: fix Ubuntu 20 HWE build issues

With newer Ubuntu systems using ZFS dkms have the following build
errors:

    In file included from zfs/2.0.2/source/include/sys/arc.h:32,
                 from lustre/osd-zfs/osd_internal.h:50,
                 from lustre/osd-zfs/osd_handler.c:51:
    zfs/2.0.2/source/include/sys/zfs_context.h:45:10:
                 fatal error: sys/types.h: No such file or directory
    45 | #include <sys/types.h>
       |          ^~~~~~~~~~~~~
    compilation terminated.

This is due to layout of the tree containing the needed headers.
Include those paths in build system.

Test-Parameters: trivial
Change-Id: I453830c4111ad88ec655d3d7d0ee51627331cb0b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44749
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14776 build: Ubuntu 20.04.2 and 20.04.3 HWE client support 48/44748/2
James Simmons [Wed, 25 Aug 2021 13:26:52 +0000 (09:26 -0400)]
LU-14776 build: Ubuntu 20.04.2 and 20.04.3 HWE client support

We now support Luste clients on both Ubuntu 20.04.2 and
20.04.3 HWE platforms.

Change-Id: I772af876ffa8beeabb8a2002f80aa776fa373996
Test-Parameters: trivial
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44748
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14962 lnet: Check for -ESHUTDOWN in lnet_parse 43/44743/3
Chris Horn [Tue, 24 Aug 2021 16:16:17 +0000 (11:16 -0500)]
LU-14962 lnet: Check for -ESHUTDOWN in lnet_parse

The fix for LU-8106, http://review.whamcloud.com/19993, no longer
works because rc does not have the return value from
lnet_nid2peerni_locked(). Use PTR_ERR to get the return value and
restore the LU-8106 fix.

HPE-bug-id: LUS-10333
Fixes: fa8b4e6357 ("LU-7734 lnet: peer/peer_ni handling adjustments")
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I9cc2bc2d6e675d38cf06d99c524bdd95110bf0e9
Reviewed-on: https://review.whamcloud.com/44743
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14961 tests: set Pool Quotas 40/44740/4
Elena Gryaznova [Tue, 24 Aug 2021 11:23:22 +0000 (14:23 +0300)]
LU-14961 tests: set Pool Quotas

We are interested in running some tests on fs with
pool quotas set for some users. For instance, setting
pool quotas limits for mpiuser allows to stress pool
quotas code with mpi tests.
Patch adds ability to set pool quotas block hard limits
for specific users via POOLS_QUOTA_USERS_SET.
Example:
  POOLS_QUOTA_USERS_SET="quota15_1:20M
                quota15_2:1G:gpool0
                quota15_4:200M:gpool0
                quota15_4:200M:gpool1"
For quota15_1 limit 20M will be set for all existing
pools.

Test-Parameters: env=FS_POOL="glo",POOLS_QUOTA_USERS_SET="mpiuser:200M quota15_1:2000M:glo1",FS_NPOOLS="2",ENABLE_QUOTA="yes"
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-10059
Reviewed-by: Vladimir Saveliev <vlaidimir.saveliev@hpe.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Change-Id: Ia9ee540ca77e70f37aa849e5e555e3c057e2052d
Reviewed-on: https://review.whamcloud.com/44740
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14960 tests: enhance ha.sh to work with several test dirs 39/44739/3
Elena Gryaznova [Tue, 24 Aug 2021 10:56:58 +0000 (13:56 +0300)]
LU-14960 tests: enhance ha.sh to work with several test dirs

Patch adds the ability to work with several test directories
set via ha_test_dirs variable.
Useful for emulation more Lustre clients.
Example:
  before the test mount Lustre on:
    /mnt/lustre, /mnt/lustre1 /mnt/lustre3
  Run ha.sh with:
  ha_test_dirs="/mnt/lustre /mnt/lustre1 /mnt/lustre3"
  The client's test directories will be created in the listed
  test directories:
  client0 works in /mnt/lustre subdirectory
  client1 works in /mnt/lustre1 subdirectory,
  etc.

Patch also adds the ability to not remove the test directories
if CLEANUP set to false.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-9705
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Change-Id: I1d04b7deeda693c9ca1c86411b0a66c6a2315923
Reviewed-on: https://review.whamcloud.com/44739
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-9859 libcfs: change libcfs_log_* functions to inline 81/44581/3
James Simmons [Tue, 10 Aug 2021 18:05:31 +0000 (14:05 -0400)]
LU-9859 libcfs: change libcfs_log_* functions to inline

The functions libcfs_log_return() and libcfs_log_goto() don't
exist in the native Linux client. We still need them for the
special OpenSFS debugging but we can change those functions
to simple inline routines since they are just wrappers
around libcfs_debug_msg().

Test-Parameters: trivial
Change-Id: I0e2b40feb18f9f1a1ffbda39756ab64308ea6439
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/44581
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14021 llite: don't touch vma after filemap_fault 58/44558/2
Alexander Boyko [Tue, 10 Aug 2021 14:20:42 +0000 (10:20 -0400)]
LU-14021 llite: don't touch vma after filemap_fault

In case of error filemap_fault unlock mutex vma->vm_mm->mmap_sem,
so touching vma is dangerous, it could be reused or freed.
The patch uses local file variable to skip vma.

HPE-bug-id: LUS-10240
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I72cd086645061819fab5b8595a880db64cfb9ff7
Reviewed-on: https://review.whamcloud.com/44558
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14807 lfsck: fix race in lfsck_pos_fill 30/44130/7
Hongchao Zhang [Sun, 27 Jun 2021 21:00:20 +0000 (05:00 +0800)]
LU-14807 lfsck: fix race in lfsck_pos_fill

There is a race for lfsck->li_di_dir between lfsck_di_dir_put and
lfsck_pos_fill, which could cause lfsck_pos_fill to use freed
lfsck->li_di_dir (struct osd_it_ea) and trigger GPF.

Change-Id: Iedadf03ac15d128bb051aea8aafa24dbcd2704fb
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/44130
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-14696 llite: check read only mount for setquota 65/43765/6
Hongchao Zhang [Thu, 12 Aug 2021 11:06:45 +0000 (19:06 +0800)]
LU-14696 llite: check read only mount for setquota

During setting quota, it should fail if the mount is read-only.

Change-Id: I966ac71d0a4a72dcb998f09ffc0f99ae28498e27
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/43765
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 years agoLU-13668 mdt: change lock mode for lease 64/38964/23
Alex Zhuravlev [Wed, 17 Jun 2020 14:05:28 +0000 (17:05 +0300)]
LU-13668 mdt: change lock mode for lease

make it PW so that lfs getstripe and open-for-read do not
interrupt replication.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I20f4bbbc4e7bf9055333aba1b8cca80aa899c664
Reviewed-on: https://review.whamcloud.com/38964
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>