Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-14473 test: check RUNAS and RUNAS_ID 85/41785/5
Olaf Faaland [Sat, 27 Feb 2021 00:53:38 +0000 (16:53 -0800)]
LU-14473 test: check RUNAS and RUNAS_ID

Validate RUNAS and RUNAS_ID before testing a file create, so
that the error messages can be more specific.

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I87b2c279f981b34ab979cca42a8ae06128a294cc
Reviewed-on: https://review.whamcloud.com/41785
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14291 lustre: further cleanup of acl code. 32/42032/3
Mr NeilBrown [Sun, 14 Mar 2021 22:34:55 +0000 (09:34 +1100)]
LU-14291 lustre: further cleanup of acl code.

Code in lustre/obdclass/acl.c is only used in lustre/mdd/, so move the
file there, renaming to mdd_acl.c and removing EXPORT_SYMBOL()
declarations.

The function prototypes in lustre_eacl.h are moved to mdd_internal.h,
and the remainder of that file is discarded.  THe
HAVE_STRUCT_ACL_XATTR stanza, in particular, is unnecessary is it
exists in lustre_compat.h.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idb0978758640c5ad527d2c68c4fdf6dee32a731c
Reviewed-on: https://review.whamcloud.com/42032
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 years agoLU-8837 lmv: don't use lqr_alloc spinlock in lmv 49/41949/6
Mr NeilBrown [Mon, 8 Mar 2021 22:28:48 +0000 (09:28 +1100)]
LU-8837 lmv: don't use lqr_alloc spinlock in lmv

The only place the lrq_alloc spinlock is used in lmv is in
lmv_locate_tgt_rr().  The purpose here is presumably to protect
lmv_qos_rr_index from concurrent updates.  This is a field that is
only tangentially related the the structure that holds the spinlock.

lmv_qos_rr_index is directly in 'struct lmv_obd' while lqr_alloc
is in struct lu_qos_rr which is in struct lu_qos, which is in lmv_obd.

As there is a spinlock in 'struct lmv_obd' (lmv_lock) it makes more
sense to use that to protect lmv_qos_rr_index.  Then the entire
lu_qos_rr structure will be unused on the client and can be made
server-only.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I926e6d31ca0ee1cbfff9905192428e28485ed448
Reviewed-on: https://review.whamcloud.com/41949
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14385 utils: add range check to strtol() in lfs.c 56/41756/4
Jian Yu [Thu, 11 Mar 2021 23:10:29 +0000 (15:10 -0800)]
LU-14385 utils: add range check to strtol() in lfs.c

Most of the strtol() and strtoll() functions called
in lfs.c did not check the range of the return value.
This patch fixes those issues.

Change-Id: I9ff51662bf0d2320961a7838da08f09552e9ef1e
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41756
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14428 libcfs: discard cfs_trace_copyin_string() 90/41490/4
Mr NeilBrown [Tue, 9 Feb 2021 00:49:30 +0000 (11:49 +1100)]
LU-14428 libcfs: discard cfs_trace_copyin_string()

Instead of cfs_trace_copyin_string(), use memdup_user_nul().
This combines the allocation with the copyin, and nul-terminates.

The resulting code is a lot simpler.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I089c5da96b59ec62d177aea2f3d170bf751c6fec
Reviewed-on: https://review.whamcloud.com/41490
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14428 libcfs: discard cfs_trace_console_buffers[] 89/41489/4
Mr NeilBrown [Tue, 9 Feb 2021 00:28:45 +0000 (11:28 +1100)]
LU-14428 libcfs: discard cfs_trace_console_buffers[]

cfs_trace_console_buffers[] is a collection of buffers into which
various messages are formatted - with vscnprintf or similar - and
which are then passed to cfs_print_to_console which adds more
formatted information.

The two levels of formatting can instead be achieved using the "%pV"
format which takes a format-and-args.  If we do this, we don't need
cfs_trace_console_buffers[] and more.

One minor drawback is that cfs_tty_write_message() requires a final
string to print, not a format plus arguments.  This is only minor
because there is precisely one message that is ever sent to
cfs_tty_write_message(), and it contains no formatting.  So we now
generate a warning if the string passed with D_TTY ever contains
formatting, and just print that string ignoring any formatting.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic78ac3703e5b6321dade8c367753c0aec1cae60b
Reviewed-on: https://review.whamcloud.com/41489
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14398 hsm: use llapi_fid2path_at() in the copytool 08/41408/2
John L. Hammond [Wed, 3 Feb 2021 20:19:05 +0000 (14:19 -0600)]
LU-14398 hsm: use llapi_fid2path_at() in the copytool

In lhsmtool_posix.c and liblustreapi_hsm.c, convert several uses of
uses of llapi_fid2path() to llapi_fid2path_at().

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ice64d02010b4260287be4d4e26c6b75b178bc81b
Reviewed-on: https://review.whamcloud.com/41408
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14179 lfs: avoid lfs find error with long paths 37/41337/8
Stephane Thiell [Fri, 26 Feb 2021 20:33:04 +0000 (12:33 -0800)]
LU-14179 lfs: avoid lfs find error with long paths

Test that files created in a directory having an absolute path length
of up to PATH_MAX-1 are properly found with lfs find. This change
might not cover other very deep directory tree (above PATH_MAX).

Signed-off-by: Stephane Thiell <sthiell@stanford.edu>
Change-Id: I44726efd5053c593094587e5c8a4652a3a876641
Reviewed-on: https://review.whamcloud.com/41337
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14119 lfsck: replace dt_lookup() with dt_lookup_dir() 18/41218/3
Lai Siyao [Wed, 13 Jan 2021 09:16:55 +0000 (17:16 +0800)]
LU-14119 lfsck: replace dt_lookup() with dt_lookup_dir()

Lfsck code calls dt_lookup() to lookup sub file under directory in
many places, but this function needs to to initialize directory with
dt_try_as_dir() first, while it's missing in several places, since
the overhead is trivial, call dt_lookup_dir() instead.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I40bd8d51edece50353af1729cf867572a0abea78
Reviewed-on: https://review.whamcloud.com/41218
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14110 obdclass: Protect cl_env_percpu[] 65/40565/11
Etienne AUJAMES [Tue, 3 Nov 2020 14:35:17 +0000 (15:35 +0100)]
LU-14110 obdclass: Protect cl_env_percpu[]

cl_env_percpu is not protected against multi client mounts on the
same node: "keys_fill" could be called with the same cl_env_percpu
context by several mount processes (race on lu_context.lc_value).

This patch add a mutex for cl_env_percpu to proctect contexts
"refill".

Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: Icfd6f3715899fa4ac5279e932f462e7cf29d98bd
Reviewed-on: https://review.whamcloud.com/40565
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 llite: use d_is_symlink to test if dentry is a symlink 70/41770/5
Mr NeilBrown [Fri, 16 Oct 2020 00:07:21 +0000 (11:07 +1100)]
LU-6142 llite: use d_is_symlink to test if dentry is a symlink

Using d_is_symlink() is preferred to testing ->get_link or
->follow_link.

A recent patch made this work for foreign files/dirs by making sure
the entry type in d_flags is correct, so we can simplify the code in
ll_revalidate_dentry().

Fixes: 15d44e787e17 ("LU-12682 llite: fake symlink type of foreign file/dir")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie4c33ae1fb9a660ccbd50e2c70b6cde65cc9b990
Reviewed-on: https://review.whamcloud.com/41770
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14480 pool: wrong usage with ost list 15/41815/3
Vitaly Fertman [Wed, 16 Dec 2020 22:02:32 +0000 (01:02 +0300)]
LU-14480 pool: wrong usage with ost list

When the OST list is given on setstripe, it should have a priority
over the pool. Also, we check only for the 1st OST if it is in the
pool at the creation time, what worked well in past with -c and
works even with -C, but not with the OST list when some of the OSTs
are out of the pool.

Make the --pool and --ost options mutualy exclusive.
Drop the pool inheritance if the OST list is given.

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I94a7fe97391f1185392f986f78ab1a372238972a
Reviewed-on: https://es-gerrit.dev.cray.com/158198
HPE-bug-id: LUS-9579
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/41815
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14182 lov: cancel layout lock on replay deadlock 67/40867/2
Vitaly Fertman [Fri, 4 Dec 2020 16:35:19 +0000 (19:35 +0300)]
LU-14182 lov: cancel layout lock on replay deadlock

layout locks are not replayed and instead cancelled as unused, what
requires to take lov_conf_lock. the semaphore may be already taken by
cl_lock_flush() which prepares a new IO which is not be able to be
sent to MDS as it is in the recovery.

HPE-bug-id: LUS-9232
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I1a1a91a81c19ad4deca9ff581107512642f0b666
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-on: https://review.whamcloud.com/40867
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14291 build: use tgt_pool for lov layer 83/39683/8
James Simmons [Fri, 26 Feb 2021 21:41:09 +0000 (16:41 -0500)]
LU-14291 build: use tgt_pool for lov layer

New general code was created for target pool handling. We can
use this new code with the lov layer. Place this tgt_pool.c in
the obdclass instead of having a special target directory just to
build this code for the client.

Change-Id: I05542c1d654d79647f5e0853bb1d587ff265fdf9
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/39683
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 lustre: remove ll_file_*_flag wrappers. 92/40292/6
Mr NeilBrown [Thu, 15 Oct 2020 23:34:04 +0000 (10:34 +1100)]
LU-6142 lustre: remove ll_file_*_flag wrappers.

ll_file_{test,set,clear,test_and_set}_flag are simple wrappers around
the various *_bit() functions.  They don't aid readability and the
convention in the kernel is to use the *_bit() functions directly.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I0d50f8936ad9f97882f4771dd3210cc05fe43989
Reviewed-on: https://review.whamcloud.com/40292
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-8837 ptlrpc: mark some functions as static 47/41947/5
Mr NeilBrown [Mon, 1 Mar 2021 04:38:42 +0000 (15:38 +1100)]
LU-8837 ptlrpc: mark some functions as static

The functions
 ptlrpc_start_threads,
 ptlrpc_start_thread,
 ptlrpc_stop_all_threads
 ptlrpc_nrs_policy_register
and
 ptlrpc_nrs_policy_register

are only used in the same file that defines them, so mark them as
'static' and remove the declarations from include files.

 ptlrpc_nrs_policy_unregister

is never used at all, so remove it completely.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id7862b9da3c58ab980c0fcd4d07c1f119fbf7581
Reviewed-on: https://review.whamcloud.com/41947
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 years agoLU-14289 ptlrpc: rename cfs_binheap to simply binheap 75/41375/3
Mr NeilBrown [Mon, 1 Feb 2021 02:16:12 +0000 (13:16 +1100)]
LU-14289 ptlrpc: rename cfs_binheap to simply binheap

As the binheap code is no longer part of libcfs, the cfs_ prefix is
misleading.  As this code is local to one module and doesn't conflict
with anything global, there is no need for a prefix at all.  So change
cfs_binheap to binheap.

This patch was prepare using 'sed', then fixing a few text-alignment
issues caused by the loss of those 4 characters.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I168bec50898ec7b9ab72dc91b080af4852ddb3a4
Reviewed-on: https://review.whamcloud.com/41375
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14291 lustre: clean up lustre_eacl.h and make server-only 26/41126/4
Mr NeilBrown [Thu, 29 Oct 2020 05:13:36 +0000 (16:13 +1100)]
LU-14291 lustre: clean up lustre_eacl.h and make server-only

lustre_eacl.h contains a number of declarations that are never used:
remove them.

The declarations which are used are only needed on server-side files,
so remove the #include from elsewhere.

As obdclass/acl.c is only built server-side, remove the
 #ifdef HAVE_SERVER_SUPPORT
in the file.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If1a3d908bf8357041c38ab9d335efa1e051cef16
Reviewed-on: https://review.whamcloud.com/41126
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13783 libcfs: don't depend on sysctl support for debugfs 32/40832/4
Mr NeilBrown [Thu, 12 Nov 2020 00:16:28 +0000 (11:16 +1100)]
LU-13783 libcfs: don't depend on sysctl support for debugfs

Since Linux v5.8-rc1~55^2~6 sysctl support routines like
proc_dointvec() expect a pointer to kernel-space, not userspace.

So stop using these function for debugfs files, and instead
provide bespoke functions.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I340a748bbfbd066054a73299ce32698aa39a0e2d
Reviewed-on: https://review.whamcloud.com/40832
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13783 libcfs: support __vmalloc with only 2 args. 28/40328/7
Mr NeilBrown [Wed, 21 Oct 2020 04:26:35 +0000 (15:26 +1100)]
LU-13783 libcfs: support __vmalloc with only 2 args.

Since v5.8-rc1~201^2~19 Commit 88dca4ca5a93 ("mm: remove the pgprot
argument to __vmalloc") __vmalloc only takes 2 arguments.

So introduce __ll_vmalloc which takes 2 args, and calls
__vmalloc with correct number of args.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2c89512a12e28b27544a891620e448a9b752b089
Reviewed-on: https://review.whamcloud.com/40328
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13903 utils: move userland only nidstr.h handling 15/39115/5
James Simmons [Mon, 8 Mar 2021 14:09:40 +0000 (09:09 -0500)]
LU-13903 utils: move userland only nidstr.h handling

The function cfs_expand_nidlist() no longer exist for kernel
internals. We can move the function prototype from the UAPI
header to string.h which is a libcfs user land header.
The structure netstrfns that is defined in a UAPI header
has been adding user land only handling. Additional its
use struct list_head which will confuse reviewers since
kernel developers see this as a kernel only thing.

Test-Parameters: trivial

Change-Id: Ifc3c87f6d3237a94d282d009455ff389278e73ea
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/39115
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10391 socklnd: convert ksocknal_add_peer to take sockaddr 08/38408/7
Mr NeilBrown [Tue, 28 Jan 2020 01:15:13 +0000 (12:15 +1100)]
LU-10391 socklnd: convert ksocknal_add_peer to take sockaddr

ksocknal_add_peer() now takes a 'struct sockaddr' which is currently
always an IPv4 address.  ksocknal_lauch_packet() is the main place
where the nid is converted to an IP address.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I194248662798542096e5cc9af985e6c0063a038a
Reviewed-on: https://review.whamcloud.com/38408
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12752 mdt: commitrw_write() - check dying object under lock 97/41797/5
Vladimir Saveliev [Mon, 1 Mar 2021 08:52:51 +0000 (11:52 +0300)]
LU-12752 mdt: commitrw_write() - check dying object under lock

If process writes to unlinked file the following race between
mdt_commitrw_write() and mdd_close() may occur because
mdt_commitrw_write() checks whether an object is dying without lock:

mdt_commitrw_write() checks lu_object_is_dying(&mo->mot_header) and it
not yet

mdd_close() interposes and destroys the object via
  mdo_destroy()
    lod_destroy()
      lod_sub_destroy()
        osd_destroy()
          obj->oo_destroyed = 1;

mdt_commitrw_write() continues, locks the object and returns ENOENT
from

  dt_attr_get()
    osd_attr_get()
      if (unlikely(obj->oo_destroyed))
        return -ENOENT;

If the file is built of DoM and raid component ll_delete_inode() calls
cl_sync_file_range() which is to iterate over both mdt and raid
components via mdc_io_fsync_start() and osc_io_fsync_start().  As
mdc_io_fsync_start() fails with -ENOENT due to failed write rpc,
osc_io_fsync_start() does not get called. Then
truncate_inode_pages_final() finds not-discarded pages and fails with:

  (osc_page.c:183:osc_page_delete()) Trying to teardown failed: -16
  (osc_page.c:184:osc_page_delete()) ASSERTION( 0 ) failed:
  (osc_page.c:184:osc_page_delete()) LBUG

Test to illustrate the issue is added.

The fix is to call lu_object_is_dying() under object lock.

Change-Id: I463c8a6f85d4f5fd934b167c6194f50ae9d4b7d4
HPE-bug-id: LUS-7189
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-on: https://review.whamcloud.com/41797
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14184 tests: component-add/del tests for DOM 70/40870/3
Vitaly Fertman [Fri, 4 Dec 2020 18:55:41 +0000 (21:55 +0300)]
LU-14184 tests: component-add/del tests for DOM

make duplicates of sanity-pfl 2,3 tests for DOM layout

HPE-bug-id: LUS-8282
Test-parameters: testlist="sanity-pfl/2.* sanity-pfl/3.*"
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: If73d7a436b2fc6b6b564cc6eec14ec9e7e4d6937
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/40870
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12828 ldlm: not freed req on enqueue 18/41818/2
Vitaly Fertman [Tue, 2 Mar 2021 20:43:08 +0000 (23:43 +0300)]
LU-12828 ldlm: not freed req on enqueue

ldlm_cli_enqueue may allocate a req but failed to allocate a req
slot and returns an errors without freeing the req.

Fixes: 85a12c6c8d ("LU-12828 ldlm: FLOCK request can be processed twice")
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I9663528bbf2bf64f6439fed6c27d0bc3f274b867
HPE-bug-id: LUS-9337
Reviewed-on: https://es-gerrit.dev.cray.com/158433
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/41818
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14183 ldlm: wrong ldlm_add_waiting_lock usage 68/40868/2
Vitaly Fertman [Fri, 4 Dec 2020 17:22:55 +0000 (20:22 +0300)]
LU-14183 ldlm: wrong ldlm_add_waiting_lock usage

exp_bl_lock_at accounted the period since BLAST send until cancel RPC
came to server originally. LU-6032 started to update l_blast_sent for
expired locks which are still busy - prolonged locks when the timeout
expired. In fact, this is a good idea to cover not the whole period
but until any involved RPC comes - it avoids excessively large lock
callback timeouts - and the IO which does the lock prolong is also
able to re-start the AT cycle by updating the l_blast_sent.

Unfortunately, the change seems to be made occasionally as the main
prolong code was not adjusted accordingly.

Fixes: 292aa42e08 ("LU-6032 ldlm: don't disable softirq for exp_rpc_lock")
HPE-bug-id: LUS-9278
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Idc598508fc13aa33ac9fce56f13310ca6fc819d4
Tested-by: Jenkins Build User <nssreleng@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/40868
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andriy Skulysh <askulysh@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11289 ptlrpc: fix ASSERTION on scp_rqbd_posted 36/41936/3
Yang Sheng [Mon, 8 Mar 2021 14:53:13 +0000 (22:53 +0800)]
LU-11289 ptlrpc: fix ASSERTION on scp_rqbd_posted

The request may be referenced by other target even the threads
of service were stopped. It caused by some portal shared among
different services. Just wait the request to be released as a
workaround.

LustreError: (service.c::ptlrpc_service_purge_all())
ASSERTION( list_empty(&svcpt->scp_rqbd_posted) ) failed:
LustreError: (service.c::ptlrpc_service_purge_all()) LBUG
Pid: 21, comm: umount 3.10.0 #1 SMP
Call Trace:
 [<a01c47dc>] libcfs_call_trace+0x8c/0xc0 [libcfs]
 [<a01c488c>] lbug_with_loc+0x4c/0xa0 [libcfs]
 [<a0b534dd>] ptlrpc_unregister_service+0xced/0xd90 [ptlrpc]
 [<a005e122>] ost_cleanup+0x82/0x1b0 [ost]
 [<a08e0bfa>] class_free_dev+0x1ca/0x630 [obdclass]
 [<a08e1240>] class_export_put+0x1e0/0x2b0 [obdclass]
 [<a08e2cc5>] class_unlink_export+0x135/0x170 [obdclass]
 [<a08f8030>] class_decref+0x80/0x160 [obdclass]
 [<a08f8481>] class_detach+0x1b1/0x2e0 [obdclass]
 [<a08fef21>] class_process_config+0x1a91/0x2820 [obdclass]
 [<a08ffe90>] class_manual_cleanup+0x1e0/0x6d0 [obdclass]
 [<a092a115>] server_stop_servers+0xd5/0x160 [obdclass]
 [<a092f6c6>] server_put_super+0x126/0xca0 [obdclass]
 [<8121068a>] generic_shutdown_super+0x6a/0xf0
 [<81210a62>] kill_anon_super+0x12/0x20
 [<a09027e2>] lustre_kill_super+0x32/0x50 [obdclass]
 [<81210e59>] deactivate_locked_super+0x49/0x60
 [<812115a6>] deactivate_super+0x46/0x60
 [<8123019f>] cleanup_mnt+0x3f/0x80
 [<81230232>] __cleanup_mnt+0x12/0x20
 [<810ab085>] task_work_run+0xb5/0xf0
 [<8102ac12>] do_notify_resume+0x92/0xb0
 [<81783c83>] int_signal+0x12/0x17
 Kernel panic - not syncing: LBUG

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Idfb19df123ceae177a0e447e9344bac6861166bf
Reviewed-on: https://review.whamcloud.com/41936
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14492 tests: sanity 27Cb skip condition 03/41903/3
Alexander Zarochentsev [Thu, 15 Oct 2020 08:09:09 +0000 (11:09 +0300)]
LU-14492 tests: sanity 27Cb skip condition

The test skip condition is wrong and causes the
test to be skipped if large xattrs are not supported.
Fixing other tests as well.

Test-Parameters: trivial
Fixes: 591a9b4ce ("LU-9846 lod: Add overstriping support")
HPE-bug-id: LUS-9454
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Change-Id: I7b9d96abb5e4cf2a3955e20828e57a64978e6229
Reviewed-on: https://review.whamcloud.com/41903
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
3 years agoLU-9859 libcfs: remove cfs_capable 83/41783/3
Peng Tao [Mon, 11 Jan 2021 15:49:38 +0000 (10:49 -0500)]
LU-9859 libcfs: remove cfs_capable

Use capable() directly.

Linux-commit: 2eb90a757e9d953c9e2a8fce530422189992fb1b

Change-Id: Iadaa3c743a350def37558b23d954f5dfd4e0844a
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/41783
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14430 mdd: don't assert on default ACL big buffer 75/41775/6
Mikhail Pershin [Fri, 26 Feb 2021 14:48:36 +0000 (17:48 +0300)]
LU-14430 mdd: don't assert on default ACL big buffer

Previous patch may cause situations when default ACL buffer
is bigger than ACL buffer, so that default ACL EA may fit
into the former but not in the latter, causing assertion in
mdd_acl_init().

There is no need in assertion actually, just return -ERANGE so
ACL buffer will be re-allocated.

Fixes: f3d03bc38a3a ("LU-14430 mdd: fix inheritance of big default ACLs")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I8c0665ba693c60506812926a8372b61095d08f78
Reviewed-on: https://review.whamcloud.com/41775
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14427 libcfs: restore LNET_DUMP_ON_PANIC functionality. 88/41488/2
Mr NeilBrown [Tue, 9 Feb 2021 04:30:46 +0000 (15:30 +1100)]
LU-14427 libcfs: restore LNET_DUMP_ON_PANIC functionality.

The functionality enabled by --enable-panic-dumplog was inadvertently
removed in Commit ae0704381efc ("LU-9859 libcfs: merge linux-debug.c
into debug.c")

Restore it.

While we are there, add conditional-compliation for other code that is
only needed when this is enabled.

Test-Parameters: trivial
Fixes: ae0704381efc ("LU-9859 libcfs: merge linux-debug.c into debug.c")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If85882045c66e54ff8493396589d4ecbf13f8f3d
Reviewed-on: https://review.whamcloud.com/41488
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14401 sec: fix migrate for encrypted dir 13/41413/8
Sebastien Buisson [Thu, 4 Feb 2021 08:22:56 +0000 (17:22 +0900)]
LU-14401 sec: fix migrate for encrypted dir

When setting an encryption policy on a directory that we want to
be encrypted, we need to make sure it is empty.
But, in some cases, setting the LL_XATTR_NAME_ENCRYPTION_CONTEXT xattr
should be allowed on non-empty directories, for instance when a
directory is migrated across MDTs into new shard directories.
Also, it is required for the encrpytion key to be available on the
client when migrating a directory so that the filenames can be
properly rehashed for the new MDT directory shard.
And, in any case, we need to prevent explicit setting of
LL_XATTR_NAME_ENCRYPTION_CONTEXT xattr outside of encryption policy
definition.

Update sanity-sec test_49 to test migration of non-empty encrypted
directory, and add sanity-sec test_57 to test security.c protection.

Fixes: e8f74fb0f5 ("LU-12275 sec: verify dir is empty when setting enc policy")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2466ea35a871c6c07bdcf9fba7191485e855e655
Reviewed-on: https://review.whamcloud.com/41413
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10973 lnet: initial LUTF C infrastructure 86/38086/41
Amir Shehata [Wed, 25 Mar 2020 02:07:58 +0000 (19:07 -0700)]
LU-10973 lnet: initial LUTF C infrastructure

LNet Unit test Framework is a utility that functionally tests LNet
via python scripts. It operates in a master/slave configuration.
Slaves run on multiple test nodes, while the master is responsible
for managing the slaves to perform specific tests.

The LUTF exercises the different LNet features via configuring
LNet through the lnetconfig interface or lnetctl, running traffic
and monitoring statistics and other logging to ensure that tests
have passed.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iefcc4d48d5f144a2abe1fdc0865331e9a9d27318
Reviewed-on: https://review.whamcloud.com/38086
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13107 utils: remove lctl lov_getconfig command 06/37106/11
Andreas Dilger [Fri, 26 Feb 2021 21:57:28 +0000 (16:57 -0500)]
LU-13107 utils: remove lctl lov_getconfig command

The "lctl lov_getconfig" command has been obsolete for some time,
but was kept around for sanity test_44a to work properly.  Now
that LU-11656 has landed, "lfs getstripe -d $DIR" can be used to
get the actual layout used for files created in a directory.

Remove the lov_getconfig command along with the IOC definition
it was using.

Test-Parameters: envdefinitions=ONLY=41a testlist=sanity
Change-Id: If94471b50fafc157c043d241dc19cdcd714cab07
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37106
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12885 mds: add enums for MDS_ATTR flags 12/33512/20
Andreas Dilger [Mon, 14 Oct 2019 03:13:18 +0000 (21:13 -0600)]
LU-12885 mds: add enums for MDS_ATTR flags

Add mds_attr_flags to the code to make it easier to follow the logic.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I833a6e6102f947a9276cb6bf03826fd4a53ebbe5
Reviewed-on: https://review.whamcloud.com/33512
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14491 ldiskfs: do not corrupt journal with bh modification 96/41896/2
Andrew Perepechko [Fri, 5 Mar 2021 14:10:38 +0000 (17:10 +0300)]
LU-14491 ldiskfs: do not corrupt journal with bh modification

Currently, ldiskfs_xattr_delete_inode() zeroes xattr inode
references in cached buffers that haven't been prepared by
get_write_access().

When using journal checksums, it is possible that these buffers
are modified after the checksum is calculated but before the
buffer has been written to journal. Journal replay will fail
with a journal checksum error message if this transaction needs
to be replayed.

Change-Id: Ia3d44f24715ad97b505e08706933e4eb608c115f
Signed-off-by: Andrew Perepechko <andrew.perepechko@hpe.com>
HPE-bug-id: LUS-9483
Reviewed-on: https://review.whamcloud.com/41896
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9679 osc: simplify osc_extent_find() 91/41691/5
NeilBrown [Thu, 13 Dec 2018 00:32:56 +0000 (11:32 +1100)]
LU-9679 osc: simplify osc_extent_find()

osc_extent_find() contains some code with the same functionality as
osc_extent_merge().  So replace that code with a call to
osc_extent_merge().

This requires that we set cur->oe_grants earlier, as
osc_extent_merge() needs that.
It also requires that osc_extent_merge() allow the victim to be
OES_INV.

Also:

 - fix a pre-existing bug - osc_extent_merge() should never try to
   merge two extends with different ->oe_mppr as later alignment
   checks can get confused.
 - Remove a redundant list_del_init() which is already included in
   __osc_extent_remove().

Linux-Commit: 85ebb57ddc5b ("lustre: osc: simplify osc_extent_find()")

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8a1e0d492f583ba9baf28bafa42d4e31c29ac0da
Reviewed-on: https://review.whamcloud.com/41691
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 llite: create file_operations registeration function. 08/40608/6
James Simmons [Thu, 4 Feb 2021 14:48:15 +0000 (09:48 -0500)]
LU-6142 llite: create file_operations registeration function.

Create new ll_register_file_operations() to set sbi->ll_ops to the
correct struct file_operations. We can make all the struct
file_operations static.

Change-Id: I0369a4f64de5233d5272bc403f222366f9559000
Test-Parameters: trivial
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/40608
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 lustre: Make dev/body/type operations const 98/39398/7
Mr NeilBrown [Thu, 16 Jul 2020 04:14:10 +0000 (14:14 +1000)]
LU-6142 lustre: Make dev/body/type operations const

Many of
  struct md_device_operations
  struct dt_body_operations
  struct dt_object_operations
  struct dt_device_operations
  struct dt_index_operations
  struct lu_object_operations
  struct lu_device_operations
  struct lu_device_type_operations
are already const.  This patch makes the remainder 'const',
and changes a few to 'static'.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ife82c870a27a9e68e57208d49f51983a552e86ec
Reviewed-on: https://review.whamcloud.com/39398
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-14504 lod: lod_xattr_del() check obj existence 76/41976/2
Lai Siyao [Wed, 10 Mar 2021 10:13:18 +0000 (18:13 +0800)]
LU-14504 lod: lod_xattr_del() check obj existence

lod_declare_xattr_del() skips object if it doesn't exist, but
lod_xattr_del() doesn't, which may trigger assertion in
osp_xattr_del() if a stripe doesn't exist.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I00723d3b0243efd1357107c59dd86967e076e2af
Reviewed-on: https://review.whamcloud.com/41976
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14490 lmv: striped directory as subdirectory mount 93/41893/4
Lai Siyao [Fri, 5 Mar 2021 09:07:34 +0000 (17:07 +0800)]
LU-14490 lmv: striped directory as subdirectory mount

lmv_intent_lookup() will replace fid1 with stripe FID, but if striped
directory is mounted as subdirectory mount, it should be handled
differently. Because fid2 is directory master object, if stripe is
located on different MDT as master object, it will be treated as
remote object by server, thus server won't reply LOOKUP lock back,
therefore each file access needs to lookup "/".

And remote directory (either plain or striped) shouldn't be used for
subdirectory mount, because remote object can't get LOOKUP lock.
Add an option "mdt_enable_remote_subdir_mount" (1 by default for
backward compatibility), mdt_get_root() will return -EREMOTE if
user specified subdir is a remote directory and this option is
disabled.

Add sanity 247g, updated 247f.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I5e8f95ee95c4155336098e55b7569ed7a43865c1
Reviewed-on: https://review.whamcloud.com/41893
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13730 lod: don't confuse stale with primary flag 03/42003/7
Alex Zhuravlev [Thu, 11 Mar 2021 05:47:34 +0000 (08:47 +0300)]
LU-13730 lod: don't confuse stale with primary flag

there can be few in-sync replicas which are not primry.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8b984463a2665bc88f2f76247df5366a68d74ea6
Reviewed-on: https://review.whamcloud.com/42003
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12678 o2iblnd: change some ints to bool. 04/39304/5
Mr NeilBrown [Mon, 6 Jul 2020 12:34:39 +0000 (08:34 -0400)]
LU-12678 o2iblnd: change some ints to bool.

Each of these ints can suitably be bool.

Also fix various style issues.

Change-Id: Ic956366afc945f74e692dd5f8953149730a3703e
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/39304
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13073 osp: don't block waiting for new objects 74/40274/42
Alex Zhuravlev [Fri, 16 Oct 2020 16:09:04 +0000 (19:09 +0300)]
LU-13073 osp: don't block waiting for new objects

if OST is down, then it's possible that few threads trying
to get already precreated object will get stuck. even worse
that all QoS-based allocations then are serialized by the
single semaphore, even those that wouldn't try to allocate
on failed OST.

the patch introduces noblock flag in the allocation hint
which is passed to OSP. then QoS code tries to allocate
objects in a non-blocking manner.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I38e66d7569aefecf800dbc32f1049ac87853439e
Reviewed-on: https://review.whamcloud.com/40274
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9855 lustre: use with_imp_locked() more broadly. 95/39595/11
Mr NeilBrown [Fri, 7 Aug 2020 02:05:35 +0000 (12:05 +1000)]
LU-9855 lustre: use with_imp_locked() more broadly.

Several places in lustre take u.cli.cl_sem to protect access to
u.cli.cl_import, and so could use with_imp_locked() achieving cleaner
code.

Using with_imp_locked() in functions calling
ptlrpc_set_import_active() requires care as that function gets a
write-lock on ->cl_sem.  So they need to use with_imp_locked() only to
get a counted reference on the imp, and must drop the lock before
calling ptlrpc_set_import_active().

This patch makes those changes and also:

- introduces with_imp_locked_nested() for sptlrpc_conf_client_adapt(),
- re-indents obd_cleanup_client_import(), which is only tangentially
  related the the main purpose of this patch,
- removes code in ldlm_flock_completion_ast() which takes a copy
  of cl_import, and doesn't use it.
- adds with_imp_locked() to two functions named 'active_store' which
  weren't using it but should
- removes with_imp_locked() from ping_show() and instead includes it
  in ptlrpc_obd_ping() where 'imp' is actually used.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I01a713c200a1698af222bc72cf4f955227a98305
Reviewed-on: https://review.whamcloud.com/39595
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14395 kernel: kernel update RHEL7.9 [3.10.0-1160.15.2.el7] 22/41822/4
Jian Yu [Wed, 3 Mar 2021 01:41:12 +0000 (17:41 -0800)]
LU-14395 kernel: kernel update RHEL7.9 [3.10.0-1160.15.2.el7]

Update RHEL7.9 kernel to 3.10.0-1160.15.2.el7.

Change debuginfo download location since debuginfo.centos.org
does not provide kernel-debuginfo-common anymore.

The patch also reverts the following fix from RHEL 7.9 kernel
since version 3.10.0-1160.8.1.el7:

- [kernel] timer: Fix lockup in __run_timers() caused by
  large jiffies/timer_jiffies delta (Waiman Long) [1849716]

The above fix caused Hard LOCKUP kernel panic.

Test-Parameters: clientdistro=el7.9 serverdistro=el7.9

Change-Id: Icdd9e8bf4bd595dece266f6c5a9b0de344781a93
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41822
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-14477 lnet: handle possiblity of IPv6 being unavailable. 91/41791/5
Mr NeilBrown [Mon, 1 Mar 2021 00:54:25 +0000 (11:54 +1100)]
LU-14477 lnet: handle possiblity of IPv6 being unavailable.

If CONFIG_IPV6 is not enabled, the attempt to create an IPv6 socket
for accepting new incoming connections will fail.  In that case
we need to creae an IPv4 socket instead.

Also ipv6_dev_get_saddr will not be available, so we must not include
the code that tries to use it.

Test-Parameters: trivial
Fixes: e4fa181abf10 ("LU-10391 lnet: allow creation of IPv6 socket")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib576c7ea498c90f549958f3c1aa0beb7fe2b66ad
Reviewed-on: https://review.whamcloud.com/41791
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14478 ldiskfs: support Ubuntu 20.04.1 kernel 5.4.0-66 86/41786/2
Jian Yu [Sun, 28 Feb 2021 05:06:56 +0000 (21:06 -0800)]
LU-14478 ldiskfs: support Ubuntu 20.04.1 kernel 5.4.0-66

This patch fixes the conflict in ext4-pdirop.patch to support
Ubuntu 20.04.1 server with kernel version greater than or
equal to 5.4.0-66.

Test-Parameters: trivial

Change-Id: I336f5bb430f87aaefc6d79a782dfd779d20e0cf7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41786
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14460 lnet: fix mismatched printf format 55/41755/4
Lei Feng [Thu, 25 Feb 2021 00:31:56 +0000 (08:31 +0800)]
LU-14460 lnet: fix mismatched printf format

Original "%llx" does not work on all platforms. Fix it.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: I2edecbf66ccb2141c72294d324ade79574f5c084
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/41755
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14431 log: Add ending newline for some messages. 23/41723/5
Lei Feng [Tue, 23 Feb 2021 03:35:18 +0000 (11:35 +0800)]
LU-14431 log: Add ending newline for some messages.

Some log messages don't have ending newline. So two log messages
will be merged into one line and cause error for parsing program.
Add ending newline for these messages.

Signed-off-by: Lei Feng <flei@whamcloud.com>
Change-Id: I79acba9fc494c148dfe2c56cdbe7694b4bbc5cf4
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/41723
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Anjus George <georgea@ornl.gov>
3 years agoLU-5170 utils: add lfs df -H for decimal units 71/41271/3
Andreas Dilger [Wed, 20 Jan 2021 00:48:28 +0000 (17:48 -0700)]
LU-5170 utils: add lfs df -H for decimal units

Running "lfs df -ih" prints a base-two suffix for inode counts,
which is somewhat unintuitive (e.g. 100000 becomes 97.2K inodes).
While this is consistent with upstream "df", it also has a "-H"
option to print the output with decimal suffixes.

Add the -H/--si option to "lfs df" also.

Document the 'f' (flash) and 'N' (noprecreate) flags for "lfs df".

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I06b8df4ae2940107720e57013bf187b3473ebbe5
Reviewed-on: https://review.whamcloud.com/41271
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14279 test: fix block soft testing failure 94/41094/4
Wang Shilong [Mon, 28 Dec 2020 02:33:24 +0000 (10:33 +0800)]
LU-14279 test: fix block soft testing failure

Soft least qunit was introduced to avoid performance
drop when users have reached soft limit, but timer has
not reached, it tried to acquire more space(not more than
least qunit) to get reasonable performance.

Test cases need be aware of this, which means slave might
exceed quota limit a bit(but should not more than least qunit
eg 4M).

Test-Parameters: trivial testlist=sanity-quota env=ONLY="3a 3b 3c"
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ia221d97d158a8da4dc1fe1611aebac2f5086440e
Reviewed-on: https://review.whamcloud.com/41094
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12477 lustre: check return status of register_shrinker() 83/40883/4
Mr NeilBrown [Mon, 7 Dec 2020 02:07:31 +0000 (13:07 +1100)]
LU-12477 lustre: check return status of register_shrinker()

register_shrinker() can fail with -ENOMEM.  We should check for that
and abort the relevant initialization functions when it happens.

For ldlm_pools, ldlm_pools_fini() can be called when ldlm_pools_init()
fails, or even in case where it hasn't been called.  So add a static
flag to ensure we ldlm_pools_fini() does undo things that haven't been
done.

For lu_global_init() we need to add proper cleanup if anything fails.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie66326486c7738547d4211095bb1d37dc75e0b6a
Reviewed-on: https://review.whamcloud.com/40883
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12477 libcfs: Further reduce complexity for shrinkers. 31/40831/6
Mr NeilBrown [Thu, 12 Nov 2020 05:08:26 +0000 (16:08 +1100)]
LU-12477 libcfs: Further reduce  complexity for shrinkers.

Commit c4c17fa4a3f5 ("LU-12477 libcfs: Remove obsolete config checks")
reduced the complexity of shinkers by removing support for older
kernels, but could have gone a lot further.  This patch adds
further simplification.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ibcc84f61e542b503f795b16a7144e430f8b73582
Reviewed-on: https://review.whamcloud.com/40831
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13903 build: Move GLIBC/openssl checks to where needed. 53/39653/6
Mr NeilBrown [Wed, 5 Aug 2020 02:06:52 +0000 (12:06 +1000)]
LU-13903 build: Move GLIBC/openssl checks to where needed.

Two config checks on glibs support:
LC_GLIBC_SUPPORT_FHANDLES
LC_GLIBC_SUPPORT_COPY_FILE_RANGE
and two on openssl support:
LC_OPENSSL_SSK
LC_OPENSSL_GETSEPOL

are currently only run when modules are being built.
The FHANDLES test is needed when building tests.
The COPY_FILE_RANGE test is needed when building
utils as are the OPENSSL checks

So move the calls to these tests to a more appropriate place, so that
  ./configure --disable-modules --disable-server
can run correctly.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id7801112cd53601b3d560119784cbd062bf9610e
Reviewed-on: https://review.whamcloud.com/39653
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 years agoLU-13779 lnet: Correct asymmetric route detection 49/39349/7
Chris Horn [Fri, 10 Jul 2020 17:33:50 +0000 (12:33 -0500)]
LU-13779 lnet: Correct asymmetric route detection

Failure to lookup the remote net for LNET_NIDNET(src_nid) indicates an
asymmetric route, but we do not drop the message in this case. Another
problem with this code is that there is no guarantee that we'll have a
route->lr_lnet that matches the net of ni->ni_nid.

We can move the asymmetric route detection to after we have looked up
the lpni of from_nid. Then, we can look at just the routes associated
with the gateway that owns the lpni. If one of those routes has
lr_net == LNET_NIDNET(src_nid), then the route is symmetrical.

Fixes: 4932febc12 ("LU-11894 lnet: check for asymmetrical route messages")
Test-Parameters: trivial
HPE-bug-id: LUS-9087
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I8044d3f53e6f000c1e4d7c4e34b3b21afe0f9711
Reviewed-on: https://review.whamcloud.com/39349
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12678 socklnd: change various ints to bool. 02/39302/3
Mr NeilBrown [Mon, 6 Jul 2020 12:34:41 +0000 (08:34 -0400)]
LU-12678 socklnd: change various ints to bool.

Each of these int variables, and one int function, are
really truth values, so change to bool.

Also convert some spaces to tabs etc.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia62a86e549c90a287a20a3b2ef7533c1b700d17e
Reviewed-on: https://review.whamcloud.com/39302
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13783 libcfs: support removal of kernel_setsockopt() 59/39259/7
Mr NeilBrown [Fri, 3 Jul 2020 03:51:50 +0000 (13:51 +1000)]
LU-13783 libcfs: support removal of kernel_setsockopt()

Linux 5.8 removes kernel_setsockopt() and kernel_getsockopt(), and
provides some helper functions for some accesses that are
not trivial.

This patch adds those helpers to libcfs when they are not available,
and changes (nearly) all calls to kernel_[gs]etsockopt() to
either use direct access to a helper call.

->keepalive() is not available before v4.11-rc1~94^2~43^2~14
and there is no helper function, so for SO_KEEPALIVE we
need to have #ifdef code in the C file.

TCP_BACKOFF* setting are not converted as they are not available in
any upstream kernel, so no conversion is possible.

Also include some minor style fixes and change lnet_sock_setbuf() and
lnet_sock_getbuf() to be 'void' functions.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I539cf8d20555ddb3565fa75130fdd3acf709c545
Reviewed-on: https://review.whamcloud.com/39259
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13636 obdclass: drop nlink if directory is removed 44/38844/14
Alex Zhuravlev [Fri, 5 Jun 2020 12:15:22 +0000 (15:15 +0300)]
LU-13636 obdclass: drop nlink if directory is removed

To make e2fsck happy.  Otherwise, all the features using
local directories (quota, nodemap, nid tables) can leave
orphaned objects as nlink doesn't drop to 0.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9e20a304d66c61f312168715e888757bc06b6ed0
Reviewed-on: https://review.whamcloud.com/38844
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12514 llite: move client mounting from obdclass to llite 93/37693/17
Mr NeilBrown [Mon, 28 Dec 2020 20:56:12 +0000 (15:56 -0500)]
LU-12514 llite: move client mounting from obdclass to llite

Mounting a lustre client is currently handled
in obdclass, using services from llite.
This requires obdclass to load the llite module
and set up inter-module linkage.

The purpose of this was for common code to support both
client and server mounts.  This isn't really a good idea
and need to go. For lustre servers we already use a
separate filesystem type.

So move the mounting code from obdclass/obd_mount to llite/super25
and remove the inter-module linkages.
Add some EXPORT_SYMBOL() so that llite can access some helpers
that remain in obdclass.

Linux-commit: a989830c88149511ee840356d9c1b34304bac576

Change-Id: Ia33bd55a042f90b178156c745a8072b516f00568
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/37693
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11085 mdt: revise recording of hsm progress updates. 25/39725/11
Mr NeilBrown [Mon, 24 Aug 2020 22:28:18 +0000 (08:28 +1000)]
LU-11085 mdt: revise recording of hsm progress updates.

When copy tool is migrating a file for HSM purposes it can report
progress as individual intervals, and the total covered by all the
intervals can be requested.

This patch makes various changes to the code for recording the
intervals.

- switch to the Linux interval-tree implementation rather than the
  lustre one.
- detect overlapping intervals as well as duplicates.  Any overlapping
  or adjacent intervals are removed and the space which they covered
  is added the the new interval.
- keep track of the total of all current intervals, so that it can be
  returned on request without needing to examine the interval tree.
- use a spinlock rather than a mutex to protect against parallel
  updates, as all operations are non-blocking.

Also add a test to llapi_hsm_test.c to send overlapping intervals and
check the result is as expected.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id300b64949a3416ee3282c5e4ce82122c9e4e2f0
Reviewed-on: https://review.whamcloud.com/39725
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 lod: remove unnecessary variables 66/41766/3
Mr NeilBrown [Thu, 17 Dec 2020 23:18:48 +0000 (10:18 +1100)]
LU-6142 lod: remove unnecessary variables

Both lprocfs_register() and dt_tunables_init() treat a NULL pointer as
being equivalent to an empty array.

So discard the empty arrays and use the NULL pointer.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I620fbf1c771ee80d2ddff0f38a87c2f08bae0e4d
Reviewed-on: https://review.whamcloud.com/41766
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 llite: move acl code into separate file. 29/40829/6
Mr NeilBrown [Mon, 2 Nov 2020 00:44:02 +0000 (11:44 +1100)]
LU-6142 llite: move acl code into separate file.

acl code is subject to conditional compilation (only if fs acls are
enabled), so move it to a separate file.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I472fa80193ee47abbab857bcc6dd021ed42ae9a5
Reviewed-on: https://review.whamcloud.com/40829
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13974 tests: update log corruption 43/40743/4
Alexander Boyko [Tue, 24 Nov 2020 09:05:36 +0000 (04:05 -0500)]
LU-13974 tests: update log corruption

Test case reproduce missing object for sub transaction during
set xattr operation.
First setattr got -2, second already started, but didn't
make llog_add yet. In this case llog osp object is stale after
top_trans_start. So declaration phase can not refresh llogs. And
at llog_osd_write_rec osp object changes stale state to
valid(dt_attr_get), but llog handle and llog header are invalid.
A new record would be added to updatelog with wrong index.
In that case processing of update log fails with

fs1-MDT0001-osp-MDT0003: [0x2:0x400024d0:0x2] Invalid record: index
112926 but expected 112925
lod_sub_recovery_thread()) fs1-MDT0001-osp-MDT0003 get update log
failed: rc = -34
Recovery aborted, and clients are evicted.

HPE-bug-id: LUS-9030
Test-Parameters: testlist=sanity  envdefinitions=ONLY="427"
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Change-Id: I6a47fed1bc01f4be62216d1d0787adc413df0cf5
Reviewed-on: https://review.whamcloud.com/40743
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13883 lnet: Lookup lpni after discovery 47/39747/3
Chris Horn [Thu, 6 Aug 2020 21:24:57 +0000 (16:24 -0500)]
LU-13883 lnet: Lookup lpni after discovery

The lpni for a nid can change as part of the discovery process (see
lnet_peer_add_nid()). As such, callers of lnet_discover_peer_locked()
need to lookup the lpni again after discovery completes to make sure
they get the correct peer.

An exception is lnet_check_routers() which doesn't do anything with
the peer or peer NI after the call to lnet_discover_peer_locked().
If the router list is changed then lnet_check_routers() will already
repeat discovery.

HPE-bug-id: LUS-9167
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I8bdfcb957e87f65ce65bfad81858a4ce3362298e
Reviewed-on: https://review.whamcloud.com/39747
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13894 lnet: Transfer disc src NID when merging peers 07/39607/7
Chris Horn [Thu, 6 Aug 2020 21:39:27 +0000 (16:39 -0500)]
LU-13894 lnet: Transfer disc src NID when merging peers

If we're merging two peers in lnet_peer_data_present() then we need
to transfer the src NID stored in the peer whose ping buffer we are
processing to the peer that actually owns the NIDs in the ping
buffer. Otherwise it is possible that the subsequent push to the peer
that is being discovered will go out over an interface that the peer
does not know about and it will be dropped.

Test-Parameters: trivial
HPE-bug-id: LUS-9193
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I050c7c1c2c0eddb8d5ff12f40342a8a02efacb9c
Reviewed-on: https://review.whamcloud.com/39607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13895 lnet: Prevent discovery on deleted peer 05/39605/7
Chris Horn [Thu, 6 Aug 2020 21:21:29 +0000 (16:21 -0500)]
LU-13895 lnet: Prevent discovery on deleted peer

We needn't perform any discovery activities on a peer that has had
lnet_peer_del() called on it.

Test-Parameters: trivial
HPE-bug-id: LUS-9192
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I5c89dc89038d2c8bf4d2a29029af7720963b81a2
Reviewed-on: https://review.whamcloud.com/39605
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13895 lnet: Prevent discovery on peer marked deletion 04/39604/6
Chris Horn [Fri, 7 Aug 2020 16:02:10 +0000 (11:02 -0500)]
LU-13895 lnet: Prevent discovery on peer marked deletion

If a peer has been marked for deletion then we needn't perform any
other discovery operation on it. Integrate this peer state into the
top level of the discovery state machine so that it is checked before
any other state.

HPE-bug-id: LUS-9192
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ie9de5b0d38d720f4f49d7e4a0673a6b52f9d3d80
Reviewed-on: https://review.whamcloud.com/39604
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13708 lnet: lnet_notify sets route aliveness incorrectly 60/39160/7
Chris Horn [Tue, 23 Jun 2020 18:02:51 +0000 (13:02 -0500)]
LU-13708 lnet: lnet_notify sets route aliveness incorrectly

lnet_notify() modifies route aliveness in two ways:
1. By setting lp_alive field of the lnet_peer struct.
2. By setting lr_alive field of the lnet_route struct (via call to
   lnet_set_route_aliveness())

In both cases, the aliveness value assigned is determined by a call
to lnet_is_peer_ni_alive(), but that value only reflects the aliveness
of a particular peer NI. A gateway may have multiple peer NIs, so the
aliveness of a gateway peer (lp_alive) is not necessarily equivalent
to the aliveness of one of its NIs. Furthermore, the lr_alive field
is only used to determine route aliveness for path selection if
discovery is disabled locally or on the gateway (see
lnet_find_route_locked() and lnet_is_route_alive()).

In general, we should not set lp_alive based on an lnet_notify()
call, and we should only set lr_alive if discovery is disabled. For
lr_alive specifically, we should only set it for those routes that
have the peer NI as a next-hop.

An exception to the above exists when the reset argument to
lnet_notify() is set. The gnilnd uses this flag in its calls to
lnet_notify() because gnilnd receives out-of-band notifications of
node up and down events. Thus, when gnilnd calls lnet_notify() we
actually know whether the gateway peer is up or down and we can set
lp_alive appropriately.

net lock/EX is held by other callers of lnet_set_route_aliveness, so
we do the same in lnet_notify().

Fixes: e35be987da ("LU-12422 lnet: discovery off route state update")
Fixes: ebc9835a97 ("LU-12941 lnet: Add peer level aliveness information")
Test-Parameters: trivial
HPE-bug-id: LUS-9034
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: I2927e5f5ef849e45c233c92d2a6deca765e496eb
Reviewed-on: https://review.whamcloud.com/39160
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14488 o2ib: Use rdma_connect_locked if it is defined 87/41887/5
Sergey Gorenko [Thu, 4 Mar 2021 12:33:16 +0000 (14:33 +0200)]
LU-14488 o2ib: Use rdma_connect_locked if it is defined

rdma_connect_locked() is added in the upstream kernel 5.10 and
MOFED-5.2-2. After that, it is not allowed to call rdma_connect()
in RDMA CM event handler; rdma_connect_locked() must be used
instead.

This commit adds configure checks to detect whether
rdma_connect_locked() is available and updates the event handler
to call the correct function.

Test-Parameters: trivial
Signed-off-by: Sergey Gorenko <sergeygo@nvidia.com>
Change-Id: I8068d04810bf6f0200292a55f3fdcea8c71d44c1
Reviewed-on: https://review.whamcloud.com/41887
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14405 mdt: read LMV with mdt_stripe_get() 52/41452/3
Lai Siyao [Tue, 9 Feb 2021 14:09:09 +0000 (22:09 +0800)]
LU-14405 mdt: read LMV with mdt_stripe_get()

mdt_path_current() reads LMV into mdt_thread_info.mti_xattr_buf,
whose size is static, and will return -ERANGE if LMV contains too
many stripes, instead it should call mdt_stripe_get(), the latter
will allocate dynamic memory for LMV.

Test-Parameters: mdtcount=8
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I1ed78f7a7f951fa5984e604a8773143a70b419e7
Reviewed-on: https://review.whamcloud.com/41452
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14058 tests: handle more MDTs in sanity.sh 85/41485/9
Andreas Dilger [Thu, 11 Feb 2021 21:22:58 +0000 (14:22 -0700)]
LU-14058 tests: handle more MDTs in sanity.sh

Fix up sanity.sh test_160 to handle configurations with more MDTs.
The "fnv_1a_64" hash is _relatively_ uniform and harder to break
under normal (ab)use, it doesn't leave totally entries balanced.
Even "all_chars" hash has a repeat MDT every handful of entries.
Since we need perfect balance across MDTs, use "lfs mkdir -i".

Fix a bug in test_160g that wasn't setting changelog_max_idle_indexes
properly for test systems with more than 4 MDTs.

Test-Parameters: trivial testlist=sanity env=ONLY=160,230 mdtcount=8
Fixes: 489afbe69d5b ("LU-13321 tests: force even DNE file distribution")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I08bf2274a00fe1c6e52ec1a55f50dc8662d354a9
Reviewed-on: https://review.whamcloud.com/41485
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14204 tests: make sure we have a single import 58/41758/2
Sebastien Buisson [Wed, 9 Dec 2020 17:53:12 +0000 (18:53 +0100)]
LU-14204 tests: make sure we have a single import

In sanity, retrieve the exact name of the import being used on the
client, in order to properly get information such as lock_count or
lru_size.

Change-Id: I065b7da7990c7171d5baa24f3400c5f8ffc12fc9
Test-Parameters: trivial
Test-Parameters: env=SHARED_KEY=true testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/41758
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14468 utils: improve 'lfs rmfid' error messages 27/41727/2
John L. Hammond [Tue, 23 Feb 2021 15:40:08 +0000 (09:40 -0600)]
LU-14468 utils: improve 'lfs rmfid' error messages

In lfs_rmfid_and_show_errors(), convert the error messages printed by
'lfs rmfid' from the format
  rmfid([0x20001a9f5:0x159:0x0]): rc = -39
to
  lfs rmfid: cannot remove [0x20001a9f5:0x155:0x0]: Directory not empty

Simplify the logic and swap rc and rc2 to follow conventions.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Iccd9e1054ed8842fc4f65dd601077cfdeaa1320c
Reviewed-on: https://review.whamcloud.com/41727
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14385 utils: add range check to strtoul() in lfs.c 26/41726/4
Jian Yu [Tue, 23 Feb 2021 17:26:37 +0000 (09:26 -0800)]
LU-14385 utils: add range check to strtoul() in lfs.c

Most of the strtoul() functions called in lfs.c
did not check the range of the return value.
This patch fixes those issues.

Change-Id: If1eb64750507b5fa4e22abe710e475e2f0032b4d
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41726
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14462 gss: fix support for namespace in lgss_keyring 16/41716/2
Sebastien Buisson [Mon, 22 Feb 2021 15:24:11 +0000 (00:24 +0900)]
LU-14462 gss: fix support for namespace in lgss_keyring

Fix the way lgss_keyring handles different mount namespaces,
so that we do not try to bind to a namespace that does not exist.

Fixes: 94c44c62de ("LU-7845 gss: support namespace in lgss_keyring")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia5a5213399decc683d5e9401b6594e7fe579123f
Reviewed-on: https://review.whamcloud.com/41716
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13857 obdclass: Add white space to output valid YAML. 09/41709/8
Lei Feng [Mon, 22 Feb 2021 02:06:03 +0000 (10:06 +0800)]
LU-13857 obdclass: Add white space to output valid YAML.

YAML needs a white space after the colon(:) between a pair of key and
value. In this case, if the integer is large enough, it will leave no
white space. So insert the white space forcefully.

Change-Id: I366b5399cc293a66a70ea6084c6a5fa30a58813b
Signed-off-by: Lei Feng <flei@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41709
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-9859 tracefile: convert list_for_each_entry_safe() to while(!list_empty()) 70/41670/4
NeilBrown [Mon, 15 Feb 2021 14:57:07 +0000 (09:57 -0500)]
LU-9859 tracefile: convert list_for_each_entry_safe() to while(!list_empty())

These loops are removing all elements from a list.
So using while(!list_empty()) makes the intent clearer.

Linux-commit: fdafb01e2c70e6b5321d158a2ff1f20a13d9b365

Change-Id: Idda25888e424a1deaa4d7c6fad427d494b1f56e5
Signed-off-by: NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/41670
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14385 tests: verify lfs setstripe comp-flags and flags options 23/41423/5
Jian Yu [Fri, 19 Feb 2021 23:33:34 +0000 (15:33 -0800)]
LU-14385 tests: verify lfs setstripe comp-flags and flags options

This patch adds more test cases to verify lfs setstripe
--comp-flags|--component-flags and --flags options.

Change-Id: Ie09089ceb65372fdf4e3b50df3771c9a355210cc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/41423
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14398 lfs: use llapi_fid2path_at() in lfs_fid2path() 07/41407/4
John L. Hammond [Wed, 3 Feb 2021 20:17:11 +0000 (14:17 -0600)]
LU-14398 lfs: use llapi_fid2path_at() in lfs_fid2path()

Use llapi_fid2path_at() in lfs_fid2path(). This avoids resolving and
opening the mount point for each FID argument passed. Make the -c,
--cur, --current option actually print the link. Add a more
descriptive long option name for this (--print-link). Update the
lfs-fid2path manpgae accordingly.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: If851e4ce95f87d3188b644eb4a345ba3cfca530d
Reviewed-on: https://review.whamcloud.com/41407
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-14314 tests: skip test_16e from sanityn 53/41353/4
Vikentsi Lapa [Thu, 28 Jan 2021 16:08:38 +0000 (16:08 +0000)]
LU-14314 tests: skip test_16e from sanityn

To avoid test failure skip test_16e from sanityn
when Lustre version is below 2.13.53 on MDS server

Fixes: 92d799217aea ("LU-13227 sanityn 16a FAIL: fsx with O_DIRECT failed.")
Signed-off-by: Vikentsi Lapa <vlapa@whamcloud.com>
Change-Id: I562df7e02a9484fbc037597f012943cefa480fda
Test-Parameters: trivial testlist=sanityn
Reviewed-on: https://review.whamcloud.com/41353
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 years agoLU-14370 quota: use dt_sync() to flush pending writes 22/41322/3
Alex Zhuravlev [Tue, 26 Jan 2021 14:18:59 +0000 (17:18 +0300)]
LU-14370 quota: use dt_sync() to flush pending writes

and find remaining quota.

Fixes: 18cd3e1e28 ("LU-12702 quota: wait pending write before acquiring remotely")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8f1ec334a1d1eefc385d8c6ef451de8a3f12365f
Reviewed-on: https://review.whamcloud.com/41322
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12125 mds: allow parallel regular file rename 86/41186/5
Andreas Dilger [Sat, 9 Jan 2021 09:08:06 +0000 (02:08 -0700)]
LU-12125 mds: allow parallel regular file rename

Allow rename of non-directory files in the same directory to be done
in parallel, by only taking the DLM lock on the parent FID, without
also locking the global LUSTRE_BFL_FID (Big Filesystem Lock).

Older clients may not send the renamed file mode in mds_rec_rename.
In this case, the LUSTRE_BFL_FID lock will still be taken, and is not
worse than before parallel rename was allowed.

Similarly, if (for whatever reason) there is a mix of MDS versions
running in the same filesystem, at worst older MDSes will continue to
unnecessarily lock LUSTRE_BFL_FID before doing the file rename.

If MDT0000 is on an older MDS, but newer MDSes are doing renames of
non-directories, the newer MDSes will *not* lock LUSTRE_BFL_FID first,
but there will still be proper serialization from the parent directory
FID lock for other renames affecting the parent and the source/target
entries.  That MDT0000 is unaware of the rename is the whole point.

In case of a race, where the file mode sent by the client is stale,
this is also not a concern, because the file mode is rechecked later
under lock and the rename fails if the source and target mode differ.

Test-Parameters: testlist=racer env=DURATION=3600
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If330b53eb6db46e40f50fd7834a83e80db3ebbe5
Reviewed-on: https://review.whamcloud.com/41186
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13991 ldlm: speedup flock reprocess 48/40048/2
Andriy Skulysh [Wed, 19 Feb 2020 20:06:33 +0000 (22:06 +0200)]
LU-13991 ldlm: speedup flock reprocess

We can check for deadlock only for first
conflicting lock, the rest deadlock checks
will be performed after cancelation of
first conflicting lock.

Change-Id: I18359db405ab021a4f32ac833de203254097142d
HPE-bug-id: LUS-8509
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/40048
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12885 mdd: clearly name variables for MAY_ flags 20/36520/8
Andreas Dilger [Fri, 18 Oct 2019 11:12:24 +0000 (20:12 +0900)]
LU-12885 mdd: clearly name variables for MAY_ flags

Clearly name variables for MAY_READ, MAY_WRITE, MAY_EXEC to
distinguish it from other "mask" variables.

The kernel VFS silently converts the MAY_READ, MAY_WRITE, and
MAY_EXEC flags to ACL_READ, ACL_WRITE, and ACL_EXECUTE modes for
the on-disk ACLs.  It later also converts from the ACL_* flags
to POSIX rwx bits. Verify that these values are the same.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Idcd91d4c467c4415f1c67a5081721393cd3ebbe5
Reviewed-on: https://review.whamcloud.com/36520
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12885 llite: mark extended attr and inode flags 19/36519/6
Andreas Dilger [Fri, 18 Oct 2019 10:43:11 +0000 (19:43 +0900)]
LU-12885 llite: mark extended attr and inode flags

Clearly name the extended attribute and inode flags so that it is
possible to distinguish them more easily.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id1ad2ba6411f69efb2de38e1019940f5fb3ebbe5
Reviewed-on: https://review.whamcloud.com/36519
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 lustre: iput() can safely be passed NULL. 91/40291/3
Mr NeilBrown [Thu, 15 Oct 2020 22:43:09 +0000 (09:43 +1100)]
LU-6142 lustre: iput() can safely be passed NULL.

iput() is a no-op when passed a NULL pointer, so there is no
need to test for NULL before calling it - doing so clutters
the code.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idcd1a6746ecc67dcfcb0713d2762ca0bdb29de19
Reviewed-on: https://review.whamcloud.com/40291
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-11085 nodemap: switch interval tree to in-kernel impl. 24/39724/9
Mr NeilBrown [Tue, 25 Aug 2020 00:03:17 +0000 (10:03 +1000)]
LU-11085 nodemap: switch interval tree to in-kernel impl.

Switch nodemap_range to use the in-kernel interval tree.
This has the same functionality, though often in a different form.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7bf119bf8cd8f14dc66deb2736c2c97562bb0743
Reviewed-on: https://review.whamcloud.com/39724
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13485 libcfs: FIELD_SIZEOF macro removed 10/39710/6
Shaun Tancheff [Sun, 30 Aug 2020 18:53:09 +0000 (13:53 -0500)]
LU-13485 libcfs: FIELD_SIZEOF macro removed

Linux v4.15-rc2-5-g4229a470175b introduced sizeof_field() macro
Linux v5.5-rc4-1-g1f07dcc459d5 removed FIELD_SIZEOF() macro

Provide a sizeof_field() macro in terms of FIELD_SIZEOF()
when sizeof_field() is not provided.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I48ca9abb931d58919d788199e5089984c9e854dd
Reviewed-on: https://review.whamcloud.com/39710
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-12678 o2iblnd: convert peers hash table to hashtable.h 03/39303/4
Mr NeilBrown [Mon, 6 Jul 2020 12:34:40 +0000 (08:34 -0400)]
LU-12678 o2iblnd: convert peers hash table to hashtable.h

Using a hashtable.h hashtable, rather than bespoke code, has several
advantages:

 - the table is comprised of hlist_head, rather than list_head, so
   it consumes less memory (though we need to make it a little bigger
   as it must be a power-of-2)
 - there are existing macros for easily walking the whole table
 - it uses a "real" hash function rather than "mod a prime number".

In some ways, rhashtable might be even better, but it can change the
ordering of objects in the table are arbitrary moments, and that could
hurt the user-space API.  It also does not support the partitioned
walking that ksocknal_check_peer_timeouts() depends on.

Note that new peers are inserted at the top of a hash chain, rather
than appended at the end.  I don't think that should be a problem.

Also various white-space cleanups etc.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2917024835abdd327c7da11dee3fd369570a9671
Reviewed-on: https://review.whamcloud.com/39303
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12678 lnet: discard LNET_MD_PHYS 01/39301/2
Mr NeilBrown [Mon, 6 Jul 2020 12:34:42 +0000 (08:34 -0400)]
LU-12678 lnet: discard LNET_MD_PHYS

This macro has no value and is never set.
It claims "compatibility with Cray Portals", yet cray-dvs
   git://github.com/glennklockwood/cray-dvs.git
does not use it in any non-trivial way.

Much has changed in lnet and lib-md since 2007 when this
value was added - it seems likely that this really
is dead.

So remove it.  If/when this results in problems, it can
easily be re-added and more details can be provided at
that time.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idef9389f4c0993adbdf088d0ccd9a0dc1449e86e
Reviewed-on: https://review.whamcloud.com/39301
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12678 lnet: use init_wait() rather than init_waitqueue_entry() 95/39295/3
Mr NeilBrown [Mon, 6 Jul 2020 12:34:43 +0000 (08:34 -0400)]
LU-12678 lnet: use init_wait() rather than init_waitqueue_entry()

init_waitqueue_entry(foo, current)

is equivalent to

  init_wait(foo)

 So use the shorter version.

Change-Id: Ic63e99d75986211d9655a89f56721394c7b3abb6
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/39295
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9859 libcfs: use wait_event_timeout() in tracefiled(). 93/39293/4
Mr NeilBrown [Mon, 6 Jul 2020 12:34:45 +0000 (08:34 -0400)]
LU-9859 libcfs: use wait_event_timeout() in tracefiled().

By using wait_event_timeout() we can make it more clear what is being
waited for, and when the loop terminates.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I5d015a3cbdbb342a5117e2c328680b3ec13aeb58
Reviewed-on: https://review.whamcloud.com/39293
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12678 lnet: discard WIRE_ATTR 14/37914/5
James Simmons [Sat, 26 Dec 2020 17:19:19 +0000 (12:19 -0500)]
LU-12678 lnet: discard WIRE_ATTR

This macro adds nothing of value, and make the code harder to
read for new readers so it was remove for the Linux client.
We still want to keep track of what data structures are
transmitted over the wire and ensure the protocol does not get
broken. Move the wire protocol structures to their own header
files and add wire checking.

Linux-commit: 3e60455d1953fcee6042c107a8d5657886aa9c58

Test-Parameters: trivial
Change-Id: I4867e80bf8e8f0598d1920865d6f1b9ba920ce5b
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/37914
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13239 ldiskfs: pass inode timestamps at initial creation 56/37556/11
Shaun Tancheff [Thu, 10 Dec 2020 16:31:51 +0000 (10:31 -0600)]
LU-13239 ldiskfs: pass inode timestamps at initial creation

A previous patch https://github.com/Cray/lustre/commit/6d4fb6694
"LUS-4880 osd-ldiskfs: pass uid/gid/xtime directly to ldiskfs"
was intended to be ported to upstream lustre but was lost.

The patch https://review.whamcloud.com/34685/
"LU-12151 osd-ldiskfs: pass owner down rather than transfer it"
passed the inode UID and GID down to ldiskfs at inode allocation
time to avoid the overhead of transferring quota from the inode
(initially created as root) over to the actual user of the file.

The two patches differed slightly in that the LUS-4880 included
passing the a/m/ctimes from osd-ldiskfs to ldiskfs at inode
creation time avoids overhead of setting the timestamps afterward.

Benchmarks using MDTEST:
  mdtest -f 32 -l 32 -n 16384 -i 5 -p 120 -t -u -v -d mdtest

                            master                 patched
   Operation                  Mean    Std Dev         Mean   Std Dev
   ---------                  ----    -------         ----   -------
   Directory creation:   17008.593     72.700    17099.863   155.461
   Directory stat    :  170513.269   1456.002   170105.207  2349.934
   Directory removal :   80796.147   2633.832    84480.222   892.536
   File creation     :   39227.419   7014.539    40429.900  6643.868
   File stat         :  101761.395   2979.802   103818.800  1146.689
   File read         :   86583.370    871.982    85725.254   965.862
   File removal      :   74923.504    761.048    75075.180   723.966
   Tree creation     :     588.570    244.534      608.332   123.939
   Tree removal      :      39.874      1.873       44.357     2.350

This patch also reorganizes the ldiskfs patch series in
order to accommodate struct iattr being added to
ldiskfs_create_inode.
All supported server platforms RHEL 7.5+, SUSE 12+ and
ubuntu 18+ are affected.

HPE-bug-id: LUS-7378, LUS-4880, LUS-8042, LUS-9157, LUS-8772, LUS-8769
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I87e9c792b5240820bfd3a7268e477970ebac8465
Reviewed-on: https://review.whamcloud.com/37556
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12780 ofd: don't use ptlrpc_thread for consistency verification 62/36262/13
Mr NeilBrown [Wed, 23 Oct 2019 00:30:50 +0000 (11:30 +1100)]
LU-12780 ofd: don't use ptlrpc_thread for consistency verification

The ofd module runs a consistency verification thread to verify parent
FID.  Rather than using ptlrpc_thread to manage this, use native
kthreads functionality.

- startup-up code is moved out of the thread to before the
  thread is started, which make error handling clearer.
  As part of this, the lfsck_req_local struct is combined with
  an lu_env and ofd_device pointer into a new oivm_args
  which is passed to the thread a arguments - now it doesn't need
  to allocate anything itself.
- Cleanup remains in the thread, so we add a completion to be
  sure the thread has started before there is any chance of
  kthread_stop() being called.

- kthread_stop() and kthread_should_stop() are used for stopping
  the thread.  wake_up_process() is used to wake it.
  The thread sets TASK_IDLE at the top of the loop, and sets
  TASK_RUNNING if anything is found to do.  At the bottom of
  the loop the 'schedule()' will only block if nothing was found
  to be done.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iec1de307ea48f7d26c60edf5d86eb0b7bf78f49a
Reviewed-on: https://review.whamcloud.com/36262
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11085 ldlm: change lock_matches() to return bool. 54/33854/12
Mr. NeilBrown [Thu, 19 Nov 2020 14:09:19 +0000 (09:09 -0500)]
LU-11085 ldlm: change lock_matches() to return bool.

The name of the function lock_matches() sounds like it
performs a test (it does) and so should return a bool.
Returning a bool gives a slight code simplification (in
search_queue) and more simplification in future patches.

Linux-commit: e16983d96c775eb4527208d3c3d13f57e6d6233c

Change-Id: I1e3a09a0768abd0ab1cfada0fd69216cb9e85df7
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/33854
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-6142 libcfs: discard cfs_strrstr() 61/40861/2
Mr NeilBrown [Tue, 24 Nov 2020 02:45:39 +0000 (13:45 +1100)]
LU-6142 libcfs: discard cfs_strrstr()

cfs_strrstr() is only used in one place, and it can easily be open
coded there without increasing code complexity.  In particular the
fact that the "needle" cannot meaningfully be at the start of the
"haystack", means a simple loop does all we need.

In fact, there is room to improve the code in lwp_setup()
 - sprintf isn't needed as the result is a constant that can
   be calculated at compile time
 - adding the nul termination is then not needed as the buffer
   being copied to was initialised to zeroes.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I52b4abb36cf809d3bd9eebcc752959b0a81bfc13
Reviewed-on: https://review.whamcloud.com/40861
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 libcfs: discard cfs_firststr 60/40860/2
Mr NeilBrown [Tue, 24 Nov 2020 02:28:01 +0000 (13:28 +1100)]
LU-6142 libcfs: discard cfs_firststr

The effect of cfs_firststr() can easily achieved with
skip_space() and strsep().

So use that instead.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idcf8aa50b6aad052f7ee5341ce6d635495aa4990
Reviewed-on: https://review.whamcloud.com/40860
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 libcfs: discard PO2_ROUNDUP_TYPED, LOWEST_BIT_SET 59/40859/3
Mr NeilBrown [Fri, 20 Nov 2020 02:20:23 +0000 (13:20 +1100)]
LU-6142 libcfs: discard PO2_ROUNDUP_TYPED, LOWEST_BIT_SET

LOWEST_BIT_SET() is never used.

PO2_ROUNDUP_TYPED() has the same function as 'round_up()'.

osd_roundup2blocksz() can be further simplified using
DIV_ROUND_UP_ULL().

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I075328ec935eef49d3aeaf5ea1b79b943aadfa2e
Reviewed-on: https://review.whamcloud.com/40859
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>