Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-5710 build: fix typo suggesting openssl-devel requirement 64/37564/2
Dominique Martinet [Thu, 13 Feb 2020 20:33:08 +0000 (21:33 +0100)]
LU-5710 build: fix typo suggesting openssl-devel requirement

Building without openssl-devel always prints a message suggesting to
install openssk-devel, which doesn't even exist. Fix typo.

Test-Parameters: trivial
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Change-Id: Id99b9d4dff7ed95aba30e4929a984878a7d13f0a
Reviewed-on: https://review.whamcloud.com/37564
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-9897 utils: have lfs.c use lstddef.h 21/38921/4
James Simmons [Fri, 12 Jun 2020 18:12:38 +0000 (14:12 -0400)]
LU-9897 utils: have lfs.c use lstddef.h

Instead of redefining ARRAY_SIZE in lfs.c we can use the macros
in lstddef.h

Test-Parameters: trivial
Change-Id: I33bca9773b609f1996ea66098edb67426273f801
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38921
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
3 years agoLU-9859 libcfs: move cfs_trace_data data to tracefile.c 14/38914/2
Mr NeilBrown [Wed, 10 Jun 2020 21:44:08 +0000 (17:44 -0400)]
LU-9859 libcfs: move cfs_trace_data data to tracefile.c

The macro cfs_tcd_for_each() is only used in tracefile.c so move
it from the header tracefile.h along with related material in
the header file.

Test-Parameters: trivial
Change-Id: I024dc0a4a1f5481cf3468c35e670096f29817c23
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38914
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9859 libcfs: remove cfs_trace_refill_stack() 13/38913/2
Mr NeilBrown [Wed, 10 Jun 2020 21:40:00 +0000 (17:40 -0400)]
LU-9859 libcfs: remove cfs_trace_refill_stack()

The function cfs_trace_refill_stack() is not used anywhere so
remove it.

Test-Parameters: trivial
Change-Id: Iade031c15a9bde091320c2fd2c66c1cd2951f649
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38913
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-9859 libcfs: Fix using smp_processor_id() in preemptible context 10/38810/3
James Simmons [Tue, 2 Jun 2020 16:48:28 +0000 (12:48 -0400)]
LU-9859 libcfs: Fix using smp_processor_id() in preemptible context

This warning show up with kernels that enable preemptible
BUG: using smp_processor_id() in preemptible [00000000] code: ...

Change it to disable preemption around smp_processor_id().

Change is apart of:
Linux-commit: 67bc8c33ec14f8290c6883a7d6237e213709561a

Change-Id: I41f7a1d3aa22240d3669f94ae92a192d219cca52
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38810
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13501 tests: Add tests for LNet health and resends 33/38633/4
Chris Horn [Sat, 16 May 2020 16:38:06 +0000 (11:38 -0500)]
LU-13501 tests: Add tests for LNet health and resends

Simulate all LNet health error statuses and validate that LNet health
modifies NI health values or attempts resends as appropriate for both
single-rail and multi-rail configurations.

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-8826
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ice705c073deefed00b20011dea5de834cf6f0984
Reviewed-on: https://review.whamcloud.com/38633
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13225 utils: fix install path for bash-completion 48/38548/3
Andreas Dilger [Fri, 8 May 2020 23:28:39 +0000 (17:28 -0600)]
LU-13225 utils: fix install path for bash-completion

Fix the default install path for bash-completion if the package is
not installed at build time.  This avoids BASH_COMPLETION_DIR being
badly formatted in the lustre.spec file.

Fixes: dfb4afc24102 ("LU-13225 utils: bash completion for lfs and lctl")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie50071c4ff86f57bc9dd53409ae339da2a3ebbe5
Reviewed-on: https://review.whamcloud.com/38548
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13501 lnet: Skip health and resends for single rail configs 48/38448/7
Chris Horn [Tue, 26 May 2020 16:31:26 +0000 (11:31 -0500)]
LU-13501 lnet: Skip health and resends for single rail configs

If the sender of a message only has a single interface it doesn't
make sense to have LNet track the health of that interface, nor
should it attempt to resend a message when it encounters a local
error. There aren't any alternative interfaces to use for a resend.

Similarly, we needn't track health values of a peer's NIs if the peer
only has a single interface. Nor do we need to attempt to resend
a message to a peer with a single interface. There's an exception for
routers. We rely on NI health to determine route aliveness, so even
if a router only has a single interface we still need to track its
health.

We can use the ln_ping_target to get the count of local NIs, and the
lnet_peer struct already contains a count of the number of peer NIs.

HPE-bug-id: LUS-8826
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Id89159a5d07c1668c1cbdfa9050535380f68d1f6
Reviewed-on: https://review.whamcloud.com/38448
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13344 osd-ldiskfs: timespec64 is broken 14/38314/6
Shaun Tancheff [Thu, 21 May 2020 15:30:29 +0000 (10:30 -0500)]
LU-13344 osd-ldiskfs: timespec64 is broken

Linux commit v5.5-rc1-6-gba70609d5ec6 removed timespec64_trunc
which was being used to determine if inode times were timespec64
Change this test to work with kernels without timespec64_truc

Linux-commit: ba70609d5ec664a8f36ba1c857fcd97a478adf79

Linux commit v5.4-rc3-21-g933f1c1e0b75 renamed h_buffer_credits
to h_total_credits

Add a configure test to determine and #define to handle this
change of name.

Linux-commit: 933f1c1e0b75bbc29730eef07c9e196c6dfd37e5

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I112da3385e5f33cbee8aadfd3efdbb4b3b823819
Reviewed-on: https://review.whamcloud.com/38314
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13467 llite: truncate deadlock with DoM files 88/38288/3
Andriy Skulysh [Thu, 27 Feb 2020 21:15:41 +0000 (23:15 +0200)]
LU-13467 llite: truncate deadlock with DoM files

All MDT intent RPCs are sent with inode mutex locked
while read/write and setattr unlocks inode mutex on entry,
takes LDLM lock and locks inode mutex again and sends the RPC.
So a deadlock can occur since LDLM lock is the same in case of DoM.

In fact read/write and setattr takes lli_trunc_sem, so
inode mutex can be ommited in truncate case.

Replace inode_lock with new lli_setattr_mutex to keep protection
from concurrent setattr time updates.

HPE-bug-id: LUS-8455
Change-Id: Ie294154306cc3b6cff977a2dff485e8d44145ed9
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/38288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 mdt: Fix style issues for mdt_recovery.c 32/38932/2
Arshad Hussain [Tue, 2 Jun 2020 18:53:23 +0000 (00:23 +0530)]
LU-6142 mdt: Fix style issues for mdt_recovery.c

This patch fixes issues reported by checkpatch
for file lustre/mdt/mdt_recovery.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ib7a0795cf2d48c078c140aef8501a167fb24d74c
Reviewed-on: https://review.whamcloud.com/38932
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 lov: Fix style issues for lov_merge.c 30/38930/2
Arshad Hussain [Tue, 2 Jun 2020 19:52:12 +0000 (01:22 +0530)]
LU-6142 lov: Fix style issues for lov_merge.c

This patch fixes issues reported by checkpatch
or file lustre/lov/lov_merge.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I6fd60dbc7c48f3dc8fc2c41e924d8d088d6912f2
Reviewed-on: https://review.whamcloud.com/38930
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 llite: Fix style issues for vvp_page.c 29/38929/2
Arshad Hussain [Tue, 2 Jun 2020 20:34:36 +0000 (02:04 +0530)]
LU-6142 llite: Fix style issues for vvp_page.c

This patch fixes issues reported by checkpatch
for file lustre/llite/vvp_page.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I14faceb6d2e137cf1ca2eac66864eed87052b1fe
Reviewed-on: https://review.whamcloud.com/38929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13649 mdd: orhpan cleanup fix 66/38866/2
Vitaly Fertman [Mon, 8 Jun 2020 20:24:12 +0000 (23:24 +0300)]
LU-13649 mdd: orhpan cleanup fix

due to a race with mdd_close() the objects may have been already
destroyed by close and the 2nd destroy asserts on lu_object_is_dying()

The problem appeared in LU-12846 which removed the error handling
(ENOENT) returned by dt_delete - the entry was already removed from
the parent.

Fixes: 688d5da6a8 ("LU-12846 mdd: return error while delete failed")
HPE-bug-id: LUS-8864

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I7e2f3fca7b7d4440340fd3daaf8ec528010d9117
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/38866
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12511 lov: use lov_pattern_support() to verify lmm 91/38791/5
James Simmons [Tue, 9 Jun 2020 22:39:06 +0000 (18:39 -0400)]
LU-12511 lov: use lov_pattern_support() to verify lmm

We can use lov_pattern_support(), which is used by the server
and userland code, to ensure lmm is valid instead of open coding.

Change-Id: I44051e6e2dba2f0b7e481572bb58d776724aecd8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38791
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
3 years agoLU-9441 llite: bind kthread thread to accepted node set 30/38730/4
James Simmons [Wed, 27 May 2020 17:27:59 +0000 (13:27 -0400)]
LU-9441 llite: bind kthread thread to accepted node set

Bind both the agl and statahead kernel threads to a node that is
apart of the cpt table that Lustre use. This limits the polluting
of the cache of HPC applications.

Change-Id: I1c29fb5dbbdb6a73dac0dc6c872a797c05eab1ad
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38730
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12785 dom: fix DoM component deletion code 37/38337/5
Mikhail Pershin [Thu, 23 Apr 2020 12:42:00 +0000 (15:42 +0300)]
LU-12785 dom: fix DoM component deletion code

The lod_erase_dom_stripe() deletes DoM entry from composite
layout upon file create if DoM is disabled on server.
That code works incorrectly if DoM is not the first component
in provided layout, e.g. in mirror.

Patch does correct DoM entry removal in generic case no matter
where it was placed in layout. Related test 270h is added into
sanity.sh

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ia1b3f25db16a7b59b83cd8f58ff44ddf082cab48
Reviewed-on: https://review.whamcloud.com/38337
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
3 years agoLU-13508 mdc: chlg device could be used after free 58/38658/4
Hongchao Zhang [Tue, 19 May 2020 16:21:41 +0000 (00:21 +0800)]
LU-13508 mdc: chlg device could be used after free

There are some issue of the usage of dynamic devices used by
the changelog in MDC, which could cause the device to be used
after it is freed.

Change-Id: Iacf6fa7c8b612f1a373091cf88e7082c4860cfe4
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38658
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13628 tests: replace btime with crtime for statx test 80/38880/5
Qian Yingjin [Tue, 9 Jun 2020 15:00:07 +0000 (23:00 +0800)]
LU-13628 tests: replace btime with crtime for statx test

Tests sanityn/106a failed due to wrongly using 'btime' to filter
the debugfs output for file creation time, which should be
'crtime'.

This patch also replaces '-c %q' with '-c %p' in sanityn/106c to
get the statx 'stx_attributes_mask': Mask to show what's supported
in 'stx_attributes'.

Test-Parameters: trivial clientdistro=el8
Test-Parameters: trivial clientdistro=ubuntu1804
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia8273e02d4ebe7f1e9e5d6973e691c82e0524fb2
Reviewed-on: https://review.whamcloud.com/38880
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
3 years agoLU-13600 ptlrpc: limit rate of lock replays 20/38920/3
Mikhail Pershin [Fri, 12 Jun 2020 14:14:50 +0000 (17:14 +0300)]
LU-13600 ptlrpc: limit rate of lock replays

Clients send all lock replays at once and that may overwhelm
server with huge amount of replays in recovery queue causing
OOM effects.

Patch adds rate control for lock replays on client

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ie557f8481c5facb690468d7136cf5feebe4e8f11
Reviewed-on: https://review.whamcloud.com/38920
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13580 tests: fix retrieval of SELinux context 48/38648/6
Sebastien Buisson [Mon, 18 May 2020 09:43:22 +0000 (11:43 +0200)]
LU-13580 tests: fix retrieval of SELinux context

Use 'stat' command instead of 'ls -lZ' to retrieve SELinux security
context, to make it more portable.

Test-Parameters: trivial clientselinux testlist=sanity-selinux mdtcount=2 clientcount=2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I61bc0efb1e8ae0427d05827e2933eb0b848fb442
Reviewed-on: https://review.whamcloud.com/38648
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12275 sec: support truncate for encrypted files 94/37794/15
Sebastien Buisson [Thu, 20 Feb 2020 14:45:07 +0000 (14:45 +0000)]
LU-12275 sec: support truncate for encrypted files

Truncation of encrypted files is not a trivial operation. The page
corresponding to the point where truncation occurs must be read,
decrypted, zeroed after truncation point, re-encrypted and then
written back.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I834f9372913d7051b1e0821515d3fea0873ffd78
Reviewed-on: https://review.whamcloud.com/37794
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12275 sec: deal with encrypted object size 46/36146/28
Sebastien Buisson [Fri, 11 Oct 2019 08:40:37 +0000 (08:40 +0000)]
LU-12275 sec: deal with encrypted object size

Problem with size of encrypted file comes from the fact that
an encrypted page will always contain PAGE_SIZE bytes of data,
even if clear text page is only a few bytes. And server infers
object size from content of encrypted page.

The way to address this is the following. Upon writing, when the
client encrypts the page representing the end of the file, it puts
into o_size info of the request's body, the size of the clear text
version of the file. On server side, this information is used to
adjust isize of the object, but still storing the complete pages
on disk.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia83424123da26920ba0e0dfb354f54b1fa0ccfbb
Reviewed-on: https://review.whamcloud.com/36146
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12275 sec: decryption for read path 45/36145/28
Sebastien Buisson [Thu, 22 Aug 2019 08:48:19 +0000 (08:48 +0000)]
LU-12275 sec: decryption for read path

With the support for encryption, all files need to be opened with
fscrypt_file_open(). fscrypt will retrieve encryption context if
file is encrypted, or immediately return if not.
Decryption itself is carried out in osc_brw_fini_request(), right
after the reply has been received from the server.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8f8f87eb8e07e35e1a4e6cc157ceddfef6934753
Reviewed-on: https://review.whamcloud.com/36145
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12275 sec: encryption for write path 44/36144/27
Sebastien Buisson [Wed, 17 Jul 2019 14:24:26 +0000 (14:24 +0000)]
LU-12275 sec: encryption for write path

First aspect is to make sure encryption context is properly set on
files/dirs that are created or opened/looked up.
Then encryption itself is carried out in osc_brw_prep_request(), just
before pages are added to the request to be sent. Because pages in
the page cache must hold clear text data, we have to use bounce pages
for encryption. The allocation is handled by fscrypt, and for
deallocation we call fscrypt_pullback_bio_page() and/or
fscrypt_pullback_bio_page().

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ieb1355bd55b6a8740e4b549d60d1f480a5abc53f
Reviewed-on: https://review.whamcloud.com/36144
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13408 target: update in-memory per client data 55/38855/4
Lai Siyao [Sat, 6 Jun 2020 20:00:00 +0000 (04:00 +0800)]
LU-13408 target: update in-memory per client data

Some clients don't support recovery:
1. lightweight clients.
2. local clients on MDS which doesn't support "local_recovery".
3. OFD connect may cause transaction before export has valid
   last_rcvd slot.

Though such clients don't store per client data on disk, they
still need to update in memory per client data to allow reply
reconstruct and track saved LDLM locks (both local and remote) be
tracked by transaction number.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id0082358e7720e5ef61f366682ae91282bd66d6d
Reviewed-on: https://review.whamcloud.com/38855
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9325 mdt: replace simple_strtol() with kstrtol() 46/38846/3
James Simmons [Fri, 5 Jun 2020 12:55:55 +0000 (08:55 -0400)]
LU-9325 mdt: replace simple_strtol() with kstrtol()

Someday simple_strtol() will go away. The simple_strtol() call in
mdt_init0() is very simple so we can easily replace it with
kstrtol().

Change-Id: I37485735f0f42aa5c2c5b9fd361e4fdfa54dc8e5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38846
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
3 years agoLU-13635 lfs: add -D option back to lfs_migrate 40/38840/3
Emoly Liu [Fri, 5 Jun 2020 03:39:37 +0000 (11:39 +0800)]
LU-13635 lfs: add -D option back to lfs_migrate

Enable "-D" option with its long option "--non-direct" correctly
in lfs_migrate.
sanity.sh test_56we is added to verify this patch.

Test-Parameters: trivial
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I6ab051c0f2e0cde9de6a5b8ace8962cc293e7656
Reviewed-on: https://review.whamcloud.com/38840
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13604 doc: update e2fsprogs to 1.45.6.wc1 57/38757/2
Li Dongyang [Fri, 29 May 2020 02:38:19 +0000 (12:38 +1000)]
LU-13604 doc: update e2fsprogs to 1.45.6.wc1

Update the recommended e2fsprogs version to 1.45.6.wc1

Change-Id: I1e3d05207da954e6d7b9204fc4ed3329486f80dd
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/38757
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
3 years agoLU-8130 obd: convert obd_nid_hash to rhashtable 18/33518/21
James Simmons [Mon, 18 May 2020 22:10:10 +0000 (18:10 -0400)]
LU-8130 obd: convert obd_nid_hash to rhashtable

Linux has a resizeable hashtable implementation in lib,
so we should use that instead of having one in libcfs.

This patch converts the struct obd_export obd_nid_hash to use
rhashtable. In the process we gain lockless lookup which should
improve performance. For the nid hash we use rhltable since the
mapping can be many exports to a NID key.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: I45154ceb48336b20161f771d986d8fe7333b9849
Reviewed-on: https://review.whamcloud.com/33518
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12511 utils: Move utilies specific values out of Lustre UAPI headers 90/38790/3
James Simmons [Fri, 5 Jun 2020 02:13:38 +0000 (22:13 -0400)]
LU-12511 utils: Move utilies specific values out of Lustre UAPI headers

Use FS_IOC_FS[S|G]ETXATTR directly. Move several things in the
UAPI header lustre_user.h that is only needed by user land tools
to the proper places.

Change-Id: Ie7a33742c0aba478c365c5fa44315400b28d8193
Test-Parameters: trivial
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38790
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
3 years agoLU-6142 mdt: Fix style issues for mdt_reint.c 86/38786/3
Arshad Hussain [Fri, 29 May 2020 13:45:47 +0000 (19:15 +0530)]
LU-6142 mdt: Fix style issues for mdt_reint.c

This patch fixes issues reported by checkpatch
for file lustre/mdt/mdt_reint.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I2afd2e127c03c7f021da24ac8b9b00a059a07f0b
Reviewed-on: https://review.whamcloud.com/38786
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12511 build: ignore kmod handling in spec file for 49/38649/6
James Simmons [Thu, 28 May 2020 12:21:59 +0000 (08:21 -0400)]
LU-12511 build: ignore kmod handling in spec file for
 utilities only build

The lustre spec file handles kmod even when --disable-modules is
used. We don't need to manage any kmod in this case so lets make
that handling only when ${with lustre_modules} is true.

Test-Parameters: trivial
Change-Id: Ifa43720aacabae5f41abf250d2e03b235c34cb4c
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13181 o2ib: fix page mapping error 88/37388/10
Alexey Lyashkov [Mon, 8 Jun 2020 00:27:18 +0000 (20:27 -0400)]
LU-13181 o2ib: fix page mapping error

IB DMA mapping can merge a physically continues page region into
single one.
It's confused a kiblnd_fmr_pool_map function who expect to see all
fragments mapped.
It's generate a error
 (o2iblnd.c:1926:kiblnd_fmr_pool_map()) Failed to map mr 1/16 elements

By study an IB code, it looks ib_map_mr_sg return code should checked
against of result of ib_dma_map_sg instead of original fragments
count, same data should be used as argument of ib_map_mr_sg function.

Test-Parameters: trivial
Cray-bug-id: LUS-8139
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I3b845ae54d8659d4045921f519effcf0a4428e49
Reviewed-on: https://review.whamcloud.com/37388
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10157 lnet: restore an maximal fragments count 85/37385/11
Alexey Lyashkov [Mon, 20 Apr 2020 18:42:50 +0000 (21:42 +0300)]
LU-10157 lnet: restore an maximal fragments count

Lowering a number of fragments blocks a connection from older clients
who wantsto use 256 fragments to transfer. Let's restore this number
to the original value.

Fixes: 272e49ce2d5d ("LU-10157 lnet: make LNET_MAX_IOV dependent on page size")

Test-Parameters: trivial testlist=lnet-selftest,sanity-lnet
Cray-bug-id: LUS-8139
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: Ia23aa1fb3d36a65abab6241c9ba75addc1dcce0a
Reviewed-on: https://review.whamcloud.com/37385
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11025 mdt: remove unused code 01/38801/2
Lai Siyao [Mon, 1 Jun 2020 20:17:17 +0000 (04:17 +0800)]
LU-11025 mdt: remove unused code

Remove obsolete code in dir_split_count_store() which are left in
code rebase.

Test-parameters: trivial

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id72385307623c7f281ea855e4c02fe110f1ed235
Reviewed-on: https://review.whamcloud.com/38801
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13488 kernel: RHEL 8.2 server support 40/38440/11
Jian Yu [Sat, 6 Jun 2020 18:29:27 +0000 (11:29 -0700)]
LU-13488 kernel: RHEL 8.2 server support

This patch makes changes to support RHEL 8.2 release with
kernel 4.18.0-193.1.2.el8 for Lustre server.

Test-Parameters: trivial \
clientdistro=el8.2 serverdistro=el8.2 \
env=SANITY_EXCEPT="130 133h" testlist=sanity

Change-Id: I350da46e1e2ff32945ef0f7106e6642821fc9ecf
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-1742 o2iblnd: 'Timed out tx' error message 35/33235/9
Sonia Sharma [Thu, 6 Sep 2018 03:39:23 +0000 (23:39 -0400)]
LU-1742 o2iblnd: 'Timed out tx' error message

Fix the error message in kiblnd_check_txs_locked()
to report the total RDMA time outstanding rather
than the number of seconds past the deadline.

This patch also adds time_on_activeq to struct kib_tx
so the time spent by tx in internal queue and active
queue can be tracked and reported. This would help
in diagnosing the issue.

Change-Id: I4e486389220e383af88dbc482646e92a85bd5b14
Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33235
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Stephen Champion <stephen.champion@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9897 build: add binaries to .gitignore 25/38825/3
James Simmons [Wed, 3 Jun 2020 11:32:03 +0000 (07:32 -0400)]
LU-9897 build: add binaries to .gitignore

Several binaries are built that show up with git status.
Add them to the .gitignore file

Test-Parameters: trivial
Change-Id: I7eb38d8fe725408dffaa71eb4db2d0305721367b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38825
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-9859 libcfs: merge linux-tracefile.c into tracefile.c 04/38804/3
Mr NeilBrown [Tue, 2 Jun 2020 12:31:35 +0000 (08:31 -0400)]
LU-9859 libcfs: merge linux-tracefile.c into tracefile.c

It's good to keep related code together.

Test-Parameters: trivial
Change-Id: I7708114c16b180c0f2f0e280447cd6fa4859792e
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38804
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13559 utils: fix lfs mirror delete error message 09/38609/8
Kévin Baillergeau [Fri, 15 May 2020 01:22:19 +0000 (01:22 +0000)]
LU-13559 utils: fix lfs mirror delete error message

Add different error messages depending on the option used.
Add a mirror_id variable instead of reusing the id variable
to store the result of mirror_id_of(id).

Signed-off-by: Kévin Baillergeau <kevin.baillergeau.ocre@cea.fr>
Change-Id: I5fbd307c4132c22d54470f2a1407074efe8bbc0a
Reviewed-on: https://review.whamcloud.com/38609
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 years agoLU-13503 mdc: allow setting max_mod_rpcs_in_flight larger 55/38455/10
Andreas Dilger [Wed, 27 May 2020 19:22:12 +0000 (12:22 -0700)]
LU-13503 mdc: allow setting max_mod_rpcs_in_flight larger

Allow setting mdc.*.max_mod_rpcs_in_flight > mdc.*.max_rpcs_in_flight
by increasing the latter value, rather than returning an error and
telling the user to do that.  This matches the similar behavior if
mdc.*.max_rpcs_in_flight is reduced lower than max_mod_rpcs_in_flight.

If there are multiple MDTs, the "mdc.*.max_mod_rpcs_in_flight" param
may be set from e.g. the MDT0000 config log before MDT0001 is fully
configured, catching MDT0001 with ocd_maxmodrpcs = 0 before the OCD
from the MDT has been filled in, and incorrectly trigger an error.
If seen during setup, allow ocd_maxmodrpcs = (max_rpcs_in_flight - 1),
since this will be fixed up later if mdc.*.max_rpcs_in_flight is set
smaller in the config log (if set larger it doesn't matter).

Test-Parameters: env=ONLY=90 testlist=conf-sanity

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I4b20163e9e212db451738169ebdc361ab8c1c15e
Reviewed-on: https://review.whamcloud.com/38455
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11963 obd: Rename OS_STATE flags to OS_STATFS 89/34289/7
Patrick Farrell [Wed, 27 Feb 2019 21:31:11 +0000 (16:31 -0500)]
LU-11963 obd: Rename OS_STATE flags to OS_STATFS

The statfs state flags are oddly named "OS_STATE_[STATE]"
Rename them to "OS_STATFS_[STATE]" to make their role clearer
and make them easier to find.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I3f43b3e73155d9fbd8b3e0fa52e7f4d26b9d2f89
Reviewed-on: https://review.whamcloud.com/34289
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
3 years agoLU-10391 lnet: fix uninitialize var in choose_ipv4_src() 23/38823/2
Mr NeilBrown [Wed, 3 Jun 2020 22:57:31 +0000 (08:57 +1000)]
LU-10391 lnet: fix uninitialize var in choose_ipv4_src()

choose_ip4_src() test "*ret" without initializing it - and callers do
not (and should not) initialize the var.

Instead of testing "*ret", test "err" - if this is non-zero (it will
be -ENOENT) we want to use the address.  If it is zero, then we only
use the address if it is on the right subnet.

Test-Parameters: trivial
Reported-by: Amir Shehata <ashehata@whamcloud.com>
Fixes: d720fbaadad9 ("LU-10391 socklnd: use interface index to track local addr")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9b83207b790db07c06be1ee1c534a0fc63eb9ffa
Reviewed-on: https://review.whamcloud.com/38823
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13195 osp: invalidate object on write error 87/38387/4
Alex Zhuravlev [Mon, 27 Apr 2020 07:24:33 +0000 (10:24 +0300)]
LU-13195 osp: invalidate object on write error

do this unconditionally, to avoid cases when the object is
on another request's invalidation list.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8ee0c484e695e88c0ea6fb13ac377fa689150780
Reviewed-on: https://review.whamcloud.com/38387
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 obdclass: convert calls to container_of0() 81/38381/2
Mr NeilBrown [Mon, 27 Apr 2020 05:28:23 +0000 (15:28 +1000)]
LU-6142 obdclass: convert calls to container_of0()

Most calls to container_of8() in lustre/obdclass can be safely changed
to container_of(), etiher because the pointer passed in is obviously
not NULL (or error) from the context, or because the pointer returned
is dereferenced without and checks.

The only excepts are simple wrapped like dt2ls_dev(), lu2ls_obj(),
scrub_obj2dev() where these is no context, so it is safest to convert
to container_of_safe() instead.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ice1063f3ccb74eaec575bff85c960f3288be5ef5
Reviewed-on: https://review.whamcloud.com/38381
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12275 sec: control client side encryption 33/36433/19
Sebastien Buisson [Fri, 11 Oct 2019 08:34:02 +0000 (08:34 +0000)]
LU-12275 sec: control client side encryption

Client enables encryption by default. However, this should be
possible only if server side is encryption aware.
Moreover, we want to give the ability to decide which clients can
make use of encryption, by extending the nodemap mechanism with a
new 'forbid_encryption' property, set to 0 by default.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I765e5ce555e8277319c03c770cb6e6ac73cfc9e8
Reviewed-on: https://review.whamcloud.com/36433
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13628 tests: add sanityn test_106 to ALWAYS_EXCEPT 15/38815/2
Sebastien Buisson [Wed, 3 Jun 2020 09:31:20 +0000 (11:31 +0200)]
LU-13628 tests: add sanityn test_106 to ALWAYS_EXCEPT

sanityn test_106 fails on CentOS 8 and Ubuntu 18, and is skipped
on all other distros because of lack of support for statx.

Test-Parameters: trivial
Test-Parameters: clientdistro=el8.1 testlist=sanityn
Test-Parameters: clientdistro=ubuntu1804 testlist=sanityn
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I370585138bbd05d1e4ea8f323c74659145fe7dec
Reviewed-on: https://review.whamcloud.com/38815
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13556 kernel: kernel update RHEL7.8 [3.10.0-1127.8.2.el7] 34/38634/5
Jian Yu [Tue, 2 Jun 2020 06:27:10 +0000 (23:27 -0700)]
LU-13556 kernel: kernel update RHEL7.8 [3.10.0-1127.8.2.el7]

Update RHEL7.8 kernel to 3.10.0-1127.8.2.el7.

Test-Parameters: trivial clientdistro=el7.8 serverdistro=el7.8

Change-Id: If7ac6f4b5f1fe32a15c63f51589a2e320001b4a5
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38634
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13111 kernel: new kernel [SLES12 SP5 4.12.14-122.20.1] 28/38628/4
Jian Yu [Wed, 20 May 2020 23:03:04 +0000 (16:03 -0700)]
LU-13111 kernel: new kernel [SLES12 SP5 4.12.14-122.20.1]

This patch makes changes to support new SLES12 SP5 release
for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp5 \
env=SANITY_EXCEPT="56oc 817" testlist=sanity

Change-Id: Ia4b856b03801e02da9a2e584efeb8759b4dd30c3
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38628
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12780 quota: don't use ptlrpc_thead of qmt_pool_recalc 12/38612/5
Mr NeilBrown [Fri, 15 May 2020 05:17:43 +0000 (15:17 +1000)]
LU-12780 quota: don't use ptlrpc_thead of qmt_pool_recalc

Rather than using ptlrpc_thread, use native kthreads functionality.

kthread_stop() / kthread_should_stop() is used to signal early
shutdown.

kthread_park()/kthread_unpark() is used to ensure that thread
actually starts (else kf kthread_stop() was called too early,
the thread function might not run and the ref on the pool
might not be dropped.

As the thread can stop spontaneiously or on request we use xchg() on
the thread pointer to disambiuate in the case of a race and wait as
needed.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I56b03a4735268bc808448a4c7b9e20c8625e2eee
Reviewed-on: https://review.whamcloud.com/38612
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13520 ldiskfs: fastpath in bitmap prefetching 13/38513/8
Alex Zhuravlev [Wed, 6 May 2020 12:25:32 +0000 (15:25 +0300)]
LU-13520 ldiskfs: fastpath in bitmap prefetching

getblk() can be very expensive if many threads are trying to find
specific block which can happen when threads are trying to prefetch
same set of block bitmaps (where only the one wins). use atomic
bitset to prevent this situation.

# mpirun -np 640 mdtest -D -C -r -u -n 1000 -vv -p 10 -i 3 -d /mdt0
               Max        Min        Mean       Std Dev
before b7cd65  76243.849  62728.925  69264.981  5525.345
after b7cd65   44270.444  42144.617  43138.707   873.040
this patch     83171.845  71796.197  77274.256  4652.121

Fixes: b7cd65a3d1 ("LU-12988 ldiskfs: mballoc to prefetch groups")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I3194aa0e13f22a1f34f5df846cb4b15feba5f432
Reviewed-on: https://review.whamcloud.com/38513
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@hpe.com>
3 years agoLU-13488 kernel: new kernel [RHEL 8.2 4.18.0-193.1.2.el8] 10/38410/9
Jian Yu [Wed, 20 May 2020 22:30:00 +0000 (15:30 -0700)]
LU-13488 kernel: new kernel [RHEL 8.2 4.18.0-193.1.2.el8]

This patch makes changes to support new RHEL 8.2 release
for Lustre client.

Test-Parameters: trivial clientdistro=el8.2 \
env=SANITY_EXCEPT="130 133h" testlist=sanity

Change-Id: Icb1db3afd2e94423a45354acfdd559f8f1e294cb
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38410
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13473 llite: don't check mirror info for page discard 07/38307/8
Bobi Jam [Wed, 22 Apr 2020 05:28:54 +0000 (13:28 +0800)]
LU-13473 llite: don't check mirror info for page discard

The CIT_MISC is used for locks/pages manipulation, it will not
go with full io procedure, i.e. cl_io_loop() will not be called
for it. So don't check it for plain file since the mirror info
is not initialized/set in this case.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I723d18260629b8f7c470d350d6d899d3bb88018a
Reviewed-on: https://review.whamcloud.com/38307
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 osd-ldiskfs: Fix style issues for osd_oi.c 89/38789/2
Arshad Hussain [Fri, 29 May 2020 11:10:15 +0000 (16:40 +0530)]
LU-6142 osd-ldiskfs: Fix style issues for osd_oi.c

This patch fixes issues reported by checkpatch
for file lustre/osd-ldiskfs/osd_oi.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I36d9551c681316836183f78d13c3e342ea8e1eb1
Reviewed-on: https://review.whamcloud.com/38789
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 ost: Fix style issues for ost_handler.c 87/38787/2
Arshad Hussain [Fri, 29 May 2020 10:28:56 +0000 (15:58 +0530)]
LU-6142 ost: Fix style issues for ost_handler.c

This patch fixes issues reported by checkpatch
for file lustre/ost/ost_handler.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ib779a2bdcd0ea3ada49e1aa2059acb17134d4a08
Reviewed-on: https://review.whamcloud.com/38787
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 mdt: Fix style issues for mdt_xattr.c 85/38785/2
Arshad Hussain [Fri, 29 May 2020 12:49:09 +0000 (18:19 +0530)]
LU-6142 mdt: Fix style issues for mdt_xattr.c

This patch fixes issues reported by checkpatch
for file lustre/mdt/mdt_xattr.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I9c832d75f416998e1fbd9d25b5b3f0d3d6cf8b4d
Reviewed-on: https://review.whamcloud.com/38785
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10157 ptlrpc: fill md correctly. 87/37387/10
Alexey Lyashkov [Mon, 1 Jun 2020 13:00:11 +0000 (09:00 -0400)]
LU-10157 ptlrpc: fill md correctly.

MD fill should limit to the overall transfer size in additional
to the number a fragment.
Let's do this.

Cray-bug-id: LUS-8139
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I45219ffd8206f89f54688e7ecb0ccbb65ed3e3c1
Reviewed-on: https://review.whamcloud.com/37387
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10157 ptlrpc: separate number MD and refrences for bulk 86/37386/10
Alexey Lyashkov [Mon, 1 Jun 2020 12:57:53 +0000 (08:57 -0400)]
LU-10157 ptlrpc: separate number MD and refrences for bulk

Introduce a bulk desc refs, it's different from MD's count ptlrpc
expects to have events from all MD's even it's filled or not. So,
number an MD's to post is related to the requested transfer size,
not a number MD's with data.

Cray-bug-id: LUS-8139
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I86a13d89eb68f469678baa842d47f5a9d910802a
Reviewed-on: https://review.whamcloud.com/37386
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12275 sec: enable client side encryption 43/36143/22
Sebastien Buisson [Mon, 8 Jul 2019 14:51:39 +0000 (14:51 +0000)]
LU-12275 sec: enable client side encryption

Enable client side encryption. By default it is activated,
letting user specifies actual encryption policy to use on
a per-directory basis. It is possible to deactivate client
side encryption by using the 'noencrypt' mount option.

Also add the test dummy encryption mode option to ease
testing.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0e8d4db7ab8a77aba0600788cca9403f7c50f8a6
Reviewed-on: https://review.whamcloud.com/36143
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12275 sec: documentation for client-side encryption 59/38759/3
Sebastien Buisson [Thu, 28 May 2020 07:11:20 +0000 (09:11 +0200)]
LU-12275 sec: documentation for client-side encryption

Add several documents about client-side encryption under
Documentation/client_side_encryption:
- threat_model.txt is the description of the threat model for Lustre
  client-side encryption;
- key_hierarchy.txt is the description of the key hierarchy for Lustre
  client-side encryption;
- modes_usage.txt is the description of the encryption modes and usage
  for Lustre client-side encryption;
- access_semantics.txt is the description of the access semantics for
  Lustre client-side encryption.

As we rely on kernel's fscrypt library for this feature, fscrypt's
concepts are largely valid. These documents are inspired by fscrypt
documentation in the Linux kernel tree, see
Documentation/filesystems/fscrypt.rst

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6c9d42e572111ed2a3388e4f58b2560f365a5853
Reviewed-on: https://review.whamcloud.com/38759
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoLU-11025 dne: directory restripe and auto split 84/37284/19
Lai Siyao [Mon, 30 Dec 2019 15:27:27 +0000 (23:27 +0800)]
LU-11025 dne: directory restripe and auto split

A specific restriper thread is created for each MDT, it does three
tasks in a loop:
1. If there is directory whose total sub-files exceeds threshold
   (50000 by default, can be changed "lctl set_param
   mdt.*.dir_split_count=N"), split this directory by adding new
   stripes (4 stripes by default, which can be adjusted by
   "lctl set_param mdt.*.dir_split_delta=N").
2. If a directory stripe LMV is marked 'MIGRATION', migrate sub file
   from current offset, and update offset to next file.
3. If a directory master LMV is marked 'RESTRIPING', check whether
   all stripe LMV 'MIGRATION' flag is cleared, if so, clear
   'RESTRIPING' flag and update directory LMV.

In last patch, the first part of manual directory stripe is
implemented, and in this patch, sub file migrations and dir layout
update is done. Directory auto-split is done in similar way, except
that the first step is done by this thread too.

Directory auto-split can be enabled/disabled by "lctl set_param
mdt.*.enable_dir_auto_split=[0|1]", it's turned on by default.

Auto split is triggered at the end of getattr(): since now the attr
contains dirent count, check whether it exceeds threshold, if so,
add this directory into mdr_auto_split list and wake up the dir
restriper thread.

Restripe migration is also triggered in getattr(): if the object is
directory stripe, and LMV 'MIGRATION' flag set, add this object into
mdr_restripe_migrate list and wake up the dir restriper thread.

Directory layout update is similar: if current directory is striped,
and LNV 'RESTRIPING' flag is set, add this directory into
mdr_restripe_update list and wake up restriper thread.

By default restripe migrate dirent only, and leave inode unchanged, it
can be adjusted by "lctl set_param mdt.*.dir_restripe_nsonly=[0|1]".

Currently DoM file inode migration is not supported, migrate dirent
only for such files to avoid leaving dir migration/restripe
unfinished.

Add sanity.sh 230o, 230p and 230q, adjust 230j since DoM files migrate
dirent.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8c83b42e4acbaab067d0092d0b232de37f956588
Reviewed-on: https://review.whamcloud.com/37284
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9859 libcfs: discard libcfs_prim.h 73/38673/2
Mr. NeilBrown [Wed, 20 May 2020 11:43:42 +0000 (07:43 -0400)]
LU-9859 libcfs: discard libcfs_prim.h

This file no longer contains enough content
to justify a separate file.  So merge with
libcfs.h.

Linux-commit: 7673fd6b6af0c234e8ed5ec94c4da083b2f7d354

Change-Id: I4f486f0356f14e564032ed22e2e439fe4e65942c
Test-Parameters: trivial
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/38673
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11310 ldiskfs: Repair support for SUSE 15 again 11/38611/4
Mr NeilBrown [Fri, 15 May 2020 04:06:29 +0000 (14:06 +1000)]
LU-11310 ldiskfs: Repair support for SUSE 15 again

A recent patch split ext4-htree-lock.patch out from the various
ext4-pdirop.patch, but this happened before various
files were copies and modified for extra sles15 support.

The patch makes the necessary checks to bring the
htree-lock split to sle15.

Fixes: 46ed28c0d10a ("LU-11310 ldiskfs: Repair support for SUSE 15 GA and SP1")
Fixes: 42880f9502ba ("LU-13054 ldiskfs: split htree_lock as separate patch")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I464aaebc52d5b2b73fdd748c7d4dbbaf43f1ac49
Reviewed-on: https://review.whamcloud.com/38611
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-9859 libcfs: merge linux-debug.c into debug.c 02/38602/4
Mr NeilBrown [Thu, 14 May 2020 17:31:39 +0000 (13:31 -0400)]
LU-9859 libcfs: merge linux-debug.c into debug.c

There is no important difference between the contents
of these files, so merge them into one.

Test-Parameters: trivial
Change-Id: I32ac9215f317b305092623e0743a530e18e4d9c1
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38602
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13509 ptlrpc: Clear bd_registered in ptlrpc_unregister_bulk 57/38457/4
Chris Horn [Sat, 2 May 2020 15:37:15 +0000 (10:37 -0500)]
LU-13509 ptlrpc: Clear bd_registered in ptlrpc_unregister_bulk

The patch for LU-12816 https://review.whamcloud.com/36309 has us
clearing the bd_registered flag in ptl_send_rpc(). This flag is set
in ptlrpc_register_bulk(), so it makes sense for us to clear it in
ptlrpc_unregister_bulk(). When we're cleaning up in ptl_send_rpc()
we can be sure the flag is cleared with the call to
ptlrpc_unregister_bulk().

This commit also adds a test case for the LU-12816 bug.

Fixes: e6225c07ce4c ("LU-12816 ptlrpc: ptlrpc_register_bulk LBUG on ENOMEM")
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Iabaf109aaf72894cd5acbcacbb0299929ea1a146
Reviewed-on: https://review.whamcloud.com/38457
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12477 ldiskfs: drop SUSE kernel 4.4 and earlier 68/38268/10
Shaun Tancheff [Tue, 19 May 2020 22:12:40 +0000 (17:12 -0500)]
LU-12477 ldiskfs: drop SUSE kernel 4.4 and earlier

This patch drops ldiskfs support for SLES 12 sp1 through sp3
SLES 12 sp4 and sp5 use an 4.12.14 kernel but no specific
testing has been done for those kernels.

The SUSE 15 ga 4.12.14-150 and sp1 4.12.14-197.7 releases are
tested and known to work.

Remove unused kernel patches for sles and any patches not
referenced in a series

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia13088481b7304d4931ecbb6946a031a851cfe89
Reviewed-on: https://review.whamcloud.com/38268
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13437 lmv: check stripe FID sanity 60/38560/2
Lai Siyao [Fri, 8 May 2020 14:53:47 +0000 (22:53 +0800)]
LU-13437 lmv: check stripe FID sanity

Striped directory layout may be broken, if some stripe FID is insane,
return -ENODEV.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7ed8c7c561e34625e2cb29bfd14bc0ecf3fce46c
Reviewed-on: https://review.whamcloud.com/38560
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10973 lnet: infrastructure to build the LUTF 84/38084/8
Amir Shehata [Wed, 25 Mar 2020 01:48:55 +0000 (18:48 -0700)]
LU-10973 lnet: infrastructure to build the LUTF

Add flags to turn on/off LUTF building.
Modify the gitignore to ignore .i files which are the
swig interface files used to create python callable APIs
from C APIs

m4 files to search and find python and swig installations needed
for building the LUTF.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Idbd23dc457c95425edbf88755ae261ff4de6b0c9
Reviewed-on: https://review.whamcloud.com/38084
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13297 tests: parallel-scale enhancement 32/37732/5
Elena Gryaznova [Wed, 26 Feb 2020 16:45:26 +0000 (19:45 +0300)]
LU-13297 tests: parallel-scale enhancement

Patch changes parallel-scale tests to use t-f test_mkdir()
instead of mkdir to have the possibility to run these tests on
striped directories.

Test-Parameters: trivial testlist=parallel-scale,parallel-scale-nfsv3
Cray-bug-id: LUS-8291
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I6a0d52d7115668ef2bc7397a9a1012dbcb9e0526
Reviewed-on: https://review.whamcloud.com/37732
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-12205 tests: host_nids_address() fix for MR setup 23/34723/10
Elena Gryaznova [Fri, 19 Apr 2019 12:39:32 +0000 (15:39 +0300)]
LU-12205 tests: host_nids_address() fix for MR setup

Patch fixes t-f:host_nids_address() to work properly
on multiple networks setup.

Example:
With MR setup we have:
lctl list_nids
192.168.101.3@tcp
10.0.101.3@tcp1
10.0.201.3@tcp2
For NETTYPE=tcp host_nids_address() should give the result
192.168.101.3 only.

Test-Parameters: trivial testlist=sanity,sanityn,sanity-sec,\
lnet-selftest,conf-sanity,obdfilter-survey

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-7150
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Change-Id: Ida397f1811be142c5aa8813f32461b83d6113fc2
Reviewed-on: https://review.whamcloud.com/34723
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-2225 tests: sanity/27 tests to poll for state 88/4388/10
Alex Zhuravlev [Mon, 29 Apr 2019 15:30:24 +0000 (18:30 +0300)]
LU-2225 tests: sanity/27 tests to poll for state

- reset_enospc() to poll when precreate_status is OK
- exhaust_precreations() to wait one time, not for every OST

ONLY=27 OSTCOUNT=7 sh sanity: 641 sec before and 373 sec after

Test-Parameters: trivial testlist=sanity fstype=zfs
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I97366cb046c50223020f2161603657056a602cd5
Reviewed-on: https://review.whamcloud.com/4388
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
3 years agoLU-6142 utils: Fix style issues for liblustreapi.c 09/38709/2
Arshad Hussain [Fri, 22 May 2020 20:43:41 +0000 (02:13 +0530)]
LU-6142 utils: Fix style issues for liblustreapi.c

This patch fixes issues reported by checkpatch
for file lustre/utils/liblustreapi.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I901dfbd7e489e4a074b329c59370f76e0d87fe31
Reviewed-on: https://review.whamcloud.com/38709
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 utils: Fix style issues for lctl.c 08/38708/2
Arshad Hussain [Fri, 22 May 2020 19:27:33 +0000 (00:57 +0530)]
LU-6142 utils: Fix style issues for lctl.c

This patch fixes issues reported by checkpatch
for file lustre/utils/lctl.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I0e9f9dd9db9f5ba64371283ce2afcabcead0b370
Reviewed-on: https://review.whamcloud.com/38708
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
3 years agoLU-6142 utils: Fix style issues for llog_reader.c 06/38706/2
Arshad Hussain [Fri, 22 May 2020 15:22:01 +0000 (20:52 +0530)]
LU-6142 utils: Fix style issues for llog_reader.c

This patch fixes issues reported by checkpatch
for file lustre/utils/llog_reader.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I02af3385be5521ef5ed9063926e846059067b8ab
Reviewed-on: https://review.whamcloud.com/38706
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
3 years agoLU-6142 utils: Fix style issues for mount_lustre.c 42/38642/3
Arshad Hussain [Sat, 16 May 2020 05:57:22 +0000 (11:27 +0530)]
LU-6142 utils: Fix style issues for mount_lustre.c

This patch fixes issues reported by checkpatch
for file lustre/utils/mount_lustre.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie664a7726805a6f699671b3703887852a1ee82f3
Reviewed-on: https://review.whamcloud.com/38642
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-6142 lustre: convert some container_of to *_safe 84/38384/2
Mr NeilBrown [Mon, 27 Apr 2020 05:42:38 +0000 (15:42 +1000)]
LU-6142 lustre: convert some container_of to *_safe

Each of these uses of container_of0() cannot be determined from local
inspection to always received a valid pointer, so container_of()
cannot be used.
So convert them to the upstream standard container_of_safe().

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7d5551ae4d88bc931f7edbd3447b5bb2db8ce40c
Reviewed-on: https://review.whamcloud.com/38384
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13546 pcc: exclude mmap_sanity tst8/tst9 from test list 98/38598/3
Qian Yingjin [Thu, 14 May 2020 10:16:48 +0000 (18:16 +0800)]
LU-13546 pcc: exclude mmap_sanity tst8/tst9 from test list

Current RHEL8 kernel does not strictly obey POSIX syntax for
mmap() within the maping but beyond current end of the underlying
files: It does not send SIGBUS signals to the process.

For negative file offset, sanity_mmap also failed on 48 bits
ldiksfs backend on the new RHEL kernel due to too large offset:
"Value too large for defined data type".

Due to the above reasons, mmap_sanity tst8/tst9 both failed on the
new RHEL8 kernel. Thus, we execlude mmap_sanity tst8 and tst9 from
sanity-pcc and sanity test list.

Test-Parameters: trivial clientdistro=el8 testlist=sanity-pcc,sanityn
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6252852fac08fea609444613c59ae138891d8fb8
Reviewed-on: https://review.whamcloud.com/38598
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-10401 procs: print new line based on distro 99/38699/7
Yang Sheng [Fri, 22 May 2020 04:06:47 +0000 (12:06 +0800)]
LU-10401 procs: print new line based on distro

Since upstream changed to print new line in module
parameter callback instead of kernel self. So we
need test output of param_get_byte to determine
whether output the new line.
(upstream: v4.14-rc3-148-g96802e6b1dbf)
Also output filename and file content when test
failed for santy 133h.

Test-Parameters: trivial testlist=sanity clientdistro=el8.1 serverdistro=el8.1
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ibb5961e4de8b05d9dd59875e4fd38a42fa07d0d6
Reviewed-on: https://review.whamcloud.com/38699
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 years agoNew tag 2.13.54 2.13.54 v2_13_54
Oleg Drokin [Wed, 27 May 2020 22:45:23 +0000 (18:45 -0400)]
New tag 2.13.54

Change-Id: Ifa907ef1d982bd4a5e777c8eacd753e4e0ba4385

3 years agoLU-13553 lnd: gracefully handle unexpected events 69/38669/2
Amir Shehata [Wed, 20 May 2020 05:21:10 +0000 (22:21 -0700)]
LU-13553 lnd: gracefully handle unexpected events

When a tx completes kiblnd_tx_complete() callback is invoked.
We ensure:
LASSERT (tx->tx_sending > 0);
However this assert is being triggered in some rare scenarios.
The reason tx_sending would be 0 at this point is because:
 1. ib_post_send() failed but OFED stack is still sending
    a tx complete event.
 2. We're getting two different events for the same tx

Instead of asserting, ignore that tx_complete event and print
the tx pointer and its status.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8cd192538c0c80abaef23a4b6e6906936043060b
Reviewed-on: https://review.whamcloud.com/38669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13585 tests: add mustfail check 62/38662/4
Elena Gryaznova [Tue, 19 May 2020 14:21:59 +0000 (17:21 +0300)]
LU-13585 tests: add mustfail check

Patch adds the possibility to ignore the mpi loads failures
for  particular instances.

This is useful for Quota Pools stress tests which are supposed
to randomly hit QP limits.
The subsets of expected failures is set by specifying NINSTMUSTFAIL.
      0 - mpi tests from all clients must pass (default)
      1 - mpi tests from all clients must fail
      N - mpi tests from one client of Ns must fail.
Set NINSTMUSTFAIL=2 to expect each 2nd mpi instance fail and
NINSTMUSTFAIL=3 to expect each 3d mpi instance fail.

For QP test: the different limits set for users per pool: a half
of users have a small limit which makes IOR to fail:
  small limit is set for user1, user3, user5
  large limit is set for user2, user4
Run N ior instances on N clients, each client/instance uses own
user{1..N}. The test considered as pass-ed if IOR instances failed
on client1, client3, client5.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-8844, LUS-8504, LUS-8602
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Change-Id: Ia7c4e394c3724190d6cff9f086f8837e54f6110d
Reviewed-on: https://review.whamcloud.com/38662
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-11621 utils: optimize lhsmtool_posix with copy_file_range() 51/38651/5
James Simmons [Tue, 19 May 2020 18:59:40 +0000 (14:59 -0400)]
LU-11621 utils: optimize lhsmtool_posix with copy_file_range()

Newer kernels and glibc offer copy_file_range() which avoids
a context switch needed with read() + write() for file data
copying. In the future Lustre can look to optimize this copy
on the server backend. Update lhsmtool_posix to use this new
functionality.

Change-Id: If63d57b89902f3d5c9ddde66901f3f55d15080f5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38651
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13557 quota: remove inline declarations 95/38595/5
Alex Zhuravlev [Thu, 14 May 2020 06:22:45 +0000 (09:22 +0300)]
LU-13557 quota: remove inline declarations

which can't be really used given function budies aren't in .h file

Fixes: 09f9fb3211 ("LU-11023 quota: quota pools for OSTs")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia8bb5d04185ec6a779c872a9825c23034030e605
Reviewed-on: https://review.whamcloud.com/38595
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13555 build: Map mainline kernel on rhel to rhel 94/38594/2
Shaun Tancheff [Wed, 13 May 2020 23:57:57 +0000 (18:57 -0500)]
LU-13555 build: Map mainline kernel on rhel to rhel

Mainline kernel builds on RHEL can map to RHEL directly and
the MAINLINE_KERNEL abstraction is not needed and can be
removed.

Test-Parameters: trivial
HPE-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I80ba0477c405b9d0f12b4d472f244bb9b15999ff
Reviewed-on: https://review.whamcloud.com/38594
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13510 lnd: Allow independent socklnd timeout 60/38460/6
Chris Horn [Sat, 2 May 2020 15:18:42 +0000 (10:18 -0500)]
LU-13510 lnd: Allow independent socklnd timeout

Allow the socklnd timeout to be set independent of
lnet_transaction_timeout and retry_count.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Iaa76e77990c8c5ce79193ae8d1f7b3a7db6b433f
Reviewed-on: https://review.whamcloud.com/38460
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13510 lnd: Allow independent ko2iblnd timeout 59/38459/6
Chris Horn [Sat, 2 May 2020 15:13:41 +0000 (10:13 -0500)]
LU-13510 lnd: Allow independent ko2iblnd timeout

Allow ko2iblnd timeout parameter to be set independent of the
lnet_transaction_timeout and retry_count.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I5b9a0da83c597c77f597db0c5cebbd933b5988fc
Reviewed-on: https://review.whamcloud.com/38459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13510 lnet: Add lnet_lnd_timeout to lnetctl 64/38464/5
Chris Horn [Sun, 3 May 2020 15:02:57 +0000 (10:02 -0500)]
LU-13510 lnet: Add lnet_lnd_timeout to lnetctl

Add lnet_lnd_timeout to lnetctl. The param is read-only since it is
calculated from transaction_timeout and retry_count.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I516d2d8082951014835c9e8c8a7ac2111f48e7ce
Reviewed-on: https://review.whamcloud.com/38464
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13510 lnet: Add lnet_lnd_timeout to sysfs 82/38482/3
Chris Horn [Mon, 4 May 2020 18:29:41 +0000 (13:29 -0500)]
LU-13510 lnet: Add lnet_lnd_timeout to sysfs

Allow lnet_lnd_timeout to be read (only) from sysfs.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I8bdbf0f6a51a798f3395238e50a2ebb1fdb64911
Reviewed-on: https://review.whamcloud.com/38482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13510 lnet: Correct the default LND timeout 81/38481/3
Chris Horn [Mon, 4 May 2020 18:24:51 +0000 (13:24 -0500)]
LU-13510 lnet: Correct the default LND timeout

Default LND timeout is currently too low. To allow for
lnet_retry_count resend attempts within a single
lnet_transaction_timeout window, the LND timeout needs to be less
than lnet_transaction_timeout / lnet_retry_count. If the retry
count is 0, we still want LND timeout to be less than the LNet
transaction timeout.

Also, be sure to update the LND timeout when health is toggled on or
off.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ifd6d97895192a321081aa09ebe9f1d0115e63305
Reviewed-on: https://review.whamcloud.com/38481
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13485 build: Enable 2 stage configure tests 47/38347/4
Shaun Tancheff [Sun, 10 May 2020 20:14:54 +0000 (15:14 -0500)]
LU-13485 build: Enable 2 stage configure tests

This idea was implemented by OpenZFS a while ago. This
is heavily inspired by the OpenZFS work.

Here we enable splitting tests compile tests into two
distinct parts that share an internal unique name.

The source half can then be built in parallel and
the results can be determined based on the build
artifacts.

Tests which depend on order of execution and/or the
result of a previous test are not well suited for
being converted. However the majority of lustre
compile tests can be run in parallel.

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If01ccdfdf4810ecc2d616da3fa6b7ca786fe760f
Reviewed-on: https://review.whamcloud.com/38347
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13472 lnet: set route aliveness properly 23/38323/6
Amir Shehata [Thu, 23 Apr 2020 00:54:34 +0000 (17:54 -0700)]
LU-13472 lnet: set route aliveness properly

In the case when the discover is toggled from on to off, the route
aliveness might become stale due to not updating the route->lr_alive
variable correctly. It will get updated once the gateway is pinged.
However, there is a period of max alive_router_check_interval where
the route can be down.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ic1754d6e7ddc9398efc7a64f823a70e5546e9ca6
Reviewed-on: https://review.whamcloud.com/38323
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13477 lnet: Force full discovery cycle 22/38322/5
Amir Shehata [Thu, 23 Apr 2020 00:47:05 +0000 (17:47 -0700)]
LU-13477 lnet: Force full discovery cycle

There are scenarios where there could be a discrepancy between
cached peer information and reality. In these cases what could
end-up happening is incomplete interface information might be
cached because one side determined that the peer didn't require
a PUSH. This will lead to undesired MR behavior, where not all
the interfaces are used for a period of time.

Therefore, it is safer to always force a full discovery cycle:
GET/PUSH to ensure both sides are up-to-date.

In the NMR case, when discovery is turned off, make sure to flag
discovery as complete to avoid stalling the state machine.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie49ad11e8ff874206baa268a4ef2d58ebb536ed5
Reviewed-on: https://review.whamcloud.com/38322
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13478 lnet: handle discovery off properly 21/38321/5
Amir Shehata [Thu, 23 Apr 2020 00:26:48 +0000 (17:26 -0700)]
LU-13478 lnet: handle discovery off properly

Peers need to only be updated when discovery is toggled from
on to off. This way the peers don't attempt to send to a
non-primary NID of the node. However, when discovery is
toggled from off to on, the peer will attempt rediscovery
and the peer information will eventually consolidate.

In order to properly delete the peer only when it makes sense
we have to differentiate between the case when we get the
initial message and when we get a push for an already discovered
peer. We only want to delete our local representation if the peer
is one we have already had in our records.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Id6a7353276fec82fddf90e0fa9d85d165b459c8d
Reviewed-on: https://review.whamcloud.com/38321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13464 target: abort recovery if timer fail 77/38277/7
Hongchao Zhang [Thu, 14 May 2020 10:25:46 +0000 (18:25 +0800)]
LU-13464 target: abort recovery if timer fail

During target recovery, the recovery timer should be kept to be
armed to ensure the recovery doesn't take too long time, there
should be some problem if the deadline of the recovery timer is
passed and the recovery is not completed yet, the recovery should
be aborted in this case.

Change-Id: Id44f2a2d1a3183ad8dd13f4d34392713c55a2cb3
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38277
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13258 obdclass: bind zombie export cleanup workqueue 12/38212/11
James Simmons [Mon, 11 May 2020 22:38:10 +0000 (18:38 -0400)]
LU-13258 obdclass: bind zombie export cleanup workqueue

Lustre uses a workqueue to clear out stale exports. Bind this
workqueue to the cores used by Lustre defined by the CPT setup.

Move the code handling workqueue binding to libcfs so it can be
used by everyone.

Rename CONFIG_LUSTRE_PINGER to CONFIG_LUSTRE_FS_PINGER to match
linux client.

Change-Id: Ifa109f6a93e6ec6bbdef5e91fe8ca1cde0eaea3e
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38212
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13412 llite: fix read if readahead window smaller than rpc size 32/38132/3
Wang Shilong [Fri, 3 Apr 2020 13:14:25 +0000 (21:14 +0800)]
LU-13412 llite: fix read if readahead window smaller than rpc size

Readahead always try to align readahead with RPC size, but this
could introduce a problem if readahead window is smaller than RPC size.

With current codes, it will fallback a lot of 4k read because
RPC aligned window start plus window pages will be behind of
current read. Fix this to align with readahead window rather
than RPC size in this case.

Change-Id: I0cd33ac7f92a75f38c926db33630f3036bbfd6c7
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/38132
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13004 lnet: always pass struct lnet_md by reference. 53/37853/10
Mr NeilBrown [Wed, 4 Dec 2019 07:28:11 +0000 (18:28 +1100)]
LU-13004 lnet: always pass struct lnet_md by reference.

Both LNetMDAttach and LNetMDBind expected a struct lnet_md to be
passed by value.  This requires copying the data structure onto the
stack, which is a waste of stack space and brings no value.

So change them to expect a reference, and declare it 'const' to be
sure it doesn't get changed.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I343797d1e70cc85fde92d544e56536e982e02973
Reviewed-on: https://review.whamcloud.com/37853
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13004 gnilnd: remove support for GNILND_BUF_VIRT_* 47/37847/8
Mr NeilBrown [Wed, 4 Dec 2019 05:26:07 +0000 (16:26 +1100)]
LU-13004 gnilnd: remove support for GNILND_BUF_VIRT_*

GNILND_BUF_VIRT_UNMAPPED and GNILND_BUF_VIRT_MAPPED are
not longer set, so remove them and any code that only
runs when they are set.
    gnd_map_nvirt  gnd_map_virtnob
can go too.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If394bc2cf64f903ed4cdb1e1e80a2a017accd562
Reviewed-on: https://review.whamcloud.com/37847
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 years agoLU-13004 gnilnd: discard struct kvec arg. 46/37846/9
Mr NeilBrown [Wed, 4 Dec 2019 04:42:17 +0000 (15:42 +1100)]
LU-13004 gnilnd: discard struct kvec arg.

The 'struct kvec *' are to kgnilnd_setup_rdma_buffer()
and kgnilnd_setup_immediate_buffer() is now always
NULL.  So we can remove the arg and code that handles
non-NULL values.
This means that kgnilnd_setup_virt_buffer() can
disappear completely.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib38494693fba521a6e3dc4e6dc0cbb33dea1595b
Reviewed-on: https://review.whamcloud.com/37846
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>