Whamcloud - gitweb
fs/lustre-release.git
15 months agoLU-12275 sec: documentation for client-side encryption 59/38759/3
Sebastien Buisson [Thu, 28 May 2020 07:11:20 +0000 (09:11 +0200)]
LU-12275 sec: documentation for client-side encryption

Add several documents about client-side encryption under
Documentation/client_side_encryption:
- threat_model.txt is the description of the threat model for Lustre
  client-side encryption;
- key_hierarchy.txt is the description of the key hierarchy for Lustre
  client-side encryption;
- modes_usage.txt is the description of the encryption modes and usage
  for Lustre client-side encryption;
- access_semantics.txt is the description of the access semantics for
  Lustre client-side encryption.

As we rely on kernel's fscrypt library for this feature, fscrypt's
concepts are largely valid. These documents are inspired by fscrypt
documentation in the Linux kernel tree, see
Documentation/filesystems/fscrypt.rst

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6c9d42e572111ed2a3388e4f58b2560f365a5853
Reviewed-on: https://review.whamcloud.com/38759
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-11025 dne: directory restripe and auto split 84/37284/19
Lai Siyao [Mon, 30 Dec 2019 15:27:27 +0000 (23:27 +0800)]
LU-11025 dne: directory restripe and auto split

A specific restriper thread is created for each MDT, it does three
tasks in a loop:
1. If there is directory whose total sub-files exceeds threshold
   (50000 by default, can be changed "lctl set_param
   mdt.*.dir_split_count=N"), split this directory by adding new
   stripes (4 stripes by default, which can be adjusted by
   "lctl set_param mdt.*.dir_split_delta=N").
2. If a directory stripe LMV is marked 'MIGRATION', migrate sub file
   from current offset, and update offset to next file.
3. If a directory master LMV is marked 'RESTRIPING', check whether
   all stripe LMV 'MIGRATION' flag is cleared, if so, clear
   'RESTRIPING' flag and update directory LMV.

In last patch, the first part of manual directory stripe is
implemented, and in this patch, sub file migrations and dir layout
update is done. Directory auto-split is done in similar way, except
that the first step is done by this thread too.

Directory auto-split can be enabled/disabled by "lctl set_param
mdt.*.enable_dir_auto_split=[0|1]", it's turned on by default.

Auto split is triggered at the end of getattr(): since now the attr
contains dirent count, check whether it exceeds threshold, if so,
add this directory into mdr_auto_split list and wake up the dir
restriper thread.

Restripe migration is also triggered in getattr(): if the object is
directory stripe, and LMV 'MIGRATION' flag set, add this object into
mdr_restripe_migrate list and wake up the dir restriper thread.

Directory layout update is similar: if current directory is striped,
and LNV 'RESTRIPING' flag is set, add this directory into
mdr_restripe_update list and wake up restriper thread.

By default restripe migrate dirent only, and leave inode unchanged, it
can be adjusted by "lctl set_param mdt.*.dir_restripe_nsonly=[0|1]".

Currently DoM file inode migration is not supported, migrate dirent
only for such files to avoid leaving dir migration/restripe
unfinished.

Add sanity.sh 230o, 230p and 230q, adjust 230j since DoM files migrate
dirent.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8c83b42e4acbaab067d0092d0b232de37f956588
Reviewed-on: https://review.whamcloud.com/37284
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-9859 libcfs: discard libcfs_prim.h 73/38673/2
Mr. NeilBrown [Wed, 20 May 2020 11:43:42 +0000 (07:43 -0400)]
LU-9859 libcfs: discard libcfs_prim.h

This file no longer contains enough content
to justify a separate file.  So merge with
libcfs.h.

Linux-commit: 7673fd6b6af0c234e8ed5ec94c4da083b2f7d354

Change-Id: I4f486f0356f14e564032ed22e2e439fe4e65942c
Test-Parameters: trivial
Signed-off-by: Mr. NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/38673
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-11310 ldiskfs: Repair support for SUSE 15 again 11/38611/4
Mr NeilBrown [Fri, 15 May 2020 04:06:29 +0000 (14:06 +1000)]
LU-11310 ldiskfs: Repair support for SUSE 15 again

A recent patch split ext4-htree-lock.patch out from the various
ext4-pdirop.patch, but this happened before various
files were copies and modified for extra sles15 support.

The patch makes the necessary checks to bring the
htree-lock split to sle15.

Fixes: 46ed28c0d10a ("LU-11310 ldiskfs: Repair support for SUSE 15 GA and SP1")
Fixes: 42880f9502ba ("LU-13054 ldiskfs: split htree_lock as separate patch")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I464aaebc52d5b2b73fdd748c7d4dbbaf43f1ac49
Reviewed-on: https://review.whamcloud.com/38611
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-9859 libcfs: merge linux-debug.c into debug.c 02/38602/4
Mr NeilBrown [Thu, 14 May 2020 17:31:39 +0000 (13:31 -0400)]
LU-9859 libcfs: merge linux-debug.c into debug.c

There is no important difference between the contents
of these files, so merge them into one.

Test-Parameters: trivial
Change-Id: I32ac9215f317b305092623e0743a530e18e4d9c1
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38602
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13509 ptlrpc: Clear bd_registered in ptlrpc_unregister_bulk 57/38457/4
Chris Horn [Sat, 2 May 2020 15:37:15 +0000 (10:37 -0500)]
LU-13509 ptlrpc: Clear bd_registered in ptlrpc_unregister_bulk

The patch for LU-12816 https://review.whamcloud.com/36309 has us
clearing the bd_registered flag in ptl_send_rpc(). This flag is set
in ptlrpc_register_bulk(), so it makes sense for us to clear it in
ptlrpc_unregister_bulk(). When we're cleaning up in ptl_send_rpc()
we can be sure the flag is cleared with the call to
ptlrpc_unregister_bulk().

This commit also adds a test case for the LU-12816 bug.

Fixes: e6225c07ce4c ("LU-12816 ptlrpc: ptlrpc_register_bulk LBUG on ENOMEM")
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Iabaf109aaf72894cd5acbcacbb0299929ea1a146
Reviewed-on: https://review.whamcloud.com/38457
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-12477 ldiskfs: drop SUSE kernel 4.4 and earlier 68/38268/10
Shaun Tancheff [Tue, 19 May 2020 22:12:40 +0000 (17:12 -0500)]
LU-12477 ldiskfs: drop SUSE kernel 4.4 and earlier

This patch drops ldiskfs support for SLES 12 sp1 through sp3
SLES 12 sp4 and sp5 use an 4.12.14 kernel but no specific
testing has been done for those kernels.

The SUSE 15 ga 4.12.14-150 and sp1 4.12.14-197.7 releases are
tested and known to work.

Remove unused kernel patches for sles and any patches not
referenced in a series

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Ia13088481b7304d4931ecbb6946a031a851cfe89
Reviewed-on: https://review.whamcloud.com/38268
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13437 lmv: check stripe FID sanity 60/38560/2
Lai Siyao [Fri, 8 May 2020 14:53:47 +0000 (22:53 +0800)]
LU-13437 lmv: check stripe FID sanity

Striped directory layout may be broken, if some stripe FID is insane,
return -ENODEV.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I7ed8c7c561e34625e2cb29bfd14bc0ecf3fce46c
Reviewed-on: https://review.whamcloud.com/38560
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-10973 lnet: infrastructure to build the LUTF 84/38084/8
Amir Shehata [Wed, 25 Mar 2020 01:48:55 +0000 (18:48 -0700)]
LU-10973 lnet: infrastructure to build the LUTF

Add flags to turn on/off LUTF building.
Modify the gitignore to ignore .i files which are the
swig interface files used to create python callable APIs
from C APIs

m4 files to search and find python and swig installations needed
for building the LUTF.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Idbd23dc457c95425edbf88755ae261ff4de6b0c9
Reviewed-on: https://review.whamcloud.com/38084
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13297 tests: parallel-scale enhancement 32/37732/5
Elena Gryaznova [Wed, 26 Feb 2020 16:45:26 +0000 (19:45 +0300)]
LU-13297 tests: parallel-scale enhancement

Patch changes parallel-scale tests to use t-f test_mkdir()
instead of mkdir to have the possibility to run these tests on
striped directories.

Test-Parameters: trivial testlist=parallel-scale,parallel-scale-nfsv3
Cray-bug-id: LUS-8291
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I6a0d52d7115668ef2bc7397a9a1012dbcb9e0526
Reviewed-on: https://review.whamcloud.com/37732
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-12205 tests: host_nids_address() fix for MR setup 23/34723/10
Elena Gryaznova [Fri, 19 Apr 2019 12:39:32 +0000 (15:39 +0300)]
LU-12205 tests: host_nids_address() fix for MR setup

Patch fixes t-f:host_nids_address() to work properly
on multiple networks setup.

Example:
With MR setup we have:
lctl list_nids
192.168.101.3@tcp
10.0.101.3@tcp1
10.0.201.3@tcp2
For NETTYPE=tcp host_nids_address() should give the result
192.168.101.3 only.

Test-Parameters: trivial testlist=sanity,sanityn,sanity-sec,\
lnet-selftest,conf-sanity,obdfilter-survey

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-7150
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Change-Id: Ida397f1811be142c5aa8813f32461b83d6113fc2
Reviewed-on: https://review.whamcloud.com/34723
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-2225 tests: sanity/27 tests to poll for state 88/4388/10
Alex Zhuravlev [Mon, 29 Apr 2019 15:30:24 +0000 (18:30 +0300)]
LU-2225 tests: sanity/27 tests to poll for state

- reset_enospc() to poll when precreate_status is OK
- exhaust_precreations() to wait one time, not for every OST

ONLY=27 OSTCOUNT=7 sh sanity: 641 sec before and 373 sec after

Test-Parameters: trivial testlist=sanity fstype=zfs
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I97366cb046c50223020f2161603657056a602cd5
Reviewed-on: https://review.whamcloud.com/4388
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
15 months agoLU-6142 utils: Fix style issues for liblustreapi.c 09/38709/2
Arshad Hussain [Fri, 22 May 2020 20:43:41 +0000 (02:13 +0530)]
LU-6142 utils: Fix style issues for liblustreapi.c

This patch fixes issues reported by checkpatch
for file lustre/utils/liblustreapi.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I901dfbd7e489e4a074b329c59370f76e0d87fe31
Reviewed-on: https://review.whamcloud.com/38709
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 utils: Fix style issues for lctl.c 08/38708/2
Arshad Hussain [Fri, 22 May 2020 19:27:33 +0000 (00:57 +0530)]
LU-6142 utils: Fix style issues for lctl.c

This patch fixes issues reported by checkpatch
for file lustre/utils/lctl.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I0e9f9dd9db9f5ba64371283ce2afcabcead0b370
Reviewed-on: https://review.whamcloud.com/38708
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
15 months agoLU-6142 utils: Fix style issues for llog_reader.c 06/38706/2
Arshad Hussain [Fri, 22 May 2020 15:22:01 +0000 (20:52 +0530)]
LU-6142 utils: Fix style issues for llog_reader.c

This patch fixes issues reported by checkpatch
for file lustre/utils/llog_reader.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I02af3385be5521ef5ed9063926e846059067b8ab
Reviewed-on: https://review.whamcloud.com/38706
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
15 months agoLU-6142 utils: Fix style issues for mount_lustre.c 42/38642/3
Arshad Hussain [Sat, 16 May 2020 05:57:22 +0000 (11:27 +0530)]
LU-6142 utils: Fix style issues for mount_lustre.c

This patch fixes issues reported by checkpatch
for file lustre/utils/mount_lustre.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie664a7726805a6f699671b3703887852a1ee82f3
Reviewed-on: https://review.whamcloud.com/38642
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 lustre: convert some container_of to *_safe 84/38384/2
Mr NeilBrown [Mon, 27 Apr 2020 05:42:38 +0000 (15:42 +1000)]
LU-6142 lustre: convert some container_of to *_safe

Each of these uses of container_of0() cannot be determined from local
inspection to always received a valid pointer, so container_of()
cannot be used.
So convert them to the upstream standard container_of_safe().

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I7d5551ae4d88bc931f7edbd3447b5bb2db8ce40c
Reviewed-on: https://review.whamcloud.com/38384
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13546 pcc: exclude mmap_sanity tst8/tst9 from test list 98/38598/3
Qian Yingjin [Thu, 14 May 2020 10:16:48 +0000 (18:16 +0800)]
LU-13546 pcc: exclude mmap_sanity tst8/tst9 from test list

Current RHEL8 kernel does not strictly obey POSIX syntax for
mmap() within the maping but beyond current end of the underlying
files: It does not send SIGBUS signals to the process.

For negative file offset, sanity_mmap also failed on 48 bits
ldiksfs backend on the new RHEL kernel due to too large offset:
"Value too large for defined data type".

Due to the above reasons, mmap_sanity tst8/tst9 both failed on the
new RHEL8 kernel. Thus, we execlude mmap_sanity tst8 and tst9 from
sanity-pcc and sanity test list.

Test-Parameters: trivial clientdistro=el8 testlist=sanity-pcc,sanityn
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I6252852fac08fea609444613c59ae138891d8fb8
Reviewed-on: https://review.whamcloud.com/38598
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-10401 procs: print new line based on distro 99/38699/7
Yang Sheng [Fri, 22 May 2020 04:06:47 +0000 (12:06 +0800)]
LU-10401 procs: print new line based on distro

Since upstream changed to print new line in module
parameter callback instead of kernel self. So we
need test output of param_get_byte to determine
whether output the new line.
(upstream: v4.14-rc3-148-g96802e6b1dbf)
Also output filename and file content when test
failed for santy 133h.

Test-Parameters: trivial testlist=sanity clientdistro=el8.1 serverdistro=el8.1
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ibb5961e4de8b05d9dd59875e4fd38a42fa07d0d6
Reviewed-on: https://review.whamcloud.com/38699
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoNew tag 2.13.54 2.13.54 v2_13_54
Oleg Drokin [Wed, 27 May 2020 22:45:23 +0000 (18:45 -0400)]
New tag 2.13.54

Change-Id: Ifa907ef1d982bd4a5e777c8eacd753e4e0ba4385

15 months agoLU-13553 lnd: gracefully handle unexpected events 69/38669/2
Amir Shehata [Wed, 20 May 2020 05:21:10 +0000 (22:21 -0700)]
LU-13553 lnd: gracefully handle unexpected events

When a tx completes kiblnd_tx_complete() callback is invoked.
We ensure:
LASSERT (tx->tx_sending > 0);
However this assert is being triggered in some rare scenarios.
The reason tx_sending would be 0 at this point is because:
 1. ib_post_send() failed but OFED stack is still sending
    a tx complete event.
 2. We're getting two different events for the same tx

Instead of asserting, ignore that tx_complete event and print
the tx pointer and its status.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8cd192538c0c80abaef23a4b6e6906936043060b
Reviewed-on: https://review.whamcloud.com/38669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13585 tests: add mustfail check 62/38662/4
Elena Gryaznova [Tue, 19 May 2020 14:21:59 +0000 (17:21 +0300)]
LU-13585 tests: add mustfail check

Patch adds the possibility to ignore the mpi loads failures
for  particular instances.

This is useful for Quota Pools stress tests which are supposed
to randomly hit QP limits.
The subsets of expected failures is set by specifying NINSTMUSTFAIL.
      0 - mpi tests from all clients must pass (default)
      1 - mpi tests from all clients must fail
      N - mpi tests from one client of Ns must fail.
Set NINSTMUSTFAIL=2 to expect each 2nd mpi instance fail and
NINSTMUSTFAIL=3 to expect each 3d mpi instance fail.

For QP test: the different limits set for users per pool: a half
of users have a small limit which makes IOR to fail:
  small limit is set for user1, user3, user5
  large limit is set for user2, user4
Run N ior instances on N clients, each client/instance uses own
user{1..N}. The test considered as pass-ed if IOR instances failed
on client1, client3, client5.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <elena.gryaznova@hpe.com>
HPE-bug-id: LUS-8844, LUS-8504, LUS-8602
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@hpe.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Change-Id: Ia7c4e394c3724190d6cff9f086f8837e54f6110d
Reviewed-on: https://review.whamcloud.com/38662
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-11621 utils: optimize lhsmtool_posix with copy_file_range() 51/38651/5
James Simmons [Tue, 19 May 2020 18:59:40 +0000 (14:59 -0400)]
LU-11621 utils: optimize lhsmtool_posix with copy_file_range()

Newer kernels and glibc offer copy_file_range() which avoids
a context switch needed with read() + write() for file data
copying. In the future Lustre can look to optimize this copy
on the server backend. Update lhsmtool_posix to use this new
functionality.

Change-Id: If63d57b89902f3d5c9ddde66901f3f55d15080f5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38651
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13557 quota: remove inline declarations 95/38595/5
Alex Zhuravlev [Thu, 14 May 2020 06:22:45 +0000 (09:22 +0300)]
LU-13557 quota: remove inline declarations

which can't be really used given function budies aren't in .h file

Fixes: 09f9fb3211 ("LU-11023 quota: quota pools for OSTs")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia8bb5d04185ec6a779c872a9825c23034030e605
Reviewed-on: https://review.whamcloud.com/38595
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13555 build: Map mainline kernel on rhel to rhel 94/38594/2
Shaun Tancheff [Wed, 13 May 2020 23:57:57 +0000 (18:57 -0500)]
LU-13555 build: Map mainline kernel on rhel to rhel

Mainline kernel builds on RHEL can map to RHEL directly and
the MAINLINE_KERNEL abstraction is not needed and can be
removed.

Test-Parameters: trivial
HPE-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I80ba0477c405b9d0f12b4d472f244bb9b15999ff
Reviewed-on: https://review.whamcloud.com/38594
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13510 lnd: Allow independent socklnd timeout 60/38460/6
Chris Horn [Sat, 2 May 2020 15:18:42 +0000 (10:18 -0500)]
LU-13510 lnd: Allow independent socklnd timeout

Allow the socklnd timeout to be set independent of
lnet_transaction_timeout and retry_count.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Iaa76e77990c8c5ce79193ae8d1f7b3a7db6b433f
Reviewed-on: https://review.whamcloud.com/38460
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13510 lnd: Allow independent ko2iblnd timeout 59/38459/6
Chris Horn [Sat, 2 May 2020 15:13:41 +0000 (10:13 -0500)]
LU-13510 lnd: Allow independent ko2iblnd timeout

Allow ko2iblnd timeout parameter to be set independent of the
lnet_transaction_timeout and retry_count.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I5b9a0da83c597c77f597db0c5cebbd933b5988fc
Reviewed-on: https://review.whamcloud.com/38459
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13510 lnet: Add lnet_lnd_timeout to lnetctl 64/38464/5
Chris Horn [Sun, 3 May 2020 15:02:57 +0000 (10:02 -0500)]
LU-13510 lnet: Add lnet_lnd_timeout to lnetctl

Add lnet_lnd_timeout to lnetctl. The param is read-only since it is
calculated from transaction_timeout and retry_count.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I516d2d8082951014835c9e8c8a7ac2111f48e7ce
Reviewed-on: https://review.whamcloud.com/38464
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13510 lnet: Add lnet_lnd_timeout to sysfs 82/38482/3
Chris Horn [Mon, 4 May 2020 18:29:41 +0000 (13:29 -0500)]
LU-13510 lnet: Add lnet_lnd_timeout to sysfs

Allow lnet_lnd_timeout to be read (only) from sysfs.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I8bdbf0f6a51a798f3395238e50a2ebb1fdb64911
Reviewed-on: https://review.whamcloud.com/38482
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13510 lnet: Correct the default LND timeout 81/38481/3
Chris Horn [Mon, 4 May 2020 18:24:51 +0000 (13:24 -0500)]
LU-13510 lnet: Correct the default LND timeout

Default LND timeout is currently too low. To allow for
lnet_retry_count resend attempts within a single
lnet_transaction_timeout window, the LND timeout needs to be less
than lnet_transaction_timeout / lnet_retry_count. If the retry
count is 0, we still want LND timeout to be less than the LNet
transaction timeout.

Also, be sure to update the LND timeout when health is toggled on or
off.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ifd6d97895192a321081aa09ebe9f1d0115e63305
Reviewed-on: https://review.whamcloud.com/38481
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13485 build: Enable 2 stage configure tests 47/38347/4
Shaun Tancheff [Sun, 10 May 2020 20:14:54 +0000 (15:14 -0500)]
LU-13485 build: Enable 2 stage configure tests

This idea was implemented by OpenZFS a while ago. This
is heavily inspired by the OpenZFS work.

Here we enable splitting tests compile tests into two
distinct parts that share an internal unique name.

The source half can then be built in parallel and
the results can be determined based on the build
artifacts.

Tests which depend on order of execution and/or the
result of a previous test are not well suited for
being converted. However the majority of lustre
compile tests can be run in parallel.

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: If01ccdfdf4810ecc2d616da3fa6b7ca786fe760f
Reviewed-on: https://review.whamcloud.com/38347
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13472 lnet: set route aliveness properly 23/38323/6
Amir Shehata [Thu, 23 Apr 2020 00:54:34 +0000 (17:54 -0700)]
LU-13472 lnet: set route aliveness properly

In the case when the discover is toggled from on to off, the route
aliveness might become stale due to not updating the route->lr_alive
variable correctly. It will get updated once the gateway is pinged.
However, there is a period of max alive_router_check_interval where
the route can be down.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ic1754d6e7ddc9398efc7a64f823a70e5546e9ca6
Reviewed-on: https://review.whamcloud.com/38323
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13477 lnet: Force full discovery cycle 22/38322/5
Amir Shehata [Thu, 23 Apr 2020 00:47:05 +0000 (17:47 -0700)]
LU-13477 lnet: Force full discovery cycle

There are scenarios where there could be a discrepancy between
cached peer information and reality. In these cases what could
end-up happening is incomplete interface information might be
cached because one side determined that the peer didn't require
a PUSH. This will lead to undesired MR behavior, where not all
the interfaces are used for a period of time.

Therefore, it is safer to always force a full discovery cycle:
GET/PUSH to ensure both sides are up-to-date.

In the NMR case, when discovery is turned off, make sure to flag
discovery as complete to avoid stalling the state machine.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie49ad11e8ff874206baa268a4ef2d58ebb536ed5
Reviewed-on: https://review.whamcloud.com/38322
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13478 lnet: handle discovery off properly 21/38321/5
Amir Shehata [Thu, 23 Apr 2020 00:26:48 +0000 (17:26 -0700)]
LU-13478 lnet: handle discovery off properly

Peers need to only be updated when discovery is toggled from
on to off. This way the peers don't attempt to send to a
non-primary NID of the node. However, when discovery is
toggled from off to on, the peer will attempt rediscovery
and the peer information will eventually consolidate.

In order to properly delete the peer only when it makes sense
we have to differentiate between the case when we get the
initial message and when we get a push for an already discovered
peer. We only want to delete our local representation if the peer
is one we have already had in our records.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Id6a7353276fec82fddf90e0fa9d85d165b459c8d
Reviewed-on: https://review.whamcloud.com/38321
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13464 target: abort recovery if timer fail 77/38277/7
Hongchao Zhang [Thu, 14 May 2020 10:25:46 +0000 (18:25 +0800)]
LU-13464 target: abort recovery if timer fail

During target recovery, the recovery timer should be kept to be
armed to ensure the recovery doesn't take too long time, there
should be some problem if the deadline of the recovery timer is
passed and the recovery is not completed yet, the recovery should
be aborted in this case.

Change-Id: Id44f2a2d1a3183ad8dd13f4d34392713c55a2cb3
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38277
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13258 obdclass: bind zombie export cleanup workqueue 12/38212/11
James Simmons [Mon, 11 May 2020 22:38:10 +0000 (18:38 -0400)]
LU-13258 obdclass: bind zombie export cleanup workqueue

Lustre uses a workqueue to clear out stale exports. Bind this
workqueue to the cores used by Lustre defined by the CPT setup.

Move the code handling workqueue binding to libcfs so it can be
used by everyone.

Rename CONFIG_LUSTRE_PINGER to CONFIG_LUSTRE_FS_PINGER to match
linux client.

Change-Id: Ifa109f6a93e6ec6bbdef5e91fe8ca1cde0eaea3e
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38212
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13412 llite: fix read if readahead window smaller than rpc size 32/38132/3
Wang Shilong [Fri, 3 Apr 2020 13:14:25 +0000 (21:14 +0800)]
LU-13412 llite: fix read if readahead window smaller than rpc size

Readahead always try to align readahead with RPC size, but this
could introduce a problem if readahead window is smaller than RPC size.

With current codes, it will fallback a lot of 4k read because
RPC aligned window start plus window pages will be behind of
current read. Fix this to align with readahead window rather
than RPC size in this case.

Change-Id: I0cd33ac7f92a75f38c926db33630f3036bbfd6c7
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/38132
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 lnet: always pass struct lnet_md by reference. 53/37853/10
Mr NeilBrown [Wed, 4 Dec 2019 07:28:11 +0000 (18:28 +1100)]
LU-13004 lnet: always pass struct lnet_md by reference.

Both LNetMDAttach and LNetMDBind expected a struct lnet_md to be
passed by value.  This requires copying the data structure onto the
stack, which is a waste of stack space and brings no value.

So change them to expect a reference, and declare it 'const' to be
sure it doesn't get changed.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I343797d1e70cc85fde92d544e56536e982e02973
Reviewed-on: https://review.whamcloud.com/37853
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 gnilnd: remove support for GNILND_BUF_VIRT_* 47/37847/8
Mr NeilBrown [Wed, 4 Dec 2019 05:26:07 +0000 (16:26 +1100)]
LU-13004 gnilnd: remove support for GNILND_BUF_VIRT_*

GNILND_BUF_VIRT_UNMAPPED and GNILND_BUF_VIRT_MAPPED are
not longer set, so remove them and any code that only
runs when they are set.
    gnd_map_nvirt  gnd_map_virtnob
can go too.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If394bc2cf64f903ed4cdb1e1e80a2a017accd562
Reviewed-on: https://review.whamcloud.com/37847
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 gnilnd: discard struct kvec arg. 46/37846/9
Mr NeilBrown [Wed, 4 Dec 2019 04:42:17 +0000 (15:42 +1100)]
LU-13004 gnilnd: discard struct kvec arg.

The 'struct kvec *' are to kgnilnd_setup_rdma_buffer()
and kgnilnd_setup_immediate_buffer() is now always
NULL.  So we can remove the arg and code that handles
non-NULL values.
This means that kgnilnd_setup_virt_buffer() can
disappear completely.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib38494693fba521a6e3dc4e6dc0cbb33dea1595b
Reviewed-on: https://review.whamcloud.com/37846
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13262 ldlm: no current source if lu_ref_del not in same tsk 24/37624/3
Bruno Faccini [Wed, 19 Feb 2020 13:48:48 +0000 (14:48 +0100)]
LU-13262 ldlm: no current source if lu_ref_del not in same tsk

Running with USE_LU_REF ("configure --enable-lu_ref") configured
triggers a LBUG (because "ref->lf_failed > 0" condition false)
due to to using "current" as the lu_ref source, but in some cases
lu_ref_del() occurs within a different task context.
To avoid this, lu_ref source is changed to ldlm_lock address by
this patch.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ia35e31c1a722c03f97672025e2abff40486b3f76
Reviewed-on: https://review.whamcloud.com/37624
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
15 months agoLU-10934 llite: integrate statx() API with Lustre 74/36674/18
Qian Yingjin [Fri, 1 Nov 2019 08:58:26 +0000 (16:58 +0800)]
LU-10934 llite: integrate statx() API with Lustre

System call statx() interface can specify a bitmask to fetch
specific attributes from a file (e.g. st_uid, st_gid, st_mode, and
st_btime = file creation time), rather than fetching all of the
normal stat() attributes (such as st_size and st_blocks). It also
has a AT_STATX_DONT_SYNC mode which allows the kernel to return
cached attributes without flushing all of the client data and
fetching an accurate result from the server.
The conditions for adding statx() API for Lustre are mature:
1. statx() is added to Linux 4.11+;
2. glibc supports statx() (glibc 2.28+ -> RHEL 8, Ubuntun 18.10+)
3. The support for stat(1) and ls(1) to use statx(3) to fetch
   only the required attributes has landed to the upstream GNU
   coreutils package.

This patch integrates statx() API with Lustre so that we can take
advantage of the efficiencies available:
- Only fetch MDS attributes if STATX_SIZE, STATX_BLOCKS and
  STATX_MTIME are not requested, and avoid OSS glimpse RPCs
  completely;
- Hook this into statahead to avoid async glimpse locks (AGL) if
  OST information not needed;
- Enhance the MDS RPC interface to return the file creation time
  stored in both ldiskfs and ZFS already, and enable STATX_BTIME;
- Better support with AT_STATX_DONT_SYNC mode. Return the "lazy"
  attributes or cached attributes (even stale) on a client if
  available without any RPCs to servers (MDS and OSS).
- statx (lustre/test/statx): port coreutils ls/stat by using
  statx(3) system call if OS supported it.
- Test scripts. Using statx() to verify btime attribute and the
  advantage described above.

Test-Parameters: clientdistro=el8
Test-Parameters: clientdistro=ubuntu1804
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I8432c9029bad9dea3e1ebc13a0d6978131d9b929
Reviewed-on: https://review.whamcloud.com/36674
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-13528 llite: prevent MAX_DIO_SIZE 32-bit truncation 26/38526/4
Sebastien Buisson [Thu, 7 May 2020 06:59:40 +0000 (08:59 +0200)]
LU-13528 llite: prevent MAX_DIO_SIZE 32-bit truncation

On 4kB PAGE_SIZE systems, kmalloc can allocate up to 4MB, which makes
MAX_DIO_SIZE up to 682MB. This number can fit into 32 bits.
But on 64kB PAGE_SIZE systems, kmalloc can allocate up to 512MB, which
then makes MAX_DIO_SIZE up to 1365GB. This needs 64 bits to fit.
Make sure that for every platform MAX_DIO_SIZE is not abusively
truncated, by casting it to size_t.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9d2c8c4a1ccf0abf0b7647e569b8454365369e8a
Reviewed-on: https://review.whamcloud.com/38526
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-6142 utils: Fix style issues for lustre_cfg.c 43/38643/2
Arshad Hussain [Sat, 16 May 2020 07:15:29 +0000 (12:45 +0530)]
LU-6142 utils: Fix style issues for lustre_cfg.c

This patch fixes issues reported by checkpatch
for file lustre/utils/lustre_cfg.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I4617daabd111309cac11c975df6ee0a897379115
Reviewed-on: https://review.whamcloud.com/38643
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 utils: Fix style issues for mkfs_lustre.c 56/38556/4
Arshad Hussain [Sat, 9 May 2020 09:24:47 +0000 (14:54 +0530)]
LU-6142 utils: Fix style issues for mkfs_lustre.c

This patch fixes issues reported by checkpatch
for file lustre/utils/mkfs_lustre.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ibf5833907b38dc67f84e88e7d2c188c6eb51773e
Reviewed-on: https://review.whamcloud.com/38556
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-9679 lnet: tidy lnet_discover and fix mem accounting bug. 44/38644/3
Mr NeilBrown [Sun, 17 May 2020 23:21:27 +0000 (09:21 +1000)]
LU-9679 lnet: tidy lnet_discover and fix mem accounting bug.

A recent patch introduce a memory accounting bug because "n_ids"
can change between the ALLOC call and the FREE call.

With this patch we fix that by ensuring n_ids doesn't change - the
current change is not needed.
Also:
 - discard 'max_intf' var.  It is always exactly lnet_interfaces_max,
   so just use that directly.
 - only copy back the number of interfaces found
 - report the number of interfaces actually copied.
 - Move the copy_to_user until after all locks and references are
   dropped so there is no need to re-take any locks.

Test-Parameters: trivial
Fixes: b1f6f3becedc ("LU-9679 libcfs: Add CFS_ALLOC_PTR_ARRAY and free")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib807ca4ce5235b28e7ae11d90e1942aff4454cfc
Reviewed-on: https://review.whamcloud.com/38644
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 llite: use %pd to report dentry names. 42/38442/3
Mr NeilBrown [Fri, 1 May 2020 05:46:33 +0000 (15:46 +1000)]
LU-6142 llite: use %pd to report dentry names.

Since Linux 3.12, it has been possible use the "%pd" format specifier
to print a dentry name, so use that instead of "%.*s" and having
to pass both the length and the name.

Redhat's 3.10 kernels also have this functionality backported.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I70303a2c375d54825e89a0ba6703914528c54c05
Reviewed-on: https://review.whamcloud.com/38442
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-10467 ptlrpc: change LONG_UNLINK to PTLRPC_REQ_LONG_UNLINK 05/38405/3
Mr NeilBrown [Tue, 28 Apr 2020 23:39:26 +0000 (09:39 +1000)]
LU-10467 ptlrpc: change LONG_UNLINK to PTLRPC_REQ_LONG_UNLINK

The name "LONG_UNLINK" is vague and generic.  Change it to
  PTLRPC_REQ_LONG_UNLINK
to make it clear it is about requests taking a long time,
and of interest to PTLRPC.

Test-Parameters: trivial
Suggested-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I990657da3ec780c982a7ae31c21e4c8c9064be17
Reviewed-on: https://review.whamcloud.com/38405
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
15 months agoLU-6142 lustre: convert some container_of0 to container_of 83/38383/3
Mr NeilBrown [Mon, 27 Apr 2020 05:38:49 +0000 (15:38 +1000)]
LU-6142 lustre: convert some container_of0 to container_of

Each of these calls to container_of0() can be determined from local
context to be passed a valid pointer, so it is best to use
container_of() directly to make this clear.
Either:
 - the returned pointer is dereferenced with out be tests, or
 - the passed-in pointer is dereferened before the call, or
 - the passed-in pointer cannot be NULL, such as when
   it is a '.next' of a list_head or returned by lu_obecjt_next()

So convert all of these to container_of()

... except one which *should* be container_of(), but cannot
be as it won't compile cleanly on older kernels.  Change
that one to container_of_safe() with a big comment.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Idcd954f89ed366882563810ce042a5ddaba5a1e5
Reviewed-on: https://review.whamcloud.com/38383
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 osd-ldiskfs: convert container_of0() to container_of() 78/38378/2
Mr NeilBrown [Mon, 27 Apr 2020 04:47:39 +0000 (14:47 +1000)]
LU-6142 osd-ldiskfs: convert container_of0() to container_of()

Every use of container_of0() in osd-ldiskfs can safely use
container_of() instead.  Doing so makes the intent of the code
clearer.

In most cases, the pointer returned is later dereferenced without any
subsequent checks.  In a few cases (e.g.  osd_obj()), the pointer
passed in is dereferenced before the container_of() call.  These
patterns assure us that the pointer in valid (not NULL or an ERR_PTR),
so container_of() is the correct interface to use.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I11cb51cd2dc459a3ab5c420ef7bf3324a28eeffc
Reviewed-on: https://review.whamcloud.com/38378
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 lustre: use BIT() macro where appropriate 77/38377/2
Mr NeilBrown [Mon, 27 Apr 2020 03:40:30 +0000 (13:40 +1000)]
LU-6142 lustre: use BIT() macro where appropriate

When accessing a bit in a bitmap/mask/flags-word it can be more
readable to use BIT(num) rather than "1 << num".

This patch makes that change to various places in lustre

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9f10e7e11e0f310c46c9b216f138799c62308279
Reviewed-on: https://review.whamcloud.com/38377
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 lustre: use BIT() macro where appropriate in include 76/38376/2
Mr NeilBrown [Mon, 27 Apr 2020 03:39:36 +0000 (13:39 +1000)]
LU-6142 lustre: use BIT() macro where appropriate in include

When accessing a bit in a bitmap/mask/flags-word it can be more
readable to use BIT(num) rather than "1 << num".

This patch makes that change to various places in lustre/include.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I6a242fa843ff26320b299b8a532b6bb4388f0233
Reviewed-on: https://review.whamcloud.com/38376
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gian-Carlo DeFazio <defazio1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-9679 ptlrpc: use OBD_ALLOC_PTR_ARRAY() and FREE 53/38253/2
Mr NeilBrown [Thu, 14 Nov 2019 03:20:01 +0000 (14:20 +1100)]
LU-9679 ptlrpc: use OBD_ALLOC_PTR_ARRAY() and FREE

Use:
  OBD_ALLOC_PTR_ARRAY
  OBD_FREE_PTR_ARRAY

for allocating and freeing arrays in ptlrpc.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic0323ce4d50d5ced3ac6e529f7477cc4779478e6
Reviewed-on: https://review.whamcloud.com/38253
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
15 months agoLU-9679 mdt: use OBD_ALLOC_PTR_ARRAY() and FREE 51/38251/2
Mr NeilBrown [Thu, 14 Nov 2019 03:20:01 +0000 (14:20 +1100)]
LU-9679 mdt: use OBD_ALLOC_PTR_ARRAY() and FREE

Use:
  OBD_ALLOC_PTR_ARRAY
  OBD_FREE_PTR_ARRAY

for allocating and freeing arrays in mdt.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I5e69f03370fc42fdfe770a4c6fc72a994a6749f6
Reviewed-on: https://review.whamcloud.com/38251
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13535 lfsck: fix possible PFL layout corruption 84/38584/4
Mikhail Pershin [Tue, 12 May 2020 20:32:22 +0000 (23:32 +0300)]
LU-13535 lfsck: fix possible PFL layout corruption

While checking lmm_oi in composite layout the pointer to 'lmm'
is re-assigned to component entry but the same pointer is used
for LOV EA buffer to update EA. Therefore if lmm_oi was fixed in
some component then just current entry is saved as new layout.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ifbd984a71b383ab4ca35ad59ed9cd8cf57b6d7cc
Reviewed-on: https://review.whamcloud.com/38584
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
15 months agoLU-10401 tests: add -F so list_param prints entry type 67/38567/5
Yang Sheng [Mon, 11 May 2020 07:36:26 +0000 (15:36 +0800)]
LU-10401 tests: add -F so list_param prints entry type

In sanity.sh test_133g add "-F" so that the list_param
command prints '=' for entries that can be written.
Otherwise, the grep will not find any entries to modify
and the test does not really work. Also change 113f to
use lctl rather than hardcode proc and sysfs paths.

Test-Parameters: trivial testlist=sanity
Fixes: 83b6c6608e94 ("LU-11644 ptlrpc: show target name in req_history")
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I3c94bb13f3345161580ef1b7944945b871ea2c60
Reviewed-on: https://review.whamcloud.com/38567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
15 months agoLU-13187 tests: improve sanity test_129 checking 50/38550/2
Andreas Dilger [Sat, 9 May 2020 01:22:59 +0000 (19:22 -0600)]
LU-13187 tests: improve sanity test_129 checking

When running sanity test_129 repeatedly, it may fail to print the
warning/error messages to the console due to throttling.  Fix the
test to only warn if the messages were not printed, so long as the
size limit is correct.

Always reset max_dir_limit on exit, and delete the test files.

Fix test_129 to use mcreate instead of multiop to be faster, and
fix mcreate to return the actual error code returned by the syscall.

Fix the console error messages to be complete strings, rather than
being dynamically generated, so that they can be more easily found.
To compensate, don't have separate messages for with/without FID.

Test-Parameters: trivial testlist=sanity env=ONLY=129,ONLY_REPEAT=100
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I06593441151fa508ef2db7a3e68be34cc0ce7057
Reviewed-on: https://review.whamcloud.com/38550
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13539 all: Cleanup LASSERTF uses missing newlines 40/38540/2
Shaun Tancheff [Fri, 8 May 2020 17:40:36 +0000 (12:40 -0500)]
LU-13539 all: Cleanup LASSERTF uses missing newlines

LASSERTF() usage that does not include a terminating newline
present an unclear syslog entry from admins.

This adds the terminating newline for a few cases.

Test-Parameters: trivial
HPE-bug-id: LUS-8853
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I3f088601c2ad6d59e2868e86bfbd5315e36ece98
Reviewed-on: https://review.whamcloud.com/38540
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
15 months agoLU-9859 libcfs: move tcd locking across to tracefile.c 01/38601/4
Mr NeilBrown [Thu, 14 May 2020 16:23:01 +0000 (12:23 -0400)]
LU-9859 libcfs: move tcd locking across to tracefile.c

No need to have this in linux-tracefile.c

Test-Parameters: trivial
Change-Id: I3fdc70ad5f32ea7ff78c778565f01eaaa78f1e94
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38601
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13549 build: fix zfs/spl config checks 92/38592/3
Sebastien Buisson [Wed, 13 May 2020 13:58:07 +0000 (15:58 +0200)]
LU-13549 build: fix zfs/spl config checks

zfs/spl config checks should only proceed to variable substitution in
case system supports it. Otherwise, it would end up adding 'Not found'
in list of include dirs/libs.

Test-Parameters: trivial testgroup=review-zfs
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia76a67e42cc6a2f116013c142b6c0b2143838548
Reviewed-on: https://review.whamcloud.com/38592
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13481 dne: improve temp file name check 39/38539/2
Lai Siyao [Fri, 8 May 2020 04:28:52 +0000 (12:28 +0800)]
LU-13481 dne: improve temp file name check

Previously if all but two characters in file name suffix are digit,
it's not treated as temp file, as is too strict if suffix length is
short, e.g. 6. Change it to allow one character, and this non-digit
character should not be the starting character.

Test-Parameters: trivial testlist=sanity env=ONLY=33h,ONLY_REPEAT=500
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie36e6e15c1e593f47f4d3fab7f8c567d1d587f28
Reviewed-on: https://review.whamcloud.com/38539
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
15 months agoLU-9859 libcfs: discard cfs_block_sigsinv() 30/38530/2
Mr NeilBrown [Thu, 7 May 2020 13:09:24 +0000 (09:09 -0400)]
LU-9859 libcfs: discard cfs_block_sigsinv()

cfs_block_sigsinv() and cfs_restore_sigs() are simple
wrappers which save a couple of line of code and
hurt readability for people not familiar with them.
They aren't used often enough to be worthwhile,
so discard them and open-code the functionality.

The sigorsets() call isn't needed as or-ing with current->blocked is
exactly what sigprocmask(SIG_BLOCK) does.

Linux-commit: 6afe572bc76688cd840032254217a4877b66e916

Change-Id: Ia9189e0885dffb098df7abef09db42ecb49196cd
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/38530
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 lnet: Correct signature in gnilnd.h 91/38491/4
Shaun Tancheff [Tue, 5 May 2020 06:13:31 +0000 (01:13 -0500)]
LU-13004 lnet: Correct signature in gnilnd.h

kgnilnd_recv was updated correctly but the function prototype
was not included.

Test-Parameters: trivial
Fixes: e35b7751f49 ("LU-13004 lnet: remove the 'struct kvec' arg from lnd_send")
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I29db966ce54801391aa3c5c6426f03a327fea4b1
Reviewed-on: https://review.whamcloud.com/38491
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13493 llite: check if page truncated in ll_write_begin() 25/38425/2
Wang Shilong [Thu, 30 Apr 2020 02:27:06 +0000 (10:27 +0800)]
LU-13493 llite: check if page truncated in ll_write_begin()

See following function flows:

CPU0 CPU1
|->grab_cache_page_nowait
  |->find_get_page
    |->__find_get_page (page unlocked)
|->truncate page
   |->trylock_page —->page might has been truncated after this.

So it is possible that page might has been truncated after
grab_cache_page_nowait() return even page lock is held.

We need check wheather vmpage->mapping change in ll_write_begin()
otherwise, we will have truncated page with NULL mapping, which
will trigger assertions in vvp_set_pagevec_dirty().

This patch also fix assertion string doesn't end in newline.

Change-Id: I46e14f560378a39d8ae1353d60cc49c0f0b241c0
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/38425
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13471 lnet: use the same src nid for discovery 20/38320/5
Amir Shehata [Thu, 23 Apr 2020 00:06:23 +0000 (17:06 -0700)]
LU-13471 lnet: use the same src nid for discovery

When discovering a remote peer (not on the same network) a GET is
sent to the peer to retrieve the peer's interfaces.  This is followed
by a PUSH, if discovery is on, to push the node's interfaces However,
if both node and peer have multiple interfaces it is likely that the
GET and the PUSH will originate on different interfaces. When the
peer receives the PUSH it will not be able to connect the two NIDs
and will not be able to consolidate the node's NIDs.  This issue is
specific for remote peers because at the time the push handler is
invoked the remote lpni has not been created yet. lnet_parse()
creates the lpni of the gateway.

Similar to the strategy already in place of using the same source NID
for all the messages of an RPC, discovery should use the same source
NID for both the GET and PUSH.

This patch stores the source NID interfaces the GET was sent on and
uses it for the PUSH.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I5a13ab7799b2ddc47714202bcbed786b0d3940b7
Reviewed-on: https://review.whamcloud.com/38320
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13369 kernel: kernel update RHEL7.7 [3.10.0-1062.18.1.el7] 40/38240/4
Jian Yu [Thu, 7 May 2020 17:30:37 +0000 (10:30 -0700)]
LU-13369 kernel: kernel update RHEL7.7 [3.10.0-1062.18.1.el7]

Update RHEL7.7 kernel to 3.10.0-1062.18.1.el7.

Test-Parameters: trivial clientdistro=el7.7 serverdistro=el7.7

Change-Id: Ifc00fca35a0ad28ba8326e56e693ea39360a8114
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38240
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13427 lmv: do not print MDTs that are inactive 65/38165/4
Andreas Dilger [Wed, 8 Apr 2020 00:37:36 +0000 (18:37 -0600)]
LU-13427 lmv: do not print MDTs that are inactive

Have lmv return -EAGAIN instead of -ENODATA for unconfigured MDTs.
That avoids "lfs df -v" from printing a long list of invalid MDTs
when trying to get the target state for non-rotational devices.

Add test for "lfs df -v" printing nonrotational state, as well as
limiting the reported OST and MDT to configured devices.

Fixes: 4c76eb64a9ff ("LU-8920 utils: don't print deactivated OSTs in 'lfs df')
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ia3dcf85f4cf52caa39848f0f6ed8d47d42ce7057
Reviewed-on: https://review.whamcloud.com/38165
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-10973 lnet: initialize buffers properly 83/38083/5
Amir Shehata [Mon, 11 May 2020 14:32:41 +0000 (10:32 -0400)]
LU-10973 lnet: initialize buffers properly

Initialize the error message buffer properly to avoid
garbage characters printed out.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9c0d08ea2baf0cda990e3abb7e7de497a5226bfe
Reviewed-on: https://review.whamcloud.com/38083
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13366 utils: SEL yaml and copy file support 60/37960/3
Vitaly Fertman [Fri, 13 Mar 2020 13:03:08 +0000 (16:03 +0300)]
LU-13366 utils: SEL yaml and copy file support

create a SEL file by a yaml template or by copying the template
from another file (setstripe --yaml and setstripe --copy).

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I001226a9dcfa5706295cef1c13a2e7aeb5c498cc
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://es-gerrit.dev.cray.com/156286
HPE-bug-id: LUS-8039
Reviewed-on: https://review.whamcloud.com/37960
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13366 doc: add SEL options to util's man pages 62/37962/3
Vitaly Fertman [Fri, 13 Mar 2020 14:03:32 +0000 (17:03 +0300)]
LU-13366 doc: add SEL options to util's man pages

a couple of documents missed the SEL info:
- lfs man page
- lfs setstripe help message
- lfs find help message

Test-Parameters: trivial
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ifd4e55877309a29cafdd05d007ace6660d0f775b
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Tested-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://es-gerrit.dev.cray.com/156672
HPE-bug-id: LUS-8149
Reviewed-on: https://review.whamcloud.com/37962
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 socklnd: discard tx_iov. 51/37851/8
Mr NeilBrown [Wed, 4 Dec 2019 06:02:13 +0000 (17:02 +1100)]
LU-13004 socklnd: discard tx_iov.

tx_iov always points to tx_hdr, so we can discard tx_iov, and just use
&tx_hdr.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9738b6f953184fdea859da7a9c4187227fa61143
Reviewed-on: https://review.whamcloud.com/37851
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 lnet: simplify ksock_tx. 50/37850/8
Mr NeilBrown [Wed, 4 Dec 2019 05:40:12 +0000 (16:40 +1100)]
LU-13004 lnet: simplify ksock_tx.

The tx_frags union in 'struct ksock_tx' is largely unnecessary.  The
payload is always lnet_kiov_t, the only kvec is a header.  So replace
the union with just those two fields.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2c4bd5397ddf412b11e9f741406de617c534c736
Reviewed-on: https://review.whamcloud.com/37850
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 lnet: remove lnet_copy_flat2iov and ..iov2flat 49/37849/8
Mr NeilBrown [Wed, 4 Dec 2019 05:34:47 +0000 (16:34 +1100)]
LU-13004 lnet: remove lnet_copy_flat2iov and ..iov2flat

This functions are no longer used, so remove them.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I748643c38430ddb21f35493f0ef704528167716f
Reviewed-on: https://review.whamcloud.com/37849
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13004 lnet: remove lnet_extract_iov() 48/37848/3
Mr NeilBrown [Wed, 4 Dec 2019 05:30:32 +0000 (16:30 +1100)]
LU-13004 lnet: remove lnet_extract_iov()

The only place this is called, the src kvec is
NULL with length 0, so it returns 0.
So remove it.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I72fd39d0b4330fd4f811f2ee1476a653407855b4
Reviewed-on: https://review.whamcloud.com/37848
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6006 tests: add sleep 1 after command in background 43/37343/10
Alex Zhuravlev [Tue, 28 Jan 2020 12:38:05 +0000 (15:38 +0300)]
LU-6006 tests: add sleep 1 after command in background

otherwise subsequent command may race with the one in
background and fail:
 mkdir a & touch a/b

Test-Parameters: trivial env=ONLY="22-23",ONLY_REPEAT=50 testlist=replay-dual
Test-Parameters: env=ONLY="22-23",ONLY_REPEAT=50 fstype=zfs testlist=replay-dual
Test-Parameters: env=ONLY="22-23",ONLY_REPEAT=50 mdscount=2 mdtcount=4 testlist=replay-dual
Test-Parameters: env=ONLY="22-23",ONLY_REPEAT=50 fstype=zfs mdscount=2 mdtcount=4 testlist=replay-dual

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0574d315e596cd899f7c4ea20c70b4c3da99b9b4
Reviewed-on: https://review.whamcloud.com/37343
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13522 mdt: count mdt_mds_mds_conns upon reconnect 12/38512/3
Lai Siyao [Tue, 5 May 2020 01:05:08 +0000 (09:05 +0800)]
LU-13522 mdt: count mdt_mds_mds_conns upon reconnect

mdt.mdt_mds_mds_conns is inter-MDT connection count, it should be
increased upon reconnect (MDT stop and start).

Update sanityn 33c.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ic0dabee10431f39665cb2bc4fa8a014fc78fbd60
Reviewed-on: https://review.whamcloud.com/38512
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13065 tests: clean up runtests 16/37016/8
James Nunez [Fri, 13 Dec 2019 20:29:52 +0000 (13:29 -0700)]
LU-13065 tests: clean up runtests

The runtests test suite needs to be cleaned up with the
following:

runtests used a verbose flag that initialized a ‘V’
variable, but setting the verbose option was removed.
The flag is still in the script.  The ‘V’ flag should be
replaced by the global VERBOSE flag set in
test-frameworks.sh.

Insert tabs at the beginning of each line of test 1

Remove ‘\’ after ||

The error() function does not need a second argument and may
be confusing when reading the error message.  Remove the
second argument from the calls to error()

Replace exit() calls with calls to error() where appropriate

Use lower case for all letters of local variables and
declare them 'local'

Test-Parameters: trivial testlist=runtests
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4e9f1eb467b004e5e2384dba4b7f047762304e8b
Reviewed-on: https://review.whamcloud.com/37016
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-12100 tests: Use least qunit to set limit 97/36797/8
Nathaniel Clark [Tue, 19 Nov 2019 14:52:45 +0000 (09:52 -0500)]
LU-12100 tests: Use least qunit to set limit

Use least qunit to set lower limit for inodes in sanity-quota/2
This ensures that the limit is set at or above the minimum size.

Test-Parameters: trivial
Test-Parameters: testlist=sanity-quota
Test-Parameters: testlist=sanity-quota fstype=zfs
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I80e2c3cb66870d11f74f34c435e266a46630479b
Reviewed-on: https://review.whamcloud.com/36797
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-12854 nodemap: allow boolean value for audit_mode 50/36450/2
Sebastien Buisson [Mon, 14 Oct 2019 12:50:58 +0000 (14:50 +0200)]
LU-12854 nodemap: allow boolean value for audit_mode

Allow "true/false" or "on/off" for audit_mode property
on nodemap entries.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ibc6123dc64a88a80f474908d82e00269daacac69
Reviewed-on: https://review.whamcloud.com/36450
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
15 months agoLU-6142 libcfs: fix tab and alignment issues for libcfs_string.c 23/38623/3
James Simmons [Fri, 15 May 2020 16:54:39 +0000 (12:54 -0400)]
LU-6142 libcfs: fix tab and alignment issues for libcfs_string.c

Resolve non-code changes for checkpatch errors.

Test-Parameters: trivial
Change-Id: Ie2848a83fdf58853ec32c4b188bcae1bcf4afa89
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38623
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-6142 utils: Fix style issues for obd.c 58/38558/3
Arshad Hussain [Sat, 9 May 2020 18:24:58 +0000 (23:54 +0530)]
LU-6142 utils: Fix style issues for obd.c

This patch fixes issues reported by checkpatch
for file lustre/utils/obd.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ic350a87f27fe13909d18f735d5044cb61e1654ea
Reviewed-on: https://review.whamcloud.com/38558
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13131 osc: Ensure immediate departure of sync write pages 53/38453/4
Oleg Drokin [Fri, 1 May 2020 21:50:39 +0000 (17:50 -0400)]
LU-13131 osc: Ensure immediate departure of sync write pages

Except for the case of direct-io and server-lock, we are
hold potentially multiple locks that are next to impossible
to find and cross reference.
So instead just send it all right away - should only
be a factor in rare cases of out of quota or close to out
of space.

Change-Id: I961cd9ba7f3266d22dfc5eff758c2f4ebbe148a4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38453
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
15 months agoLU-6142 utils: Fix style issues for lustre_rsync.c 41/38441/4
Arshad Hussain [Thu, 30 Apr 2020 23:32:51 +0000 (05:02 +0530)]
LU-6142 utils: Fix style issues for lustre_rsync.c

This patch fixes issues reported by checkpatch
for file lustre/utils/lustre_rsync.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I4f4b3bd2455f2012561eb7567ba508e47c06749e
Reviewed-on: https://review.whamcloud.com/38441
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-6142 kernel: use kmem_cache_zalloc as appropriate. 39/38439/3
Mr NeilBrown [Fri, 1 May 2020 05:11:59 +0000 (15:11 +1000)]
LU-6142 kernel: use kmem_cache_zalloc as appropriate.

Rather than passing __GFP_ZERO to kmem_cache_alloc(), or calling
memset(0) after the allocation, use kmem_cache_zalloc().

Also update spelling.txt to encourage use of kmem_cache_zalloc().

kmem_cache_zalloc() has been part of Linux since 2.6.17.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic4ccbe0e223121e54699f7667e35db14d0f0da70
Reviewed-on: https://review.whamcloud.com/38439
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <chris.horn@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-12222 ptlrpc: Check if NID is local, not just lolnd NID 88/38388/3
Chris Horn [Mon, 27 Apr 2020 15:07:21 +0000 (10:07 -0500)]
LU-12222 ptlrpc: Check if NID is local, not just lolnd NID

There's a couple places where we check whether a NID is the lolnd NID
but we really want to know whether the NID is local. Use
LNetIsPeerLocal() to accomplish this.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ia17b9b4b54fd1063c42a6f8bdd0e593be1086683
Reviewed-on: https://review.whamcloud.com/38388
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
15 months agoLU-6142 utils: Fix style issues for portals.c 62/38362/4
Arshad Hussain [Tue, 21 Apr 2020 18:07:15 +0000 (23:37 +0530)]
LU-6142 utils: Fix style issues for portals.c

This patch fixes issues reported by checkpatch
for file lustre/utils/portals.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I97eb576fe12a75e26d36dcf2228d5f161712e3e5
Reviewed-on: https://review.whamcloud.com/38362
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13131 osc: Do not wait for grants for too long 83/38283/6
Oleg Drokin [Mon, 20 Apr 2020 13:51:29 +0000 (09:51 -0400)]
LU-13131 osc: Do not wait for grants for too long

obd_timeout is way too long considering we are holding a lock
that might be contended. If OST is slow to respond, we might
get evicted, so limit us to a half of the shortest possible
max wait a server might have before switching to synchronous IO.

Change-Id: I36653194c1b8b95ba3cc2ed9240df7b0888cf7ed
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38283
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
15 months agoLU-9679 target: use OBD_ALLOC_PTR_ARRAY() and FREE 55/38255/2
Mr NeilBrown [Thu, 14 Nov 2019 03:20:01 +0000 (14:20 +1100)]
LU-9679 target: use OBD_ALLOC_PTR_ARRAY() and FREE

Use:
  OBD_ALLOC_PTR_ARRAY
  OBD_FREE_PTR_ARRAY

for allocating and freeing arrays in target.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I844a12e810c8a0bd6fb13ca047000de7f265988d
Reviewed-on: https://review.whamcloud.com/38255
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
15 months agoLU-9679 lodlov: use OBD_ALLOC_PTR_ARRAY() and others 54/38254/2
Mr NeilBrown [Thu, 14 Nov 2019 03:20:01 +0000 (14:20 +1100)]
LU-9679 lodlov: use OBD_ALLOC_PTR_ARRAY() and others

Use:
  OBD_ALLOC_PTR_ARRAY
  OBD_FREE_PTR_ARRAY
  OBD_ALLOC_PTR_ARRAY_LARGE
  OBD_FREE_PTR_ARRAY_LARGE

for allocating and freeing arrays in lod and lov.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia782fc903761e21be982e4553ec40035c57f73f3
Reviewed-on: https://review.whamcloud.com/38254
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
15 months agoLU-13357 lod: implement striped directory .dio_lookup 03/37903/6
Lai Siyao [Thu, 12 Mar 2020 00:35:20 +0000 (08:35 +0800)]
LU-13357 lod: implement striped directory .dio_lookup

Add function lod_striped_lookup() for
lod_striped_index_ops.dio_lookup to allow name lookup under striped
directory.

Currently this is used by subdir mount, which needs to lookup FID
of the subdir on server side.

Function lfsck_namespace_repair_dirent() should call dt_lookup() with
bottom object, because child may be shard.

Add sanity 247f.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Iba844d1a34a318bcbd42b00186ed6fa9d165effc
Reviewed-on: https://review.whamcloud.com/37903
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-12003 osd: take reference to object in osd_trunc_lock() 70/37170/8
Alex Zhuravlev [Thu, 9 Jan 2020 13:28:54 +0000 (16:28 +0300)]
LU-12003 osd: take reference to object in osd_trunc_lock()

normally the references to objects are held until a transaction
is over, but in few cases reference is released before. and then
such an object can be release, so OSD should have own reference
to prevent early release.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I81647fdec8d42f123e990553edb5e371636f45c0
Reviewed-on: https://review.whamcloud.com/37170
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-11025 dne: support directory restripe 98/36898/14
Lai Siyao [Sat, 10 Aug 2019 05:00:01 +0000 (13:00 +0800)]
LU-11025 dne: support directory restripe

This patch adds directory restripe support:
* 'lfs setdirstripe -m -1 -c <stripe_count>' on an existed directory
  will change this directory layout, if 'stripe_count' is larger than
  current count, new stripes are allocated after current stripes,
  otherwise merge stripes of this directory, NB, if stripe count is
  unchanged, but hash type changed, it's treated as merging, but
  rehashing actually.
* mdt_restripe() ia added to restripe directory.
* mdd_dir_declare_layout_split() is added to split directory, which
  handles both plain and striped directory split.
* lod_dir_declare_layout_split() will handle the internal of directory
  split.
* directory merge is simple compared to split, which just records
  target stripe count in LMV, and update it.

NB. this patch only restripe directory, but doesn't add the code to
migrate sub files, which will be implemented in the following patch.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I526f7423b909eb83cf8723e65981d713b3e42499
Reviewed-on: https://review.whamcloud.com/36898
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-11025 osd: osd_attr_get() returns dirent count 97/38097/8
Lai Siyao [Fri, 20 Mar 2020 09:59:32 +0000 (17:59 +0800)]
LU-11025 osd: osd_attr_get() returns dirent count

For osd-ldiskfs, to get dirent count it needs to iterate directory
entries and sum it up, while for osd-zfs, zap_count() can get it
from ZAP directly.

Add a new field 'la_dirent_count' in struct lu_attr, and set it
to directory entry count in osd_attr_get() if object is directory, and
this value will be cached in osd_object for osd-ldiskfs, if directory
is newly created, it will be set to '0', and later index_insert/delete
will update its value, so the following osd_attr_get() can use the
cached value.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If4225aecdba1c428d64d97c35b6c982c4932a265
Reviewed-on: https://review.whamcloud.com/38097
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-11025 obdclass: add lu_device_operations::ldo_fid_alloc() 82/37282/10
Lai Siyao [Mon, 6 Jan 2020 11:44:33 +0000 (19:44 +0800)]
LU-11025 obdclass: add lu_device_operations::ldo_fid_alloc()

Add an interface ldo_fid_alloc in lu_device_operations, which is to
allocate a FID by parent object and sub file name, this will be
used to migrate sub files for directory restripe.

The existing osd_fid_alloc() and osp_fid_alloc() will switch to this
interface from obd_ops::o_fid_alloc().

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ide480fe492fd2d4d5de675bbc61aee7e2a9e3ce3
Reviewed-on: https://review.whamcloud.com/37282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-12931 ldlm: use proper units for timeouts 65/38365/6
Andreas Dilger [Sat, 25 Apr 2020 09:19:04 +0000 (03:19 -0600)]
LU-12931 ldlm: use proper units for timeouts

Use timeout_t for ns_ctime_age_limit since this is a relative time
and not an absolute time.

Use ktime_t for ns_dirty_age_limit internally, even though the user
interface is in seconds, since this is frequenty used together with
other ktime_t values in the kernel.

Fixes: e920be681451 ("LU-9019 ldlm: migrate the rest of the code to 64 bit time")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Idb5ece834e56a95cd781cb871b1b7c20bf3ebbe5
Reviewed-on: https://review.whamcloud.com/38365
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13411 llog: allow delete of zero size llog 31/38131/5
Alexander Boyko [Fri, 3 Apr 2020 12:34:54 +0000 (08:34 -0400)]
LU-13411 llog: allow delete of zero size llog

1) all plain logs belonging to catalog should have flag
LLOG_F_ZAP_WHEN_EMPTY base on llog_cat_new_log(). When
llog_cat_process_common processing plain log with zero file size,
this flag is not set during llog_cat_id2handle LLOG_EMPTY, so these
plain llogs are not canceled/destroyed. They appeared during cross
MDT updates. Fix adds flag LLOG_F_ZAP_WHEN_EMPTY for any plain llog
at catalog.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-8634
Change-Id: Ieebee67bf9e7bebb9ecc51b858a9976a00583c7b
Reviewed-on: https://review.whamcloud.com/38131
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
15 months agoLU-13416 ldiskfs: don't corrupt data on journal replay 81/38281/5
Alexey Lyashkov [Mon, 20 Apr 2020 09:45:52 +0000 (12:45 +0300)]
LU-13416 ldiskfs: don't corrupt data on journal replay

Journalled write want a special attention on blocks release, revoke records must
added to avoid replace a new write blocks with stale data. Mark inode as
“journal write” generate a right revoke records. Large EA inode updates affected
with this bug also.

large ea fix is

commit ddfa17e4adc4bd19c32216aaa6250dc38b0579df
Author: Tahsin Erdogan <tahsin@google.com>
Date:   Wed Jun 21 21:36:51 2017 -0400
    ext4: call journal revoke when freeing ea_inode blocks

Change-Id: I605128c4ba70331a48715dc95546430909efb893
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/38281
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
16 months agoLU-11643 tests: revert new images and tests for upgrade patch 19/38619/3
James Nunez [Fri, 15 May 2020 14:14:38 +0000 (14:14 +0000)]
LU-11643 tests: revert new images and tests for upgrade patch

Revert "LU-11643 tests: add new images and tests for upgrade tests".
This patch seem to cause conf-sanity test 32a to hang.
Let's revert the patch until this issue is understood.

This reverts commit 6b979daaffc36aeef145316b41d0e2fe8abcf20f.

Test-Parameters: trivial testlist=conf-sanity env=ONLY=32
Change-Id: Ic4a354420797a5968234925960278eaab86d22ca
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38619
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
16 months agoLU-13541 llite: fix possible divide zero in ll_use_fast_io() 45/38545/2
Wang Shilong [Thu, 7 May 2020 00:58:54 +0000 (08:58 +0800)]
LU-13541 llite: fix possible divide zero in ll_use_fast_io()

ll_use_fast_io() is used to check wheather we could use fast IO.
Since it is called in fast path, we don't hold ras_lock to protect
access, there might have the race @ras_stride_bytes is reset after
stride_io_mode() check.

Fixes: 9e4c5bdaaec5 ("LU-12644 llite: try fast io for stride io correctly")
Change-Id: If57ad074ecfa6560cc527f9f52e7adc2b0a456fd
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/38545
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
16 months agoLU-13246 osd: unlock os_lock if it was locked 47/37547/6
Alex Zhuravlev [Wed, 12 Feb 2020 05:45:09 +0000 (08:45 +0300)]
LU-13246 osd: unlock os_lock if it was locked

do not use the global state (which can change concurrently) for that.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Idbdc3639bca50006dac00112205e1fee9c9a0e30
Reviewed-on: https://review.whamcloud.com/37547
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>