Whamcloud - gitweb
fs/lustre-release.git
5 years agoLU-12381 ko2iblnd: ignore down interfaces 49/35249/2
James Simmons [Mon, 17 Jun 2019 19:25:48 +0000 (12:25 -0700)]
LU-12381 ko2iblnd: ignore down interfaces

The for_each_netdev() loop in kiblnd_create_dev() scans for all
network devices on a system. Currently the code exit when an
network device is down but the device could be something besides
an IB device. Instead of exiting just ignore any device that is
down.

This patch is back-ported from the following one:
Lustre-commit: 1dea5aac9d9be99c4b317a491f308872b97bf0e6
Lustre-change: https://review.whamcloud.com/35098

Test-Parameters: trivial

Fixes: c4b39bf56bbc ("LU-11893 o2iblnd: add secondary IP address handling")
Change-Id: I0a3bf808d849cd00711b6ef2e4e5bbd876b64903
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35249
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 osd-ldiskfs: inode times switched to timespec64 47/35247/2
Li Dongyang [Mon, 17 Jun 2019 19:01:36 +0000 (12:01 -0700)]
LU-11838 osd-ldiskfs: inode times switched to timespec64

Since kernel 4.18 inode times swtich from struct timespec
to timespec64 to make it y2038 safe.

Linux-commit: 95582b00838837fc07e042979320caf917ce3fe6

This patch is back-ported from the following one:
Lustre-commit: 3af55b3159ac2133dc35eeb2f02825848fb65548
Lustre-change: https://review.whamcloud.com/34675

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Iaddb2f2be27ec348fb97e13371aa3d7e6f6e5c9f
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35247
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10885 llite: enable flock mount option by default 87/34987/2
Andreas Dilger [Tue, 2 Oct 2018 21:52:28 +0000 (15:52 -0600)]
LU-10885 llite: enable flock mount option by default

The "flock" mount option has been optional for many years, initially
because of potential stability issues, and also to provide a choice
for administrators to select between "flock" and "localflock" options.

However, from the large number of problems that users report when
trying to use applications that depend on this feature (typically
databases and other cloud stacks) that disabling flock by default
causes more problems than it solves.

Enable the "flock" (distributed coherent userspace locking) feature
by default.  If applications do not need this functionality, then it
will not affect them.  If applications *do* need this functionality,
they will get it.  If administrators really know what they are doing,
then they can use the "localflock" feature to enable client-local
flock functionality, possibly only on select nodes that need this.

Users wanting to disable this functionality should mount with the
existing "-o noflock" mount option, or build the client with the
"configure --disable-flock" option.

If clients are already using "-o {flock|localflock|noflock}" then
their existing options will be handled appropriately.

Lustre-change: https://review.whamcloud.com/32091
Lustre-commit: 3613af3e15cbc6091e3a16c8caeb1307be2d91f6

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I182637604fa22573b1da6b6b86d8915e3c3ebbe5
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-on: https://review.whamcloud.com/34987
Tested-by: Jenkins
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8365 ldiskfs: procfs entries for mballoc 42/34842/3
Lokesh Nagappa Jaliminche [Mon, 4 Jul 2016 09:04:20 +0000 (14:34 +0530)]
LU-8365 ldiskfs: procfs entries for mballoc

Export mballoc streaming block allocator variables
mb_last_group and mb_last_start through procfs.

Lustre-change: https://review.whamcloud.com/21142
Lustre-commit: 75703118588f2b23afd8c8815e5ebb768fc7a8ff

Test-Parameters: testgroup=review-ldiskfs
Change-Id: I5dd00503a81c6819751c9f99b64615b497ef4e28
Cray-bug-id: LUS-3176
Signed-off-by: Lokesh Nagappa Jaliminche <lokesh.jaliminche@seagate.com>
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34842
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12458 kernel: kernel update RHEL7.6 [3.10.0-957.21.3.el7] 69/35269/3
Jian Yu [Fri, 28 Jun 2019 18:07:19 +0000 (11:07 -0700)]
LU-12458 kernel: kernel update RHEL7.6 [3.10.0-957.21.3.el7]

Update RHEL7.6 kernel to 3.10.0-957.21.3.el7.

Test-Parameters: clientdistro=el7.6 serverdistro=el7.6

Change-Id: I78133d5bc7567d8ea56c4b1aebc3e97096495fad
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35269
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12166 test: fix broken detection on ZFS 24/34924/2
Wang Shilong [Sun, 7 Apr 2019 03:44:51 +0000 (11:44 +0800)]
LU-12166 test: fix broken detection on ZFS

We intent to run the command on mds, otherwise
project quota will never be tested.

Lustre-change: https://review.whamcloud.com/34609
Lustre-commit: 0f2cd5948b870c0f82a70bdf32f0c5f6d845144d

Test-Parameters:trivial fstype=zfs
Fixes: a046e87 ("LU-7991 quota: project quota against ZFS backend")
Change-Id: I8650a0e1065f0bb465da01556472d3d23b22a530
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34924
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12447 utils: specify correct size for lfs project buffer 84/35284/2
Wang Shilong [Fri, 21 Jun 2019 06:28:10 +0000 (23:28 -0700)]
LU-12447 utils: specify correct size for lfs project buffer

Enviorment:
Fedora release 28 (Twenty Eight)

gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20)
Copyright (C) 2018 Free Software Foundation, Inc.

Hit build failure:
lfs_project.c: In function ‘lfs_project_item_alloc’:
lfs_project.c:72:2: error: ‘strncpy’ specified bound 4096
equals destination size [-Werror=stringop-truncation]
  strncpy(lpi->lpi_pathname, pathname, sizeof(lpi->lpi_pathname));
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This patch is back-ported from the following one:
Lustre-commit: ffef6e3271ad1136d3ab1c2ee229b4690a6722a0
Lustre-change: https://review.whamcloud.com/35257

Test-Parameters: trivial testlist=sanity-quota
Change-Id: Ia6429c47391bf503546609ec6a262fe24664bdd4
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/35284
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12399 tests: avoid 'pdsh localhost' in sanity test_420 50/35250/2
Sebastien Buisson [Mon, 17 Jun 2019 19:30:26 +0000 (12:30 -0700)]
LU-12399 tests: avoid 'pdsh localhost' in sanity test_420

sanity test_420 needs a clean env to execute openfile, ie not
inherited from root user.
Replace 'pdsh localhost' with simpler 'su - $uname -c' alternative
to achieve this.

This patch is back-ported from the following one:
Lustre-commit: 1476ac047b449886a0c382b840a7b09dc0cec7eb
Lustre-change: https://review.whamcloud.com/35176

Test-Parameters: trivial envdefinitions=ONLY=420 testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ifeba7fc1eba86d74a64cca187e286adb23147e2e
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35250
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11893 o2iblnd: add secondary IP address handling 48/35248/2
James Simmons [Mon, 17 Jun 2019 19:21:53 +0000 (12:21 -0700)]
LU-11893 o2iblnd: add secondary IP address handling

Using dev_get_by_name() in kiblnd_create_dev() means we can only
discover primary IP addresses. This breaks using network
aliasing which some people use. Move away from dev_get_by_name()
to using for_ifa() so we can detect any secondary IP addresses.

This patch is back-ported from the following one:
Lustre-commit: c4b39bf56bbcacd49d7f888a0745cd4b5580b36b
Lustre-change: https://review.whamcloud.com/34476

Change-Id: I03f2f8d18118b716a5eb5fb87694000ac06fe242
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35248
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8066 sysfs: make ping sysfs file read and writable 13/35313/2
James Simmons [Wed, 12 Dec 2018 16:19:45 +0000 (11:19 -0500)]
LU-8066 sysfs: make ping sysfs file read and writable

Starting with 4.15 kernels any sysfs read only is limited to
root access only. To retain the ability for non root users
to detect if a remote server is alive using the 'ping' sysfs
file we need to change it to writable. Retain the read ability
so older tools will work.

Lustre-change: https://review.whamcloud.com/33776
Lustre-commit: 6bbae72c6900dbd2b853d716bc4d456dc7fd586e

Change-Id: I6560c119328d723a20a2b32e1fa8c68dce5d407a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33776
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-on: https://review.whamcloud.com/35313
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
5 years agoLU-12382 llite: fix deadloop with tiny write 12/35312/2
Wang Shilong [Tue, 4 Jun 2019 12:54:01 +0000 (20:54 +0800)]
LU-12382 llite: fix deadloop with tiny write

For a small write(<4K), we will use tiny write and
__generic_file_write_iter() will be called to handle it.

On newer kernel(4.14 etc), the function is exported and will
do something like following:

|->__generic_file_write_iter
  |->generic_perform_write()

If iov_iter_count() passed in is 0, generic_write_perform() will
try go to forever loop as bytes copied is always calculated as 0.

The problem is VFS doesn't always skip IO count zero before it comes
to lower layer read/write hook, and we should do it by ourselves.

To fix this problem, always return 0 early if there is no
real any IO needed.

Lustre-change: https://review.whamcloud.com/35058
Lustre-commit: e9a543b0d3039027423cb469525015f97caa3a3f

Change-Id: I765a723da79eb5fd09317c3fad47fe479b1dd4fb
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35312
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-8066 utils: have llapi_target_iterate use sysfs tree 81/34781/5
James Simmons [Tue, 25 Jun 2019 13:29:07 +0000 (09:29 -0400)]
LU-8066 utils: have llapi_target_iterate use sysfs tree

Update llapi_target_iterate() to not use 'devices' but collect the
data from the lustre sysfs tree itself.

Lustre-change: https://review.whamcloud.com/33799
Lustre-commit: b24d69492b818457d9da0d6dce3adc0f91f18ec6

Change-Id: If100b4918bdcc8b24e72f37127048a32a808310f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34781
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
5 years agoLU-12269 build: fix hardened builds in rpm spec file 61/35161/3
Ben Menadue [Tue, 11 Jun 2019 03:38:18 +0000 (20:38 -0700)]
LU-12269 build: fix hardened builds in rpm spec file

The hardened build configure on RHEL8 has a quoted string
with spaces in it, and this breaks the construction of
%eval_configure on lustre.spec.in - the quotes end up in
the wrong place.

Moreover, the hardened build flags are only for user-space
code, and breaks kernel code compilation on RHEL 8.0 (it
adds -fPIE, which isn't valid for kernel code.

This patch stores the %build_cflags and %build_ldflags from
rpmbuild as environment variables before turning hardened
build off to allow the kernel code to build. These
environment variables are used in the lnet/utils and
lustre/utils Makefiles so that the user-space code there
gets the benefit of any system-specific RPM build flag
(such as hardened builds).

For RHEL7 on PPC64 we then also need to define the C macro
__SANE_USERSPACE_TYPES__ so that __s64 and __u64 are long
long instead of the default long - otherwise the build will
fail with a format string error on this platform because
Lustre uses %ll when printing/scanning __s64/__u64.

The environment variables (UTILS_CFLAGS and UTILS_LDFLAGS)
could also be used for a standalone, non-RPM build to pass
flags to the user-space code, with the usual CFLAGS and
LDFLAGS still used for kernel code.

This patch is back-ported from the following one:
Lustre-commit: 5270583ae6e436e9e7ae0199312e7f50365744af
Lustre-change: https://review.whamcloud.com/34882

Signed-off-by: Ben Menadue <ben.menadue@anu.edu.au>
Change-Id: I9b4ba830bf63838fd88ef1bae5dd10dff2109a1d
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/35161
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11742 test: have libtool execute the test binaries 83/34583/7
James Simmons [Fri, 14 Jun 2019 04:52:06 +0000 (21:52 -0700)]
LU-11742 test: have libtool execute the test binaries

With the move to libtools the ability to run all the lustre
utilities form the source tree was lost. To work around this
the libtool -no-install flag was used to prevent the creation
of the libtool wrappers. While this worked to restore the
source tree sand box development new package breakage is showing.
This is due to the rpath being hard coded into the utilies when
-no-install is used and some platforms disable fixed rpaths.

A very similar problem exist for people who want to use gdb to
debug their projects application. gdb does not work on libtool
wrappers as well so the recommended approach to this type of
problem is to use the libtool execute command. This command
allows the execution of an external non project binary, like
gdb, with the projects real binary application. Apply this
approach to the lustre test suite so commands like kill can
be used to shutdown lustre utilies that are not installed into
the testing environment.

Lustre-change: https://review.whamcloud.com/33947
Lustre-commit: f9e5224fbb60bb8b44753b7be10cb06108627f89

Change-Id: I74112f7250f1c43313d868c0edc7c8815d373002
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34583
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12195 tests: use sleep instead of wrapped multiop 55/34955/5
Alex Zhuravlev [Fri, 14 Jun 2019 04:43:47 +0000 (21:43 -0700)]
LU-12195 tests: use sleep instead of wrapped multiop

in sanity/43* and sanity/14* tests as multiop is not a binary,
but libtool-wrapped script. the tests fail when started from a
build tree.

Lustre-commit: 9a1f327a76f72c7713e53d8b354ff7f0e32be870
Lustre-change: https://review.whamcloud.com/34721

LU-12261 tests: Race between exec and truncate

Execing '$tdir/sleep' with & doesn't guarantee the file is
actually open before returning, so it is sometimes losing
the race with truncate, resulting in errors like this:
/usr/lib64/lustre/tests/sanity.sh: line 4172:
/mnt/lustre/d43b.sanity/sleep: Text file busy

Where $tdir/sleep gets ETXTBSY, instead of truncate as
expected.

A 1 second delay should be enough to guarantee exec wins
the race vs truncate.

Test-Parameters: trivial
Test-Parameters: testgroup=review-ldiskfs-arm
Test-Parameters: testgroup=review-ldiskfs
Test-Parameters: testgroup=review-ldiskfs-arm

Lustre-commit: c64855fca1504bddcb0fc7ad7316d8d6b20a9c6f
Lustre-change: https://review.whamcloud.com/34791

Change-Id: Iaec3433f03aab23583052373e5f0252d9eac7f04
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34955
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11893 ksocklnd: add secondary IP address handling 59/35159/4
James Simmons [Tue, 11 Jun 2019 07:49:50 +0000 (00:49 -0700)]
LU-11893 ksocklnd: add secondary IP address handling

With ksocknal_enumerate_interfaces() use of for_primary_ifa() only
primary IP addresses are returned. This disables using network
aliasing which some people use. Change for_primary_ifa() to
for_ifa() so we can detect any secondary IP addresses. Update the
string handling since ifa_device names can be different than the
net_device name. Discard the 'j' counter and instead keep
ksnn_ninterfaces up to date. This measn that we return 0 on
sucess, rather than a count of added interfaces. Update the
too many interfaces test in ksocknal_enumerate_interfaces()
with a better test using ARRAY_SIZE.

This patch is back-ported from the following one:
Lustre-commit: 9a2013af0668737dc56424c5c6eaac01621f6c17
Lustre-change: https://review.whamcloud.com/34392

Change-Id: I832df89148def5088502ac92df27b8b3872f3792
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35159
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 socklnd: use for_each_netdev() instead of lnet_ipif_enumerate() 58/35158/3
NeilBrown [Tue, 11 Jun 2019 05:37:58 +0000 (22:37 -0700)]
LU-11838 socklnd:  use for_each_netdev() instead of lnet_ipif_enumerate()

for_each_netdev() is a more direct interface and doesn't require
library support.

Also get the ip address directly from the net_device, rather than
using lnet_ipif_query().

Linux-commit: f703f71afd98e6e7ec70f92ffc52ef3ffffcd849
Linux-commit: 9eb957b98aa6322abde33240bf50dd483c5d1190

This patch is back-ported from the following one:
Lustre-commit: e9d9cbb072956f2582c97263184aecd196bba14a
Lustre-change: https://review.whamcloud.com/33966

Change-Id: I82894991b9a4a250d0560af31325b6c765cc0620
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-on: https://review.whamcloud.com/35158
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 kernel: harden current_time autoconf test 70/35170/2
James Simmons [Tue, 11 Jun 2019 05:26:42 +0000 (22:26 -0700)]
LU-11838 kernel: harden current_time autoconf test

In newer kernels CURRENT_TIME was replaced by current_time(). The
return value of current_time() was struct timespec but to support
time after 2038 the return value was changed to struct timespec64.
This change broke the autoconf test. The solution is to use one
of the struct iattr field in the autoconf test since it hides
the return value type.

Test-Parameters: trivial

This patch is back-ported from the following one:
Lustre-commit: 74b3726f42b1f72e289e3c3252030a62646afa7b
Lustre-change: https://review.whamcloud.com/33963

Change-Id: I95abd2cd2b777f99cbf6ab78370ee2171e5fca67
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-on: https://review.whamcloud.com/35170
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11359 mdt: fix mdt_dom_discard_data() timeouts 97/35197/3
Mikhail Pershin [Wed, 31 Oct 2018 13:28:29 +0000 (16:28 +0300)]
LU-11359 mdt: fix mdt_dom_discard_data() timeouts

The mdt_dom_discard_data() issues new lock to cause data
discard for all conflicting client locks. This was done in
context of unlink RPC processing and may cause it to be stuck
waiting for client to cancel their locks leading to cascading
timeouts for any other locks waiting on the same resource and
parent directory.

Patch skips discard lock waiting in the current context by
using own CP callback for that which doesn't wait for blocking
locks. They will be finished later by LDLM and cleaned up in
that completion callback. So current thread just makes sure
discard locks are taken and BL ASTs are sent but doesnt't wait
for lock granting and that fixes the original problem.

At the same time that opens window for race with data being
flushed on client, so it is possible that new IO from client
will happen on just unlinked object causing error message and
it is not possible to distinguish that case from other
possibly critical situations. To solve that the unlinked object
is pinned in memory while until discard lock is granted.
Therefore, such objects can be easily distinguished as stale one
and any IO against it can be just silently ignored.

Older clients are not fully compatible with async DoM discard so
patch adds also new connection flag ASYNC_DISCARD to distinguish
old clients and use old blocking discard for then.

Lustre-change: https://review.whamcloud.com/34071
Lustre-commit: 9c028e74c2202a8a481557c4cb22225734aaf19f

Test-Parameters: testlist=racer,racer,racer
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I419677af43c33e365a246fe12205b506209deace
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35197
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838: lnet: remove lnet_ipif_enumerate() 60/35160/5
NeilBrown [Tue, 11 Jun 2019 07:52:04 +0000 (00:52 -0700)]
LU-11838: lnet: remove lnet_ipif_enumerate()

Also remove lnet_ipif_query() and related functions.

There are no longer any users of these functions, so remove them.

Linux-commit: 6e659fcfab0cdd876a555a752acf9997f98acbcd

This patch is back-ported from the following one:
Lustre-commit: dedd3706945ef759d7d645cde30fa488c8ced4a1
Lustre-change: https://review.whamcloud.com/34234

Change-Id: I8183e505e3dbe12ff71ddf38f5b18a945d8a4a6c
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-on: https://review.whamcloud.com/35160
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12270 o2iblnd: pci_unmap_addr() removed in 4.19 57/35157/4
Li Dongyang [Tue, 11 Jun 2019 07:47:14 +0000 (00:47 -0700)]
LU-12270 o2iblnd: pci_unmap_addr() removed in 4.19

Since kernel 4.19 the pci_unmap_addr() wrappers have
been removed, along with linux/pci-dma.h
We can use the good old DEFINE_DMA_UNMAP_ADDR instead
of DECLARE_PCI_UNMAP_ADDR.

Linux-commit: 18b01b16e8bae9cd227909f6e6d2783d74855f65

This patch is back-ported from the following one:
Lustre-commit: 0cae491cc6d3cc949972366a3fdfdf32dfea5912
Lustre-change: https://review.whamcloud.com/34827

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I387bd3d1c4e8c3bc75400ce1be05132fb25f8a50
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35157
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 llite: address_space ->page_tree renamed ->i_pages 56/35156/3
Li Dongyang [Tue, 11 Jun 2019 05:55:30 +0000 (22:55 -0700)]
LU-11838 llite: address_space ->page_tree renamed ->i_pages

kernel 4.17 renamed address_space renamed ->page_tree to ->i_pages,
and switched to xa_lock on the radix_tree_root.

Linux-commit: b93b016313b3ba8003c3b8bb71f569af91f19fc7

This patch is back-ported from the following one:
Lustre-commit: 2d0c621d21be4e67b6075b76017af6e6fcd18c64
Lustre-change: https://review.whamcloud.com/34673

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Iadbc5eda884dbe8ad0d694e0f88255bc496dea5b
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35156
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11838 ldlm: struct timespec64.tv_sec type change 75/35175/2
Li Dongyang [Tue, 11 Jun 2019 05:52:12 +0000 (22:52 -0700)]
LU-11838 ldlm: struct timespec64.tv_sec type change

Since kernel 4.18 struct timespec64 is no longer defined
as struct timespec on 64bit systems, this means tv_sec
is no longer __kernel_time_t but now time64_t.

Use %llu as the format specifier and explicitly cast it
to unsigned long long.

This patch is back-ported from the following one:
Lustre-commit: f2bf0379a773c8c1659bfe018a22861784a0b9a6
Lustre-change: https://review.whamcloud.com/34677

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ib4c80c9b20854d45b1b3c04057c45ee20d5413d9
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35175
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 osp: atomic64_read() returns s64 74/35174/2
Li Dongyang [Tue, 11 Jun 2019 05:46:42 +0000 (22:46 -0700)]
LU-11838 osp: atomic64_read() returns s64

Since kernel 4.17 atomic64_read on x86_64 returns s64
instead of long.

Use %llu as the format specifier and explicitly cast it
to unsigned long long.

This patch is back-ported from the following one:
Lustre-commit: dc46952ecd1aa09e738b2de6b1a3076ecbaa740e
Lustre-change: https://review.whamcloud.com/34676

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I805d43251f24417e6405f5d087927c15cf531619
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35174
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 lnet: getname dropping addrlen argument 73/35173/2
Li Dongyang [Tue, 11 Jun 2019 05:43:55 +0000 (22:43 -0700)]
LU-11838 lnet: getname dropping addrlen argument

Since kernel 4.17 ->getname() does not take int *addrlen
argument anymore, instead it's returning the length to
the caller.

Linux-commit: 9b2c45d479d0fb8647c9e83359df69162b5fbe5f

This patch is back-ported from the following one:
Lustre-commit: dbb81e826290b2db27e24a85869c9d0736726caa
Lustre-change: https://review.whamcloud.com/34672

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I4ad5de4a22f3fb23c07a356650ea7925acf07eed
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35173
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 llite: remove assert for acl refcount 72/35172/2
James Simmons [Tue, 11 Jun 2019 05:40:39 +0000 (22:40 -0700)]
LU-11838 llite: remove assert for acl refcount

The purpose of this asssert to was to ensure lustre
was properly managing its posix_acl access. This test
is invalid due to the VFS layer also taking references
on the posix_acl. In reality their is no simple way to
detect this class of mistakes.

* lastest kernels remove this refcount *

Linux-commit: 6a42e615a28bad49f2e04829486e94190c066390

This patch is back-ported from the following one:
Lustre-commit: df7bfbb1c7890deed15fd85e75da70d88be2ef7f
Lustre-change: https://review.whamcloud.com/34236

Change-Id: I167f2de449a2e8357517f33c2e81a25b25104d57
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35172
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 o2iblnd: get IP address more directly. 66/35166/3
NeilBrown [Tue, 11 Jun 2019 05:33:43 +0000 (22:33 -0700)]
LU-11838 o2iblnd: get IP address more directly.

Use dev_get_by_name() and for_primary_ifa() to
get IP address for a named device.  This is more
direct.

Linux-commit: 10e138e41a4343fd1a88e4543990205d134e562a
Linux-commit: 9eb957b98aa6322abde33240bf50dd483c5d1190

This patch is back-ported from the following one:
Lustre-commit: 7a40cd2c83d174ae0bb7e22d62fad9fbd247a654
Lustre-change: https://review.whamcloud.com/33970

Change-Id: Ic4562c3948934bacb8613e9f6f57f609ecc04de7
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-on: https://review.whamcloud.com/35166
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 lnet: change lnet_ipaddr_enumerate() to use for_each_netdev() 71/35171/2
NeilBrown [Tue, 11 Jun 2019 05:29:31 +0000 (22:29 -0700)]
LU-11838 lnet: change lnet_ipaddr_enumerate() to use for_each_netdev()

for_each_netdev() is a more direct interface than
lnet_ipif_enumerate(), so use it instead.  Also get
address and 'up' status directly from the device.

This means we need to possible re-allocate the storage
space if there are lots of IP addresses.

However there is no need to resize the allocation down if we
over-allocated.  This is only used once, and is freed soon
after it is allocated, so that is a false optimization.

Linux-commit: 0400cf406c32ac3968241cd528747d922b6c55c3

This patch is back-ported from the following one:
Lustre-commit: f5991afd8779fe747778e28e998277a10242a57d
Lustre-change: https://review.whamcloud.com/33969

Change-Id: I1c1e7722c7b2b267dcb8134ae295a54f976d96ad
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35171
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11838 lustre: discard LTIME_S macro 69/35169/2
NeilBrown [Tue, 11 Jun 2019 05:23:20 +0000 (22:23 -0700)]
LU-11838 lustre: discard LTIME_S macro

Rather than using a macro, just access the required
field directly.

Linux-commit: 5b7cc4e4ce3cc15f67462ae75c55eecc7edc3a40

This patch is back-ported from the following one:
Lustre-commit: 65a8ff5fbe8ca014bd01150ab102d8aa43f78cff
Lustre-change: https://review.whamcloud.com/33984

Change-Id: I325cac7458265d1cf6ad9f195a513f2612865906
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-on: https://review.whamcloud.com/35169
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12269 build: remove %{fullrelease} from Provides 62/35162/3
Ben Menadue [Tue, 11 Jun 2019 03:40:15 +0000 (20:40 -0700)]
LU-12269 build: remove %{fullrelease} from Provides

Commit 7532409 adds a version number to lustre-osd-mount
Provides lines in lustre.spec.in, but include the
%{fullrelease} macro that was previously removed by
28c17d4. This causes an "unexpanded macro" warning when
building the RPM, and the result contains a bogus string
for that name, e.g.

    2.12.53_45_g43fc4db-%{fullrelease}

This patch simply removes the "-%{fullrelease}" suffix from
these lines in lustre.spec.in.

This patch is back-ported from the following one:
Lustre-commit: 48e16113731a1a9fff06370a0a11ea083ad290b8
Lustre-change: https://review.whamcloud.com/34883

Test-Parameters: trivial
Signed-off-by: Ben Menadue <ben.menadue@anu.edu.au>
Change-Id: Ia13f339f57b89c02443ebc2d68f0aa3b0802319a
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/35162
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12269 build: add value to definition of with_gss in spec 63/35163/3
Ben Menadue [Tue, 11 Jun 2019 03:42:50 +0000 (20:42 -0700)]
LU-12269 build: add value to definition of with_gss in spec

rpmbuild currently fails when gss_keyring is enabled (which
happens automatically if the right packages are installed).
This is due to an ill-formed %define in lustre.spec.in that
doesn't include the value to set the macro do.

This patch updates this line to set the value to 1.

This patch is back-ported from the following one:
Lustre-commit: 5bdf89f0c13eab1513d11b2e1950fba31479535d
Lustre-change: https://review.whamcloud.com/34892

Signed-off-by: Ben Menadue <ben.menadue@anu.edu.au>
Change-Id: I2f52b19795091702622eb3b4c110f09eb80654db
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/35163
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12269 kernel: new kernel [RHEL 8.0 4.18.0-80.el8] 64/35164/3
Jian Yu [Tue, 11 Jun 2019 03:48:26 +0000 (20:48 -0700)]
LU-12269 kernel: new kernel [RHEL 8.0 4.18.0-80.el8]

This patch makes changes to support new RHEL 8.0 release
for Lustre client.

Test-Parameters: trivial

This patch is back-ported from the following one:
Lustre-commit: d37b0ab99eaeeac391088848c275d2757b6ff17d
Lustre-change: https://review.whamcloud.com/34862

Change-Id: I89b4f1e59f8b25bf9d37d3564e2d05d6e87d9b38
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/35164
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11678 quota: protect quota flags at OSC 15/34915/3
Hongchao Zhang [Tue, 22 Jan 2019 08:39:21 +0000 (16:39 +0800)]
LU-11678 quota: protect quota flags at OSC

There is no protection in OSC quota hash tracking the quota flags of
different qid, which could cause the previous request to modify the
quota flags which was set by the current request because the replies
could be out of order.

This patch also adds a lock to protect the operations on the quota
hash from different requests.

Test-Parameters: testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota

Lustre-change: https://review.whamcloud.com/33747
Lustre-commit: 77d9f4e05a5c366ad0f7c2e97a338c6958676f73

Change-Id: Ia6e5141265beacb9401dd533081fa0b85fd5ea6a
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34915
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11768 test: reset qsd_time before test 12/35212/2
Hongchao Zhang [Sat, 22 Dec 2018 22:21:22 +0000 (17:21 -0500)]
LU-11768 test: reset qsd_time before test

In test_6 of sanity-quota, if the qsd_timeout is larger than
TIMEOUT*2, it will trigger the watchdog and cause the test fail.

Lustre-change: https://review.whamcloud.com/33931
Lustre-commit: 6cb284c2ed92284454580e6e54e00ddc33530c6e

Change-Id: I3f2993ce2b88e1520b6907ae134557abcd30aa0c
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35212
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12375 scripts: Start lnet after opa 60/35260/2
Nathaniel Clark [Sun, 2 Jun 2019 13:50:53 +0000 (09:50 -0400)]
LU-12375 scripts: Start lnet after opa

Ensure ordering of lnet after opa for startup and lnet before opa on
shutdown.

Lustre-change: https://review.whamcloud.com/35032
Lustre-commit: 54adf45f2c73e542a2fcf3125781de791fac9c56

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I4c2cad2381349f866bdc08e2a81e3d8990d8752e
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Artur Novik <anovik@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35260
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12395 build: require python2 for lustre-iokit 51/35251/2
Li Dongyang [Mon, 17 Jun 2019 19:36:11 +0000 (12:36 -0700)]
LU-12395 build: require python2 for lustre-iokit

RHEL8 has splitting python2 and python3 rpms,
and none of them provdes python anymore.
We can just require python2 in the spec, other
distros all have python rpm providing python2.

This patch is back-ported from the following one:
Lustre-commit: 6f0fcc289887737f45687b5f8a15835aeab32ef4
Lustre-change: https://review.whamcloud.com/35094

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I881c90a4e66d1a431d11d16b9e89781de2f87a7d
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35251
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11355 lustre: enable fstrim on lustre device 96/35196/2
Wang Shilong [Thu, 11 Apr 2019 00:40:23 +0000 (08:40 +0800)]
LU-11355 lustre: enable fstrim on lustre device

pass the FITRIM ioctl through the OST/MDT
mountpoint to the underlying filesystem, which
allows us to run fstrim on server mount point directly.

Lustre-change: https://review.whamcloud.com/33131
Lustre-commit: d5be104cc9b0c7a71b30aa5feb16873aa30803a9

Change-Id: Ia6f9b43e48245ee7907a47f05c3924b3640bc734
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35196
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12034 obdclass: put all service's env on the list 10/35110/14
Alex Zhuravlev [Wed, 3 Apr 2019 08:29:06 +0000 (11:29 +0300)]
LU-12034 obdclass: put all service's env on the list

to be able to lookup by current thread where it's too
complicated to pass env by argument.

this version has stats to see slow/fast lookups. so, in sanity-benchmark
there were 172850 fast lookups (from per-cpu cache) and 27228 slow lookups
(from rhashtable). going to see the ration in autotest's reports.

Fixes: 2339e1b3b690 ("LU-11483 ldlm ofd_lvbo_init() and mdt_lvbo_fill() create env")
Fixes: e02cb40761ff ("LU-11164 ldlm: pass env to lvbo methods")

Lustre-change: https://review.whamcloud.com/34566
Lustre-commit: aa82cc83612dbd4c7d05fc101b98e8660f1373db

Change-Id: Ia760e10fa5c68e7a18284e4726d215b330fc0eed
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35110
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12314 tests: Add Missing Description to sanity test 258a 45/35245/2
Arshad Hussain [Fri, 10 May 2019 00:54:03 +0000 (06:24 +0530)]
LU-12314 tests: Add Missing Description to sanity test 258a

This patch adds missing test description to sanity test 258a.

This patch is a back port from master:
Lustre-change: https://review.whamcloud.com/34902
Lustre-commit: ef2a05a61b4201af612356f78197a57b427260b2

Test-Parameters: trivial

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I972549cd049b965c9e6da9b43aa245bab875a77a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-on: https://review.whamcloud.com/35245
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-8047 llite: optimizations for not granted lock processing 01/35001/3
Andrew Perepechko [Thu, 7 Mar 2019 20:18:45 +0000 (12:18 -0800)]
LU-8047 llite: optimizations for not granted lock processing

This patch removes ll_md_blocking_ast() processing for
not granted locks. The reason is ll_invalidate_negative_children()
can slow down I/O significantly without a reason if there
are thousands or millions of files in the directory
cache.

Lustre-change: https://review.whamcloud.com/19665
Lustre-commit: 2c126c5a73edea434456c6c335772daaac717f2f

Change-Id: Ic69c5f02f71c14db4b9609677d102dd2993f4feb
Seagate-bug-id: MRP-3409
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35001
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11946 build: no zlib check during configure --enable-dist 94/35194/2
Olaf Faaland [Mon, 6 May 2019 18:31:21 +0000 (11:31 -0700)]
LU-11946 build: no zlib check during configure --enable-dist

If the zlib libraries are not found, the error is fatal, and prevents
the sources from being packaged.

This check is unnecessary when sources are being packaged, so this patch
disables the test when configure is run with --enable-dist.

Lustre-change: https://review.whamcloud.com/34811
Lustre-commit: e9d9a0eeb0ff71b207e9d0095fce16af74f842b6

Change-Id: Ie262b17b63c0edc8e8bfbd0c1a466ec37d05622c
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35194
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8130 libcfs: support latest rhashtable API 43/35143/6
James Simmons [Mon, 11 Feb 2019 16:20:04 +0000 (11:20 -0500)]
LU-8130 libcfs: support latest rhashtable API

With the broad support range of the OpenSFS lustre version pieces
are missing in some distributions to properly support using the
rhashtable API as required by Lustre.

Lustre-change: https://review.whamcloud.com/34036
Lustre-commit: 8de7221201c0707245c9ee2ef7cdd1d54207b3ee

Change-Id: I7ce2949ca2f1d497dcb60a8b17b964e47cdff223
Test-Parameters: trivial
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35143
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12361 lov: fix wrong calculated length for fiemap 83/35083/3
Wang Shilong [Thu, 30 May 2019 14:46:09 +0000 (22:46 +0800)]
LU-12361 lov: fix wrong calculated length for fiemap

lov_stripe_intersects() will return a closed interval
[@obd_start, @obd_end], so to calcuate length of interval we need

 @obd_end - @obd_start + 1

rather than

 @obd_end - @obd_start

Wrong extent length will make us return wrong fiemap information.

Lustre-change: https://review.whamcloud.com/34998
Lustre-commit: 225e7b8c70fb68bc3aa3a6d88c5e9bda322c9cc9

Change-Id: I30deb17cf5fa80a6d3046098fbac0d3faa01ad1c
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35083
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11755 ldlm: remove LASSERT() in ns_is_{client,server}() 93/35193/2
Nikitas Angelinas [Tue, 11 Dec 2018 04:49:52 +0000 (20:49 -0800)]
LU-11755 ldlm: remove LASSERT() in ns_is_{client,server}()

(ns->ns_client == LDLM_NAMESPACE_CLIENT ||
ns->ns_client == LDLM_NAMESPACE_SERVER) implies that
(!(ns->ns_client & ~(LDLM_NAMESPACE_CLIENT |LDLM_NAMESPACE_SERVER)))
is also true, so the latter LASSERT() can be removed from
ns_is_{client,server}().

Lustre-change: https://review.whamcloud.com/33822
Lustre-commit: 0af247b17619f586702b0ad37f0796bdf783b11c

Test-Parameters: trivial

Signed-off-by: Nikitas Angelinas <nangelinas@cray.com>
Cray-bug-id: LUS-6762
Change-Id: Ib165d97ca397d5109c046415d2ec513357b07cbd
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35193
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12323 libcfs: check if save_stack_trace_tsk is exported 85/35085/2
Chris Horn [Wed, 22 May 2019 16:21:14 +0000 (11:21 -0500)]
LU-12323 libcfs: check if save_stack_trace_tsk is exported

Lustre 2.12 commit afedf9343686504c89f2e28cf6133540166f2347 introduced
the use of save_stack_trace_tsk, but this symbol is not exported for
all architectures. When it's possible we can use save_stack_trace
instead. Otherwise skip printing stack trace.

Lustre-change: https://review.whamcloud.com/34937
Lustre-commit: ffb2b46ed7eda42530596df3d52f76250d53e506

Cray-bug-id: LUS-7352
Test-Parameters: clientarch=aarch64
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I142b542f5c5672abbad461a621aedd1e49db1bdd
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35085
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11946 build: no yaml check during configure --enable-dist 88/35088/3
Olaf Faaland [Mon, 6 May 2019 18:38:37 +0000 (11:38 -0700)]
LU-11946 build: no yaml check during configure --enable-dist

If the yaml libraries are not found, the error is fatal, and prevents
the sources from being packaged.

This check is unnecessary when sources are being packaged, so this patch
disables the test when configure is run with --enable-dist.

Lustre-change: https://review.whamcloud.com/34812
Lustre-commit: 748ede308d0eb2bb2762cae9c45978808e20a735

Change-Id: I160e0d54efc59480d2f830607467dbc9f34c9de3
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35088
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11925 hsm: attributes aren't updated after RESTORE 82/35082/2
Andriy Skulysh [Wed, 30 Jan 2019 09:33:49 +0000 (11:33 +0200)]
LU-11925 hsm: attributes aren't updated after RESTORE

MDS returns file size to a client with UPDATE lock
while file is RELEASED. It isn't cancelled after RESTORE
and the client has old file size after appending data.

Flush update lock after RESTORE completed.

Lustre-change: https://review.whamcloud.com/34180
Lustre-commit: 63493381b3a1a1ebd5293df69403ee2ff8b0440e

Change-Id: Ib956dbd075691ce5fac1ce552df9519f9fa768e4
Cray-bug-id: LUS-6945
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35082
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12374 lustre: push rcu_barrier() before destroying slab 97/35097/2
Wang Shilong [Sat, 1 Jun 2019 11:22:11 +0000 (19:22 +0800)]
LU-12374 lustre: push rcu_barrier() before destroying slab

From rcubarrier.txt:

"
We could try placing a synchronize_rcu() in the module-exit code path,
but this is not sufficient. Although synchronize_rcu() does wait for a
grace period to elapse, it does not wait for the callbacks to complete.

One might be tempted to try several back-to-back synchronize_rcu()
calls, but this is still not guaranteed to work. If there is a very
heavy RCU-callback load, then some of the callbacks might be deferred
in order to allow other processing to proceed. Such deferral is required
in realtime kernels in order to avoid excessive scheduling latencies.

We instead need the rcu_barrier() primitive. This primitive is similar
to synchronize_rcu(), but instead of waiting solely for a grace
period to elapse, it also waits for all outstanding RCU callbacks to
complete. Pseudo-code using rcu_barrier() is as follows:

   1. Prevent any new RCU callbacks from being posted.
   2. Execute rcu_barrier().
   3. Allow the module to be unloaded.
"

So use synchronize_rcu() in ldlm_exit() is not safe enough, and we might
still hit use-after-free problem, also we missed rcu_barrier() when destory
inode cache, this is simiar idea what current local filesystem does.

Lustre-change: https://review.whamcloud.com/35030
Lustre-commit: 1f7613968c800f99ed074f17cd7ba1086847d2db

Change-Id: I76c7dfe7b6472d377fe1b60b0891c61ac8a0fbfc
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/35097
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12350 tests: Do not use background failover 86/35086/2
Patrick Farrell [Tue, 28 May 2019 21:02:49 +0000 (17:02 -0400)]
LU-12350 tests: Do not use background failover

For some reason, test 33 chooses at one point to take an
OST offline by starting failover in the background. It
seems to assume the OST will be offline during the
subsequent read, without doing anything to verify it is
offline - In fact, it could either be not offline yet or
back online with failover complete.

Just use stop like the rest of the test does.

Lustre-change: https://review.whamcloud.com/34985
Lustre-commit: 4ac0324fb9d824915b3dd11b75e81e609d9e8e84

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I9c074ff1412793b8f0d8f15dc1e2ee21bb6d9fd6
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35086
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12120 grants: prevent negative ted_grant value 84/35084/2
Mikhail Pershin [Thu, 30 May 2019 09:30:43 +0000 (12:30 +0300)]
LU-12120 grants: prevent negative ted_grant value

Add check in tgt_grant_shrink() to protect ted_grant
against negative value.

Lustre-change: https://review.whamcloud.com/34996
Lustre-commit: 7e08317ef5cbed5cd587017cbe343eb4cc52822c

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Iddea86f052124413ac60f5d0f26bcb68e376ede5
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35084
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12342 spec: mark lsvcgss as a config file in the rpm 87/35087/2
Götz Waschk [Tue, 28 May 2019 06:48:02 +0000 (08:48 +0200)]
LU-12342 spec: mark lsvcgss as a config file in the rpm

The file /etc/sysconfig/lsvcgss shouldn't be overwritten on package
upgrades.

Lustre-change: https://review.whamcloud.com/34978
Lustre-commit: cfe9e1a56c696bcbba24dd4041845ead12aba291

Signed-off-by: Götz Waschk <goetz.waschk@desy.de>
Change-Id: I3fa0a3a5a06d9e59699d23e652329365f38fd028
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35087
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12232 test: commit before df 27/34927/3
Hongchao Zhang [Tue, 2 Apr 2019 02:49:53 +0000 (22:49 -0400)]
LU-12232 test: commit before df

In sub_test6 of replay_ost_single, the transactions at OSTs should
be committed to cleanup the test environment.

Lustre-change: https://review.whamcloud.com/34808
Lustre-commit: f1cbfb96c820aa7e1e5a84176619679d696a117a

Test-Parameters: trivial
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Change-Id: Icbb06789855ab02252b7f1b0b9aff6bbb0f5f2e1
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34927
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12225 obdclass: improve jobid memory reclaim policy 08/35008/2
Wang Shilong [Mon, 29 Apr 2019 13:13:59 +0000 (21:13 +0800)]
LU-12225 obdclass: improve jobid memory reclaim policy

jobid_should_free_item() will be called in following three
cases to decide whether @pidmap should be deleted from hash list:

1) expire normal timeout and memory reclaimer called to
try free some items.

2) admin echo sys interface to free some jobid.

3) Umount client to free all memory.

For case 2 && 3, it makes sense we always return 1,
add a warn_on in case3 to make sure there isn't any
bug in the codes.

For the case1, we could change policy a bit to not
return 1 if reference count of @pidmap is larger than 1,
a common case is a newly added @pidmap is easily freed
from hash list with current policy.

Actually, even we delete @pidmap from hash list, memory
will be eventually freed with its references count reached
1, and it is very likely we deleted and inserted @pidmap
again since this could be a hot and long runtime job.

Lustre-change: https://review.whamcloud.com/34775
Lustre-commit: 3e9fedfa7ea52a03a1975572ab37cc1ae9344a8a

Change-Id: I61b894a900319953d5a3369bee69bda050102129
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35008
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12324 mdd: Do not record xattr size get in changelogs 10/35010/2
Oleg Drokin [Wed, 22 May 2019 05:22:49 +0000 (01:22 -0400)]
LU-12324 mdd: Do not record xattr size get in changelogs

It looks like if the xattr itself was not fetched there's no
need to create a changelog entry for it. The real get will come
and we'd do it there

Lustre-change: https://review.whamcloud.com/34936
Lustre-commit: 8f2599c599011a50549cdc79f0b3379239cf4f0c

Change-Id: I5b19f9309f65da0a4c58cb79a95787dab862eb94
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35010
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12225 obdclass: fix race access vs removal of jobid_hash 07/35007/2
Wang Shilong [Mon, 29 Apr 2019 12:46:47 +0000 (20:46 +0800)]
LU-12225 obdclass: fix race access vs removal of jobid_hash

We added @pidmap into hash and reference count will be 1.
However, another thread might reclaim this newely added
@pidmap from hash list, we try to access this @pidmap
will become a user-after-free operation.

Fix this problem by init reference count as 1 before
adding hash list, which gurantee memory could be not
freed during our access.

Check other places where memory reclaim used did similar
idea like this.

Lustre-change: https://review.whamcloud.com/34763
Lustre-commit: b664182e0361731fa409ac6a0a0f19637a7e5288

Change-Id: Idd5f429b97e064e29b6883243f8a012c2b4b4ae7
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35007
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11233 build: support for gcc8 02/35002/2
Alex Zhuravlev [Mon, 15 Apr 2019 12:58:59 +0000 (15:58 +0300)]
LU-11233 build: support for gcc8

this patch covers kernel portion of Lustre

Lustre-change: https://review.whamcloud.com/34660
Lustre-commit: 6601661f96325b4971d0d1cb0be0fa01cc2ddc97

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I3fac8b89eef2291b5cb91ea05ee0b6ff32d11741
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35002
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11233 tests: fix gcc8 build warnings 05/35005/2
Alex Zhuravlev [Wed, 22 May 2019 20:28:55 +0000 (13:28 -0700)]
LU-11233 tests: fix gcc8 build warnings

this patch covers Lustre tests

Lustre-change: https://review.whamcloud.com/34661
Lustre-commit: 6733fbff9a682bcec5fdca6f7062c24f0fe27cfe

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I6345d603772fb32bbc4b38a758a3e97f0361d116
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35005
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-12309 osd-zfs: Support disabled project quotas 09/35009/2
Nathaniel Clark [Thu, 16 May 2019 17:18:04 +0000 (13:18 -0400)]
LU-12309 osd-zfs: Support disabled project quotas

Allow project quotas to be compiled in but disabled in the zpool.
This would be the case for zpools created by pre-0.8.0 ZFS, but then
used with newer ZFS.

Lustre-change: https://review.whamcloud.com/34888
Lustre-commit: 291e7196d39365739f9daa02efd25535b5415174

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I79c2c4ee3b191dad4150c218b25ced2508062d51
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35009
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11851 ldiskfs: reschedule for htree thread. 06/35006/2
Yang Sheng [Fri, 1 Feb 2019 05:04:10 +0000 (13:04 +0800)]
LU-11851 ldiskfs: reschedule for htree thread.

Thread may be waken inproperly in htree code. This patch
reschedule thread to keep locking correct.

Lustre-change: https://review.whamcloud.com/34160
Lustre-commit: 892f62c7a20d87954744c6c8937960379a870992

Change-Id: I6a8d1bbc0470b2577ca80faa304eb06f7913c218
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35006
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11233 utils: fix build warnings for gcc8 04/35004/2
Alex Zhuravlev [Mon, 15 Apr 2019 13:25:50 +0000 (16:25 +0300)]
LU-11233 utils: fix build warnings for gcc8

Quiet new build warnings that appear with GCC8, mainly related
to the length of string buffers not being long enough (in theory)
for the maximum possible string sizes, even if this never actually
is possible in practice.

Lustre-change: https://review.whamcloud.com/34662
Lustre-commit: 5e0ad2afa62e9eb7cf4f48c394c6a84c74a02f2f

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I83a955fc68f3e03fe84622ddf1cedfb30d5916ac
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35004
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11233 utils: fix double-free of params fields 03/35003/2
Andreas Dilger [Thu, 18 Apr 2019 23:29:44 +0000 (17:29 -0600)]
LU-11233 utils: fix double-free of params fields

Call find_param_fini() on error so that the params are not leaked
during initialization if there is an intermediate error.

Zero out the parameters as they are freed, so if find_param_fini()
is called multiple times (as it is in some error paths) it does
not corrupt the heap by double freeing pointers.  This can be hit
by calling "lfs getstripe -m" on multiple pathnames, some of which
do not exist.

Lustre-change: https://review.whamcloud.com/34711
Lustre-commit: 7c7c39d84c98df3c6fe33c04c9e391529a86db53

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie0d7e9ee134deb0633af2f8052b8a458333ebbe5
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35003
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-930 doc: man page for l_getsepol 33/34833/2
Sebastien Buisson [Tue, 5 Feb 2019 15:06:39 +0000 (16:06 +0100)]
LU-930 doc: man page for l_getsepol

Man page for l_getsepol.

Lustre-change: https://review.whamcloud.com/34184
Lustre-commit: e82adfcbd00fecaf588ee8ddd3d5432a5d92a51d

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I338492cebca9a088657ff8bd5122274e7e49a5c7
Reviewed-on: https://review.whamcloud.com/34833
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12178 osd: do not rebalance quota under memory pressure 26/34926/3
Alex Zhuravlev [Tue, 23 Apr 2019 14:51:28 +0000 (17:51 +0300)]
LU-12178 osd: do not rebalance quota under memory pressure

this will happen eventually.

Lustre-change: https://review.whamcloud.com/34741
Lustre-commit: c5e5b7cd872eb2fa0028cef8b1a5e5c51b085b44

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ibe4ef9e45deed5ea19169f3affed322351785357
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34926
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12169 llite: fill copied dentry name's ending char properly 25/34925/3
Wang Shilong [Mon, 8 Apr 2019 13:22:45 +0000 (21:22 +0800)]
LU-12169 llite: fill copied dentry name's ending char properly

Dentry name expect an extra '\0'. and dentry_len won't calcualte
extra '\0' for it, but we should allocate memory and fill it
when copying dentry name by ourselves.

Otherwise, lu_name_is_valid_2() will try to access @name[len]
and check whether it is '\0'. this is invalid memory access.
We will possibly hit a crash if the first access that bit is '\0'.
and the bit overwritten by someone else, and finally we failed
sanity check in mdc_name_pack().

LustreError: 157839:0:(mdc_lib.c:137:mdc_pack_name()) LBUG

Fixes: f575b65("LU-12020 llite: make sure name pack atomic")

Lustre-change: https://review.whamcloud.com/34611
Lustre-commit: bc9cc327983c45e6255e0d6475b8bdbdcd82c938

Change-Id: I533e19a0e6efb0fca5a46bcdbdb0006d1b1bedab
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34925
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-10777 dom: disable read-on-open with resend 12/34912/3
Mikhail Pershin [Mon, 22 Apr 2019 18:18:01 +0000 (21:18 +0300)]
LU-10777 dom: disable read-on-open with resend

The read-on-open can fill more data on reply buffer than
client allocated, this causes buffer re-allocation followed
by resend. Meanwhile FIO read test shows that such resends
perform worse than separate READ RPC. For example:
FIO 8k read is ~50% better without buffer re-allocation
with resend. Considering that there is parameter on MDC
'mdc_dom_min_repsize' to control read-on-open inline buffer
size, there is no sense to keep 'reallocation+resend'
option on MDT. Patch removes it.

Lustre-change: https://review.whamcloud.com/34700
Lustre-commit: e0adb618a4b0d0182419a5731fe046e9157b9f51

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I7eb9d64f5551789e93b1f7676f61c0e7a5149f76
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34912
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12093 osc: don't check capability for every page 20/34920/3
Li Dongyang [Thu, 21 Mar 2019 03:05:14 +0000 (14:05 +1100)]
LU-12093 osc: don't check capability for every page

We check CFS_CAP_SYS_RESOURCE for every page during the io.
This is expensive on apparmor enabled systems, we can only
do that once for the entire io and use the result when
submitting the pages.

Don't init the oap_brw_flags during osc_page_init(), the flag
will be set in either osc_queue_async_io() or osc_page_submit().

Lustre-change: https://review.whamcloud.com/34478
Lustre-commit: c1cab789aaa25bbb4062208aeb2822fde3007cd4

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I0e664f43ce31c276b33476fdff11794185ab0a3b
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34920
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12152 lnet: Cleanup lnet_get_rtr_pool_cfg 22/34922/2
Chris Horn [Thu, 4 Apr 2019 02:40:58 +0000 (21:40 -0500)]
LU-12152 lnet: Cleanup lnet_get_rtr_pool_cfg

The cfs_percpt_for_each loop contains an off-by-one error that causes
memory corruption. In addition, the way these loops are nested results
in unnecessary iterations. We only need to iterate through the cpts
until we match the cpt number passed as an argument. At that point we
want to copy the router buffer pools for that cpt.

Lustre-change: https://review.whamcloud.com/34591
Lustre-commit: 187117fd94e4904c168de02fc439b41a1fcc3e48

Cray-bug-id: LUS-7240
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I8c0dc7bab7ca42dbce04a9e6efa4343da4139239
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34922
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12098 mdd: explicitly clear changelogs on deregister 21/34921/2
Sebastien Buisson [Tue, 16 Apr 2019 13:32:43 +0000 (22:32 +0900)]
LU-12098 mdd: explicitly clear changelogs on deregister

In case of MDS crash in the middle of changelog_deregister, the system
can end up with the changelogs user deregistered, but the changelog
entries not actually cleared. Then the only way to get rid of the
remaining changelogs not used anymore by any user is to register a new
changelogs user and then deregister it.
To protect from this scenario, explicitly clear changelogs used by the
user, before actually deregistering it.

Also add recovery-small test_136 for non-regression purpose.

Lustre-change: https://review.whamcloud.com/34688
Lustre-commit: 83ffa859bc629e246de9fcdfc82838b14c6d0ea3

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I14576180c9351337fc4d9ed0e1b176d352584750
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34921
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12165 quota: fix to use correct fsname array size 23/34923/2
Wang Shilong [Sat, 6 Apr 2019 13:23:23 +0000 (21:23 +0800)]
LU-12165 quota: fix to use correct fsname array size

Max fsname is allowed to be LUSTRE_MAXFSNAME, plus '\0',
we expected arrary size should be LUSTRE_MAXFSNAME + 1.

Otherwise, we will hit following crash easily.

[864870.292204] [<ffffffff9230e84e>] dump_stack+0x19/0x1b
[864870.293186] [<ffffffff92308b50>] panic+0xe8/0x21f
[864870.294104] [<ffffffffc0f3f805>] ? qsd_enabled_seq_write+0x205/0x210 [lquota]
[864870.295418] [<ffffffff91c91b8b>] __stack_chk_fail+0x1b/0x20
[864870.296437] [<ffffffffc0f3f805>] qsd_enabled_seq_write+0x205/0x210 [lquota]
[864870.297760] [<ffffffff91e1e418>] ? __sb_start_write+0x58/0x110
[864870.298894] [<ffffffff91e91050>] proc_reg_write+0x40/0x80
[864870.299883] [<ffffffff91e1b490>] vfs_write+0xc0/0x1f0
[864870.300765] [<ffffffff91e1c2bf>] SyS_write+0x7f/0xf0
[864870.301711] [<ffffffff92320795>] system_call_fastpath+0x1c/0x21

Lustre-change: https://review.whamcloud.com/34608
Lustre-commit: 4414fd5365612a5fc1243e0e83f91d01949ba7d2

Test-parameters: trivial
Change-Id: I33dd331a83ddac0e0c36a82480e7e90ad0ed2c2a
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34923
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11403 tests: Fix $tfile usage 14/34914/4
Patrick Farrell [Wed, 17 Apr 2019 16:19:09 +0000 (12:19 -0400)]
LU-11403 tests: Fix $tfile usage

We cannot just use raw $tfile - we must use something under
$DIR.  This is resulting in failures because $tfile exists.

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/34698
Lustre-commit: 4191e0cdd0d96b848c1235471179d25d37a889dc

Fixes: a8f4d1e5fd79 ("LU-11403 llite: ll_fault fixes")
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iea6356cabb1623606bf926ce80c55a3210c0b535
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34914
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11403 llite: ll_fault fixes 35/34935/2
Patrick Farrell [Tue, 12 Mar 2019 18:32:21 +0000 (14:32 -0400)]
LU-11403 llite: ll_fault fixes

Various error conditions in the fault path can cause us to
not return a page in vm_fault.  Check if it's present
before accessing it.

Additionally, it's not valid to return VM_FAULT_NOPAGE for
page faults.  The correct return when accessing a page that
does not exist is VM_FAULT_SIGBUS.  Correcting this avoids
looping infinitely in the testcase.

Lustre-change: https://review.whamcloud.com/34247
Lustre-commit a8f4d1e5fd79e77f1347e983ec52f2ddc3e75ab9

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I53fc16d91462ac5d4555855dfa067d7fd6716c90
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/34935
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11796 lov: Remove unnecessary assert 18/34918/2
Patrick Farrell [Fri, 29 Mar 2019 19:01:01 +0000 (15:01 -0400)]
LU-11796 lov: Remove unnecessary assert

This is asserting on network data from the server, and
additionally, the LU-9846 (overstriping) work shows this
condition is not a problem if it does somehow occur.

Lustre-change: https://review.whamcloud.com/33882
Lustre-commit: 1d71044851192ee36f0841b525d3df0e3b054794

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I7b53eb63914f6e9d31a0747a40d09df9ffedaa91
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34918
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11690 lod: fix LBUG with wide striping 17/34917/2
Patrick Farrell [Thu, 2 May 2019 13:06:58 +0000 (09:06 -0400)]
LU-11690 lod: fix LBUG with wide striping

When striping extremely widely (~1600+ stripes), we reach
more than half of the theoretical limit of layout size,
and LBUG.

It is also possible to trigger this assert with
multi-component PFL files, where all the components are
below the stripe count limit, but together they exceed it.

PFL makes asserting based on LOV_MAX_STRIPE_COUNT
unworkable, so just remove the assert.  Further work is
planned to match up maximum allowed layout size with
the real maximum EA size.

Lustre-change: https://review.whamcloud.com/33708
Lustre-commit: f1ca2c0bd059e3606225127e5ff72b4db9a1ed6e

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id0240785792e7d4084ea6e53b44469a40e59043d
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34917
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12282 build: export IB_OPTIONS before build 28/34928/2
Minh Diep [Fri, 10 May 2019 03:45:51 +0000 (20:45 -0700)]
LU-12282 build: export IB_OPTIONS before build

We need to export any option before dpkg-buildpackage

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/34843
Lustre-commit: eb70bc0b63a6aae50c7b9ab9fc84b6b0e090428d

Change-Id: I683080e1872c8818ae9c391f5971b5e4488147a6
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34928
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-9625 utils: remove old lfs "cp" and "ls" sub-commands 11/34911/2
Andreas Dilger [Wed, 13 Feb 2019 07:54:42 +0000 (00:54 -0700)]
LU-9625 utils: remove old lfs "cp" and "ls" sub-commands

Remove the obsolete "lfs cp" and "lfs ls" sub-commands for handling
"remote" users in a different namespace.  They have been non-working
since commit v2_8_54_0-73-g9d06de3 and were never in use before that.

Instead we have nodemap to handle UID/GID mapping.

Test-Parameters: trivial

Lustre-change: https://review.whamcloud.com/34240
Lustre-commit: a1ecd11712a6a4e8b7819c6826a17bf8677df752

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I8db90d388f8fa621d61fc65ab677b1589b3ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nangelinas@cray.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34911
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11566 utils: fix lctl llog_print for large configs 50/34850/2
Andreas Dilger [Mon, 10 Dec 2018 10:18:19 +0000 (03:18 -0700)]
LU-11566 utils: fix lctl llog_print for large configs

If "lctl llog_print" is called for a large configuration, it will
overflow the 8KB buffer limit for OBD ioctl commands.  The kernel
snprintf calls try to overflow the supplied buffer.  Avoid that.
If the configuration is large, fetch the configuration records in
chunks and print them incrementally.

Add --start and --end options to llog_print and deprecate the use of
positional parameters, since positional parameters are increasingly
complex to parse as options are added, and are harder to use.

The callback for the configuration records will allow "lctl pool_*"
commands to be processed directly on the MGS.

Move existing llog_print test_60aa, test_60ab to conf-sanity as
test_123aa and test_123ab (rename set_param -F test_123 to test_123F).
Add new test_123ac and test_123ad for the new llog_print --start and
--end param, and update test_123aa to test old positional parameters.

Lustre-change: https://review.whamcloud.com/33815
Lustre-commit: 3783aa285b15a811081a8de829d52f7f83e91209

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib7d2ae893033bd4594646c980b7d0ddbd2b3a089
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-on: https://review.whamcloud.com/34850
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12248 lov: fix ost objects calculation in lod_statfs 16/34816/3
Li Dongyang [Tue, 30 Apr 2019 05:29:19 +0000 (15:29 +1000)]
LU-12248 lov: fix ost objects calculation in lod_statfs

Wen OSTs report fewer free objects than MDTs, the statfs
objects results are presented with the numbers reported
by OSTs. Fix the calculation of OST objetcs to make it
work with statfs aggregation via the MDT.

Make the lfs code consistent with ll_statfs_internal()
and lod_statfs().

Lustre-change: https://review.whamcloud.com/34777
Lustre-commit: 7a6ac6f91273d7b14243509266d38cbbe1eeb550

Fixes: a829595add ("LU-11721 lod: limit statfs ffree if less ...")

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I838a1527ed6411a412b63e2855ca7247755a3bcf
Reviewed-on: https://review.whamcloud.com/34816
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11721 lod: limit statfs ffree if less than OST ffree 53/34453/4
Andreas Dilger [Sun, 3 Feb 2019 00:11:00 +0000 (17:11 -0700)]
LU-11721 lod: limit statfs ffree if less than OST ffree

If the OSTs report fewer total free objects than the MDTs, then
use the free files count reported by the OSTs, since it represents
the minimum number of files that can be created in the filesystem
(creating more may be possible, but this depends on other factors).
This has always been what ll_statfs_internal() reports, but the
statfs aggregation via the MDT missed this step in lod_statfs().

Fix a minor defect in sanity test_418() that would let it loop
forever until the test was killed due to timeout if the "df -i"
and "lfs df -i" output did not converge.

Fixes: b500d5193360 ("LU-10018 protocol: MDT as a statfs proxy")
Fixes: 263e80f4572b ("LU-11721 tests: wait for statfs to update ...")

Lustre-change: https://review.whamcloud.com/34167
Lustre-commit: a829595add808d0fb09bab525c500d0aa6955883

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id8d7b7edfd854f1ec30bfbbb85f04b0c973ebbe5
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nangelinas@cray.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34453
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
5 years agoLU-11830 libcfs: quiet print format warning 05/34805/2
Andreas Dilger [Sun, 5 May 2019 07:56:25 +0000 (01:56 -0600)]
LU-11830 libcfs: quiet print format warning

The watchdog code was reworked in master but generates a warning on
b2_12 due to mismatched print format vs. time64_t.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I65beef9b5567c45802778afd78e3fde7299406f8
Reviewed-on: https://review.whamcloud.com/34805
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10171 lmv: avoid gratuitous 64-bit modulus 07/34807/2
Andreas Dilger [Wed, 26 Dec 2018 10:45:52 +0000 (03:45 -0700)]
LU-10171 lmv: avoid gratuitous 64-bit modulus

Fix the pct() calculation to use unsigned long arguments, since this
is what callers use.  Remove duplicate pct() definition in lproc_mdc.

Don't do a 64-bit modulus of the LNet NID to find the starting MDT
index when this isn't really needed.

Similarly, don't compute the FLD cache usage percentage for a debug
message that is never used.

Lustre-change: https://review.whamcloud.com/33922
Lustre-commit: e1b63fd21177b40d5c23cedd9e5d81b461db53c3

Fixes: 9b924e86b27d ("LU-10171 headers: define pct(a,b) once")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I34cefd269cb83f563d2f08c32dc3fa1ed5c5a5b1
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34807
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10171 headers: define pct(a,b) once 06/34806/2
Ben Evans [Wed, 12 Dec 2018 20:20:03 +0000 (15:20 -0500)]
LU-10171 headers: define pct(a,b) once

pct is defined 6 times in different places.  Define it in one.
Also change it to a static inline to do a better job of
enforcing types.

Lustre-change: https://review.whamcloud.com/29852
Lustre-commit: 9b924e86b27df0cb7a6f0d4c04ff96f867413485

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: If61132a1096c351a9bcb7debb868351206267535
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34806
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew release 2.12.2 2.12.2 v2_12_2
Oleg Drokin [Sun, 26 May 2019 21:36:17 +0000 (17:36 -0400)]
New release 2.12.2

Change-Id: I9d1ec7d3ede2b9cc2060981a2d75474e2745cea4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoNew tag 2.12.2-RC3 2.12.2-RC3 v2_12_2-RC3
Oleg Drokin [Sun, 26 May 2019 21:34:44 +0000 (17:34 -0400)]
New tag 2.12.2-RC3

Change-Id: Ia745bf6cf9eff0617960750128257fbc76aa62f3
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12279 lnet: use number of wrs to calculate CQEs 33/34933/3
Amir Shehata [Tue, 21 May 2019 20:44:58 +0000 (13:44 -0700)]
LU-12279 lnet: use number of wrs to calculate CQEs

Using concurrent sends to calculate the number of CQEs results
in a small number of CQEs which exposes an issue where under
failure scenarios, example when a node reboots, there wouldn't
be enough CQEs available leading to IB_EVENT_QP_FATAL

Lustre-change: https://review.whamcloud.com/34945
Lustre-commit: 24294b843f79a1167f19d230ff1ab5c1a5cd88e7

Fixes: 83e45ead69ba ("LU-11931 lnd: bring back concurrent_sends")
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I6e2be079e11622b83fe3fb4fdb695f5a2672c9ac
Reviewed-on: https://review.whamcloud.com/34933
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoNew tag 2.12.2-RC2 2.12.2-RC2 v2_12_2-RC2
Oleg Drokin [Thu, 16 May 2019 22:35:18 +0000 (18:35 -0400)]
New tag 2.12.2-RC2

Change-Id: I354830e6b09de4cc15c8c194a3c31ea255783d4a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12298 init: Add init info to lustre sysvinit script 69/34869/2
Nathaniel Clark [Wed, 15 May 2019 18:16:40 +0000 (14:16 -0400)]
LU-12298 init: Add init info to lustre sysvinit script

This adds info to sysvinit script that systemd can use
to build dependency graphs.

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Ied3bc05d61ba9dc33904a84c5f91bb9adc60cb01
Reviewed-on: https://review.whamcloud.com/34869
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoRevert "LU-8384 scripts: Add scripts to systemd for EL7" 68/34868/2
Nathaniel Clark [Wed, 15 May 2019 18:09:00 +0000 (14:09 -0400)]
Revert "LU-8384 scripts: Add scripts to systemd for EL7"

This reverts commit 420d8c09887ff178508be0434373f74b5ef7ae6e.

This prevents lustre from starting correctly, as seen in LU-12298

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Ib0a7e85079d1aea27b3a09496a2bf02c698c294c
Reviewed-on: https://review.whamcloud.com/34868
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoNew RC 2.12.2-RC1 2.12.2-RC1 v2_12_2-RC1
Oleg Drokin [Fri, 10 May 2019 22:52:31 +0000 (18:52 -0400)]
New RC 2.12.2-RC1

Change-Id: I260a08bb593463b64365046e4b01d57a2287f551

5 years agoLU-930 doc: man page for lctl nodemap_set_sepol 35/34835/2
Sebastien Buisson [Mon, 21 Jan 2019 16:07:48 +0000 (01:07 +0900)]
LU-930 doc: man page for lctl nodemap_set_sepol

Man page for lctl nodemap_set_sepol.

Lustre-change: https://review.whamcloud.com/34084
Lustre-commit: 4813c6afbe77137facf4579f458d34d4dda40dd5

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9e27aaa7d5653fcd6225a424bdbb920471b01555
Reviewed-on: https://review.whamcloud.com/34835
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11871 doc: man page for lctl nodemap_set_fileset 34/34834/2
Sebastien Buisson [Thu, 17 Jan 2019 15:53:50 +0000 (00:53 +0900)]
LU-11871 doc: man page for lctl nodemap_set_fileset

Man page for lctl nodemap_set_fileset.

Lustre-change: https://review.whamcloud.com/34057
Lustre-commit: 76965ebc257ef1a090df2b21f107739f6471a9af

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Icce0b1558621bca84b14f76037a5000002855881
Reviewed-on: https://review.whamcloud.com/34834
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11914 build: add a configure check for l_getsepol 32/34832/2
Sebastien Buisson [Tue, 5 Feb 2019 14:09:45 +0000 (23:09 +0900)]
LU-11914 build: add a configure check for l_getsepol

l_getsepol requires openssl-devel, so add a configure check for
openssl/evp.h header and EVP_MD_CTX_create function, and disable
building l_getsepol in case they are missing.

Lustre-change: https://review.whamcloud.com/34183
Lustre-commit: be6de3db9cace327b3d34870417c96c2ac705313

Test-Parameters: trivial
Change-Id: I31ddbc2f5300e9e38db9e00e2b7fbcac7f83d9e5
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-on: https://review.whamcloud.com/34832
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11661 test: improve sanityn test_47g 74/34574/2
Lai Siyao [Mon, 22 Oct 2018 11:51:49 +0000 (19:51 +0800)]
LU-11661 test: improve sanityn test_47g

'stat' may be run before 'mkdir', to avoid this, sync data before
test and wait longer time after 'mkdir' in background.

Test-Parameters: trivial testlist=sanityn,sanityn,sanityn,sanityn mdtfilesystemtype=zfs mdscount=2 mdtcount=4 envdefinitions=ONLY=47g

Lustre-change: https://review.whamcloud.com/33647
Lustre-commit: 6b44117e9f35f6610b2c6d975c89b519aaef7730

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I314bc9d36629a5185efc5ef8281a03337ea77776
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34574
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11803 tests: don't assume obd device name 83/34783/2
James Simmons [Thu, 24 Jan 2019 18:20:48 +0000 (13:20 -0500)]
LU-11803 tests: don't assume obd device name

Several tests created to exercise lustre were developed on the
x86 platform and it was assumed the device name exposed in the
sysfs tree are the same across all platforms. Additionally
we can update the test to handle the case of using an uuid
format for the sysfs directory naming instead of an internal
address pointer.

Lustre-change: https://review.whamcloud.com/33894
Lustre-commit: b8e6c8bdca9bd0e12d78cd4a06800c13f4293325

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I704d13059f76337fa49aab77f3e748a70a74f1bc
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34783
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11960 build: Add missing libssl-dev DEB package 00/34800/3
Thomas Stibor [Tue, 12 Feb 2019 13:30:51 +0000 (14:30 +0100)]
LU-11960 build: Add missing libssl-dev DEB package

Building Lustre client DEB packages on Debian fails due to missing
package libssl-dev and results in error:
"No such file or directory #include <openssl/evp.h>"
Add required package libssl-dev into "make debs" chain.

Lustre-change: https://review.whamcloud.com/34233
Lustre-commit: a869a4ee0c9f40f80b8f487f114dcfe6971c66bd

Test-Parameters: trivial
Signed-off-by: Thomas Stibor <t.stibor@gsi.de>
Change-Id: Ib99cd744b2d44d3f6c1915e2f2d2da7d83e07cae
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/34800
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12068 test: compare position for ZFS dot entry 96/34696/2
Hongchao Zhang [Wed, 27 Mar 2019 17:08:03 +0000 (13:08 -0400)]
LU-12068 test: compare position for ZFS dot entry

in test_6b of sanity-lfsck.sh, the position will be zero for
special entries "." and "..", which should not be used to
determine whether the LFSCK process is forward or not, in this
case, the otable position should be used.

Lustre-change: https://review.whamcloud.com/34525
Lustre-commit: 42adbae36f206a6ed4170e7619cd993c8fa80b1d

Change-Id: I98aee1ae92fa5ea742a8001b58e092111d646477
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34696
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-12133 tests: sanityn test_35 syntax error 13/34613/2
Elena Gryaznova [Fri, 29 Mar 2019 14:14:33 +0000 (17:14 +0300)]
LU-12133 tests: sanityn test_35 syntax error

Patch fixes test_35() trivial syntax error.

This patch is a back port from
Lustre-commit: f93c9c9c70bdb321ca5ec7dd84db9a426cb3bc06
Lustre-change: https://review.whamcloud.com/34542

Test-Parameters: trivial envdefinitions=ONLY=35 testlist=sanityn
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5882
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Change-Id: Id81b9f071920a2111314c869fe2700e6ddf5981a
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/34613
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 years agoLU-11779 tests: add version check for sanity-hsm 35/34735/2
James Nunez [Wed, 3 Apr 2019 19:49:16 +0000 (13:49 -0600)]
LU-11779 tests: add version check for sanity-hsm

sanity-hsm test 255 was added to Lustre tag 2.12.0.
sanity-hsm test 260c was modified with Lustre tag 2.12.0.
Thus, we need to check that the server is 2.12.0
or later before running these tests.

This patch is a back port from
Lustre-commit: 0938b17bd93bf6cd702ddecbab790bad3a8ac1fb
Lustre-change: https://review.whamcloud.com/34589

Fixes: e7d5c1681c07 (LU-11653 hsm: copytool registration wakes the coordinator)
Fixes: b84bc6d895a0 (LU-11572 tests: make sanity-hsm test_260c reliable)

Test-Parameters: trivial serverjob=lustre-b2_10 serverbuildno=168 testlist=sanity-hsm
Test-Parameters: testlist=sanity-hsm

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I1a5369ec864432a241a875c3430baa5a064b0dfe
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34735
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-11749 tests: sanity-sec 23b exec commands on right node 23/34623/2
Sebastien Buisson [Thu, 13 Dec 2018 09:02:47 +0000 (18:02 +0900)]
LU-11749 tests: sanity-sec 23b exec commands on right node

In sanity-sec test 23b, make sure commands are executed on
client 1, then client 2.
Otherwise, ACL mapping cannot be correctly demonstrated.

This is a backported patch from:
Lustre-commit: 3c64d3310b7b46689b69091f512663bcb5aecdaf
Lustre-change: https://review.whamcloud.com/33846

Test-Parameters: trivial clientcount=2 envdefinitions=ONLY=23b testlist=sanity-sec,sanity-sec,sanity-sec,sanity-sec,sanity-sec,sanity-sec

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iffa3946be149313af696907fa6c83a6ea58cb3f6
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34623
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-10073 tests: stop running smoke test 14/34814/2
James Nunez [Fri, 29 Mar 2019 19:01:54 +0000 (13:01 -0600)]
LU-10073 tests: stop running smoke test

lnet-selftest test smoke is failing at a high rate
when tested with ARM clients and when run with Ubuntu
clients. Stop running this test for ARM and Ubuntu
clients until we find a solution.

Lustre-change: https://review.whamcloud.com/34543
Lustre-commit: ddf3c0416790f74e10abc39543843e0de49b176e

Test-Parameters: trivial clientarch=aarch64 testlist=lnet-selftest
Test-Parameters: clientdistro=ubuntu1804 testlist=lnet-selftest
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I5c59b3a5dd42c9b6afcf5e0d1ce17e49efc1b44a
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34814
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 years agoLU-8434 tests: add script language option to auster 15/34815/2
James Nunez [Mon, 22 Apr 2019 16:46:42 +0000 (10:46 -0600)]
LU-8434 tests: add script language option to auster

auster is a scipt that kicks off the Lustre test suites.
auster assumes that all scirpts are written in bash and
runs all script using bash. We may want to run other scripts
to test Lustre and we need to allow the user to choose what
scripting language to use to kick off their scripts.

Lustre-change: https://review.whamcloud.com/34737
Lustre-commit: 90a89391f8110fbd1ff9b3041d548dac7e73a99a

Test-Parameters: trivial

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ifbd3707171de57912306cf051a98922249c4b2a9
Reviewed-on: https://review.whamcloud.com/34737
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34815
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>