Whamcloud - gitweb
fs/lustre-release.git
7 weeks agoLU-12923 utils: Use BUILD_BUG_ON() for wiretest.c 47/36647/3
Arshad Hussain [Mon, 28 Oct 2019 22:14:35 +0000 (03:44 +0530)]
LU-12923 utils: Use BUILD_BUG_ON() for wiretest.c

This patch replaces all CLASSERT() with BUILD_BUG_ON()
for file lustre/utils/wiretest.c

This is done by modifying local defined CLASSERT() macro
with BUILD_BUG_ON() macro. This replicates the kernel
defined BUILD_BUG_ON() where it asserts when condition
is true. This is user-space, therefore we cannot use
kernel define BUILD_BUG_ON() here and had to rely locally
defined BUILD_BUG_ON()

This patch also fixes few space/tab issues reported
by checkpatch

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I2f8bfdbd034a2c8059cf356dd72e4255f4999f8e
Reviewed-on: https://review.whamcloud.com/36647
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12923 mdc: Use BUILD_BUG_ON() for mdc_lib.c 46/36646/4
Arshad Hussain [Mon, 28 Oct 2019 20:35:46 +0000 (02:05 +0530)]
LU-12923 mdc: Use BUILD_BUG_ON() for mdc_lib.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/mdc/mdc_lib.c

This patch also fixes few space/tab issues reported
by checkpatch

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I6bdc084ca73163b88b2dd105b44b9a3cb611a999
Reviewed-on: https://review.whamcloud.com/36646
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
7 weeks agoLU-12923 contrib: Update spelling.txt to add BUILD_BUG_ON() 45/36645/4
Arshad Hussain [Mon, 28 Oct 2019 18:47:50 +0000 (00:17 +0530)]
LU-12923 contrib: Update spelling.txt to add BUILD_BUG_ON()

This is first in the series of patchs which replaces
CLASSERT() with upstream kernel defined BUILD_BUG_ON()

This specific patch updates contrib/scripts/spelling.txt
to add line CLASSERT||BUILD_BUG_ON(). This will subsequently
help follow up patchs to trap and flag warning during
checkpatch check if CLASSERT() is still left defined.

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: If8fd76dd107cb53d657b7fa89bd62a9357222629
Reviewed-on: https://review.whamcloud.com/36645
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
7 weeks agoLU-12920 build: replace ed with sed 30/36630/3
Minh Diep [Thu, 31 Oct 2019 14:26:03 +0000 (07:26 -0700)]
LU-12920 build: replace ed with sed

Ed commad is very old

Test-Parameters: trivial

Change-Id: I18ffe50c3fb006182e68460c03a4d34d5011e62a
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36630
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12895 mdt: check if object exists first 29/36629/5
Sebastien Buisson [Thu, 31 Oct 2019 11:33:45 +0000 (20:33 +0900)]
LU-12895 mdt: check if object exists first

Make sure object exists before trying to get its attr.

Test-Parameters: clientselinux mdtcount=4 envdefinitions=ONLY=185a testlist=sanity,sanity,sanity,sanity
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idb2cd5d6e3fdf7998040b933be54a001a0e5391b
Reviewed-on: https://review.whamcloud.com/36629
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
7 weeks agoLU-12910 osc: allow increasing osc.*.short_io_bytes 87/36587/16
Andreas Dilger [Sat, 26 Oct 2019 11:32:03 +0000 (05:32 -0600)]
LU-12910 osc: allow increasing osc.*.short_io_bytes

The osc.*.short_io_bytes parameter was mixing up the default and
maximum parameter values, and did not allow increasing the parameter
beyond the default.

Allow it to be increased to the maximum value, which depends on the
client PAGE_SIZE, and the amount of free space in the maximally-sized
OST RPC.  Since the maximum size is system dependent, allow some
grace when setting the parameter, so that a single tunable parameter
can work on a variety of different systems.

However, if it is larger than the maximum RDMA size (which is already
too large) return an error, as it means something is wrong.

Add a test case to exercise the osc.*.short_io_bytes parameter.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2ce73af5963a0f9e0f1079dd2f91a4495a3ebbe5
Reviewed-on: https://review.whamcloud.com/36587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12904 build: Support for gcc -Wimplicit-fallthrough 77/36577/2
Shaun Tancheff [Fri, 25 Oct 2019 13:13:26 +0000 (08:13 -0500)]
LU-12904 build: Support for gcc -Wimplicit-fallthrough

Linux 5.3 enables -Wimplicit-fallthrough
Add decorators for implicit-fallthrough compiler checks.

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I740062e60e1d19b967ec6b91970cdd3ab03cbab6
Reviewed-on: https://review.whamcloud.com/36577
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12904 build: External _module_ decorator removed 76/36576/2
Shaun Tancheff [Fri, 25 Oct 2019 13:10:59 +0000 (08:10 -0500)]
LU-12904 build: External _module_ decorator removed

As of 5.4 the _module_ decorator prefix is not used for external
kernel modules. This breaks building kernel modules for 5.4.
Prior kernels still require the _module_ decorator.

Add a configure check to test for and handle _module_ decorator is
used.

Linux-commit: d7b0827f28ab3a4fd65864451ffefa695e3255fd

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I4359452cea8e32a31234b9becc2ed319954c55a4
Reviewed-on: https://review.whamcloud.com/36576
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9859 libcfs: Prevent harmless read underflow 67/36567/2
Dan Carpenter [Thu, 24 Oct 2019 17:42:53 +0000 (13:42 -0400)]
LU-9859 libcfs: Prevent harmless read underflow

Because this is a post-op instead of a pre-op, then it means we check
if knl_buffer[-1] is a space.  It doesn't really hurt anything, but
it causes a static checker warning so let's fix it.

Linux-commit: 134aecbc25fd77645baaea5467b2a7ed8e9d1ea7

Change-Id: I40fee264eb1ac461baa183f199b4e5e1b5eb26f5
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9325 mds: replace simple_strtol use with target_name2index() 51/36551/5
James Simmons [Thu, 7 Nov 2019 13:49:00 +0000 (08:49 -0500)]
LU-9325 mds: replace simple_strtol use with target_name2index()

With simple_strtol() going away in the future we should move to
kstrtoXXX functions. Looking a the simple_strtol() use in lod and
osp layer that its use is really target_name2index(). We can
migrate to this function so we have one stop to update from using
simple_strtol().

Change-Id: Ia3d0208c1b1c6bfbe9aa03ce3c068d41ed2c7595
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/36551
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9859 libcfs: move remaining code from linux-module.c to module.c 10/36510/2
NeilBrown [Sun, 20 Oct 2019 15:09:10 +0000 (11:09 -0400)]
LU-9859 libcfs: move remaining code from linux-module.c to module.c

There is no longer any need to keep this code separate,
and now we can remove linux-module.c

Linux-commit: 9604c7ac2005e214cb08500c957a79c58bea5c83

Test-Parameters: trivial

Change-Id: Ie2b905f5a79be17840ddfac0661c10332dc2667d
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/36510
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
7 weeks agoLU-8130 obd: remove used HASH_CL_ENV_[BKT]_BITS 32/36432/2
James Simmons [Fri, 11 Oct 2019 13:11:00 +0000 (09:11 -0400)]
LU-8130 obd: remove used HASH_CL_ENV_[BKT]_BITS

Their is no libcfs hash table for cl_env so this can be removed.

Test-Parameters: trivial

Change-Id: I8d9d4f1dc683edc8fc4c14ffc8266deb178c3162
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/36432
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
7 weeks agoLU-12816 ptlrpc: ptlrpc_register_bulk LBUG on ENOMEM 09/36309/8
Ann Koehler [Mon, 14 Oct 2019 16:30:56 +0000 (11:30 -0500)]
LU-12816 ptlrpc: ptlrpc_register_bulk LBUG on ENOMEM

Another path through ptl_send_rpc() can cause the assert reported
in LU-10643. The assertion in ptlrpc_register_bulk() on
!desc->bd_registered fails when an rpc is resent and the first
send attempt failed to successfully attach the reply buffer. The
bulk error cleanup in ptl_send_rpc() does not reset the
bd_registered flag.

Cray-bug-id: LUS-7946
Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: I474211f196ea9bd83a036747e25c91c37c85ffbb
Reviewed-on: https://review.whamcloud.com/36309
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9859 libcfs: opencode cfs_cap_{raise,lower,raised} 04/36304/4
NeilBrown [Fri, 11 Oct 2019 14:38:45 +0000 (10:38 -0400)]
LU-9859 libcfs: opencode cfs_cap_{raise,lower,raised}

Each of these functions is used precisely once, so having
a separate exported function seems like overkill.

cfs_cap_raised() is trivial - one line.
cfs_cap_raise() and cfs_cap_lower() are used as a pair
which is more effectively implemented with
override_cred() / revert_creds().

Linux-commit: cc738c1a69da27be8ff7885b4069fa02e45c75c1

There exists a bug in the original Linux client patch.
Additionally handling the SYS_CAP_RESOURCE is used
extensively with the server code so create we can create
simple inline functions that handle this and it makes
the code cleaner.

Change-Id: I3a39a855fb9718ca43e74ef4b9e749b0f43f4bc8
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/36304
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12780 lustre: remove SVC_EVENT flag 57/36257/4
Mr NeilBrown [Tue, 22 Oct 2019 14:42:29 +0000 (10:42 -0400)]
LU-12780 lustre: remove SVC_EVENT flag

This flag is never set or tested, so remove it and the
function for testing it.

Test-Parameters:trivial

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie9fc586ecd26ffce16026d53eac998e3c046d270
Reviewed-on: https://review.whamcloud.com/36257
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-12770 test: zfs project_quota testing 97/36197/6
Shaun Tancheff [Fri, 4 Oct 2019 18:13:35 +0000 (13:13 -0500)]
LU-12770 test: zfs project_quota testing

Use zpool get all to query zfs features and check for project_quota
Use zpool get/set to manage project_quota feature, where possible

Cray-bug-id: LUS-7795
Test-Parameters: trivial testlist=sanity-quota
Test-Parameters: fstype=zfs testlist=sanity-quota
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I8111820aa2f4415e8d62c472a3553fe3b9288f19
Reviewed-on: https://review.whamcloud.com/36197
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
7 weeks agoLU-12757 utils: avoid newline inside error message 76/36176/5
Andreas Dilger [Thu, 12 Sep 2019 23:31:00 +0000 (17:31 -0600)]
LU-12757 utils: avoid newline inside error message

When calling llapi_error() the format string should not end in a
newline, since the error string is appended to the output with
its own newline.

Fix several callers to not supply their own newline, and callers
that duplicate the error string in the error message itself.

In the case that there are callers that *do* include a newline,
handle this gracefully to avoid splitting the error across lines.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie8f7206d82faccb3b33e2fc62b00f5226b3ebbe5
Reviewed-on: https://review.whamcloud.com/36176
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12275 osd: make osd layer always send complete pages 38/36238/7
Sebastien Buisson [Thu, 19 Sep 2019 17:24:49 +0000 (19:24 +0200)]
LU-12275 osd: make osd layer always send complete pages

In osd layer, instead of looking if we go beyong isize, just make sure
we send complete pages all the time.
Data in page beyond isize will be discared by client anyway, and it
should not be harmful to send at max PAGE_SIZE-1 more bytes for reads
at end of file.

With this new paradigm, we need to remove sanity test_246, as its sole
purpose is to actually make sure we do not send more than isize bytes
to the client.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I03dc6037a8dfa1d40d40a4b1f675e047d862d933
Reviewed-on: https://review.whamcloud.com/36238
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
7 weeks agoLU-12634 libcfs: force_sig() removed task parameter 45/35745/8
Shaun Tancheff [Tue, 5 Nov 2019 05:12:20 +0000 (23:12 -0600)]
LU-12634 libcfs: force_sig() removed task parameter

Linux 5.3 removed the task parameter for force_sig()
signal: Remove task parameter from force_sig

When force_sig() is not available reset the target thread
default handler to SIG_DFL and proceed to use send_sig(..., 1)
which eventually marshals the same signal to the target task.

kernel-commit: 3cf5d076fb4d48979f382bc9452765bf8b79e740

NOTE: force_sig() is used here instead of a wake_up_process() as tasks
      may be blocked on rpc activity.

Test-Parameters: trivial
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ic28f604d985f7e6c3c3dea8bc284c6f2e212f45c
Reviewed-on: https://review.whamcloud.com/35745
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12634 build: Recognize ELRepo -ml mainline kernel 42/35742/5
Shaun Tancheff [Fri, 25 Oct 2019 16:15:00 +0000 (11:15 -0500)]
LU-12634 build: Recognize ELRepo -ml mainline kernel

Add support for identifying ELRepo kernel-ml style
builds on CentOS 7 and 8 based distributions

Test-Parameters: trivial
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: If4ae7441d4d023d31b1fb42f3fe90ff9c747c0f8
Reviewed-on: https://review.whamcloud.com/35742
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-11762 ldlm: ensure the recovery timer is armed 27/35627/4
Hongchao Zhang [Wed, 10 Jul 2019 08:22:15 +0000 (04:22 -0400)]
LU-11762 ldlm: ensure the recovery timer is armed

During recovery, when the recovery timer is expired, the VBR phase
is initiated only the current recovery timeout is less than the hard
recovery timeout, or it will be stuck in the "wait_event_timeout()"
because there is no timer and it can't be waked up.

Change-Id: I32467afa45393e37f255e2b14f160c9da710461b
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12518 llite: support page unaligned stride readahead 37/35437/12
Wang Shilong [Mon, 19 Aug 2019 06:57:29 +0000 (14:57 +0800)]
LU-12518 llite: support page unaligned stride readahead

Currently, Lustre works well for aligned IO, but performance
is pretty bad for unaligned IO stride read, we might need
take some efforts to improve this situation.

One of the main problem with current stride read is it is
based on Page Index, so if we hit unaligned page case,
stride Read detection will not work well. To support unaligned
page stride read, we might change page index to bytes offset
thus stride read pattern detection work well and we won't hit
many small pages RPC and readahead window reset. At the same
time, we shall keep as much as performances for existed cases
and make sure there won't be obvious regressions for
aligned-stride and sequential read.

Benchmark numbers:
iozone -w -c -i 5 -t1 -j 2 -s 1G -r 43k -F /mnt/lustre/data

Patched                 Unpatched
1386630.75 kB/sec       152002.50 kB/sec

At least performance bumped up more than ~800%.

Benchmarked with IOR from ihara:
        FPP Read(MB/sec)        SSF Read(MB/sec)
Unpatched 44,636                7,731

Patched   44,318                20,745

Got 250% performances up for ior_hard_read workload.

Change-Id: I791745f957af84a6c790c52fbe9f5fed3fd30c77
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35437
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
7 weeks agoLU-8066 obd_type: discard obd_type_lock 96/35096/9
NeilBrown [Wed, 18 Sep 2019 02:07:56 +0000 (22:07 -0400)]
LU-8066 obd_type: discard obd_type_lock

This lock is only used to protect typ_refcnt, so change
that to an atomic_t and discard the lock.

The lock also covers calls to try_module_get and module_put,
but this serves no purpose as it does not prevent the module
from being unloaded.

Finally, the return value for the call to try_module_get is
ignored, which is not safe.

Linux-commit: 493ae16ed39a1c9f792c3b650e2dff11ca2e73e8

Change-Id: I904c51cc4d3426ca520c0bcad9665380ce1f3c3d
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35096
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12355 ldiskfs: Update ldiskfs patches for 5.0 51/35051/3
Shaun Tancheff [Mon, 3 Jun 2019 22:55:05 +0000 (17:55 -0500)]
LU-12355 ldiskfs: Update ldiskfs patches for 5.0

Update ldiskfs patch series for 5.0
Update configure for ubuntu19 / 5.0.0 kernel

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I912d457c924c93cfcf98c0b91cd514d5d2a72bbc
Reviewed-on: https://review.whamcloud.com/35051
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12071 osd-ldiskfs: bypass pagecache if requested 22/34422/30
Alex Zhuravlev [Thu, 14 Mar 2019 14:51:31 +0000 (17:51 +0300)]
LU-12071 osd-ldiskfs: bypass pagecache if requested

in few cases (non-rotational drive, by request, or file size)
osd-ldiskfs may want to skip caching. If so, bypass page cache
instead of later cache invalidation, as cache invalidation can
be quite expensive.

set the maximum cached read/write IO size use:
     lctl set_param osd-ldiskfs.*.readcache_max_io_mb=N
     lctl set_param osd-ldiskfs.*.writethrough_max_io_mb=N
The default maximum cached IO size is 8MiB.

ladvise() enforces IO to go in the cache and all subsquent
reads will consult with the cache.

Change-Id: I37403ced7ad9553128ba168fa36315d6aa1aaf2d
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34422
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-12542 handle: move refcount into the lustre_handle. 94/35794/15
NeilBrown [Wed, 11 Sep 2019 15:34:54 +0000 (11:34 -0400)]
LU-12542 handle: move refcount into the lustre_handle.

Most objects with a lustre_handle have a refcount. The exception
is mdt_mfd which uses locking (med_open_lock) to manage its
lifetime. The lustre_handles code currently needs a call-out to
increment its refcount. To simplify things, move the refcount
into the lustre_hanle (which will be largely ignored by mdt_mfd)
and discard the call-out.

To avoid warnings when refcount debugging is enabled the refcount
of mdt_mfd is initialized to 1, and decremeneted after any
class_handle2object() call which would have incremented it.

In order to preserve the same debug messages, we store an object type
name in the portals_handle_ops, and use that in a CDEBUG() when
incrementing the ref count.

Change-Id: I1920330b2aeffd4b865cb9b249997aa28b209c33
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35794
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-11380 llapi: add llapi_fid_parse() helper 84/36184/11
Andreas Dilger [Fri, 13 Sep 2019 23:27:23 +0000 (17:27 -0600)]
LU-11380 llapi: add llapi_fid_parse() helper

Split the llapi_* FID handling functions to a separate file
rather than continually increasing the size of liblustrepai.c.

Add llapi_fid_parse() to parse a string to binary struct lu_fid.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I15abfaf888a5474d62feebab4e8db543ba3ebbe5
Reviewed-on: https://review.whamcloud.com/36184
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9679 modules: declare zero-arg functions correctly 72/36672/2
Mr NeilBrown [Tue, 5 Nov 2019 03:04:32 +0000 (14:04 +1100)]
LU-9679 modules: declare zero-arg functions correctly

Functions that don't take any arguments should be
declared
   return-type name(void)
rather than
   return-type name()

This patch only changes functions that are included in
kernel modules.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I327c57131c4b5008660844a8436fa27df53c16c7
Reviewed-on: https://review.whamcloud.com/36672
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-9679 modules: Use LIST_HEAD for declaring list_heads 69/36669/2
Mr NeilBrown [Tue, 5 Nov 2019 02:19:07 +0000 (13:19 +1100)]
LU-9679 modules: Use LIST_HEAD for declaring list_heads

Rather than
  struct list_head foo = LIST_HEAD_INIT(foo);
use
  LIST_HEAD(foo);

This is shorter and more in-keeping with upstream style.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I36aa8c7e0763f3dfc88fe482cd28935184c1effa
Reviewed-on: https://review.whamcloud.com/36669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-9679 general: avoid bare return; at end of void function 54/36654/3
Mr NeilBrown [Sun, 3 Nov 2019 23:55:04 +0000 (10:55 +1100)]
LU-9679 general: avoid bare return; at end of void function

Having:
   return;
}

at the end of a void function is unnecessary noise.
Where it is the *only* statement in the function, it can
be useful, so that remain unchanged.  The rest have been removed.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If02f6f5b91d4134cf95a68ebccc83df28c360fb2
Reviewed-on: https://review.whamcloud.com/36654
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12853 ptlrpc: zero session enviroment 43/36443/2
Alexander Boyko [Mon, 14 Oct 2019 07:31:35 +0000 (03:31 -0400)]
LU-12853 ptlrpc: zero session enviroment

handle_recovery_req() set le_ses for request processing,
and doesn't zero it after. This leads to accessing freed memory
at keys_fill() later.

The patch also adds a cleanup for xxx_env_info, makes them equal
and combines to a single function.

Cray-bug-id: LUS-7676
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ifad95c1177258b6f71effe5fa815f68c8426c516
Reviewed-on: https://review.whamcloud.com/36443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-6142 tests: Fix style issues for tchmod.c 41/36441/2
Arshad Hussain [Sun, 29 Sep 2019 23:10:50 +0000 (04:40 +0530)]
LU-6142 tests: Fix style issues for tchmod.c

This patch fixes issues reported by checkpatch
for file lustre/tests/tchmod.c

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I59a94c26e553a616d82ecc9a4d493511e808a82e
Reviewed-on: https://review.whamcloud.com/36441
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
7 weeks agoLU-6142 tests: Fix style issues for sendfile.c 40/36440/2
Arshad Hussain [Sun, 29 Sep 2019 23:19:16 +0000 (04:49 +0530)]
LU-6142 tests: Fix style issues for sendfile.c

This patch fixes issues reported by checkpatch
for file lustre/tests/sendfile.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Idbc8d5cd00b57da8f91b4ce39c40942a7fea8fc3
Reviewed-on: https://review.whamcloud.com/36440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
7 weeks agoLU-6142 tests: Remove file rmdirmany.c 39/36439/2
Arshad Hussain [Sun, 29 Sep 2019 23:02:52 +0000 (04:32 +0530)]
LU-6142 tests: Remove file rmdirmany.c

This patch removes file lustre/tests/rmdirmany.c
This file currently is not used at all by any
tests or binary.

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I3bd39abefb49855d70eed3be57f8e80e2439776d
Reviewed-on: https://review.whamcloud.com/36439
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
7 weeks agoLU-6142 tests: Fix style issues for openunlink.c 38/36438/2
Arshad Hussain [Sun, 29 Sep 2019 22:56:15 +0000 (04:26 +0530)]
LU-6142 tests: Fix style issues for openunlink.c

This patch fixes issues reported by checkpatch
for file lustre/tests/openunlink.c

Test-Parameters: trivial testlist=sanityn
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ibfd29751769c1b8339ac249ad1379c8d42250ae3
Reviewed-on: https://review.whamcloud.com/36438
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
7 weeks agoLU-6142 tests: Fix style issues for write_time_limit.c 90/36390/3
Arshad Hussain [Sun, 29 Sep 2019 15:34:37 +0000 (21:04 +0530)]
LU-6142 tests: Fix style issues for write_time_limit.c

This patch fixes issues reported by checkpatch
for file lustre/tests/write_time_limit.c

Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Id55ff1de3a4c05f04ebe446bfa394d9d0e32997c
Reviewed-on: https://review.whamcloud.com/36390
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
7 weeks agoLU-6142 tests: Fix style issues for unlinkmany.c 89/36389/5
Arshad Hussain [Sun, 29 Sep 2019 16:23:41 +0000 (21:53 +0530)]
LU-6142 tests: Fix style issues for unlinkmany.c

This patch fixes issues reported by checkpatch
for file lustre/tests/unlinkmany.c

This patch also updates the usage message to
to print the information about '-d' option.

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Idd107ba7c005e0186bc39fc9bb4fc84691919178
Reviewed-on: https://review.whamcloud.com/36389
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-6142 tests: Fix style issues for test_brw.c 88/36388/2
Arshad Hussain [Sun, 29 Sep 2019 17:03:43 +0000 (22:33 +0530)]
LU-6142 tests: Fix style issues for test_brw.c

This patch fixes issues reported by checkpatch
for file lustre/tests/test_brw.c

Test-Parameters: trivial testlist=sanityn,recovery-small,recovery-single,lnet-selftest
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I888fa9289839dbdf6970685395ae17f4d6a28d44
Reviewed-on: https://review.whamcloud.com/36388
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
7 weeks agoLU-6142 tests: Fix style issues for statone.c 87/36387/2
Arshad Hussain [Sun, 29 Sep 2019 17:17:06 +0000 (22:47 +0530)]
LU-6142 tests: Fix style issues for statone.c

This patch fixes issues reported by checkpatch
for file lustre/tests/statone.c

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Idb38488116c471b8f6dbe767eaada2dc328b3d7a
Reviewed-on: https://review.whamcloud.com/36387
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
7 weeks agoLU-6142 tests: Fix style issues for rwv.c 85/36385/3
Arshad Hussain [Sun, 29 Sep 2019 18:08:40 +0000 (23:38 +0530)]
LU-6142 tests: Fix style issues for rwv.c

This patch fixes issues reported by checkpatch
for file lustre/tests/rwv.c

Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I574ef984e08a413569391d67a3a27abe9502438b
Reviewed-on: https://review.whamcloud.com/36385
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
7 weeks agoLU-6142 tests: Fix style issues for runas.c 84/36384/2
Arshad Hussain [Sun, 29 Sep 2019 18:36:38 +0000 (00:06 +0530)]
LU-6142 tests: Fix style issues for runas.c

This patch fixes issues reported by checkpatch
for file lustre/tests/runas.c

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Id0b658a0d9fabb520f3f087c0901047518e9f6cf
Reviewed-on: https://review.whamcloud.com/36384
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
7 weeks agoLU-10467 llite: use wait_event in cl_object_put_last() 45/36345/2
NeilBrown [Tue, 1 Oct 2019 18:28:39 +0000 (14:28 -0400)]
LU-10467 llite: use wait_event in cl_object_put_last()

cl_object_put_last() contains an open-coded version
of wait_event().
Replace it with the library macro.

Change-Id: I878f76c9af24e827f91fe50fbeb637dda1489b8a
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36345
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 lov: use wait_event() in lov_subobject_kill() 43/36343/2
NeilBrown [Tue, 1 Oct 2019 18:12:23 +0000 (14:12 -0400)]
LU-10467 lov: use wait_event() in lov_subobject_kill()

lov_subobject_kill() has an open-coded version
of wait_event(). Change it to use the macro.

There is no need to take a spinlock just to check if a variable
have changed value. If there was, the first test would be protected too.

"lti_waiter" now has no users and can be removed from lov_thread_info.

Change-Id: Ic1126fc500c03c48c4426171e98590ef6dce3098
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36343
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 ldlm: convert l_wait_event in __ldlm_namespace_free 89/35989/9
Mr NeilBrown [Wed, 28 Aug 2019 23:35:23 +0000 (09:35 +1000)]
LU-10467 ldlm: convert l_wait_event in  __ldlm_namespace_free

The l_wait_event call in __ldlm_namespace_free() can do one
of two things depending on which LWI_* setup call is in effect.
If 'force', it ignores signals and times out after 1/4 second.
If '!force', it has no timeout but allows fatal signals.

So change it to two separate calls: wait_event_idle_timeout()
or l_wait_event_abortable().

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I1ac7ff5daa80581010cd913f01650c07ac40c151
Reviewed-on: https://review.whamcloud.com/35989
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 ldlm: fix style issues in ldlm_flock_completion_ast 83/35983/13
Mr NeilBrown [Thu, 29 Aug 2019 00:45:00 +0000 (10:45 +1000)]
LU-10467 ldlm: fix style issues in ldlm_flock_completion_ast

Prior to some code changes, fix up indenting and other
style issues (particularly multi-line comments) in
ldlm_flock.c.

In a few cases, parentheses have been added so that the re-indenting
done by emacs c-mode does "the right thing".

Test-Parameters: trivial
Change-Id: Ic628acaae875bea9759fbb669c154f046a75a9fa
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35983
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 ptlrpc: fix style issues in import.c 78/35978/9
Mr NeilBrown [Thu, 29 Aug 2019 00:37:41 +0000 (10:37 +1000)]
LU-10467 ptlrpc: fix style issues in import.c

Each of the functions changed here will have a
code change in the next patch, so fix up indenting
and a few other style issues first.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I21a52d7a8a510b9e12d7b822a0abf573247e1405
Reviewed-on: https://review.whamcloud.com/35978
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 llite: style fixes prior to code change. 74/35974/9
Mr NeilBrown [Thu, 29 Aug 2019 00:12:06 +0000 (10:12 +1000)]
LU-10467 llite: style fixes prior to code change.

Next patch will make some code changes to ll_put_super(),
so fix up indenting and fix a couple of checkpatch
warnings first.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I502c81f481c1046d0943f1407a910c1fceeb7ecc
Reviewed-on: https://review.whamcloud.com/35974
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 lustre: use wait_event_idle() where appropriate. 71/35971/10
Mr NeilBrown [Mon, 26 Aug 2019 05:34:17 +0000 (15:34 +1000)]
LU-10467 lustre: use wait_event_idle() where appropriate.

When l_wait_event() is passed an 'lwi' which is initialised
to all zeroes, it behaves exactly like wait_event_idle():
 - no timeout
 - not interrupted by any signal
 - doesn't add to load average.

So change all these instances to wait_event_idle(), or in two cases,
to wait_event_idle_exclusive().

There are three ways that lwi gets set to all zeros:
struct l_wait_info lwi = { 0 };
lwi = LWI_INTR(NULL, NULL);
memset(&lwi, 0, sizeof(lwi));

Change-Id: Ia6723cbe248ce067331a002e5e9d54796739c08a
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35971
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 lustre: don't use l_wait_event() for poll loops. 68/35968/6
Mr NeilBrown [Mon, 26 Aug 2019 04:42:17 +0000 (14:42 +1000)]
LU-10467 lustre: don't use l_wait_event() for poll loops.

When polling without any usable wait queue, it is clearest
to have an explicit poll loop.
So don't use l_wait_event() in these two cases, but
use a while loop with ssleep(1);

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: Ic6a203085699fb9802d32871479c822ebe3c2510
Reviewed-on: https://review.whamcloud.com/35968
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-10467 lustre: don't use l_wait_event() for simple sleep. 66/35966/6
Mr NeilBrown [Wed, 2 Oct 2019 02:19:01 +0000 (12:19 +1000)]
LU-10467 lustre: don't use l_wait_event() for simple sleep.

Passing '0' as the condition to l_wait_event() means that
it just waits for the given timeout.
This can be done more simply with ssleep(seconds) or in
one case, a schedule_timeout_killable() loop.

In most of these case, l_wait_event() in configured to ignore signals,
so ssleep() - which also ignores signals - is appropriate.
In one case (lfsck_lib.c) l_wait_event() is configured to respond
to fatal signals, and as there is no ssleep_killable, we
need to opencode one.

ssleep() and schedule_timeout_killable() *will* add to the load
average, while l_wait_event() does not, so if these sleeps happen a
lot, it will add to the load average.  I don't think that will be a
problem for these sleeps.

So remove these l_wait_event() calls and associated variables,
and do it the simpler ways.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I5a77e631c68f6dfb45fdd7ea01d60b13268240cc
Reviewed-on: https://review.whamcloud.com/35966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12931 general: fix some cfs_time_seconds() inconsistencies. 68/36668/2
Mr NeilBrown [Mon, 4 Nov 2019 01:58:18 +0000 (12:58 +1100)]
LU-12931 general: fix some cfs_time_seconds() inconsistencies.

mgc_process_log:
  the value stored in 'secs' has units of 'jiffes' which is
  confusing.  So change the name to 'timeout'.
ptl_recover_import:
  the value stored in 'secs' has units of 'jffied' which is
  confusing.  It is reported in a CDEBUG message as 'seconds'.
  So rename to 'timeout' and report 'obd_timeout', which is in
  seconds.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I1c92c3ed45dc8a7ce9b82eb823e2db8779c881fa
Reviewed-on: https://review.whamcloud.com/36668
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12856 target: check FLFLAGS are valid while accessing them 32/36632/2
Mikhail Pershin [Thu, 31 Oct 2019 20:44:38 +0000 (23:44 +0300)]
LU-12856 target: check FLFLAGS are valid while accessing them

While checking OBD_FL_SHORT_IO flag check first that OBD_MD_FLFLAGS
are valid.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I04ac61141d70883c29a113fac3985ac81cc878af
Reviewed-on: https://review.whamcloud.com/36632
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-13030 pcc: Init saved dataset flags properly 23/36923/3
Qian Yingjin [Wed, 4 Dec 2019 14:44:58 +0000 (22:44 +0800)]
LU-13030 pcc: Init saved dataset flags properly

When init a new inode, the saved flags is set wrongly with
PCC_DATASET_NONE which means that the file is known in NONE
of PCC dataset.
This patch corrects it with PCC_DATASET_INVALID.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Id775a20711cbc89979e81cbb2b0fe77dc5a850d5
Reviewed-on: https://review.whamcloud.com/36923
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
7 weeks agoLU-13030 pcc: auto attach not work after client cache clear 92/36892/13
Qian Yingjin [Thu, 28 Nov 2019 14:21:12 +0000 (22:21 +0800)]
LU-13030 pcc: auto attach not work after client cache clear

When the inode of a PCC cached file in unused state was evicted
from icache due to memory pressure or manual icache cleanup (i.e.
"echo 3 > /proc/sys/vm/drop_caches"), this file will be detached
from PCC also, and all PCC state for this file is cleared.
In the current design, PCC only tries to auto attache the file
once attached into PCC according to the in-memery PCC state. Thus
later IO for the file is not directed to PCC and will trigger the
data restore.

If this is a not desired result for the user, then we need to try
to auto attach file that was never attached into PCC or once
attached but detached as a result of shrinking its inode from
icache.

Although the candidates to try auto attach are increased, but only
the file in HSM released state (which can directly get from file
layout) will be checked.

This bug is easy reproduced on rhel8. It seems that the command
"echo 3 > /proc/sys/vm/drop_caches" will drop all unused inodes
from icache, but it is not true for rhel7.

This patch also adds the check for the input parameter @rwid,
which should be non zero value and same as the archive ID.

Test-Parameters: clientdistro=el8 testlist=sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibb4c7c624de089766f4a56ef08ff0e2088d2e859
Reviewed-on: https://review.whamcloud.com/36892
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-13023 pcc: Incorrect size after re-attach 84/36884/7
Qian Yingjin [Wed, 27 Nov 2019 15:28:24 +0000 (23:28 +0800)]
LU-13023 pcc: Incorrect size after re-attach

The following test case will result in incorrect size for PCC copy:
- Attach a file with size of s1 (s2 > 0) into PCC;
- Detach this file with --keep option, and the data will retain
  on PCC;
- Truncate this file locally or on an remote client to a new size
  s2 (s2 < s1);
- Re-attach the file again. The size of PCC copy is still s1.

To solve this problem, it need to truncate the size of the PCC copy
to the same size of the Lustre copy which will be HSM released
later after finished the data copy (archive) phase.
This patch also adds the handle for the signal pending when the
attach process is killed by an administrator.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I18f2c883454450bf5dc2f2b3600e2685d8f8f130
Reviewed-on: https://review.whamcloud.com/36884
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event 99/36699/3
Wang Shilong [Thu, 7 Nov 2019 02:18:15 +0000 (10:18 +0800)]
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event

It looks like what's happening is when dm_dispatch_clone_request
dispatches the "clone" I/O request to the underlying (real) device
from the multipath device, the scsi driver can (often under load)
return BLK_MQ_RQ_QUEUE_DEV_BUSY. dm_dispatch_clone_request doesn't
have that as an exception the way it does BLK_MQ_RQ_QUEUE_BUSY and
so it calls dm_complete_request which propagates
the BLK_MQ_RQ_QUEUE_DEV_BUSY error code up the stack resulting
in multipath_end_io calling fail_path and failing the path because
there is an error value set.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: If17ea5b3ab33a89a17d49e5dfb2e9f9f19371564
Reviewed-on: https://review.whamcloud.com/36699
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12410 lnet: Add additional output to sanity-lnet.sh 42/36242/5 multi-rail
Chris Horn [Thu, 19 Sep 2019 19:01:05 +0000 (14:01 -0500)]
LU-12410 lnet: Add additional output to sanity-lnet.sh

Add wrappers around ip netns exec and lnetctl commands to generate
some additional test output. This makes it easier to see what each
test case is doing from the test script output, and aids in debugging
any problems.

Test-parameters: trivial
Test-parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I95b18cb3a090527548a8f9e65845eb4a18dea6d6
Reviewed-on: https://review.whamcloud.com/36242
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12704 lov: check all entries in lov_flush_composite 68/36368/8
Vladimir Saveliev [Thu, 24 Oct 2019 09:17:09 +0000 (12:17 +0300)]
LU-12704 lov: check all entries in lov_flush_composite

Check all layout entries for DOM layout and exit with
-ENODATA if no one exists. Caller consider that as valid
case due to layout change.

Define llo_flush methods for all layouts as required
by lov_dispatch().

Patch cleans up also cl_dom_size field in cl_layout which
was used in previous ll_dom_lock_cancel() implementation

Run lov_flush_composite under down_read lov->lo_type_guard to avoid
race with layout change.

Fixes: 707bab62f5 ("LU-12296 llite: improve ll_dom_lock_cancel")

Test-Parameters testlist=racer
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I4e7b1b201bb1a669fe0d8f0f728467e579ef3512
Reviewed-on: https://review.whamcloud.com/36368
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12370 ptlrpc: grammar fix. 08/36508/4
Alexander Zarochentsev [Fri, 31 Oct 2014 18:48:45 +0000 (21:48 +0300)]
LU-12370 ptlrpc: grammar fix.

ptlrpc_invalidate_import() error message grammar fix.

Test-Parameters: trivial
Cray-bug-id: LUS-4015
Change-Id: Ic1a99440f381ed982e348267996e4523aef8ebad
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/36508
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Colin Faber <cfaber@cray.com>
2 months agoLU-12823 tests: basetest to remove A-Z in test name 22/36322/3
Alex Zhuravlev [Mon, 30 Sep 2019 11:07:03 +0000 (14:07 +0300)]
LU-12823 tests: basetest to remove A-Z in test name

this is required for bash5, otherwise test_mkdir() fails:
test-framework.sh: line 8640: 24A: value too great for
 base (error token is "24A")

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ic2ab4c1530099fbbb57d12f06fbdc761c251ce58
Reviewed-on: https://review.whamcloud.com/36322
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-12530 utils: narrow l_tunedisk udev rule 99/36599/2
Olaf Faaland [Mon, 28 Oct 2019 20:34:53 +0000 (13:34 -0700)]
LU-12530 utils: narrow l_tunedisk udev rule

Narrow the udev rule so that it runs l_tunedisk only for ext4 block
devices formatted for Lustre.

Devices which are members of ZFS pools do not need such tunings to
be provided by lustre - they are handled by ZFS.

There are currently no other OSD types in the tree.  Sites/Vendors which
support other OSDs will need to adjust the rule appropriately.

Change-Id: Iba8b20fc705da0259ab71ee33b92193cae7e8eae
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/36599
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11546 utils: enable large_dir for ldiskfs 55/36555/2
Li Dongyang [Wed, 23 Oct 2019 00:10:34 +0000 (11:10 +1100)]
LU-11546 utils: enable large_dir for ldiskfs

Format MDT with "large_dir" option by default,
to get over the 10M-entry limit for the directories.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ie51e6ce28b5f00adc9958de24794a760d9b43b77
Reviewed-on: https://review.whamcloud.com/36555
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
2 months agoLU-12870 build: sanity-hsm test depends on libtool 71/36471/3
Minh Diep [Thu, 17 Oct 2019 14:11:09 +0000 (07:11 -0700)]
LU-12870 build: sanity-hsm test depends on libtool

Adding Ubuntu libtool-bin requirement

Test-Parameters: trivial clientdistro=ubuntu1804 testlist=sanity-hsm

Change-Id: I04cfffc880259e4cf1c2cba142eddd47a95a736e
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36471
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12809 llite: statfs to use NODELAY with MDS 97/36297/3
Alex Zhuravlev [Thu, 26 Sep 2019 12:39:36 +0000 (15:39 +0300)]
LU-12809 llite: statfs to use NODELAY with MDS

otherwise client umount can get stuck if MDS is down
for a reason. recovery-small/110k simulates this.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I40f6059d429b51a877deb532c1d0302dba0d5c85
Reviewed-on: https://review.whamcloud.com/36297
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
2 months agoLU-12778 tests: give time to apply nodemap 80/36280/3
Sebastien Buisson [Tue, 24 Sep 2019 12:50:29 +0000 (14:50 +0200)]
LU-12778 tests: give time to apply nodemap

As nodemap definitions can need time before they are taken into
account, retry several times before declaring the nodemap is not
updated.

Test-Parameters: trivial testlist=sanity-sec,sanity-sec,sanity-sec,sanity-sec
Test-Parameters: trivial testlist=sanity-sec,sanity-sec,sanity-sec,sanity-sec
Test-Parameters: trivial testlist=sanity-sec,sanity-sec,sanity-sec,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I632bd100ee62e3604aed3aaabc826e7a32287234
Reviewed-on: https://review.whamcloud.com/36280
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12759 osc: don't re-enable grant shrink on reconnect 77/36177/5
Alexander Zarochentsev [Wed, 10 Jul 2019 18:37:33 +0000 (21:37 +0300)]
LU-12759 osc: don't re-enable grant shrink on reconnect

client requests grant shrinking support on each
reconnect and re-enables the capability even it was
explicitly disabled by lctl set_param.

Cray-bug-id: LUS-7585
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Change-Id: I87b1718022ee3346c9b177890a118410c5757458
Reviewed-on: https://review.whamcloud.com/36177
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12631 llite: report latency for filesystem ops 78/36078/12
Andreas Dilger [Fri, 6 Sep 2019 07:50:44 +0000 (01:50 -0600)]
LU-12631 llite: report latency for filesystem ops

Add the elapsed time of VFS operations to the llite stats
counter, instead of just tracking the number of operations,
to allow tracking of operation round-trip latency.

Update sanity test_127[ab] to check that llite.*.stats and
osc.*.stats counter shows read/write stats in usec, and
fix code style nearby.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I40e188374f91c030d978a83157d8869e928cab07
Reviewed-on: https://review.whamcloud.com/36078
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12726 mdt: Fix usage of sscanf 52/36052/7
Patrick Farrell [Fri, 11 Oct 2019 01:36:13 +0000 (21:36 -0400)]
LU-12726 mdt: Fix usage of sscanf

sscanf is returning the number of items matched in the
input, but we need to return the amount of data
successfully written.

Fixes: a408e9dd426f ("LU-8066 mdt: migrate procfs files to sysfs")

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ib90bd260e745692d11656d0b74820573aaa35550
Reviewed-on: https://review.whamcloud.com/36052
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11607 tests: replace lustre_version in mds-survey/pcc/sec 28/35928/3
James Nunez [Mon, 26 Aug 2019 21:43:54 +0000 (15:43 -0600)]
LU-11607 tests: replace lustre_version in mds-survey/pcc/sec

The routine get_lustre_env() is available to all Lustre test
suites and sets an environment variable for the Lustre
version of servers; MGS, MDS1, etc.

In mds-survey, sanity-sec, ost-pools,replay-single and
sanity-pcc, replace the calls to lustre_version_code()
and lustre_build_version() for all server types with
definitions from get_lustre_env().

While doing this, replace ‘lustre_version_code $SINGLEMDS’
with ‘MDS1_VERSION’.  If skip_env() is called based on a
Lustre version check, change this to skip().

Clean up around any modifications by removing calls to
return() or exit() after skip() or skip_env().

Test-Parameters: trivial testlist=mds-survey,sanity-pcc,sanity-sec,ost-pools,replay-single
Test-Parameters: fstype=zfs testlist=mds-survey,sanity-pcc,sanity-sec,ost-pools,replay-single
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ia4b0f426943fdc2f4bcdaa312fbb6f6113ee058f
Reviewed-on: https://review.whamcloud.com/35928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
2 months agoLU-1538 tests: standardize test script init - dne-part-2 87/35787/3
Andreas Dilger [Tue, 13 Aug 2019 21:50:03 +0000 (15:50 -0600)]
LU-1538 tests: standardize test script init - dne-part-2

Standardize the initial Lustre test script initialization for
clarity and consistency for test suites in review-dne-part-2.

The LUSTRE path is already normalized in init_test_env(), so
this doesn't need to be done in the caller.  Use $(...)
subshells instead of `...` in the affected lines.  Remove
NAME, CHECKSTAT, TMP, SAVE_PWD,SRCDIR, PATH, MULTIOP, SETUP,
CLEANUP variable initialization, since it is already done in
init_test_env() or not needed in the test script.  Remove all
calls to get_lustre_env() in the test scripts since this is
called in init_test_env().

Move all definitions of ALWAYS_EXCEPT and SLOW to after
init_test_env() and init_logging() and call build_test_filter()
immediately after the ALWAYS_EXCEPT and SLOW definitions.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-2
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I6a04c9ffc0ce965c7f170119814d6ee8a30631df
Reviewed-on: https://review.whamcloud.com/35787
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12477 lnet: Remove obsolete config options 43/35343/5
Patrick Farrell [Thu, 15 Aug 2019 15:52:37 +0000 (11:52 -0400)]
LU-12477 lnet: Remove obsolete config options

Remove a few obselete config options, some for kernels we
no longer support, some for configs that should be unused.

Test-Parameters: trivial

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Icb949b1dd69e5cac0278965dfc5b8e3d1120f153
Reviewed-on: https://review.whamcloud.com/35343
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-1538 tests: standardize test script init dne-part-1 63/35263/7
Andreas Dilger [Tue, 18 Jun 2019 14:55:16 +0000 (08:55 -0600)]
LU-1538 tests: standardize test script init   dne-part-1

Standardize the initial Lustre test script initialization for
clarity and consistency.

The LUSTRE path is already normalized in init_test_env(), so this
doesn't need to be done in the caller.  Use $(...) subshells instead
of `...` in the affected lines.  Remove NAME, SRCDIR, PATH, MULTIOP,
SETUP, CLEANUP, CHECKSTAT, TMP, SAVE_PWD, variable initialization,
since it is already done in init_test_env() or not needed in the test
scripts.  Remove all calls to get_lustre_env() in the test scripts
since this is called in init_test_env().

Move all definitions of ALWAYS_EXCEPT and SLOW to after
init_test_env() and init_logging() and call build_test_filter()
immediately after the ALWAYS_EXCEPT and SLOW definitions.

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-part-1
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ia8a1b3afcca7af645eed1d0f3dcf843e5254afe6
Reviewed-on: https://review.whamcloud.com/35263
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12137 llite: use ->iterate_shared() for readdir 56/34556/14
Andreas Dilger [Fri, 27 Sep 2019 21:06:06 +0000 (17:06 -0400)]
LU-12137 llite: use ->iterate_shared() for readdir

Use the ->iterate_shared() method for readdir in llite, which has
been available since kernel commit v4.6-rc3-29-g6192269.

Remove duplicate autoconf check for the "iterate_shared" method.

Fixes: e41bdca7559 ("LU-11071 build: Add server build support for Ubuntu 18.04")
Test-Parameters: clientdistro=sles12sp3
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4dcec7c8ce1a5dbcc1e0ebc74ac47d1b7a4cab07
Reviewed-on: https://review.whamcloud.com/34556
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-10447 tests: replace $SET/$GETSTRIPE for misc tests 19/33919/12
James Nunez [Thu, 27 Dec 2018 03:26:09 +0000 (20:26 -0700)]
LU-10447 tests: replace $SET/$GETSTRIPE for misc tests

$SETSTRIPE and $GETSTRIPE were needed when we used the
standalone 'lstripe' utility. 'lstripe' hasn't been used
for years and we need to clean up all remnants of it.

Replace all instances of $SETSTRIPE with '$LFS setstripe'
and $GETSTRIPE with '$LFS getstripe' in the recovery-small,
sanity-hsm, sanity-lfsck, and sanity-scrub test suites.

Test-Parameters: trivial testlist=recovery-small,sanity-hsm
Test-Parameters: testlist=sanity-scrub
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I3ac87e0c48f33ac40cd8631f3d403fed4090adc5
Reviewed-on: https://review.whamcloud.com/33919
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
2 months agoNew development branch for Lustre 2.14 2.13.50 v2_13_50
Oleg Drokin [Fri, 8 Nov 2019 16:47:17 +0000 (11:47 -0500)]
New development branch for Lustre 2.14

Change-Id: I5636e71d6618d23e72127c4154affec62718cdb4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12932 lod: rename qos_threshold_rr parameter 86/36686/6
James Simmons [Wed, 6 Nov 2019 18:19:59 +0000 (13:19 -0500)]
LU-12932 lod: rename qos_threshold_rr parameter

Rename the qos_thresholdrr parameter back to its original name of
qos_threshold_rr so that there is no interop breakage. Update
test to handle mdt_qos_threshold_rr which lines up with the name
of qos_* sysfs files. Since we are using directly kstrtouint()
we have to eat the '%' that could be passed in.

Change-Id: I318a2ece6910e28a7a2331851d13b2269cf23e28
Fixes: c1d0a355a6a6 ("LU-12624 lod: alloc dir stripes by QoS")
Test-Parameters: trivial testlist=sanityn
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36686
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12734 misc: allow older bash_completion versions 59/36459/3
Andreas Dilger [Tue, 15 Oct 2019 23:07:53 +0000 (17:07 -0600)]
LU-12734 misc: allow older bash_completion versions

Allow the "lctl" bash_completion to work on older versions which
which don't have _init_completions().  Check at runtime if this
function is available, and if not fall back to an older interface.

Has been manually tested with both bash-completion v1.3 and v2.1.

Fixes: f87a7f2656ce ("LU-12734 misc: add bash completion for lctl set/get_param)"
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3822c0967354d83d12f299c4be3023b2fc254035
Reviewed-on: https://review.whamcloud.com/36459
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
2 months agoLU-12932 tests: remove obsolete qos.sh test script 66/36666/2
Andreas Dilger [Mon, 4 Nov 2019 19:21:29 +0000 (12:21 -0700)]
LU-12932 tests: remove obsolete qos.sh test script

The qos.sh test script is broken for a number of reasons:
- hard coded filesystem name
- uses old positional parameters for lfs setstripe
- sets parameters on client that should be set on MDS

and is duplicated by sanity test_116a.  Remove it.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3d4b795a65f6fbb4398f76f4a533d753700cab07
Reviewed-on: https://review.whamcloud.com/36666
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-12932 lod: restore qos_thresholdrr sysfs file 67/36667/2
James Simmons [Mon, 4 Nov 2019 21:53:25 +0000 (16:53 -0500)]
LU-12932 lod: restore qos_thresholdrr sysfs file

The introduction of directory stripe allocation by space usage
renamed the lod sysfs file qos_thresholdrr to lod_qos_thresholdrr
but this breaks backwards compatiablity. Restore qos_thresholdrr.

Fixes: c1d0a355a6 ("LU-12624 lod: alloc dir stripes by QoS")

Change-Id: I93bf29cbec3c3a5a7a8527353aa8005ebd340ec5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36667
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-12895 tests: stop running tests for SSK and SELinux 16/36616/7
James Nunez [Wed, 30 Oct 2019 16:47:16 +0000 (10:47 -0600)]
LU-12895 tests: stop running tests for SSK and SELinux

There are a few tests that crash consistently when Shared Secret Key
(SSK) and/or SELinux are enabled.  We need to stop running them, by
adding them to the ALWAYS_EXCEPT list, until we can find a solution.

  sanity test 185, 230b, 230d, 272a
  recovery-small test 110k, 136

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-ssk
Test-Parameters: testgroup=review-dne-selinux
Test-Parameters: testgroup=review-dne-selinux-ssk
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I0af5f0c9d0d3c56a79e6558f2ce9f4e5a0a2d4c5
Reviewed-on: https://review.whamcloud.com/36616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-12903 doc: make PCC man pages 72/36572/4
James Nunez [Thu, 24 Oct 2019 22:54:56 +0000 (16:54 -0600)]
LU-12903 doc: make PCC man pages

Several man pages for the Persistent Client Cache
feature were not included in the doc/Makefile.am
file and, thus, they do not show up on the Lustre client.

Add the following man pages to the Makefile:
lctl-pcc.8
lfs-pcc-detach.1
llapi_pcc_attach.3
llapi_pcc_attach_fid.3
llapi_pcc_attach_fid_str.3
llapi_pcc_detach_fid.3
llapi_pcc_detach_fid_fd.3
llapi_pcc_detach_fid_str.3
llapi_pcc_detach_file.3
llapi_pccdev_get.3
llapi_pccdev_set.3
llapi_pcc_state_get.3
llapi_pcc_state_get_fd.3

Test-Parameters: trivial
Fixes: f172b1168857 ("LU-10092 llite: Add persistent cache on client")
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4a7accb4ab77a9fcefda9f115a751ccbc35f9b7c
Reviewed-on: https://review.whamcloud.com/36572
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
2 months agoLU-12624 lod: alloc dir stripes by QoS 25/35825/13
Lai Siyao [Sun, 4 Aug 2019 18:08:02 +0000 (02:08 +0800)]
LU-12624 lod: alloc dir stripes by QoS

Similar to file OST object allocation, introduce directory stripe
allocation by space usage, but they don't share the same code because
of the many differences between them: file has mirrors, PFL, object
precreation; while for directory, the first stripe is always on the
same MDT where its master object is on. The changes include:
* add lod_mdt_alloc_qos() to allocate stripes by space/inode usage.
* add lod_mdt_alloc_rr() to allocate stripes round-robin.
* add lod_mdt_alloc_specific() to allocate stripes in the old way.
* add sysfs support for lmv_desc field in LOD structure, and move
  those remain in procfs to sysfs.

This patch also changes LMV QoS code:
* mkdir by QoS if user mkdir by command 'lfs mkdir -i -1 ...', or the
  parent directory default LMV starting MDT index is -1.
* with the above change, 'space' hash flag is useless, remove all
  related code.
* previously 'lfs mkdir -i -1' QoS code is in lfs_setdirstripe(),
  but now it's done in LMV, remove the old code.

Update sanity 413a 413b to support QoS mkdir of both plain and
striped directories.

Update lfs-setdirstripe man to reflect the changes.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8f5f8e46faae68ffd9a49a4ac1d450e951e979c5
Reviewed-on: https://review.whamcloud.com/35825
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12526 pcc: Auto attach for PCC during IO 05/36005/4
Qian Yingjin [Thu, 11 Jul 2019 08:37:55 +0000 (16:37 +0800)]
LU-12526 pcc: Auto attach for PCC during IO

PCC uses the layout lock to protect the cache validity. Currently
PCC only supports auto attach at the next open. However, the
layout lock can be revoked at any time by LRU/manual lock
shrinking or lock conflict callback.

For example, the layout lock can be revoked when performing I/Os
after opened the file. At this time, the cached file will be
detached involuntary. The I/O originally directed into PCC will
redirect to OSTs after the data restore into OSTs' objects. The
cost of this unwilling behavior may be expensive.

To avoid this problem, this patch implements auto attach for PCC
even during IOs (not only at the open time).

For debug purpose, now we have three auto attach options:
- open_attach: auto attach at the next open;
- io_attach: auto attach during IO
- stat_attach: auto attach at stat() call.

The reason to add the stat_attach option is that: when check
PCC state via "lfs pcc state", it will not only open the file but
also stat() on the file, to verify the feature of auto attach
during IO, we need to both disable open_attach and stat_attach.

And all these auto attach options are enabled by default.

This patch also fixed the bug for auto cache at create time:
In the current Lustre, the truncate operation will revoke the
LOOKUP ibits lock, and the file dentry cache will be invalidated.
The following open with O_CREAT flag will call into ->atomic_open,
the file was wrongly though as newly created file and try to
auto cache the file. So after client known it is not a
DISP_OPEN_CREATE, it should cleanup the already created PCC copy.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1e0a84ca125f00076cf88ee26f9b7da8d17a960c
Reviewed-on: https://review.whamcloud.com/36005
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12893 lnet: fix peer_ni selection 52/36552/2
Amir Shehata [Tue, 22 Oct 2019 18:27:24 +0000 (11:27 -0700)]
LU-12893 lnet: fix peer_ni selection

When selecting a peer-ni we must use the same peer NID
through all the messages which belong to the same RPC.
This is necessary in order to ensure we do the RDMA over
the optimal interface.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0391537da32bc6ac7a8a3d92e207bf172d111981
Reviewed-on: https://review.whamcloud.com/36552
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12672 tests: Correctly determine mdccli in recovery-small test 66 27/35827/2
Oleg Drokin [Mon, 19 Aug 2019 02:56:31 +0000 (22:56 -0400)]
LU-12672 tests: Correctly determine mdccli in recovery-small test 66

As is aparently the filtering by awk does not work and
we get errors like this:

error: get_param: param_path 'lustre-MDT0001-mdc-ffff880114af9800/mds_conn_uuid': No such file or directory
error: set_param: param_path 'mdc/lustre-MDT0000-mdc-ffff880114af9800
lustre-MDT0001-mdc-ffff880114af9800/import': No such file or directory

Test-Parameters: trivial testlist=recovery-small
Change-Id: Ibbcc79f71d2fa5966da90f0c8d0e98a3c5f2a964
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35827
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
3 months agoLU-12773 tests: sanity test_805 Use do_facet 04/36204/4
Oleg Drokin [Tue, 17 Sep 2019 05:23:26 +0000 (01:23 -0400)]
LU-12773 tests: sanity test_805 Use do_facet

do_node cannot really work with $SINGLEMDS, that's the
facet name.

This fixes error message below (and a following syntax error):
mds1: ssh: Could not resolve hostname mds1: Name or service not known

Fixes: 106abc184d8b ("LU-8856 osd: mark specific transactions netfree")
Test-Parameters: trivial fstype=zfs
Change-Id: I0d842dbccbfd934c524ae01cca7399dd52158064
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36204
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-10070 utils: move new SEL find_param fields to end 54/36554/4
Andreas Dilger [Tue, 22 Oct 2019 23:31:22 +0000 (17:31 -0600)]
LU-10070 utils: move new SEL find_param fields to end

Move the new fp_ext_size and fp_ext_size_units fields to the end
of struct find_param so that they don't break the ABI for the
llapi_find() and llapi_getstripe() functions that use it.

Add "unused" fields for the sign and exclude bitfields, so that
it is clear how many more can still be used before we need to
add new fields.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib15af374774050a9e5b224f7edc7523fdae570c1
Reviewed-on: https://review.whamcloud.com/36554
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-11607 tests: replace lustre_version/fstype - large-lun 80/36380/2
James Nunez [Fri, 4 Oct 2019 17:54:56 +0000 (11:54 -0600)]
LU-11607 tests: replace lustre_version/fstype - large-lun

The routine get_lustre_env() is available to all Lustre test
suites and sets environment variables for the Lustre version
installed on servers and clients.

Replace calls to lustre_version_code() and facet_fstype()
for all server types with definitions from get_lustre_env()
for the large-lun, lfsck-performance, sanity-selinux and
scrub-performance test suites.

While doing this, replace ‘$SINGLEMDS’ with ‘MDS1_VERSION’
in lustre_version_code() and facet_fstype().

Test-Parameters: trivial fstype=ldiskfs testlist=sanity-selinux,scrub-performance
Test-Parameters: fstype=zfs testlist=ldiskfs testlist=sanity-selinux,scrub-performance
Test-Parameters: fstype=ldiskfs testlist=large-lun,lfsck-performance
Test-Parameters: fstype=zfs testlist=ldiskfs testlist=large-lun,lfsck-performance
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ie1a04103b8d721ab20992ed0a9afb3a399270937
Reviewed-on: https://review.whamcloud.com/36380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
3 months agoLU-12803 libcfs: bump module version 88/36488/2
James Simmons [Fri, 18 Oct 2019 13:31:00 +0000 (09:31 -0400)]
LU-12803 libcfs: bump module version

The linux client version of libcfs is further ahead in its
cleanup so its module version is higher. While this is the
case it does prevent the OpenSFS version of libcfs from
loading and since OpenSFS is current ahead of the linux
client we prefere to use it at this time. Lets just increase
the OpenSFS libcfs module to be just slightly ahead of the
linux client.

Test-Parameters: trivial

Change-Id: Ie57d93529bf25d908099f7dab06d2960f9923d58
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36488
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
3 months agoLU-11768 test: make at_max to take effect 31/36431/4
Hongchao Zhang [Thu, 10 Oct 2019 20:22:25 +0000 (16:22 -0400)]
LU-11768 test: make at_max to take effect

In test_6 of sanity-quota, the "at_max" won't affect
the "at_current" if there is no RPC to be sent in that
import, which still makes the following DQACQ request
to have larger timeout value and triggers watchdog.

Fixes: d8226b93 ("LU-11768 test: limit at_max to timeout in time")
Test-Parameters: trivial \
testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota

Change-Id: Iccc969459647aa70da6f6ecb0d8d13a404bf8088
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36431
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12026 mdt: MDS stores atime|mtime|ctime during close 86/36286/9
Qian Yingjin [Wed, 25 Sep 2019 09:14:12 +0000 (17:14 +0800)]
LU-12026 mdt: MDS stores atime|mtime|ctime during close

In order to make direct inode scanning on the MDT useful, in
addition to storing the file size/blocks via LSOM on the MDT, we
also need to store the atime/mtime/ctime on the MDT inodes.

Currently the atime is already lazily updated on the MDS (at
close time). In this patch, the final mtime/ctime are sent to the
MDS at close time and updated on the MDT inode, and make MDT-only
scanning workable.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I4465281a03d70919c388cb241c16eebcb03e850f
Reviewed-on: https://review.whamcloud.com/36286
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12799 ptlrpc: return proper error code 82/36282/4
Alex Zhuravlev [Tue, 24 Sep 2019 20:29:01 +0000 (23:29 +0300)]
LU-12799 ptlrpc: return proper error code

from ptlrpc_disconnect_prep_req() using ERR_PTR()
as the callers expect.

Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")
Change-Id: I5493194a1f18f3d0b559921b7859bf835585ba58
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36282
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-12025 osp: allow OS_STATE_* flags from OSTs 29/35029/8
Andreas Dilger [Thu, 28 Feb 2019 00:37:08 +0000 (17:37 -0700)]
LU-12025 osp: allow OS_STATE_* flags from OSTs

Allow OS_STATE_* flags to be sent from the OST, so that the
OS_STATE_NOPRECREATE can be used to prevent a newly-added OST
from being used until it is ready.  Add the "no_precreate"
parameter on the OFD that can be set from userspace.

Close a race in the cached opd_statfs.os_state handling in
osp_pre_update_statfs().  It was being overwritten by the
new statfs data from the OST, but was globally visible for a
short time to the precreate threads before the OS_STATE_*
flags were set on the cached statfs data again.

Similarly, there was a race with updating the opd_pre_status
if the OST was out of space, where it would be cleared after
a successful statfs, and wouldn't be set to -ENOSPC until a
short time later.

Split osp_pre_update_status() into osp_pre_update_msfs() that
only copies the statfs data into the cache after all of the
flags are set.  Don't clear flags from the cache, they will
only be cleared when new statfs data is sent.

Add a test that the 'N'OPRECREATE flag appears in "lfs df".

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9c1c7a097f3de8edfdeef2b437f40936e73ebbe5
Reviewed-on: https://review.whamcloud.com/35029
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12842 utils: llog_print with snapshot name 14/36414/2
Andreas Dilger [Wed, 9 Oct 2019 17:26:24 +0000 (11:26 -0600)]
LU-12842 utils: llog_print with snapshot name

The lsnapshot utility creates filesystems named with generated
hexadecimal strings.  In some cases the filesystem name may start
with a number instead of a character, which causes "lctl llog_print"
(via llog_ioctl()) to consider the filesystem name invalid.

Allow filesystem names in llog_ioctl() to start with a digit.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib2054d5afbeaa3f661148fff834c29f83f5d98ad
Reviewed-on: https://review.whamcloud.com/36414
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9629 utils: fix lfs_migrate for non-root users 83/36383/5
Andreas Dilger [Sat, 5 Oct 2019 06:03:51 +0000 (00:03 -0600)]
LU-9629 utils: fix lfs_migrate for non-root users

Allow lfs_migrate to work with non-root users even when there are
hard-linked files.  The use of "lfs fid2path" is only strictly
needed if "lfs migrate" is not working and the script falls back
to using "rsync" to migrate the hard-linked files.  In the common
case, "lfs migrate" will preserve the links to the file and all
that is needed is "path2fid" to record which FIDs have already
been migrated so that they are not migrated again.

There is no need to track files with only one link, so none of
this FID-handling infrastructure is needed in the common case.

Don't get the mountpoint (via "df") for each hard-linked file within
a single filesystem (which is normally all files).  This is only
needed if files are on different mountpoints, which can be detected
by the device number returned by stat(1) on the file.  Cache the
device number across stat calls, and if it doesn't change then use
the same mountpoint for the fid2path call.

Add named variables to index the fields in the "nlink_type" array to
make it easier to see what is being accessed and avoid bugs.

Fixes: 80a2ff7137d3 ("LU-6051 utils: allow lfs_migrate to handle hard links")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If37d9f73bd1e2ff261fdcfb5248b9e51ae42bd13
Reviewed-on: https://review.whamcloud.com/36383
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12784 llite: limit max xattr size by kernel value 40/36240/9
Andreas Dilger [Sat, 5 Oct 2019 08:06:24 +0000 (02:06 -0600)]
LU-12784 llite: limit max xattr size by kernel value

Limit the maximum xattr size returned to userspace from the MDS to
what the currently-running kernel supports (XATTR_SIZE_MAX=65536
bytes typically).  While it is possible a Lustre backing filesystem
may store larger xattrs than this, it wouldn't be possible for users
to access a larger xattr via kernel xattr interfaces.

This fixes interop problems when newer clients and tests are running
against older servers:

  sanity.sh: line 8946: /usr/bin/setfattr: Argument list too long

Skip subtests for new features in 2.13 so 2.12 interop testing passes.

Fix test-framework.sh::large_xattr_enabled() to return true for ZFS.
Fix test-framework.sh::max_xattr_size() to return the actual value
returned from the MDS rather than computing it locally.

Fixes: 3ec712bd183 ("LU-11868 osd: Set max ea size to XATTR_SIZE_MAX")
Test-Parameters: trivial serverversion=2.12 testlist=sanity
Test-Parameters: serverversion=2.12 testlist=conf-sanity envdefinitions=ONLY=81
Test-Parameters: testlist=sanity-pfl,replay-single
Test-Parameters: testlist=conf-sanity envdefinitions=ONLY=48,61,81
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I14232809b13886efa8f11a50ecc35e78f316810d
Reviewed-on: https://review.whamcloud.com/36240
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 months agoLU-12275 sec: reserve flags for client side encryption 60/36360/5
Sebastien Buisson [Thu, 3 Oct 2019 14:35:11 +0000 (14:35 +0000)]
LU-12275 sec: reserve flags for client side encryption

Reserve OBD_CONNECT2_ENC connection flag so that 'encrypt' or
'test_dummy_encryption' client mount options can only be used if
server side knows how to handle encrypted object size properly.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I42d0b597df3b68bd1de19394104e7fda1b76bf6c
Reviewed-on: https://review.whamcloud.com/36360
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12593 osd: zeroing a freshly allocated block buffer 29/35629/5
Alexander Boyko [Fri, 26 Jul 2019 14:13:21 +0000 (10:13 -0400)]
LU-12593 osd: zeroing a freshly allocated block buffer

Ldiskfs zeroes new buffer only when it is not uptodate.
In rare case we can get a new buffer head with uptodate flag.
This may cause a file corruption for non zero offset writes,
especially for internal Lustre files like update_log, CATALOGS,
lov_objid.

od_fld_lookup()) lustre-MDT0001-mdtlov: invalid FID [0x0:0x50:0x0]

The patch adds zeroing under i_mutex for unmaped blocks.

The performance results, since the patch adds mutex to a creation
path (lov_objid file).
40 tasks, 2000000 files
SUMMARY: (of 5 iterations)
Operation       Max           Min           Mean    Std Dev
---------       ---           ---           ----    -------
without fix
File creation: 39990.601   19020.238     27443.823  6909.605
With fix
File creation: 37958.809   21708.187     27065.855  5900.961

Cray-bug-id: LUS-6132
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ica8fbe29b5a7253d553b41a41ffe5d8d8b4b2e55
Reviewed-on: https://review.whamcloud.com/35629
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12625 build: reliable detection of struct timespec64 75/35675/9
Alexey Zhuravlev [Fri, 2 Aug 2019 09:16:22 +0000 (12:16 +0300)]
LU-12625 build: reliable detection of struct timespec64

existing configure check define struct inode on stack and this
may cause the following error with gcc8:
build/conftest.c: In function main
build/conftest.c:226:1: error: the frame size of 1032 bytes is
larger than 1024 bytes [-Werror=frame-larger-than=]
which result in false result of the ckeck and then osd-ldiskfs
doesn't build.

put struct inode * on the stack instead.

Change-Id: If31cfd13836e36ef59d428d3c05bf7f51319f89b
Signed-off-by: Alexey Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35675
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12859 llite: clear flock when using localflock 52/36452/4
Andreas Dilger [Mon, 14 Oct 2019 22:29:30 +0000 (06:29 +0800)]
LU-12859 llite: clear flock when using localflock

When mounting a client with "-o localflock" or equivalent option in
/etc/fstab, it does not clear out the "flock" mount option flag from
the superblock.  This results in "flock" still being the option used
and it displays both options in the /proc/mounts output:

  10.0.0.1@o2ib:/lfs on /mnt/lfs type lustre (rw,flock,localflock)

Mount a client with both "flock,localflock" as mount options and
verify that the "flock" option is cleared by "localflock", and
vice versa.  Verify that "noflock" clears both options.

Remove the "remount_client()" helper in conf-sanity.sh, since this
shadows a helper function of the same name in test-framework.sh and
is confusing.  Instead, use "mount_client()" now that it can accept
mount options, and just pass "remount" explicitly in a few places.

Fixes: 3613af3e15cb ("LU-10885 llite: enable flock mount option by default")
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie31b0c4f6674c99d3ed5b73caa39cfc23d3ebbe5
Reviewed-on: https://review.whamcloud.com/36452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>