Whamcloud - gitweb
fs/lustre-release.git
7 hours agoLU-12899 build: rhel8 not install kernel-rpm-macros 57/36557/3 master
Qian Yingjin [Wed, 23 Oct 2019 01:43:24 +0000 (09:43 +0800)]
LU-12899 build: rhel8 not install kernel-rpm-macros

On RHEL8 kmodtool and kernel_module_package_buildreqs are not
installed with kernel-devel.

kernel_module_package_buildreqs is defined in kernel-rpm-marcos.
If kernel-rpm-macros is not installed, the Lustre RPM build will
report:
"Dependency tokens must begin with alpha-numeric, '_' or '/':
BuildRequires: %kernel_module_package_buildreqs"

This patch helps the developer understanding the detailed
information for the required packages when kernel-rpm-macros is
not installed.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Id9b855eeac97d780d9c572d306da3c3a1fa95ea6
Reviewed-on: https://review.whamcloud.com/36557
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-11607 tests: replace lustre_version/fstype - full 75/36375/3
James Nunez [Fri, 4 Oct 2019 17:43:05 +0000 (11:43 -0600)]
LU-11607 tests: replace lustre_version/fstype - full

The routine get_lustre_env() is available to all Lustre test
suites and sets an environment variable for the Lustre
version of installed on servers and clients.

Replace calls to lustre_version_code() and facet_fstype()
for all server types with definitions from get_lustre_env()
for the racer, replay-dual, replay-vbr and sanity-lsnapshot
test suites.

While doing this, replace ‘$SINGLEMDS’ with ‘MDS1_VERSION’
in lustre_version_code() and facet_fstype().

Clean up around any modifications by removing calls to
return after skip() or skip_env() and converting spaces to
tabs.

Test-Parameters: trivial fstype=ldiskfs testlist=replay-vbr,sanity-lsnapshot
Test-Parameters: fstype=zfs testlist=replay-vbr,sanity-lsnapshot
Test-Parameters: fstype=ldiskfs testlist=racer,replay-dual
Test-Parameters: fstype=zfs testlist=racer,replay-dual
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4581ccf98b9da256a00f24a2da8cd8ff41f115ca
Reviewed-on: https://review.whamcloud.com/36375
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
21 hours agoLU-12820 osc: remove 'transient' arg from osc_enter_cache_try 19/36319/3
Mr NeilBrown [Sun, 29 Sep 2019 23:09:54 +0000 (09:09 +1000)]
LU-12820 osc: remove 'transient' arg from osc_enter_cache_try

This arg is always '0', so remove it.
Consequently, OBD_BRW_NOCACHE is never set, and
cl_dirty_transit and obd_dirty_transit_pages
are never non-zero.
So they can be removed as well.

Linux-commit: 8d1057264a75

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia047affc33fb9277e6c28a8f6d7d088c385b51a8
Reviewed-on: https://review.whamcloud.com/36319
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-8130 ldlm: separate buckets from ldlm hash table 18/36218/3
NeilBrown [Tue, 17 Sep 2019 19:33:09 +0000 (15:33 -0400)]
LU-8130 ldlm: separate buckets from ldlm hash table

ldlm maintains a per-namespace hashtable of resources.
With these hash tables it stores per-bucket 'struct adaptive_timeout'
structures.

Presumably having a single struct for the whole table results in too
much contention while having one per resource results in very little
adaption.

A future patch will change ldlm to use rhashtable which does not
support per-bucket data, so we need to manage the data separately.

There is no need for the multiple adaptive_timeout to align with the
hash chains, and trying to do this has resulted in a rather complex
hash function.
The purpose of ldlm_res_hop_fid_hash() appears to be to keep
resources with the same fid in the same hash bucket, so they use
the same adaptive timeout.  However it fails at doing this
because it puts the fid-specific bits in the wrong part of the hash.
If that is not the purpose, then I can see no point to the
complexitiy.

This patch creates a completely separate array of adaptive timeouts
(and other less interesting data) and uses a hash of the fid to index
that, meaning that a simple hash can be used for the hash table.

In the previous code, two namespace uses the same value for
nsd_all_bits and nsd_bkt_bits.  This results in zero bits being
used to choose a bucket - so there is only one bucket.
This looks odd and would confuse hash_32(), so I've adjusted the
numbers so there is always at least 1 bit (2 buckets).

Change-Id: Ifab1b48b35b4a9a56610340556875901ad3804b2
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36218
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12718 obdclass: Allow read-ahead for write requests 00/36000/9
Mr NeilBrown [Thu, 5 Dec 2019 13:51:21 +0000 (08:51 -0500)]
LU-12718 obdclass: Allow read-ahead for write requests

cl_io_read_ahead asserts that read-ahead can only happen
due to CIT_READ or CIT_FAULT requests.
Since LU-9618, we expect CIT_WRITE requests to also
sometimes trigger read-ahead.
So the LINVRNT() needs to be extended to acknowledge
that.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I7aa1efb4fc8bb6f8474596a6194fc39f484d7ac7
Reviewed-on: https://review.whamcloud.com/36000
Reviewed-by: Shilong Wang <wshilong@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12679 tests: large-lun test_2 check OSTSIZE or skip (zfs) 54/35854/9
Shaun Tancheff [Wed, 4 Dec 2019 18:04:19 +0000 (12:04 -0600)]
LU-12679 tests: large-lun test_2 check OSTSIZE or skip (zfs)

Ensure OSTSIZE is sufficient for test to run, or the
backing device is sufficient, otherwise skip the test.

Use the zfs pool name when invoking zdb to report the
super block details.

Cray-bug-id: LUS-6875
Test-Parameters: fstype=zfs testlist=large-lun envdefinitions=REFORMAT=yes
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I65d8bac11d437230153c6ad8d821a5372244f7fa
Reviewed-on: https://review.whamcloud.com/35854
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12677 tests: conf-sanity test_21d must keep zpool 48/35848/5
Shaun Tancheff [Fri, 4 Oct 2019 09:25:07 +0000 (04:25 -0500)]
LU-12677 tests: conf-sanity test_21d must keep zpool

Currently test_21d destroys the zpool during writeconf_or_reformat
but the zpool must be preserved for writeconf to work as intended.

Cray-bug-id: LUS-7688
Test-Parameters: fstype=zfs testlist=conf-sanity
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I8fdc123504b70e59a9ca789141e56815377f6b35
Reviewed-on: https://review.whamcloud.com/35848
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12488 tests: Fix sanityn 93 for DNE configs 66/35366/6
Patrick Farrell [Wed, 6 Nov 2019 18:55:04 +0000 (11:55 -0700)]
LU-12488 tests: Fix sanityn 93 for DNE configs

sanityn test 93 only uses MDT0, but it is getting the parameter
qos_threshold_rr for all MDTs.  This confused the test when
trying to reset the parameter value.

Just limit everything to MDT0. Also modernize lctl vs
$LCTL usage and replace hardwired file system name with
$FSNAME.

Test-Parameters: trivial
Test-Parameters: testlist=sanityn
Test-Parameters: fstype=ldiskfs clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 testlist=sanityn
Test-Parameters: fstype=zfs clientcount=2 mdscount=2 mdtcount=4 osscount=1 ostcount=8 testlist=sanityn

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iff9cca4ce3d2e5ad4e499bba0369189bea21448a
Reviewed-on: https://review.whamcloud.com/35366
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12378 ptlrpc: always reset generation for idle reconnect 52/35052/13
Wang Shilong [Mon, 17 Jun 2019 06:58:34 +0000 (14:58 +0800)]
LU-12378 ptlrpc: always reset generation for idle reconnect

Idle reconnetion is common case and reconnections will
be quick mostly, so always reset generation for this case,
otherwise, it will make application fail just for Idle
reconnection feature.

Change-Id: Ia1531df6a3288663d832865e48a30b448b225766
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35052
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-11385 lnet: check if current->nsproxy is NULL before using 77/34577/8
Sonia Sharma [Sat, 30 Mar 2019 08:32:34 +0000 (01:32 -0700)]
LU-11385 lnet: check if current->nsproxy is NULL before using

A crash is seen at few sites in the function
rdma_create_id(current->nsproxy->net_ns, cb, dev, ps, qpt).
The issue is identified with the first param in this
function - current->nsproxy->net_ns. There is a
possibility that this value is NULL and resulting in
"kernel NULL pointer dereference" crash.

Handle the case of NULL value gracefully by adding
a check and using init_net if current or
current->nsproxy is NULL.

Change-Id: I06349e081f2c4ba0480b3924fc304f94ca765891
Signed-off-by: Sonia Sharma <sharmaso@whamcloud.com>
Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34577
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
21 hours agoLU-11185 mgc: config lock leak 90/32890/12
Alexey Lyashkov [Fri, 22 Mar 2019 08:59:35 +0000 (11:59 +0300)]
LU-11185 mgc: config lock leak

Regression introduced by "LU-580: update mgc llog process code".
It takes additional cld reference to the lock, but lock cancel forget
during normal shutdown. So this lock holds cld on the list for a long
time. any config modification needs to cancel each lock separately.

Cray-bugid: LUS-6253
Fixes: 5538eee216a1 ("LU-580: update mgc llog process code")

Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: Ic83e42666bf788739a2f81ab0c66632daa329290
Reviewed-on: https://review.whamcloud.com/32890
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-9679 llite: use lli_flags instead of lli_update_atime 39/36839/3
Mr NeilBrown [Mon, 30 Sep 2019 05:10:25 +0000 (15:10 +1000)]
LU-9679 llite: use lli_flags instead of lli_update_atime

Rather than adding a new single-bit field for a flag, use the
already-existing lli_flags field.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I23abc5c7c7383dca3385958f804074baaf551567
Reviewed-on: https://review.whamcloud.com/36839
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Nikitas Angelinas <nangelinas@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 hours agoLU-13001 lnet: Wait for single discovery attempt of routers 20/36820/2
Chris Horn [Fri, 22 Nov 2019 20:19:03 +0000 (14:19 -0600)]
LU-13001 lnet: Wait for single discovery attempt of routers

Historically, check_routers_before_use would cause LNet
initialization to pause until all routers had been ping'd once.

This behavior was changed in commit
fe17e9b8370affe063769b880f02b9190584baaa from LU-11298. Now, LNet
will wait indefinitely until discovery completes on all routers.
This is problematic, because if even one router is down then LNet
will stall forever.

Introduce a new lnet_peer state to indicate whether a router has
been discovered (either successfully or not) to restore the historic
behavior.

Fixes fe17e9b8370a ("LU-11298 lnet: use peer for gateway")

Test-Parameters: trivial
Cray-bug-id: LUS-8184
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ia064ffeb3e918cdb8d5a6150f443c48aa14e7a7c
Reviewed-on: https://review.whamcloud.com/36820
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12193 quota: use rw_sem to protect lqs_hash 95/36795/3
Sergey Cheremencev [Tue, 19 Nov 2019 13:09:14 +0000 (16:09 +0300)]
LU-12193 quota: use rw_sem to protect lqs_hash

Patch introduces rw semaphore for locking
in cfs_hash_lock. It is used to protect lqs_hash
instead of rw_lock to avoid sleeping in atomic:

BUG: sleeping function called from invalid context at kernel/rwsem.c:51
in_atomic(): 1, irqs_disabled(): 0, pid: 11265, name: mdt00_004
CPU: 0 PID: 11265 Comm: mdt00_004 Kdump: loaded Tainted: P
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
Call Trace:
 [<ffffffff817b5bf2>] dump_stack+0x19/0x1b
 [<ffffffff810c3bc9>] __might_sleep+0xd9/0x100
 [<ffffffff817bc470>] down_write+0x20/0x50
 [<ffffffffa0a7dad9>] qmt_set_with_lqe+0x3a9/0x750 [lquota]
 [<ffffffffa0a7dede>] qmt_entry_iter_cb+0x5e/0xa0 [lquota]
 [<ffffffffa01b327c>] cfs_hash_for_each_tight+0x10c/0x300 [libcfs]
 [<ffffffffa01b3503>] cfs_hash_for_each_safe+0x13/0x20 [libcfs]
 [<ffffffffa0a7db4f>] qmt_set_with_lqe+0x41f/0x750 [lquota]
 [<ffffffffa0a7dfa9>] qmt_set.constprop.15+0x89/0x2a0 [lquota]
 [<ffffffffa0a7e649>] qmt_quotactl+0x489/0x560 [lquota]
 [<ffffffffa0cc3a90>] mdt_quotactl+0x620/0x770 [mdt]
 [<ffffffffa06860f5>] tgt_request_handle+0x915/0x15c0 [ptlrpc]
 [<ffffffffa0628639>] ptlrpc_server_handle_request+0x259/0xad0 [ptlrpc]
 [<ffffffffa062c771>] ptlrpc_main+0xca1/0x2290 [ptlrpc]
 [<ffffffff810b4ed4>] kthread+0xe4/0xf0
 [<ffffffff817cac77>] ret_from_fork_nospec_begin+0x21/0x21
[  280.258396] BUG: scheduling while atomic: mdt00_004/11265/0x10000003

Change-Id: Id9238f9001c38105fb91d29c47fa34ad35158b40
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-on: https://review.whamcloud.com/36795
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-11025 uapi: introduce OBD_CONNECT2_CRUSH 74/36774/7
Lai Siyao [Sat, 7 Sep 2019 12:22:29 +0000 (20:22 +0800)]
LU-11025 uapi: introduce OBD_CONNECT2_CRUSH

Introduce a new connect flag OBD_CONNECT2_CRUSH to indicate whether
client or server supports new directory hash type 'crush'.

Test-parameters: trivial

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I073ab65eddf502c016f30ad535740f3d9a77459f
Reviewed-on: https://review.whamcloud.com/36774
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
21 hours agoLU-6174 obd: perform proper division 51/36751/5
James Simmons [Mon, 18 Nov 2019 20:26:39 +0000 (15:26 -0500)]
LU-6174 obd: perform proper division

Lustre stats have two files lc_sum and lc_count which are both
s64 so using do_div() is completely wrong. Use div64_s64()
instead.

Change-Id: Ie694c1c6bf79979bff3eae0de9791c81c355ea30
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36751
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12741 ptlrpc: do lu_env_refill for new request 14/36714/2
Mikhail Pershin [Fri, 8 Nov 2019 06:26:06 +0000 (09:26 +0300)]
LU-12741 ptlrpc: do lu_env_refill for new request

Perform lu_env_refill() prior any new request handling.
That was done already in tgt_request_handle() and is moved
now to ptlrpc_main() to work for any handler as well,
e.g. ldlm_cancel_handler()

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic5d8bfbd845f7e131849078c016f7e13b91d072f
Reviewed-on: https://review.whamcloud.com/36714
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12937 utils: update wirecheck for new values 06/36706/5
Andreas Dilger [Thu, 7 Nov 2019 17:47:21 +0000 (10:47 -0700)]
LU-12937 utils: update wirecheck for new values

Update the wirecheck.c file to handle changes that were made directly
in the wiretest.c files.  This allows the wiretrst.c file to be
regenerated properly without losing checks for new constants, and
fixes issues with some #defines that were changed to named enums.

Code under CONFIG_FS_POSIX_ACL in utils/wirecheck.c does not build in
userspace if the <linux/posix_acl_xattr.h> header is not available.
It should be enough that we are checking this in ptlrpc/wirecheck.c
since struct posix_acl_xattr_entry and posix_acl_xattr_header come
from the kernel anyway, so a userspace check is mostly redundant.

Test-Parameters: trivial
Fixes: 1fd63fcb045c ("LU-12090 utils: lfs rmfid")
Fixes: b4375f5fc66c ("LU-11444 ptlrpc: Add increasing XIDs CONNECT2 flag")
Fixes: 19b2bc9bbc25 ("LU-6142 lustre: introduce CONFIG_LUSTRE_FS_POSIX_ACL")
Fixes: 3611352b699c ("LU-11285 mdt: improve IBITS lock definitions")
Fixes: 68635c3d9b31 ("LU-11963 osd: Add nonrotational flag to statfs")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4945d6e2e8d7f4f98530e6dafd43bcf93735a5b1
Reviewed-on: https://review.whamcloud.com/36706
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12949 obdclass: don't extend timer if obd stops 03/36703/3
Alexander Boyko [Thu, 7 Nov 2019 11:13:50 +0000 (06:13 -0500)]
LU-12949 obdclass: don't extend timer if obd stops

During umount all clients became stale, so the first check at
check_for_recovery_ready() is passed, but there is no guarantee
that recovery timer was started. So, we need to check obd_stopping.

The test 138 is added to recovery-smal.sh.
It reproduces the issue when MDT is waiting for clients during
recovery and MDT umount happens.
extend_recovery_timer()) ASSERTION( obd->obd_recovery_start != 0 )
failed

Cray-bug-id: LUS-7917
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I1906fdfcc10606912a1f81560bb60b9d424db149
Reviewed-on: https://review.whamcloud.com/36703
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12469 mdd: handle migrate case with SELinux 84/36684/2
Sebastien Buisson [Wed, 6 Nov 2019 12:51:55 +0000 (21:51 +0900)]
LU-12469 mdd: handle migrate case with SELinux

In case a metadata object is created for migration purpose,
its security context should not be initialized. The
security.selinux xattr will be copied after creation, just like
any other xattr, so that the migrated object has the right security
context.

Test-Parameters: clientselinux mdtcount=4 envdefinitions=ONLY=230 testlist=sanity,sanity,sanity,sanity
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0bc274426c003f8081da2f4d1e8e6c12a70b9930
Reviewed-on: https://review.whamcloud.com/36684
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12826 mdt: limit root to change project state by default 44/36544/10
Wang Shilong [Tue, 22 Oct 2019 06:15:02 +0000 (14:15 +0800)]
LU-12826 mdt: limit root to change project state by default

The current project quota implementation allows users to
change the Project ID of files for which they have write
permission to any value. This is not useful if the project
quota is intended to be enforced instead of only being used
for quota accouting.

Change it so that by default only root can change the projid
of a file. Setting "mdt.*.enable_chprojid_gid" will allow
users with the specified numeric Group ID (eg. 1 = "admin") to
also change the projid of a file. Use "-1" to return the previous
behavior where all users can change the projid of their files.

Change-Id: I91c138d29f4d0b9bc607528d86893451904c9892
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36544
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-11832 ldiskfs: properly handle VFS parallel locking 14/34714/30
James Simmons [Thu, 21 Nov 2019 22:15:10 +0000 (17:15 -0500)]
LU-11832 ldiskfs: properly handle VFS parallel locking

The lustre server stack has an OSD abstraction to treat the under
lying native file system as an object store. In this case it is
ext4 being used to store internal data the OSD abstraction uses.

The area the recent VFS parallel locking changes impacts us is in
the LFSCK scrubbing code. The method the scrubbing code uses to
ensure a healthy state for the OSD data that is stored is by
using the native file system backend directory scanning code which
is setup to pass in unique function hook to handle each OSD
object store data file.

Currently this starts with the scrub code calling code that resembles
the internals of iterate_dir() which in turn calls ext4_readdir() with
struct dir_context actor mapping to each unique callback defined
by the field olm_filldir of struct osd_lf_map. The inode_lock()
and inode_unlock() was called with osd_lookup_one_len_unlocked()
which is often present in the dir_context actor function as defined by
olm_filldir. In essence the locking was done at the lowest levels.
With newer kernels this low level locking preventing mounting
of the server back end with the following back trace:

 [  612.287442] mount.lustre    D    0  3026   3025 0x00000224
 [  612.289699] Call trace:
 [  612.290398] [<ffff000008085a8c>] __switch_to+0x8c/0xa8
 [  612.292363] [<ffff00000880b9d0>] __schedule+0x328/0x860
 [  612.294394] [<ffff00000880bf3c>] schedule+0x34/0x8c
 [  612.296285] [<ffff00000880f460>] rwsem_down_write_failed+0x134/0x238
 [  612.304803] [<ffff00000880e89c>] down_write+0x54/0x58
 [  612.307115] [<ffff0000028cae2c>] osd_ios_root_fill+0xd4/0x590 [osd_ldiskfs]
 [  612.310109] [<ffff000002281798>] call_filldir+0xd8/0x148 [ldiskfs]
 [  612.312450] [<ffff000002282170>] ldiskfs_readdir+0x670/0x7b8 [ldiskfs]
 [  612.314975] [<ffff0000082b18d0>] iterate_dir+0x150/0x1b8
 [  612.317118] [<ffff0000028c24ac>] osd_ios_general_scan+0x104/0x2b8 [osd_ldiskfs]
 [  612.320299] [<ffff0000028cb384>] osd_initial_OI_scrub+0x9c/0x13c0 [osd_ldiskfs]
 [  612.323047] [<ffff0000028cd53c>] osd_scrub_setup+0xb44/0x1118 [osd_ldiskfs]
 [  612.325758] [<ffff00000289d4ec>] osd_device_alloc+0x544/0x950 [osd_ldiskfs]
 [  612.328637] [<ffff000001e29dac>] class_setup+0x7bc/0xd20 [obdclass]
 [  612.331324] [<ffff000001e33a30>] class_process_config+0x1708/0x2e90 [obdclass]
 [  612.334259] [<ffff000001e3a368>] do_lcfg+0x2b0/0x6d8 [obdclass]
 [  612.336704] [<ffff000001e3f49c>] lustre_start_simple+0x154/0x3f8 [obdclass]
 [  612.339694] [<ffff000001e74ee0>] osd_start+0x500/0xa40 [obdclass]
 [  612.342277] [<ffff000001e80a84>] server_fill_super+0x1d4/0x1848 [obdclass]
 [  612.345078] [<ffff000001e437a4>] lustre_fill_super+0x62c/0xdb0 [obdclass]
 [  612.347655] [<ffff0000082a02e0>] mount_nodev+0x5c/0xbc
 [  612.349954] [<ffff000001e3adc4>] lustre_mount+0x4c/0x80 [obdclass]
 [  612.352263] [<ffff0000082a1324>] mount_fs+0x54/0x16c
 [  612.354159] [<ffff0000082bfb6c>] vfs_kern_mount+0x58/0x154
 [  612.356246] [<ffff0000082c2ff8>] do_mount+0x1cc/0xbac
 [  612.358192] [<ffff0000082c3d60>] SyS_mount+0x88/0xd4

This is due to ext4_readdir use of dir_relax_shared() that is taking the
inode_lock which conflicts with the lock taking in our dir_context actor.
Having the dir_context actor take the lock so deep in the stack is incorrect
behavior. Since this is the case we need to migrate the locking from the
dir_context actor to before ext4_readdir() is called. This ends up involving
implementing the locking in the code that calls fops->iterate_dir() osd-ldiskfs
implemented. Instead of handling the lock changes in at the osd-ldiskfs level
across many kernel versions we can use iterate_dir() directly which does this
handling for us.

To use interate_dir() we need to work around the fact that osd-ldiskfs does
not use the VFS layer to create or managed VFS data structure such as struct file
or struct dentry. Instead osd-ldiskfs uses these VFS data structures as scratch
areas just to interface with the ext4 layer. We could use the VFS layer to
manage these data structures but the would require a massive reworking of this
OSD abstraction. Instead we look at what pieces are missing from struct file
due to the open coding that iterate_dir() expects. The first thing missing
from struct file is the initialization of struct path which is used by
file_accessed() to update the atime. The current behavior for our OSD layer
is not to update the atime so we preserve this behavior by setting the
struct file f_flag field to O_NOATIME. This prevents the calling of
touch_atime() which expects a proper struct path. The second expected field
for iterate_dir() is f_security. The function security_file_alloc() is not
exported so we do this manually. For the Lustre OSD case we are not interested
in any file security messages since our object store is not exposed in any way.
We disable the sending of the fsnotify events by setting the f_mode flag to
FMODE_NONOTIFY.

With this migration of the lock handling up the OSD stack we can no longer
just use osd_lookup_one_len_unlocked(). Replace osd_lookup_one_len_unlocked()
with osd_lookup_one_len() for the cases when the inode_lock() is taken much
earlier in the stack. This also closely follows the behavior of similar
functions in the linux VFS layer.

Change-Id: I00893f41ef5ec01835f0e58da6bd5c96a62aea88
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/34714
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
21 hours agoLU-12944 mdd: pass correct xattr size to lower layers 89/36689/3
Sebastien Buisson [Wed, 6 Nov 2019 17:31:08 +0000 (02:31 +0900)]
LU-12944 mdd: pass correct xattr size to lower layers

In mdd_iterate_xattrs(), struct lu_buf allocated to store xattr value
can be reused for multiple xattrs, because it is only reallocated if
it happens to be too small for one xattr.
As a consequence, lb_len field does not represent actual xattr's size.
It has to be adjusted when passed to lower layers.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I26b54759b4e69fbac17a1032bbc724b796d78108
Reviewed-on: https://review.whamcloud.com/36689
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12931 timers: correctly offset mod_timer. 88/36688/5
James Simmons [Tue, 19 Nov 2019 00:17:49 +0000 (19:17 -0500)]
LU-12931 timers: correctly offset mod_timer.

During a high level code review of the lustre time code it was
discovered that some of the mod_timer() calles was missing
adding the current jiffies value to the timeout that converted
to jiffies from seconds. Add this proper offset.

Fixes: b11be372c21d ("LU-9019 lnet: move ping an delay injection to time64_t")
Fixes: e920b3681451 ("LU-9019 ldlm: migrate the rest of the code to 64 bit time")
Change-Id: Ie4be14946032308610aff2fe72d15d4d70773da1
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36688
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12828 ldlm: FLOCK request can be processed twice 40/36340/4
Andriy Skulysh [Thu, 9 Aug 2018 14:58:17 +0000 (17:58 +0300)]
LU-12828 ldlm: FLOCK request can be processed twice

Original request can be processed after resend
request, so it can create a lock on MDT without
client lock or unlock other lock.

Make flock enqueue to use modify RPC slot.

Change-Id: Icfee202fe2e389beda1116f78f8b933c7ea182fb
Cray-bug-id: LUS-5739
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/36340
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 hours agoLU-12503 llite: file write pos mimatch 21/36021/3
Bobi Jam [Wed, 27 Nov 2019 08:48:49 +0000 (16:48 +0800)]
LU-12503 llite: file write pos mimatch

In vvp_io_write_start(), after data were successfully written, but
for some reason (e.g. out of quota), the data does not or got
partially commited, so that the file's write position (kiocb->ki_pos)
would be pushed forward falsely, and in the next iteration of write
loop, it fails the assertion

ASSERTION( io->u.ci_rw.rw_iocb.ki_pos == range->cir_pos )

This patch corrects ki_pos if this scenario happens.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ib85b1a777da24cc935e5976beab2390052b4cec3
Reviewed-on: https://review.whamcloud.com/36021
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 days agoLU-13057 tests: sanity-pcc test_1c: Failed to start copytool 57/36957/3
Yang Sheng [Mon, 9 Dec 2019 07:34:00 +0000 (15:34 +0800)]
LU-13057 tests: sanity-pcc test_1c: Failed to start copytool

Initialize default value while entry to ensure use correct value.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I6053f4c9e9cc655fdb1501c901aab81f4bd14987
Reviewed-on: https://review.whamcloud.com/36957
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 days agoLU-10589 tests: insulate sanity-dom from test failures 69/36369/3
James Nunez [Fri, 4 Oct 2019 11:39:03 +0000 (05:39 -0600)]
LU-10589 tests: insulate sanity-dom from test failures

The test suite sanity-dom calls two other test suites.
If one of these subtest suites has a test failure, that
failure is correctly reported as a failure, but it also
causes the last test in the subtest suite to look like
a failure.

Return zero for the subtest suites sanity and sanityn.

Test-Parameters: trivial fstype=zfs testlist=sanity-dom
Test-Parameters: fstype=ldiskfs testlist=sanity-dom
Test-Parameters: mdscount=2 mdtcount=4 fstype=zfs testlist=sanity-dom
Test-Parameters: mdscount=2 mdtcount=4 fstype=ldiskfs testlist=sanity-dom
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I778990ad8724de182a96399cf09758edfd35b2e1
Reviewed-on: https://review.whamcloud.com/36369
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
9 days agoLU-12514 obdclass: discard FS_REQUIRES_DEV flag. 25/35425/13
NeilBrown [Thu, 12 Sep 2019 00:13:07 +0000 (20:13 -0400)]
LU-12514 obdclass: discard FS_REQUIRES_DEV flag.

Lustre client mounts do not need a dev, as we can see from
lustre_mount() calling mount_nodev(). So remove the flag which
could cause confusion elsewhere.

Linux-commit: 60de0ad7076260081de78346c0fee24ed8e3c5c8

Change-Id: I93adeb9594369255018de0b618b158a39b634c0b
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/35425
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 days agoLU-12833 obdclass: fix LWP config processing 91/36391/2
Alexander Boyko [Mon, 7 Oct 2019 10:39:25 +0000 (06:39 -0400)]
LU-12833 obdclass: fix LWP config processing

In the situation when config includes SKIP records
for command add mdc. LWP config processing interprets
add_conn as valid and fails to add connection because
there is no LWP device.
lustre_lwp_add_conn()) lustre-OST0000: can't find lwp device.
server_start_targets()) lustre-OST0000: failed to start LWP: -2

The fix adds checking for CFG_F_MARKER and CFG_F_SKIP before
adding connection. These flags shows that marker was not in SKIP
state, device and uuid was added.

Cray-bug-id: LUS-7933
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Idbf709bb46a0be958946048fb16d3b622d2edd1f
Reviewed-on: https://review.whamcloud.com/36391
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12929 lnet: Remove field net_state in struct lnet_net 55/36655/2
Mr NeilBrown [Mon, 4 Nov 2019 00:15:42 +0000 (11:15 +1100)]
LU-12929 lnet: Remove field net_state in struct lnet_net

lnet_net.net_state is set but never used.  So the code that sets
it isn't tested and might be wrong.  Best to just remove it.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia11b8c0da0286f82c43be353944e036b5ea936c6
Reviewed-on: https://review.whamcloud.com/36655
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 mdc: Use BUILD_BUG_ON() for mdc_request.c 22/36722/2
Arshad Hussain [Tue, 29 Oct 2019 15:57:31 +0000 (21:27 +0530)]
LU-12923 mdc: Use BUILD_BUG_ON() for mdc_request.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/mdc/mdc_request.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ib04ddcb9d93c99cebdb175b39aceca0c25b12a10
Reviewed-on: https://review.whamcloud.com/36722
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 target: Use BUILD_BUG_ON() for tgt_lastrcvd.c 21/36721/2
Arshad Hussain [Tue, 29 Oct 2019 15:48:40 +0000 (21:18 +0530)]
LU-12923 target: Use BUILD_BUG_ON() for tgt_lastrcvd.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/target/tgt_lastrcvd.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I72df471f3480ae2267be08103ae4fae62c94b2ff
Reviewed-on: https://review.whamcloud.com/36721
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 ldlm: Use BUILD_BUG_ON() for ldlm_resource.c 20/36720/2
Arshad Hussain [Tue, 29 Oct 2019 15:39:56 +0000 (21:09 +0530)]
LU-12923 ldlm: Use BUILD_BUG_ON() for ldlm_resource.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/ldlm/ldlm_resource.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I16e32c16de09ef020ed0dad1775403d9d3c55ef6
Reviewed-on: https://review.whamcloud.com/36720
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 mdd: Use BUILD_BUG_ON() for mdd_object.c 19/36719/3
Arshad Hussain [Tue, 29 Oct 2019 15:30:29 +0000 (21:00 +0530)]
LU-12923 mdd: Use BUILD_BUG_ON() for mdd_object.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/mdd/mdd_object.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I6badf6cbeb2726b643f494f68b2f30752ffffa8c
Reviewed-on: https://review.whamcloud.com/36719
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 lov: Use BUILD_BUG_ON() for lov_pack.c 18/36718/2
Arshad Hussain [Tue, 29 Oct 2019 14:14:33 +0000 (19:44 +0530)]
LU-12923 lov: Use BUILD_BUG_ON() for lov_pack.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/lov/lov_pack.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ic59b482e73cf2556247153ecde658ded57b603f9
Reviewed-on: https://review.whamcloud.com/36718
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 mdc: Use BUILD_BUG_ON() for mdc_reint.c 17/36717/2
Arshad Hussain [Tue, 29 Oct 2019 14:54:22 +0000 (20:24 +0530)]
LU-12923 mdc: Use BUILD_BUG_ON() for mdc_reint.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/mdc/mdc_reint.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I70a711f15e6471f7a89577e4d7d712cf46e4e0cd
Reviewed-on: https://review.whamcloud.com/36717
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
9 days agoLU-12928 gss: crash in sec2target_str() 08/36708/5
Yang Sheng [Thu, 7 Nov 2019 18:48:43 +0000 (02:48 +0800)]
LU-12928 gss: crash in sec2target_str()

The timer_setup() API has being used since 3.10.0-957.x
kernel. So change gck_timer to a embedded struct to avoid
crashed on new timer API.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Ie12e21bca4169746016c8ac0e3ee4a125893ebf6
Reviewed-on: https://review.whamcloud.com/36708
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12942 lnet: Optimize check for routing feature flag 79/36679/2
Chris Horn [Wed, 11 Sep 2019 20:27:30 +0000 (15:27 -0500)]
LU-12942 lnet: Optimize check for routing feature flag

Check the routing feature flag outside of the loop.

Cray-bug-id: LUS-7862
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I636305532d4bd5b8157a9df1b98af0da3bba867f
Reviewed-on: https://review.whamcloud.com/36679
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
9 days agoLU-12936 lnet: remove pt_number from lnet_peer_table. 71/36671/2
Mr NeilBrown [Tue, 5 Nov 2019 02:49:16 +0000 (13:49 +1100)]
LU-12936 lnet: remove pt_number from lnet_peer_table.

This fields is no longer used - except for an ASSERT().
It did have a use once, but that was removed in
Commit ffd8e881bb98 ("LU-2456 lnet: Dynamic LNet
                      Configuration (DLC)")

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I43d4c50d68fea3194f8515ac02599ef07b37ad60
Reviewed-on: https://review.whamcloud.com/36671
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 pltrpc: Use BUILD_BUG_ON() for wiretest.c 48/36648/3
Arshad Hussain [Mon, 28 Oct 2019 23:58:34 +0000 (05:28 +0530)]
LU-12923 pltrpc: Use BUILD_BUG_ON() for wiretest.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/ptlrpc/wiretest.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Iaf0a9074809883f6eac2274a9e9bfba26a0ac84e
Reviewed-on: https://review.whamcloud.com/36648
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 utils: Use BUILD_BUG_ON() for wiretest.c 47/36647/3
Arshad Hussain [Mon, 28 Oct 2019 22:14:35 +0000 (03:44 +0530)]
LU-12923 utils: Use BUILD_BUG_ON() for wiretest.c

This patch replaces all CLASSERT() with BUILD_BUG_ON()
for file lustre/utils/wiretest.c

This is done by modifying local defined CLASSERT() macro
with BUILD_BUG_ON() macro. This replicates the kernel
defined BUILD_BUG_ON() where it asserts when condition
is true. This is user-space, therefore we cannot use
kernel define BUILD_BUG_ON() here and had to rely locally
defined BUILD_BUG_ON()

This patch also fixes few space/tab issues reported
by checkpatch

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I2f8bfdbd034a2c8059cf356dd72e4255f4999f8e
Reviewed-on: https://review.whamcloud.com/36647
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12923 mdc: Use BUILD_BUG_ON() for mdc_lib.c 46/36646/4
Arshad Hussain [Mon, 28 Oct 2019 20:35:46 +0000 (02:05 +0530)]
LU-12923 mdc: Use BUILD_BUG_ON() for mdc_lib.c

This patch replaces all CLASSERT() with kernel defined
BUILD_BUG_ON() for file lustre/mdc/mdc_lib.c

This patch also fixes few space/tab issues reported
by checkpatch

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I6bdc084ca73163b88b2dd105b44b9a3cb611a999
Reviewed-on: https://review.whamcloud.com/36646
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
9 days agoLU-12923 contrib: Update spelling.txt to add BUILD_BUG_ON() 45/36645/4
Arshad Hussain [Mon, 28 Oct 2019 18:47:50 +0000 (00:17 +0530)]
LU-12923 contrib: Update spelling.txt to add BUILD_BUG_ON()

This is first in the series of patchs which replaces
CLASSERT() with upstream kernel defined BUILD_BUG_ON()

This specific patch updates contrib/scripts/spelling.txt
to add line CLASSERT||BUILD_BUG_ON(). This will subsequently
help follow up patchs to trap and flag warning during
checkpatch check if CLASSERT() is still left defined.

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: If8fd76dd107cb53d657b7fa89bd62a9357222629
Reviewed-on: https://review.whamcloud.com/36645
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
9 days agoLU-12920 build: replace ed with sed 30/36630/3
Minh Diep [Thu, 31 Oct 2019 14:26:03 +0000 (07:26 -0700)]
LU-12920 build: replace ed with sed

Ed commad is very old

Test-Parameters: trivial

Change-Id: I18ffe50c3fb006182e68460c03a4d34d5011e62a
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36630
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12895 mdt: check if object exists first 29/36629/5
Sebastien Buisson [Thu, 31 Oct 2019 11:33:45 +0000 (20:33 +0900)]
LU-12895 mdt: check if object exists first

Make sure object exists before trying to get its attr.

Test-Parameters: clientselinux mdtcount=4 envdefinitions=ONLY=185a testlist=sanity,sanity,sanity,sanity
Test-Parameters: clientselinux mdtcount=4 testlist=sanity,recovery-small,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idb2cd5d6e3fdf7998040b933be54a001a0e5391b
Reviewed-on: https://review.whamcloud.com/36629
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 days agoLU-12910 osc: allow increasing osc.*.short_io_bytes 87/36587/16
Andreas Dilger [Sat, 26 Oct 2019 11:32:03 +0000 (05:32 -0600)]
LU-12910 osc: allow increasing osc.*.short_io_bytes

The osc.*.short_io_bytes parameter was mixing up the default and
maximum parameter values, and did not allow increasing the parameter
beyond the default.

Allow it to be increased to the maximum value, which depends on the
client PAGE_SIZE, and the amount of free space in the maximally-sized
OST RPC.  Since the maximum size is system dependent, allow some
grace when setting the parameter, so that a single tunable parameter
can work on a variety of different systems.

However, if it is larger than the maximum RDMA size (which is already
too large) return an error, as it means something is wrong.

Add a test case to exercise the osc.*.short_io_bytes parameter.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2ce73af5963a0f9e0f1079dd2f91a4495a3ebbe5
Reviewed-on: https://review.whamcloud.com/36587
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12904 build: Support for gcc -Wimplicit-fallthrough 77/36577/2
Shaun Tancheff [Fri, 25 Oct 2019 13:13:26 +0000 (08:13 -0500)]
LU-12904 build: Support for gcc -Wimplicit-fallthrough

Linux 5.3 enables -Wimplicit-fallthrough
Add decorators for implicit-fallthrough compiler checks.

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I740062e60e1d19b967ec6b91970cdd3ab03cbab6
Reviewed-on: https://review.whamcloud.com/36577
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12904 build: External _module_ decorator removed 76/36576/2
Shaun Tancheff [Fri, 25 Oct 2019 13:10:59 +0000 (08:10 -0500)]
LU-12904 build: External _module_ decorator removed

As of 5.4 the _module_ decorator prefix is not used for external
kernel modules. This breaks building kernel modules for 5.4.
Prior kernels still require the _module_ decorator.

Add a configure check to test for and handle _module_ decorator is
used.

Linux-commit: d7b0827f28ab3a4fd65864451ffefa695e3255fd

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I4359452cea8e32a31234b9becc2ed319954c55a4
Reviewed-on: https://review.whamcloud.com/36576
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-9859 libcfs: Prevent harmless read underflow 67/36567/2
Dan Carpenter [Thu, 24 Oct 2019 17:42:53 +0000 (13:42 -0400)]
LU-9859 libcfs: Prevent harmless read underflow

Because this is a post-op instead of a pre-op, then it means we check
if knl_buffer[-1] is a space.  It doesn't really hurt anything, but
it causes a static checker warning so let's fix it.

Linux-commit: 134aecbc25fd77645baaea5467b2a7ed8e9d1ea7

Change-Id: I40fee264eb1ac461baa183f199b4e5e1b5eb26f5
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-9325 mds: replace simple_strtol use with target_name2index() 51/36551/5
James Simmons [Thu, 7 Nov 2019 13:49:00 +0000 (08:49 -0500)]
LU-9325 mds: replace simple_strtol use with target_name2index()

With simple_strtol() going away in the future we should move to
kstrtoXXX functions. Looking a the simple_strtol() use in lod and
osp layer that its use is really target_name2index(). We can
migrate to this function so we have one stop to update from using
simple_strtol().

Change-Id: Ia3d0208c1b1c6bfbe9aa03ce3c068d41ed2c7595
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/36551
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-9859 libcfs: move remaining code from linux-module.c to module.c 10/36510/2
NeilBrown [Sun, 20 Oct 2019 15:09:10 +0000 (11:09 -0400)]
LU-9859 libcfs: move remaining code from linux-module.c to module.c

There is no longer any need to keep this code separate,
and now we can remove linux-module.c

Linux-commit: 9604c7ac2005e214cb08500c957a79c58bea5c83

Test-Parameters: trivial

Change-Id: Ie2b905f5a79be17840ddfac0661c10332dc2667d
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/36510
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
9 days agoLU-8130 obd: remove used HASH_CL_ENV_[BKT]_BITS 32/36432/2
James Simmons [Fri, 11 Oct 2019 13:11:00 +0000 (09:11 -0400)]
LU-8130 obd: remove used HASH_CL_ENV_[BKT]_BITS

Their is no libcfs hash table for cl_env so this can be removed.

Test-Parameters: trivial

Change-Id: I8d9d4f1dc683edc8fc4c14ffc8266deb178c3162
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/36432
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 days agoLU-12816 ptlrpc: ptlrpc_register_bulk LBUG on ENOMEM 09/36309/8
Ann Koehler [Mon, 14 Oct 2019 16:30:56 +0000 (11:30 -0500)]
LU-12816 ptlrpc: ptlrpc_register_bulk LBUG on ENOMEM

Another path through ptl_send_rpc() can cause the assert reported
in LU-10643. The assertion in ptlrpc_register_bulk() on
!desc->bd_registered fails when an rpc is resent and the first
send attempt failed to successfully attach the reply buffer. The
bulk error cleanup in ptl_send_rpc() does not reset the
bd_registered flag.

Cray-bug-id: LUS-7946
Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: I474211f196ea9bd83a036747e25c91c37c85ffbb
Reviewed-on: https://review.whamcloud.com/36309
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-9859 libcfs: opencode cfs_cap_{raise,lower,raised} 04/36304/4
NeilBrown [Fri, 11 Oct 2019 14:38:45 +0000 (10:38 -0400)]
LU-9859 libcfs: opencode cfs_cap_{raise,lower,raised}

Each of these functions is used precisely once, so having
a separate exported function seems like overkill.

cfs_cap_raised() is trivial - one line.
cfs_cap_raise() and cfs_cap_lower() are used as a pair
which is more effectively implemented with
override_cred() / revert_creds().

Linux-commit: cc738c1a69da27be8ff7885b4069fa02e45c75c1

There exists a bug in the original Linux client patch.
Additionally handling the SYS_CAP_RESOURCE is used
extensively with the server code so create we can create
simple inline functions that handle this and it makes
the code cleaner.

Change-Id: I3a39a855fb9718ca43e74ef4b9e749b0f43f4bc8
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/36304
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12780 lustre: remove SVC_EVENT flag 57/36257/4
Mr NeilBrown [Tue, 22 Oct 2019 14:42:29 +0000 (10:42 -0400)]
LU-12780 lustre: remove SVC_EVENT flag

This flag is never set or tested, so remove it and the
function for testing it.

Test-Parameters:trivial

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie9fc586ecd26ffce16026d53eac998e3c046d270
Reviewed-on: https://review.whamcloud.com/36257
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 days agoLU-12770 test: zfs project_quota testing 97/36197/6
Shaun Tancheff [Fri, 4 Oct 2019 18:13:35 +0000 (13:13 -0500)]
LU-12770 test: zfs project_quota testing

Use zpool get all to query zfs features and check for project_quota
Use zpool get/set to manage project_quota feature, where possible

Cray-bug-id: LUS-7795
Test-Parameters: trivial testlist=sanity-quota
Test-Parameters: fstype=zfs testlist=sanity-quota
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I8111820aa2f4415e8d62c472a3553fe3b9288f19
Reviewed-on: https://review.whamcloud.com/36197
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
9 days agoLU-12757 utils: avoid newline inside error message 76/36176/5
Andreas Dilger [Thu, 12 Sep 2019 23:31:00 +0000 (17:31 -0600)]
LU-12757 utils: avoid newline inside error message

When calling llapi_error() the format string should not end in a
newline, since the error string is appended to the output with
its own newline.

Fix several callers to not supply their own newline, and callers
that duplicate the error string in the error message itself.

In the case that there are callers that *do* include a newline,
handle this gracefully to avoid splitting the error across lines.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie8f7206d82faccb3b33e2fc62b00f5226b3ebbe5
Reviewed-on: https://review.whamcloud.com/36176
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12275 osd: make osd layer always send complete pages 38/36238/7
Sebastien Buisson [Thu, 19 Sep 2019 17:24:49 +0000 (19:24 +0200)]
LU-12275 osd: make osd layer always send complete pages

In osd layer, instead of looking if we go beyong isize, just make sure
we send complete pages all the time.
Data in page beyond isize will be discared by client anyway, and it
should not be harmful to send at max PAGE_SIZE-1 more bytes for reads
at end of file.

With this new paradigm, we need to remove sanity test_246, as its sole
purpose is to actually make sure we do not send more than isize bytes
to the client.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I03dc6037a8dfa1d40d40a4b1f675e047d862d933
Reviewed-on: https://review.whamcloud.com/36238
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
9 days agoLU-12634 libcfs: force_sig() removed task parameter 45/35745/8
Shaun Tancheff [Tue, 5 Nov 2019 05:12:20 +0000 (23:12 -0600)]
LU-12634 libcfs: force_sig() removed task parameter

Linux 5.3 removed the task parameter for force_sig()
signal: Remove task parameter from force_sig

When force_sig() is not available reset the target thread
default handler to SIG_DFL and proceed to use send_sig(..., 1)
which eventually marshals the same signal to the target task.

kernel-commit: 3cf5d076fb4d48979f382bc9452765bf8b79e740

NOTE: force_sig() is used here instead of a wake_up_process() as tasks
      may be blocked on rpc activity.

Test-Parameters: trivial
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ic28f604d985f7e6c3c3dea8bc284c6f2e212f45c
Reviewed-on: https://review.whamcloud.com/35745
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12634 build: Recognize ELRepo -ml mainline kernel 42/35742/5
Shaun Tancheff [Fri, 25 Oct 2019 16:15:00 +0000 (11:15 -0500)]
LU-12634 build: Recognize ELRepo -ml mainline kernel

Add support for identifying ELRepo kernel-ml style
builds on CentOS 7 and 8 based distributions

Test-Parameters: trivial
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: If4ae7441d4d023d31b1fb42f3fe90ff9c747c0f8
Reviewed-on: https://review.whamcloud.com/35742
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-11762 ldlm: ensure the recovery timer is armed 27/35627/4
Hongchao Zhang [Wed, 10 Jul 2019 08:22:15 +0000 (04:22 -0400)]
LU-11762 ldlm: ensure the recovery timer is armed

During recovery, when the recovery timer is expired, the VBR phase
is initiated only the current recovery timeout is less than the hard
recovery timeout, or it will be stuck in the "wait_event_timeout()"
because there is no timer and it can't be waked up.

Change-Id: I32467afa45393e37f255e2b14f160c9da710461b
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35627
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12518 llite: support page unaligned stride readahead 37/35437/12
Wang Shilong [Mon, 19 Aug 2019 06:57:29 +0000 (14:57 +0800)]
LU-12518 llite: support page unaligned stride readahead

Currently, Lustre works well for aligned IO, but performance
is pretty bad for unaligned IO stride read, we might need
take some efforts to improve this situation.

One of the main problem with current stride read is it is
based on Page Index, so if we hit unaligned page case,
stride Read detection will not work well. To support unaligned
page stride read, we might change page index to bytes offset
thus stride read pattern detection work well and we won't hit
many small pages RPC and readahead window reset. At the same
time, we shall keep as much as performances for existed cases
and make sure there won't be obvious regressions for
aligned-stride and sequential read.

Benchmark numbers:
iozone -w -c -i 5 -t1 -j 2 -s 1G -r 43k -F /mnt/lustre/data

Patched                 Unpatched
1386630.75 kB/sec       152002.50 kB/sec

At least performance bumped up more than ~800%.

Benchmarked with IOR from ihara:
        FPP Read(MB/sec)        SSF Read(MB/sec)
Unpatched 44,636                7,731

Patched   44,318                20,745

Got 250% performances up for ior_hard_read workload.

Change-Id: I791745f957af84a6c790c52fbe9f5fed3fd30c77
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35437
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
9 days agoLU-8066 obd_type: discard obd_type_lock 96/35096/9
NeilBrown [Wed, 18 Sep 2019 02:07:56 +0000 (22:07 -0400)]
LU-8066 obd_type: discard obd_type_lock

This lock is only used to protect typ_refcnt, so change
that to an atomic_t and discard the lock.

The lock also covers calls to try_module_get and module_put,
but this serves no purpose as it does not prevent the module
from being unloaded.

Finally, the return value for the call to try_module_get is
ignored, which is not safe.

Linux-commit: 493ae16ed39a1c9f792c3b650e2dff11ca2e73e8

Change-Id: I904c51cc4d3426ca520c0bcad9665380ce1f3c3d
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35096
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12355 ldiskfs: Update ldiskfs patches for 5.0 51/35051/3
Shaun Tancheff [Mon, 3 Jun 2019 22:55:05 +0000 (17:55 -0500)]
LU-12355 ldiskfs: Update ldiskfs patches for 5.0

Update ldiskfs patch series for 5.0
Update configure for ubuntu19 / 5.0.0 kernel

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I912d457c924c93cfcf98c0b91cd514d5d2a72bbc
Reviewed-on: https://review.whamcloud.com/35051
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12071 osd-ldiskfs: bypass pagecache if requested 22/34422/30
Alex Zhuravlev [Thu, 14 Mar 2019 14:51:31 +0000 (17:51 +0300)]
LU-12071 osd-ldiskfs: bypass pagecache if requested

in few cases (non-rotational drive, by request, or file size)
osd-ldiskfs may want to skip caching. If so, bypass page cache
instead of later cache invalidation, as cache invalidation can
be quite expensive.

set the maximum cached read/write IO size use:
     lctl set_param osd-ldiskfs.*.readcache_max_io_mb=N
     lctl set_param osd-ldiskfs.*.writethrough_max_io_mb=N
The default maximum cached IO size is 8MiB.

ladvise() enforces IO to go in the cache and all subsquent
reads will consult with the cache.

Change-Id: I37403ced7ad9553128ba168fa36315d6aa1aaf2d
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34422
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 days agoLU-12542 handle: move refcount into the lustre_handle. 94/35794/15
NeilBrown [Wed, 11 Sep 2019 15:34:54 +0000 (11:34 -0400)]
LU-12542 handle: move refcount into the lustre_handle.

Most objects with a lustre_handle have a refcount. The exception
is mdt_mfd which uses locking (med_open_lock) to manage its
lifetime. The lustre_handles code currently needs a call-out to
increment its refcount. To simplify things, move the refcount
into the lustre_hanle (which will be largely ignored by mdt_mfd)
and discard the call-out.

To avoid warnings when refcount debugging is enabled the refcount
of mdt_mfd is initialized to 1, and decremeneted after any
class_handle2object() call which would have incremented it.

In order to preserve the same debug messages, we store an object type
name in the portals_handle_ops, and use that in a CDEBUG() when
incrementing the ref count.

Change-Id: I1920330b2aeffd4b865cb9b249997aa28b209c33
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35794
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-11380 llapi: add llapi_fid_parse() helper 84/36184/11
Andreas Dilger [Fri, 13 Sep 2019 23:27:23 +0000 (17:27 -0600)]
LU-11380 llapi: add llapi_fid_parse() helper

Split the llapi_* FID handling functions to a separate file
rather than continually increasing the size of liblustrepai.c.

Add llapi_fid_parse() to parse a string to binary struct lu_fid.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I15abfaf888a5474d62feebab4e8db543ba3ebbe5
Reviewed-on: https://review.whamcloud.com/36184
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-9679 modules: declare zero-arg functions correctly 72/36672/2
Mr NeilBrown [Tue, 5 Nov 2019 03:04:32 +0000 (14:04 +1100)]
LU-9679 modules: declare zero-arg functions correctly

Functions that don't take any arguments should be
declared
   return-type name(void)
rather than
   return-type name()

This patch only changes functions that are included in
kernel modules.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I327c57131c4b5008660844a8436fa27df53c16c7
Reviewed-on: https://review.whamcloud.com/36672
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
9 days agoLU-9679 modules: Use LIST_HEAD for declaring list_heads 69/36669/2
Mr NeilBrown [Tue, 5 Nov 2019 02:19:07 +0000 (13:19 +1100)]
LU-9679 modules: Use LIST_HEAD for declaring list_heads

Rather than
  struct list_head foo = LIST_HEAD_INIT(foo);
use
  LIST_HEAD(foo);

This is shorter and more in-keeping with upstream style.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I36aa8c7e0763f3dfc88fe482cd28935184c1effa
Reviewed-on: https://review.whamcloud.com/36669
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-9679 general: avoid bare return; at end of void function 54/36654/3
Mr NeilBrown [Sun, 3 Nov 2019 23:55:04 +0000 (10:55 +1100)]
LU-9679 general: avoid bare return; at end of void function

Having:
   return;
}

at the end of a void function is unnecessary noise.
Where it is the *only* statement in the function, it can
be useful, so that remain unchanged.  The rest have been removed.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: If02f6f5b91d4134cf95a68ebccc83df28c360fb2
Reviewed-on: https://review.whamcloud.com/36654
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12853 ptlrpc: zero session enviroment 43/36443/2
Alexander Boyko [Mon, 14 Oct 2019 07:31:35 +0000 (03:31 -0400)]
LU-12853 ptlrpc: zero session enviroment

handle_recovery_req() set le_ses for request processing,
and doesn't zero it after. This leads to accessing freed memory
at keys_fill() later.

The patch also adds a cleanup for xxx_env_info, makes them equal
and combines to a single function.

Cray-bug-id: LUS-7676
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ifad95c1177258b6f71effe5fa815f68c8426c516
Reviewed-on: https://review.whamcloud.com/36443
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-6142 tests: Fix style issues for tchmod.c 41/36441/2
Arshad Hussain [Sun, 29 Sep 2019 23:10:50 +0000 (04:40 +0530)]
LU-6142 tests: Fix style issues for tchmod.c

This patch fixes issues reported by checkpatch
for file lustre/tests/tchmod.c

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I59a94c26e553a616d82ecc9a4d493511e808a82e
Reviewed-on: https://review.whamcloud.com/36441
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
9 days agoLU-6142 tests: Fix style issues for sendfile.c 40/36440/2
Arshad Hussain [Sun, 29 Sep 2019 23:19:16 +0000 (04:49 +0530)]
LU-6142 tests: Fix style issues for sendfile.c

This patch fixes issues reported by checkpatch
for file lustre/tests/sendfile.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Idbc8d5cd00b57da8f91b4ce39c40942a7fea8fc3
Reviewed-on: https://review.whamcloud.com/36440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
9 days agoLU-6142 tests: Remove file rmdirmany.c 39/36439/2
Arshad Hussain [Sun, 29 Sep 2019 23:02:52 +0000 (04:32 +0530)]
LU-6142 tests: Remove file rmdirmany.c

This patch removes file lustre/tests/rmdirmany.c
This file currently is not used at all by any
tests or binary.

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I3bd39abefb49855d70eed3be57f8e80e2439776d
Reviewed-on: https://review.whamcloud.com/36439
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
9 days agoLU-6142 tests: Fix style issues for openunlink.c 38/36438/2
Arshad Hussain [Sun, 29 Sep 2019 22:56:15 +0000 (04:26 +0530)]
LU-6142 tests: Fix style issues for openunlink.c

This patch fixes issues reported by checkpatch
for file lustre/tests/openunlink.c

Test-Parameters: trivial testlist=sanityn
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ibfd29751769c1b8339ac249ad1379c8d42250ae3
Reviewed-on: https://review.whamcloud.com/36438
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
9 days agoLU-6142 tests: Fix style issues for write_time_limit.c 90/36390/3
Arshad Hussain [Sun, 29 Sep 2019 15:34:37 +0000 (21:04 +0530)]
LU-6142 tests: Fix style issues for write_time_limit.c

This patch fixes issues reported by checkpatch
for file lustre/tests/write_time_limit.c

Test-Parameters: trivial testlist=sanity-sec
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Id55ff1de3a4c05f04ebe446bfa394d9d0e32997c
Reviewed-on: https://review.whamcloud.com/36390
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
9 days agoLU-6142 tests: Fix style issues for unlinkmany.c 89/36389/5
Arshad Hussain [Sun, 29 Sep 2019 16:23:41 +0000 (21:53 +0530)]
LU-6142 tests: Fix style issues for unlinkmany.c

This patch fixes issues reported by checkpatch
for file lustre/tests/unlinkmany.c

This patch also updates the usage message to
to print the information about '-d' option.

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Idd107ba7c005e0186bc39fc9bb4fc84691919178
Reviewed-on: https://review.whamcloud.com/36389
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-6142 tests: Fix style issues for test_brw.c 88/36388/2
Arshad Hussain [Sun, 29 Sep 2019 17:03:43 +0000 (22:33 +0530)]
LU-6142 tests: Fix style issues for test_brw.c

This patch fixes issues reported by checkpatch
for file lustre/tests/test_brw.c

Test-Parameters: trivial testlist=sanityn,recovery-small,recovery-single,lnet-selftest
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I888fa9289839dbdf6970685395ae17f4d6a28d44
Reviewed-on: https://review.whamcloud.com/36388
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
9 days agoLU-6142 tests: Fix style issues for statone.c 87/36387/2
Arshad Hussain [Sun, 29 Sep 2019 17:17:06 +0000 (22:47 +0530)]
LU-6142 tests: Fix style issues for statone.c

This patch fixes issues reported by checkpatch
for file lustre/tests/statone.c

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Idb38488116c471b8f6dbe767eaada2dc328b3d7a
Reviewed-on: https://review.whamcloud.com/36387
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
9 days agoLU-6142 tests: Fix style issues for rwv.c 85/36385/3
Arshad Hussain [Sun, 29 Sep 2019 18:08:40 +0000 (23:38 +0530)]
LU-6142 tests: Fix style issues for rwv.c

This patch fixes issues reported by checkpatch
for file lustre/tests/rwv.c

Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I574ef984e08a413569391d67a3a27abe9502438b
Reviewed-on: https://review.whamcloud.com/36385
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
9 days agoLU-6142 tests: Fix style issues for runas.c 84/36384/2
Arshad Hussain [Sun, 29 Sep 2019 18:36:38 +0000 (00:06 +0530)]
LU-6142 tests: Fix style issues for runas.c

This patch fixes issues reported by checkpatch
for file lustre/tests/runas.c

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Id0b658a0d9fabb520f3f087c0901047518e9f6cf
Reviewed-on: https://review.whamcloud.com/36384
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Ben Evans <bevans@cray.com>
9 days agoLU-10467 llite: use wait_event in cl_object_put_last() 45/36345/2
NeilBrown [Tue, 1 Oct 2019 18:28:39 +0000 (14:28 -0400)]
LU-10467 llite: use wait_event in cl_object_put_last()

cl_object_put_last() contains an open-coded version
of wait_event().
Replace it with the library macro.

Change-Id: I878f76c9af24e827f91fe50fbeb637dda1489b8a
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36345
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 lov: use wait_event() in lov_subobject_kill() 43/36343/2
NeilBrown [Tue, 1 Oct 2019 18:12:23 +0000 (14:12 -0400)]
LU-10467 lov: use wait_event() in lov_subobject_kill()

lov_subobject_kill() has an open-coded version
of wait_event(). Change it to use the macro.

There is no need to take a spinlock just to check if a variable
have changed value. If there was, the first test would be protected too.

"lti_waiter" now has no users and can be removed from lov_thread_info.

Change-Id: Ic1126fc500c03c48c4426171e98590ef6dce3098
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/36343
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 ldlm: convert l_wait_event in __ldlm_namespace_free 89/35989/9
Mr NeilBrown [Wed, 28 Aug 2019 23:35:23 +0000 (09:35 +1000)]
LU-10467 ldlm: convert l_wait_event in  __ldlm_namespace_free

The l_wait_event call in __ldlm_namespace_free() can do one
of two things depending on which LWI_* setup call is in effect.
If 'force', it ignores signals and times out after 1/4 second.
If '!force', it has no timeout but allows fatal signals.

So change it to two separate calls: wait_event_idle_timeout()
or l_wait_event_abortable().

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I1ac7ff5daa80581010cd913f01650c07ac40c151
Reviewed-on: https://review.whamcloud.com/35989
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 ldlm: fix style issues in ldlm_flock_completion_ast 83/35983/13
Mr NeilBrown [Thu, 29 Aug 2019 00:45:00 +0000 (10:45 +1000)]
LU-10467 ldlm: fix style issues in ldlm_flock_completion_ast

Prior to some code changes, fix up indenting and other
style issues (particularly multi-line comments) in
ldlm_flock.c.

In a few cases, parentheses have been added so that the re-indenting
done by emacs c-mode does "the right thing".

Test-Parameters: trivial
Change-Id: Ic628acaae875bea9759fbb669c154f046a75a9fa
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35983
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 ptlrpc: fix style issues in import.c 78/35978/9
Mr NeilBrown [Thu, 29 Aug 2019 00:37:41 +0000 (10:37 +1000)]
LU-10467 ptlrpc: fix style issues in import.c

Each of the functions changed here will have a
code change in the next patch, so fix up indenting
and a few other style issues first.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I21a52d7a8a510b9e12d7b822a0abf573247e1405
Reviewed-on: https://review.whamcloud.com/35978
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 llite: style fixes prior to code change. 74/35974/9
Mr NeilBrown [Thu, 29 Aug 2019 00:12:06 +0000 (10:12 +1000)]
LU-10467 llite: style fixes prior to code change.

Next patch will make some code changes to ll_put_super(),
so fix up indenting and fix a couple of checkpatch
warnings first.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I502c81f481c1046d0943f1407a910c1fceeb7ecc
Reviewed-on: https://review.whamcloud.com/35974
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 lustre: use wait_event_idle() where appropriate. 71/35971/10
Mr NeilBrown [Mon, 26 Aug 2019 05:34:17 +0000 (15:34 +1000)]
LU-10467 lustre: use wait_event_idle() where appropriate.

When l_wait_event() is passed an 'lwi' which is initialised
to all zeroes, it behaves exactly like wait_event_idle():
 - no timeout
 - not interrupted by any signal
 - doesn't add to load average.

So change all these instances to wait_event_idle(), or in two cases,
to wait_event_idle_exclusive().

There are three ways that lwi gets set to all zeros:
struct l_wait_info lwi = { 0 };
lwi = LWI_INTR(NULL, NULL);
memset(&lwi, 0, sizeof(lwi));

Change-Id: Ia6723cbe248ce067331a002e5e9d54796739c08a
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35971
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 lustre: don't use l_wait_event() for poll loops. 68/35968/6
Mr NeilBrown [Mon, 26 Aug 2019 04:42:17 +0000 (14:42 +1000)]
LU-10467 lustre: don't use l_wait_event() for poll loops.

When polling without any usable wait queue, it is clearest
to have an explicit poll loop.
So don't use l_wait_event() in these two cases, but
use a while loop with ssleep(1);

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: Ic6a203085699fb9802d32871479c822ebe3c2510
Reviewed-on: https://review.whamcloud.com/35968
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-10467 lustre: don't use l_wait_event() for simple sleep. 66/35966/6
Mr NeilBrown [Wed, 2 Oct 2019 02:19:01 +0000 (12:19 +1000)]
LU-10467 lustre: don't use l_wait_event() for simple sleep.

Passing '0' as the condition to l_wait_event() means that
it just waits for the given timeout.
This can be done more simply with ssleep(seconds) or in
one case, a schedule_timeout_killable() loop.

In most of these case, l_wait_event() in configured to ignore signals,
so ssleep() - which also ignores signals - is appropriate.
In one case (lfsck_lib.c) l_wait_event() is configured to respond
to fatal signals, and as there is no ssleep_killable, we
need to opencode one.

ssleep() and schedule_timeout_killable() *will* add to the load
average, while l_wait_event() does not, so if these sleeps happen a
lot, it will add to the load average.  I don't think that will be a
problem for these sleeps.

So remove these l_wait_event() calls and associated variables,
and do it the simpler ways.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I5a77e631c68f6dfb45fdd7ea01d60b13268240cc
Reviewed-on: https://review.whamcloud.com/35966
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12931 general: fix some cfs_time_seconds() inconsistencies. 68/36668/2
Mr NeilBrown [Mon, 4 Nov 2019 01:58:18 +0000 (12:58 +1100)]
LU-12931 general: fix some cfs_time_seconds() inconsistencies.

mgc_process_log:
  the value stored in 'secs' has units of 'jiffes' which is
  confusing.  So change the name to 'timeout'.
ptl_recover_import:
  the value stored in 'secs' has units of 'jffied' which is
  confusing.  It is reported in a CDEBUG message as 'seconds'.
  So rename to 'timeout' and report 'obd_timeout', which is in
  seconds.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I1c92c3ed45dc8a7ce9b82eb823e2db8779c881fa
Reviewed-on: https://review.whamcloud.com/36668
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-12856 target: check FLFLAGS are valid while accessing them 32/36632/2
Mikhail Pershin [Thu, 31 Oct 2019 20:44:38 +0000 (23:44 +0300)]
LU-12856 target: check FLFLAGS are valid while accessing them

While checking OBD_FL_SHORT_IO flag check first that OBD_MD_FLFLAGS
are valid.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I04ac61141d70883c29a113fac3985ac81cc878af
Reviewed-on: https://review.whamcloud.com/36632
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 days agoLU-13030 pcc: Init saved dataset flags properly 23/36923/3
Qian Yingjin [Wed, 4 Dec 2019 14:44:58 +0000 (22:44 +0800)]
LU-13030 pcc: Init saved dataset flags properly

When init a new inode, the saved flags is set wrongly with
PCC_DATASET_NONE which means that the file is known in NONE
of PCC dataset.
This patch corrects it with PCC_DATASET_INVALID.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Id775a20711cbc89979e81cbb2b0fe77dc5a850d5
Reviewed-on: https://review.whamcloud.com/36923
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
11 days agoLU-13030 pcc: auto attach not work after client cache clear 92/36892/13
Qian Yingjin [Thu, 28 Nov 2019 14:21:12 +0000 (22:21 +0800)]
LU-13030 pcc: auto attach not work after client cache clear

When the inode of a PCC cached file in unused state was evicted
from icache due to memory pressure or manual icache cleanup (i.e.
"echo 3 > /proc/sys/vm/drop_caches"), this file will be detached
from PCC also, and all PCC state for this file is cleared.
In the current design, PCC only tries to auto attache the file
once attached into PCC according to the in-memery PCC state. Thus
later IO for the file is not directed to PCC and will trigger the
data restore.

If this is a not desired result for the user, then we need to try
to auto attach file that was never attached into PCC or once
attached but detached as a result of shrinking its inode from
icache.

Although the candidates to try auto attach are increased, but only
the file in HSM released state (which can directly get from file
layout) will be checked.

This bug is easy reproduced on rhel8. It seems that the command
"echo 3 > /proc/sys/vm/drop_caches" will drop all unused inodes
from icache, but it is not true for rhel7.

This patch also adds the check for the input parameter @rwid,
which should be non zero value and same as the archive ID.

Test-Parameters: clientdistro=el8 testlist=sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ibb4c7c624de089766f4a56ef08ff0e2088d2e859
Reviewed-on: https://review.whamcloud.com/36892
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 days agoLU-13023 pcc: Incorrect size after re-attach 84/36884/7
Qian Yingjin [Wed, 27 Nov 2019 15:28:24 +0000 (23:28 +0800)]
LU-13023 pcc: Incorrect size after re-attach

The following test case will result in incorrect size for PCC copy:
- Attach a file with size of s1 (s2 > 0) into PCC;
- Detach this file with --keep option, and the data will retain
  on PCC;
- Truncate this file locally or on an remote client to a new size
  s2 (s2 < s1);
- Re-attach the file again. The size of PCC copy is still s1.

To solve this problem, it need to truncate the size of the PCC copy
to the same size of the Lustre copy which will be HSM released
later after finished the data copy (archive) phase.
This patch also adds the handle for the signal pending when the
attach process is killed by an administrator.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I18f2c883454450bf5dc2f2b3600e2685d8f8f130
Reviewed-on: https://review.whamcloud.com/36884
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event 99/36699/3
Wang Shilong [Thu, 7 Nov 2019 02:18:15 +0000 (10:18 +0800)]
LU-12946 kernel: fix to handle BLK_MQ_RQ_QUEUE_DEV_BUSY event

It looks like what's happening is when dm_dispatch_clone_request
dispatches the "clone" I/O request to the underlying (real) device
from the multipath device, the scsi driver can (often under load)
return BLK_MQ_RQ_QUEUE_DEV_BUSY. dm_dispatch_clone_request doesn't
have that as an exception the way it does BLK_MQ_RQ_QUEUE_BUSY and
so it calls dm_complete_request which propagates
the BLK_MQ_RQ_QUEUE_DEV_BUSY error code up the stack resulting
in multipath_end_io calling fail_path and failing the path because
there is an error value set.

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: If17ea5b3ab33a89a17d49e5dfb2e9f9f19371564
Reviewed-on: https://review.whamcloud.com/36699
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-12410 lnet: Add additional output to sanity-lnet.sh 42/36242/5 multi-rail
Chris Horn [Thu, 19 Sep 2019 19:01:05 +0000 (14:01 -0500)]
LU-12410 lnet: Add additional output to sanity-lnet.sh

Add wrappers around ip netns exec and lnetctl commands to generate
some additional test output. This makes it easier to see what each
test case is doing from the test script output, and aids in debugging
any problems.

Test-parameters: trivial
Test-parameters: testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I95b18cb3a090527548a8f9e65845eb4a18dea6d6
Reviewed-on: https://review.whamcloud.com/36242
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-12704 lov: check all entries in lov_flush_composite 68/36368/8
Vladimir Saveliev [Thu, 24 Oct 2019 09:17:09 +0000 (12:17 +0300)]
LU-12704 lov: check all entries in lov_flush_composite

Check all layout entries for DOM layout and exit with
-ENODATA if no one exists. Caller consider that as valid
case due to layout change.

Define llo_flush methods for all layouts as required
by lov_dispatch().

Patch cleans up also cl_dom_size field in cl_layout which
was used in previous ll_dom_lock_cancel() implementation

Run lov_flush_composite under down_read lov->lo_type_guard to avoid
race with layout change.

Fixes: 707bab62f5 ("LU-12296 llite: improve ll_dom_lock_cancel")

Test-Parameters testlist=racer
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I4e7b1b201bb1a669fe0d8f0f728467e579ef3512
Reviewed-on: https://review.whamcloud.com/36368
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-12370 ptlrpc: grammar fix. 08/36508/4
Alexander Zarochentsev [Fri, 31 Oct 2014 18:48:45 +0000 (21:48 +0300)]
LU-12370 ptlrpc: grammar fix.

ptlrpc_invalidate_import() error message grammar fix.

Test-Parameters: trivial
Cray-bug-id: LUS-4015
Change-Id: Ic1a99440f381ed982e348267996e4523aef8ebad
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/36508
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Colin Faber <cfaber@cray.com>