Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-12651 osc: always call update_next_shrink 29/37429/4
Alexander Zarochentsev [Tue, 4 Feb 2020 17:47:06 +0000 (20:47 +0300)]
LU-12651 osc: always call update_next_shrink

Call update_next_shrink in case of clients not
supporting grant shrinking or clients with grant
shrinking explicitely disabled. Otherwise
osc_grant_work_handler() schedules itself immediately
after its completion causing excessive CPU consumption.

Fixes: 3e070e30a98d ("LU-8708 osc: enable/disable OSC grant shrink")

Cray-bug-id: LUS-8460
Change-Id: I507b3d10dd5374772456853098bc26053cbd140d
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-on: https://review.whamcloud.com/37429
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 lustre: avoid cast of file->private_data 51/36651/4
Mr NeilBrown [Sun, 3 Nov 2019 23:02:58 +0000 (10:02 +1100)]
LU-9679 lustre: avoid cast of file->private_data

Instead of
  foo = ((struct seq_file*)file->private_data)->private;
use
  struct seq_file *m = file->private_data;
  foo = m->private;

Many places is lustre use this second style already.
It is much less noisy and prefered for upstream Linux.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9a7adb102687496f43bab099b1ca584955f040c9
Reviewed-on: https://review.whamcloud.com/36651
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
4 years agoLU-12321 mdc: allow ELC for DOM file unlink 42/36442/10
Mikhail Pershin [Fri, 27 Sep 2019 18:29:00 +0000 (21:29 +0300)]
LU-12321 mdc: allow ELC for DOM file unlink

ELC is skipping DOM bit to prevent data flush when it
is not really needed. Meanwhile if lock bits are combined
that caused unlink slowdown because ELC is disabled for
whole lock if DOM bit exists.

This patch is simple approach which determines if inode has
dirty pages and allows ELC for DOM unlink if there are none.

Test result of mdtest_easy_delete on DoM that unlink for
zero-byte files demostrated 28% perforamnce improvements.

1 x AI400(4 x MDS/MDT) on 10 node challenges:
Without patch:
mdtest_easy_delete  96.564 kiops : time 649.36 seconds
With patch:
mdtest_easy_delete 123.630 kiops : time 454.82 seconds

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ic5b2aed8c8c0884ee518a587a0c45ad54915f4fa
Reviewed-on: https://review.whamcloud.com/36442
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
4 years agoLU-11961 nodemap: nodemap_create() handles default nodemap 45/34245/8
Sebastien Buisson [Wed, 13 Feb 2019 15:41:47 +0000 (00:41 +0900)]
LU-11961 nodemap: nodemap_create() handles default nodemap

nodemap_create() is responsible for assigning nmc_default_nodemap
so it should not be done outside of this function.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8d0615196e32fb8e6c59ddedd421323a7d6eff7f
Reviewed-on: https://review.whamcloud.com/34245
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13235 lnet: copy the correct amount of CPTs to lnet_cpts 36/36636/4
Mr NeilBrown [Tue, 4 Feb 2020 15:52:22 +0000 (10:52 -0500)]
LU-13235 lnet: copy the correct amount of CPTs to lnet_cpts

A previous patch fixed one of three memcpy() calls in
lnet_net_append_cpts() to copy the correct number of bytes.
This patch fixes the other two.

Test-Parameters: trivial testlist=sanity-lnet

Fixes: 8cbb8cd3e771 ("LU-7734 lnet: Multi-Rail local NI split")

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I5a3450b0043c60b6c432db5be47f1e27ecc1fc94
Reviewed-on: https://review.whamcloud.com/36636
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-13210 lnet: gcc8 add implicit-fallthrough decorator 66/37466/3
Shaun Tancheff [Thu, 6 Feb 2020 22:44:13 +0000 (16:44 -0600)]
LU-13210 lnet: gcc8 add implicit-fallthrough decorator

With newer compilers and newer kernels -Werror=implicit-fallthrough
is enabled.

This adds the missing decorator.

Test-Parameters: trivial
Cray-bug-id: LUS-8476
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I47334d5a8d0bcf17489c1b15af29cd553fa01a09
Reviewed-on: https://review.whamcloud.com/37466
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoNew tag 2.13.52 2.13.52 v2_13_52
Oleg Drokin [Wed, 12 Feb 2020 06:18:35 +0000 (01:18 -0500)]
New tag 2.13.52

Change-Id: Iafa9279dd716bac93851412e64ef7b7e85945353
Signed-off-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12988 ldiskfs: mballoc to prefetch groups 93/36893/15
Alex Zhuravlev [Mon, 2 Dec 2019 08:23:30 +0000 (11:23 +0300)]
LU-12988 ldiskfs: mballoc to prefetch groups

ahead of scanning. prefething is done in 8 * flex_bg groups, so
it should be 8 read-ahead reads for a single allocating thread.
at the end of allocation the allocating thread waits for read-ahead
completion and initializes buddy information so that read-aheads
are not lost in case of memory pressure.
at cr=0 the number of prefetching IOs is limited per allocation
context to prevent a situation when mballoc loads thousands of
bitmaps looking for a perfect group and ignoring groups with
good chunks.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If86e3aff75379e064f70c0a66e2d65bdc5593651
Reviewed-on: https://review.whamcloud.com/36893
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13180 lustre: reserve bit for RDMA-only memory RPC 83/37383/3
Wang Shilong [Fri, 31 Jan 2020 07:21:30 +0000 (15:21 +0800)]
LU-13180 lustre: reserve bit for RDMA-only memory RPC

This is reserved for RDMA-only memory integrated with Lustre.
The purpose of this bit is to:

1) disable short IO if memory is not dirextly addressie by CPU.
2) prevent CPU memory pages and RDMA memory pages merging into one RPC.

Test-Parameters: trivial
Change-Id: I148b269c5e7d7c52e760b20a6482c259407e0898
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37383
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
4 years agoLU-13134 obdclass: use slab allocation for cl_dio_aio 27/37227/6
Wang Shilong [Tue, 14 Jan 2020 15:00:03 +0000 (23:00 +0800)]
LU-13134 obdclass: use slab allocation for cl_dio_aio

cl_dio_aio is used frequently for dio/aio, try to use
a private slab pool for it.

This could help improve aio performance.

Change-Id: Ic06523ae59eed04e55c17ac03af9187af8f695c5
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37227
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-4198 clio: AIO support for direct IO 16/32416/28
Jinshan Xiong [Mon, 29 Apr 2019 08:30:05 +0000 (16:30 +0800)]
LU-4198 clio: AIO support for direct IO

This patch try to add aio support for Lustre, AIO is
doing IO like DIO but we don't wait IO finished upon
return, we return EIOCBQUEUED to vfs instead to indicate IO
have been issued, aio_complete() will be called in the
callback once IO have been done.

  fio AIO/DIO bandwidth results:
  # numjob=4, bs=512k

  MB/s      write       read
  master      832       1806
  patched    6591      11800

  fio AIO/DIO IOPS results:
  # 32 clients, 8192 threads
  # ioengine=libaio rw=randread blocksize=4096 iodepth=128 direct=1
  # size=1g runtime=300 group_reporting numjobs=256 create_serialize=0

  IOPS      write       read
  master      99K      1239K
  patched    265K      3498K

Test-Parameters: testgroup=review-ldiskfs-arm
Signed-off-by: Jinshan Xiong <jinshan.xiong@uber.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: If2ac9283612514e10fe342fc43e95b4081347168
Reviewed-on: https://review.whamcloud.com/32416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-4198 clio: turn on lockless for some kind of IO 01/8201/46
Jinshan Xiong [Thu, 9 Mar 2017 19:30:00 +0000 (11:30 -0800)]
LU-4198 clio: turn on lockless for some kind of IO

We can safely turn on lockless for Direct IO
and no lock.

Direct IO will still enqueue lock in the server side,
and we could not use lockless for in the following case:

1) If group lock is held before DIO, use lockless will
make us deadlock, so we use group lock instead and trust
this to protect consistecy.

2) Direct IO might fallback to Buffer IO in some cases,
and we will restart Direct IO with normal lock holding

The main motivation for this patch is to support AIO.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ia004d6b39272df8159c9df3cc76662e198230b55
Reviewed-on: https://review.whamcloud.com/8201
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13145 lnet: use conservative health timeouts 30/37430/2
Andreas Dilger [Fri, 31 Jan 2020 20:00:00 +0000 (13:00 -0700)]
LU-13145 lnet: use conservative health timeouts

Use more conservative lnet_transaction_timeout and lnet_retry_count
values by default.  Currently with timeout=10 and retry=3 there is
only a 3s window for the RPC to be sent before it is timed out.
This has caused fault injection rather than fault tolerance.
Increase the default timeout to 50s with retry=2, which is hopefully
long enough to cover virtually all uses, but still allows LNet Health
to be enabled by default and resend before Lustre times out itself.

Fixes: 8632e94aeb7e ("LU-11816 lnet: setup health timeout defaults")

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6bfc4d61cebab38c1554e1b42834b1f38fc34ba8
Reviewed-on: https://review.whamcloud.com/37430
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12593 osd: up i_append_sem during errors 06/37406/3
Alexander Boyko [Mon, 3 Feb 2020 09:24:40 +0000 (04:24 -0500)]
LU-12593 osd: up i_append_sem during errors

There is a potential leak of i_append_sem during errors for
buffer head read and ldiskfs_joural_get_write_access() at
osd_ldiskfs_write_record().
The patch adds up(i_append_sem) for errors paths.

Fixes: f832a7dc33c6 ("LU-12593 osd: zeroing a freshly allocated block buffer")
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: I245d0c45af03519c66b75731e5d57f42de41fe95
Reviewed-on: https://review.whamcloud.com/37406
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13191 osp: handle -EROFS in osp_sync_interpret() 04/37404/2
Lai Siyao [Sat, 25 Jan 2020 21:23:28 +0000 (05:23 +0800)]
LU-13191 osp: handle -EROFS in osp_sync_interpret()

Upon OST disk failure, osp_sync_interpret() may get -EROFS,
which is a valid errno.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I5c3cff3019aa47c6d5803f0f0b373bc704f18118
Reviewed-on: https://review.whamcloud.com/37404
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13163 mdc: new kernel function xa_is_value() 99/37399/3
Lai Siyao [Sat, 25 Jan 2020 00:30:44 +0000 (08:30 +0800)]
LU-13163 mdc: new kernel function xa_is_value()

xa_is_value() is added in kernel 4.19-rc6 to replace
radix_tree_entry_exceptional().

Test-Parameters: trivial clientdistro=el8.1 envdefinitions=ONLY=65i testlist=sanity,sanity,sanity,sanity,sanity
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If89aa19c37af8a67debe782d1c77f4ef4dc6f923
Reviewed-on: https://review.whamcloud.com/37399
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-8304 libcfs: convert debug_ctlwq to a completion. 98/37398/3
NeilBrown [Sun, 2 Feb 2020 02:15:17 +0000 (21:15 -0500)]
LU-8304 libcfs: convert debug_ctlwq to a completion.

kthread_run might sleep during an allocation, and so
it's considered unsafe to call with a state that's not
RUNNABLE.
Rather than move the state setting to after kthread_run, which
introduces a small race, replace the waitqueue with a completion.
This has clean semantics which perfectly match the need here.

Change-Id: Ic3bcf21dc747d73ce482e2d50bffd6c43fc04fbc
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/37398
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13183 ldiskfs: Drop remove truncate warning patch 89/37389/3
Shaun Tancheff [Fri, 31 Jan 2020 18:28:28 +0000 (12:28 -0600)]
LU-13183 ldiskfs: Drop remove truncate warning patch

Drop the ext4-remove-truncate-warning.patch as it was
removed as part of
    f64e9f19f68e ("LU-12977 ldiskfs: properly take inode_lock ...")
and is not needed.

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I78667ba380e9e78d4972377e59fa56bc27f15bb5
Reviewed-on: https://review.whamcloud.com/37389
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11300 lnet: remove lnd_query interface. 37/37337/4
Mr NeilBrown [Tue, 28 Jan 2020 00:31:31 +0000 (11:31 +1100)]
LU-11300 lnet: remove lnd_query interface.

The ->lnd_query interface is completely unused, and has been since
commit 8e498d3f23ea ("LU-11300 lnet: peer aliveness")

So remove all mention of it.

Fixes: 8e498d3f23ea ("LU-11300 lnet: peer aliveness")
Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iff11652283b371519cf31bf66b9ba08e024d3193
Reviewed-on: https://review.whamcloud.com/37337
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12988 ldiskfs: skip non-loaded groups at cr=0/1 91/36891/7
Alex Zhuravlev [Thu, 28 Nov 2019 12:04:25 +0000 (15:04 +0300)]
LU-12988 ldiskfs: skip non-loaded groups at cr=0/1

cr=0 is supposed to be an optimization to save CPU cycles,
but if buddy data (in memory) is not initialized then all
this makes no sense as we have to do sync IO taking a lot
of cycles.  also, at cr=0 mballoc doesn't store any avaibale
chunk. cr=1 also skips groups using heruistic based on avg.
fragment size.
it's more useful to skip such groups and switch to cr=2 where
groups will be scanned for available chunks.

using sparse image and dm-slow virtual device of 120TB was
simulated. then the image was formatted as OST and filled
using debugfs to mark ~85% of available space as busy.
mount as OST w/o the patch couldn't complete in half an hour
(according to vmstat it would take ~10-11 hours). with the
patch applied mount took ~20 seconds.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I88c8c1b01b386af0fa438bfeb97acb6110bd00ec
Reviewed-on: https://review.whamcloud.com/36891
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
4 years agoLU-13165 mdt: MSG_RESENT can be improperly cleared. 96/37296/2
Andriy Skulysh [Wed, 9 Oct 2019 19:53:14 +0000 (22:53 +0300)]
LU-13165 mdt: MSG_RESENT can be improperly cleared.

req_can_reconstruct() can return -EPROTO, it means that
original request was processed and reply was received.

Change-Id: I06ba9aa24821f414777d38e9ca606652b172e92c
Fixes: 23773b32bf ("LU-11444 ptlrpc: resend may corrupt the data")
Cray-bug-id: LUS-7972
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-on: https://review.whamcloud.com/37296
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12542 handle: discard h_lock. 63/35863/7
NeilBrown [Fri, 13 Dec 2019 15:48:18 +0000 (10:48 -0500)]
LU-12542 handle: discard h_lock.

The h_lock spinlock is now only taken while bucket->lock
is held.  As a handle is associated with precisely one bucket,
this means that h_lock can never be contended, so it isn't needed.

So discard h_lock.

Also discard an increasingly irrelevant comment in the declaration
of struct portals_handle.

Change-Id: Ib5231fb43d1bf5031d5c2426c4e1d1865544bcf5
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35863
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11607 tests: replace version/fstype calls in sanity/n 19/35719/8
James Nunez [Wed, 7 Aug 2019 19:27:13 +0000 (13:27 -0600)]
LU-11607 tests: replace version/fstype calls in sanity/n

The routine get_lustre_env() is available to all Lustre
test suites and sets an environment variable for the file
system type for MDS1 and OST1 and sets a variable for the
Lustre version of servers.

Replace the calls to facet_fstype() and lustre_version_code()
for all server types defined in get_lustre_env().  While
doing this, replace SINGLEMDS with mds1 in these calls.

Clean up around any modifications with
- converting spaces to tabs
- removing calls to return after skip() or skip_env()

Test-Parameters: trivial testlist=sanityn
Test-Parameters: fstype=zfs testlist=sanityn,sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ibc66220ae3b57cf22395d13f5d35feceeb61adfe
Reviewed-on: https://review.whamcloud.com/35719
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
4 years agoLU-10447 tests: deprecate use of $SETSTRIPE/$GETSTRIPE 25/33925/3
James Nunez [Thu, 27 Dec 2018 16:50:48 +0000 (09:50 -0700)]
LU-10447 tests: deprecate use of $SETSTRIPE/$GETSTRIPE

$SETSTRIPE and $GETSTRIPE were needed when we used the
standalone 'lstripe' utility. 'lstripe' hasn't been used
for years and we need to clean up all remnants of it.

Remove the definition and replace all instances of
$SETSTRIPE with '$LFS setstripe' and $GETSTRIPE with
'$LFS getstripe' in test-framework library.

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ibd78b2d75b0b8fc7ff686c1b0a73ce51fe9452e2
Reviewed-on: https://review.whamcloud.com/33925
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-9679 lustre: use LIST_HEAD() for local lists. 55/36955/4
Mr NeilBrown [Thu, 5 Dec 2019 06:09:19 +0000 (17:09 +1100)]
LU-9679 lustre: use LIST_HEAD() for local lists.

When declaring a local list head, instead of

   struct list_head list;
   INIT_LIST_HEAD(&list);

use
   LIST_HEAD(list);

which does both steps.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I67bda77c04479e9b2b8c84f02bfb86d9c2ef5671
Reviewed-on: https://review.whamcloud.com/36955
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 lnet: use LIST_HEAD() for local lists. 54/36954/4
Mr NeilBrown [Thu, 5 Dec 2019 05:56:16 +0000 (16:56 +1100)]
LU-9679 lnet: use LIST_HEAD() for local lists.

When declaring a local list head, instead of

   struct list_head list;
   INIT_LIST_HEAD(&list);

use
   LIST_HEAD(list);

which does both steps.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia1f1f1abf1b8a9f50e3033976990010b1d2100db
Reviewed-on: https://review.whamcloud.com/36954
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 lnet: discard lnet_print_text_bufs() 59/36659/2
Mr NeilBrown [Mon, 4 Nov 2019 04:20:58 +0000 (15:20 +1100)]
LU-9679 lnet: discard lnet_print_text_bufs()

lnet_print_text_bufs() is unused and has
never been used since it was introduced in
Commit ed88907a96ba ("Landing b_hd_newconfig on HEAD")
So let's remove it.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ic412cdef4981a043e94060e5de5646b836bb0e36
Reviewed-on: https://review.whamcloud.com/36659
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 general: add missing spaces to folded strings. 53/36653/5
Mr NeilBrown [Sun, 3 Nov 2019 23:40:56 +0000 (10:40 +1100)]
LU-9679 general: add missing spaces to folded strings.

Many places in lustre fold a long string onto multiple lines,
usually at word breaks.  Sometimes the space between those words
got lost.
In a couple of places, a newline (n) rather than a space was lost.

This patch adds those spaces (and newlines) back in.

Where a space was added, the whole string is joined onto a
single line as this is current policy - encouraged by checkpatch.

In a couple of places neighbouring strings are also joined
into a single line, and some code has been re-indented to use
TABs.

Where the missing space was in a .diff file, the string hasn't
been joined into a line, as it doesn't seem worth the churn.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I6882bb957df566da0794f4ee85133dbf8c3debc1
Reviewed-on: https://review.whamcloud.com/36653
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
4 years agoLU-10467 ptlrpc: convert use of l_wait_event_exclusive_head() 86/35986/11
Mr NeilBrown [Sat, 18 Jan 2020 14:46:59 +0000 (09:46 -0500)]
LU-10467 ptlrpc: convert use of l_wait_event_exclusive_head()

Only one place uses l_wait_event_exclusive_head().
It uses an on_timeout function that returns non-zero, so
the wait aborts after timeout.

Change this to wait_event_idle_exclusive_lifo_timeout(),
and if it times out, perform the same action as the
on_timeout handler - a simple assignment.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I11bee6aa1eceb6564fb72e41528f2f6a80b0d207
Reviewed-on: https://review.whamcloud.com/35986
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 ldlm: convert waiting in ldlm_completion_ast() 85/35985/14
Mr NeilBrown [Sat, 18 Jan 2020 14:17:51 +0000 (09:17 -0500)]
LU-10467 ldlm: convert waiting in ldlm_completion_ast()

ldlm_completion_ast() calls l_wait_event() in two slightly different
ways depending on whether a timeout is defined.

As a non-NULL _on_signal handler in passed, the non-timed-out portion
of the wait allows signals (abortable).  As the on_timeout handler
return zero, the timed-out portion of the wait is always followed by a
non-timedout portion.

So if no timeout is defined, we can simply wait with
l_wait_event_abortable().

If there is a timeout, we first wait with wait_event_idle_timeout()
and if that times out, we call ldlm_expired_completion_wait(), then
wait with l_wait_event_abortable().

Change-Id: I6874010085864764f2fc0e294dc0c67152cb2ad2
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35985
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 ldlm: convert waiting in ldlm_flock_completion_ast() 84/35984/11
Mr NeilBrown [Sat, 18 Jan 2020 14:19:29 +0000 (09:19 -0500)]
LU-10467 ldlm: convert waiting in ldlm_flock_completion_ast()

The l_wait_event() call in ldlm_flock_completion_ast() sets no
timeout, and so always enables fatal signals.  So it can be converted
to l_wait_event_abortable().

It is passed an on_signal handler, so that needs to be called if
l_wait_event_abortable() returns a negative result.  As this is the
only place the handler is call, it can be inlined.  We already have an
'if' which captures the 'wait was interrupted' condition, so place the
signal handler code in there.

This makes struct ldlm_flock_wait_data redundant.  In fact the
fwd_genertion field in there was already unused.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I9cc3a3e8b593a66f46183584382dc13169ff9adf
Reviewed-on: https://review.whamcloud.com/35984
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 ptlrpc: convert waiters on set->set_waitq 82/35982/13
Mr NeilBrown [Fri, 24 Jan 2020 14:45:44 +0000 (09:45 -0500)]
LU-10467 ptlrpc: convert waiters on set->set_waitq

There are a couple of interesting aspects of waiters on ->set_waitq.

One is the only usage of LWI_TIMEOUT_INTR_ALL().  This causes
l_wait_event() to enable "fatal" signals during the timeout
part of the wait. (normally signals are completely blocked when
there is a timeout).
This can be converted to l_wait_event_abortable_timeout().

Another is that ptlrpc_expired_set() is passed as the on_timeout
handler.  As this always returns true, it cauess l_wait_event()
to quit after the timeout, and not go "back to sleep".
We can instead call this explicitly after the wait_event_timeout
returns 0 - which means that it timedout.
Due to this change in call pattern, we can change the function to
take a ptlrpc_request_set* instead of a void*, and to not return
anything.

Also, ptlrpc_interrupted_set() is sometimes passed as the on_signal
function.  Instead we can explicitly call this when we get a negative
return from wait_event_abortable.  Again, we can declare it as
taking the real type and not a void*.

The wait on set_waitq in ptlrpcd() might be a timedout wait or,
if timeout == 0, it is an indefinite wait.  We make that explicit
with 2 separate cases.

So this patch:
  - changes to wait_event_idle_timeout and
    l_wait_event_abortable_timeout,
  - calls ptlrpc_*_set explicitly based on return code
  - changes signatures for ptlrpc_*_set()

Change-Id: Ieb97aa3ba9b1f988a30bb7a424588f87f75e8023
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35982
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 lustre: convert users of back_to_sleep() 80/35980/11
Mr NeilBrown [Sat, 18 Jan 2020 14:39:47 +0000 (09:39 -0500)]
LU-10467 lustre: convert users of back_to_sleep()

When back_to_sleep() is passed to l_wait_event as
the on_timeout hander, the effect is to potentially wait twice.
The first wait ignores all signals and has a timeout.
If the timeout fires without the event occuring, the l_wait_event()
goes "back to sleep" indefinitely, but this time with fatal
signals unblocked.

This pattern can be made more clear with two separate wait calls:
  wait_event_idle_timeout() followed by l_wait_event_abortable().

Change-Id: I3536e33b4d982f37c960f31df1ea0d9808f9ced7
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35980
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10467 lustre: convert most users of LWI_TIMEOUT_INTERVAL() 73/35973/16
Mr NeilBrown [Sat, 18 Jan 2020 14:38:03 +0000 (09:38 -0500)]
LU-10467 lustre: convert most users of LWI_TIMEOUT_INTERVAL()

when l_wait_event() is called with an lwi initialised with
LWI_TIMEOUT_INTERVAL(t1, t2, NULL, NUL),
waits for a total of t1 jiffies, but wakes up every t2 jiffies
to check the condition - incase the condition changed without
triggering a wakeup.
In (nearly) every case, t2 is one second.
So this is effectively a poll loop around wait_event_timeout.
So replace with with

 seconds = t1;
 while (seconds > 0 &&
        wait_event_timeout(q, cond, cfs_time_seconds(1)) == 0)
     seconds -= 1;

Then if seconds is zero at the end, the whole loop timed out.

In the one exception ("nearly" above) if t1 is small, t2 is set to one
jiffies, so we always wait a little bit and check the condition.  For
that case, we count to "seconds >= 0" and adjust the timeout
accordingly when seconds == 0.

Note that in one case, the on_timeout function is
target_bulk_timeout() instead of NULL.  As this always returns '1', it
behaves exactly like passing NULL.

Change-Id: I4cddbd2c28f07012cce7915489eedcb668c7e808
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35973
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12287 lnet: handling device failure by IB event handler 37/35037/7
Tatsushi Takamura [Mon, 3 Jun 2019 01:11:24 +0000 (10:11 +0900)]
LU-12287 lnet: handling device failure by IB event handler

The following IB events cannot be handled by QP event handler
- IB_EVENT_DEVICE_FATAL
- IB_EVENT_PORT_ERR
- IB_EVENT_PORT_ACTIVE

IB event handler handles device errors such as hardware errors
and link down.

Test-Parameters: trivial
Signed-off-by: Tatsushi Takamura <takamr.tatsushi@jp.fujitsu.com>
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9869fb1cd1172040e0dd34828318017a0f30df81
Reviewed-on: https://review.whamcloud.com/35037
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-10198 llog: keep llog handle alive until last reference 67/37367/2
Mikhail Pershin [Wed, 29 Jan 2020 21:22:07 +0000 (00:22 +0300)]
LU-10198 llog: keep llog handle alive until last reference

Llog handle keeps related dt_object pinned until llog_close()
call, meanwhile llog handle can still have other users which
took llog handle via llog_cat_id2handle()

Patch changes llog_handle_put() to call lop_close() upon last
reference drop. So llog_osd_close() will put dt_object only
when llog_handle has no more references.
The llog_handle_get() checks and reports if llog_handle has
zero reference.
Also patch modifies checks for destroyed llogs, llog handle
has new lgh_destroyed flag which is set when llog is destroyed,
llog_osd_exist() checks dt_object_exist() and lgh_destroyed
flag, so destroyed llogs are considered as non-existent too.
Previously it uses lu_object_is_dying() check which is not
reliable because means only that object is not to be kept in
cache.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: If7df41646c243c0d40b20a30a33e86c688d24508
Reviewed-on: https://review.whamcloud.com/37367
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13128 osc: glimpse and lock cancel race 15/37215/5
Alexander Zarochentsev [Thu, 9 Jan 2020 17:45:56 +0000 (20:45 +0300)]
LU-13128 osc: glimpse and lock cancel race

osc_dlm_blocking_ast0 clears l_ast_data before writing
file data to OST and opens a race window. Neither a glimpse
AST nor ldlm_cb_interpret can find correct file attributes at
that moment.

Cray-bug-id: LUS-8344
Signed-off-by: Alexander Zarochentsev <c17826@cray.com>
Change-Id: Iadac4f7da94b71639430c9a7cdd77d55e7ba2849
Reviewed-on: https://review.whamcloud.com/37215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12852 pfl: restrict the stripe count correctly 47/36947/5
Emoly Liu [Fri, 6 Dec 2019 02:08:07 +0000 (10:08 +0800)]
LU-12852 pfl: restrict the stripe count correctly

In function lod_get_stripe_count(), when restricting the stripe
count to the maximum xattr size, the xattr overhead should be
taken into count correctly.

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Ief548e47ce4d375f2e189860ccfe05d0f3c7e890
Reviewed-on: https://review.whamcloud.com/36947
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11939 tgt: Do not assert during grant cleanup 15/34215/7
Patrick Farrell [Fri, 8 Feb 2019 17:14:06 +0000 (12:14 -0500)]
LU-11939 tgt: Do not assert during grant cleanup

Client/server grant inconsistencies discovered during
cleanup are indicative of a bug, but any problems they
would cause have already occurred at this point.

So do not assert during this cleanup.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ic9b827b1005bc321a290505a368349699ddf2f38
Reviewed-on: https://review.whamcloud.com/34215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13194 tests: check server version sanityn 104 61/37461/2
James Nunez [Tue, 4 Feb 2020 04:15:10 +0000 (21:15 -0700)]
LU-13194 tests: check server version sanityn 104

Check the server version before running sanityn test 104.
If the server version is less than 2.12.4, skip the test.

Fixes: d2f7cb7934a0 ("LU-12026 mdt: MDS stores atime|mtime|ctime")

Test-Parameters: trivial serverversion=2.11.0 serverdistro=el7 envdefinitions=ONLY=104 testlist=sanityn
Test-Parameters: envdefinitions=ONLY=104 testlist=sanityn

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I625fb0163c078dc95ed670d169dc5744bc16d4e8
Reviewed-on: https://review.whamcloud.com/37461
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
4 years agoLU-12598 osd-ldiskfs: always return errors for osd_ios_lf_fill 23/37323/2
James Simmons [Fri, 24 Jan 2020 16:45:48 +0000 (11:45 -0500)]
LU-12598 osd-ldiskfs: always return errors for osd_ios_lf_fill

While working on ARM ldiskfs support it was noticed that
osd_ios_lf_fill() behaves differently then the other olm_filldir
handlers. On failure of osd_lookup_one_len() osd_ios_lf_fill()
silently returns zero when it should return an error code. Change
to return proper error codes and update the cdebug messages.

Change-Id: I528b18aaa7277133875cba5db3150ce34cc6431a
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/37323
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13063 tests: remove checks for old RHEL versions 72/37272/4
Andreas Dilger [Fri, 17 Jan 2020 23:38:02 +0000 (16:38 -0700)]
LU-13063 tests: remove checks for old RHEL versions

There was a check in sanity test_17g for RHEL6.5, but we haven't been
testing that client version for some time, and 6.5 no longer works on
master.  Remove this check entirely

Similarly, is_project_quota_supported() was trying to check for RHEL7,
but the lfs check was being done on the client.  It was also wrong for
RHEL8 kernels, and would incorrectly match any version with a "7" in
it.  Move lfs check to the MDS, and don't check the kernel version.

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If115b65ba8bc09b6c292ec9cf2e949c8153ebbe5
Reviewed-on: https://review.whamcloud.com/37272
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13142 lod: cleanup layout checking 67/37267/4
Sebastien Buisson [Fri, 17 Jan 2020 13:15:25 +0000 (22:15 +0900)]
LU-13142 lod: cleanup layout checking

Cleanup layout checking in lod layer and lfs command-line utility,
for DoM components.

Reported-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib8b184a31d26442ed10241dc12a0452e5243d0e8
Reviewed-on: https://review.whamcloud.com/37267
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12133 osd-zfs: set blocksize to 8K for llog objects 92/37192/4
Alex Zhuravlev [Fri, 10 Jan 2020 20:15:12 +0000 (23:15 +0300)]
LU-12133 osd-zfs: set blocksize to 8K for llog objects

with ZFS-0.8+ default blocksize is 512 bytes. as many llog
operations use 8K chunks it turns into 16 dbuf lookups
which is quite expensive.

for example, sanity/60a takes 104s with blocksize=512 and
90s with blocksize=8K

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I86e6e598899e5d09a550dff7dcb9edd5ee56abd5
Reviewed-on: https://review.whamcloud.com/37192
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11114 llite: Update mdc and lite stats on open|creat 48/36948/7
Olaf Faaland [Tue, 26 Nov 2019 23:20:11 +0000 (15:20 -0800)]
LU-11114 llite: Update mdc and lite stats on open|creat

Increment "create" counter in mdc/<instance>/md_stats, and
"mknod" counter in llite/<instance>stats when an open with
the CREAT flag results in a newly created file.

The mknod counter is chosen for consistency with
patch http://review.whamcloud.com/20246
 "LU-8150 mdt: Track open+create as mknod"
but the mdc counter set does not include mknod.

Change-Id: Ib32d828dac35924b929f44f161cff13c99810540
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Reviewed-on: https://review.whamcloud.com/36948
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 socklnd: convert peers hash table to hashtable.h 37/36837/6
Mr NeilBrown [Wed, 15 Jan 2020 15:36:42 +0000 (10:36 -0500)]
LU-12678 socklnd: convert peers hash table to hashtable.h

Using a hashtable.h hashtable, rather than bespoke code, has several
advantages:

 - the table is comprised of hlist_head, rather than list_head, so
   it consumes less memory (though we need to make it a little bigger
   as it must be a power-of-2)
 - there are existing macros for easily walking the whole table
 - it uses a "real" hash function rather than "mod a prime number".

In some ways, rhashtable might be even better, but it can change the
ordering of objects in the table at arbitrary moments, and that could
hurt the user-space API.  It also does not support the partitioned
walking that ksocknal_check_peer_timeouts() depends on.

Note that new peers are inserted at the top of a hash chain, rather
than appended at the end.  I don't think that should be a problem.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I70fe64df0dd0db73666ff6fb2d2888b1d64f4be5
Reviewed-on: https://review.whamcloud.com/36837
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12634 gss: uid_keyring and session_keyring moved 43/35743/17
Shaun Tancheff [Fri, 3 Jan 2020 20:10:58 +0000 (15:10 -0500)]
LU-12634 gss: uid_keyring and session_keyring moved

Linux 5.3 removed uid_keyring and session_keyring from user_struct
Prefer the lookup_user_key() API when it is available (~5.0)
Prefer get_request_key_auth() when it is available (~5.0)

kernel-commit: 0f44e4d976f96c6439da0d6717238efa4b91196e
kernel-commit: 822ad64d7e46a8e2c8b8a796738d7b657cbb146d

Remove LC_HAVE_CRED_TGCRED which is no longer used.

Test-Parameters: envdefinitions=SHARED_KEY=true,SANITY_SEC_EXCEPT=30b testlist=sanity,recovery-small,sanity-sec
Cray-bug-id: LUS-7689
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I6d551cd8a9e317b717a43cba9be57f184a281c0a
Reviewed-on: https://review.whamcloud.com/35743
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12889 lnet: Do not assume peers are MR capable 12/36512/8
Chris Horn [Fri, 18 Oct 2019 19:16:53 +0000 (14:16 -0500)]
LU-12889 lnet: Do not assume peers are MR capable

If a peer has discovery disabled then it will not consolidate peer
NI information. This means we need to use a consistent source NI
when sending to it just like we do for non-MR peers.

A comment in lnet_discovery_event_reply() indicates that this was a
known issue, but the situation is not handled properly.

Do not assume peers are multi-rail capable when peer objects are
allocated and initialized.

Do not mark a peer as multi-rail capable unless all of the following
conditions are satisified:
1. The peer has the MR feature flag set
2. The peer has discovery enabled.
3. We have discovery enabled locally

Note: 1, 2, and 3 above are implemented in the code for
lnet_discovery_event_reply(), but code earlier in the function breaks
this behavior. Remove the offending code.

Update sanity-lnet tests 100 and 101 to reflect the fact that peers
added via the traffic path no longer have multi-rail by default.

Cray-bug-id: LUS-7918
Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ia02bd446f4b2143fb490f56c1ff6103198316da3
Reviewed-on: https://review.whamcloud.com/36512
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoRevert "LU-12222 lnet: Check if we're sending to ourselves" 59/37259/3
Chris Horn [Thu, 16 Jan 2020 19:25:29 +0000 (13:25 -0600)]
Revert "LU-12222 lnet: Check if we're sending to ourselves"

This reverts commit e4af756e1f428a9f7883bf883f66941defb1447f.

Commit e4af756 causes an assert when combined with patch
    https://review.whamcloud.com/36512
    LU-12889 lnet: Do not assume peers are MR capable
Since the 36512 patch is fixing a more serious bug, this
patch is reverted to allow that fix to land.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6f3c1e7f7b2858f4aa330b53880fbcc815c1e2c7
Reviewed-on: https://review.whamcloud.com/37259
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12923 libcfs: Remove CLASSERT() for libcfs_private.h 88/37188/3
Arshad Hussain [Mon, 30 Dec 2019 23:28:56 +0000 (04:58 +0530)]
LU-12923 libcfs: Remove CLASSERT() for libcfs_private.h

This patch removes final CLASSERT() define from file
libcfs/include/libcfs/libcfs_private.h. For compile
time assertion kernel defined BUILD_BUG_ON() is preferred

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I6d7dd55489824631ae61393413598fe6dc4365a2
Reviewed-on: https://review.whamcloud.com/37188
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12861 libcfs: Cleanup use of bare printk 46/37046/5
Shaun Tancheff [Fri, 3 Jan 2020 16:21:19 +0000 (10:21 -0600)]
LU-12861 libcfs: Cleanup use of bare printk

Some users of printk(<LEVEL> "fmt" can be converted to
pr_level("fmt" equivalents

Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I5bb13dfa3538839cfaf81137f3cffd937ce55a92
Reviewed-on: https://review.whamcloud.com/37046
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-3606 lustre: Reserve OST_FALLOCATE(fallocate) opcode 77/37277/4
Swapnil Pimpale [Wed, 1 Jan 2020 00:42:57 +0000 (06:12 +0530)]
LU-3606 lustre: Reserve OST_FALLOCATE(fallocate) opcode

A new RPC, OST_FALLOCATE has been added for
space preallocation. This patch reserves
OST_FALLOCATE opcode for fallocate syscall.
Reserving opcode upfront would ensure consistency
and would avoid protocol interoperability issues
in the future.

Test-Parameters: trivial testlist=sanity,sanityn,sanity-dom
Signed-off-by: Swapnil Pimpale <spimpale@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Abrarahmed Momin <abrar.momin@gmail.com>
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie109f8f5720dec6d34c5ce4f7732fe49ccb47cd9
Reviewed-on: https://review.whamcloud.com/37277
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
4 years agoLU-3606 fsx: Add fallocate operation to fsx 73/37273/2
Swapnil Pimpale [Tue, 31 Dec 2019 19:39:25 +0000 (01:09 +0530)]
LU-3606 fsx: Add fallocate operation to fsx

This patch updates Lustre fsx(File system exerciser)
to handle fallocate calls. There is no need to change
any existing test case using "fsx" binary as with this
fsx version the 'fallocate' call will simply be skipped
as "Operation not supported".

Test-Parameters: trivial testlist=sanity,sanityn,sanity-dom
Signed-off-by: Swapnil Pimpale <spimpale@ddn.com>
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Abrarahmed Momin <abrar.momin@gmail.com>
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I81649d00984257b1785e763ab5c00d570eb412f9
Reviewed-on: https://review.whamcloud.com/37273
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
4 years agoLU-13164 uapi: remove unused LUSTRE_DIRECTIO_FL 95/37295/2
Andreas Dilger [Tue, 21 Jan 2020 09:32:54 +0000 (02:32 -0700)]
LU-13164 uapi: remove unused LUSTRE_DIRECTIO_FL

The LUSTRE_DIRECTIO_FL was added based on the upstream FS_DIRECTIO_FL
flag in the hopes that it might be useful, but it has since been
removed from the upstream in kernel commit v4.4-rc4-22-g68ce7bfcd995
and replaced by FS_VERITY_FL using the same value in kernel commit
v5.3-rc2-4-gfe9918d3b228, which we are much more likely to use.

Since LUSTRE_DIRECTIO_FL was unused, there is no risk to remove it.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I49e915612636a674a86d25be5d91a042693ebbe5
Reviewed-on: https://review.whamcloud.com/37295
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13130 tests: sanity-scrub to use full device size with ZFS 17/37217/4
Alex Zhuravlev [Mon, 13 Jan 2020 17:29:55 +0000 (20:29 +0300)]
LU-13130 tests: sanity-scrub to use full device size with ZFS

as on tiny devices ZFS fallbacks to non-cached writes (grants are
consumed too quickly) while formatting time doesn't depend on
device size with ZFS (which was the original reasoning for the limits).

also increase OST size as sometimes local testing with ldiskfs fails
due to lack of space.

Test-Parameters: trivial testlist=sanity-scrub mdscount=2 mdtcount=4
Test-Parameters: fstype=zfs testlist=sanity-scrub mdscount=2 mdtcount=4

Change-Id: I8aad6c39d23a1d4c8db07b76e9de7fa2a664b1e5
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/37217
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-9859 libcfs: move files out of libcfs/linux 91/37191/6
James Simmons [Sat, 18 Jan 2020 14:53:35 +0000 (09:53 -0500)]
LU-9859 libcfs: move files out of libcfs/linux

Files that are not used to handle various kernel verisons are
promoted out of the linux directory. Loosely based on

Linux-commit: f72c3ab791ac0b2b75b5b5d4d51d8eb89ea1e515

This bring us more into sync with linux lustre client.

Change-Id: I4aad42671de14b4e5ca0743d2126363c829b0d74
Test-Parameters: trivial
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/37191
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12977 ldiskfs: properly take inode_lock() for truncates 16/37116/5
James Simmons [Mon, 30 Dec 2019 17:48:05 +0000 (12:48 -0500)]
LU-12977 ldiskfs: properly take inode_lock() for truncates

Originally Lustre grabbed the inode_lock() but this lead to
deadlocks as described in LU-6446 and LU-4252. The recent work
of LU-10048 changed the truncate code so that it is called
asynchronously from the main transactions. This should avoid
lock ordering issues. It should be safe to take the
inode_lock() around ldiskfs_truncate() and remove the WARN().

Test-Parameters: fstype=ldiskfs testlist=racer

Change-Id: Id7b6d05d054ab041980e946989aa1effae5c7111
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/37116
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13036 lnet: avoid extra memory consumption 97/36897/3
Alexey Lyashkov [Fri, 20 Dec 2019 12:40:37 +0000 (15:40 +0300)]
LU-13036 lnet: avoid extra memory consumption

use slab allocation for the rsp_tracker and lnet_message
structs to avoid memory fragmnetation.

Test-parameters: trivial

Cray-bug-id: LUS-8190
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I67ec8f8fe4da4c646241d551e0a23745cae8ed00
Reviewed-on: https://review.whamcloud.com/36897
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: me: discard struct lnet_handle_me 59/36859/8
Mr NeilBrown [Sat, 18 Jan 2020 13:41:38 +0000 (08:41 -0500)]
LU-12678 lnet: me: discard struct lnet_handle_me

The Portals API uses a cookie 'handle' to identify an ME.  This is
appropriate for a user-space API for objects maintained by the
kernel, but it brings no value when the API client and
implementation are both in the kernel, as is the case with Lustre
and LNet.

Instead of using a 'handle', a pointer to the 'struct lnet_me' can
be used.  This object is not reference counted and is always freed
correctly, so there can be no case where the cookie becomes invalid
while it is still held - as can be seen by the fact that the return
value from LNetMEUnlink() is never used except to assert that it is
zero.

So use 'struct lnet_me *' directly instead of having indirection
through a 'struct lnet_handle_me'.

Also:
 - change LNetMEUnlink() to return void as it cannot fail now.
 - have LNetMEAttach() return the pointer, using ERR_PTR() to return
   errors.
 - discard ln_me_containers and don't store the me there-in.
 - store an explicit 'cpt' in each me, we no longer store one
   implicitly via the cookie.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I4e73e3217a244d8d15da90a8ba80371d1fd5f61f
Reviewed-on: https://review.whamcloud.com/36859
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: fix small race in unloading klnd modules. 53/36853/4
Mr NeilBrown [Sat, 18 Jan 2020 13:48:35 +0000 (08:48 -0500)]
LU-12678 lnet: fix small race in unloading klnd modules.

Reference counting of klnd modules is handled by the module itself.
Currently, it is possible for a module to be completely unloaded
between the time when the module called module_put(), and when
it subsequently returns from the function that makes that call.
During this time there may be one or two instructions to execute,
and if the module is unmapped before they are executed, an
exception will result.

The module unload will call lnet_unregister_lnd() which takes
the_lnet.ln_lnd_mutex, so module unload cannot complete while
that is held.  lnd_startup is called with this mutex held to
avoid any races, but lnd_shutdown is not.  Adding that
protection will close the race.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I98036ef2fc939101d085bbd6d0c76a29b848ee26
Reviewed-on: https://review.whamcloud.com/36853
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: remove dead code: lnet_fini_locks() 38/36838/5
Mr NeilBrown [Sat, 18 Jan 2020 13:45:25 +0000 (08:45 -0500)]
LU-12678 lnet: remove dead code: lnet_fini_locks()

lnet_fini_locks() does nothing and appears to serve
no purpose.  Remove it.
Once long ago it contained some asserts, and there was
a parallel function that was used for user-space builds,
but all that is gone now.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I91243b1726694dfe3437a81d15d06330bf96e9c9
Reviewed-on: https://review.whamcloud.com/36838
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13004 target: convert tgt_send_buffer to use KIOV 27/36827/6
Mr NeilBrown [Sat, 18 Jan 2020 13:57:51 +0000 (08:57 -0500)]
LU-13004 target: convert tgt_send_buffer to use KIOV

Rather than BULK_BUF_KVEC, use a BULK_BUF_KIOV descriptor.

This is a step towards removing KVEC support.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I27a81f6a3ef7ba9b079b1b93f56f475a38aaa3f4
Reviewed-on: https://review.whamcloud.com/36827
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-9679 modules: use list_move were appropriate. 70/36670/4
Mr NeilBrown [Tue, 5 Nov 2019 02:40:53 +0000 (13:40 +1100)]
LU-9679 modules: use list_move were appropriate.

Rather than
  list_del(&foo);
  list_add(&foo, &bar);
use
  list_move(&foo, &bar);

Similarly for list_add_tail and list_move_tail.

In lnet_attach_rsp_tracker, local_rspt already has a suitably
initialised ->rspt_on_list, so the new_entry variable can
be discarded.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I81b1165e01a6f26be0ec0c2686d84502be1e0b35
Reviewed-on: https://review.whamcloud.com/36670
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12871 mdd: enable Changelog garbage collection 67/36467/5
Andreas Dilger [Thu, 17 Oct 2019 05:50:32 +0000 (14:50 +0900)]
LU-12871 mdd: enable Changelog garbage collection

Enable the Changelog garbage collection by default.

This feature was disabled by default in commit v2_10_56_0-2-g3442db6fa
(2.11.0) and was fixed in commit v2_11_52_0-59-g31fef6845e (2.12), but
was not re-enabled again by default.

Fixes: 31fef6845e8b ("LU-10680 mdd: create gc thread when no current transaction")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id4e68e0563cb2216d56bb9aec3a49c83c93ebbe5
Reviewed-on: https://review.whamcloud.com/36467
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-1538 tests: standardize test script init – failover 58/36358/3
James Nunez [Thu, 3 Oct 2019 13:58:11 +0000 (07:58 -0600)]
LU-1538 tests: standardize test script init – failover

Standardize the initial Lustre test script initialization for
clarity and consistency for recovery and replay tests in the
failover test group.

The LUSTRE path is already normalized in init_test_env(), so this
doesn't need to be done in the caller.  Use $(...) subshells instead
of `...` in the affected lines.  Remove NAME, SRCDIR, PATH, MULTIOP,
SETUP, CLEANUP, CHECKSTAT, TMP, SAVE_PWD, variable initialization,
since it is already done in init_test_env() or not needed in the test
scripts.

Move all definitions of ALWAYS_EXCEPT and SLOW to after
init_test_env() and init_logging() and call build_test_filter()
immediately after the ALWAYS_EXCEPT and SLOW definitions.

Test-Parameters: trivial
Test-Parameters: envdefinitions=SLOW=no testgroup=failover
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I6484130e70a738c2fc4962afe2b814b39ea5ed77
Reviewed-on: https://review.whamcloud.com/36358
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
4 years agoLU-10467 obdclass: convert waiting in cl_sync_io_wait(). 02/36102/11
Mr NeilBrown [Sat, 18 Jan 2020 13:55:58 +0000 (08:55 -0500)]
LU-10467 obdclass: convert waiting in cl_sync_io_wait().

The l_wait_event() call in cl_sync_io_wait() will wait indefinitely
if timeout is zero, or for a limited time if timeout is positive.
This doesn't have an exact analogue in wait_event* macros, so we
need to revise the code more broadly.

This function will *always* wait until ->csi_sync_nr reaches zero.
The effect of the timeout is:
 1/ to report an error if the count doesn't reach zero in the given
    time
 2/ to return -ETIMEDOUt instead of csi_sync_rc if the timeout was
    exceeded.

So we rearrange the code to make that more obvious.
A small exrta change is that we now call wait_event_idle() again
even if there was a timeout and the first wait succeeded.
This will simply test csi_sync_nr again and not actually wait.
We could protected it with 'rc != 0 || timeout == 0' but there seems
no point.

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I3c507a7fe01bfdf2ed5e9e71ea8215e6cfd0b54e
Reviewed-on: https://review.whamcloud.com/36102
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12542 handle: use hlist for hash lists. 62/35862/5
NeilBrown [Fri, 13 Dec 2019 15:47:35 +0000 (10:47 -0500)]
LU-12542 handle: use hlist for hash lists.

hlist_head/hlist_node is the preferred data structure
for hash tables. Not only does it make the 'head' smaller,
but is also provides hlist_unhashed() which can be used to
check if an object is in the list.  This means that
we don't need h_in any more.

Change-Id: I18e2799a6e719b96ed47747375e4e20675d9b7cc
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35862
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12542 handle: remove locking from class_handle2object() 61/35861/7
NeilBrown [Fri, 13 Dec 2019 15:46:15 +0000 (10:46 -0500)]
LU-12542 handle: remove locking from class_handle2object()

There is limited value in this locking and test on h_in.

If the lookup could have run in parallel with
class_handle_unhash_nolock() and seen "h_in == 0", then it could
equally well have run moments earlier and not seen it - no locking
would prevent that, so the caller much be prepared to have
an object returned which has already been unhashed by the time it
sees the object.

In other words, any interlock between unhash and lookup must be
provided at a higher level than where this code is trying
to handle it.

The locking *does* prevent the refcount from being incremented if the
object has already been removed from the list.  As the final reference
is always dropped after that removal, it indirectly stops the refcount
from being incremented after the final reference is dropped.
This can be more directly achieved by using refcount_inc_not_zero().

So remove the locking, and replace it with refcount_inc_not_zero().

Change-Id: Id29cee173ed0c3b060ea92e21af6e420970cfa18
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35861
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12518 llite: Accept EBUSY for page unaligned read 57/35457/14
Patrick Farrell [Thu, 8 Aug 2019 17:13:01 +0000 (13:13 -0400)]
LU-12518 llite: Accept EBUSY for page unaligned read

When doing unaligned strided reads, it's possible for the
first and last page of a stride to be read by another
thread on the same node, resulting in EBUSY.

Also this could potentially happen for sequential read,
for example, several MPI split one large file with unaligned
page size, sequential read happen with each MPI program.

We shouldn't stop readahead in these cases.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I4e832c8859452d0b52f14b5e4fdb64a972bf40a3
Reviewed-on: https://review.whamcloud.com/35457
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12518 llite: proper names/types for offset/pages 48/37248/8
Andreas Dilger [Wed, 15 Jan 2020 10:30:32 +0000 (03:30 -0700)]
LU-12518 llite: proper names/types for offset/pages

Use loff_t for file offsets and pgoff_t for page index values
instead of unsigned long, so that it is possible to distinguish
what type of value is being used in the byte-granular readahead
code.  Otherwise, it is difficult to determine what units "start"
or "end" in a given function are in.

Rename variables that reference page index values with an "_idx"
suffix to make this clear when reading the code.  Similarly, use
"bytes" or "pages" for variable names instead of "count" or "len".

Fix stride_page_count() to properly use loff_t for the byte_count,
which might otherwise overflow for large strides.

Cast pgoff_t vars to loff_t before PAGE_SIZE shift to avoid overflow.
Use shift and mask with PAGE_SIZE and PAGE_MASK instead of mod/div.

Use proper 64-bit division functions for the loff_t types when
calculating stride, since they are not guaranteed to be within 4GB.

Remove unused "remainder" argument from ras_align() function.

Fixes: 91d264551508 ("LU-12518 llite: support page unaligned stride readahead")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie1e18e0766bde2a72311e25536dbb562ce3ebbe5
Reviewed-on: https://review.whamcloud.com/37248
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13154 test: skip sanity-quota 66 if MDS version < 2.12.4 76/37276/2
Wang Shilong [Sun, 19 Jan 2020 02:16:25 +0000 (10:16 +0800)]
LU-13154 test: skip sanity-quota 66 if MDS version < 2.12.4

Since LU-12826 landed after this version, add version check to
make interop test pass.

Test-Parameters: trivial envdefinitions=ONLY=66 testlist=sanity-quota
Change-Id: I829f424b9bb103e18c06de6f797827f82e1874d1
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37276
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
4 years agoLU-13063 tests: stop running sanity test 411 70/37270/6
James Nunez [Fri, 17 Jan 2020 19:24:15 +0000 (12:24 -0700)]
LU-13063 tests: stop running sanity test 411

sanity test 411 hits a kernel bug for RHEL 8.1.  Since this
is an issue with the kernel and not Lustre, let's stop
running this test until the kernel is patched.  Thus, we
need to add sanity test 411 to the ALWAYS_EXCEPT list.

Also change the ALWAYS_EXCEPT condition for test smoke for
lnet-selftest to be based on kernel version and not
architecture, so that the custom test for this patch can
pass.

Test-Parameters: trivial clientdistro=el8.1
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I60174dcd4776b53ac5b44be6c208d40e1f022445
Reviewed-on: https://review.whamcloud.com/37270
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
4 years agoLU-13152 llapi: llapi_layout_get_by_xattr groks DoM 69/37269/3
Sebastien Buisson [Fri, 17 Jan 2020 16:31:04 +0000 (17:31 +0100)]
LU-13152 llapi: llapi_layout_get_by_xattr groks DoM

llapi_layout_get_by_xattr() function must be updated to handle
lov component with LOV_PATTERN_MDT pattern.

Signed-off-by: Clement Barthelemy <clement.barthelemy@nextino.eu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I6553e66cd4f3b5acc65790da94555350c98fe179
Reviewed-on: https://review.whamcloud.com/37269
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
4 years agoLU-13147 tests: Cleanup sanity-lnet on test failure 58/37258/4
Chris Horn [Thu, 16 Jan 2020 19:31:47 +0000 (13:31 -0600)]
LU-13147 tests: Cleanup sanity-lnet on test failure

Trap EXIT so we can cleanup on test failure.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I702b214046a68af2b87536dab01879c356bff2a8
Reviewed-on: https://review.whamcloud.com/37258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13136 dom: check read-on-open buffer presents in reply 49/37249/2
Mikhail Pershin [Wed, 15 Jan 2020 11:55:18 +0000 (14:55 +0300)]
LU-13136 dom: check read-on-open buffer presents in reply

The ll_dom_finish_open() uses req_capsule_has_field() wronly,
it check only format but not buffer presence in reply, that
causes unneeded console errors about missing buffer later in
req_capsule_server_get()

Patch replaces that with req_capsule_field_present() to check
if server pack that field in reply or not and properly skip
responses from an old server.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ia6114879c90e3e6b8c5020c4912e988cad90df30
Reviewed-on: https://review.whamcloud.com/37249
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Stephane Thiell <sthiell@stanford.edu>
4 years agoLU-11644 ptlrpc: show target name in req_history 93/37193/3
Andreas Dilger [Fri, 10 Jan 2020 22:41:18 +0000 (15:41 -0700)]
LU-11644 ptlrpc: show target name in req_history

Currently the req_history tracing shows the "self" NID as the second
field.  However, this is not very useful since there may be a number
of different targets on the same server, and since the logs are all
collected directly on the server we already know the local NID.

Instead of printing the "self" NID, store the target name as the
second field, if that is available, so that we can determine which
target the RPC was intended for.  This makes it easier to debug
problems with bad clients and isolate traffic for a specific target.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4ce5b7c557c5b491bfe3bbc5ae80257f0a3ebbe5
Reviewed-on: https://review.whamcloud.com/37193
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13093 osd: fix osd_attr_set race 17/37117/4
Alexander Boyko [Tue, 31 Dec 2019 15:17:58 +0000 (10:17 -0500)]
LU-13093 osd: fix osd_attr_set race

The race between tgt_brw_write->ofd_write_attr_set and
ofd_attr_set took a place, and it could set a wrong attributes.
ofd_write_attr_set() does checks and declarations and sleeps on
ofd_read_lock. Another thread executes ofd_attr_set() and sets
initial uid/gid. After that the first thread wakeups and sets
another uid/gid. But ofd_write_attr_set should change attributes
for initial time only.
This also leads to a bug at credits check cause uid was changed
between declaration and attr_set.

osd_trans_exec_check(ATTR_SET) has a wrong place when xattr_set
is called. Also xattr doesn't have osd_trans_exec_op.

lustre-OST0001: opcode 0: used 9, used now 9, reserved 1
create: 0/0/0, destroy: 0/0/0
attr_set: 1/1/9, xattr_set: 2/274/0
write: 0/0/0, punch: 0/0/0, quota 6/6/0
insert: 0/0/0, delete: 0/0/0
ref_add: 0/0/0, ref_del: 0/0/0
LBUG

Cray-bug-id: LUS-8133
Fixes: 9f79d4488  ("LU-10048 ofd: take local locks within transaction")
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Id36ff633b0d97fff345ec105e0aa1b14fccafce4
Reviewed-on: https://review.whamcloud.com/37117
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13101 llite: eviction during ll_open_cleanup() 96/37096/2
Andriy Skulysh [Mon, 28 Oct 2019 19:42:42 +0000 (21:42 +0200)]
LU-13101 llite: eviction during ll_open_cleanup()

On error ll_open_cleanup() is called while
intent lock remains pinned. So eviction can
happen while close request waits for a mod rpc slot.

Release intent lock before ll_open_cleanup()

Change-Id: Ia422351f3f54fc652078f742f2ead0bf278c9d17
Cray-bug-id: LUS-8055
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-on: https://review.whamcloud.com/37096
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13099 lmv: disable statahead for remote objects 89/37089/6
Vladimir Saveliev [Mon, 23 Dec 2019 11:07:25 +0000 (14:07 +0300)]
LU-13099 lmv: disable statahead for remote objects

Statahead for remote objects is supposed to be disabled by
LU-11681 lmv: disable remote file statahead.

However due to typo it is not and statahead for remote objects is
accompanied by warnings like:
  ll_set_inode()) Can not initialize inode .. without object type..
  ll_prep_inode()) new_inode -fatal: rc -12

Fix the typo.

Test to illustrate the issue is added.

Fixes: 02b5a407081c ("LU-11681 lmv: disable remote file statahead")

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Cray-bug-id: LUS-8262
Change-Id: I8055b6373fb7b9777fa888dcb09384213822a59f
Reviewed-on: https://review.whamcloud.com/37089
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12991 lnet: lnet response entries leak 96/36896/8
Alexey Lyashkov [Fri, 29 Nov 2019 10:43:15 +0000 (13:43 +0300)]
LU-12991 lnet: lnet response entries leak

LNetPut with ACK flag called, but LNetMDUnlink issued before ACK
arrives. It can due timeout or it is application call (ldiskfs commit
for difficult replies on MDT).
It freed an MD but rsp don't detached, as ACK don't hold an reference
to the MD between request sends and ACK arrives.
monitor thread detect it situation and RSP entry moved into the zombie
list, which don't freed as no msg processed due MD absense.

Let's remove a response tracking in case nobody want to have reply aka
LNetMDUnlink called.

Test-parameters: trivial

Cray-bug-id: LUS-8188
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I90ad88cea41bb28b29f909c85b8273d41464ce81
Reviewed-on: https://review.whamcloud.com/36896
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13049 lnet: peer lookup handle shutdown 25/36925/4
Amir Shehata [Wed, 4 Dec 2019 20:19:05 +0000 (12:19 -0800)]
LU-13049 lnet: peer lookup handle shutdown

When LNet is shutting down, looking up peer_nis shouldn't assert
but return NULL. Callers handle NULL return

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia658f527719a71b2d0bed144ae03582eff54fcf9
Reviewed-on: https://review.whamcloud.com/36925
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12822 uapi: properly pack data structures 98/36798/11
James Simmons [Sat, 18 Jan 2020 15:12:38 +0000 (10:12 -0500)]
LU-12822 uapi: properly pack data structures

Linux UAPI headers use the gcc attributre __packed__ to ensure
that the data structures are the exact same size on all platforms.
This comes at the cost of potential misaligned accesses to these
data structures which at best cost performance and at worst cause
a bus error on some platforms. To detect potential misaligned
access starting with gcc version 9 a new compile flags was
introduced which is now impacting builds with Lustre.

Examining the build failures shows most of the problems are due to
packed data structures in the Lustre UAPI header containing
unpacked data structure fields. Packing those missed structures
resolved many of the build issues. The second problem is that the
lustre utilities tend to cast some of its UAPI data structure.
A good example is struct lov_user_md being cast to
struct lov_user_md_v3. To ensure this is properly handled with
packed data structures we need to use the __may_alias__ compiler
attribute. The one exception is struct statx which is defined out
side of Lustre and its unpacked. This requires extra special
handling in user land code due to the described issues in this
comment.

Fixing this problem exposed an incorrect wiretest for
struct update_op

Last problem address is the use of __swabXXp() on packed data
structure fields. Because of the potential alignment issues we
have to use __swabXX() functions instead.

Change-Id: I149c55d3361e893bd890f9c5e9c77c15f81acc1b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36798
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-7791 ldlm: signal vs CP callback race 98/19898/9
Andriy Skulysh [Tue, 3 May 2016 07:41:56 +0000 (10:41 +0300)]
LU-7791 ldlm: signal vs CP callback race

In case of interrupted wait for a CP AST
failed_lock_cleanup() sets LDLM_FL_LOCAL_ONLY, so
the client wouldn't cancel the lock on CP AST.

A lock isn't canceled on the server on reception

Cray-bug-id: LUS-2021
Change-Id: Id1e365b41f1fb8a0f9a32c0c929457b22ceba8ef
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/19898
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoRevert "LU-13120 build: Fix ZFS dependancies for osd-zfs-mount" 20/37320/3
Andreas Dilger [Thu, 23 Jan 2020 23:02:35 +0000 (23:02 +0000)]
Revert "LU-13120 build: Fix ZFS dependancies for osd-zfs-mount"

This reverts commit fb687e35402fa6755589657a67dbe30be09ba9c5.

All review-dne-zfs-part-[1234] sessions fail with the most recent
master landings, and this seems like the likely culprit.

Change-Id: Id0295d65a642e7c2ef6367dac72d89acfca8a6b4
Reviewed-on: https://review.whamcloud.com/37320
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
4 years agoLU-11276 ldlm: fix lock convert races 66/36466/11
Vitaly Fertman [Wed, 16 Oct 2019 16:07:56 +0000 (19:07 +0300)]
LU-11276 ldlm: fix lock convert races

The blocking cb may be triggered in parallel and the convert logic
of the DOM lock must be ready that the cancel_bits could be already
zeroed by the first executor.

As there may be several blocking cb parallel executors and several
conversion callers, each requesting for different inode bits, setup
the following logic:
- the lock keeps the aggregated set of bits requested for cancelling
  by different parties, where 0 means the whole lock is to be
  cancelled, and where the CBPENDING flag means there is a canceling
  job pending;
- once completed, the cancel_bits are zeroed and the CBPENDING flag
  is dropped, meaning the next request will be a part of the next job;
- once a local lock is converted, its state is changed appropriately
  and no cleanup is left for the interpret time as the lock is ready
  for the next usage;
- as the lock is unlocked in a process of conversion and more bits
  may appear, check it and repeat appropriately;
- let just 1 conversion executor to work at a time, others are waiting
  similar to ldlm_cli_cancel();
- there are others who may want to cancel unused locks (cancel_lru,
  cancel_resource_local), consider CANCELING as a request to cancel
  the full lock independently of the cancel_bits;

Some cleanups are done:
- move the cache drop logic to the CANCELING part of the blocking cb
  from the BLOCKING one;
- remove the convert RPC interpret, as the lock cleanups are already
  done in advance; the convert RPC is re-sendable and an error means
  there is a serioes net problem;

Test-Parameters: testlist=racer,racer,racer
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I901de34241704ed801152f071cb7f610fe6f4bfe
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36466
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13119 osd-ldiskfs: set f_cred for app armour 84/37184/4
James Simmons [Fri, 10 Jan 2020 14:30:47 +0000 (09:30 -0500)]
LU-13119 osd-ldiskfs: set f_cred for app armour

The function interate_dir() interfaces with the security layer.
For some kernel versions on platforms that use app armour it
expects f_cred to be set. Currently osd-ldiskfs open codes the
creation of struct file so it is missing a cred. Fix this by
setting f_cred to the default current_cred().

Test-Parameters: testlist=sanity-lfsck serverdistro=sles12sp3

Change-Id: I38487e8ae99a0f70d6e430935b7d19523d414b4b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/37184
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
4 years agoLU-13121 llite: fix deadlock in ll_update_lsm_md() 82/37182/2
Lai Siyao [Tue, 7 Jan 2020 13:30:38 +0000 (21:30 +0800)]
LU-13121 llite: fix deadlock in ll_update_lsm_md()

Deadlock may happen in in following senario: a lookup process called
ll_update_lsm_md(), it found lli->lli_lsm_md is NULL, then
down_write(&lli->lli_lsm_sem). but another lookup process initialized
lli->lli_lsm_md after this check and before write lock, so the first
lookup process called up_read(&lli->lli_lsm_sem) and return, so the
write lock is never released, which cause subsequent lookups deadlock.

Rearrange the code to simplify the locking:
1. take read lock.
2. if lsm was initialized and unchanged, release read lock and return.
3. otherwise release read lock and take write lock.
4. free current lsm and initialize with new lsm.
5. release write lock.
6. initialize stripes with read lock.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ifcc25a957983512db6f29105b5ca5b6ec914cb4b
Reviewed-on: https://review.whamcloud.com/37182
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13120 build: Fix ZFS dependancies for osd-zfs-mount 69/37169/3
Shaun Tancheff [Thu, 9 Jan 2020 11:57:29 +0000 (05:57 -0600)]
LU-13120 build: Fix ZFS dependancies for osd-zfs-mount

lustre-osd-zfs-mount depends on zfs
lustre-osd-zfs-mount depends on kmod-lustre-osd-zfs

SuSE packaging style prefers kmp package naming so prepare
for adopting a kmp named zfs package

Test-Parameters: trivial
Cray-bug-id: LUS-7077
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I510a46dd3d0e6d58a1e0db36226d412ee06016ec
Reviewed-on: https://review.whamcloud.com/37169
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13117 libcfs: fix to match right key in cfs_get_environ() 56/37156/4
Wang Shilong [Wed, 8 Jan 2020 01:45:27 +0000 (09:45 +0800)]
LU-13117 libcfs: fix to match right key in cfs_get_environ()

It does the memcmp() to match the environment variable
with the desired key, then accounts for the "=" when
calculating length. But it fails to check that the next
character is actually an equals sign. In the case of
any key which is also the prefix to some other variable

Also add debug information for debugging similar issue
in the future.

Test-Parameters: trivial
Change-Id: Ia2b4ccd1f10c89059cecc224d4e2ba8d1d75b825
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37156
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13098 ptlrpc: supress connection restored message 86/37086/4
Alex Zhuravlev [Sat, 21 Dec 2019 15:40:20 +0000 (18:40 +0300)]
LU-13098 ptlrpc: supress connection restored message

if that happens on idling connection.

Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I506665d427f3e77477f53e2d3059bcb1daaf0318
Reviewed-on: https://review.whamcloud.com/37086
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13092 lbuild: include lbuild-{fc,rhel,sles} to SIGNATURE 76/37076/4
Wang Shilong [Thu, 9 Jan 2020 01:34:28 +0000 (09:34 +0800)]
LU-13092 lbuild: include lbuild-{fc,rhel,sles} to SIGNATURE

We should include these files to calculate SIGNATURE, for example
bump kernel extra tags could happen there.

Test-Parameters: trivial
Change-Id: I2c62ad765d3c6a1b9e99affe3be95a404d6140c5
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37076
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13053 tests: fix conf-sanity call to umount_ldiskfs 49/36949/3
James Nunez [Fri, 6 Dec 2019 16:38:13 +0000 (09:38 -0700)]
LU-13053 tests: fix conf-sanity call to  umount_ldiskfs

conf-sanity test 87 calls umount_ldiskfs(), but the function
in test-framework.sh is unmount_ldiskfs().  We need to
change the function call in test 87 to unmount_ldiskfs().

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I3e0818a229341c4fab8aee923cad2253b7dd634d
Reviewed-on: https://review.whamcloud.com/36949
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: remove locking protection ln_testprotocompat 56/36856/4
Mr NeilBrown [Fri, 17 Jan 2020 19:36:51 +0000 (14:36 -0500)]
LU-12678 lnet: remove locking protection ln_testprotocompat

lnet_net_lock(LNET_LOCK_EX) is a heavy-weight lock that is not
necessary here.  The bits in this field are only set rarely - via an
ioctl - and the pattern for reading and clearing them exactly
matches test_and_clear_bit().  So change the field to "unsigned
long" (so test_and_clear_bit() can be used), and use
test_and_clear_bit(), discarding all other locking.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ie420fcb3d547d9ec04025b921d5b24bd8f2fcce3
Reviewed-on: https://review.whamcloud.com/36856
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 lnet: make "struct lnet_lnd" always "const". 32/36832/3
Mr NeilBrown [Sun, 24 Nov 2019 23:00:50 +0000 (10:00 +1100)]
LU-12678 lnet: make "struct lnet_lnd" always "const".

Every place where "struct lnet_lnd" appears, "const" is
added in front. Now all those structs can be in read-only
memory which is generally more secure.

Linux-commit 07499855083e ("lnet: make "struct lnet_lnd"
                            always "const".")

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I54a73d5b12de8c6b9a98182577c3c30d05c00222
Reviewed-on: https://review.whamcloud.com/36832
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12678 socklnd: initialize the_ksocklnd at compile-time. 31/36831/5
Mr NeilBrown [Wed, 15 Jan 2020 15:31:45 +0000 (10:31 -0500)]
LU-12678 socklnd: initialize the_ksocklnd at compile-time.

All other lnds initialize this struct at compile-time.
It is best for socklnd to do so too.

Test-Parameters: trivial testlist=sanity-lnet

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I3acd636f6f5ba783a2c60bf18ffc46c98e091c13
Reviewed-on: https://review.whamcloud.com/36831
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-11385 odbclass: Handle gracefully if nsproxy is NULL 02/36802/5
Serguei Smirnov [Tue, 19 Nov 2019 22:18:17 +0000 (14:18 -0800)]
LU-11385 odbclass: Handle gracefully if nsproxy is NULL

Gracefully handle the case if current->nsproxy is NULL:
check for the condition and return an error, avoiding attempts
to dereference the pointer.

Signed-off-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Change-Id: Ia102d2bacdb0e54b0339985396447e6d25465c56
Reviewed-on: https://review.whamcloud.com/36802
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12968 mgs: Prevent reading past end of buffer 53/36753/5
Shaun Tancheff [Thu, 14 Nov 2019 01:28:46 +0000 (19:28 -0600)]
LU-12968 mgs: Prevent reading past end of buffer

KASAN reported
  BUG: KASAN: slab-out-of-bounds in mgs_wlp_lcfg+0xb3/0x4a0 [mgs]
  Read of size 64 at addr ffff8880b8f9fe40 by task ll_mgs_0002/17603

On memory allocated here.
  mgs_write_log_target+0x2ae/0x910 [mgs]

In mgs_wlp_lcfg( ..., char *ptr) ptr is a string so use strlcpy
instead of memcpy to avoid reading past the end of the buffer

Cray-bug-id: LUS-8137
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I539c0b4d878d26c44f64a4cd5746a8fba1bef2fa
Reviewed-on: https://review.whamcloud.com/36753
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
4 years agoLU-12904 ldiskfs: Add ldiskfs support for linux 5.4 83/36583/6
Shaun Tancheff [Wed, 15 Jan 2020 13:47:00 +0000 (07:47 -0600)]
LU-12904 ldiskfs: Add ldiskfs support for linux 5.4

Linux 5.4 ext4 has some changes from 5.0 this
fixes up the ldiskfs patches to apply against 5.4

Test-Parameters: trivial
Cray-bug-id: LUS-8042
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I116226ec9297eead4dfd3403be748f732e67f54f
Reviewed-on: https://review.whamcloud.com/36583
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-13141 ldiskfs: block alloc performance patch 50/37250/2
Shaun Tancheff [Wed, 15 Jan 2020 12:22:13 +0000 (06:22 -0600)]
LU-13141 ldiskfs: block alloc performance patch

Add block alloc performance patch to CentOS 7.7, 8.0 and
Ubuntu 19.04 5.0 kernel.

Cray-bug-id: LUS-8402
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ifeb78839e5dbe8731bbb5532906708b97d4d9d33
Reviewed-on: https://review.whamcloud.com/37250
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 years agoLU-12791 kernel: kernel update RHEL 8.0 [4.18.0-80.11.2.el8_0] 27/36527/5
Jian Yu [Fri, 10 Jan 2020 18:01:04 +0000 (10:01 -0800)]
LU-12791 kernel: kernel update RHEL 8.0 [4.18.0-80.11.2.el8_0]

Update RHEL 8.0 kernel to 4.18.0-80.11.2.el8_0.

Test-Parameters: trivial clientdistro=el8 \
testlist=sanity

Change-Id: I4081719fa9a8c83ea0e8bff46dc9d54774cabb56
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36527
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>