Whamcloud - gitweb
fs/lustre-release.git
3 weeks agoNew development branch for Lustre 2.14 2.13.50 v2_13_50
Oleg Drokin [Fri, 8 Nov 2019 16:47:17 +0000 (11:47 -0500)]
New development branch for Lustre 2.14

Change-Id: I5636e71d6618d23e72127c4154affec62718cdb4
Signed-off-by: Oleg Drokin <green@whamcloud.com>
3 weeks agoLU-12932 lod: rename qos_threshold_rr parameter 86/36686/6
James Simmons [Wed, 6 Nov 2019 18:19:59 +0000 (13:19 -0500)]
LU-12932 lod: rename qos_threshold_rr parameter

Rename the qos_thresholdrr parameter back to its original name of
qos_threshold_rr so that there is no interop breakage. Update
test to handle mdt_qos_threshold_rr which lines up with the name
of qos_* sysfs files. Since we are using directly kstrtouint()
we have to eat the '%' that could be passed in.

Change-Id: I318a2ece6910e28a7a2331851d13b2269cf23e28
Fixes: c1d0a355a6a6 ("LU-12624 lod: alloc dir stripes by QoS")
Test-Parameters: trivial testlist=sanityn
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36686
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
4 weeks agoLU-12734 misc: allow older bash_completion versions 59/36459/3
Andreas Dilger [Tue, 15 Oct 2019 23:07:53 +0000 (17:07 -0600)]
LU-12734 misc: allow older bash_completion versions

Allow the "lctl" bash_completion to work on older versions which
which don't have _init_completions().  Check at runtime if this
function is available, and if not fall back to an older interface.

Has been manually tested with both bash-completion v1.3 and v2.1.

Fixes: f87a7f2656ce ("LU-12734 misc: add bash completion for lctl set/get_param)"
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3822c0967354d83d12f299c4be3023b2fc254035
Reviewed-on: https://review.whamcloud.com/36459
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
4 weeks agoLU-12932 tests: remove obsolete qos.sh test script 66/36666/2
Andreas Dilger [Mon, 4 Nov 2019 19:21:29 +0000 (12:21 -0700)]
LU-12932 tests: remove obsolete qos.sh test script

The qos.sh test script is broken for a number of reasons:
- hard coded filesystem name
- uses old positional parameters for lfs setstripe
- sets parameters on client that should be set on MDS

and is duplicated by sanity test_116a.  Remove it.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3d4b795a65f6fbb4398f76f4a533d753700cab07
Reviewed-on: https://review.whamcloud.com/36666
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
4 weeks agoLU-12932 lod: restore qos_thresholdrr sysfs file 67/36667/2
James Simmons [Mon, 4 Nov 2019 21:53:25 +0000 (16:53 -0500)]
LU-12932 lod: restore qos_thresholdrr sysfs file

The introduction of directory stripe allocation by space usage
renamed the lod sysfs file qos_thresholdrr to lod_qos_thresholdrr
but this breaks backwards compatiablity. Restore qos_thresholdrr.

Fixes: c1d0a355a6 ("LU-12624 lod: alloc dir stripes by QoS")

Change-Id: I93bf29cbec3c3a5a7a8527353aa8005ebd340ec5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36667
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
4 weeks agoLU-12895 tests: stop running tests for SSK and SELinux 16/36616/7
James Nunez [Wed, 30 Oct 2019 16:47:16 +0000 (10:47 -0600)]
LU-12895 tests: stop running tests for SSK and SELinux

There are a few tests that crash consistently when Shared Secret Key
(SSK) and/or SELinux are enabled.  We need to stop running them, by
adding them to the ALWAYS_EXCEPT list, until we can find a solution.

  sanity test 185, 230b, 230d, 272a
  recovery-small test 110k, 136

Test-Parameters: trivial
Test-Parameters: testgroup=review-dne-ssk
Test-Parameters: testgroup=review-dne-selinux
Test-Parameters: testgroup=review-dne-selinux-ssk
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I0af5f0c9d0d3c56a79e6558f2ce9f4e5a0a2d4c5
Reviewed-on: https://review.whamcloud.com/36616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
5 weeks agoLU-12903 doc: make PCC man pages 72/36572/4
James Nunez [Thu, 24 Oct 2019 22:54:56 +0000 (16:54 -0600)]
LU-12903 doc: make PCC man pages

Several man pages for the Persistent Client Cache
feature were not included in the doc/Makefile.am
file and, thus, they do not show up on the Lustre client.

Add the following man pages to the Makefile:
lctl-pcc.8
lfs-pcc-detach.1
llapi_pcc_attach.3
llapi_pcc_attach_fid.3
llapi_pcc_attach_fid_str.3
llapi_pcc_detach_fid.3
llapi_pcc_detach_fid_fd.3
llapi_pcc_detach_fid_str.3
llapi_pcc_detach_file.3
llapi_pccdev_get.3
llapi_pccdev_set.3
llapi_pcc_state_get.3
llapi_pcc_state_get_fd.3

Test-Parameters: trivial
Fixes: f172b1168857 ("LU-10092 llite: Add persistent cache on client")
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4a7accb4ab77a9fcefda9f115a751ccbc35f9b7c
Reviewed-on: https://review.whamcloud.com/36572
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
5 weeks agoLU-12624 lod: alloc dir stripes by QoS 25/35825/13
Lai Siyao [Sun, 4 Aug 2019 18:08:02 +0000 (02:08 +0800)]
LU-12624 lod: alloc dir stripes by QoS

Similar to file OST object allocation, introduce directory stripe
allocation by space usage, but they don't share the same code because
of the many differences between them: file has mirrors, PFL, object
precreation; while for directory, the first stripe is always on the
same MDT where its master object is on. The changes include:
* add lod_mdt_alloc_qos() to allocate stripes by space/inode usage.
* add lod_mdt_alloc_rr() to allocate stripes round-robin.
* add lod_mdt_alloc_specific() to allocate stripes in the old way.
* add sysfs support for lmv_desc field in LOD structure, and move
  those remain in procfs to sysfs.

This patch also changes LMV QoS code:
* mkdir by QoS if user mkdir by command 'lfs mkdir -i -1 ...', or the
  parent directory default LMV starting MDT index is -1.
* with the above change, 'space' hash flag is useless, remove all
  related code.
* previously 'lfs mkdir -i -1' QoS code is in lfs_setdirstripe(),
  but now it's done in LMV, remove the old code.

Update sanity 413a 413b to support QoS mkdir of both plain and
striped directories.

Update lfs-setdirstripe man to reflect the changes.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8f5f8e46faae68ffd9a49a4ac1d450e951e979c5
Reviewed-on: https://review.whamcloud.com/35825
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-12526 pcc: Auto attach for PCC during IO 05/36005/4
Qian Yingjin [Thu, 11 Jul 2019 08:37:55 +0000 (16:37 +0800)]
LU-12526 pcc: Auto attach for PCC during IO

PCC uses the layout lock to protect the cache validity. Currently
PCC only supports auto attach at the next open. However, the
layout lock can be revoked at any time by LRU/manual lock
shrinking or lock conflict callback.

For example, the layout lock can be revoked when performing I/Os
after opened the file. At this time, the cached file will be
detached involuntary. The I/O originally directed into PCC will
redirect to OSTs after the data restore into OSTs' objects. The
cost of this unwilling behavior may be expensive.

To avoid this problem, this patch implements auto attach for PCC
even during IOs (not only at the open time).

For debug purpose, now we have three auto attach options:
- open_attach: auto attach at the next open;
- io_attach: auto attach during IO
- stat_attach: auto attach at stat() call.

The reason to add the stat_attach option is that: when check
PCC state via "lfs pcc state", it will not only open the file but
also stat() on the file, to verify the feature of auto attach
during IO, we need to both disable open_attach and stat_attach.

And all these auto attach options are enabled by default.

This patch also fixed the bug for auto cache at create time:
In the current Lustre, the truncate operation will revoke the
LOOKUP ibits lock, and the file dentry cache will be invalidated.
The following open with O_CREAT flag will call into ->atomic_open,
the file was wrongly though as newly created file and try to
auto cache the file. So after client known it is not a
DISP_OPEN_CREATE, it should cleanup the already created PCC copy.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I1e0a84ca125f00076cf88ee26f9b7da8d17a960c
Reviewed-on: https://review.whamcloud.com/36005
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-12893 lnet: fix peer_ni selection 52/36552/2
Amir Shehata [Tue, 22 Oct 2019 18:27:24 +0000 (11:27 -0700)]
LU-12893 lnet: fix peer_ni selection

When selecting a peer-ni we must use the same peer NID
through all the messages which belong to the same RPC.
This is necessary in order to ensure we do the RDMA over
the optimal interface.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0391537da32bc6ac7a8a3d92e207bf172d111981
Reviewed-on: https://review.whamcloud.com/36552
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-12672 tests: Correctly determine mdccli in recovery-small test 66 27/35827/2
Oleg Drokin [Mon, 19 Aug 2019 02:56:31 +0000 (22:56 -0400)]
LU-12672 tests: Correctly determine mdccli in recovery-small test 66

As is aparently the filtering by awk does not work and
we get errors like this:

error: get_param: param_path 'lustre-MDT0001-mdc-ffff880114af9800/mds_conn_uuid': No such file or directory
error: set_param: param_path 'mdc/lustre-MDT0000-mdc-ffff880114af9800
lustre-MDT0001-mdc-ffff880114af9800/import': No such file or directory

Test-Parameters: trivial testlist=recovery-small
Change-Id: Ibbcc79f71d2fa5966da90f0c8d0e98a3c5f2a964
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35827
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
5 weeks agoLU-12773 tests: sanity test_805 Use do_facet 04/36204/4
Oleg Drokin [Tue, 17 Sep 2019 05:23:26 +0000 (01:23 -0400)]
LU-12773 tests: sanity test_805 Use do_facet

do_node cannot really work with $SINGLEMDS, that's the
facet name.

This fixes error message below (and a following syntax error):
mds1: ssh: Could not resolve hostname mds1: Name or service not known

Fixes: 106abc184d8b ("LU-8856 osd: mark specific transactions netfree")
Test-Parameters: trivial fstype=zfs
Change-Id: I0d842dbccbfd934c524ae01cca7399dd52158064
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36204
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
5 weeks agoLU-10070 utils: move new SEL find_param fields to end 54/36554/4
Andreas Dilger [Tue, 22 Oct 2019 23:31:22 +0000 (17:31 -0600)]
LU-10070 utils: move new SEL find_param fields to end

Move the new fp_ext_size and fp_ext_size_units fields to the end
of struct find_param so that they don't break the ABI for the
llapi_find() and llapi_getstripe() functions that use it.

Add "unused" fields for the sign and exclude bitfields, so that
it is clear how many more can still be used before we need to
add new fields.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib15af374774050a9e5b224f7edc7523fdae570c1
Reviewed-on: https://review.whamcloud.com/36554
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
5 weeks agoLU-11607 tests: replace lustre_version/fstype - large-lun 80/36380/2
James Nunez [Fri, 4 Oct 2019 17:54:56 +0000 (11:54 -0600)]
LU-11607 tests: replace lustre_version/fstype - large-lun

The routine get_lustre_env() is available to all Lustre test
suites and sets environment variables for the Lustre version
installed on servers and clients.

Replace calls to lustre_version_code() and facet_fstype()
for all server types with definitions from get_lustre_env()
for the large-lun, lfsck-performance, sanity-selinux and
scrub-performance test suites.

While doing this, replace ‘$SINGLEMDS’ with ‘MDS1_VERSION’
in lustre_version_code() and facet_fstype().

Test-Parameters: trivial fstype=ldiskfs testlist=sanity-selinux,scrub-performance
Test-Parameters: fstype=zfs testlist=ldiskfs testlist=sanity-selinux,scrub-performance
Test-Parameters: fstype=ldiskfs testlist=large-lun,lfsck-performance
Test-Parameters: fstype=zfs testlist=ldiskfs testlist=large-lun,lfsck-performance
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ie1a04103b8d721ab20992ed0a9afb3a399270937
Reviewed-on: https://review.whamcloud.com/36380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
5 weeks agoLU-12803 libcfs: bump module version 88/36488/2
James Simmons [Fri, 18 Oct 2019 13:31:00 +0000 (09:31 -0400)]
LU-12803 libcfs: bump module version

The linux client version of libcfs is further ahead in its
cleanup so its module version is higher. While this is the
case it does prevent the OpenSFS version of libcfs from
loading and since OpenSFS is current ahead of the linux
client we prefere to use it at this time. Lets just increase
the OpenSFS libcfs module to be just slightly ahead of the
linux client.

Test-Parameters: trivial

Change-Id: Ie57d93529bf25d908099f7dab06d2960f9923d58
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/36488
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
6 weeks agoLU-11768 test: make at_max to take effect 31/36431/4
Hongchao Zhang [Thu, 10 Oct 2019 20:22:25 +0000 (16:22 -0400)]
LU-11768 test: make at_max to take effect

In test_6 of sanity-quota, the "at_max" won't affect
the "at_current" if there is no RPC to be sent in that
import, which still makes the following DQACQ request
to have larger timeout value and triggers watchdog.

Fixes: d8226b93 ("LU-11768 test: limit at_max to timeout in time")
Test-Parameters: trivial \
testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota,sanity-quota

Change-Id: Iccc969459647aa70da6f6ecb0d8d13a404bf8088
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36431
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12026 mdt: MDS stores atime|mtime|ctime during close 86/36286/9
Qian Yingjin [Wed, 25 Sep 2019 09:14:12 +0000 (17:14 +0800)]
LU-12026 mdt: MDS stores atime|mtime|ctime during close

In order to make direct inode scanning on the MDT useful, in
addition to storing the file size/blocks via LSOM on the MDT, we
also need to store the atime/mtime/ctime on the MDT inodes.

Currently the atime is already lazily updated on the MDS (at
close time). In this patch, the final mtime/ctime are sent to the
MDS at close time and updated on the MDT inode, and make MDT-only
scanning workable.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I4465281a03d70919c388cb241c16eebcb03e850f
Reviewed-on: https://review.whamcloud.com/36286
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12799 ptlrpc: return proper error code 82/36282/4
Alex Zhuravlev [Tue, 24 Sep 2019 20:29:01 +0000 (23:29 +0300)]
LU-12799 ptlrpc: return proper error code

from ptlrpc_disconnect_prep_req() using ERR_PTR()
as the callers expect.

Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")
Change-Id: I5493194a1f18f3d0b559921b7859bf835585ba58
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36282
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
6 weeks agoLU-12025 osp: allow OS_STATE_* flags from OSTs 29/35029/8
Andreas Dilger [Thu, 28 Feb 2019 00:37:08 +0000 (17:37 -0700)]
LU-12025 osp: allow OS_STATE_* flags from OSTs

Allow OS_STATE_* flags to be sent from the OST, so that the
OS_STATE_NOPRECREATE can be used to prevent a newly-added OST
from being used until it is ready.  Add the "no_precreate"
parameter on the OFD that can be set from userspace.

Close a race in the cached opd_statfs.os_state handling in
osp_pre_update_statfs().  It was being overwritten by the
new statfs data from the OST, but was globally visible for a
short time to the precreate threads before the OS_STATE_*
flags were set on the cached statfs data again.

Similarly, there was a race with updating the opd_pre_status
if the OST was out of space, where it would be cleared after
a successful statfs, and wouldn't be set to -ENOSPC until a
short time later.

Split osp_pre_update_status() into osp_pre_update_msfs() that
only copies the statfs data into the cache after all of the
flags are set.  Don't clear flags from the cache, they will
only be cleared when new statfs data is sent.

Add a test that the 'N'OPRECREATE flag appears in "lfs df".

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9c1c7a097f3de8edfdeef2b437f40936e73ebbe5
Reviewed-on: https://review.whamcloud.com/35029
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12842 utils: llog_print with snapshot name 14/36414/2
Andreas Dilger [Wed, 9 Oct 2019 17:26:24 +0000 (11:26 -0600)]
LU-12842 utils: llog_print with snapshot name

The lsnapshot utility creates filesystems named with generated
hexadecimal strings.  In some cases the filesystem name may start
with a number instead of a character, which causes "lctl llog_print"
(via llog_ioctl()) to consider the filesystem name invalid.

Allow filesystem names in llog_ioctl() to start with a digit.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib2054d5afbeaa3f661148fff834c29f83f5d98ad
Reviewed-on: https://review.whamcloud.com/36414
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-9629 utils: fix lfs_migrate for non-root users 83/36383/5
Andreas Dilger [Sat, 5 Oct 2019 06:03:51 +0000 (00:03 -0600)]
LU-9629 utils: fix lfs_migrate for non-root users

Allow lfs_migrate to work with non-root users even when there are
hard-linked files.  The use of "lfs fid2path" is only strictly
needed if "lfs migrate" is not working and the script falls back
to using "rsync" to migrate the hard-linked files.  In the common
case, "lfs migrate" will preserve the links to the file and all
that is needed is "path2fid" to record which FIDs have already
been migrated so that they are not migrated again.

There is no need to track files with only one link, so none of
this FID-handling infrastructure is needed in the common case.

Don't get the mountpoint (via "df") for each hard-linked file within
a single filesystem (which is normally all files).  This is only
needed if files are on different mountpoints, which can be detected
by the device number returned by stat(1) on the file.  Cache the
device number across stat calls, and if it doesn't change then use
the same mountpoint for the fid2path call.

Add named variables to index the fields in the "nlink_type" array to
make it easier to see what is being accessed and avoid bugs.

Fixes: 80a2ff7137d3 ("LU-6051 utils: allow lfs_migrate to handle hard links")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: If37d9f73bd1e2ff261fdcfb5248b9e51ae42bd13
Reviewed-on: https://review.whamcloud.com/36383
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12784 llite: limit max xattr size by kernel value 40/36240/9
Andreas Dilger [Sat, 5 Oct 2019 08:06:24 +0000 (02:06 -0600)]
LU-12784 llite: limit max xattr size by kernel value

Limit the maximum xattr size returned to userspace from the MDS to
what the currently-running kernel supports (XATTR_SIZE_MAX=65536
bytes typically).  While it is possible a Lustre backing filesystem
may store larger xattrs than this, it wouldn't be possible for users
to access a larger xattr via kernel xattr interfaces.

This fixes interop problems when newer clients and tests are running
against older servers:

  sanity.sh: line 8946: /usr/bin/setfattr: Argument list too long

Skip subtests for new features in 2.13 so 2.12 interop testing passes.

Fix test-framework.sh::large_xattr_enabled() to return true for ZFS.
Fix test-framework.sh::max_xattr_size() to return the actual value
returned from the MDS rather than computing it locally.

Fixes: 3ec712bd183 ("LU-11868 osd: Set max ea size to XATTR_SIZE_MAX")
Test-Parameters: trivial serverversion=2.12 testlist=sanity
Test-Parameters: serverversion=2.12 testlist=conf-sanity envdefinitions=ONLY=81
Test-Parameters: testlist=sanity-pfl,replay-single
Test-Parameters: testlist=conf-sanity envdefinitions=ONLY=48,61,81
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I14232809b13886efa8f11a50ecc35e78f316810d
Reviewed-on: https://review.whamcloud.com/36240
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
6 weeks agoLU-12275 sec: reserve flags for client side encryption 60/36360/5
Sebastien Buisson [Thu, 3 Oct 2019 14:35:11 +0000 (14:35 +0000)]
LU-12275 sec: reserve flags for client side encryption

Reserve OBD_CONNECT2_ENC connection flag so that 'encrypt' or
'test_dummy_encryption' client mount options can only be used if
server side knows how to handle encrypted object size properly.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I42d0b597df3b68bd1de19394104e7fda1b76bf6c
Reviewed-on: https://review.whamcloud.com/36360
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12593 osd: zeroing a freshly allocated block buffer 29/35629/5
Alexander Boyko [Fri, 26 Jul 2019 14:13:21 +0000 (10:13 -0400)]
LU-12593 osd: zeroing a freshly allocated block buffer

Ldiskfs zeroes new buffer only when it is not uptodate.
In rare case we can get a new buffer head with uptodate flag.
This may cause a file corruption for non zero offset writes,
especially for internal Lustre files like update_log, CATALOGS,
lov_objid.

od_fld_lookup()) lustre-MDT0001-mdtlov: invalid FID [0x0:0x50:0x0]

The patch adds zeroing under i_mutex for unmaped blocks.

The performance results, since the patch adds mutex to a creation
path (lov_objid file).
40 tasks, 2000000 files
SUMMARY: (of 5 iterations)
Operation       Max           Min           Mean    Std Dev
---------       ---           ---           ----    -------
without fix
File creation: 39990.601   19020.238     27443.823  6909.605
With fix
File creation: 37958.809   21708.187     27065.855  5900.961

Cray-bug-id: LUS-6132
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ica8fbe29b5a7253d553b41a41ffe5d8d8b4b2e55
Reviewed-on: https://review.whamcloud.com/35629
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12625 build: reliable detection of struct timespec64 75/35675/9
Alexey Zhuravlev [Fri, 2 Aug 2019 09:16:22 +0000 (12:16 +0300)]
LU-12625 build: reliable detection of struct timespec64

existing configure check define struct inode on stack and this
may cause the following error with gcc8:
build/conftest.c: In function main
build/conftest.c:226:1: error: the frame size of 1032 bytes is
larger than 1024 bytes [-Werror=frame-larger-than=]
which result in false result of the ckeck and then osd-ldiskfs
doesn't build.

put struct inode * on the stack instead.

Change-Id: If31cfd13836e36ef59d428d3c05bf7f51319f89b
Signed-off-by: Alexey Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35675
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12859 llite: clear flock when using localflock 52/36452/4
Andreas Dilger [Mon, 14 Oct 2019 22:29:30 +0000 (06:29 +0800)]
LU-12859 llite: clear flock when using localflock

When mounting a client with "-o localflock" or equivalent option in
/etc/fstab, it does not clear out the "flock" mount option flag from
the superblock.  This results in "flock" still being the option used
and it displays both options in the /proc/mounts output:

  10.0.0.1@o2ib:/lfs on /mnt/lfs type lustre (rw,flock,localflock)

Mount a client with both "flock,localflock" as mount options and
verify that the "flock" option is cleared by "localflock", and
vice versa.  Verify that "noflock" clears both options.

Remove the "remount_client()" helper in conf-sanity.sh, since this
shadows a helper function of the same name in test-framework.sh and
is confusing.  Instead, use "mount_client()" now that it can accept
mount options, and just pass "remount" explicitly in a few places.

Fixes: 3613af3e15cb ("LU-10885 llite: enable flock mount option by default")
Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie31b0c4f6674c99d3ed5b73caa39cfc23d3ebbe5
Reviewed-on: https://review.whamcloud.com/36452
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-11997 ptlrpc: Properly swab ll_fiemap_info_key 08/36308/7
Oleg Drokin [Fri, 27 Sep 2019 14:23:18 +0000 (10:23 -0400)]
LU-11997 ptlrpc: Properly swab ll_fiemap_info_key

It was using lustre_swab_fiemap which is incorrect since the
structures don't match.

Added lustre_swab_fiemap_info_key that swabs embedded
obdo and ll_fiemap_info_key structures.

Change-Id: Ie701163bd4c2072a0461b2d9485bc184c6548f8f
Test-Parameters: clientarch=ppc64 testlist=sanity,sanityn
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36308
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
6 weeks agoLU-12703 utils: reset rootpath in llapi_search_rootpath() 35/36335/4
Alex Zhuravlev [Mon, 30 Sep 2019 20:50:49 +0000 (23:50 +0300)]
LU-12703 utils: reset rootpath in llapi_search_rootpath()

as get_root_path() can use it as a source and fail if
passed pathname contains garbage (on stack);

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9f628353c872afc82a582b0a6ca960cd0e8cffcb
Reviewed-on: https://review.whamcloud.com/36335
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12624 obdclass: lu_tgt_descs cleanup 24/35824/5
Lai Siyao [Sat, 3 Aug 2019 21:00:33 +0000 (05:00 +0800)]
LU-12624 obdclass: lu_tgt_descs cleanup

This patch cleans up code about lu_tgt_descs, so that it's cleaner
to add MDT object QoS allocation support:
* rename struct ost_pool to lu_tgt_pool.
* put struct lu_qos, lmv_desc/lov_desc and lu_tgt_pool into struct
  lu_tgt_descs because it's more natural to manage these data there
  and fewer arguments are needed to pass around in related functions.
* remove lu_tgt_descs.ltd_tgtnr, use
  lu_tgt_descs.ltd_lov_desc.ld_tgt_count instead, because they are
  duplicate.
* other cleanups.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I46f2e0ff06a8e580bac1dfda9a09a549b38d487d
Reviewed-on: https://review.whamcloud.com/35824
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12328 flr: avoid reading unhealthy mirror 52/34952/14
Bobi Jam [Fri, 24 May 2019 17:40:25 +0000 (01:40 +0800)]
LU-12328 flr: avoid reading unhealthy mirror

* Fix an error in lov_io_mirror_init() which would wait unnecessarily
  if we're retrying the last mirror of the file.

* In osc_io_iter_init() we'd check its OSC import status so that the
  read path can quickly switch another mirror.
  sanity-flr test_33b is added to test this case.

* And with all mirrors have been tried, we'd turn off the quick switch
  so that when all mirrors contain bad OSTs, the read will still try
  its best to get partial data from a component before trying another
  mirror.
  sanity-flr test_33c is added to test this case.

Test-Parameters: envdefinitions=ONLY="33" testlist=sanity-flr,sanity-flr,sanity-flr,sanity-flr,sanity-flr,sanity-flr,sanity-flr,sanity-flr,sanity-flr,sanity-flr
Fixes: 5a6ceb664f07 ("LU-7236 ptlrpc: idle connections can disconnect")
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I5621a834e58ee1bfccf6c407d2c68357b5c3eb3b
Reviewed-on: https://review.whamcloud.com/34952
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
6 weeks agoLU-12838 ptlrpc: fix watchdog ratelimit logic 09/36409/3
Andreas Dilger [Tue, 8 Oct 2019 21:42:32 +0000 (15:42 -0600)]
LU-12838 ptlrpc: fix watchdog ratelimit logic

The ptlrpc-level watchdog ratelimiting is broken. The kernel prints:

    mdt00_009: service thread pid 18935 was inactive for 72s.
    Watchdog stack traces are limited to 3 per 300s, skipping...

even though there hasn't been any stack trace printed before.

It looks like the __ratelimit() return value is backward from
what one would expect from normal English grammar, namely that
if __ratelimit() returns true the action should NOT be limited.

Fix the logic checking the __ratelimit() return value, and add a
check in sanity test_422 (which forces a service thread timeout)
to ensure that the watchdog sometimes prints a full stack.

Fixes: fc9de679a4c2 ("LU-9859 libcfs: add watchdog for ptlrpc service threads")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4a97dd361c12ac7c7a39c251551c21506b3ebbe5
Reviewed-on: https://review.whamcloud.com/36409
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
7 weeks agoLU-12844 lnet: fix strncpy bound error 17/36417/2
Jian Yu [Wed, 9 Oct 2019 21:30:49 +0000 (14:30 -0700)]
LU-12844 lnet: fix strncpy bound error

This patch fixes the following error while using gcc 8:

liblnetconfig.c: In function ‘lustre_lnet_parse_nids’:
liblnetconfig.c:320:3: error: ‘strncpy’ specified bound depends on
the length of the source argument [-Werror=stringop-overflow=]
   strncpy(entry, cur, len - 1);
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
liblnetconfig.c:310:10: note: length computed here
    len = strlen(cur) + 1;
          ^~~~~~~~~~~
cc1: all warnings being treated as errors

Change-Id: I2d5840fd58c7b7d27ef1b2aa12f1f187d30abbfd
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36417
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
7 weeks agoLU-12845 utils: fix typos in 'lctl pcc' help 16/36416/2
James Nunez [Wed, 9 Oct 2019 20:26:20 +0000 (14:26 -0600)]
LU-12845 utils: fix typos in 'lctl pcc' help

The help message for 'lctl pcc ...' has typos that
should be corrected.

Test-Parameters: trivial
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I880a126d38a7cf9b2fc65d1a05a5d4eb554be1e5
Reviewed-on: https://review.whamcloud.com/36416
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 weeks agoLU-10496 tests: enable 39k for DoM 52/36352/4
Mikhail Pershin [Wed, 2 Oct 2019 05:26:18 +0000 (08:26 +0300)]
LU-10496 tests: enable 39k for DoM

Test was disabled temporary, enable it back as all needed
fixes are landed.

This reverts commit 68dd8a8acff9ad2295a1fcba318fc8ed5f140026.

Test-Parameters: trivial testlist=sanity-dom,sanity-dom,sanity-dom
Test-Parameters: fstype=zfs testlist=sanity-dom,sanity-dom,sanity-dom
Test-Parameters: fstype=zfs testlist=sanity-dom,sanity-dom,sanity-dom
Test-Parameters: fstype=zfs testlist=sanity-dom,sanity-dom,sanity-dom
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ib230d2849450bb4642ae8a286a84e501e0dafde1
Reviewed-on: https://review.whamcloud.com/36352
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
8 weeks agoLU-12760 tests: stack_trap defaults to sigspec=EXIT 86/36186/2
Andreas Dilger [Sat, 14 Sep 2019 07:46:44 +0000 (01:46 -0600)]
LU-12760 tests: stack_trap defaults to sigspec=EXIT

If the "sigspec" argument is not specified for stack_trap(), default
to "EXIT" as the signal, since this is what we use for all callers
of stack_trap() today anyway.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I2c8d986cdf8743e1d956cd7941a47bd4cd772592
Reviewed-on: https://review.whamcloud.com/36186
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
8 weeks agoLU-12712 lod: fix warning message for non-SEL file 51/36351/3
Andreas Dilger [Tue, 1 Oct 2019 21:26:34 +0000 (15:26 -0600)]
LU-12712 lod: fix warning message for non-SEL file

The warning message printed when the LCME_FL_EXTENSION flag is set on
a non-SEL file was incorrectly checking the component magic (usually
LOV_MAGIC_V3) instead of the file magic (should be LOV_MAGIC_SEL).
Don't redefine "magic" for each component, since this is only used in
one other place to mean the component magic, and use the actual file
or component magic in the few places where this is checked.

Fix the warning message to be rate-limited on the console by using
CWARN(...) instead of CDEBUG(D_WARNING, ...), though it is somewhat
questionable whether this should be a console message at all as there
is nothing that the administrator can do about this problem and it
doesn't appear to have any side-effects.  Also print the component
ID/index and restructure the message text so that it is more clear
about what the actual problem is.

Fix a few minor style issues in nearby code.

Test-Parameters: trivial testlist=sanity-pfl
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I81c3f9914512b1959b8483bb2b988ea4597cab07
Reviewed-on: https://review.whamcloud.com/36351
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 weeks agoLU-12800 mdd: do not cache atributes on mdd_parent_fid 87/36287/2
Andriy Skulysh [Mon, 5 Aug 2019 15:24:05 +0000 (18:24 +0300)]
LU-12800 mdd: do not cache atributes on mdd_parent_fid

mdd_is_parent() brings link xattrs of not locked objects into osp
cache.

Invalidate osp cache built during mdd_is_subdir().

Change-Id: Id9e34af3ff4712af9d4f3ae984e8082448e5fd3f
Cray-bug-id: LUS-7634
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-on: https://review.whamcloud.com/36287
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 weeks agoLU-12769 recovery: use monotonic timer 74/36274/12
Alex Zhuravlev [Mon, 23 Sep 2019 08:26:19 +0000 (11:26 +0300)]
LU-12769 recovery: use monotonic timer

instead of real one. also use absolute values for timer.

One of the reasons for the move from jiffies based timer
to a hrtimer timer was to avoid the issue of time drift.
It was discovered due to test failures with recovery on
VMs that the high resolution wall clock can drift as well.
Moving to the monotonic clock for the hrtimer avoids this
drift completely and it is safe to use since the recovery
timestamp is not shared between nodes.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8b75121934c229dec8df7be0a4e69c1cda940d3f
Reviewed-on: https://review.whamcloud.com/36274
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 weeks agoLU-12691 ldlm: obd_max_recoverable_clients is not atomic 14/35914/3
Tatsushi Takamura [Mon, 26 Aug 2019 00:12:37 +0000 (09:12 +0900)]
LU-12691 ldlm: obd_max_recoverable_clients is not atomic

Originally obd_max_recoverable_clients is not increased
at the same moment. But because of LU-3540,
it will be increased by multiple processes.

The type of obd_max_recoverable_clients should be
atomic_t and be handled by atomic operations.

Signed-off-by: Tatsushi Takamura <takamr.tatsushi@jp.fujitsu.com>
Change-Id: I9a67bbbfacab2e05858243f649e3a4e0d4b5d7f7
Reviewed-on: https://review.whamcloud.com/35914
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 weeks agoLU-12824 o2ib: Reintroduce kiblnd_dev_search 26/36326/4
Chris Horn [Mon, 30 Sep 2019 15:04:10 +0000 (10:04 -0500)]
LU-12824 o2ib: Reintroduce kiblnd_dev_search

If we add an interface to multiple nets then we need to re-use the
struct ib_dev object for each of the nets.

Cray-bug-id: LUS-7935
Fixes: 75ab841 ("LU-11893 lnet: consoldate secondary IP address handling")
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I1790e24458f47d632fd137b78de076d408fe5260
Reviewed-on: https://review.whamcloud.com/36326
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
8 weeks agoLU-12824 o2ib: Record rc in debug log on startup failure 25/36325/4
Chris Horn [Mon, 30 Sep 2019 15:03:06 +0000 (10:03 -0500)]
LU-12824 o2ib: Record rc in debug log on startup failure

Since kiblnd_startup() return -ENETDOWN on failure, let's record the
rc value for the failure case in the debug log.

Cray-bug-id: LUS-7935
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ied934642bc567b8d3f51293d7dd095d47ff134df
Reviewed-on: https://review.whamcloud.com/36325
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12802 tests: speedup cleanup of racer 89/36289/3
Li Xi [Wed, 10 Feb 2016 14:37:00 +0000 (22:37 +0800)]
LU-12802 tests: speedup cleanup of racer

After racer test survives for a given time, it starts to cleanup.
And the parent racer.sh script waits the child racer/racer.sh
to exit. However sometimes, somehow, this stucks for a long time.
Sending a signal to remaining dd(or other) processes will wake up
the wait in parent racer.sh script immediately.

Lustre-change: https://review.whamcloud.com/35101

Test-Parameter: trivial testlist=racer
DDN-Bug-ID: DDN-256
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I2ff2784b76faa0532c39af29b1586a48f2b90a21
Reviewed-on: https://review.whamcloud.com/36289
Reviewed-by: Shilong Wang <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12777 test: fix to pass facet to facet_fstype 98/36298/4
Wang Shilong [Thu, 26 Sep 2019 13:21:13 +0000 (21:21 +0800)]
LU-12777 test: fix to pass facet to facet_fstype

Function facet_fstype() expect mgs1 mds1 etc as its
argument, and we used it wrong to pass $mds1 which will
cause following error.

line 1192: lustre-ost1/ost1_FSTYPE: bad substitution

And we fail to detect this is ZFS based OSD, and pool
reimporting will be missed thus failed to mount.

Test-Parameters: trivial clientdistro=el8 testlist=conf-sanity \
                 fstype=zfs envdefinitions=ONLY=103
Test-Parameters: trivial clientdistro=el8 testlist=conf-sanity \
                 fstype=ldiskfs envdefinitions=ONLY=103
Change-Id: Id8fd5b9f17e666614e83e5c1a2399fde8b91b023
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36298
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12795 tests: Prefer FAILF in mpi tests 76/36276/2
Shaun Tancheff [Mon, 23 Sep 2019 17:18:21 +0000 (12:18 -0500)]
LU-12795 tests: Prefer FAILF in mpi tests

Prefer FAILF() to the sprintf()/FAIL() pattern and remove
the errmsg temporary buffer.

Cleanup use of assignment in conditional statements.

Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Icb04b188209ad3717260aa1238de1d9788d76f79
Reviewed-on: https://review.whamcloud.com/36276
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12790 obdclass: print jobid error message properly 72/36272/3
Emoly Liu [Tue, 24 Sep 2019 11:20:48 +0000 (19:20 +0800)]
LU-12790 obdclass: print jobid error message properly

Modify unlikely() condition to print error message properly when
(rc == -EOVERFLOW).

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I19bfb353c71b55a0dfb6eec78c1af915494acd71
Reviewed-on: https://review.whamcloud.com/36272
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-12795 tests: Introduce FAILF and start using it 55/36255/2
Shaun Tancheff [Sun, 22 Sep 2019 08:28:41 +0000 (03:28 -0500)]
LU-12795 tests: Introduce FAILF and start using it

Unify the sprintf(msg, ...); FAIL(msg); pattern into a single
FAILF() macro and avoid the needless use of a temporary buffer.

Remove the duplicate lp_utils.c and lp_utils.h files.

Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ia0ed463e305ec671909a049d847eb8518291f72f
Reviewed-on: https://review.whamcloud.com/36255
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12229 tests: fix "bad substitution" error 43/36243/3
Emoly Liu [Thu, 19 Sep 2019 10:26:31 +0000 (18:26 +0800)]
LU-12229 tests: fix "bad substitution" error

In newer bash version, the special characters is invalid in the
usage of indirect variable expansion {!word}. For example,
# a=lustre,pool
# echo ${!a}
-bash: lustre,pool: bad substitution
To avoid "bad sustitution" error, pool_new command is used in
test_1j and test_1k directly.

Test-Parameters:trivial clientdistro=el8 testlist=ost-pools
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: Ifce4616cd7f314416fe5fa09f8fba846ae45bcef
Reviewed-on: https://review.whamcloud.com/36243
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12764 lnet: eliminate uninitialized warning 89/36189/2
Wang Shilong [Mon, 16 Sep 2019 01:55:58 +0000 (18:55 -0700)]
LU-12764 lnet: eliminate uninitialized warning

lustre-release/lnet/lnet/router.c: In function ‘lnet_del_route’:
include/linux/compiler.h:177:26: error: ‘lp’ may be used uninitialized
in this function [-Werror=maybe-uninitialized]
  case 8: *(__u64 *)res = *(volatile __u64 *)p; break;  \
                          ^
/home/wangsl/lustre-release/lnet/lnet/router.c:754:20: note: ‘lp’ was declared here
  struct lnet_peer *lp;
                    ^
/home/wangsl/lustre-release/lnet/lnet/router.c: At top level:
cc1: error: unrecognized command line option ‘-Wno-stringop-overflow’ [-Werror]
cc1: error: unrecognized command line option ‘-Wno-stringop-truncation’ [-Werror]
cc1: error: unrecognized command line option ‘-Wno-format-truncation’ [-Werror]
cc1: all warnings being treated as errors

codes logic gurantee @lpi and @lpni are inited at the same time,
but let's init @lpi to make gcc happy.

Change-Id: I1ccd88ca5061b5f29a530bf2b755585c92612a69
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/36189
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11467 utils: add lfs mirror delete command 85/36185/5
Andreas Dilger [Thu, 19 Sep 2019 07:45:11 +0000 (00:45 -0700)]
LU-11467 utils: add lfs mirror delete command

Add "lfs mirror delete" as an alias for "lfs mirror split -d", to
balance "lfs mirror create" and simplify the interface for users.

Add lfs-mirror-create.1 man page, and convert some tests in
sanity-flr.sh to use the new interface.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I4399878dc2fd435c517a2ff529b91480583ebbe5
Reviewed-on: https://review.whamcloud.com/36185
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12704 tests: component end must be multiple of stripesize 74/36174/3
Andreas Dilger [Thu, 12 Sep 2019 21:44:10 +0000 (15:44 -0600)]
LU-12704 tests: component end must be multiple of stripesize

In racer file_create.sh, it was generating a random multiple of 64KB
in [64K..1024K] for the stripe_size, but using a fixed 1MB component
end.  However, the component end must be a multiple of stripe_size,
or an error will be returned by "lfs setstripe":

  Invalid layout: component end must be aligned by the stripe size

Generate the component_end argument as a random multiple [1..8] of
the stripe size for PFL, DoM, and FLR files.

Fixes: 0b66b11523cb ("LU-3285 tests: add dom into racer test suite")
Test-Parameters: trivial testlist=racer
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib79131fc71d0eaa4a398cd8c8adf6e53473ebbe5
Reviewed-on: https://review.whamcloud.com/36174
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12724 kernel: kernel update RHEL7.7 [3.10.0-1062.1.1.el7] 74/36074/5
Jian Yu [Fri, 27 Sep 2019 18:21:14 +0000 (11:21 -0700)]
LU-12724 kernel: kernel update RHEL7.7 [3.10.0-1062.1.1.el7]

Update RHEL7.7 kernel to 3.10.0-1062.1.1.el7.

Test-Parameters: trivial clientdistro=el7.7 serverdistro=el7.7

Change-Id: Iad40fb93b8a15d875b72749a05666a23e4755fcc
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36074
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11607 tests: replace lustre_version/fstype in posix/perf 34/35934/6
James Nunez [Tue, 27 Aug 2019 15:39:02 +0000 (09:39 -0600)]
LU-11607 tests: replace lustre_version/fstype in posix/perf

The routine get_lustre_env() is available to all Lustre test
suites and sets environment variables for the Lustre
version and file system type for servers.

In posix, performance-sanity, and parallel-scale, replace
calls to lustre_version_code() and facet_fstype() for all
server types with definitions from get_lustre_env().

While doing this, replace "$SINGLEMDS" with "$MDS1_VERSION"
or "$mds1_FSTYPE" in lustre_version_code() and facet_fstype().

Clean up around any modifications by converting spaces to
tabs.

Test-Parameters: trivial testlist=posix,performance-sanity,parallel-scale
Test-Parameters: fstype=zfs testlist=posix,performance-sanity,parallel-scale
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Id7d3c2a50c7b880c1147e9b6c721fddff07861fa
Reviewed-on: https://review.whamcloud.com/35934
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
2 months agoLU-4315 doc: add separate lctl-list_param man page 49/35649/5
Andreas Dilger [Mon, 17 Jun 2019 03:37:00 +0000 (05:37 +0200)]
LU-4315 doc: add separate lctl-list_param man page

Split the lctl list_param man page from the main lctl.8 man page.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib3f125e953427ec6ace1709588f5d40b1e3ebbe5
Reviewed-on: https://review.whamcloud.com/35649
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12400 zfs: zfs mainline 0.8+ with mainline (5.2) kernel 18/35518/11
Shaun Tancheff [Fri, 30 Aug 2019 16:57:40 +0000 (11:57 -0500)]
LU-12400 zfs: zfs mainline 0.8+ with mainline (5.2) kernel

Compile tests need to resolve using Module.symvers from the zfs
kmod kernel version.

dsl_pool_config_enter signature changed

Use spa_get_hostid() if it is available (zfs 0.7.0 and later).
For zfs 0.6.x use /proc/sys/kernel/spl/hostid with a fallback
to reading the spl module parameters and manually decoding
the hostid file.

list_move_tail() in libzfs conflicts with normal kernel/lustre
list API. Move it out of the way before libcfs/util/list.h is
included.

Cray-bug-id: LUS-7600
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ia16e226239d33555ba7d906b39e37e20f012a02c
Reviewed-on: https://review.whamcloud.com/35518
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-4322 tests: re-enable 101a in dne config 27/35027/6
Patrick Farrell [Sat, 1 Jun 2019 00:03:40 +0000 (20:03 -0400)]
LU-4322 tests: re-enable 101a in dne config

We should re-enable 101a in the dne config, and also make
it more strict on discards.  This test should normally
result in 0 discards, because every page brought in by
readahead is used.

It is possible for randomness with the reads to lead to a
few discards, but no more than that.

Test-Parameters: trivial testlist=sanity,sanity,sanity,sanity
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I53c6263c33c27d36b746c8fc56c8deebb4b713c7
Reviewed-on: https://review.whamcloud.com/35027
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-8066 obd_type: discard obd_types linked list. 18/34718/15
NeilBrown [Wed, 18 Sep 2019 01:17:20 +0000 (21:17 -0400)]
LU-8066 obd_type: discard obd_types linked list.

As all obd_types are kobjects in the lustre_kset kset,
they are linked together in that kset and don't
need any extra linkage.

There are non-obd_type objects in lustre_kset, added by
class_setup_tunables().  These have a different ->ktype, so we are
careful to only return objects with the correct ->ktype.

As kset_find_obj() returns a counted reference, we need
to put that reference when done.

On the server side it is possible to have an obd_type partially
initialized by one subsystem and latter fully initialized by
another subsystem. We use typ_sym_filter to notify us if the
obd_type is only partially setup. If it only paritially setup
then we let the original subsystem that created the obd_type
to clean it up. If the obd_type was latter completely setup
then we let the latter subsystem do the cleanup for us.

Linux-commit: 881bc9b58ef5e8c9be297b121187ea6c26572cf1

Change-Id: I4316644f7fb12e358b799af64deb57836e796066
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/34718
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12127 lfsck: skip orphan processing for namespace LFSCK 74/34174/8
Fan Yong [Wed, 8 May 2019 02:24:52 +0000 (10:24 +0800)]
LU-12127 lfsck: skip orphan processing for namespace LFSCK

LFSCK can reconnect a recently-deleted orphan object back
into the normal namespace when it shouldn't. This can cause
access to the deleted data (potential security risk), and
sometimes cause an assertion if orphan is later deleted.

The commit 077570483e75e0610fd45149b926097547c434b8 skips
the orphan object during LFSCK scan. But what needs to be
skipped is only the namespace LFSCK logic for the orphan
object, but the layout LFSCK processing is still necessary
for the orphan object to verify the consistency between the
orphan MDT-object and related OST-object(s). This patch
just does that.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9b2809b95efa4b3c3e3b2c7d0a501624ed3ebbe5
Reviewed-on: https://review.whamcloud.com/34174
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-10496 ofd: serialize fmd check and set 28/36228/4
Alexey Zhuravlev [Wed, 18 Sep 2019 12:23:44 +0000 (15:23 +0300)]
LU-10496 ofd: serialize fmd check and set

Serialize FMD check and set in OFD to prevent update
with old data.
Update also sanity tests 39j,k to cancel LRU locks on
both mdc and osc namespaces because for DOM files both
MDT and OST stripes are being used

Test-Parameters: testlist=sanity-dom
Signed-off-by: Alexey Zhuravlev <bzzz@whamcloud.com>
Change-Id: I02e28e1e3e8e533d9c7450d798fceb9261b27ea0
Reviewed-on: https://review.whamcloud.com/36228
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for munlink.c 12/35912/2
Arshad Hussain [Tue, 6 Aug 2019 12:41:52 +0000 (18:11 +0530)]
LU-6142 tests: Fix style issues for munlink.c

This patch fixes issues reported by checkpatch
for file lustre/tests/munlink.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I1722de906c529045fc9a2c8edc877d695840677d
Reviewed-on: https://review.whamcloud.com/35912
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for openfile.c 11/35911/2
Arshad Hussain [Tue, 6 Aug 2019 13:24:13 +0000 (18:54 +0530)]
LU-6142 tests: Fix style issues for openfile.c

This patch fixes issues reported by checkpatch
for file lustre/tests/openfile.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ied280e919397d27b087ec9d5ac3c7e55d44ec0b2
Reviewed-on: https://review.whamcloud.com/35911
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for opendevunlink.c 10/35910/3
Arshad Hussain [Tue, 6 Aug 2019 12:49:59 +0000 (18:19 +0530)]
LU-6142 tests: Fix style issues for opendevunlink.c

This patch fixes issues reported by checkpatch
for file lustre/tests/opendevunlink.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I2f488b77610429e7c17ac3975a66aab271b27409
Reviewed-on: https://review.whamcloud.com/35910
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for opendirunlink.c 09/35909/3
Arshad Hussain [Tue, 6 Aug 2019 13:05:51 +0000 (18:35 +0530)]
LU-6142 tests: Fix style issues for opendirunlink.c

This patch fixes issues reported by checkpatch
for file lustre/tests/opendirunlink.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I245b37c93e53313c96a5f69d11d41bc93b3844e2
Reviewed-on: https://review.whamcloud.com/35909
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for utime.c 08/35908/2
Arshad Hussain [Tue, 6 Aug 2019 13:56:31 +0000 (19:26 +0530)]
LU-6142 tests: Fix style issues for utime.c

This patch fixes issues reported by checkpatch
for file lustre/tests/utime.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Iae92c34b2f6d68204bb3f9846227eb507c04e27d
Reviewed-on: https://review.whamcloud.com/35908
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for openfilleddirunlink.c 07/35907/3
Arshad Hussain [Tue, 6 Aug 2019 13:40:50 +0000 (19:10 +0530)]
LU-6142 tests: Fix style issues for openfilleddirunlink.c

This patch fixes issues reported by checkpatch
for file lustre/tests/openfilleddirunlink.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ic6771392bf8fdb477ce7d5632c09780667a8abe0
Reviewed-on: https://review.whamcloud.com/35907
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for writemany.c 06/35906/4
Arshad Hussain [Tue, 6 Aug 2019 14:20:35 +0000 (19:50 +0530)]
LU-6142 tests: Fix style issues for writemany.c

This patch fixes issues reported by checkpatch
for file lustre/tests/writemany.c

This patch also removes locally defined difftime
macro with C library defined difftime function

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I10458cd1467639ba613b303f2fbe8cd24efc9942
Reviewed-on: https://review.whamcloud.com/35906
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for multifstat.c 05/35905/2
Arshad Hussain [Tue, 6 Aug 2019 12:23:42 +0000 (17:53 +0530)]
LU-6142 tests: Fix style issues for multifstat.c

This patch fixes issues reported by checkpatch
for file lustre/tests/multifstat.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ief81a0fb80c5863ee92b5e6e602924e8ebdd0031
Reviewed-on: https://review.whamcloud.com/35905
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for mkdirmany.c 03/35903/2
Arshad Hussain [Tue, 6 Aug 2019 10:29:11 +0000 (15:59 +0530)]
LU-6142 tests: Fix style issues for mkdirmany.c

This patch fixes issues reported by checkpatch
for file lustre/tests/mkdirmany.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I79a86ddf661c3149b1de67a29b329809dd87fbec
Reviewed-on: https://review.whamcloud.com/35903
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for mmap_sanity.c 02/35902/2
Arshad Hussain [Tue, 6 Aug 2019 10:52:59 +0000 (16:22 +0530)]
LU-6142 tests: Fix style issues for mmap_sanity.c

This patch fixes issues reported by checkpatch
for file lustre/tests/mmap_sanity.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I37a842ef9a193b554c26deb916b9fe9187b79042
Reviewed-on: https://review.whamcloud.com/35902
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for ll_sparseness_write.c 21/35821/4
Arshad Hussain [Mon, 5 Aug 2019 23:47:25 +0000 (05:17 +0530)]
LU-6142 tests: Fix style issues for ll_sparseness_write.c

This patch fixes issues reported by checkpatch
for file lustre/tests/ll_sparseness_write.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Icf1b9488f75791cd859e975c21341e67b160992c
Reviewed-on: https://review.whamcloud.com/35821
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for ll_sparseness_verify.c 20/35820/2
Arshad Hussain [Mon, 5 Aug 2019 23:19:47 +0000 (04:49 +0530)]
LU-6142 tests: Fix style issues for ll_sparseness_verify.c

This patch fixes issues reported by checkpatch
for file lustre/tests/ll_sparseness_verify.c

Change-Id: I7049be84c43169a2b21d0d2cff980f48f6fd27d0
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/35820
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for ll_dirstripe_verify.c 19/35819/2
Arshad Hussain [Mon, 5 Aug 2019 23:05:19 +0000 (04:35 +0530)]
LU-6142 tests: Fix style issues for ll_dirstripe_verify.c

This patch fixes issues reported by checkpatch
for file lustre/tests/ll_dirstripe_verify.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I552d27b9517ef36b6ad66db6222bf55239c16c04
Reviewed-on: https://review.whamcloud.com/35819
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Remove file iam_ut.c 18/35818/3
Arshad Hussain [Mon, 5 Aug 2019 22:46:53 +0000 (04:16 +0530)]
LU-6142 tests: Remove file iam_ut.c

This patch removes file lustre/tests/iam_ut.c
This file currently is not used at all by any
tests or binary.

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ia27a94a4cc7ed1a5ff788b07ce1f2487f9427c83
Reviewed-on: https://review.whamcloud.com/35818
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Remove file flock.c 17/35817/5
Arshad Hussain [Mon, 5 Aug 2019 22:29:54 +0000 (03:59 +0530)]
LU-6142 tests: Remove file flock.c

This patch removes file lustre/tests/flock.c
This file currently is not used at all by any
tests or binary.

For flock tests validation file
lustre/tests/flocks_test.c is used instead.

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ie9b55c7a93c77234ef21a5bc7827fbaf9e9b20cd
Reviewed-on: https://review.whamcloud.com/35817
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for directio.c 16/35816/2
Arshad Hussain [Mon, 5 Aug 2019 22:18:51 +0000 (03:48 +0530)]
LU-6142 tests: Fix style issues for directio.c

This patch fixes issues reported by checkpatch
for file lustre/tests/directio.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ic1b39dd84aaef75c4bd94c357b6f917b735175cb
Reviewed-on: https://review.whamcloud.com/35816
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for chownmany.c 15/35815/2
Arshad Hussain [Mon, 5 Aug 2019 21:55:40 +0000 (03:25 +0530)]
LU-6142 tests: Fix style issues for chownmany.c

This patch fixes issues reported by checkpatch
for file lustre/tests/chownmany.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I57a89193c683c12881215394fa417ab0aefa004f
Reviewed-on: https://review.whamcloud.com/35815
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for checkstat.c 14/35814/2
Arshad Hussain [Mon, 5 Aug 2019 21:01:57 +0000 (02:31 +0530)]
LU-6142 tests: Fix style issues for checkstat.c

This patch fixes issues reported by checkpatch
for file lustre/tests/checkstat.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I2ed9c9b4325f9285643a21114dbaf2e1e48c1757
Reviewed-on: https://review.whamcloud.com/35814
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 tests: Fix style issues for mcreate.c 13/35813/2
Arshad Hussain [Tue, 6 Aug 2019 00:38:19 +0000 (06:08 +0530)]
LU-6142 tests: Fix style issues for mcreate.c

This patch fixes issues reported by checkpatch
for file lustre/tests/mcreate.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I51651c52e81ca64302a02cedcc97faa11fff656e
Reviewed-on: https://review.whamcloud.com/35813
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12410 tests: Add auster option to skip setup 89/35389/6
Chris Horn [Sun, 30 Jun 2019 15:40:53 +0000 (10:40 -0500)]
LU-12410 tests: Add auster option to skip setup

Add an option to auster to skip the initial setup of Lustre.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ie3de93c8a4d3f812aaf1f032e39c351827c6eaef
Reviewed-on: https://review.whamcloud.com/35389
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12824 o2ib: Fix whitespace in kiblnd_startup 24/36324/3
Chris Horn [Mon, 30 Sep 2019 15:01:28 +0000 (10:01 -0500)]
LU-12824 o2ib: Fix whitespace in kiblnd_startup

Convert whitespace to tabs where appropriate in kiblnd_startup()

Cray-bug-id: LUS-7935
Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I11aaaa8e47d754b219fb773d74e37190476e4eeb
Reviewed-on: https://review.whamcloud.com/36324
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11670 tests: do not fail the first half in sanityn test 103 03/36303/3
Jian Yu [Thu, 26 Sep 2019 17:51:22 +0000 (10:51 -0700)]
LU-11670 tests: do not fail the first half in sanityn test 103

There are two halves in sanityn test 103. The first half is to
reproduce the problem of incorrect size when using lockahead
and the second half is to verify that the fix works. Sometimes,
the problem cannot be reproduced in the first half test, so we
should not fail the whole test.

Test-Parameters: trivial fstype=zfs \
mdscount=2 mdtcount=4 testlist=sanityn,sanityn,sanityn

Test-Parameters: trivial fstype=ldiskfs \
mdscount=2 mdtcount=4 testlist=sanityn,sanityn,sanityn

Change-Id: Ib6c82bfe512ac072104abfcb406e2ef1bd6a6a02
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36303
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
2 months agoLU-11967 mdt: reint layout_change in standard way 65/35465/6
Lai Siyao [Sun, 30 Jun 2019 15:26:11 +0000 (23:26 +0800)]
LU-11967 mdt: reint layout_change in standard way

Layout_change is a reint operation, and it should be handled the
same as other reint operations, so that resent and replay can
work correctly.

Also replace the lock passed in ldlm_handle_enqueue0() with the
lock taken in mdt_layout_change(). This avoids taking lock again
in ldlm_handle_enqueue0(), and also makes replay eaiser. Note,
before replacing, the mode is downgraded from EX to CR, because
client only needs this mode, as can avoid unnecessary lock cancel
later.

Add missing resent reconstructor for REINT_RESYNC.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I328044dacbf18d03232c9bbb51271f6202e9b939
Reviewed-on: https://review.whamcloud.com/35465
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11739 lod: subdir under ROOT should honor default layout 04/35204/7
Lai Siyao [Mon, 30 Sep 2019 17:14:51 +0000 (10:14 -0700)]
LU-11739 lod: subdir under ROOT should honor default layout

Though sub directories under ROOT don't inherit ROOT default
layout, they should hornor ROOT default layout in creation.

Add sub test in sanity.sh 65n to verify this.

Fixes: 0a988cae95 ("LU-11739 lod: don't inherit default
                    layout from root directory")

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I1edf91da7944a8871652df7bca2104d00f66163a
Reviewed-on: https://review.whamcloud.com/35204
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
2 months agoLU-12825 build: change lbuild to support MOFED 4.7 33/36333/5
Minh Diep [Mon, 30 Sep 2019 18:25:50 +0000 (11:25 -0700)]
LU-12825 build: change lbuild to support MOFED 4.7

* Remove 'alternate' name in MOFED tar
* use MLNX_LIBS to download rpms

Test-Parameters: trivial

Change-Id: Ia5a4f51455be836a7df4fa6b3e9eccc17cffef2c
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36333
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11444 ptlrpc: resend may corrupt the data 14/35114/15
Andriy Skulysh [Sat, 8 Jun 2019 11:30:55 +0000 (14:30 +0300)]
LU-11444 ptlrpc: resend may corrupt the data

Late resend if arrives much later than another modification RPC
which has been already handled on this slot, may be still applied
and therefore overrides the last one

Send RPCs from client in increasing order for each tag
and check it on server to check late resend.

A slot can be reused by a client after kill while
the server continue to rely on it.

Add flag for such obsolete requests, here we trust the
client and perform xid check for all in progress requests.

Cray-bug-id: LUS-6272, LUS-7277, LUS-7339
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Change-Id: I97806577cf979f49a75379ffc55947cc3dcd02b1
Reviewed-on: https://review.whamcloud.com/35114
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9920 vvp: dirty pages with pagevec 11/28711/18
Patrick Farrell [Fri, 13 Sep 2019 19:27:40 +0000 (15:27 -0400)]
LU-9920 vvp: dirty pages with pagevec

When doing i/o from multiple writers to a single file, the
per-file page cache lock (the mapping lock) becomes a
bottleneck.

Most current uses are single page at a time.  This converts
one prominent use, marking page as dirty, to use a pagevec.

When many threads are writing to one file, this improves
write performance by around 25%.

This requires implementing our own version of the
set_page_dirty-->__set_page_dirty_nobuffers functions.

This was modeled on upstream tip of tree:
v5.2-rc4-224-ge01e060fe0 (7/13/2019)

The relevant code is unchanged since Linux 4.17, and has
changed only minimally since before Linux 2.6.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ifff9cd01f8b4e960bb4ebea560b9a9a01376698d
Reviewed-on: https://review.whamcloud.com/28711
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-10467 target: remove lwi arg from target_bulk_io 69/35969/7
Mr NeilBrown [Fri, 23 Aug 2019 07:28:47 +0000 (17:28 +1000)]
LU-10467 target: remove lwi arg from target_bulk_io

The callers of target_bulk_io() pass in an lwi pointer but never put
any information into it or take any information out of it.  Also
target_bulk_io() always re-initializes the struct before using it, so
it doesn't communicate info from one call to the next.

All that this achieves it to make stack usage slightly less
in the few cases where the lwi pointer is tti_wait_info in
struct tgt_thread_info.  That is not worth it, and a future
patch will remove the use of the struct completely.

So make lwi local to target_bulk_io, and remove it from
all callers.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: Ib6039006d0168393abf3995877acde2d7c796b1f
Reviewed-on: https://review.whamcloud.com/35969
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-10467 lustre: add wait_event macros suitable for upstream 62/35962/8
Mr NeilBrown [Fri, 23 Aug 2019 06:02:22 +0000 (16:02 +1000)]
LU-10467 lustre: add wait_event macros suitable for upstream

This patch adds three sorts of wait_event macros.

1/ wait_event_idle_* which are available upstream, but not
   in older kernels.
   if TASK_NOLOAD is not available, we use TASK_UNINTERRUPTIBLE,
   and block all interrupts.

   We cannot use ___wait_cond_timeout() as it changed signature
   in 3.13. so we define our own ___wait_cond_timeout1().

2/ wait_event_idle_exclusive_lifo() and
   wait_event_idle_exclusive_lifo_timeout()
    which might be accepted upstream if we can make a strong case

   prepare_to_wait_event() doesn't support this directly, but
   as it won't relink a wait_entry that is already linked, it
   is sufficient to link to the head of the queue before calling
   prepare_to_wait_event().

3/ l_wait_event_abortable
   l_wait_event_abortable_timeout
   l_wait_event_abortable_exclusive
    which are unlikely to be accepted upstream, but match the general
    approach of upstream wait_event macros, and are useful
    to lustre.
    Possibly some or all of these should become wait_event_killable_*
    LUSTRE_FATAL_SIGS is moved over to linux-wait.h.

___wait_event() and related macros are copied from upstream linux,
and modified slightly to work across all supported kernels.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I2d260fc159dbe5b1a3cc7a26e4aeedf30150d85a
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/35962
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12681 osc: wrong cache of LVB attrs, part2 00/36200/3
Vitaly Fertman [Wed, 11 Sep 2019 15:22:23 +0000 (18:22 +0300)]
LU-12681 osc: wrong cache of LVB attrs, part2

It may happen that osc oinfo lvb cache has size < kms.

It occurs if a reply re-ordering happens and an older size is applied
to oinfo unconditionally.

Another possibility is RA, when osc_match_base() attaches the dlm lock
to osc object but does not cache the lvb. The next layout change will
overwrites the lock lvb by the oinfo cache (previous LUS-7731 fix),
presumably smaller values. Therefore, the next lock re-use may run
into a problem with partial page write which thinks the preliminary
read is not needed.

Do not let the cached oinfo lvb size to become less than kms.
Also, cache the lock's lvb in the oinfo on osc_match_base().

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Cray-bug-id: LUS-7731
Change-Id: I50136f57491364146ce7b6a81b814e474e3edb86
Reviewed-on: https://review.whamcloud.com/36200
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12681 osc: wrong cache of LVB attrs 99/36199/4
Vitaly Fertman [Mon, 16 Sep 2019 13:46:40 +0000 (16:46 +0300)]
LU-12681 osc: wrong cache of LVB attrs

osc object keeps the cache of LVB, obtained on lock enqueue, in
lov_oinfo. This cache gets all the modifications happenning on
the client, whereas the original LVB in locks does not get them.
At the same time, this cache is lost on object destroy, which
may appear on layout change in particular.

ldlm locks are left in LRU and could be matched on next operations.
First enqueue does not match a lock in LRU due to @kms_ignore in
enqueue_base, however if the lock will be obtained on a small offset
with some locks existent in LRU on larger offsets, the obtained size
will be cut by the policy region when set to KMS.

2nd enqueue can already match and add stale data to oinfo. Thus the
OSC cache is left with a small KMS. However the logic of preparing
a partial page code checks the KMS to decide if to read a page and
as it is small,the page is not read and therefore the non-read part
of the page is zeroed.

The object destroy detaches dlm locks from osc object, offload the
current osc oinfo cache to all the locks, so that it could be
reconstructed for the next osc oinfo. Introduce per-lock flag to
control the cached attribute status and drop re-enqueue after osc
object replacement.

This patch also fixes the handling of KMS_IGNORE added in LU-11964. It
is used only for skip the self lock in a search there is no other logic
for it and it is not needed for DOM locks at all - all the relevant
semantics is supposed to be accomplished by cbpending flag.

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Cray-bug-id: LUS-7731
Change-Id: Iba45bb3e5ee181c82c2f22deb299228b1519cddb
Reviewed-on: https://review.whamcloud.com/36199
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12495 obdclass: qos penalties miscalculated 69/36269/2
Lai Siyao [Sat, 17 Aug 2019 22:37:33 +0000 (06:37 +0800)]
LU-12495 obdclass: qos penalties miscalculated

In lqos_calc_penalties(), the penalty_per_obj is miscalculated.

Also improve sanity test_413b: take both blocks and inodes into
account to make the test more robost.

Fixes: d3090bb ("LU-11213 lod: share object alloc QoS code with LMV")
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ie965fc3bfa3e303c27f93a6e1a428cc4a90f8548
Reviewed-on: https://review.whamcloud.com/36269
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12758 quota: clear default flag for new ID 36/36236/2
Hongchao Zhang [Tue, 17 Sep 2019 12:57:50 +0000 (08:57 -0400)]
LU-12758 quota: clear default flag for new ID

When setting the quota limits as 0 by "lfs setquota", the default
flag won't be cleared if the lquota_entry is just created for some
quota ID at the first time because the quota limits are the same.

Change-Id: I7f44ce0cb13783ca5bede2f55cd0707f1ccbc8ca
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36236
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shilong Wang <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11743 utils: allow lctl pool_list on separate MGS 95/35895/9
Emoly Liu [Sun, 22 Sep 2019 12:06:07 +0000 (20:06 +0800)]
LU-11743 utils: allow lctl pool_list on separate MGS

Change lctl pool_list command to parse the configuration log directly
when run on a standalone MGS node.  This also allows the pool commands
to be run when only the MGS is started.

Also, those test scripts from the patch of LU-9899 to mount a client
on the standalone MGS to allow OST pools to work properly are cleared.

Change-Id: Ic25931d49c2cf747da2a3f2ac3c25a21f6878991
Test-Parameters: standalonemgs=true testlist=ost-pools.sh,sanity.sh,conf-sanity.sh
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-on: https://review.whamcloud.com/35895
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
2 months agoLU-12755 ldiskfs: fix project quota unpon unpatched kernel 03/36203/10
Jian Yu [Fri, 20 Sep 2019 15:15:42 +0000 (23:15 +0800)]
LU-12755 ldiskfs: fix project quota unpon unpatched kernel

The value of MAXQUOTAS is the number of quota types supported
by kernel. With project quotas patch applied, MAXQUOTAS is
equal to EXT4_MAXQUOTAS. However, on an unpatched kernel,
project quota type is not supported and MAXQUOTAS is one less
than EXT4_MAXQUOTAS.

In ldiskfs, we need to make sure that the loop in
ext4_quota_off_umount() is limiting the EXT4_MAXQUOTAS loop
to the kernel MAXQUOTAS value. Otherwise, it is trying to
dereference sb_dqopt(sb)->files[2] which is not an inode at all,
and cause the kernel stick on a spinlock in ext4_quota_off()
as follows during unmount:

Call Trace:
[<ffffffffb9d733c5>] queued_spin_lock_slowpath+0xb/0xf
[<ffffffffb9d81b30>] _raw_spin_lock+0x20/0x30
[<ffffffffb9865e2e>] igrab+0x1e/0x60
[<ffffffffc08a8c4b>] ldiskfs_quota_off+0x3b/0x130 [ldiskfs]
[<ffffffffc08abcdd>] ldiskfs_put_super+0x4d/0x400 [ldiskfs]
[<ffffffffb984b13d>] generic_shutdown_super+0x6d/0x100
[<ffffffffb984b5b7>] kill_block_super+0x27/0x70
[<ffffffffb984b91e>] deactivate_locked_super+0x4e/0x70
[<ffffffffb984c0a6>] deactivate_super+0x46/0x60
[<ffffffffb986abff>] cleanup_mnt+0x3f/0x80
[<ffffffffb986ac92>] __cleanup_mnt+0x12/0x20
[<ffffffffb96c1c0b>] task_work_run+0xbb/0xe0
[<ffffffffb962cc65>] do_notify_resume+0xa5/0xc0
[<ffffffffb9d8d23b>] int_signal+0x12/0x17

Test-Parameters: clientdistro=el7.7 serverdistro=el7.7

Change-Id: I18a4d97656e2f8478754943424c0fac927f843ca
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/36203
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12789 o2ib: fix configure checks 45/36245/2
Sergey Gorenko [Fri, 20 Sep 2019 13:34:48 +0000 (16:34 +0300)]
LU-12789 o2ib: fix configure checks

Fix configure checks for modern kernels / MOFED 4.7
1) sg_dma_address() and sg_dma_len() always have only one argument.
2) Make configure checks executed in proper enviroment

Change-Id: I9910de888371776758376743ab4418778e1d85e4
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/36245
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12763 lnet: Use alternate ping processing for non-mr peers 82/36182/2
Chris Horn [Fri, 13 Sep 2019 21:23:43 +0000 (16:23 -0500)]
LU-12763 lnet: Use alternate ping processing for non-mr peers

Router peers without multi-rail capabilities (i.e. older Lustre
versions) or router peers that have discovery disabled need to use
the alternate ping processing introduced by LU-12422. Otherwise,
these peers go through the normal discovery processing, but their
remote network interfaces are never added to the peer object. This
causes routes through these peers to be considered down when
avoid_asym_router_failure is enabled.

Cray-bug-id: LUS-7866
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib567b66c871abdad9b39b4f29b38eca424d4cd8d
Reviewed-on: https://review.whamcloud.com/36182
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12739 lnet: Don't queue msg when discovery has completed 39/36139/3
Chris Horn [Mon, 9 Sep 2019 17:54:08 +0000 (12:54 -0500)]
LU-12739 lnet: Don't queue msg when discovery has completed

In lnet_initiate_peer_discovery(), it is possible for the peer object
to change after the call to lnet_discover_peer_locked(), and it is
also possible for the peer to complete discovery between the first
call to lnet_peer_is_uptodate() and our placing the lnet_msg onto
the peer's lp_dc_pendq. After the call to lnet_discover_peer_locked()
check whether the, potentially new, peer object is up to date while
holding the lp_lock. If the peer is up to date, then we needn't
queue the message. Otherwise, we continue to hold the lock to place
the message on the peer's lp_dc_pendq.

Cray-bug-id: LUS-7596
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib3da7447588479bb35afcc3fe176b9120d915a89
Reviewed-on: https://review.whamcloud.com/36139
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12705 utils: cleanup unnecessary typecasting 24/36224/2
Gu Zheng [Wed, 18 Sep 2019 04:12:55 +0000 (12:12 +0800)]
LU-12705 utils: cleanup unnecessary typecasting

There're a bunch of variables typeecasted in utils/lfs.c where
they are not needed, so cleanup them here.

Change-Id: I6c944f18137fd1ff1162d9b6567c9328dfa185eb
Test-Parameters: trivial
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/36224
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11426 llog: changelog records reordering 87/36187/5
Andrew Perepechko [Tue, 17 Sep 2019 07:34:44 +0000 (10:34 +0300)]
LU-11426 llog: changelog records reordering

Changelog records can get reordered because of a race
window between cr_index generation and llog file
space allocation. This can lead to llog records
loss.

llog_write() holds loghandle->lgh_lock semaphore,
so it seems an appropriate place to generate a
new changelog index.

Change-Id: I034d1a696bde1d0f780e494ab65073e4018ceec9
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Cray-bug-id: LUS-7691
Reviewed-on: https://review.whamcloud.com/36187
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9859 libcfs: move misc-device registration closer to related code. 18/36118/2
NeilBrown [Mon, 9 Sep 2019 17:53:56 +0000 (13:53 -0400)]
LU-9859 libcfs: move misc-device registration closer to related code.

The ioctl handler for the misc device is in  lnet/libcfs/module.c
but is it registered in lnet/libcfs/linux/linux-module.c.

Keeping related code together make maintenance easier, so move the
code.

Linux-commit: b4ded66db93bbe1f5323ad38ce51bb1be114934f

Test-Parameters: trivial

Change-Id: Ia2b3590a769214fe964dab7a63fd5edcfd6c5042
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/36118
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-12690 llite: error handling of ll_och_fill() 13/35913/4
Bobi Jam [Sat, 24 Aug 2019 17:20:23 +0000 (01:20 +0800)]
LU-12690 llite: error handling of ll_och_fill()

The return error of ll_och_fill() should be handled.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I4e750001cb124104836fa24e39ec8ae203b51a83
Reviewed-on: https://review.whamcloud.com/35913
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>