Whamcloud - gitweb
fs/lustre-release.git
21 months agoLU-6142 quota: Fix style issues for qsd_lock.c 64/33564/2
Arshad Hussain [Thu, 1 Nov 2018 13:42:33 +0000 (19:12 +0530)]
LU-6142 quota: Fix style issues for qsd_lock.c

This patch fixes issues reported by checkpatch for file
lustre/quota/qsd_lock.c

Change-Id: If6bd59204b05741878436d46de06aec0728d91f2
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33564
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
21 months agoLU-6142 lov: Fix style issues for lov_ea.c 35/33535/3
Arshad Hussain [Sun, 21 Oct 2018 18:16:30 +0000 (23:46 +0530)]
LU-6142 lov: Fix style issues for lov_ea.c

This patch fixes issues reported by checkpatch for file
lustre/lov/lov_ea.c

Change-Id: Iaaba150663e980d1d9d8f06b8c41731b90b84c77
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33535
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
21 months agoLU-11468 lnet: set recovery interval from lnetctl 98/33498/4
Amir Shehata [Fri, 26 Oct 2018 18:18:10 +0000 (11:18 -0700)]
LU-11468 lnet: set recovery interval from lnetctl

Configure lnet_recovery_interval from lnetctl

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I75e3e40ea1eb87bcd6599caa4707f53cc33abea4
Reviewed-on: https://review.whamcloud.com/33498
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11508 mdt: reject DoM file migration 94/33394/7
Lai Siyao [Tue, 16 Oct 2018 21:57:30 +0000 (05:57 +0800)]
LU-11508 mdt: reject DoM file migration

Now that DoM file migration between MDTs is not suppoted, reject
it to avoid data loss.

Add sanity.sh 230j.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Signed-off-by: Peter Jones <pjones@whamcloud.com>
Change-Id: I029446918692f911c29d2409e1398e1b147737c3
Reviewed-on: https://review.whamcloud.com/33394
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoRevert "LU-8130 ptlrpc: convert conn_hash to rhashtable" 95/33595/5
Andreas Dilger [Tue, 6 Nov 2018 17:53:02 +0000 (17:53 +0000)]
Revert "LU-8130 ptlrpc: convert conn_hash to rhashtable"

This reverts commit 7b3f9e5d6c509fabcec3cbd71e541a84987db2ff
due to crashes being seen on the MDS during client mount,
as described in LU-11624.

Change-Id: I3b39363ad1e41dee60f31466aa23d555ebe5135d
Reviewed-on: https://review.whamcloud.com/33595
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoRevert "LU-6142 obdclass: Fix style issues for obd_mount.c" 15/33615/3
John L. Hammond [Wed, 7 Nov 2018 18:45:13 +0000 (12:45 -0600)]
Revert "LU-6142 obdclass: Fix style issues for obd_mount.c"

This reverts commit cbcfb4f7ff81ce1887cb57c674b8e9d5da498dcf.

Test-Parameters: trivial
Test Parameters: ostfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs clientcount=2 mdscount=2 mdtcount=2 osscount=1 ostcount=8 testlist=ost-pools
Test-Parameters: ostfilesystemtype=zfs mdtfilesystemtype=zfs clientcount=2 mdscount=2 mdtcount=2 osscount=1 ostcount=8 testlist=ost-pools

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ib05e1ebe22b7460c39ac0da0d9ab3ab5372c64ec
Reviewed-on: https://review.whamcloud.com/33615
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-11553 build: add download 7.5alternate for MOFED 27/33527/6
Minh Diep [Wed, 31 Oct 2018 18:49:38 +0000 (11:49 -0700)]
LU-11553 build: add download 7.5alternate for MOFED

Mellanox download file name for aarch64 need to use
rhel7.5alternate as distro

Rename target file to be more precise which minor version
we are using

Change-Id: I02a31afa2a7803b3e78d975829a36a97eb64f021
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33527
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-10472 osd-ldiskfs: T10PI between RPC and BIO 66/32266/15
Li Xi [Sun, 8 Apr 2018 12:21:13 +0000 (08:21 -0400)]
LU-10472 osd-ldiskfs: T10PI between RPC and BIO

When OST recieves bulk write RPC, the T10PI guard tag will be
generated during the process of calculating RPC checksum with
T10PI type. Guard tags of each sector will be copied to the
BIO integrity payload to avoid recalculating of guard tags.

When OST reads data from disk, the T10PI guard tags will be
copied from BIO integrity payload. These guard tags will be
reused for calculation the RPC checksum with T10PI type, thus
no recalcuating of guard tags is needed either.

However, if the data that the client is reading is cached
in memory, the guard tags need to be calculated based on the
cached data, since there is no place to plug the guard tags
to the page cache on OSS.

Some modification to Linux kernel is needed:

1) We can pass “struct bio *” and  to the integrity
generate/verify methods, and struct blk_integrity_exchg
has bi_idx which is the current bio_vec index.

2) bio_integrity_prep accepts optional pointers to integrity
generation/verification methods. The optional methods take
priority over the ones registered by the device.

These two modification enable Lustre (and other file systems) to
integrate with BIO for integrity verification/generation. Any private
data need during data integrity generation/verification process can
be attached to bio->bi_private. Instead of calculating guard tags,
Lustre generation method will copy the guard tags from existing
buffer. And instead of (or besides of) data integrity verification,
Lustre verification method will copy the guard tags to internal
buffer for further usage.

Besides of these changes, two Linux kernel patches are applied:

1) The first problem is that bio_integrity_verify() doesn't verify
the data integrity at all. In that function, after reading the data,
bio->bi_idx will be equal to bio->bi_vcnt because of bio_advance(),
so bio_for_each_segment_all() should be used, not
bio_for_each_segment(). And also, bio_advance() should not change
the integrity data bio_integrity_advance() unless the BIO is being
trimmed.
Linux-commit: 63573e359d052e506d305c263576499f06355985

2) The second patch fixes a problem of the sd_dif_complete(). When
sector offset is larger then 2^32, the mapping from physical
reference tag to the virtual values expected by block layer will be
wrong.
Linux-commit: c611529e7cd3465ec0eada0f44200e8420c38908

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ia6c1d586284b0d9884116e1a753fd88e066366fe
Reviewed-on: https://review.whamcloud.com/32266
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11430 tests: get MDC stats by index 90/33490/4
Mikhail Pershin [Fri, 26 Oct 2018 09:07:04 +0000 (12:07 +0300)]
LU-11430 tests: get MDC stats by index

Fix test groups 270 and 271 in sanity.sh and
100 in sanityn.sh to get MDC stats using particular
MDC index to prevent multiline output before 'awk'.
In some cases use just 'grep -c' for simpler checks.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I6bd36192ab800418e3ddb745e128b5ea4d5e20c4
Reviewed-on: https://review.whamcloud.com/33490
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11563 build: Only add l_tunedisk udev rule to server 66/33466/3
Nathaniel Clark [Wed, 24 Oct 2018 19:49:39 +0000 (15:49 -0400)]
LU-11563 build: Only add l_tunedisk udev rule to server

Split LU-9551 patch off into server only udev rules.
It just spits errors on the client since l_tunedisk is a server-side
only tool.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Iee426588bcce611dc913cf89a4bcb733c364482b
Reviewed-on: https://review.whamcloud.com/33466
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Jay J Lan <jay.j.lan@nasa.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-10695 tests: fix sanity-lfsck test_23c 07/33407/7
Andreas Dilger [Fri, 19 Oct 2018 23:13:05 +0000 (17:13 -0600)]
LU-10695 tests: fix sanity-lfsck test_23c

sanity-lfsck test_23c fails intermittently with "(8) unexpected size"
while introducing the required corruption to the filesystem to run
the test.

It appears that in these cases, LFSCK has actually fixed the file
before it can be "further corrupted" as part of the test.  In
this case, the test failure is actually a sign that LFSCK is working
correctly, so it should not be considered a test failure.

Add a check that the file was repaired correctly (contains original
data) and consider the test a pass.

Test-Parameters: trivial testlist=sanity-lfsck,sanity-lfsck,sanity-lfsck mdtfilesystemtype=zfs ostfilesystemtype=zfs
Test-Parameters: testlist=sanity-lfsck,sanity-lfsck,sanity-lfsck mdtfilesystemtype=zfs ostfilesystemtype=zfs
Test-Parameters: testlist=sanity-lfsck,sanity-lfsck,sanity-lfsck mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9d92cd8e471426cc544293e1149ad556a33ebbe5
Reviewed-on: https://review.whamcloud.com/33407
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11514 lnet: separate ni state from recovery 61/33361/3
Amir Shehata [Fri, 12 Oct 2018 18:30:34 +0000 (11:30 -0700)]
LU-11514 lnet: separate ni state from recovery

To make the code more readable we make the ni_state an
enumerated type, and create a separate bit filed to track
the recovery state. Both of these are protected by the
lnet_ni_lock()

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I5acfccecffd5dbb07c9ad3b1c7651cf291b85cb8
Reviewed-on: https://review.whamcloud.com/33361
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11469 lnet: fix "debug recovery" output 10/33310/8
Amir Shehata [Thu, 4 Oct 2018 00:17:03 +0000 (17:17 -0700)]
LU-11469 lnet: fix "debug recovery" output

Don't print out anything from

lnetctl debug recovery [--local|--peer]

if there are no NIs on the recovery queues.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Icf4d5e2f1e3eefafce81dcc73525a4dd9a36d009
Reviewed-on: https://review.whamcloud.com/33310
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11423 osc: Do not walk full extent list 27/33227/4
Patrick Farrell [Tue, 25 Sep 2018 15:19:27 +0000 (10:19 -0500)]
LU-11423 osc: Do not walk full extent list

It is only possible to merge with the extent immediately
before or immediately after the one we are trying to add,
so do not continue to walk the extent list after passing
that extent.

This has a significant impact when writing large sparse
files, where most writes create a new extent, and many
extents are too distant to be merged with their neighbors.

Writing 2 GiB of data randomly 4K at a time, we see an
improvement of about 15% with this patch.

mpirun -n 1 $IOR -w -t 4K -b 2G -o ./file -z
w/o patch:
write         285.86 MiB/s
w/patch:
write         324.03 MiB/s

Cray-bug-id: LUS-6523
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I3da224762638aa71714cfc6dd1f0abac42e1f358
Reviewed-on: https://review.whamcloud.com/33227
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-8066 llite: Move all remaining procfs entries to debugfs 17/32517/6
James Simmons [Sat, 13 Oct 2018 19:24:46 +0000 (15:24 -0400)]
LU-8066 llite: Move all remaining procfs entries to debugfs

This moves all remaining procfs handling in llite layer to debugfs.

This is a modified version of

Linux-commit : ae7c0f4833a65b7648cceaf1a60503a89e057f0f

Change-Id: Id5c411d21a660a17a015ca9976b857e6b088c28a
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32517
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-6142 lov: Fix style issues for lovsub_object.c 44/33544/2
Arshad Hussain [Sun, 21 Oct 2018 21:14:26 +0000 (02:44 +0530)]
LU-6142 lov: Fix style issues for lovsub_object.c

This patch fixes issues reported by checkpatch for file
lustre/lov/lovsub_object.c

Change-Id: Ia35e926c780787f168fb62a95ed79fa1a35ac6c0
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33544
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-6142 lov: Fix style issues for lovsub_dev.c 43/33543/2
Arshad Hussain [Sun, 21 Oct 2018 20:53:21 +0000 (02:23 +0530)]
LU-6142 lov: Fix style issues for lovsub_dev.c

This patch fixes issues reported by checkpatch for file
lustre/lov/lovsub_dev.c

Change-Id: Ideacff81ab326602f9889ef48be123851950ae8e
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33543
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-6142 lov: Fix style issues for lov_request.c 42/33542/2
Arshad Hussain [Sun, 21 Oct 2018 20:39:42 +0000 (02:09 +0530)]
LU-6142 lov: Fix style issues for lov_request.c

This patch fixes issues reported by checkpatch for file
lustre/lov/lov_request.c

Change-Id: I045e9f24c53c349f974c4786d741a174b326fdf8
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33542
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-6142 lov: Fix style issues for lov_pack.c 40/33540/2
Arshad Hussain [Sun, 21 Oct 2018 20:10:58 +0000 (01:40 +0530)]
LU-6142 lov: Fix style issues for lov_pack.c

This patch fixes issues reported by checkpatch for file
lustre/lov/lov_pack.c

Change-Id: Iabe796ce8441f722bc377e5d5af6bfe582901da6
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33540
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-6142 lov: Fix style issues for lov_offset.c 39/33539/2
Arshad Hussain [Sun, 21 Oct 2018 19:29:38 +0000 (00:59 +0530)]
LU-6142 lov: Fix style issues for lov_offset.c

This patch fixes issues reported by checkpatch for file
lustre/lov/lov_offset.c

Change-Id: I0aeeb161bef9247f11a91563f82bce48150001d9
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33539
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-6142 lov: Fix style issues for lov_dev.c 34/33534/2
Arshad Hussain [Sun, 21 Oct 2018 17:57:29 +0000 (23:27 +0530)]
LU-6142 lov: Fix style issues for lov_dev.c

This patch fixes issues reported by checkpatch for file
lustre/lov/lov_dev.c

Change-Id: I93dabb01b939328494146a45ffcb88edf1ed066c
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33534
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-6142 obdclass: Fix style issues for llog_ioctl.c 32/33532/2
Arshad Hussain [Sun, 21 Oct 2018 12:59:59 +0000 (18:29 +0530)]
LU-6142 obdclass: Fix style issues for llog_ioctl.c

This patch fixes issues reported by checkpatch for file
lustre/obdclass/llog_ioctl.c

Change-Id: I404f1a0fa0a011c8b502632d10a4b66c0a927f01
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33532
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-6142 obdclass: Fix style issues for llog_obd.c 31/33531/2
Arshad Hussain [Sun, 21 Oct 2018 12:19:06 +0000 (17:49 +0530)]
LU-6142 obdclass: Fix style issues for llog_obd.c

This patch fixes issues reported by checkpatch for file
lustre/obdclass/llog_obd.c

Change-Id: I9daf5ce029766d1d4d0c1ab0662620ba79801b6a
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33531
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-6142 obdclass: Fix style issues for dt_object.c 30/33530/2
Arshad Hussain [Sun, 21 Oct 2018 14:47:26 +0000 (20:17 +0530)]
LU-6142 obdclass: Fix style issues for dt_object.c

This patch fixes issues reported by checkpatch for file
lustre/obdclass/dt_object.c

Change-Id: I8e508b260c4fb97faa26dde37e9d61560d918f01
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33530
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-11468 lnet: configure recovery interval 09/33309/9
Amir Shehata [Thu, 4 Oct 2018 00:01:38 +0000 (17:01 -0700)]
LU-11468 lnet: configure recovery interval

Added a module parameter to configure the interval between each
recovery ping. Some sites might not want to ping failed NIDs once
a second and might desire a longer interval. The interval defaults
to 1 second.
Monitor thread now wakes up depending on the smallest interval
it needs to monitor

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia96fa7dea0b3925686d785b4d4dde399742c86b7
Reviewed-on: https://review.whamcloud.com/33309
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-6142 obdclass: Fix style issues for obd_mount.c 84/32984/10
Arshad Hussain [Sun, 12 Aug 2018 09:18:29 +0000 (14:48 +0530)]
LU-6142 obdclass: Fix style issues for obd_mount.c

This patch fixes issues reported by checkpatch
for file lustre/obdclass/obd_mount.c

Change-Id: Idea4f23b46c1c95928f3cf3b86cb8641344647a9
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32984
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-11507 osd-zfs: Use zfs_refcount_add if available 59/33359/7
Tony Hutter [Thu, 11 Oct 2018 22:07:44 +0000 (15:07 -0700)]
LU-11507 osd-zfs: Use zfs_refcount_add if available

refcount_add was removed from ZFS master in:

    Linux 4.19-rc3+ compat: Remove refcount_t compat
    https://github.com/zfsonlinux/zfs/pull/7932

It is expected to be removed in zfs-0.7.12 as well.  Update Lustre
to use zfs_refcount_add if zfs supports it, and fall back to
refcount_add if not.

Signed-off-by: Tony Hutter <hutter2@llnl.gov>
Change-Id: Ib1b2ff13eb4ff8c56dd49a427b9827c6649ecd31
Reviewed-on: https://review.whamcloud.com/33359
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11461 scripts: Support symlink target 77/33277/6
Nathaniel Clark [Thu, 1 Nov 2018 15:13:26 +0000 (11:13 -0400)]
LU-11461 scripts: Support symlink target

Support if configured target is symlink to real device, for instance
/dev/disk/by-id/scsi-WWID.  Also check against bare target for
ZPOOL/DEVICE which will return an empty string when passed to
realpath.
Also fix usage function, so it prints usage and doesn't just error
out.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I699b1fd36c1e53e99a8d0e6b691374eca42fccc9
Reviewed-on: https://review.whamcloud.com/33277
Reviewed-by: Joe Grund <jgrund@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-4684 llite: add lock for dir layout data 46/32946/15
Lai Siyao [Fri, 20 Jul 2018 04:26:41 +0000 (12:26 +0800)]
LU-4684 llite: add lock for dir layout data

Directory layout data should be accessed with lock, because
directory migration may change it, if it's accessed without lock,
it may cause crash.

Introduce an rw_semaphore 'lli_lsm_sem', any MD operation that uses
directory layout data will take read lock, and ll_update_lsm_md()
will take write lock when setting lsm.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ice3b15c90eefd6c9dbefbea87cd65f436bec96b1
Reviewed-on: https://review.whamcloud.com/32946
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11329 misc: Add LLNL reviewers to MAINTAINERS 67/33467/2
Olaf Faaland [Wed, 24 Oct 2018 21:13:44 +0000 (14:13 -0700)]
LU-11329 misc: Add LLNL reviewers to MAINTAINERS

Add LLNL reviewers to utilities and zfs-osd.

Test-Parameters: trivial
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I93d60d0fc438d48e2e5ceb4acd95391eb28f92cf
Reviewed-on: https://review.whamcloud.com/33467
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11329 misc: populate MAINTAINERS file 13/33413/8
Andreas Dilger [Mon, 22 Oct 2018 10:30:13 +0000 (18:30 +0800)]
LU-11329 misc: populate MAINTAINERS file

Add a relatively comprehensive set of subsystems to the MAINTAINERS
file, and assign patch reviewers to most of them.  There is room
for improvement, but at least this gives someone a chance to find
a maintainer for most of the code.

Update the get_maintainers.pl script to allow reading from stdin.
This allows the script to accept input from "git show <patch>"
to find reviewers for an existing patch.

Create a .mailmap file to map old email addresses to a fairly
current list of users (for now at least).  This allows
get_maintainers.pl to combine contributors into a single identity,
to avoid their "score" from being diluted across two identities.

Some addresses were not mapped from @whamcloud.com to @intel.com,
because they moved back to the @whamcloud.com domain again.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I12d223b4e1d4841c2b6fe1da65e69cd0bb4ebbe5
Reviewed-on: https://review.whamcloud.com/33413
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11152 lnd: test fpo_fmr_poool pointer instead of special bool 08/33408/4
James Simmons [Tue, 23 Oct 2018 03:39:46 +0000 (23:39 -0400)]
LU-11152 lnd: test fpo_fmr_poool pointer instead of special bool

For the ko2iblnd driver it sets a fpo_is_fmr bool to tell use if
a pool was allocated. The name fpo_is_fmr is very misleading to
its function and its a weak test to tell us if a pool was allocated
in the FMR case. It is much easier to test the actually FMR pool
pointer then manually setting a bool flag to tell us if the FMR
pool is valide.

Test-Parameters: trivial
Change-Id: Ib5fa14f4a9d2b89efe5f453e0f243699894f3aeb
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33408
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-930 doc: update Lustre Changelog kernel versions 97/33397/5
Andreas Dilger [Thu, 18 Oct 2018 19:04:51 +0000 (13:04 -0600)]
LU-930 doc: update Lustre Changelog kernel versions

Update the Lustre ChangeLog file to reflect more current kernels
that are known to work.

Clarify that ldiskfs needs a series for that kernel, we no longer
require a patched server kernel for Lustre.

Replace Intel with Whamcloud for the release.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I13a91d5d82bfbeedcebcfa749184b69ca03ebbe5
Reviewed-on: https://review.whamcloud.com/33397
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11530 lnet: properly error check sensitivity 92/33392/2
Amir Shehata [Wed, 17 Oct 2018 19:59:13 +0000 (12:59 -0700)]
LU-11530 lnet: properly error check sensitivity

Reject setting health sensitivity greater than the maximum health
value.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I866ff2cac2ba6b034cdbac24096e7014c66a3e2e
Reviewed-on: https://review.whamcloud.com/33392
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-10258 lfs: lfs mirror write command 19/33219/7
Bobi Jam [Sat, 22 Sep 2018 09:14:46 +0000 (17:14 +0800)]
LU-10258 lfs: lfs mirror write command

Rename "lfs mirror dump" command to "lfs mirror read".

Add "lfs mirror write" command to write a mirror's content of a
mirrored file.

Usage:

lfs mirror write {--mirror-id|-N <mirror_id>}
[--inputfile|-i <input_file>] <mirrored_file>

Options:

--mirror-id|-N <mirror_id>
  This  option  indicates  the  content of which mirror specified by
  mirror_id needs to be written. The mirror_id is the numerical unique
  identifier for a mirror.

--inputfile|-i <input_file>
  The path name of the input file, if not specified, the standard
  input stream will be used. The input stream or input_file cannot
  be the same mirrored file as the mirrored_file.

This command will issue a RESYNC lease write lock to notify MDS to
prepare destination mirror for the write (instantiate components of
the mirror), then client copy data from a file or STDIN to the
specified mirror of the mirrored file. After the data copy, a
RESYNC_DONE lease unlock is issued to MDS to update the layout
of the mirrored file.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I02022349d4ce871319903a8714ffc4534186c0e4
Reviewed-on: https://review.whamcloud.com/33219
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11371 socklnd: dynamically set LND parameters 91/33191/4
Sonia Sharma [Sun, 2 Sep 2018 10:32:17 +0000 (06:32 -0400)]
LU-11371 socklnd: dynamically set LND parameters

Currently, the socklnd parameters cannot be set
dynamically. Only the default values are set
which cannot be changed by deleting and
re-adding the net with DLC.

This patch allows setting socklnd parameters
dynamically.

Change-Id: Ied9c300833b4d1352ca4d94c3b6886ed8d5eb901
Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33191
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11163 libcfs: fix CPT NUMA memory failures 48/32848/6
Andreas Dilger [Wed, 25 Jul 2018 05:36:18 +0000 (23:36 -0600)]
LU-11163 libcfs: fix CPT NUMA memory failures

In some (mis-)configurations, NUMA nodes may not have any local RAM,
or the memory allocations are non-uniform between NUMA nodes.

In the unlikely case that a CPT-bound allocation fails, retry the
allocation without the CPT-binding.  Having some remote memory usage
is better than an allocation failure.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib0ab84bef8ff10c43bafb48a8082b62fc544ca29
Reviewed-on: https://review.whamcloud.com/32848
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11101 quota: fix setattr project check 30/32730/6
Wang Shilong [Wed, 17 Oct 2018 05:55:17 +0000 (13:55 +0800)]
LU-11101 quota: fix setattr project check

Similar patch motivated by upstream patch:
ext4: fix setattr project check in fssetxattr ioctl

Currently, project quota could be changed by fssetxattr
ioctl, and existed permission check inode_owner_or_capable()
is obviously not enough, just think that common users could
change project id of file, that could make users to
break project quota easily.

This patch try to follow same regular of xfs project
quota:

"Project Quota ID state is only allowed to change from
within the init namespace. Enforce that restriction only
if we are trying to change the quota ID state.
Everything else is allowed in user namespaces."

Test-Parameters: trivial testlist=sanity-quota,sanity-quota
Change-Id: If03bb120476eca9707b1b4db64e9594bb99df59e
Signed-off-by: Wang Shilong <wshilong@ddn.com>i
Reviewed-on: https://review.whamcloud.com/32730
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-4939 utils: allow configuration through yaml files 46/31846/13
Ben Evans [Wed, 21 Mar 2018 20:57:18 +0000 (16:57 -0400)]
LU-4939 utils: allow configuration through yaml files

add -F option to lctl set_param
file must be in yaml format.  Will accept either set_param
or conf_param formats and issue the appropriate commands

Reorganize set_param and conf_param infrastructures to allow
for shared code.

rename jt_lcfg_mgsparam to jt_lcfg_confparam
rename jt_lcfg_mgsparam2 to jt_lcfg_setparam_perm

Add test_806 to test reconfigure after writeconf

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I8c36ea9be162112e75412fbd990a4f21e108d000
Reviewed-on: https://review.whamcloud.com/31846
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-6142 obdclass: Fix style issues for llog_test.c 01/33501/2
Arshad Hussain [Sun, 21 Oct 2018 03:27:42 +0000 (08:57 +0530)]
LU-6142 obdclass: Fix style issues for llog_test.c

This patch fixes issues reported by checkpatch for file
lustre/obdclass/llog_test.c

Change-Id: I7b148b3db29f374421bad43764ca40bf7e8d6a9f
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/33501
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
21 months agoLU-6142 lnd: create enum kib_dev_caps 09/33409/4
James Simmons [Tue, 23 Oct 2018 03:57:31 +0000 (23:57 -0400)]
LU-6142 lnd: create enum kib_dev_caps

Cleanup IBLND_DEV_CAPS_* by creating enum kib_dev_caps and using
the BIT() macros.

Change-Id: Ia3feaa0a0a98d5621686cddf9cb02af50f42f78c
Test-Parameters: trivial
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33409
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-4684 lod: parse layout for migrating directory 56/33456/3
Lai Siyao [Wed, 17 Oct 2018 23:25:29 +0000 (07:25 +0800)]
LU-4684 lod: parse layout for migrating directory

If directory migration failed, it will be marked
LMV_HASH_FLAG_MIGRATION, lod_parse_dir_striping() should parse
layout for such directory, otherwise such directory can't be
accessed correctly.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I5ee2c2e3e5aa3f9befc3ad81be4bcdc6f9267842
Reviewed-on: https://review.whamcloud.com/33456
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11536 ofd: ofd_create_hdl may return 0 in case of ENOSPC 90/33390/3
Sergey Cheremencev [Mon, 25 Jun 2018 15:52:11 +0000 (18:52 +0300)]
LU-11536 ofd: ofd_create_hdl may return 0 in case of ENOSPC

ostid_set_id rewrites ofd_precreate_objects result after
"LU-6401 uapi: fix up lustre_ostid.h and lustre_fid.h".
This breakes the logic of osp_precreate_reserve() causing
osp_precreate_send() to return ESTALE instead of ENOSPC
when OST can't precreate objects.
osp_precreate_send() returns ESTALE because the result of
create is 0 while last created fid on OST is still the same
with local last_id:

fs1-OST0001-osc-MDT0000: precreate fid [0x100010000:0x571607f:0x0] <
local used fid [0x100010000:0x571607f:0x0]: rc = -116
fs1-OST0001-osc-MDT0000: precreate failed opd_pre_status -116
fs1-OST0001-osc-MDT0000: cannot precreate objects: rc = -116

Change-Id: I4dc057c201253cab14e63c1f06bd5b0d56b5ad2d
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Fixes: 34acfbc2bfe502d18c12ba35771bde7c4a0f7906
Reviewed-on: https://es-gerrit.dev.cray.com/153462
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/33390
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11472 lnet: Decrement health on timeout 08/33308/4
Amir Shehata [Thu, 4 Oct 2018 20:00:49 +0000 (13:00 -0700)]
LU-11472 lnet: Decrement health on timeout

When a response times out we want to decrement the health of the
immediate next hop peer ni, so we don't use that interface if there
are others available.

When sending a message if there is a response tracker associated
with the MD, store the next-hop-nid there. If the response times
out then we can look up the peer_ni using the cached NID, and
decrement its health value.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I6c2f49a695f078ee50378c0a468c7ee058f7e712
Reviewed-on: https://review.whamcloud.com/33308
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-6142 libcfs: Enforce kernel coding style for libcfs/debug.c 05/14105/3
Robert Read [Thu, 19 Mar 2015 06:42:39 +0000 (23:42 -0700)]
LU-6142 libcfs: Enforce kernel coding style for libcfs/debug.c

This patch enforces Linux kernel coding style for file
libcfs/libcfs/debug.c reported by checkpatch.

Test-Parameters: trivial
Change-Id: I98fb26a3e4d9dde6f620692e485f1709b8257fc0
Signed-off-by: Robert Read <robert.read@intel.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/14105
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
21 months agoLU-11556 tests: fix set_persistent_param_and_check breakage 22/33422/3
Andreas Dilger [Mon, 22 Oct 2018 23:10:57 +0000 (07:10 +0800)]
LU-11556 tests: fix set_persistent_param_and_check breakage

Since patch https://review.whamcloud.com/30087 "LU-7004 tests: move
from lctl conf_param to lctl set_param -P" was landed, there are
a few places that call set_persistent_param_and_check() with a node
name as an argument instead of a facet.

Fix the few places that are doing this.  Note that the call to
t32_verify_quota() is not enabled in this patch because of LU-11558,
since it is entirely possible that this code is currently broken.
This patch is about fixing set_persistent_param_and_check() breakage,
and t32_verify_quota() re-enablement can be done in a second patch.

Test-Parameters: trivial testlist=conf-sanity
Test-Parameters: testlist=conf-sanity mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I47473844c6103efe9c73c780de24605f4e3ebbe5
Reviewed-on: https://review.whamcloud.com/33422
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11516 mdd: do not assert on missing orphan 68/33368/2
Andreas Dilger [Sat, 13 Oct 2018 23:08:28 +0000 (17:08 -0600)]
LU-11516 mdd: do not assert on missing orphan

Do not assert if an orphan being cleaned up is missing.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Icf990bf5ea6dfa2098f0b1fa90d9f546d83ebbe5
Reviewed-on: https://review.whamcloud.com/33368
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11488 test: ignore statfs from precreate in sanity 133b() 17/33517/2
John L. Hammond [Tue, 30 Oct 2018 16:44:44 +0000 (11:44 -0500)]
LU-11488 test: ignore statfs from precreate in sanity 133b()

In sanity test_133b() use obdfilter client export stats rather than
target stats to check for statfs allowing us to ignore statfs RPCs
from the MDT for precreate.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia47e535d6ce94c712da8ad8698315188c8d64d83
Reviewed-on: https://review.whamcloud.com/33517
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11564 tests: add version check sanity-hsm tests 63/33463/5
James Nunez [Wed, 24 Oct 2018 15:21:53 +0000 (09:21 -0600)]
LU-11564 tests: add version check sanity-hsm tests

sanity-hsm test 24g, 260a and 260b were added to Lustre
tag 2.11.56.15. sanity-hsm test 1d was added to Lustre
2.10.59. Thus, we need to check that the server is
2.11.56 or later before running test 24g, 260a, and 260b
and is 2.10.59 or later for test 1d.

Several tests call the lustre_version_code() routine to check
the Lustre code version of the MDS. Make this call once at the
beginning of the test suite and keep the version in a global
variable.

Also, remove the call to return() after all calls to skip().

Test-Parameters: trivial mdsjob=lustre-b2_10 ossjob=lustre-b2_10 serverbuildno=136 testlist=sanity-hsm
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: If89e730ba9352b5eaa2dc24686372237375a7556
Reviewed-on: https://review.whamcloud.com/33463
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11373 tests: increase debug limit sanity 60b 74/33474/3
James Nunez [Thu, 25 Oct 2018 00:40:35 +0000 (18:40 -0600)]
LU-11373 tests: increase debug limit sanity 60b

We've seen cases where the number of debug messages
on the MDS exceed the line number limit of 100 in
sanity test 60b. Since the line limit is an approximation,
increase this limit to 120 lines.

Test-Parameters: trivial
Test-Parameters: mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity
Test-Parameters: mdscount=2 mdtcount=2 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity
Test-Parameters: mdscount=2 mdtcount=2 mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs testlist=sanity

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ibf302893f468957983c11374b8fa829802ff136c
Reviewed-on: https://review.whamcloud.com/33474
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11479 rsync: replicate attributes of file in .lustrerepl 73/33373/2
John L. Hammond [Mon, 15 Oct 2018 18:58:10 +0000 (13:58 -0500)]
LU-11479 rsync: replicate attributes of file in .lustrerepl

When lustre_rsync receives a setattr or setxattr changelog record, the
file to be replicated may still be in the .lustrerepl directory of the
archive. When this is the case, apply the attributes to the file
there.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: I6e686d5c4dbeb3acf177a061eb70807c8dd7dfb3
Reviewed-on: https://review.whamcloud.com/33373
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-7770 lov: fix statfs for conf-sanity test_50b 69/33369/6
Andreas Dilger [Mon, 8 Oct 2018 22:00:08 +0000 (16:00 -0600)]
LU-7770 lov: fix statfs for conf-sanity test_50b

Wait for the *client* to be disconnected from the OSTs, not the MDS,
to ensure that the test is actually doing what it thinks it should.
In conf-sanity.sh::lazystatfs(), sleep between statfs operations to
ensure we are not just getting locally-cached statfs results.
Print out device status to ensure that wait_osc_import_state has
actually waited long enough for OSCs to be marked disconnected.

In obd_statfs() print the device name in the debug logs for clarity.

Have "lfs df" print block stats from MDTs only if no OSTs connected.

Checkpatch should warn about get_seconds(), not ktime_get_seconds().

Test-Parameters: mdtcount=4 testlist=conf-sanity,conf-sanity,conf-sanity envdefinitions=ONLY=50b
Test-Parameters: mdtcount=4 testlist=conf-sanity,conf-sanity,conf-sanity envdefinitions=ONLY=50b
Test-Parameters: mdtcount=4 testlist=conf-sanity,conf-sanity,conf-sanity envdefinitions=ONLY=50b
Change-Id: Icbe68f0a133f04f89d44f74a5caaa6c523fcab07
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33369
Tested-by: Jenkins
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11499 tests: skip test_56ba for old server 44/33344/7
Elena Gryaznova [Wed, 10 Oct 2018 16:49:58 +0000 (19:49 +0300)]
LU-11499 tests: skip test_56ba for old server

sanity test_56ba exercise 'lfs find' options expanded
for Progressive File Layout feature.
Patch skips test_56ba for old MDS where PFL feature
does not exist.

Test-Parameters: trivial
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5536
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I2439663008548b89c78d0d3f13f3c0722c8f9ba7
Reviewed-on: https://review.whamcloud.com/33344
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11498 tests: remove duplicate write_disjoint test name 43/33343/4
Elena Gryaznova [Wed, 10 Oct 2018 16:34:07 +0000 (19:34 +0300)]
LU-11498 tests: remove duplicate write_disjoint test name

Patch renames write_disjoint() test added by LU-9409 / LUS-1705
to write_disjoint_tiny() to keep the tests names unique.

Test-Parameters: trivial testlist=parallel-scale envdefinitions=ONLY="write_disjoint write_disjoint_tiny"
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5939
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Change-Id: I87961e244c5f3fcfdae8263591d03685d8d4fcbd
Reviewed-on: https://review.whamcloud.com/33343
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11476 lnet: set the health status correctly 07/33307/5
Amir Shehata [Thu, 4 Oct 2018 22:41:33 +0000 (15:41 -0700)]
LU-11476 lnet: set the health status correctly

There are cases where the health status wasn't set properly.
Most notably in the tx_done we need to deal with a specific
set of errno: ENETDOWN, EHOSTUNREACH, ENETUNREACH, ECONNREFUSED,
ECONNRESET. In all those cases we can try and resend to other
available peer NIs.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie8f0275582d434bda5e394fccc2a4d88dd538c69
Reviewed-on: https://review.whamcloud.com/33307
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11419 lfsck: lfsck_namespace_shrink_linkea() dead loop 52/33252/3
Lai Siyao [Thu, 30 Aug 2018 13:08:57 +0000 (21:08 +0800)]
LU-11419 lfsck: lfsck_namespace_shrink_linkea() dead loop

lfsck_namespace_shrink_linkea() may fall in dead loop if it tries
to delete XATTR_NAME_LINK.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I43e6e7917f8f89eb2cc873c8521cd3fbb528f495
Reviewed-on: https://review.whamcloud.com/33252
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11427 llite: optimize read on open pages 34/33234/3
Jinshan Xiong [Tue, 25 Sep 2018 19:27:22 +0000 (12:27 -0700)]
LU-11427 llite: optimize read on open pages

Current read-on-open implementation does allocate cl_page after data
are piggied back by open request, which is expensive and not
necessary.

This patch improves the case by just adding the pages into page cache.
As long as those pages will be discarded at lock revocation, there
should be no concerns.

Signed-off-by: Jinshan Xiong <jinshan.xiong@uber.com>
Change-Id: Idef1b70483e3780790ba5b95c26ef2d4141add5f
Reviewed-on: https://review.whamcloud.com/33234
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11427 lod: create layout in mdo_create() 33/33233/3
Jinshan Xiong [Tue, 25 Sep 2018 19:13:48 +0000 (12:13 -0700)]
LU-11427 lod: create layout in mdo_create()

This patch will create MDT layout in the path of mdo_create() before
mdt_object_open_lock() is invoked. The previous implementation created
layout in mdt_create_data() that caused the problem that layout lock
couldn't be packed in the reply of the open request. Later on an extra
layout request has to be issued for layout lock, which kills all
performance gains by DoM for small files write.

Signed-off-by: Jinshan Xiong <jinshan.xiong@uber.com>
Change-Id: Id11ac79c89d12bbe0e925fbc89417fca3e72e479
Reviewed-on: https://review.whamcloud.com/33233
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11124 utils: add "lfs getstripe -N" option 80/33280/3
Andreas Dilger [Wed, 3 Oct 2018 22:41:56 +0000 (16:41 -0600)]
LU-11124 utils: add "lfs getstripe -N" option

Add an "lfs getstripe -N" option to print the number of mirrors on a
file.  The code for printing the mirror count was already in
liblustreapi.c, but there was no option to request only this value
to be printed.

Move the VERBOSE_* flags into an enum and change the various functions
and structures using these flags to use the enum.  Rename a few of the
constants to be more specific, but add compatibility definitions.

Use "lfs getstripe -N" in sanity-flr and sanity-lfsck for mirror count
instead of parsing the mirror count from the verbose layout.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iafd111c25e22d94153596f9bd4a16750548cab07
Reviewed-on: https://review.whamcloud.com/33280
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11080 tests: skip async update recovery tests 89/32689/3
Elena Gryaznova [Thu, 11 Oct 2018 16:06:35 +0000 (19:06 +0300)]
LU-11080 tests: skip async update recovery tests

Patch skips replay-single async update recovery
tests 110,111,112,115 for old server where this feature
is missing.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=replay-single
Cray-bug-id: LUS-5837
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Change-Id: I9d7ddf348955bad0644038fe898812cbf92bbdcd
Reviewed-on: https://review.whamcloud.com/32689
Tested-by: Jenkins
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-10030 idl: use proper ATTR/MDS_ATTR/MDS_OPEN flags 07/32107/6
Andreas Dilger [Fri, 19 Oct 2018 03:43:11 +0000 (23:43 -0400)]
LU-10030 idl: use proper ATTR/MDS_ATTR/MDS_OPEN flags

Add proper MDS_ATTR_* and MDS_OPEN_* flags for different flags
namespaces.  The MDS_OPEN_OWNEROVERRIDE was being mapped into
the MDS_ATTR_* flags in some cases.  This did not conflict yet, but
add separate ATTR_OVERRIDE and MDS_ATTR_OVERRIDE flags for this use
so they don't conflict in the future.

Remove the MDS_OPEN_CROSS flag, since this was only used internally
as a hack to pass open flags to mdd_permission(), but was truncating
the u64 open flags to a 32-bit int in the process.  Do the convert
to MAY_* flags at the MDT layer instead of inside mdd_permission()
by moving the accmode() flag conversion into lustre_mds.h code.

The ATTR_OPEN flag has existed since kernel 2.6.27, so we can always
use that directly instead of the ATTR_FROM_OPEN flag we #defined.
The ATTR_RAW flag is no longer used at all and can be removed.

Rename various "flags" uses in the code to "open_flags" so that it
is more clear which flags values are being used.  This exposed a few
places in the code where we were using an int to pass these flags, but
some of the MDS_OPEN_* flags are using 64-bit values already.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I833a6e6102f947a9276cb6bf03826fd4a5ecab07
Reviewed-on: https://review.whamcloud.com/32107
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-8066 obd: use correct names for conn_uuid 13/33213/6
James Simmons [Sat, 13 Oct 2018 02:34:58 +0000 (22:34 -0400)]
LU-8066 obd: use correct names for conn_uuid

The LUSTRE_R[OW]_ATTR() macros assume that the name of the sysfs
file to create matches the beginning of the function names. In
the case of LUSTRE_RO_ATTR(conn_uuid) this maps to the function
conn_uuid_show() and generated sysfs files "conn_uuid". While it
makes sense to standardize this interface we need to keep the
old xxx_conn_uuid. We can create these xxx_conn_uuid sysfs files
by using the base sysfs attr macro LUSTRE_ATTR().

Change-Id: I3bea85334578a07f4758f54773846d0f24a3d69a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33213
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-10337 mdt: Allow open of open orphans 05/30405/23
Patrick Farrell [Thu, 8 Mar 2018 11:47:56 +0000 (05:47 -0600)]
LU-10337 mdt: Allow open of open orphans

Standard open by handle behavior allows opening of open
unlinked files files.  This currently only works in Lustre
if the file is already open on the same node, which is
insufficient.

When an open file is unlinked, we make it an orphan.
These files can be recognized by checking their open count
(mod_count).  It's enough to just make opening these files
possible, because the client cannot look them up to do an
open except when using a file handle.

Cray-bug-id: LUS-2626
Change-Id: Idd7898cefcf60b28c682e578774411e476216c9e
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/30405
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-9795 tests: exclude several tests which conflict with SSK 62/28662/37
Chris Hanna [Fri, 26 Jan 2018 14:17:39 +0000 (09:17 -0500)]
LU-9795 tests: exclude several tests which conflict with SSK

When SSK is activated by setting SHARED_KEY to true,
some tests in various suites fail, often because components
are manually halted or failed.
This patch excludes these tests under SSK and makes minor changes
for SSK compatibility.

Also reconnect client-to-OST connections if they became idle.
This new idle ability is brought by patch
https://review.whamcloud.com/16682 and it prevents idle OSCs from
taking into account new GSS flavor.

Change-Id: I998ae9bf1998f206914ff425e1f6e27741443e9c
Test-Parameters: testlist=sanity-gss envdefinitions=ONLY=1,SHARED_KEY=true
Signed-off-by: Chris Hanna <hannac@iu.edu>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/28662
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-6142 llite: move CONFIG_SECURITY handling to llite_internal.h 10/33410/2
James Simmons [Sat, 20 Oct 2018 18:42:45 +0000 (14:42 -0400)]
LU-6142 llite: move CONFIG_SECURITY handling to llite_internal.h

For the linux kernel its recommended to keep CONFIG_* wrapped code
in a header file instead of the source files to avoid making the
code more difficulty to read. Move CONFIG_SECURITY wrapped code
to llite_internal.h in this case.

Change-Id: I60eba17181d3b57fb64e99a441163f975dbab03c
Test-Parameters: trivial
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
21 months agoLU-11474 lnet: unlink md if fail to send recovery 06/33306/3
Amir Shehata [Thu, 4 Oct 2018 21:00:37 +0000 (14:00 -0700)]
LU-11474 lnet: unlink md if fail to send recovery

MD for recovery ping should be unlinked if we fail to send the GET.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iac84ceda886f47df1b1a1d734129c8d29851886b
Reviewed-on: https://review.whamcloud.com/33306
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11288 osc: re-check target versus available grant 26/33226/10
Alex Zhuravlev [Tue, 25 Sep 2018 06:48:06 +0000 (09:48 +0300)]
LU-11288 osc: re-check target versus available grant

- under the spinlock, otherwise it's possible that available
  grant has changed since target calculation and bytes to
  shrink go negative.
- tgt_grant_alloc() should avoid negative grants

Change-Id: I35613e4e840e172977c7b866fb429c40a7fefc8f
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33226
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-6142 obdclass: Fix style issues for obdo.c 82/32982/3
Arshad Hussain [Sun, 12 Aug 2018 04:52:23 +0000 (10:22 +0530)]
LU-6142 obdclass: Fix style issues for obdo.c

This patch fixes issues reported by checkpatch
for file lustre/obdclass/obdo.c

Change-Id: Ie658d5428407f69f7654c5450464589d7ddb2282
Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/32982
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Ben Evans <bevans@cray.com>
21 months agoLU-10801 utils: fix lfs_migrate argument parsing 77/32977/10
Andreas Dilger [Fri, 10 Aug 2018 06:59:10 +0000 (14:59 +0800)]
LU-10801 utils: fix lfs_migrate argument parsing

Since the landing of the following patch, any short options with
adjacent arguments(e.g. -S1M or -E-1) are treated as separate
options(e.g. -S -1 -M or -E -1).
- Lustre-commit: 60c5bc2502591f46260e11db540c0ec2adbc8db8
- Lustre-change: https://review.whamcloud.com/20621

This patch is to fix the broken argument parsing in lfs_migrate.

Change-Id: I99b9518a8f371c2becb6b1fc346b8a14dd02870e
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32977
Tested-by: Jenkins
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11158 mdt: grow lvb buffer to hold layout 47/32847/11
Bobi Jam [Thu, 19 Jul 2018 15:19:43 +0000 (23:19 +0800)]
LU-11158 mdt: grow lvb buffer to hold layout

Write intent RPC could generate a layout bigger than the initial
mdt_max_mdsize, so that the new layout cannot be returned to client,
this patch fix this issue by:

* fix a glitch in lod_use_defined_striping(), where v3 should be
  updated along v1.
* change lvbo_fill() return -ERANGE in this case, and stores in its
  @buflen parameter the needed buffer size
* in ldlm_handle_enqueue0(), when ldlm_lvbo_fill() detects -ERANGE,
  it grows the corresponding RMF_DLM_LVB buffer and retrives the
  layout to refill the buffer again.
* define a new MAX_MD_SIZE to hold a reasonal composite layout, and
  keeps old MAX_MD_SIZE as MAX_MD_SIZE_OLD.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I255b954195b3e64c3edd416c0cb209df0d9fc43a
Reviewed-on: https://review.whamcloud.com/32847
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-4684 migrate: replace PFID via source 24/33324/2
Lai Siyao [Fri, 31 Aug 2018 11:47:12 +0000 (19:47 +0800)]
LU-4684 migrate: replace PFID via source

In directory migration, when it needs to update OST object PFID,
it should always do via source object, because target object may
be remote. And in this case, lod_obj_stripe_replace_parent_fid_cb()
doesn't compare parent FID with that in XATTR_NAME_FID, but set
with the passed in FID directly.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Icd7f6521ecac43cfeaee3e61e662d94115d63d68
Reviewed-on: https://review.whamcloud.com/33324
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-8391 ldlm: check double grant race after resource change 75/21275/11
Li Dongyang [Wed, 13 Jul 2016 06:17:53 +0000 (16:17 +1000)]
LU-8391 ldlm: check double grant race after resource change

In ldlm_handle_cp_callback(), we call lock_res_and_lock and then
check if the ldlm lock has already been granted.
If the lock resource has changed, we release the lock and go ahead
allocating new resource, then grabs the lock again before calling
ldlm_grant_lock().
However this gives another thread an opportunity to grab the lock
and pass the check, while we change the resource. Eventually the
other thread calls ldlm_grant_lock() on the same ldlm lock and
triggers a LASSERT.

Fix the issue by doing double grant race check after changing the
lock resource.

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Change-Id: Ib327b5e6b5f211909db5350de383d470a891e72a
Reviewed-on: https://review.whamcloud.com/21275
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11535 ldiskfs: allocate extra ldiskfs_ext_path for root 88/33388/2
Artem Blagodarenko [Wed, 17 Oct 2018 12:17:47 +0000 (15:17 +0300)]
LU-11535 ldiskfs: allocate extra ldiskfs_ext_path for root

Patch ext4_s_max_ext_tree_depth changes path array allocation.
Maximum extent depth is counted in ext4_ext_init(), but
extent's root stored in i_data is not counted. This leads to
out of array writting in ldiskfs_ext_remove_space() and following
fault during transaction commit:

BUG: unable to handle kernel NULL pointer dereference at (null)
[<ffffffffa0f25acb>] osd_trans_commit_cb+0xcb/0x2b0 [osd_ldiskfs]
[<ffffffffa0ecc8e1>] ldiskfs_journal_commit_callback+0x61/0x80
[<ffffffffa03eb8ef>] jbd2_journal_commit_transaction+0x116f/0x15a0

This patch adds one extra element for root in path array.

Cray-bug-id: LUS-6488
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@gmail.com>
Change-Id: I950e223f6ad68c88c1e78fc62448542fd4e78329
Reviewed-on: https://review.whamcloud.com/33388
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11450 mdd: avoid logging trusted.som xattr in changelogs 23/33323/5
Qian Yingjin [Tue, 9 Oct 2018 07:41:59 +0000 (15:41 +0800)]
LU-11450 mdd: avoid logging trusted.som xattr in changelogs

The Lazy Size on MDT is causing the trusted.som xattr to be logged
in the changelog whenever a file is needed to update this xattr
data casued by file open/close or truncate operations.
This patch fixes this problem to avoid logging this xattr for
every file.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I8f069afdfa84a8fc9f96819d066fd3e4d08794af
Reviewed-on: https://review.whamcloud.com/33323
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11553 kernel: new kernel [RHEL7.5 4.14.0-49.13.1.el7a] 11/33411/5
Minh Diep [Sun, 21 Oct 2018 21:27:14 +0000 (14:27 -0700)]
LU-11553 kernel: new kernel [RHEL7.5 4.14.0-49.13.1.el7a]

This patch makes changes to support new RHEL 7.5 release on ARM.

Test-Parameters: forbuildonly

Change-Id: Id0bab8510b7479d979b5d66ea96969140c509feb
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33411
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11453 class: use INIT_LIST_HEAD_RCU instead INIT_LIST_HEAD 17/33317/4
Yang Sheng [Mon, 8 Oct 2018 15:01:01 +0000 (23:01 +0800)]
LU-11453 class: use INIT_LIST_HEAD_RCU instead INIT_LIST_HEAD

Use INIT_LIST_HEAD_RCU to avoid compiler optimization too much
in some case.

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I66b340ac3147d2cb911a2b7d3e210c6847047dac
Reviewed-on: https://review.whamcloud.com/33317
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
21 months agoLU-11393 osd-zfs: time struct changes 45/33345/4
Nathaniel Clark [Wed, 10 Oct 2018 21:16:58 +0000 (17:16 -0400)]
LU-11393 osd-zfs: time struct changes

Account for changes in 0.7.10 and pre 0.8 changes in ZFS that change
the time structure.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I5c2373d053ea92d8bf04befe1d096159b8a34126
Reviewed-on: https://review.whamcloud.com/33345
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
21 months agoLU-11490 tests: fix rr_alloc() test to use FSNAME 33/33333/2
Elena Gryaznova [Wed, 10 Oct 2018 13:46:46 +0000 (16:46 +0300)]
LU-11490 tests: fix rr_alloc() test to use FSNAME

Patch fixes rr_alloc() test to use FSNAME instead of lustre in
the parameter path.

Test-Parameters: testlist=trivial testlist=parallel-scale envdefinitions=ONLY=rr_alloc
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5955
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Change-Id: Iebd147dd9757357bf7c8376e9271cb17f4d076a9
Reviewed-on: https://review.whamcloud.com/33333
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11482 flr: Inherit flags from template 26/33326/10
Patrick Farrell [Thu, 11 Oct 2018 11:55:42 +0000 (06:55 -0500)]
LU-11482 flr: Inherit flags from template

New files created in directories with a default layout
should inherit the per-component layout flags.

This allows us to set the prefer or nosync flags in a
default layout and apply them to files created in that
directory.

Cray-bug-id: LUS-6574
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I053ca0f3db3e0967799f469feeb4f1f12b144be7
Reviewed-on: https://review.whamcloud.com/33326
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11418 osd-zfs: call stop_cb if transaction start fail 48/33248/4
Lai Siyao [Thu, 30 Aug 2018 06:11:42 +0000 (14:11 +0800)]
LU-11418 osd-zfs: call stop_cb if transaction start fail

osd_trans_stop() should call osd_trans_stop_cb() if transaction is
not successfully started.

Improve debug messages for distribute transaction.

Add sanity 416 for this.

Get rid of ot_write_commit which is useless.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I35da81ebd2c9e97c12ae52bd4faed60393cd67d6
Reviewed-on: https://review.whamcloud.com/33248
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11429 mdt: rename mdt_remote_permission 47/33247/2
Andreas Dilger [Thu, 27 Sep 2018 11:41:01 +0000 (13:41 +0200)]
LU-11429 mdt: rename mdt_remote_permission

Rename mdt_remote_permission() to mdt_remote_dir_permission() and
mdt_remote_permission_check() to mdt_remote_dir_permission_check()
to match the "mdt_remote_dir" and "mdt_remote_dir_gid" proc tunable
names.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I3336260b3fdc0a1ab3b12a7e2c4722c7a63ebbe5
Reviewed-on: https://review.whamcloud.com/33247
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11417 llapi: add llapi_layout_get_by_xattr(3) API 30/33230/2
Andreas Dilger [Tue, 25 Sep 2018 10:17:46 +0000 (12:17 +0200)]
LU-11417 llapi: add llapi_layout_get_by_xattr(3) API

Add new llapi_layout_get_by_xattr(3) interface to be able to extract
a layout structure from a LOV EA xattr.  This can be useful when the
xattr is retrieved from some external source (e.g. tarball, HSM, or
tools that directly access the underlying ldiskfs or ZFS filesystem).

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I9b405bf6b3119e44097f36d49ac5859ff93ebbe5
Reviewed-on: https://review.whamcloud.com/33230
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11282 tests: log skip message for sanityn test_19 14/33214/2
Andreas Dilger [Fri, 21 Sep 2018 21:20:17 +0000 (15:20 -0600)]
LU-11282 tests: log skip message for sanityn test_19

Log the skip message for sanityn test_19() rather than just echoing it
and returning success.  The current behaviour makes it difficult to
see if this test is being run or not.

Test-Parameters: trivial testlist=sanityn
Test-Parameters: testlist=sanityn mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I05e2a9ef7c9fc5fb1be6533eab8bcb885f54ca6c
Reviewed-on: https://review.whamcloud.com/33214
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11010 tests: remove calls to return after skip() 33/32733/7
James Nunez [Tue, 26 Jun 2018 22:45:04 +0000 (16:45 -0600)]
LU-11010 tests: remove calls to return after skip()

The skip() routine now contains a call to exit. All calls
to skip() and skip_env() should be reviewed and calls to
return that follow skip() should be removed.

This is the fourth patch in a series that removes calls
to return after skip() in the Lustre test suites.

Calls to return after skip() are removed for:
mmp.sh
ost-pools.sh
parallel-scale-cifs.sh
parallel-scale-nfs.sh
posix.sh

Test-Parameters: trivial testlist=mmp
Test-Parameters: ostcount=1 osscount=1 testlist=ost-pools
Test-Parameters: mdtfilesystemtype=zfs testlist=posix,mmp
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ib6e4c81069b142652a0c50c339683dca21f03199
Reviewed-on: https://review.whamcloud.com/32733
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11392 tests: check race for llog_process_thread 92/33192/3
Alexander Boyko [Tue, 18 Sep 2018 12:58:41 +0000 (08:58 -0400)]
LU-11392 tests: check race for llog_process_thread

The patch adds 10h test at llog_test which runs at 60a sanity.
It reproduces a race between llog_process_thread and llog_add.
The llog should be wrapped so it has old data on disk and zero at
bitmap.
1. llog_process_thread reads part of llog at buffer.
1. process a last record, checks the next record fields
2. llog_add adds a record and marks new record at bitmap
1. check bitmap flag and process the old record from buffer

Test-Parameters: testlist=sanity envdefinitions=ONLY=60a
Cray-bug-id: LUS-6287
Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ic89c81dd918d856f441df4d3257377e09b91a8cc
Reviewed-on: https://review.whamcloud.com/33192
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11193 llog: Do not write to read-only devices 57/33157/4
Nathaniel Clark [Thu, 13 Sep 2018 17:41:09 +0000 (13:41 -0400)]
LU-11193 llog: Do not write to read-only devices

Check if device is read-only before trying to start a transaction
on a devices.  Lustre snapshots are generally read-only.

When underlying device is read-only we check in the following places:
* llog_destroy, llog_write - return EROFS
* llog_open_create - on the create sized we return EROFS
* llog_cancel_rec - we return 0 which means success where 1 means
"success & log destroyed"

Test-Parameters: trivial testlist=sanity-lsnapshot mdtcount=4 mdscount=2 ostfilesystemtype=zfs mdtfilesystemtype=zfs
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: Ia0083c57ceb589698b1422fec57e75aa6e68948a
Reviewed-on: https://review.whamcloud.com/33157
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11440 doc: recommend e2fsprogs 1.44.3.wc1 70/33370/2
Li Dongyang [Mon, 15 Oct 2018 00:07:03 +0000 (11:07 +1100)]
LU-11440 doc: recommend e2fsprogs 1.44.3.wc1

Update the recommended e2fsprogs version to 1.44.3.wc1

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I164f7c12ed718f1939d6fc392eb7bb7f9286f053
Reviewed-on: https://review.whamcloud.com/33370
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
21 months agoLU-11524 tests: fix sanity-sec test_31 for all situations 80/33380/5
Sebastien Buisson [Tue, 16 Oct 2018 12:52:46 +0000 (21:52 +0900)]
LU-11524 tests: fix sanity-sec test_31 for all situations

In case setupall() is called with server_only, this info must be
passed to init_param_vars(), and init_param_vars() must return
immediately. Otherwise, it will try to do client-specific tunings
(including quota settings) whereas no clients are mounted.

Modify cleanup_31() in sanity-sec so that client-specific
tunings are done via setupall().
Also make client umount more robust in test_31.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I2db17c139768d0842ff65ac8313a8e7d1484c4ef
Reviewed-on: https://review.whamcloud.com/33380
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
21 months agoLU-11329 utils: create tests maintainers list 60/33360/2
James Nunez [Fri, 12 Oct 2018 20:21:43 +0000 (14:21 -0600)]
LU-11329 utils: create tests maintainers list

Add the subsystem lustre/tests to the existing Lustre
subsystems in the Maintainers List and add myslef as a
maintainer.

Test-Parameters: trivial
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ic78dbd151d9cdf73fe32299da048bb103ed7592f
Reviewed-on: https://review.whamcloud.com/33360
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
21 months agoLU-11276 ldlm: don't apply ELC to converting and DOM locks 25/33125/5
Mikhail Pershin [Fri, 7 Sep 2018 10:23:48 +0000 (18:23 +0800)]
LU-11276 ldlm: don't apply ELC to converting and DOM locks

Prevent ELC for locks being converted and for locks
having DOM bit set to avoid data flush without need.

Test-Parameters: testlist=racer,racer,racer
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Id1429412f2ccd77f037bef2a851d22874a44dce6
Reviewed-on: https://review.whamcloud.com/33125
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11347 osd: do not use pagecache for I/O 75/32875/13
Alex Zhuravlev [Wed, 25 Jul 2018 10:24:27 +0000 (14:24 +0400)]
LU-11347 osd: do not use pagecache for I/O

for testing purposes cache is constantly disabled.

 - with non-rotational storage
 - when both read and write caches are disable
 - sanityn/16c to run fsx with cache disable

Change-Id: If6ea9186485cd0aceb0372b68f4860de3a4fb124
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32875
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11199 mdt: Attempt lookup lock on open 29/32929/5
Patrick Farrell [Tue, 28 Aug 2018 14:28:29 +0000 (09:28 -0500)]
LU-11199 mdt: Attempt lookup lock on open

Commit 4f50273a (LU-10269 ldlm: fix the issues introduced
by try bits) changed the locking behavior on open to not
attempt to grant the LOOKUP lock bit.  This causes a
performance regression in open(), which is up to 75% in
some benchmarks, such as mdsrate (from lustre/tests/mpi):

First create the files:
mpirun -n 4 /usr/lib64/lustre/tests/mdsrate /
-d /mnt/lustre/mdsrate --create --nfile=30000

Then drop caches:
echo 3 > /proc/sys/vm/drop_caches

Then run the open benchmark:
mpirun -n 4 /usr/lib64/lustre/tests/mdsrate /
-d /mnt/lustre/mdsrate --open --iters 8000 --nfile=30000
[More details in LU-1199]

This patch reverts that specific part of 4f50273a, which
restores open() performance to prior levels.

It may not be 100% correct to ask for open only when
also asking for layout, but this is the earlier behavior.

There is a further patch in flight to optimize this
code:
https://review.whamcloud.com/32156

And further changes are left for that patch.

Cray-bug-id: LUS-6358

Change-Id: Iceca88807e99955f28eba6bbcb3585964f7df2f4
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/32929
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11473 doc: add lfs-getsom man page 90/33290/2
James Nunez [Thu, 4 Oct 2018 21:32:42 +0000 (15:32 -0600)]
LU-11473 doc: add lfs-getsom man page

The Lazy Size on MDT feature added a flag to 'lfs' to allow
users to get the LSOM data. Add the lfs "getsom" flag to the
lfs man page.

Also, make a minor correction to the llsom_sync man page that
corrects the order of the --device flag for changelog_register
and changelog_deregister.

Test-Parameters: trivial
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I9a5acab6de3b7f32eb246c94e9975e30f63f10e1
Reviewed-on: https://review.whamcloud.com/33290
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-1095 misc: quiet console messages at startup 81/33281/2
Andreas Dilger [Wed, 3 Oct 2018 23:16:23 +0000 (17:16 -0600)]
LU-1095 misc: quiet console messages at startup

Some modules print less-than-useful messages on every load.
Turn these into internal debug messages to reduce noise.

The message in gss_init_svc_upcall() should also be quieted,
but it exposes that this function is waiting 1.5s on each module
load for lsvcgssd to start.  This should be fixed separately.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib51ce0e9a88a94d8d2d5eb0906abef0f544cab07
Reviewed-on: https://review.whamcloud.com/33281
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11454 ptlrpc: Make CPU binding switchable 62/33262/6
Patrick Farrell [Thu, 4 Oct 2018 12:10:11 +0000 (07:10 -0500)]
LU-11454 ptlrpc: Make CPU binding switchable

LU-6325 added CPT binding to the ptlrpc worker threads on
the servers.  This is often desirable, especially where
NUMA latencies are high, but it is not always beneficial.

If NUMA latencies are low, there is little benefit, and
sometimes it can be quite costly:

In particular, if NID-CPT hashing with routers leads to an
unbalanced workload by CPT, it is easy to end up in a
situation where the CPUs in one CPT are maxed out but
others are idle.

To this end, we add module parameters to allow disabling
the strict binding behavior, allowing threads to use all
CPUs.

This is complicated a bit because we still want separate
service partitions - The existing "no affinity" behavior
places all service threads in a single service partition,
which gives only one queue for service wakeups.

So we separate binding behavior from CPT association,
allowing us to keep multiple service partitions where
desired.

Module parameters are added to ldlm, mdt, and ost, of the
form "servicename_cpu_bind", such as "mds_rdpg_cpu_bind".

Setting them to "0" will disable the strict CPU binding
behavior for the threads in that service.

Parameters were not added for certain minor services which
do not have any CPT affinity/binding behavior today.  (This
appears to be because they are not expected to be
performance sensitive.)

cray-bug-id: LUS-6518
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I1f6f9bb7a11da3a3eec7fc14c41d09ed27700f46
Reviewed-on: https://review.whamcloud.com/33262
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11448 kernel: kernel update RHEL7.5 [3.10.0-862.14.4.el7] 54/33254/5
Jian Yu [Sat, 6 Oct 2018 06:35:45 +0000 (23:35 -0700)]
LU-11448 kernel: kernel update RHEL7.5 [3.10.0-862.14.4.el7]

Update RHEL7.5 kernel to 3.10.0-862.14.4.el7.

Change-Id: I4901102347a14d23645547efc84857868acec0f7
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33254
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-11369 hsm: allow non-owner writers to set HSM state dirty 58/33158/2
John L. Hammond [Thu, 13 Sep 2018 17:52:45 +0000 (12:52 -0500)]
LU-11369 hsm: allow non-owner writers to set HSM state dirty

In mdt_add_dirty_flag(), bump up the capability so that
mdt_hsm_attr_set() will succeed even if the writer (or truncater) is
not the owner of the file.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ibd7e9e039c3a984642b4a01c63cd11d2029e93f1
Reviewed-on: https://review.whamcloud.com/33158
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-8066 llite: make llite/lov and lmv symlinks 16/32516/4
James Simmons [Mon, 8 Oct 2018 15:17:55 +0000 (11:17 -0400)]
LU-8066 llite: make llite/lov and lmv symlinks

old proc code had /proc/sys/fs/lustre/llite/.../lov and lmv
dirs that contained name of the dir in lustre/lov and lustre/lmv
to better be able to find correct obd device there, but
I imagine a better solution would be to just create a symlink with
the same name. The name is then pointless and the target dir would
have uuid file just as if it was the old-style dir.

This is a modified version of

Linux-commit : d8ede3f1d5d94618442a61067c6b98a2afbb0962

Change-Id: I90bc1b75e07f0aaa4c3119671f6d097b0e7353b3
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/32516
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-8324 hsm: prioritize one RESTORE once in a while 23/31723/14
Quentin Bouget [Fri, 4 May 2018 12:53:12 +0000 (14:53 +0200)]
LU-8324 hsm: prioritize one RESTORE once in a while

Currently, HSM requests are run in the order they are submitted to
the coordinator. This has the undesirable side effect that the more
interactive requests (RESTORE / REMOVE) can take quite a while to be
run when a huge batch of ARCHIVE requests are already queued.

This patch is not a clean fix to LU-8324, it is merely an attempt at
making things bearable while a proper solution is being developped.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I8697ffae19b28f31901d9e61cce55b40f848fb51
Reviewed-on: https://review.whamcloud.com/31723
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
21 months agoLU-8950 tests: do not use make_custom_file_for_progress() 94/24394/17
Quentin Bouget [Wed, 20 Dec 2017 15:12:36 +0000 (15:12 +0000)]
LU-8950 tests: do not use make_custom_file_for_progress()

Do not use make_custom_file_for_progress() in sanity-hsm in tests:
12c, 26, 27b, 28, 31b, 31c, 33, 34, 35, 36, 54, 55, 56, 60, 62, 71,
104, 200, 202, 221, 223b, 225, 251, 252, 407

Before this patch make_custom_file_for_progress() was used to create
big files (5MB to 40 MB). There were several use cases for that:
 - in combination with the --bandwidth option of lhsmtool_posix, it
   allows synchronizing HSM operations and other things;
 - to log enough "progress events", HSM operations need to last long
   enough (the --bandwidth option comes in handy too).

It needed to be removed because:
 - archiving and restoring big files at a 1MB/s rate takes too much
   time (admittedly, that was the point);
 - there are other way to have HSM operations occur concurently. For
   example: suspending copytools to delay request processing.
 - make_custom_file_for_progress() does not correctly reflect what
   the function does and should be used for. Wherever big file are
   still needed, the patch uses create_file() instead.

Removing the need to archive or restore big files at a limited rate
represents quite a speed-up on sanity-hsm.

Test-Parameters: trivial testlist=sanity-hsm,sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: Ib9eb7599f9e16d8790630c69c9d1c7be3df416a1
Reviewed-on: https://review.whamcloud.com/24394
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jean-Baptiste Riaux <riaux.jb@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>