Whamcloud - gitweb
fs/lustre-release.git
9 months agoLU-11434 tests: add version check conf-sanity 109a/b 54/33954/17
James Nunez [Wed, 2 Jan 2019 23:02:29 +0000 (16:02 -0700)]
LU-11434 tests: add version check conf-sanity 109a/b

conf-sanity test 109a and 109b were added to Lustre with tag 2.10.59.
Thus, we need to check that the server is 2.10.59 or later before
running conf-sanity test 109a and 109b.

Test-Parameters: trivial envdefinitions=ONLY=109 serverjob=lustre-b2_10 serverbuildno=170 testlist=conf-sanity
Test-Parameters: envdefinitions=ONLY=109 testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4a653d1374973f00d283af885183621ec14628e1
Reviewed-on: https://review.whamcloud.com/33954
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoNew tag 2.12.55 2.12.55 v2_12_55
Oleg Drokin [Sun, 16 Jun 2019 03:38:29 +0000 (23:38 -0400)]
New tag 2.12.55

Change-Id: I6e76cc7c06092f54b778f6e45932e21427991dcf

9 months agoLU-12395 build: require python2 for lustre-iokit 94/35094/2
Li Dongyang [Fri, 7 Jun 2019 07:34:23 +0000 (17:34 +1000)]
LU-12395 build: require python2 for lustre-iokit

RHEL8 has splitting python2 and python3 rpms,
and none of them provdes python anymore.
We can just require python2 in the spec, other
distros all have python rpm providing python2.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I881c90a4e66d1a431d11d16b9e89781de2f87a7d
Reviewed-on: https://review.whamcloud.com/35094
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12383 utils: only check project inherit bit for dir 76/35076/2
Wang Shilong [Thu, 6 Jun 2019 02:36:39 +0000 (10:36 +0800)]
LU-12383 utils: only check project inherit bit for dir

Currently, ZFS won't set inherit bit on regular files, but
ext4 always set it, it doesn't make sense for regular files
have this bit, but own it won't do any harm as well.

To make test happy and give a consistent view on users,
let's fix project check only complain erros for Direcotry.

Test-Parameters: trivial testlist=sanity-quota
Change-Id: I194f3ed9d6ded69313a683995295ab8c07b4fb3a
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35076
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12267 tests: filter trailing '.' for SELinux 26/35026/6
James Nunez [Fri, 31 May 2019 21:28:20 +0000 (15:28 -0600)]
LU-12267 tests: filter trailing '.' for SELinux

When SELinux is enforced, sanity test 420 fails due to
the "ls -n" command producing an extra '.' at the end of
the file/directory permissions to indicate extra security
attributes are set.

We need to filter out the trailing '.' in the 'ls -n'
output for testing to pass when SELinux is enabled.

Test-Parameters: trivial envdefinitions=ONLY=420 testlist=sanity
Test-Parameters: clientselinux envdefinitions=ONLY=420 testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4a2f199d2ef4a7b1b6a1b381041b384bb0077cc6
Reviewed-on: https://review.whamcloud.com/35026
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-12399 tests: avoid 'pdsh localhost' in sanity test_420 76/35176/2
Sebastien Buisson [Tue, 11 Jun 2019 09:50:01 +0000 (11:50 +0200)]
LU-12399 tests: avoid 'pdsh localhost' in sanity test_420

sanity test_420 needs a clean env to execute openfile, ie not
inherited from root user.
Replace 'pdsh localhost' with simpler 'su - $uname -c' alternative
to achieve this.

Test-Parameters: trivial envdefinitions=ONLY=420 testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ifeba7fc1eba86d74a64cca187e286adb23147e2e
Reviewed-on: https://review.whamcloud.com/35176
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
9 months agoLU-12412 recovery: wake all waiters of trd_finishing 41/35141/2
Sergey Cheremencev [Wed, 20 Jan 2016 20:57:01 +0000 (23:57 +0300)]
LU-12412 recovery: wake all waiters of trd_finishing

There is a small window where lctl --device abort_recovery
and umount->...->stop_recovery_thread may be called before
recovery finish. In such case all threads need to be
waked up, so change complete to complete_all.

Cray-bug-id: LUS-2000
Change-Id: I01ef163e72c7691a2c2d5449adf55b55ec734c4d
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-on: https://review.whamcloud.com/35141
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12396 utils: lfs should not output 'nul' char 37/35137/3
Patrick Farrell [Mon, 10 Jun 2019 16:29:06 +0000 (12:29 -0400)]
LU-12396 utils: lfs should not output 'nul' char

If lfs prints a nul char, it breaks parsing of the output.

Fixes: 68635c3d9b31 ("LU-11963 osd: Add nonrotational flag to statfs")
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ibfa77920adf3a6c62b01efb005d02ca81db7f7c1
Reviewed-on: https://review.whamcloud.com/35137
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-12355 lnet: Adjust checks for ib_device_ops 16/35016/4
Shaun Tancheff [Tue, 11 Jun 2019 12:29:49 +0000 (07:29 -0500)]
LU-12355 lnet: Adjust checks for ib_device_ops

RDMA/core: Introduce ib_device_ops

The ib_device_ops structure defines all the InfiniBand device
operations in one place

Linux-commit: 521ed0d92ab0db3edd17a5f4716b7f698f4fce61

Test-Parameters: trivial
Change-Id: Ia2a617597c75ec819f485b93a1deb368d4b5e873
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/35016
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12381 ko2iblnd: ignore down interfaces 98/35098/4
James Simmons [Mon, 10 Jun 2019 13:58:29 +0000 (09:58 -0400)]
LU-12381 ko2iblnd: ignore down interfaces

The for_each_netdev() loop in kiblnd_create_dev() scans for all
network devices on a system. Currently the code exit when an
network device is down but the device could be something besides
an IB device. Instead of exiting just ignore any device that is
down.

Test-Parameters: trivial

Fixes: c4b39bf56bbc ("LU-11893 o2iblnd: add secondary IP address handling")
Change-Id: I0a3bf808d849cd00711b6ef2e4e5bbd876b64903
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35098
Tested-by: Jenkins
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-1538 tests: standardize test script init - sanity 63/34863/3
Andreas Dilger [Mon, 3 Jun 2019 14:39:19 +0000 (08:39 -0600)]
LU-1538 tests: standardize test script init - sanity

Standardize the initial Lustre test script initialization of the
test-framework.sh for clarity and consistency.

The LUSTRE path is already normalized in init_test_env(), so this
doesn't need to be done in the caller.  Use $(...) subshells instead
of `...` in the affected lines.  Remove PATH, NAME, TMP, LFS, LCTL
variable initialization, since it is already done in init_test_env().

Move MACHINEFILE into init_test_env().

Move get_lustre_env() to the end of init_test_env(). All test scripts
currently call init_test_env() and this move will allow all test
scripts to use the variables defined in get_lustre_env() without
having to modify the individual test scripts.

Move all definitions of ALWAYS_EXCEPT to after init_test_env()
and init_logging() and call build_test_filter() immediately
after these and SLOW definitions.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I1ef6639bcb3eb5179bd44da13b35fd843c267156
Reviewed-on: https://review.whamcloud.com/34863
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
9 months agoLU-11758 osp: remove assertion from statfs 32/33832/6
Sergey Cheremencev [Fri, 6 Jul 2018 19:51:14 +0000 (22:51 +0300)]
LU-11758 osp: remove assertion from statfs

Sequence can't be changed or overflowed
in case of IDIF. Thus don't tigger kernel
panic for below case:
last_created [0x100000001:0x15:0x0], next_fid [0x100000000:0xfffffff6:0x0]
The same assertion that excepts IDIFs exists
in osp_fid_diff.
Also the patch is adding several optimizations
in osp_precreate_send.

Change-Id: I3966dfc621999d065c9b485d387938085fccb140
Cray-bug-id: LUS-2386
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-on: https://es-gerrit.dev.cray.com/153571
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/33832
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9846 test: a test number fix 89/35089/5
Vitaly Fertman [Mon, 3 Jun 2019 16:30:06 +0000 (19:30 +0300)]
LU-9846 test: a test number fix

A wrong test number was specified originally

Test-Parameters: trivial
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I8ea31bb2e613c6e225fa7f41f405d5ee2d396a61
Reviewed-on: https://review.whamcloud.com/35089
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10948 mdt: Remove openlock compat code with 2.1 39/35039/4
Oleg Drokin [Mon, 3 Jun 2019 06:39:41 +0000 (02:39 -0400)]
LU-10948 mdt: Remove openlock compat code with 2.1

Checking openlock when doing a create does not allow us to create
a file if we want to also get openlock from it right away.

Since 2.1 is no longer something we care about wrt compatibility,
ok to kill it.

Change-Id: Ic4327be5c45ae856dfbe20291a59c5b1654dbf8f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35039
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-12387 tests: Validate l_tunedisk max_sectors_kb tuning 81/35081/10
Chris Horn [Thu, 6 Jun 2019 15:59:18 +0000 (10:59 -0500)]
LU-12387 tests: Validate l_tunedisk max_sectors_kb tuning

Add test to ensure that l_tunedisk only tunes the max_sectors_kb
value of OST devices, and that it properly tunes any slave devices.

Test-parameters: trivial
Test-parameters: fstype=ldiskfs testlist=conf-sanity \
 envdefinitions=ONLY=125
Cray-bug-id: LUS-7358
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I414526e71fd7ac2811d7c0e8a6afd80a50788258
Reviewed-on: https://review.whamcloud.com/35081
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12387 utils: Avoid passing symlink to tune_block_dev 65/35065/4
Chris Horn [Wed, 5 Jun 2019 00:14:47 +0000 (19:14 -0500)]
LU-12387 utils: Avoid passing symlink to tune_block_dev

In tune_block_dev_slaves we iterate over the directories inside the
slaves subdirectory for the multipath device that is being tuned. For
example:

 $ /usr/sbin/l_tunedisk /dev/mapper/mpathc

Suppose mpathc maps to /dev/dm-2. tune_block_dev will initially set
the value of
/sys/devices/virtual/block/dm-2/queue/max_sectors_kb
equal to the value of
/sys/devices/virtual/block/dm-2/queue/max_hw_sectors_kb

Then it looks at the entries in /sys/devices/virtual/block/dm-2/slaves
Suppose the slave devices are as follows:

 $ ls /sys/devices/virtual/block/dm-2/slaves
 sdc  sdh  sdm  sdr
 $

It then calls tune_block_dev recursively, passing
/sys/devices/virtual/block/dm-2/slaves/sdc,
/sys/devices/virtual/block/dm-2/slaves/sdh, etc. However, these are
symlinks that point to directories and as such tune_block_dev will not
tune them because stat does not identify them as block devices.

Instead we should contruct the path argument for these recursive calls
as /dev/<d_name>. In this example, /dev/sdc, /dev/sdh, etc.

Cray-bug-id: LUS-7358
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I63bc073a82384d68648ff23a56b7d43d6656159b
Reviewed-on: https://review.whamcloud.com/35065
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
9 months agoLU-12387 utils: Read existing ldd data in l_tunedisk 66/35066/5
Chris Horn [Tue, 4 Jun 2019 19:34:01 +0000 (14:34 -0500)]
LU-12387 utils: Read existing ldd data in l_tunedisk

Read the lustre_disk_data from the device passed to l_tunedisk, so
we can determine whether the device is an MGT or MDT and thus skip
the tuning of the device.

Fixes: 892280742a2b ("LU-9551 utils: add l_tunedisk to fix disk tunings")
Fixes: 2f8d7b4679de ("LU-11736 utils: don't set max_sectors_kb on MDT/MGT")
Cray-bug-id: LUS-7358
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I193fe008d5777b0e83f2be9a500eaffb1d3ca615
Reviewed-on: https://review.whamcloud.com/35066
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
9 months agoLU-12375 scripts: Start lnet after opa 32/35032/2
Nathaniel Clark [Sun, 2 Jun 2019 13:50:53 +0000 (09:50 -0400)]
LU-12375 scripts: Start lnet after opa

Ensure ordering of lnet after opa for startup and lnet before opa on
shutdown.

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I4c2cad2381349f866bdc08e2a81e3d8990d8752e
Reviewed-on: https://review.whamcloud.com/35032
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Artur Novik <anovik@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-6142 ptlrpc: Fix style issues for pinger.c 01/34701/5
Arshad Hussain [Thu, 18 Apr 2019 13:42:17 +0000 (19:12 +0530)]
LU-6142 ptlrpc: Fix style issues for pinger.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/pinger.c

Change-Id: I048a7ab7d31bc468a410ec1704c5d79a34feebb4
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/34701
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-11623 tests: Fix sanity 27E to ensure getattr RPC 67/35067/2
Oleg Drokin [Wed, 5 Jun 2019 06:28:24 +0000 (02:28 -0400)]
LU-11623 tests: Fix sanity 27E to ensure getattr RPC

While cat does perform fstat() on the file it opens,
I guess it's not guaranteed.
More importantly, we really need to ensure the locks
that the file has after creation are dropped before
we issue our stat() to ensure the RPC is actually made,
since it's this GETATTR RPC that is ensuring easize
update from MDT response.

Change-Id: Ic86229ac514e1385c665c6c0d9f6eef13d9748f5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/35067
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-12438 llite: vfs_read/write removed, use kernel_read/write 23/35223/5
Shaun Tancheff [Thu, 13 Jun 2019 18:48:10 +0000 (13:48 -0500)]
LU-12438 llite: vfs_read/write removed, use kernel_read/write

As of Linux 4.14 the vfs_read() is no longer available
to kernel modules. The kernel_read() function calls
vfs_read() and will continue to be available.

Adding a configure test to use kernel_read() as the
function signature changed in 4.14 to match the other file I/O
helpers.

Also remove vfs_write() in favor of kernel_write() wrapper
cfs_kernel_write().

Fixes: f172b1168857 ("LU-10092 llite: Add persistent cache on client")
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I5e5fce0e6644ba750169f3bf11ac5c98525da0a7
Reviewed-on: https://review.whamcloud.com/35223
Tested-by: Jenkins
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
9 months agoLU-10092 First phase of persistent client cache project merging 14/35214/1
Oleg Drokin [Thu, 13 Jun 2019 04:36:36 +0000 (00:36 -0400)]
LU-10092 First phase of persistent client cache project merging

Merge remote-tracking branch 'origin/pcc'

Change-Id: I87a681c54712926d336c983dd8e56b58ebf4b612
Signed-off-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: change detach behavior and add keep option 44/33844/19
Qian Yingjin [Thu, 13 Dec 2018 02:41:14 +0000 (10:41 +0800)]
LU-10092 pcc: change detach behavior and add keep option

After introduce the feature of auto-attach at open, when the PCC
cached file is detach by "pcc detach" command, it will be attached
automatically at the next open. This may be not what the user wants.

To solve this problem, we change the defualt detach behavior and
add an option "--keep|-k" for the detach of RW-PCC.
The manual "lfs pcc detach" command will detach the file from PCC
permanently. And it will also remove the PCC copy by default.
When the file is detached with "keep" option, it only unmaps the
relationship between the file inode and PCC copy, but keep the
PCC copy. The file is allowed to be attached automatically at
the next open when the file is still valid in cache.

Note here that currently auto detach caused by inode reclaim or
revocation of the layout lock would not delete the PCC copy too.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I010df54177ae4cfeddcc0a9982c1aee58ee683de
Reviewed-on: https://review.whamcloud.com/33844
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: auto attach during open for valid cache 87/33787/21
Qian Yingjin [Wed, 5 Dec 2018 03:22:11 +0000 (11:22 +0800)]
LU-10092 pcc: auto attach during open for valid cache

In current PCC implementation, all PCC state information is
stored in the in-memory data structure named pcc_inode (a member
of data structure ll_inode_info). Once the file inode is reclaimed
due to the memory pressure or memory shrinking, the corresponding
in-memory pcc_inode will be released too, and the PCC-cached file
will be detached automatically. And the revocation of layout lock
will also trigger the detach of the PCC-cached file. These all lead
that the still valid PCC-cached file can not be used.

To solve this problem, we introduce an auto-attaching mechanism
during open. During PCC attach, the L.Gen will be stored as
extented attribute of the local copy file on PCC device. When the
in-memory inode is reclaimed or the layout lock is revoked, and
the file is opend again, it can check whether the stored L.Gen on
the PCC copy is same as the Lustre file current L.Gen on MDT. If
they are consistent, it means the cached copy on PCC device is still
valid, we can continue to use it after auto-attach.

Test-Parameters: testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I63be96f8d83816529983d0f97af0aaca81703fed
Reviewed-on: https://review.whamcloud.com/33787
Tested-by: Jenkins
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10918 llite: Rule based auto PCC caching when create files 51/34751/15
Qian Yingjin [Wed, 24 Apr 2019 09:51:25 +0000 (17:51 +0800)]
LU-10918 llite: Rule based auto PCC caching when create files

Configurable rule based auto PCC caching for newly created files
can significantly benefit users for readwrite PCC. It can
determine which file can use a cache on PCC directly without any
admission control for high priority user/group/project or filename
with wildcard support. Meanwhile, we can enforce a quota limitation
of capacity usage for each user/group/project to providing caching
isolation.

Similar to NRS TBF command line, it supports logical conditional
conjunction and disjunction operations among different user/group/
project or filename with the wildcard support.

The command line to add this kind of rule is as follow:
lctl pcc add /mnt/lustre /mnt/pcc
"projid={500 1000}&fname={*.h5},uid={1001} rwid=1 roid=1"
It means that Project ID of 500, 1000 AND file suffix name is "h5"
OR User ID is 1001 can be auto cached on PCC for newly create file
on the client. "rwid" means RW-PCC attach ID (which is
usually archive ID); "roid" means RO-PCC attach ID. By defualt,
RO-PCC attach id is setting same with RW-PCC attach ID for a
shared PCC backend.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I628975b3e097e98d6b93f1c6acd855aaacdaa8b3
Reviewed-on: https://review.whamcloud.com/34751
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: security and permission for non-root user access 37/34637/20
Qian Yingjin [Thu, 11 Apr 2019 02:41:38 +0000 (10:41 +0800)]
LU-10092 pcc: security and permission for non-root user access

For current PCC, if a file is left on the PCC cache, it may be
accessible to other jobs/users who would not normally be able to
access it. (That is,  they access it directly on the PCC mount via
FID as the local PCC mount is basically just a normal local file
system.)

This patch solves this by restricting access on the PCC side and
just depending on the Lustre side permissions for opening a file.
So PCC files on the local mount fs are created with some minimal
(zero) set of permissions. Then, when accessing a PCC cached
file, we do the permission check on the Lustre file, then do not
do it on the PCC file. This should render the PCC files
inaccessible except to root or via Lustre.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I059fa3e479fe97ef6b65db1cbeb8b7f3ea611880
Reviewed-on: https://review.whamcloud.com/34637
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: Non-blocking PCC caching 66/32966/37
Qian Yingjin [Thu, 5 Jul 2018 06:43:46 +0000 (14:43 +0800)]
LU-10092 pcc: Non-blocking PCC caching

Current PCC uses refcount of PCC inode to determine whether a
previous PCC-attached file can be detached. If a file is open
(refcount > 1), the detaching will return -EBUSY.

When another client accesses the PCC-cached file, it will trigger
the restore process as the file is HSM released. During restore,
the Agent needs to detach the PCC-cached file.
Thus, if a PCC-attached file is keeping opened but not closed
for a long time, the restore request will always return failure.

In this patch, we implement a non-blocking PCC caching mechanism
for Lustre. After attaching the file into PCC, the client acquires
the layout lock for the file, and the layout generation is
maintained in the PCC inode. Under the layout lock protection, the
PCC caching state is valid and all I/O will direct into PCC. When
the layout lock is revoked, in the blocking AST it will invalidate
the PCC caching state and detach the file automatically.

This patch is also helpful to handle the ENOSPC error for PCC
write by fallback to normal I/O path which will restore the file
data into OSTs (The file is in HSM released state) and redo the
write again.

Change-Id: I9130c04dc0e6eae879ea2ff3fdda65726e74d177
Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/32966
Tested-by: Jenkins
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 llite: Add persistent cache on client 63/32963/38
Li Xi [Tue, 27 Jun 2017 12:18:14 +0000 (20:18 +0800)]
LU-10092 llite: Add persistent cache on client

PCC is a new framework which provides a group of local cache
on Lustre client side. No global namespace will be provided
by PCC. Each client uses its own local storage as a cache for
itself. Local file system is used to manage the data on local
caches. Cached I/O is directed to local filesystem while
normal I/O is directed to OSTs.

PCC uses HSM for data synchronization. It uses HSM copytool
to restore file from local caches to Lustre OSTs. Each PCC
has a copytool instance running with unique archive number.
Any remote access from another Lustre client would trigger
the data synchronization. If a client with PCC goes offline,
the cached data becomes inaccessible for other client
temporarilly. And after the PCC client reboots and the copytool
restarts, the data will be accessible again.

ToDo:
1) Make PCC exclusive with HSM.
2) Strong size consistence for PCC cached file among clients.
3) Support to cache partial content of a file.

Change-Id: I188ed36c48aae223380739f607cc6caf2f789298
Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/32963
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11838 osd-ldiskfs: inode times switched to timespec64 75/34675/4
Li Dongyang [Tue, 16 Apr 2019 01:14:13 +0000 (11:14 +1000)]
LU-11838 osd-ldiskfs: inode times switched to timespec64

Since kernel 4.18 inode times swtich from struct timespec
to timespec64 to make it y2038 safe.

Linux-commit: 95582b00838837fc07e042979320caf917ce3fe6

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Iaddb2f2be27ec348fb97e13371aa3d7e6f6e5c9f
Reviewed-on: https://review.whamcloud.com/34675
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-11838 ldiskfs: add rhel8 server support 74/34674/6
Li Dongyang [Mon, 15 Apr 2019 08:05:58 +0000 (18:05 +1000)]
LU-11838 ldiskfs: add rhel8 server support

This patch adds ldiskfs patch series for rhel8
kernel 4.18.0-32.el8.

Fix lustre-build-ldiskfs.m4, make
CONFIG_LDISKFS_FS_ENCRYPTION consistent with
kernel's config CONFIG_EXT4_FS_ENCRYPTION.
Otherwise ldiskfs won't build
on kernels with CONFIG_EXT4_FS_ENCRYPTION
disabled.

Note: this contains a small clean up for
ubuntu18/ext4-kill-dx-root.patch

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ib500bff2f6688405b912620c5217586c8420c6e1
Reviewed-on: https://review.whamcloud.com/34674
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11213 lmv: reuse object alloc QoS code from LOD 57/34657/15
Lai Siyao [Fri, 22 Mar 2019 00:22:37 +0000 (08:22 +0800)]
LU-11213 lmv: reuse object alloc QoS code from LOD

Reuse the same object alloc QoS code as LOD, but the QoS code is
not moved to lower layer module, instead it's copied to LMV, because
it involves almost all LMV code, which is too big a change and should
be done separately in the future.

And for LMV round-robin object allocation, because we only need to
allocate one object, use the MDT index saved and update it to next
MDT.

Add sanity 413b.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I53c3d863dafda534eebb6b95da205b395071cd25
Reviewed-on: https://review.whamcloud.com/34657
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-9846 utils: hash may be overridden in 'lfs setdirstripe' 95/35095/3
Lai Siyao [Fri, 7 Jun 2019 06:20:13 +0000 (14:20 +0800)]
LU-9846 utils: hash may be overridden in 'lfs setdirstripe'

lfs_setdirstripe() may override 'hash' if '-H hash' is specified
before '-i', since LMV doesn't support OVERSTRIPING, this support
can be ignored.

Fixes: 591a9b4cebc5 ("LU-9846 lod: Add overstriping support")

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If6db03d2d4f6d208da19ae064fde1d851f01beb4
Reviewed-on: https://review.whamcloud.com/35095
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-11297 lnet: MR Routing Feature 83/34983/3
Amir Shehata [Fri, 7 Jun 2019 18:35:09 +0000 (14:35 -0400)]
LU-11297 lnet: MR Routing Feature

This is a merge commit from the multi-rail branch. It brings in
the MR Routing feature. This feature aligns the LNET Multi-Rail
behavior with routing. A gateway now is viewed as a Multi-Rail
capable node. When a route is added only one entry per gateway
should be used. That route entry should use the primary-nid of
the gateway. The multi-rail selection algorithm is then run when
sending to the gateway to select the best interface to send to.

Furthermore the gateway aliveness is now kept via the health
mechanism. And the gateway pinger now uses discovery instead
of maintaining its own pinger handler.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie2d8c6449f84860511b322ff2db3ed656a163e74

10 months agoLU-12200 lnet: check peer timeout on a router 72/34772/15
Amir Shehata [Fri, 19 Apr 2019 00:19:22 +0000 (17:19 -0700)]
LU-12200 lnet: check peer timeout on a router

On a router assume that a peer is alive and attempt to send it
messages as long as the peer_timeout hasn't expired.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0806a52c8ad7acc1c93dcf32353f1c4467c618b1
Reviewed-on: https://review.whamcloud.com/34772
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-12053 lnet: look up MR peers routes 25/34625/17
Amir Shehata [Mon, 8 Apr 2019 22:28:23 +0000 (15:28 -0700)]
LU-12053 lnet: look up MR peers routes

An MR peer can have multiple interfaces some of which we might
have a route to. The primary NID of the peer might not necessarily
specify a NID we have a route to. When looking up a route, we must
iterate over all the nets the peer is on and select the one which
we can route to. Taking into consideration the peer can exist on
multiple routed networks we also have a simple round robin algorithm
to iterate over all the networks we can reach the peer on.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0651dd4f732c8b71872f73cf2512b08f34129bd9
Reviewed-on: https://review.whamcloud.com/34625
Tested-by: Jenkins
10 months agoLU-11299 lnet: discover each gateway Net 11/34511/22
Amir Shehata [Tue, 26 Mar 2019 21:16:32 +0000 (14:16 -0700)]
LU-11299 lnet: discover each gateway Net

Wakeup every gateway aliveness interval / number of local networks.
Discover each local gateway network in round robin.

This is done to make sure the gateway keeps its networks up.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehat <ashehata@whamcloud.com>
Change-Id: I4035e39c286cb599d4eb8f9df7ed5d278e6d744a
Reviewed-on: https://review.whamcloud.com/34511
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
10 months agoLU-11299 lnet: net aliveness 10/34510/22
Amir Shehata [Sat, 23 Mar 2019 01:01:51 +0000 (18:01 -0700)]
LU-11299 lnet: net aliveness

If a router is discovered on any interface on the network, then
update the network last alive time and the NI's status to UP.
If a router isn't discovered on any interface on a network,
then change the status of all the interfaces on that network to down.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I1d67eb4b3284ccb8306ad4c877a2fcbdf4958d8c
Reviewed-on: https://review.whamcloud.com/34510
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11664 lnet: push router interface updates 51/33651/30
Amir Shehata [Wed, 14 Nov 2018 02:14:36 +0000 (18:14 -0800)]
LU-11664 lnet: push router interface updates

A router can bring up/down its interfaces if it hasn't received any
messages on that interface for a configurable period
(alive_router_ping_timeout). When this even occures the router can now
push its status change to the peers it's talking to in order to inform
them of the change in its status. This will allow the router users to
handle asym router failures quicker.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9530ed7d9bc0a86edc43e3f610cc943f1732dcfd
Reviewed-on: https://review.whamcloud.com/33651
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11297 lnet: set gw sensitivity from lnetctl 35/33635/31
Amir Shehata [Fri, 9 Nov 2018 19:24:20 +0000 (11:24 -0800)]
LU-11297 lnet: set gw sensitivity from lnetctl

Allow an optional parameter from the:
lnetctl route add
command to set the health sensitivity of the gateway
lnetctl route add --net <net> --gateway <gw> --sensitivity <value>

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iee120c78a41b79da6ab6bdf1560f558df89233e2
Reviewed-on: https://review.whamcloud.com/33635
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11297 lnet: handle router health off 34/33634/31
Amir Shehata [Fri, 9 Nov 2018 18:31:27 +0000 (10:31 -0800)]
LU-11297 lnet: handle router health off

Routing infrastructure depends on health infrastructure to manage
route status. However, health can be turned off. Therefore, we need
to enable health for gateways in order to monitor them properly.
Each peer now has its own health sensitivity. When adding a route
the gateway's health sensitivity can be explicitly set from lnetctl
or if not specified then it'll default to 1, thereby turning health
on for that gateway, allowing peer NI recovery if there is a failure.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ibae33d595e97d0eec432ae8f5d51898ce0776f01
Reviewed-on: https://review.whamcloud.com/33634
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11641 lnet: handle discovery off 20/33620/32
Amir Shehata [Thu, 8 Nov 2018 00:51:44 +0000 (16:51 -0800)]
LU-11641 lnet: handle discovery off

When discovery is turned off locally or when the peer either has
discovery off or doesn't support MR at all then degrade discovery
behavior to a standard ping. This will allow routers to continue
using discovery mechanism even if it's turned off.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I7f0829d37cbff2bf9e41de251efa715fc4c97e5d
Reviewed-on: https://review.whamcloud.com/33620
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11470 lnet: drop all rule 05/33305/36
Amir Shehata [Thu, 4 Oct 2018 00:36:45 +0000 (17:36 -0700)]
LU-11470 lnet: drop all rule

Add a rule to drop all messages arriving on a specific interface.
This is useful for simulating failures on a specific router interface.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ic69f683fb2caf7a69a1d85428878c89b7b1ee3ad
Reviewed-on: https://review.whamcloud.com/33305
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11478 lnet: misleading discovery seqno. 04/33304/34
Amir Shehata [Fri, 5 Oct 2018 00:18:20 +0000 (17:18 -0700)]
LU-11478 lnet: misleading discovery seqno.

There is a sequence number used when sending discovery messages. This
sequence number is intended to detect stale messages. However it
could be misleading if the peer reboots. In this case the peer's
sequence number will reset. The node will think that all information
being sent to it is stale, while in reality the peer might've
changed configuration.

There is no reliable why to know whether a peer rebooted, so we'll
always assume that the messages we're receiving are valid. So we'll
operate on first come first serve basis.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I421a00e47bc93ee60fa37c648d6d9a726d9def9c
Reviewed-on: https://review.whamcloud.com/33304
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11477 lnet: handle health for incoming messages 01/33301/34
Amir Shehata [Thu, 4 Oct 2018 23:21:48 +0000 (16:21 -0700)]
LU-11477 lnet: handle health for incoming messages

In case of routers (as well as for the general case) it's important to
update the health of the ni/lpni for incoming messages. For an lpni
specifically when we receive a message is when we know that the lpni
is up.

A percentage router health is required in order to send a message to a
gateway. That defaults to 100, meaning that a router interface has to
be absolutely healthy in order to send to it. This matches the current
behavior. So if a router interface goes down an its health goes down
significantly, but then it comes back up again; either we receive a
message from it or we discover it and get a reply, then in order to
start using that router interface again we have to boost its health
all the way up to maximum.

This behavior is special cased for routers.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ida6c23f95dbef56c2e6ed7b6d03743939d8b30a0
Reviewed-on: https://review.whamcloud.com/33301
Tested-by: Jenkins
10 months agoLU-11475 lnet: transfer routers 39/34539/20
Amir Shehata [Thu, 28 Mar 2019 02:32:45 +0000 (19:32 -0700)]
LU-11475 lnet: transfer routers

When a primary NID of a peer is about to be deleted because
it's being transfered to another peer, if that peer is a gateway
then transfer all gateway properties to the new peer.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ib475c389ca5630906416a5112b3088f6f5d03950
Reviewed-on: https://review.whamcloud.com/34539
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11475 lnet: allow deleting router primary_nid 00/33300/34
Amir Shehata [Thu, 4 Oct 2018 22:31:04 +0000 (15:31 -0700)]
LU-11475 lnet: allow deleting router primary_nid

Discovery doesn't allow deleting a primary_nid of a peer. This
is necessary because upper layers only know to reach the peer by
using the primary_nid. For routers this is not the case. So
if a router changes its interfaces and comes back up again, the
peer_ni should be adjusted.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9da056172f35a5f15eed5ba0e02fcb37ac414c54
Reviewed-on: https://review.whamcloud.com/33300
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: consider alive_router_check_interval 98/33298/34
Amir Shehata [Fri, 5 Oct 2018 01:28:49 +0000 (18:28 -0700)]
LU-11300 lnet: consider alive_router_check_interval

Consider router_check_interval when waking up the monitor thread,
to make sure you wakeup the monitor thread at the earliest possible
time.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ibc4b53886b59a9bc174a29d0da711ac77db3a62c
Reviewed-on: https://review.whamcloud.com/33298
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11378 lnet: MR aware gateway selection 88/33188/36
Amir Shehata [Fri, 14 Sep 2018 18:04:44 +0000 (11:04 -0700)]
LU-11378 lnet: MR aware gateway selection

When selecting a route use the Multi-Rail Selection algorithm to
select the best available peer_ni of the best route. The selected
peer_ni can then be used to send the message or to discover it
if the gateway peer needs discovering.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I376af57611591eed2eb1edb80a1b3a68b5aefd19
Reviewed-on: https://review.whamcloud.com/33188
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: use discovery for routing 54/33454/31
Amir Shehata [Mon, 22 Oct 2018 23:03:06 +0000 (16:03 -0700)]
LU-11299 lnet: use discovery for routing

Instead of re-inventing the wheel, routing now uses discovery.
Every router interval the router is discovered. This will
update the router information locally and will serve to let the
router know that the peer is alive.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I211bf15af0b0a5d50f9e2a69a385419a1dd5096b
Reviewed-on: https://review.whamcloud.com/33454
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: modify lnd notification mechanism 53/33453/30
Amir Shehata [Mon, 22 Oct 2018 22:44:50 +0000 (15:44 -0700)]
LU-11299 lnet: modify lnd notification mechanism

LND notifies when a peer is up or down. If the LND notifies
LNet that the peer is up and sets the "reset" flag to true
then this indicates to LNet that the LND knows about the health
of the peer and is telling LNet that the peer is fully healthy.
LNet will set the health value of the peer to maximum, otherwise
it will increment the health by one.

If the LND notifies the LNet that the peer is down, LNet will
decrement the health of the peer by sensitivity value configured.

LNet then turns around and rechecks the peer aliveness and if its
dead it'll notify the LND. This code is only used by the socklnd
because it needs to tear down connections. This is in keeping with
the original functionality.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ifa614405fb0c2cd4f6bcb1a2a97e856320eb6cbe
Reviewed-on: https://review.whamcloud.com/33453
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: Cleanup rcd 87/33187/35
Amir Shehata [Mon, 22 Oct 2018 22:09:11 +0000 (15:09 -0700)]
LU-11299 lnet: Cleanup rcd

Cleanup all code pertaining to rcd, as routing code will use
discovery going forward and there will be no need to keep its own
pinging code.

test_215 looks at the routers file which had its format changed.
Update the test to reflect the change.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: If31caa3b5703df40b6ae0f758f2fe764991aa4f3
Reviewed-on: https://review.whamcloud.com/33187
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: simplify lnet_handle_local_failure() 52/33452/30
Amir Shehata [Mon, 22 Oct 2018 20:39:36 +0000 (13:39 -0700)]
LU-11300 lnet: simplify lnet_handle_local_failure()

Pass the struct lnet_ni to lnet_handle_local_failure() instead of the
message structure, since nothing else from the message is being
used. This also makes symmetrical with lnet_handle_remote_failure()

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I10146ec5bf5f378e28a7725382f00132ada32c6e
Reviewed-on: https://review.whamcloud.com/33452
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: router aliveness 85/33185/34
Amir Shehata [Thu, 6 Sep 2018 00:03:45 +0000 (17:03 -0700)]
LU-11300 lnet: router aliveness

A route is considered alive if the gateway is able to route
messages from the local to the remote net. That means that
at least one of the network interfaces on the remote net of
the gateway is viable.

Introduced the concept of sensitivity percentage. This defaults
to 100%. It holds a dual meaning:
1. A route is considered alive if at least one of the its interfaces'
health is >= LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage
100 means at least one interface has to be 100% healthy
2. On a router consider a peer_ni dead if its health is not at least
LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage.
100% means the interface has to be 100% healthy.

Re-implemented lnet_notify() to decrement the health of the
peer interface if the LND reports a failure on that peer.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie97561fb70bf6a558bc90fa9266a6ba38fa3d293
Reviewed-on: https://review.whamcloud.com/33185
Tested-by: Jenkins
10 months agoLU-11300 lnet: peer aliveness 86/33186/34
Amir Shehata [Thu, 6 Sep 2018 01:19:35 +0000 (18:19 -0700)]
LU-11300 lnet: peer aliveness

Peer NI aliveness is now solely dependent on the health
infrastructure. With the addition of router_sensitivity_percentage,
peer NI is considered dead if its health drops below the percentage
specified of the total health. Setting the percentage to 100% means
that a peer_ni is considered dead if it's interface is less than
fully healthy.

Removed obsolete code that queries the peer NI every second since
the health infrastructure introduces the recovery mechanism which
is designed to recover the health of peer NIs.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I506060fbb66c74295808891b689d7d634dc69284
Reviewed-on: https://review.whamcloud.com/33186
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: Cache the routing feature 51/33451/30
Amir Shehata [Sat, 20 Oct 2018 01:24:39 +0000 (18:24 -0700)]
LU-11300 lnet: Cache the routing feature

When processing a REPLY or a PUSH for a discovery cache the
whether the routing feature is enabled or disabled as
reported by the peer.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I69bd41fade196773af0e1004c2e7fff2fb91392d
Reviewed-on: https://review.whamcloud.com/33451
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: cache ni status 50/33450/30
Amir Shehata [Sat, 20 Oct 2018 01:02:05 +0000 (18:02 -0700)]
LU-11300 lnet: cache ni status

When processing the data in the PUSH or the REPLY make sure to cache
the ns_status. This is the status of the peer_ni as reported by the
peer itself.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I14de2460f578fb7f47d329a97b8833f49c569b74
Reviewed-on: https://review.whamcloud.com/33450
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: configure lnet router senstivity 55/33455/29
Amir Shehata [Tue, 23 Oct 2018 04:25:33 +0000 (21:25 -0700)]
LU-11300 lnet: configure lnet router senstivity

Allow the configuration of router_sensitivity_percentage from the
user space utility lnetctl

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: If5440f30881361ebb06dafa9cadb7cbc2b934f93
Reviewed-on: https://review.whamcloud.com/33455
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: router sensitivity 49/33449/30
Amir Shehata [Sat, 20 Oct 2018 00:09:24 +0000 (17:09 -0700)]
LU-11300 lnet: router sensitivity

Introduce the router_sensitivity_percentage module parameter to
control the sensitivity of routers to failures. It defaults to 100%
which means a router interface needs to be fully healthy in order
to be used.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I3e9333033f049918c1cdca58a72604c71884acbe
Reviewed-on: https://review.whamcloud.com/33449
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
10 months agoLU-11551 lnet: Do not allow deleting of router nis 48/33448/27
Amir Shehata [Fri, 19 Oct 2018 23:40:52 +0000 (16:40 -0700)]
LU-11551 lnet: Do not allow deleting of router nis

Check the peer before deleting a peer_ni. If it's a router then do
not allow deletion of the peer-ni.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I372052b4e9b5af3a8f18a49676fc60b4c8077cbd
Reviewed-on: https://review.whamcloud.com/33448
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: lnet_add/del_route() 84/33184/31
Amir Shehata [Tue, 4 Sep 2018 23:47:54 +0000 (16:47 -0700)]
LU-11299 lnet: lnet_add/del_route()

Reimplemented lnet_add_route() and lnet_del_route() to use
the peer instead of the peer_ni.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I3734098a81ab18d1d74220c691d96a9b9817e6da
Reviewed-on: https://review.whamcloud.com/33184
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Reviewed-by: Chris Horn <hornc@cray.com>
10 months agoLU-11298 lnet: use peer for gateway 83/33183/31
Amir Shehata [Fri, 31 Aug 2018 02:04:39 +0000 (19:04 -0700)]
LU-11298 lnet: use peer for gateway

The routing code uses peer_ni for a gateway. However with Mulit-Rail
a gateway could have multiple interfaces on several different
networks. Instead of using a single peer_ni as the gateway we should
be using the peer and let the MR selection code select the best
peer_ni to send to.

This patch moves the gateway from peer to peer_ni. Much of the
code needs to be rewritten in the following patches to account
for that change. This patch disables the routing features by
disabling the code to add/delete routes.

The asymmetric routing detection feature is also modified to
use the MR routing

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia7dab552268c4a7fbd7b88122b9a95363d155fd7
Reviewed-on: https://review.whamcloud.com/33183
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11292 lnet: Discover routers on first use 82/33182/31
Amir Shehata [Tue, 28 Aug 2018 23:42:35 +0000 (16:42 -0700)]
LU-11292 lnet: Discover routers on first use

Discover routers on first use. This brings the behavior when
interacting with routers in line with when dealing with normal
peers.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8527e41daf2f5f6ab5f04aac1285aaa6cc4ee594
Reviewed-on: https://review.whamcloud.com/33182
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
10 months agoLU-10153 lnet: remove route add restriction 47/33447/23
Amir Shehata [Fri, 19 Oct 2018 23:23:40 +0000 (16:23 -0700)]
LU-10153 lnet: remove route add restriction

Remove restriction with adding routes to the same remote network
via two different gateways.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iefc5aa10f73e9e7bdd283f5e933fbb8ee819df50
Reviewed-on: https://review.whamcloud.com/33447
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-12339 lnet: select LO interface for sending 57/34957/5
Amir Shehata [Sat, 25 May 2019 16:55:47 +0000 (09:55 -0700)]
LU-12339 lnet: select LO interface for sending

In the following scenario

Lustre->LNetPrimaryNID with 0@lo
Discover is initiated on 0@lo
The peer is created with 0@lo and <addr>@<net>
The interface health of the peer's <addr>@<net> is decremented
LNetPut() to self
selection algorithm selects 0@lo to send to

This exposes an issue where we try and go through the peer credit
management algorithm, but because there are no credits associated with
0@lo we end up indefinitely queuing the message. ptlrpc will then get
stuck waiting for send completion on the message.

This was exposed via conf-sanity 32a

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I98e9d3428b594a0d041d27d8e8d8de7596825edc
Reviewed-on: https://review.whamcloud.com/34957
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-12199 lnet: verify msg is commited for send/recv 97/34797/12
Amir Shehata [Tue, 30 Apr 2019 21:01:48 +0000 (14:01 -0700)]
LU-12199 lnet: verify msg is commited for send/recv

Before performing a health check make sure the message
is committed for either send or receive. Otherwise we
can just finalize it.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Id7bd956f8e81e60a2d63059730973f851d4c7abe
Reviewed-on: https://review.whamcloud.com/34797
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
10 months agoLU-12199 lnet: Ensure md is detached when msg is not committed 85/34885/8
Chris Horn [Thu, 18 Apr 2019 03:49:18 +0000 (22:49 -0500)]
LU-12199 lnet: Ensure md is detached when msg is not committed

It's possible for lnet_is_health_check() to return "true" when the
message has not hit the network. In this situation the message is
freed without detaching the MD. As a result, requests do not receive
their unlink events and these requests are stuck forever.

A little cleanup is included here:
 - The value of lnet_is_health_check() is only used in one place, so
   we don't need to save the result of it in a variable.
 - We don't need separate logic to detach the md when the send was
   successful. We'll fall through to the finalizing code after
   incrementing the health counters

Test-Parameters: forbuildonly
Cray-bug-id: LUS-7239
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6301d491090b862d016eed3aac8afd7be8685e57
Reviewed-on: https://review.whamcloud.com/34885
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
10 months agoLU-12264 lnet: Protect lp_dc_pendq manipulation with lp_lock 98/34798/9
Chris Horn [Thu, 2 May 2019 22:24:32 +0000 (17:24 -0500)]
LU-12264 lnet: Protect lp_dc_pendq manipulation with lp_lock

Protect the peer discovery queue from concurrent manipulation by
acquiring the lp_lock.

Test-Parameters: forbuildonly
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: If43b877c1c7ea203f346a3d6ea846f00b8f9661f
Reviewed-on: https://review.whamcloud.com/34798
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
10 months agoLU-12254 lnet: correct discovery LNetEQFree() 96/34796/8
Amir Shehata [Tue, 30 Apr 2019 18:51:09 +0000 (11:51 -0700)]
LU-12254 lnet: correct discovery LNetEQFree()

The EQ needs to be freed after all the queues are cleaned to avoid
having non-processed events on the event queue on free. This will
prevent the memory from being freed.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie38ec25e09bf6d7cf2aadc30edd91d298897c51b
Reviewed-on: https://review.whamcloud.com/34796
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-12249 lnet: fix list corruption 78/34778/8
Amir Shehata [Tue, 30 Apr 2019 05:57:21 +0000 (22:57 -0700)]
LU-12249 lnet: fix list corruption

In shutdown the resend queues are cleared and freed. The monitor
thread state is set to shutdown. It is possible to get lnet_finalize()
called after the queues are freed. The code checks for ln_state to see
if we're shutting down. But in this case we should really be checking
ln_mt_state. The monitor thread is the one that matters in this case,
because it's the one which allocates and frees the resend queues.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia077cec7a52ef5cd2e1b231437c6265ba9416b1b
Reviewed-on: https://review.whamcloud.com/34778
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11297 lnet: invalidate recovery ping mdh 71/34771/8
Amir Shehata [Sat, 27 Apr 2019 22:47:42 +0000 (15:47 -0700)]
LU-11297 lnet: invalidate recovery ping mdh

For cleanliness, ensure that recovery ping mdh is invalidated when
an peer ni or a local ni are allocated

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: If06448b1602b3680831244923b6b982a555159ea
Reviewed-on: https://review.whamcloud.com/34771
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-12201 lnet: detach response tracker 70/34770/8
Amir Shehata [Fri, 19 Apr 2019 00:12:49 +0000 (17:12 -0700)]
LU-12201 lnet: detach response tracker

We need to unlink the response tracker from MDs even if the
corresponding message failed to send.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I4f320274576790e3332f66f30aad5c2b3450b955
Reviewed-on: https://review.whamcloud.com/34770
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-12163 lnet: fix cpt locking 07/34607/9
Amir Shehata [Sat, 6 Apr 2019 00:38:38 +0000 (17:38 -0700)]
LU-12163 lnet: fix cpt locking

In lnet_select_pathway() the call to lnet_handle_send_case_locked()
can result in sd_cpt being changed. If this function returns
REPEAT_SEND, we'll go back to the again label. It is possible at
this time to initiate discovery, which will unlock the cpt.
If the local cpt isn't updated we could potentially be manipulating
the wrong cpt resulting in some form of corruption or dead lock.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ifd39b0d84f8cce859151f7cc900a082481dd7218
Reviewed-on: https://review.whamcloud.com/34607
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11816 lnet: setup health timeout defaults 52/34252/13
Amir Shehata [Wed, 19 Dec 2018 23:55:49 +0000 (15:55 -0800)]
LU-11816 lnet: setup health timeout defaults

Enable health feature by default.
Setup transaction timeout to a default 10 seconds and
retry count to 3 when health is enabled. When health
is disabled set default transaction timeout to 50.
When toggling between health enabled/disabled the defaults
will always kick in.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I153c2822898b44e33871ec827de7e61f153bb1db
Reviewed-on: https://review.whamcloud.com/34252
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
10 months agoLU-12344 lnet: handle remote health error 67/34967/2
Amir Shehata [Mon, 27 May 2019 17:43:10 +0000 (10:43 -0700)]
LU-12344 lnet: handle remote health error

When a peer is dead set the health status to REMOTE_DROPPED
in order to handle health properly for the peer.
When dropping a routed message set REMOTE_ERROR. Routed messages
are dropped when the routing feature is turned off which could
be considered a configuration error if it happens in the middle
of traffic. Therefore, it's better to flag this issue at this
point without resending the message.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I131263215a68fc8607582643a47007ce4d04abbc
Reviewed-on: https://review.whamcloud.com/34967
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-12080 lnet: clean mt_eqh properly 77/34477/8
Amir Shehata [Wed, 20 Mar 2019 19:14:51 +0000 (12:14 -0700)]
LU-12080 lnet: clean mt_eqh properly

There is a scenario where you have a peer on your recovery queue
that's down. So you keep pinging it, but every ping times out
after 10 seconds. In the middle of these 10 seconds you perform a
shutdown. First you try to do the rsp_tracker_clean. It goes through
and calls MDUnlink on the MD related to that ping. But because the
message has a ref count on the MD, it doesn't go away. The MD gets
zombied. And just waits for lnet_md_unlink to be called in
lnet_finalize(). Then you hit clean_peer_ni_recovery. We see the peer
on the queue, we try to call Unlink on it, but when we lookup the
MD using lnet_handle2md() we can't find it. Afterwards we try to clean
up the EQ and it asserts. Even if we remove the assert we end up with
a resource leak since the EQ is not actually freed since we won't call
LNetEQFree() again.

The solution is to pull the EQ create in the LNetNIInit() and deletion
happens in lnet_unprepare. By this point all the remaining messages
would've been finalized and all references on the EQ are gone,
allowing us to clean it up properly

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I7fd6018ee2e57f82c649fc3658352e89a4309986
Reviewed-on: https://review.whamcloud.com/34477
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-12080 lnet: recovery event handling broken 45/34445/7
Amir Shehata [Sun, 17 Mar 2019 15:16:40 +0000 (08:16 -0700)]
LU-12080 lnet: recovery event handling broken

Don't increment health on unlink event.
If a SEND fails an unlink will follow so no need to do any
special processing on SEND event. If SEND succeeds then we
wait for the reply.
When queuing a message on the NI recovery queue only do so
if the MT thread is still running.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I4877caebcac5cdfc35a59a18a3e3451b1f23cb0d
Reviewed-on: https://review.whamcloud.com/34445
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-12014 llite: check correct size in ll_dom_finish_open() 95/33895/10
Mikhail Pershin [Wed, 19 Dec 2018 19:28:53 +0000 (22:28 +0300)]
LU-12014 llite: check correct size in ll_dom_finish_open()

The check in ll_dom_finish_open() for data end shouldn't
use i_size for comparision because it may be not updated
yet with just returned data from server. Use size value in
mdt_body from reply for that check.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I1104fbbb0eb4633869b9bf2d1803ac3e84e3853d
Reviewed-on: https://review.whamcloud.com/33895
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
10 months agoLU-11213 lmv: mkdir with balanced space usage 60/34360/15
Lai Siyao [Fri, 15 Feb 2019 14:07:56 +0000 (22:07 +0800)]
LU-11213 lmv: mkdir with balanced space usage

If a plain directory default LMV hash type is "space", create
subdirs on all MDTs with balanced space usage:
* client mkdir allocate FID on MDT with balanced space usage
  (space QoS code is in next patch).
* MDT allows mkdir on different MDT with its parent if it has
  "space" hash type in default LMV, this is normally rejected
  because mkdir shouldn't create remote directory.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I284e21f334c07462211be4c8e38e965722d1e8a8
Reviewed-on: https://review.whamcloud.com/34360
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11213 mdc: add async statfs 59/34359/16
Lai Siyao [Fri, 15 Feb 2019 11:12:34 +0000 (19:12 +0800)]
LU-11213 mdc: add async statfs

Add obd_statfs_async() interface for MDC, the statfs request
is sent by ptlrpcd.

This statfs result is for each MDT separately, it's different
from current cached statfs which is aggregated statfs of all
MDTs.

The max age of statfs result is decided by lmv_desc.ld_qos_maxage.

It will deactivate MDC on failure, and activate MDC on success.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8e1bd104fb60ff81e2eb26e49a89a5baf8050d47
Reviewed-on: https://review.whamcloud.com/34359
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11213 ptlrpc: intent_getattr fetches default LMV 02/34802/17
Lai Siyao [Thu, 18 Apr 2019 10:01:47 +0000 (18:01 +0800)]
LU-11213 ptlrpc: intent_getattr fetches default LMV

Intent_getattr fetches default LMV, and caches it on client,
which will be used in subdir creation.

* Add RMF_DEFAULT_MDT_MD in intent_getattr reply.
* Save default LMV in ll_inode_info->lli_default_lsm_md, and
  replace lli_def_stripe_offset with it.
* take LOOKUP lock on default LMV setting to let client update
  cached default LMV.
* improve mdt_object_striped() to read from bottom device
  to avoid reading stripe FIDs.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Idb369db2c514a9c5108390f70d9284b3a87d26db
Reviewed-on: https://review.whamcloud.com/34802
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12374 lustre: push rcu_barrier() before destroying slab 30/35030/2
Wang Shilong [Sat, 1 Jun 2019 11:22:11 +0000 (19:22 +0800)]
LU-12374 lustre: push rcu_barrier() before destroying slab

From rcubarrier.txt:

"
We could try placing a synchronize_rcu() in the module-exit code path,
but this is not sufficient. Although synchronize_rcu() does wait for a
grace period to elapse, it does not wait for the callbacks to complete.

One might be tempted to try several back-to-back synchronize_rcu()
calls, but this is still not guaranteed to work. If there is a very
heavy RCU-callback load, then some of the callbacks might be deferred
in order to allow other processing to proceed. Such deferral is required
in realtime kernels in order to avoid excessive scheduling latencies.

We instead need the rcu_barrier() primitive. This primitive is similar
to synchronize_rcu(), but instead of waiting solely for a grace
period to elapse, it also waits for all outstanding RCU callbacks to
complete. Pseudo-code using rcu_barrier() is as follows:

   1. Prevent any new RCU callbacks from being posted.
   2. Execute rcu_barrier().
   3. Allow the module to be unloaded.
"

So use synchronize_rcu() in ldlm_exit() is not safe enough, and we might
still hit use-after-free problem, also we missed rcu_barrier() when destory
inode cache, this is simiar idea what current local filesystem does.

Change-Id: I76c7dfe7b6472d377fe1b60b0891c61ac8a0fbfc
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35030
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11204 obdclass: remove unprotected access to lu_object 60/34960/2
Mikhail Pershin [Sun, 26 May 2019 17:46:43 +0000 (20:46 +0300)]
LU-11204 obdclass: remove unprotected access to lu_object

The check of lu_object_is_dying() is done after reference
drop and without lock, so can access freed object if concurrent
thread did final put.

The patch saves object state right before atomic_dec_and_lock()
and checks it after check, so object is not being accessed

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: I6407cdb079777e60cc0a7aecb64e3a559210b504
Reviewed-on: https://review.whamcloud.com/34960
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12219 obdfilter: changes PAGE_SIZE variable 54/34754/3
Alexander Boyko [Wed, 24 Apr 2019 14:14:57 +0000 (10:14 -0400)]
LU-12219 obdfilter: changes PAGE_SIZE variable

obdfilter-survey uses PAGE_SIZE in KBytes. After LU-11597
PAGE_SIZE exported from test-framework.sh in bytes. So it confuses
obdfilter-survey and lead to error:
/usr/bin/obdfilter-survey: line 509: size * 1024 / (actual_rsz * thr):
division by 0 (error token is ")")

Patch changes the name to PAGE_SIZE_KB.

Fixes: f602b5ec7f4 ("LU-11597 tests: fix O_DIRECT test usage for ARM")
Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-7214
Change-Id: Ie8be852c9634569c59a770ba49c3d1c36f53fdb2
Reviewed-on: https://review.whamcloud.com/34754
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoNew tag 2.12.54 2.12.54 v2_12_54
Oleg Drokin [Wed, 5 Jun 2019 06:37:03 +0000 (02:37 -0400)]
New tag 2.12.54

Change-Id: I17db7c495c4419c0815398f531b6407269355892
Signed-off-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12361 lov: fix wrong calculated length for fiemap 98/34998/3
Wang Shilong [Thu, 30 May 2019 14:46:09 +0000 (22:46 +0800)]
LU-12361 lov: fix wrong calculated length for fiemap

lov_stripe_intersects() will return a closed interval
[@obd_start, @obd_end], so to calcuate length of interval we need

 @obd_end - @obd_start + 1

rather than

 @obd_end - @obd_start

Wrong extent length will make us return wrong fiemap information.

Change-Id: I30deb17cf5fa80a6d3046098fbac0d3faa01ad1c
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/34998
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12352 libcfs: crashes with certain cpu part numbers 91/34991/6
Andrew Perepechko [Thu, 17 Jan 2019 21:58:10 +0000 (00:58 +0300)]
LU-12352 libcfs: crashes with certain cpu part numbers

Due to a bug in the code, libcfs will crash if the
number of online cpus does not divide by the number
of cpu partitions.

Based on the checks in cfs_cpt_table_create(), it
appears that the original intent was to push the
remaining cpus into the initial partitions.

So let's do that properly.

Change-Id: I3c5e2aa1fdfca4c07e7afce143c984973373f009
Signed-off-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Cray-bug-id: LUS-6455
Reviewed-on: https://review.whamcloud.com/34991
Tested-by: Jenkins
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12341 tests: Add kmemleak awareness to test-framework 59/34959/4
Oleg Drokin [Sun, 26 May 2019 18:53:21 +0000 (14:53 -0400)]
LU-12341 tests: Add kmemleak awareness to test-framework

If active kmemleak is detected, perform a clear operation to
ensure all non-Lustre related leaks are not getting in the way.

When it comes time to unload modules, first perform a scan
and then save the output if it's not empty, print to
syslog (for simplicity).
Also save /proc/modules content for the next step (we can save ogdb
from /tmp, but that seems to be getting stale and needs its own
fixing)

After modules unload perform another scan and if the result is non-empty,
output the saved /proc/modules output and the updated memleaks
into syslog as well

Test-Parameters: trivial
Change-Id: Ibba9047e4d8b98e7ab74aeb0906078549029ad43
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34959
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
10 months agoLU-12306 kernel: kernel update RHEL7.6 [3.10.0-957.12.2.el7] 97/34897/3
Jian Yu [Sat, 18 May 2019 05:21:12 +0000 (22:21 -0700)]
LU-12306 kernel: kernel update RHEL7.6 [3.10.0-957.12.2.el7]

Update RHEL7.6 kernel to 3.10.0-957.12.2.el7.

Test-Parameters: clientdistro=el7.6 serverdistro=el7.6

Change-Id: I8124f68085af2b6d8228166e84745cb94edb7fb0
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34897
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11963 osd: Add nonrotational flag to statfs 35/34235/11
Patrick Farrell [Wed, 27 Feb 2019 21:29:59 +0000 (16:29 -0500)]
LU-11963 osd: Add nonrotational flag to statfs

It is potentially useful for the MDS and userspace to
know whether or not an OST is using non-rotational media.

Add a flag to obd_statfs that reflects this.

Users can override this parameter in proc.

ZFS does not currently make this information available to
Lustre, so default to rotational and allow users to
override.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Iac2b54c5d8cc1eb79cdace764e93578c7b058661
Reviewed-on: https://review.whamcloud.com/34235
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-9581 tests: remove recovery-small test from ALWAYS_EXCEPT 82/27382/8
James Nunez [Wed, 27 Jun 2018 19:34:28 +0000 (13:34 -0600)]
LU-9581 tests: remove recovery-small test from ALWAYS_EXCEPT

recovery-small test 52 is not run during testing,
by adding the test number to the ALWAYS_EXCEPT list,
due to bugzilla bug number 5493.

Remove recovery-small test 52 from the ALWAYS_EXCEPT
list and start running this test again.

Test-Parameters: trivial testlist=recovery-small
Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I50e30831ee5af8e063dc4b6197141fed365535b6
Reviewed-on: https://review.whamcloud.com/27382
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12120 grants: prevent negative ted_grant value 96/34996/2
Mikhail Pershin [Thu, 30 May 2019 09:30:43 +0000 (12:30 +0300)]
LU-12120 grants: prevent negative ted_grant value

Add check in tgt_grant_shrink() to protect ted_grant
against negative value.

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Iddea86f052124413ac60f5d0f26bcb68e376ede5
Reviewed-on: https://review.whamcloud.com/34996
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11213 dne: add new dir hash type "space" 58/34358/12
Lai Siyao [Thu, 14 Feb 2019 21:16:33 +0000 (05:16 +0800)]
LU-11213 dne: add new dir hash type "space"

Add a new hash type "space", if this is set on default LMV of
a directory, its subdirs will be created on all MDTs with
balanced space usage.

* new hash type LMV_HASH_TYPE_SPACE.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I8edf38f94e24965b1cffb21253c3be0eef68c707
Reviewed-on: https://review.whamcloud.com/34358
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
10 months agoLU-12269 kernel: new kernel [RHEL 8.0 4.18.0-80.el8] 62/34862/8
Jian Yu [Wed, 22 May 2019 19:24:14 +0000 (12:24 -0700)]
LU-12269 kernel: new kernel [RHEL 8.0 4.18.0-80.el8]

This patch makes changes to support new RHEL 8.0 release
for Lustre client.

Test-Parameters: trivial

Change-Id: I89b4f1e59f8b25bf9d37d3564e2d05d6e87d9b38
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34862
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12140 lnet: adds checking msg len 75/34975/5
Alexander Boyko [Tue, 28 May 2019 10:07:12 +0000 (06:07 -0400)]
LU-12140 lnet: adds checking msg len

The LNET can't handle a msg with len larger than LNET_MTU.
The next error occured for DOM 1MB
 LNetError: 3137:0:(lib-move.c:4143:lnet_parse()) 192.168.8.1@tcp,
 src 192.168.8.1@tcp: bad PUT payload 1051832 (1048576 max expected)

The patch adds fragment size check.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-7174
Change-Id: Id2d21ebd87ab0bf3a9114548900fab99b278ffb0
Reviewed-on: https://review.whamcloud.com/34975
Tested-by: Jenkins
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11623 llite: hash just created files if lock allows 84/33584/9
Oleg Drokin [Tue, 6 Nov 2018 00:26:44 +0000 (19:26 -0500)]
LU-11623 llite: hash just created files if lock allows

If open|creat (and other intent operations later) returned a lookup bit
as part of the lock, hash the resultant dentry under this lock,
not to trigger further RPCs in subsequent lookups.

Benchmark results:

This patch can significantly improve open-create + stat on the same
client.

This patch in combination with two others:

https://review.whamcloud.com/32157
https://review.whamcloud.com/33585

Improves the 'stat' side of open-create + stat by >10x.

Without patches (master branch commit 26a7abe):

mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /cache1/out/ -F -C -T -v -w 32k

   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :       3838.205       3838.204       3838.204          0.000
   File stat         :      33459.289      33459.249      33459.271          0.011
   File read         :          0.000          0.000          0.000          0.000
   File removal      :          0.000          0.000          0.000          0.000
   Tree creation     :       3146.841       3146.841       3146.841          0.000
   Tree removal      :          0.000          0.000          0.000          0.000

With the three patches:

mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /cache1/out/ -F -C -T -v -w 32k
SUMMARY rate: (of 1 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :       3822.440       3822.439       3822.440          0.000
   File stat         :     350620.140     350615.980     350617.193          1.051
   File read         :          0.000          0.000          0.000          0.000
   File removal      :          0.000          0.000          0.000          0.000
   Tree creation     :       2076.727       2076.727       2076.727          0.000
   Tree removal      :          0.000          0.000          0.000          0.000

Note 33K stats/second vs 350K stats/second.

ls -l time of the mdtest directory is also reduced from 23.5 seconds to
5.8 seconds.

Change-Id: Id5140d1042af7f5ab9052922e11a7eda8f92a29a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33584
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
10 months agoLU-10948 llite: Revalidate dentries in ll_intent_file_open 57/32157/9
Oleg Drokin [Wed, 25 Apr 2018 19:04:48 +0000 (15:04 -0400)]
LU-10948 llite: Revalidate dentries in ll_intent_file_open

We might get a lookup lock in response to our open request and we
definitely want to ensure that our dentry is valid, so it could
actually be matched by dcache code in future operations.

Benchmark results:

This patch can significantly improve open-create + stat on the same
client.

This patch in combination with two others:

https://review.whamcloud.com/#/c/33584
https://review.whamcloud.com/#/c/33585

Improves the 'stat' side of open-create + stat by >10x.

Without patches (master branch commit 26a7abe):

mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /cache1/out/ -F -C -T -v -w 32k

   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :       3838.205       3838.204       3838.204          0.000
   File stat         :      33459.289      33459.249      33459.271          0.011
   File read         :          0.000          0.000          0.000          0.000
   File removal      :          0.000          0.000          0.000          0.000
   Tree creation     :       3146.841       3146.841       3146.841          0.000
   Tree removal      :          0.000          0.000          0.000          0.000

With the three patches:

mpirun -np 24 --allow-run-as-root /work/tools/bin/mdtest -n 50000 -d /cache1/out/ -F -C -T -v -w 32k
SUMMARY rate: (of 1 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   File creation     :       3822.440       3822.439       3822.440          0.000
   File stat         :     350620.140     350615.980     350617.193          1.051
   File read         :          0.000          0.000          0.000          0.000
   File removal      :          0.000          0.000          0.000          0.000
   Tree creation     :       2076.727       2076.727       2076.727          0.000
   Tree removal      :          0.000          0.000          0.000          0.000

Note 33K stats/second vs 350K stats/second.

ls -l time of the mdtest directory is also reduced from 23.5 seconds to
5.8 seconds.

Change-Id: I2cb4f94c0300897adb90cc89425e5cfb1c6fe7af
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/32157
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-11041 kernel: Enable tons of kernel debug options 93/32493/14
Minh Diep [Tue, 5 Feb 2019 20:13:27 +0000 (12:13 -0800)]
LU-11041 kernel: Enable tons of kernel debug options

Enable extra debugging options in rhel7 kernels
Create new lbuild option to build with the file config file

Test-Parameters: trivial

Change-Id: I29f503dcc97ff79e27539667e3f1d0edb33c23f4
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/32493
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-12345 ldiskfs: optimize nodelalloc mode 82/34982/2
Artem Blagodarenko [Tue, 28 May 2019 16:51:21 +0000 (19:51 +0300)]
LU-12345 ldiskfs: optimize nodelalloc mode

We found performance regression when using bigalloc with "nodelalloc"
(1MB cluster size):

1. mke2fs -C 1048576 -O ^has_journal,bigalloc /dev/sda
2. mount -o nodelalloc /dev/sda /test/
3. time dd if=/dev/zero of=/test/io bs=1048576 count=1024

The "dd" will cost about 2 seconds to finish, but if we mke2fs without
"bigalloc", "dd" will only cost less than 1 second.

The reason is: when using ext4 with "nodelalloc", it will call
ext4_find_delalloc_cluster() nearly everytime it call
ext4_ext_map_blocks(), and ext4_find_delalloc_range() will also scan
all pages in cluster because no buffer is "delayed".  A cluster has
256 pages (1MB cluster), so it will scan 256 * 256k pags when creating
a 1G file. That severely hurts the performance.

Therefore, we return immediately from ext4_find_delalloc_range() in
nodelalloc mode, since by definition there can't be any delalloc
pages.

The same optimization also added for ldiskfs_find_delayed_extent()
function that improve performance dromaticaly.

Here is results of testing on two node system.
Without the patch:
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00   56.30    0.06    0.00   43.63

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sds               0.00     0.00    0.00 1174.00     0.00     4.59
8.00     0.84    0.71    0.00    0.71   0.01   1.20

With patch:
08/29/2018 01:13:22 AM
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
0.00    0.00    4.13   30.37    0.00   65.50

Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s      wMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm %util
sds               0.00     0.00    0.00 54117.82     0.00     211.43
8.00   152.59    2.82    0.00    2.82   0.02 99.01

Cray-bug-id: LUS-5835
Signed-off-by: Artem Blagodarenko <c17828@cray.com>
Change-Id: Ie33410d4481778ee4f76a054ab8cfc11cc19a0ed
Reviewed-on: https://review.whamcloud.com/34982
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
10 months agoLU-10467 ptlrpc: discard SVC_SIGNAL and related functions 56/34956/2
NeilBrown [Sat, 25 May 2019 15:19:30 +0000 (11:19 -0400)]
LU-10467 ptlrpc: discard SVC_SIGNAL and related functions

This flag is never set, so remove checks and remove
the flag.

Linux-commit: 7f76eb1a6bb7587cbfee410df914bc83f717a362

Change-Id: I4f0c082392b4c140c85da2dcc149a682b2f37fea
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/34956
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
10 months agoLU-12323 libcfs: check if save_stack_trace_tsk is exported 37/34937/9
Chris Horn [Wed, 22 May 2019 16:21:14 +0000 (11:21 -0500)]
LU-12323 libcfs: check if save_stack_trace_tsk is exported

Lustre 2.12 commit afedf9343686504c89f2e28cf6133540166f2347 introduced
the use of save_stack_trace_tsk, but this symbol is not exported for
all architectures. When it's possible we can use save_stack_trace
instead. Otherwise skip printing stack trace.

Cray-bug-id: LUS-7352
Test-Parameters: clientarch=aarch64
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I142b542f5c5672abbad461a621aedd1e49db1bdd
Reviewed-on: https://review.whamcloud.com/34937
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>