Whamcloud - gitweb
fs/lustre-release.git
9 months agoLU-11673 tests: add space before ']' in test-framework 79/35079/4
James Nunez [Thu, 6 Jun 2019 13:48:13 +0000 (07:48 -0600)]
LU-11673 tests: add space before ']' in test-framework

The test command '[' expects spaces before all arguments
including the closing ']'.

Add a space before the closing ']' in the function
print_summary() in test-framework.sh.

Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: If2365cb5f2b9c003949c6224997644c61341fe35
Reviewed-on: https://review.whamcloud.com/35079
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
9 months agoLU-8066 obd: collect all resource releasing for obj_type. 16/34716/4
NeilBrown [Wed, 5 Jun 2019 16:36:23 +0000 (12:36 -0400)]
LU-8066 obd: collect all resource releasing for obj_type.

Now that obj_type is managed as a kobject, move all
the freeing and deregistering into class_sysfs_release().

Change-Id: I784287ea17e010206b5fa256c7a224d01085be92
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/34716
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11838 scrub: handle s_uuid change to uuid_t 89/34689/8
James Simmons [Thu, 20 Jun 2019 20:57:01 +0000 (16:57 -0400)]
LU-11838 scrub: handle s_uuid change to uuid_t

The 4.12 kernel changed the s_uuid field in struct super_block from
an character array to an uuid_t. While ldiskfs uses it own s_uuid
field in struct ext4_super_block that field is a char array instead
of an uuid. Currently on going effort are being down in the linux
kernel to move to uuid_t so I suspect this will change in the future.
Since this is the case change all the character arrays for uuid
handling to uuid_t located in the scrubbing code. Change osd-ldiskfs
to use the struct super_block uuid, which is equivalent to s_es
version, to handle the uuid_t changes now.

Change-Id: I40643d342b5bc17a6ef922e99b3e8524930822de
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34689
Tested-by: Jenkins
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-9859 libcfs: replace cfs_get_random_bytes calls with get_random_byte() 34/35234/4
NeilBrown [Sat, 15 Jun 2019 01:21:15 +0000 (21:21 -0400)]
LU-9859 libcfs: replace cfs_get_random_bytes calls with get_random_byte()

The cfs_get_random_bytes() interface adds nothing of value
to get_random_byte() (which it uses internally).  So just use the
standard interface.

Linux-commit: e904f839cdb04d1b314753a83a6e58146e315c66

Change-Id: I48e153d7658f0f616afe4e884faeb09c2dbdcd03
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/35234
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9859 libcfs: deprecate libcfs_debug_vmsg2 24/35224/4
NeilBrown [Fri, 14 Jun 2019 01:56:34 +0000 (21:56 -0400)]
LU-9859 libcfs: deprecate libcfs_debug_vmsg2

Since 2.6.36, Linux' vsprintf has supported %pV
which supports "recursive sprintf" - exactly the task
that libcfs_debug_vmsg2 aims to provide.

Instead of calling libcfs_debug_vmsg2(), we can put the fmt and
args in a 'struct va_format', and pass the address of that structure
to the "%pV" format.

So do this to remove all users of libcfs_debug_vmsg2().

Linux-commit: 0fe922e1eca8e2850f0e6c535a14ba7414ca73c2

Change-Id: I6952ca8fdb619423639734aab1a30f4635b089cc
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35224
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12226 doc: recommend e2fsprogs 1.45.2.wc1 02/35202/2
Li Dongyang [Wed, 12 Jun 2019 06:17:00 +0000 (16:17 +1000)]
LU-12226 doc: recommend e2fsprogs 1.45.2.wc1

Update the recommended e2fsprogs version to 1.45.2.wc1

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I0eea35a0bcc24a6109d0c90254e9d071f70e8e9d
Reviewed-on: https://review.whamcloud.com/35202
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
9 months agoLU-8066 tests: use lod / osp tunables on servers 85/35185/2
James Simmons [Tue, 11 Jun 2019 15:39:17 +0000 (11:39 -0400)]
LU-8066 tests: use lod / osp tunables on servers

Before the lustre 2.4 OSD work the lov and osc code was used on
both servers and clients. With the OSD layer work we saw the new
lod and osp layers created that are server specific. To avoid
breakage symlinks were created that went from the lod / osp to
lov / osc directories in the proc tree on the server side.

It has been a very long time since that change so we can now
safely start to unwind that handling. The first step taken here
is to migrate the maloo test from using lov / osc for the server
tunables to using lod / osp instead.

Change-Id: I9dd562cd74d68aaa0226d5ab93042b52193604a1
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35185
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12420 utils: llog_reader handles uninitialized mountdata 78/35178/2
Li Xi [Tue, 11 Jun 2019 12:28:30 +0000 (20:28 +0800)]
LU-12420 utils: llog_reader handles uninitialized mountdata

When reading an mountdata that has never been used, "llog_reader
CONFIGS/mountdata" command crashes with following output:

Header size : 500170753
Time : Wed Sep  4 00:57:37 6869
Number of records: 65534
Target uuid :
-----------------------
Segmentation fault

After apply this patch, llog_reader will print following message
and quit under this circumstance:

Header size : 500170753
Time : Wed Sep  4 00:57:37 6869
Number of records: 65534
Target uuid :
-----------------------
uninitialized llog record at index 0

Change-Id: I25147f7fd09c6d59ff0049bdb20ac1979cf43ee4
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/35178
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12420 utils: llog_reader handles uninitialized llog properly 77/35177/3
Li Xi [Tue, 11 Jun 2019 08:26:22 +0000 (16:26 +0800)]
LU-12420 utils: llog_reader handles uninitialized llog properly

When reading an empty LLOG, llog_reader would crash because
of record number of zero. E.g. "llog_reader CONFIGS/nodemap" on
a MGS without nodemap configuration would cause failure of:

llog_reader: Error allocating -16 bytes for recs_buf: Cannot allocate memory (12)
llog_reader: Could not pack buffer.: Cannot allocate memory (12)

After apply this patch, llog_reader will print following message
and quit if the LLOG is unintialized:

uninitialized llog: zero record number

Change-Id: I87246672e9fc992c99126134236c2e8d304df74b
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/35177
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12382 llite: fix deadloop with tiny write 58/35058/7
Wang Shilong [Tue, 4 Jun 2019 12:54:01 +0000 (20:54 +0800)]
LU-12382 llite: fix deadloop with tiny write

For a small write(<4K), we will use tiny write and
__generic_file_write_iter() will be called to handle it.

On newer kernel(4.14 etc), the function is exported and will
do something like following:

|->__generic_file_write_iter
  |->generic_perform_write()

If iov_iter_count() passed in is 0, generic_write_perform() will
try go to forever loop as bytes copied is always calculated as 0.

The problem is VFS doesn't always skip IO count zero before it comes
to lower layer read/write hook, and we should do it by ourselves.

To fix this problem, always return 0 early if there is no
real any IO needed.

Change-Id: I765a723da79eb5fd09317c3fad47fe479b1dd4fb
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35058
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12330 obdclass: allow per-session jobids. 95/34995/9
NeilBrown [Thu, 23 May 2019 00:15:43 +0000 (10:15 +1000)]
LU-12330 obdclass: allow per-session jobids.

Lustre includes a jobid in all RPC message sent to the server.  This
is used to collected per-job statistics, where a "job" can involve
multiple processes on multiple nodes in a cluster.

Nodes in a cluster can be running processes for multiple jobs, so it
is best if different processes can have different jobids, and that
processes on different nodes can have the same job id.

The current mechanism for supporting this is to use an environment
variable which the kernel extracts from the relevant process's address
space.  Some kernel developers see this to be an unacceptable design
choice, and the code is not likely to be accepted upstream.

This patch provides an alternate method, leveraging the concept of a
"session id", as set with setsid().  Each login session already gets a
unique sid which is preserved for all processes in that session unless
explicitly changed (with setsid(1)).
When a process in a session writes to
     /sys/fs/lustre/jobid_this_session
the string becomes the name for that session.
If jobid_var is set to "session", then the per-session jobid is used
for the jobid for all requests from processes in that session.

When a session ends, the jobid information will be purged within 5
minutes.

Change-Id: I6fb1a75f8f60f824e402706de0b1439464bfa05c
Signed-off-by: Mr NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/34995
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12043 llite: improve single-thread read performance 95/34095/35
Wang Shilong [Mon, 21 Jan 2019 12:23:47 +0000 (20:23 +0800)]
LU-12043 llite: improve single-thread read performance

Here is whole history:

Currently, for sequential read IO, We grow up window
size very quickly, and once we cached @max_readahead_per_file
pages. For following command:

  dd if=/mnt/lustre/file of=/dev/null bs=1M

We will do something like following:
...
64M bytes cached.
fast io for 16M bytes
readahead extra 16M to fill up window.
fast io for 16M bytes
readahead extra 16M to fill up window.
....

In this way, we could only use fast IO for 16M bytes and
then fall through non-fast IO mode. this is also reason
that why increasing @max_readahead_per_file don't give us
performances up, since this value only changes how much
memory we cached in memory, during my testing whatever
I changed the value, i could only get 2GB/s for single thread
read.

Actually, we could do this better, if we have used
more than 16M bytes readahead pages, submit another readahead
requests in the background. and ideally, we could always
use fast IO.

Test Patched Unpatched
dd if=file of=/dev/null bs=1M.   4.0G/s 1.9G/s
ior -np 192 r -t 1m -b 4g -F -e -vv -o /cache1/ior -k 11195.97 10817.02 MB/sec

Tested with drop OSS and client memory before every run.
max_readahead_per_mb=128M, RPC size is 16M.
dd file's size is 400G which is double of memory or so.

Change-Id: I9b6be078ca24c256198488a9c1635791dafbd7e7
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/34095
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12043 llite,readahead: don't always use max RPC size 33/35033/4
Wang Shilong [Sun, 2 Jun 2019 15:17:26 +0000 (23:17 +0800)]
LU-12043 llite,readahead: don't always use max RPC size

Since 64M RPC landed, @PTLRPC_MAX_BRW_PAGES will be 64M.
And we always try to use this max possible RPC size to check
whether we should avoid fast IO and trigger real context IO.

This is not good for following reasons:

(1) Since current default RPC size is still 4M,
most of system won't use 64M for most of time.

(2) Currently default readahead size per file is still 64M,
which makes fast IO always run out of all readahead pages
before next IO. This breaks what users really want for readahead
grapping pages in advance.

To fix this problem, we use 16M as a balance value if RPC smaller
than 16M, patch also fix the problem that @ras_rpc_size could not
grow bigger which is possibe in the following case:

1) set RPC to 16M
2) Set RPC to 64M

In the current logic ras->ras_rpc_size will be kept as 16M which is wrong.

Change-Id: Ida9f839f7c692cd88d32dc0909503f6ae991d909
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35033
Tested-by: Jenkins
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9010 lnet: Change static defines to use macro for module.c 32/33932/12
Arshad Hussain [Thu, 27 Dec 2018 12:26:17 +0000 (07:26 -0500)]
LU-9010 lnet: Change static defines to use macro for module.c

This patch replaces mutex which are defined statically
in file lnet/lnet/module.c with kernel provided macro.

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I59de4514dc332c3c59e0d816720a81394521881c
Reviewed-on: https://review.whamcloud.com/33932
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11775 osc: reduce lock contention in osc_unreserve_grant 58/33858/6
Li Dongyang [Fri, 14 Dec 2018 01:19:09 +0000 (12:19 +1100)]
LU-11775 osc: reduce lock contention in osc_unreserve_grant

In osc_queue_async_io() the cl_loi_list_lock is acquired to reserve
and consume the grant and released, right after we expand the extent
the same lock is used to unreserve the grant.
We can keep the spinlock when we are done with the grant to improve
the throughput.

mpirun  -np 32 /root/ior-openmpi/src/ior -w -t 1m -b 8g -F -e -vv
-o /scratch0/file -i 1
master:
Max Write: 13799.70 MiB/sec (14470.04 MB/sec)
master with 33858:
Max Write: 14339.57 MiB/sec (15036.13 MB/sec)

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ic61af84c7b98b5a189d7adabe33ae687954b2ed4
Reviewed-on: https://review.whamcloud.com/33858
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12411 lnet: Do not allow gateways on remote nets 98/35198/2
Chris Horn [Tue, 11 Jun 2019 19:59:31 +0000 (14:59 -0500)]
LU-12411 lnet: Do not allow gateways on remote nets

A gateway needs to be reachable over some local interface.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib66d4f8fd48d8863097280c480648ab8e29d2767
Reviewed-on: https://review.whamcloud.com/35198
Tested-by: Jenkins
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10070 ldlm: layout lock fixes 32/35232/7
Vitaly Fertman [Tue, 11 Jun 2019 23:18:45 +0000 (02:18 +0300)]
LU-10070 ldlm: layout lock fixes

as the intent_layout operation becomes more frequent with SEL,
cancel existent layout locks in advance and reuse ELC to deliver
cancels to MDS

as clients are given LCK_EX layout locks, take into account this
mode as well in ldlm_lock_match

Cray-bug-id: LUS-2528
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I1525153b3a07385fc17ef5416ded7b6d4378b2ec
Reviewed-on: https://review.whamcloud.com/35232
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12447 utils: specify correct size for lfs project buffer 57/35257/2
Wang Shilong [Tue, 18 Jun 2019 00:38:17 +0000 (20:38 -0400)]
LU-12447 utils: specify correct size for lfs project buffer

Enviorment:
Fedora release 28 (Twenty Eight)

gcc (GCC) 8.0.1 20180324 (Red Hat 8.0.1-0.20)
Copyright (C) 2018 Free Software Foundation, Inc.

Hit build failure:
lfs_project.c: In function ‘lfs_project_item_alloc’:
lfs_project.c:72:2: error: ‘strncpy’ specified bound 4096
equals destination size [-Werror=stringop-truncation]
  strncpy(lpi->lpi_pathname, pathname, sizeof(lpi->lpi_pathname));
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Test-Parameters: trivial testlist=sanity-quota
Change-Id: Ia6429c47391bf503546609ec6a262fe24664bdd4
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35257
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12392 utils: specify correct size for the buffer 70/35070/6
Alex Zhuravlev [Wed, 5 Jun 2019 13:38:27 +0000 (16:38 +0300)]
LU-12392 utils: specify correct size for the buffer

otherwise gcc8 makes a warning which interrupts build.

Change-Id: I6a94c6cd63473df9fc88b1867bbda1353fa10247
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35070
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
9 months agoLU-12355 ldiskfs: bio_phys_segments symbol is not exported 40/35040/3
Shaun Tancheff [Sat, 8 Jun 2019 17:05:04 +0000 (12:05 -0500)]
LU-12355 ldiskfs: bio_phys_segments symbol is not exported

As of kenrel 5.0 bio_phys_segments not exported
It is only used in one CDEBUG(D_INODE so use bio->bi_phys_segments
directly.

Linux-commit: 6c210aa596d0ecf6f3eea65c02ac807877385a18

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I19cf7cab86ccebe4fccf7a34a945a4150069d18b
Reviewed-on: https://review.whamcloud.com/35040
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-8066 obd: cleanup server sysfs symlinks handling 15/34715/14
James Simmons [Wed, 5 Jun 2019 16:32:38 +0000 (12:32 -0400)]
LU-8066 obd: cleanup server sysfs symlinks handling

Rename class_setup_tunables() to class_add_symlinks(). Move all the
special sysfs and debugfs symlink handling into the function
class_add_symlinks(). Now that the obd_type is created using sysfs
handling we can complete the initializion if the real obd device is
registered later. For example if lod is registered first and it
creates the the "lov" obd_type. Then if the lov module is loaded
later then class_register_type() will use the "lov" obd_type created
by the lod module.

Change-Id: I754ec15a88458b170422b988d783efbe20141b87
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34715
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12333 ptlrpc: Add more flags to DEBUG_REQ_FLAGS macro 90/35090/5
Vitaly Fertman [Wed, 5 Jun 2019 21:17:34 +0000 (00:17 +0300)]
LU-12333 ptlrpc: Add more flags to DEBUG_REQ_FLAGS macro

Add rq_no_reply flag to the DEBUG_REQ_FLAGS macro for debug purposes
Also, add another debug message to check_write_rcs

Test-Parameters: trivial
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I39ea7e9359a377ad46f7600edad14375f9935793
Reviewed-on: https://review.whamcloud.com/35090
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-11264 llapi: reduce llapi_stripe_limit_check() overhead 91/35091/2
Andreas Dilger [Fri, 7 Jun 2019 03:34:49 +0000 (21:34 -0600)]
LU-11264 llapi: reduce llapi_stripe_limit_check() overhead

There is no need to check PAGE_SIZE in llapi_stripe_limit_check()
every time, since this cannot change between calls.

Always set errno if an error is returned.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib377c1cc734c9e683f75eeb509e220c4ea3ebbe5
Reviewed-on: https://review.whamcloud.com/35091
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11089 obd: rename lu_keys_guard to lu_context_remembered_guard 73/33673/8
NeilBrown [Wed, 5 Jun 2019 13:26:36 +0000 (09:26 -0400)]
LU-11089 obd: rename lu_keys_guard to lu_context_remembered_guard

The only remaining use of lu_keys_guard is to protect the
lu_context_remembers linked list, and always write_lock()
is used.
So rename it to reflect this, and change to a spinlock.
We move keys_fini() out of the locked region in
lu_context_fini() - once we have removed the context from
the lc_remembers list, there can no longer be a race.

Linux-commit: dc8aaaca0062878c2fbad9df1b9ac3e85cad8630

Change-Id: Id66930e073d5351a96b139f2fc1a8007841de728
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/33673
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11213 lod: default LMV can't be deleted 07/35207/3
Lai Siyao [Wed, 12 Jun 2019 10:33:12 +0000 (18:33 +0800)]
LU-11213 lod: default LMV can't be deleted

When 'space' hash type was introduced, default LMV deletion added
check for hash type, but it only checks whether type is 'none',
while by default it's 'fnv_1a_64', which caused default LMV can't
be deleted.

Change check to !LMV_HASH_TYPE_SPACE and update test 413b.

Fixes: a24f6153292 ("LU-11213 dne: add new dir hash type 'space'")

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I88c630dc8d339ddeb9dc03d6f8987d8783062a13
Reviewed-on: https://review.whamcloud.com/35207
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11213 doc: update lfs-setdirstripe man 'space' hash 08/35208/3
Lai Siyao [Wed, 12 Jun 2019 10:54:06 +0000 (18:54 +0800)]
LU-11213 doc: update lfs-setdirstripe man 'space' hash

Add 'space' hash description for lfs-setdirstripe man.

Test-Parameters: trivial
Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I847618c5c98925ebc7b1bd124173ffb5adfe90dd
Reviewed-on: https://review.whamcloud.com/35208
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Joseph Gmitter <jgmitter@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12355 lnet: ib_fmr_pool_unmap returns void 17/35017/3
Shaun Tancheff [Thu, 13 Jun 2019 19:04:54 +0000 (14:04 -0500)]
LU-12355 lnet: ib_fmr_pool_unmap returns void

Historically ib_fmr_pool_unmap only ever returned 0
Linux kernel 4.20 changed the return for ib_fmr_pool_unmap to void.

Linux-commit: 3eeeb7a59acddaa326b03efdf6dce61c120449a3

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I49d91a49c452dad5c7d9b153fdbc011f2f25743a
Reviewed-on: https://review.whamcloud.com/35017
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12355 llite: vfs atomic_open change with FMODE_CREATED 20/35020/6
Shaun Tancheff [Thu, 13 Jun 2019 13:57:33 +0000 (08:57 -0500)]
LU-12355 llite: vfs atomic_open change with FMODE_CREATED

Kernel 4.19 introduced FMODE_CREATED and switched to it while
the last argument to vfs atomic_open was removed and the f_mode
flags are used to indicate the created state on return.

Linux-commit: 73a09dd94377e4b186b300bd5461920710c7c3d5

Test-Parameters: trivial
Change-Id: I26d4aadb123bb1d1bc0aa1d78a64a75b94276ffb
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/35020
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12355 ldiskfs: Added ext4_iget_flags to ext4_iget 23/35023/5
Shaun Tancheff [Thu, 13 Jun 2019 14:14:41 +0000 (09:14 -0500)]
LU-12355 ldiskfs: Added ext4_iget_flags to ext4_iget

Kernel 4.19 introduced ext4_iget_flags and changed ext4_iget
to require a flags argument.

Use EXT4_IGET_HANDLE for stale inode checking by ext4/ldiskfs

Linux-commit: 8a363970d1dc38c4ec4ad575c862f776f468d057

Test-Parameters: trivial
Change-Id: Ic6cbe3eabcdbb1a506813d145fb2980dbed95e66
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/35023
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12355 osd-ldiskfs: timespec_trunc removed 50/35050/7
Shaun Tancheff [Thu, 13 Jun 2019 18:27:19 +0000 (13:27 -0500)]
LU-12355 osd-ldiskfs: timespec_trunc removed

As part of the y2038 changes:

timespec64_trunc was added in kernel 4.17
inode i_Xtime values are timespec64 in kernel 4.18
timespec_trunc was removed in kernel 4.19

Harden the existing LC_INODE_TIMESPEC64 check and stop
using the deprecated timespec_trunc().

oti_time is set but never used, remove oti_time.

Linux-commit: 8efd6894ff089adeeac7cb9f32125b85d963d1bc
Linux-commit: 95582b00838837fc07e042979320caf917ce3fe6
Linux-commit: 976516404ff3fab2a8caa8bd6f5efc1437fed0b8

Test-Parameters: trivial
Change-Id: Idadbf6ddd9aedbd251d7bbbb5b1486d4aa757ac5
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/35050
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
9 months agoLU-12430 build: remove misleading help message 05/35205/3
Li Xi [Wed, 12 Jun 2019 10:41:31 +0000 (18:41 +0800)]
LU-12430 build: remove misleading help message

build/README.kernel-source has been deleted for nine years, but
the build help message still says: "Consult build/README.kernel-source
for details."

This patch removes these messages.

Test-Parameters: trivial
Change-Id: I2ff3afc363205b13f7f5ef645ade2854bfc3fa47
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/35205
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-12436 lov: return error if cl_env_get fails 29/35229/2
Shaun Tancheff [Thu, 13 Jun 2019 22:51:38 +0000 (17:51 -0500)]
LU-12436 lov: return error if cl_env_get fails

When cl_env_get() fails with an error return the error.

Test-Parameters: trivial
Cray-bug-id: LUS-7310
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Ia065aeb142a772f4d620b84111af423e27c06b90
Reviewed-on: https://review.whamcloud.com/35229
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11434 tests: add version check conf-sanity 109a/b 54/33954/17
James Nunez [Wed, 2 Jan 2019 23:02:29 +0000 (16:02 -0700)]
LU-11434 tests: add version check conf-sanity 109a/b

conf-sanity test 109a and 109b were added to Lustre with tag 2.10.59.
Thus, we need to check that the server is 2.10.59 or later before
running conf-sanity test 109a and 109b.

Test-Parameters: trivial envdefinitions=ONLY=109 serverjob=lustre-b2_10 serverbuildno=170 testlist=conf-sanity
Test-Parameters: envdefinitions=ONLY=109 testlist=conf-sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4a653d1374973f00d283af885183621ec14628e1
Reviewed-on: https://review.whamcloud.com/33954
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoNew tag 2.12.55 2.12.55 v2_12_55
Oleg Drokin [Sun, 16 Jun 2019 03:38:29 +0000 (23:38 -0400)]
New tag 2.12.55

Change-Id: I6e76cc7c06092f54b778f6e45932e21427991dcf

9 months agoLU-12395 build: require python2 for lustre-iokit 94/35094/2
Li Dongyang [Fri, 7 Jun 2019 07:34:23 +0000 (17:34 +1000)]
LU-12395 build: require python2 for lustre-iokit

RHEL8 has splitting python2 and python3 rpms,
and none of them provdes python anymore.
We can just require python2 in the spec, other
distros all have python rpm providing python2.

Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: I881c90a4e66d1a431d11d16b9e89781de2f87a7d
Reviewed-on: https://review.whamcloud.com/35094
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12383 utils: only check project inherit bit for dir 76/35076/2
Wang Shilong [Thu, 6 Jun 2019 02:36:39 +0000 (10:36 +0800)]
LU-12383 utils: only check project inherit bit for dir

Currently, ZFS won't set inherit bit on regular files, but
ext4 always set it, it doesn't make sense for regular files
have this bit, but own it won't do any harm as well.

To make test happy and give a consistent view on users,
let's fix project check only complain erros for Direcotry.

Test-Parameters: trivial testlist=sanity-quota
Change-Id: I194f3ed9d6ded69313a683995295ab8c07b4fb3a
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35076
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12267 tests: filter trailing '.' for SELinux 26/35026/6
James Nunez [Fri, 31 May 2019 21:28:20 +0000 (15:28 -0600)]
LU-12267 tests: filter trailing '.' for SELinux

When SELinux is enforced, sanity test 420 fails due to
the "ls -n" command producing an extra '.' at the end of
the file/directory permissions to indicate extra security
attributes are set.

We need to filter out the trailing '.' in the 'ls -n'
output for testing to pass when SELinux is enabled.

Test-Parameters: trivial envdefinitions=ONLY=420 testlist=sanity
Test-Parameters: clientselinux envdefinitions=ONLY=420 testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I4a2f199d2ef4a7b1b6a1b381041b384bb0077cc6
Reviewed-on: https://review.whamcloud.com/35026
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-12399 tests: avoid 'pdsh localhost' in sanity test_420 76/35176/2
Sebastien Buisson [Tue, 11 Jun 2019 09:50:01 +0000 (11:50 +0200)]
LU-12399 tests: avoid 'pdsh localhost' in sanity test_420

sanity test_420 needs a clean env to execute openfile, ie not
inherited from root user.
Replace 'pdsh localhost' with simpler 'su - $uname -c' alternative
to achieve this.

Test-Parameters: trivial envdefinitions=ONLY=420 testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ifeba7fc1eba86d74a64cca187e286adb23147e2e
Reviewed-on: https://review.whamcloud.com/35176
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
9 months agoLU-12412 recovery: wake all waiters of trd_finishing 41/35141/2
Sergey Cheremencev [Wed, 20 Jan 2016 20:57:01 +0000 (23:57 +0300)]
LU-12412 recovery: wake all waiters of trd_finishing

There is a small window where lctl --device abort_recovery
and umount->...->stop_recovery_thread may be called before
recovery finish. In such case all threads need to be
waked up, so change complete to complete_all.

Cray-bug-id: LUS-2000
Change-Id: I01ef163e72c7691a2c2d5449adf55b55ec734c4d
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-on: https://review.whamcloud.com/35141
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12396 utils: lfs should not output 'nul' char 37/35137/3
Patrick Farrell [Mon, 10 Jun 2019 16:29:06 +0000 (12:29 -0400)]
LU-12396 utils: lfs should not output 'nul' char

If lfs prints a nul char, it breaks parsing of the output.

Fixes: 68635c3d9b31 ("LU-11963 osd: Add nonrotational flag to statfs")
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Ibfa77920adf3a6c62b01efb005d02ca81db7f7c1
Reviewed-on: https://review.whamcloud.com/35137
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-12355 lnet: Adjust checks for ib_device_ops 16/35016/4
Shaun Tancheff [Tue, 11 Jun 2019 12:29:49 +0000 (07:29 -0500)]
LU-12355 lnet: Adjust checks for ib_device_ops

RDMA/core: Introduce ib_device_ops

The ib_device_ops structure defines all the InfiniBand device
operations in one place

Linux-commit: 521ed0d92ab0db3edd17a5f4716b7f698f4fce61

Test-Parameters: trivial
Change-Id: Ia2a617597c75ec819f485b93a1deb368d4b5e873
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/35016
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12381 ko2iblnd: ignore down interfaces 98/35098/4
James Simmons [Mon, 10 Jun 2019 13:58:29 +0000 (09:58 -0400)]
LU-12381 ko2iblnd: ignore down interfaces

The for_each_netdev() loop in kiblnd_create_dev() scans for all
network devices on a system. Currently the code exit when an
network device is down but the device could be something besides
an IB device. Instead of exiting just ignore any device that is
down.

Test-Parameters: trivial

Fixes: c4b39bf56bbc ("LU-11893 o2iblnd: add secondary IP address handling")
Change-Id: I0a3bf808d849cd00711b6ef2e4e5bbd876b64903
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35098
Tested-by: Jenkins
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-1538 tests: standardize test script init - sanity 63/34863/3
Andreas Dilger [Mon, 3 Jun 2019 14:39:19 +0000 (08:39 -0600)]
LU-1538 tests: standardize test script init - sanity

Standardize the initial Lustre test script initialization of the
test-framework.sh for clarity and consistency.

The LUSTRE path is already normalized in init_test_env(), so this
doesn't need to be done in the caller.  Use $(...) subshells instead
of `...` in the affected lines.  Remove PATH, NAME, TMP, LFS, LCTL
variable initialization, since it is already done in init_test_env().

Move MACHINEFILE into init_test_env().

Move get_lustre_env() to the end of init_test_env(). All test scripts
currently call init_test_env() and this move will allow all test
scripts to use the variables defined in get_lustre_env() without
having to modify the individual test scripts.

Move all definitions of ALWAYS_EXCEPT to after init_test_env()
and init_logging() and call build_test_filter() immediately
after these and SLOW definitions.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I1ef6639bcb3eb5179bd44da13b35fd843c267156
Reviewed-on: https://review.whamcloud.com/34863
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
9 months agoLU-11758 osp: remove assertion from statfs 32/33832/6
Sergey Cheremencev [Fri, 6 Jul 2018 19:51:14 +0000 (22:51 +0300)]
LU-11758 osp: remove assertion from statfs

Sequence can't be changed or overflowed
in case of IDIF. Thus don't tigger kernel
panic for below case:
last_created [0x100000001:0x15:0x0], next_fid [0x100000000:0xfffffff6:0x0]
The same assertion that excepts IDIFs exists
in osp_fid_diff.
Also the patch is adding several optimizations
in osp_precreate_send.

Change-Id: I3966dfc621999d065c9b485d387938085fccb140
Cray-bug-id: LUS-2386
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-on: https://es-gerrit.dev.cray.com/153571
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-on: https://review.whamcloud.com/33832
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-9846 test: a test number fix 89/35089/5
Vitaly Fertman [Mon, 3 Jun 2019 16:30:06 +0000 (19:30 +0300)]
LU-9846 test: a test number fix

A wrong test number was specified originally

Test-Parameters: trivial
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I8ea31bb2e613c6e225fa7f41f405d5ee2d396a61
Reviewed-on: https://review.whamcloud.com/35089
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10948 mdt: Remove openlock compat code with 2.1 39/35039/4
Oleg Drokin [Mon, 3 Jun 2019 06:39:41 +0000 (02:39 -0400)]
LU-10948 mdt: Remove openlock compat code with 2.1

Checking openlock when doing a create does not allow us to create
a file if we want to also get openlock from it right away.

Since 2.1 is no longer something we care about wrt compatibility,
ok to kill it.

Change-Id: Ic4327be5c45ae856dfbe20291a59c5b1654dbf8f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35039
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-12387 tests: Validate l_tunedisk max_sectors_kb tuning 81/35081/10
Chris Horn [Thu, 6 Jun 2019 15:59:18 +0000 (10:59 -0500)]
LU-12387 tests: Validate l_tunedisk max_sectors_kb tuning

Add test to ensure that l_tunedisk only tunes the max_sectors_kb
value of OST devices, and that it properly tunes any slave devices.

Test-parameters: trivial
Test-parameters: fstype=ldiskfs testlist=conf-sanity \
 envdefinitions=ONLY=125
Cray-bug-id: LUS-7358
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I414526e71fd7ac2811d7c0e8a6afd80a50788258
Reviewed-on: https://review.whamcloud.com/35081
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-12387 utils: Avoid passing symlink to tune_block_dev 65/35065/4
Chris Horn [Wed, 5 Jun 2019 00:14:47 +0000 (19:14 -0500)]
LU-12387 utils: Avoid passing symlink to tune_block_dev

In tune_block_dev_slaves we iterate over the directories inside the
slaves subdirectory for the multipath device that is being tuned. For
example:

 $ /usr/sbin/l_tunedisk /dev/mapper/mpathc

Suppose mpathc maps to /dev/dm-2. tune_block_dev will initially set
the value of
/sys/devices/virtual/block/dm-2/queue/max_sectors_kb
equal to the value of
/sys/devices/virtual/block/dm-2/queue/max_hw_sectors_kb

Then it looks at the entries in /sys/devices/virtual/block/dm-2/slaves
Suppose the slave devices are as follows:

 $ ls /sys/devices/virtual/block/dm-2/slaves
 sdc  sdh  sdm  sdr
 $

It then calls tune_block_dev recursively, passing
/sys/devices/virtual/block/dm-2/slaves/sdc,
/sys/devices/virtual/block/dm-2/slaves/sdh, etc. However, these are
symlinks that point to directories and as such tune_block_dev will not
tune them because stat does not identify them as block devices.

Instead we should contruct the path argument for these recursive calls
as /dev/<d_name>. In this example, /dev/sdc, /dev/sdh, etc.

Cray-bug-id: LUS-7358
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I63bc073a82384d68648ff23a56b7d43d6656159b
Reviewed-on: https://review.whamcloud.com/35065
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
9 months agoLU-12387 utils: Read existing ldd data in l_tunedisk 66/35066/5
Chris Horn [Tue, 4 Jun 2019 19:34:01 +0000 (14:34 -0500)]
LU-12387 utils: Read existing ldd data in l_tunedisk

Read the lustre_disk_data from the device passed to l_tunedisk, so
we can determine whether the device is an MGT or MDT and thus skip
the tuning of the device.

Fixes: 892280742a2b ("LU-9551 utils: add l_tunedisk to fix disk tunings")
Fixes: 2f8d7b4679de ("LU-11736 utils: don't set max_sectors_kb on MDT/MGT")
Cray-bug-id: LUS-7358
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I193fe008d5777b0e83f2be9a500eaffb1d3ca615
Reviewed-on: https://review.whamcloud.com/35066
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
9 months agoLU-12375 scripts: Start lnet after opa 32/35032/2
Nathaniel Clark [Sun, 2 Jun 2019 13:50:53 +0000 (09:50 -0400)]
LU-12375 scripts: Start lnet after opa

Ensure ordering of lnet after opa for startup and lnet before opa on
shutdown.

Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Change-Id: I4c2cad2381349f866bdc08e2a81e3d8990d8752e
Reviewed-on: https://review.whamcloud.com/35032
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Artur Novik <anovik@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-6142 ptlrpc: Fix style issues for pinger.c 01/34701/5
Arshad Hussain [Thu, 18 Apr 2019 13:42:17 +0000 (19:12 +0530)]
LU-6142 ptlrpc: Fix style issues for pinger.c

This patch fixes issues reported by checkpatch
for file lustre/ptlrpc/pinger.c

Change-Id: I048a7ab7d31bc468a410ec1704c5d79a34feebb4
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/34701
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-11623 tests: Fix sanity 27E to ensure getattr RPC 67/35067/2
Oleg Drokin [Wed, 5 Jun 2019 06:28:24 +0000 (02:28 -0400)]
LU-11623 tests: Fix sanity 27E to ensure getattr RPC

While cat does perform fstat() on the file it opens,
I guess it's not guaranteed.
More importantly, we really need to ensure the locks
that the file has after creation are dropped before
we issue our stat() to ensure the RPC is actually made,
since it's this GETATTR RPC that is ensuring easize
update from MDT response.

Change-Id: Ic86229ac514e1385c665c6c0d9f6eef13d9748f5
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/35067
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-12438 llite: vfs_read/write removed, use kernel_read/write 23/35223/5
Shaun Tancheff [Thu, 13 Jun 2019 18:48:10 +0000 (13:48 -0500)]
LU-12438 llite: vfs_read/write removed, use kernel_read/write

As of Linux 4.14 the vfs_read() is no longer available
to kernel modules. The kernel_read() function calls
vfs_read() and will continue to be available.

Adding a configure test to use kernel_read() as the
function signature changed in 4.14 to match the other file I/O
helpers.

Also remove vfs_write() in favor of kernel_write() wrapper
cfs_kernel_write().

Fixes: f172b1168857 ("LU-10092 llite: Add persistent cache on client")
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I5e5fce0e6644ba750169f3bf11ac5c98525da0a7
Reviewed-on: https://review.whamcloud.com/35223
Tested-by: Jenkins
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
9 months agoLU-10092 First phase of persistent client cache project merging 14/35214/1
Oleg Drokin [Thu, 13 Jun 2019 04:36:36 +0000 (00:36 -0400)]
LU-10092 First phase of persistent client cache project merging

Merge remote-tracking branch 'origin/pcc'

Change-Id: I87a681c54712926d336c983dd8e56b58ebf4b612
Signed-off-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: change detach behavior and add keep option 44/33844/19
Qian Yingjin [Thu, 13 Dec 2018 02:41:14 +0000 (10:41 +0800)]
LU-10092 pcc: change detach behavior and add keep option

After introduce the feature of auto-attach at open, when the PCC
cached file is detach by "pcc detach" command, it will be attached
automatically at the next open. This may be not what the user wants.

To solve this problem, we change the defualt detach behavior and
add an option "--keep|-k" for the detach of RW-PCC.
The manual "lfs pcc detach" command will detach the file from PCC
permanently. And it will also remove the PCC copy by default.
When the file is detached with "keep" option, it only unmaps the
relationship between the file inode and PCC copy, but keep the
PCC copy. The file is allowed to be attached automatically at
the next open when the file is still valid in cache.

Note here that currently auto detach caused by inode reclaim or
revocation of the layout lock would not delete the PCC copy too.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I010df54177ae4cfeddcc0a9982c1aee58ee683de
Reviewed-on: https://review.whamcloud.com/33844
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: auto attach during open for valid cache 87/33787/21
Qian Yingjin [Wed, 5 Dec 2018 03:22:11 +0000 (11:22 +0800)]
LU-10092 pcc: auto attach during open for valid cache

In current PCC implementation, all PCC state information is
stored in the in-memory data structure named pcc_inode (a member
of data structure ll_inode_info). Once the file inode is reclaimed
due to the memory pressure or memory shrinking, the corresponding
in-memory pcc_inode will be released too, and the PCC-cached file
will be detached automatically. And the revocation of layout lock
will also trigger the detach of the PCC-cached file. These all lead
that the still valid PCC-cached file can not be used.

To solve this problem, we introduce an auto-attaching mechanism
during open. During PCC attach, the L.Gen will be stored as
extented attribute of the local copy file on PCC device. When the
in-memory inode is reclaimed or the layout lock is revoked, and
the file is opend again, it can check whether the stored L.Gen on
the PCC copy is same as the Lustre file current L.Gen on MDT. If
they are consistent, it means the cached copy on PCC device is still
valid, we can continue to use it after auto-attach.

Test-Parameters: testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I63be96f8d83816529983d0f97af0aaca81703fed
Reviewed-on: https://review.whamcloud.com/33787
Tested-by: Jenkins
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10918 llite: Rule based auto PCC caching when create files 51/34751/15
Qian Yingjin [Wed, 24 Apr 2019 09:51:25 +0000 (17:51 +0800)]
LU-10918 llite: Rule based auto PCC caching when create files

Configurable rule based auto PCC caching for newly created files
can significantly benefit users for readwrite PCC. It can
determine which file can use a cache on PCC directly without any
admission control for high priority user/group/project or filename
with wildcard support. Meanwhile, we can enforce a quota limitation
of capacity usage for each user/group/project to providing caching
isolation.

Similar to NRS TBF command line, it supports logical conditional
conjunction and disjunction operations among different user/group/
project or filename with the wildcard support.

The command line to add this kind of rule is as follow:
lctl pcc add /mnt/lustre /mnt/pcc
"projid={500 1000}&fname={*.h5},uid={1001} rwid=1 roid=1"
It means that Project ID of 500, 1000 AND file suffix name is "h5"
OR User ID is 1001 can be auto cached on PCC for newly create file
on the client. "rwid" means RW-PCC attach ID (which is
usually archive ID); "roid" means RO-PCC attach ID. By defualt,
RO-PCC attach id is setting same with RW-PCC attach ID for a
shared PCC backend.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I628975b3e097e98d6b93f1c6acd855aaacdaa8b3
Reviewed-on: https://review.whamcloud.com/34751
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: security and permission for non-root user access 37/34637/20
Qian Yingjin [Thu, 11 Apr 2019 02:41:38 +0000 (10:41 +0800)]
LU-10092 pcc: security and permission for non-root user access

For current PCC, if a file is left on the PCC cache, it may be
accessible to other jobs/users who would not normally be able to
access it. (That is,  they access it directly on the PCC mount via
FID as the local PCC mount is basically just a normal local file
system.)

This patch solves this by restricting access on the PCC side and
just depending on the Lustre side permissions for opening a file.
So PCC files on the local mount fs are created with some minimal
(zero) set of permissions. Then, when accessing a PCC cached
file, we do the permission check on the Lustre file, then do not
do it on the PCC file. This should render the PCC files
inaccessible except to root or via Lustre.

Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I059fa3e479fe97ef6b65db1cbeb8b7f3ea611880
Reviewed-on: https://review.whamcloud.com/34637
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 pcc: Non-blocking PCC caching 66/32966/37
Qian Yingjin [Thu, 5 Jul 2018 06:43:46 +0000 (14:43 +0800)]
LU-10092 pcc: Non-blocking PCC caching

Current PCC uses refcount of PCC inode to determine whether a
previous PCC-attached file can be detached. If a file is open
(refcount > 1), the detaching will return -EBUSY.

When another client accesses the PCC-cached file, it will trigger
the restore process as the file is HSM released. During restore,
the Agent needs to detach the PCC-cached file.
Thus, if a PCC-attached file is keeping opened but not closed
for a long time, the restore request will always return failure.

In this patch, we implement a non-blocking PCC caching mechanism
for Lustre. After attaching the file into PCC, the client acquires
the layout lock for the file, and the layout generation is
maintained in the PCC inode. Under the layout lock protection, the
PCC caching state is valid and all I/O will direct into PCC. When
the layout lock is revoked, in the blocking AST it will invalidate
the PCC caching state and detach the file automatically.

This patch is also helpful to handle the ENOSPC error for PCC
write by fallback to normal I/O path which will restore the file
data into OSTs (The file is in HSM released state) and redo the
write again.

Change-Id: I9130c04dc0e6eae879ea2ff3fdda65726e74d177
Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/32966
Tested-by: Jenkins
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-10092 llite: Add persistent cache on client 63/32963/38
Li Xi [Tue, 27 Jun 2017 12:18:14 +0000 (20:18 +0800)]
LU-10092 llite: Add persistent cache on client

PCC is a new framework which provides a group of local cache
on Lustre client side. No global namespace will be provided
by PCC. Each client uses its own local storage as a cache for
itself. Local file system is used to manage the data on local
caches. Cached I/O is directed to local filesystem while
normal I/O is directed to OSTs.

PCC uses HSM for data synchronization. It uses HSM copytool
to restore file from local caches to Lustre OSTs. Each PCC
has a copytool instance running with unique archive number.
Any remote access from another Lustre client would trigger
the data synchronization. If a client with PCC goes offline,
the cached data becomes inaccessible for other client
temporarilly. And after the PCC client reboots and the copytool
restarts, the data will be accessible again.

ToDo:
1) Make PCC exclusive with HSM.
2) Strong size consistence for PCC cached file among clients.
3) Support to cache partial content of a file.

Change-Id: I188ed36c48aae223380739f607cc6caf2f789298
Test-Parameters: clientcount=3 testlist=sanity-pcc,sanity-pcc,sanity-pcc
Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/32963
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11838 osd-ldiskfs: inode times switched to timespec64 75/34675/4
Li Dongyang [Tue, 16 Apr 2019 01:14:13 +0000 (11:14 +1000)]
LU-11838 osd-ldiskfs: inode times switched to timespec64

Since kernel 4.18 inode times swtich from struct timespec
to timespec64 to make it y2038 safe.

Linux-commit: 95582b00838837fc07e042979320caf917ce3fe6

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Iaddb2f2be27ec348fb97e13371aa3d7e6f6e5c9f
Reviewed-on: https://review.whamcloud.com/34675
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
9 months agoLU-11838 ldiskfs: add rhel8 server support 74/34674/6
Li Dongyang [Mon, 15 Apr 2019 08:05:58 +0000 (18:05 +1000)]
LU-11838 ldiskfs: add rhel8 server support

This patch adds ldiskfs patch series for rhel8
kernel 4.18.0-32.el8.

Fix lustre-build-ldiskfs.m4, make
CONFIG_LDISKFS_FS_ENCRYPTION consistent with
kernel's config CONFIG_EXT4_FS_ENCRYPTION.
Otherwise ldiskfs won't build
on kernels with CONFIG_EXT4_FS_ENCRYPTION
disabled.

Note: this contains a small clean up for
ubuntu18/ext4-kill-dx-root.patch

Test-Parameters:trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ib500bff2f6688405b912620c5217586c8420c6e1
Reviewed-on: https://review.whamcloud.com/34674
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
9 months agoLU-11213 lmv: reuse object alloc QoS code from LOD 57/34657/15
Lai Siyao [Fri, 22 Mar 2019 00:22:37 +0000 (08:22 +0800)]
LU-11213 lmv: reuse object alloc QoS code from LOD

Reuse the same object alloc QoS code as LOD, but the QoS code is
not moved to lower layer module, instead it's copied to LMV, because
it involves almost all LMV code, which is too big a change and should
be done separately in the future.

And for LMV round-robin object allocation, because we only need to
allocate one object, use the MDT index saved and update it to next
MDT.

Add sanity 413b.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I53c3d863dafda534eebb6b95da205b395071cd25
Reviewed-on: https://review.whamcloud.com/34657
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
9 months agoLU-9846 utils: hash may be overridden in 'lfs setdirstripe' 95/35095/3
Lai Siyao [Fri, 7 Jun 2019 06:20:13 +0000 (14:20 +0800)]
LU-9846 utils: hash may be overridden in 'lfs setdirstripe'

lfs_setdirstripe() may override 'hash' if '-H hash' is specified
before '-i', since LMV doesn't support OVERSTRIPING, this support
can be ignored.

Fixes: 591a9b4cebc5 ("LU-9846 lod: Add overstriping support")

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: If6db03d2d4f6d208da19ae064fde1d851f01beb4
Reviewed-on: https://review.whamcloud.com/35095
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
10 months agoLU-11297 lnet: MR Routing Feature 83/34983/3
Amir Shehata [Fri, 7 Jun 2019 18:35:09 +0000 (14:35 -0400)]
LU-11297 lnet: MR Routing Feature

This is a merge commit from the multi-rail branch. It brings in
the MR Routing feature. This feature aligns the LNET Multi-Rail
behavior with routing. A gateway now is viewed as a Multi-Rail
capable node. When a route is added only one entry per gateway
should be used. That route entry should use the primary-nid of
the gateway. The multi-rail selection algorithm is then run when
sending to the gateway to select the best interface to send to.

Furthermore the gateway aliveness is now kept via the health
mechanism. And the gateway pinger now uses discovery instead
of maintaining its own pinger handler.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie2d8c6449f84860511b322ff2db3ed656a163e74

10 months agoLU-12200 lnet: check peer timeout on a router 72/34772/15
Amir Shehata [Fri, 19 Apr 2019 00:19:22 +0000 (17:19 -0700)]
LU-12200 lnet: check peer timeout on a router

On a router assume that a peer is alive and attempt to send it
messages as long as the peer_timeout hasn't expired.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0806a52c8ad7acc1c93dcf32353f1c4467c618b1
Reviewed-on: https://review.whamcloud.com/34772
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-12053 lnet: look up MR peers routes 25/34625/17
Amir Shehata [Mon, 8 Apr 2019 22:28:23 +0000 (15:28 -0700)]
LU-12053 lnet: look up MR peers routes

An MR peer can have multiple interfaces some of which we might
have a route to. The primary NID of the peer might not necessarily
specify a NID we have a route to. When looking up a route, we must
iterate over all the nets the peer is on and select the one which
we can route to. Taking into consideration the peer can exist on
multiple routed networks we also have a simple round robin algorithm
to iterate over all the networks we can reach the peer on.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I0651dd4f732c8b71872f73cf2512b08f34129bd9
Reviewed-on: https://review.whamcloud.com/34625
Tested-by: Jenkins
10 months agoLU-11299 lnet: discover each gateway Net 11/34511/22
Amir Shehata [Tue, 26 Mar 2019 21:16:32 +0000 (14:16 -0700)]
LU-11299 lnet: discover each gateway Net

Wakeup every gateway aliveness interval / number of local networks.
Discover each local gateway network in round robin.

This is done to make sure the gateway keeps its networks up.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehat <ashehata@whamcloud.com>
Change-Id: I4035e39c286cb599d4eb8f9df7ed5d278e6d744a
Reviewed-on: https://review.whamcloud.com/34511
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
10 months agoLU-11299 lnet: net aliveness 10/34510/22
Amir Shehata [Sat, 23 Mar 2019 01:01:51 +0000 (18:01 -0700)]
LU-11299 lnet: net aliveness

If a router is discovered on any interface on the network, then
update the network last alive time and the NI's status to UP.
If a router isn't discovered on any interface on a network,
then change the status of all the interfaces on that network to down.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I1d67eb4b3284ccb8306ad4c877a2fcbdf4958d8c
Reviewed-on: https://review.whamcloud.com/34510
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11664 lnet: push router interface updates 51/33651/30
Amir Shehata [Wed, 14 Nov 2018 02:14:36 +0000 (18:14 -0800)]
LU-11664 lnet: push router interface updates

A router can bring up/down its interfaces if it hasn't received any
messages on that interface for a configurable period
(alive_router_ping_timeout). When this even occures the router can now
push its status change to the peers it's talking to in order to inform
them of the change in its status. This will allow the router users to
handle asym router failures quicker.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9530ed7d9bc0a86edc43e3f610cc943f1732dcfd
Reviewed-on: https://review.whamcloud.com/33651
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11297 lnet: set gw sensitivity from lnetctl 35/33635/31
Amir Shehata [Fri, 9 Nov 2018 19:24:20 +0000 (11:24 -0800)]
LU-11297 lnet: set gw sensitivity from lnetctl

Allow an optional parameter from the:
lnetctl route add
command to set the health sensitivity of the gateway
lnetctl route add --net <net> --gateway <gw> --sensitivity <value>

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iee120c78a41b79da6ab6bdf1560f558df89233e2
Reviewed-on: https://review.whamcloud.com/33635
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11297 lnet: handle router health off 34/33634/31
Amir Shehata [Fri, 9 Nov 2018 18:31:27 +0000 (10:31 -0800)]
LU-11297 lnet: handle router health off

Routing infrastructure depends on health infrastructure to manage
route status. However, health can be turned off. Therefore, we need
to enable health for gateways in order to monitor them properly.
Each peer now has its own health sensitivity. When adding a route
the gateway's health sensitivity can be explicitly set from lnetctl
or if not specified then it'll default to 1, thereby turning health
on for that gateway, allowing peer NI recovery if there is a failure.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ibae33d595e97d0eec432ae8f5d51898ce0776f01
Reviewed-on: https://review.whamcloud.com/33634
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11641 lnet: handle discovery off 20/33620/32
Amir Shehata [Thu, 8 Nov 2018 00:51:44 +0000 (16:51 -0800)]
LU-11641 lnet: handle discovery off

When discovery is turned off locally or when the peer either has
discovery off or doesn't support MR at all then degrade discovery
behavior to a standard ping. This will allow routers to continue
using discovery mechanism even if it's turned off.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I7f0829d37cbff2bf9e41de251efa715fc4c97e5d
Reviewed-on: https://review.whamcloud.com/33620
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11470 lnet: drop all rule 05/33305/36
Amir Shehata [Thu, 4 Oct 2018 00:36:45 +0000 (17:36 -0700)]
LU-11470 lnet: drop all rule

Add a rule to drop all messages arriving on a specific interface.
This is useful for simulating failures on a specific router interface.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ic69f683fb2caf7a69a1d85428878c89b7b1ee3ad
Reviewed-on: https://review.whamcloud.com/33305
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11478 lnet: misleading discovery seqno. 04/33304/34
Amir Shehata [Fri, 5 Oct 2018 00:18:20 +0000 (17:18 -0700)]
LU-11478 lnet: misleading discovery seqno.

There is a sequence number used when sending discovery messages. This
sequence number is intended to detect stale messages. However it
could be misleading if the peer reboots. In this case the peer's
sequence number will reset. The node will think that all information
being sent to it is stale, while in reality the peer might've
changed configuration.

There is no reliable why to know whether a peer rebooted, so we'll
always assume that the messages we're receiving are valid. So we'll
operate on first come first serve basis.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I421a00e47bc93ee60fa37c648d6d9a726d9def9c
Reviewed-on: https://review.whamcloud.com/33304
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11477 lnet: handle health for incoming messages 01/33301/34
Amir Shehata [Thu, 4 Oct 2018 23:21:48 +0000 (16:21 -0700)]
LU-11477 lnet: handle health for incoming messages

In case of routers (as well as for the general case) it's important to
update the health of the ni/lpni for incoming messages. For an lpni
specifically when we receive a message is when we know that the lpni
is up.

A percentage router health is required in order to send a message to a
gateway. That defaults to 100, meaning that a router interface has to
be absolutely healthy in order to send to it. This matches the current
behavior. So if a router interface goes down an its health goes down
significantly, but then it comes back up again; either we receive a
message from it or we discover it and get a reply, then in order to
start using that router interface again we have to boost its health
all the way up to maximum.

This behavior is special cased for routers.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ida6c23f95dbef56c2e6ed7b6d03743939d8b30a0
Reviewed-on: https://review.whamcloud.com/33301
Tested-by: Jenkins
10 months agoLU-11475 lnet: transfer routers 39/34539/20
Amir Shehata [Thu, 28 Mar 2019 02:32:45 +0000 (19:32 -0700)]
LU-11475 lnet: transfer routers

When a primary NID of a peer is about to be deleted because
it's being transfered to another peer, if that peer is a gateway
then transfer all gateway properties to the new peer.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ib475c389ca5630906416a5112b3088f6f5d03950
Reviewed-on: https://review.whamcloud.com/34539
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11475 lnet: allow deleting router primary_nid 00/33300/34
Amir Shehata [Thu, 4 Oct 2018 22:31:04 +0000 (15:31 -0700)]
LU-11475 lnet: allow deleting router primary_nid

Discovery doesn't allow deleting a primary_nid of a peer. This
is necessary because upper layers only know to reach the peer by
using the primary_nid. For routers this is not the case. So
if a router changes its interfaces and comes back up again, the
peer_ni should be adjusted.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I9da056172f35a5f15eed5ba0e02fcb37ac414c54
Reviewed-on: https://review.whamcloud.com/33300
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: consider alive_router_check_interval 98/33298/34
Amir Shehata [Fri, 5 Oct 2018 01:28:49 +0000 (18:28 -0700)]
LU-11300 lnet: consider alive_router_check_interval

Consider router_check_interval when waking up the monitor thread,
to make sure you wakeup the monitor thread at the earliest possible
time.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ibc4b53886b59a9bc174a29d0da711ac77db3a62c
Reviewed-on: https://review.whamcloud.com/33298
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11378 lnet: MR aware gateway selection 88/33188/36
Amir Shehata [Fri, 14 Sep 2018 18:04:44 +0000 (11:04 -0700)]
LU-11378 lnet: MR aware gateway selection

When selecting a route use the Multi-Rail Selection algorithm to
select the best available peer_ni of the best route. The selected
peer_ni can then be used to send the message or to discover it
if the gateway peer needs discovering.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I376af57611591eed2eb1edb80a1b3a68b5aefd19
Reviewed-on: https://review.whamcloud.com/33188
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: use discovery for routing 54/33454/31
Amir Shehata [Mon, 22 Oct 2018 23:03:06 +0000 (16:03 -0700)]
LU-11299 lnet: use discovery for routing

Instead of re-inventing the wheel, routing now uses discovery.
Every router interval the router is discovered. This will
update the router information locally and will serve to let the
router know that the peer is alive.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I211bf15af0b0a5d50f9e2a69a385419a1dd5096b
Reviewed-on: https://review.whamcloud.com/33454
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: modify lnd notification mechanism 53/33453/30
Amir Shehata [Mon, 22 Oct 2018 22:44:50 +0000 (15:44 -0700)]
LU-11299 lnet: modify lnd notification mechanism

LND notifies when a peer is up or down. If the LND notifies
LNet that the peer is up and sets the "reset" flag to true
then this indicates to LNet that the LND knows about the health
of the peer and is telling LNet that the peer is fully healthy.
LNet will set the health value of the peer to maximum, otherwise
it will increment the health by one.

If the LND notifies the LNet that the peer is down, LNet will
decrement the health of the peer by sensitivity value configured.

LNet then turns around and rechecks the peer aliveness and if its
dead it'll notify the LND. This code is only used by the socklnd
because it needs to tear down connections. This is in keeping with
the original functionality.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ifa614405fb0c2cd4f6bcb1a2a97e856320eb6cbe
Reviewed-on: https://review.whamcloud.com/33453
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: Cleanup rcd 87/33187/35
Amir Shehata [Mon, 22 Oct 2018 22:09:11 +0000 (15:09 -0700)]
LU-11299 lnet: Cleanup rcd

Cleanup all code pertaining to rcd, as routing code will use
discovery going forward and there will be no need to keep its own
pinging code.

test_215 looks at the routers file which had its format changed.
Update the test to reflect the change.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: If31caa3b5703df40b6ae0f758f2fe764991aa4f3
Reviewed-on: https://review.whamcloud.com/33187
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: simplify lnet_handle_local_failure() 52/33452/30
Amir Shehata [Mon, 22 Oct 2018 20:39:36 +0000 (13:39 -0700)]
LU-11300 lnet: simplify lnet_handle_local_failure()

Pass the struct lnet_ni to lnet_handle_local_failure() instead of the
message structure, since nothing else from the message is being
used. This also makes symmetrical with lnet_handle_remote_failure()

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I10146ec5bf5f378e28a7725382f00132ada32c6e
Reviewed-on: https://review.whamcloud.com/33452
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: router aliveness 85/33185/34
Amir Shehata [Thu, 6 Sep 2018 00:03:45 +0000 (17:03 -0700)]
LU-11300 lnet: router aliveness

A route is considered alive if the gateway is able to route
messages from the local to the remote net. That means that
at least one of the network interfaces on the remote net of
the gateway is viable.

Introduced the concept of sensitivity percentage. This defaults
to 100%. It holds a dual meaning:
1. A route is considered alive if at least one of the its interfaces'
health is >= LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage
100 means at least one interface has to be 100% healthy
2. On a router consider a peer_ni dead if its health is not at least
LNET_MAX_HEALTH_VALUE * router_sensitivity_percentage.
100% means the interface has to be 100% healthy.

Re-implemented lnet_notify() to decrement the health of the
peer interface if the LND reports a failure on that peer.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie97561fb70bf6a558bc90fa9266a6ba38fa3d293
Reviewed-on: https://review.whamcloud.com/33185
Tested-by: Jenkins
10 months agoLU-11300 lnet: peer aliveness 86/33186/34
Amir Shehata [Thu, 6 Sep 2018 01:19:35 +0000 (18:19 -0700)]
LU-11300 lnet: peer aliveness

Peer NI aliveness is now solely dependent on the health
infrastructure. With the addition of router_sensitivity_percentage,
peer NI is considered dead if its health drops below the percentage
specified of the total health. Setting the percentage to 100% means
that a peer_ni is considered dead if it's interface is less than
fully healthy.

Removed obsolete code that queries the peer NI every second since
the health infrastructure introduces the recovery mechanism which
is designed to recover the health of peer NIs.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I506060fbb66c74295808891b689d7d634dc69284
Reviewed-on: https://review.whamcloud.com/33186
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: Cache the routing feature 51/33451/30
Amir Shehata [Sat, 20 Oct 2018 01:24:39 +0000 (18:24 -0700)]
LU-11300 lnet: Cache the routing feature

When processing a REPLY or a PUSH for a discovery cache the
whether the routing feature is enabled or disabled as
reported by the peer.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I69bd41fade196773af0e1004c2e7fff2fb91392d
Reviewed-on: https://review.whamcloud.com/33451
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: cache ni status 50/33450/30
Amir Shehata [Sat, 20 Oct 2018 01:02:05 +0000 (18:02 -0700)]
LU-11300 lnet: cache ni status

When processing the data in the PUSH or the REPLY make sure to cache
the ns_status. This is the status of the peer_ni as reported by the
peer itself.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I14de2460f578fb7f47d329a97b8833f49c569b74
Reviewed-on: https://review.whamcloud.com/33450
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: configure lnet router senstivity 55/33455/29
Amir Shehata [Tue, 23 Oct 2018 04:25:33 +0000 (21:25 -0700)]
LU-11300 lnet: configure lnet router senstivity

Allow the configuration of router_sensitivity_percentage from the
user space utility lnetctl

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: If5440f30881361ebb06dafa9cadb7cbc2b934f93
Reviewed-on: https://review.whamcloud.com/33455
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11300 lnet: router sensitivity 49/33449/30
Amir Shehata [Sat, 20 Oct 2018 00:09:24 +0000 (17:09 -0700)]
LU-11300 lnet: router sensitivity

Introduce the router_sensitivity_percentage module parameter to
control the sensitivity of routers to failures. It defaults to 100%
which means a router interface needs to be fully healthy in order
to be used.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I3e9333033f049918c1cdca58a72604c71884acbe
Reviewed-on: https://review.whamcloud.com/33449
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
10 months agoLU-11551 lnet: Do not allow deleting of router nis 48/33448/27
Amir Shehata [Fri, 19 Oct 2018 23:40:52 +0000 (16:40 -0700)]
LU-11551 lnet: Do not allow deleting of router nis

Check the peer before deleting a peer_ni. If it's a router then do
not allow deletion of the peer-ni.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I372052b4e9b5af3a8f18a49676fc60b4c8077cbd
Reviewed-on: https://review.whamcloud.com/33448
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11299 lnet: lnet_add/del_route() 84/33184/31
Amir Shehata [Tue, 4 Sep 2018 23:47:54 +0000 (16:47 -0700)]
LU-11299 lnet: lnet_add/del_route()

Reimplemented lnet_add_route() and lnet_del_route() to use
the peer instead of the peer_ni.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I3734098a81ab18d1d74220c691d96a9b9817e6da
Reviewed-on: https://review.whamcloud.com/33184
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Reviewed-by: Chris Horn <hornc@cray.com>
10 months agoLU-11298 lnet: use peer for gateway 83/33183/31
Amir Shehata [Fri, 31 Aug 2018 02:04:39 +0000 (19:04 -0700)]
LU-11298 lnet: use peer for gateway

The routing code uses peer_ni for a gateway. However with Mulit-Rail
a gateway could have multiple interfaces on several different
networks. Instead of using a single peer_ni as the gateway we should
be using the peer and let the MR selection code select the best
peer_ni to send to.

This patch moves the gateway from peer to peer_ni. Much of the
code needs to be rewritten in the following patches to account
for that change. This patch disables the routing features by
disabling the code to add/delete routes.

The asymmetric routing detection feature is also modified to
use the MR routing

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ia7dab552268c4a7fbd7b88122b9a95363d155fd7
Reviewed-on: https://review.whamcloud.com/33183
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-11292 lnet: Discover routers on first use 82/33182/31
Amir Shehata [Tue, 28 Aug 2018 23:42:35 +0000 (16:42 -0700)]
LU-11292 lnet: Discover routers on first use

Discover routers on first use. This brings the behavior when
interacting with routers in line with when dealing with normal
peers.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I8527e41daf2f5f6ab5f04aac1285aaa6cc4ee594
Reviewed-on: https://review.whamcloud.com/33182
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
10 months agoLU-10153 lnet: remove route add restriction 47/33447/23
Amir Shehata [Fri, 19 Oct 2018 23:23:40 +0000 (16:23 -0700)]
LU-10153 lnet: remove route add restriction

Remove restriction with adding routes to the same remote network
via two different gateways.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iefc5aa10f73e9e7bdd283f5e933fbb8ee819df50
Reviewed-on: https://review.whamcloud.com/33447
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
10 months agoLU-12339 lnet: select LO interface for sending 57/34957/5
Amir Shehata [Sat, 25 May 2019 16:55:47 +0000 (09:55 -0700)]
LU-12339 lnet: select LO interface for sending

In the following scenario

Lustre->LNetPrimaryNID with 0@lo
Discover is initiated on 0@lo
The peer is created with 0@lo and <addr>@<net>
The interface health of the peer's <addr>@<net> is decremented
LNetPut() to self
selection algorithm selects 0@lo to send to

This exposes an issue where we try and go through the peer credit
management algorithm, but because there are no credits associated with
0@lo we end up indefinitely queuing the message. ptlrpc will then get
stuck waiting for send completion on the message.

This was exposed via conf-sanity 32a

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I98e9d3428b594a0d041d27d8e8d8de7596825edc
Reviewed-on: https://review.whamcloud.com/34957
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
10 months agoLU-12199 lnet: verify msg is commited for send/recv 97/34797/12
Amir Shehata [Tue, 30 Apr 2019 21:01:48 +0000 (14:01 -0700)]
LU-12199 lnet: verify msg is commited for send/recv

Before performing a health check make sure the message
is committed for either send or receive. Otherwise we
can just finalize it.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Id7bd956f8e81e60a2d63059730973f851d4c7abe
Reviewed-on: https://review.whamcloud.com/34797
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
10 months agoLU-12199 lnet: Ensure md is detached when msg is not committed 85/34885/8
Chris Horn [Thu, 18 Apr 2019 03:49:18 +0000 (22:49 -0500)]
LU-12199 lnet: Ensure md is detached when msg is not committed

It's possible for lnet_is_health_check() to return "true" when the
message has not hit the network. In this situation the message is
freed without detaching the MD. As a result, requests do not receive
their unlink events and these requests are stuck forever.

A little cleanup is included here:
 - The value of lnet_is_health_check() is only used in one place, so
   we don't need to save the result of it in a variable.
 - We don't need separate logic to detach the md when the send was
   successful. We'll fall through to the finalizing code after
   incrementing the health counters

Test-Parameters: forbuildonly
Cray-bug-id: LUS-7239
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I6301d491090b862d016eed3aac8afd7be8685e57
Reviewed-on: https://review.whamcloud.com/34885
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Maloo <maloo@whamcloud.com>
Tested-by: Jenkins
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
10 months agoLU-12264 lnet: Protect lp_dc_pendq manipulation with lp_lock 98/34798/9
Chris Horn [Thu, 2 May 2019 22:24:32 +0000 (17:24 -0500)]
LU-12264 lnet: Protect lp_dc_pendq manipulation with lp_lock

Protect the peer discovery queue from concurrent manipulation by
acquiring the lp_lock.

Test-Parameters: forbuildonly
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: If43b877c1c7ea203f346a3d6ea846f00b8f9661f
Reviewed-on: https://review.whamcloud.com/34798
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
10 months agoLU-12254 lnet: correct discovery LNetEQFree() 96/34796/8
Amir Shehata [Tue, 30 Apr 2019 18:51:09 +0000 (11:51 -0700)]
LU-12254 lnet: correct discovery LNetEQFree()

The EQ needs to be freed after all the queues are cleaned to avoid
having non-processed events on the event queue on free. This will
prevent the memory from being freed.

Test-Parameters: forbuildonly
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie38ec25e09bf6d7cf2aadc30edd91d298897c51b
Reviewed-on: https://review.whamcloud.com/34796
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Jenkins