Whamcloud - gitweb
fs/lustre-release.git
2 months agoLU-13476 llite: Fix lock ordering in pagevec_dirty 17/38317/5
Shaun Tancheff [Wed, 6 May 2020 08:19:48 +0000 (03:19 -0500)]
LU-13476 llite: Fix lock ordering in pagevec_dirty

In vvp_set_pagevec_dirty lock order between i_pages and
lock_page_memcg was inverted with the expectation that
no other users would conflict.

However in vvp_page_completion_write the call to
test_clear_page_writeback does expect to be able
to lock_page_memcg then lock i_pages which appears
to conflict with the original analysis.

The reported case shows as RCU stalls with
vvp_set_pagevec_dirty blocked attempting to lock i_pages.

Fixes: a7299cb012f ("LU-9920 vvp: dirty pages with pagevec")
HPE-bug-id: LUS-8798
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I47c2107ddbef4a76325928e982abfc0ea666f39b
Reviewed-on: https://review.whamcloud.com/38317
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13134 osc: re-declare ops_from/to to shrink osc_page 87/37487/8
Wang Shilong [Sat, 8 Feb 2020 11:41:46 +0000 (19:41 +0800)]
LU-13134 osc: re-declare ops_from/to to shrink osc_page

@ops_from and @ops_to is within PAGE_SIZE, use PAGE_SHIFT
bits to limit it is fine, on x86_64 platform, this patch
will reduce another 8 bytes.

Notice, previous @ops_to is exclusive which could be PAGE_SIZE,
this patch change it to inclusive which means max value will be
PAGE_SIZE - 1, and be careful to calculate its length.

After this patch, cl_page size could reduce from 320 to 312 bytes,
and we are able to allocate 13 objects using slab pool for 4K page.

Change-Id: Ic260c0a6580292301b5397276042e399c0f07e11
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37487
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13134 obdclass: re-declare cl_page variables to reduce its size 80/37480/11
Wang Shilong [Sat, 8 Feb 2020 02:19:07 +0000 (10:19 +0800)]
LU-13134 obdclass: re-declare cl_page variables to reduce its size

With following changes:
1) make CPS_CACHED declare start from 1 consistent with CPT_CACHED
2) add CPT_NR to indicate max allowed CPT state value.
3) Reserve 4 bits for @cp_state which allow 15 kind of states
4) Reserve 2 bits for @cp_type which allow 3 kinds of cl_page types
5) use short int for @cp_kmem_index and We still have another 16 bits
reserved for future extension.
6)move @cp_lov_index after @cp_ref to fill 4 bytes hole.

After this patch, cl_page size could reduce from 336 bytes to 320 bytes

Change-Id: I92d5652a42850890ac6ce61e54884450dda25cc7
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37480
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13134 obdclass: use offset instead of cp_linkage 28/37428/12
Wang Shilong [Tue, 4 Feb 2020 13:44:23 +0000 (21:44 +0800)]
LU-13134 obdclass: use offset instead of cp_linkage

Since we have fixed-size cl_page allocations, we could use
offset array to store every slices pointer for cl_page.

With this patch, we will reduce cl_page size from 392 bytes
to 336 bytes which means we could allocate from 10 to 12 objects.

Change-Id: I323bd589941125bfddf104f53a335d0cfee5c548
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37428
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9679 lustre: remove some "#ifdef CONFIG*" from .c files. 31/39131/4
Mr NeilBrown [Sun, 7 Jun 2020 23:24:26 +0000 (19:24 -0400)]
LU-9679 lustre: remove some "#ifdef CONFIG*" from .c files.

It is Linux policy to avoid #ifdef in C files where
convenient - .h files are OK.

This patch defines a few inline functions which differ
depending on CONFIG_LUSTRE_FS_POSIX_ACL, and removes
some #ifdefs from .c files.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I680bcf568d3a09d3768cc992a53671352bd125fd
Reviewed-on: https://review.whamcloud.com/39131
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-9679 nrs: remove ts_opcodes bitmap. 75/38975/2
Mr NeilBrown [Wed, 29 Apr 2020 02:46:43 +0000 (12:46 +1000)]
LU-9679 nrs: remove ts_opcodes bitmap.

This bitmap is never used.  There is one place that tests if it has
been allocated or not, but that place can easily be satisfied by a
strcmp().

So discard the field.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I97ca6753236f106d1865d514034b72a679f9b28a
Reviewed-on: https://review.whamcloud.com/38975
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13609 llog: list all the log files correctly on MGS/MDT 17/38917/6
Emoly Liu [Fri, 12 Jun 2020 08:12:00 +0000 (16:12 +0800)]
LU-13609 llog: list all the log files correctly on MGS/MDT

"lctl --device xxx llog_catlist" should list all the config log on
MGS and catalog on MDT correctly without any buffer size limit.
If data can't be fetched in one time, data->ioc_count is used to
save the number of all the fetched logs and then continue.

conf-sanity.sh test_123af is added to verify this patch.

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I364d563446833751b1f017fa2bef0351dab56235
Reviewed-on: https://review.whamcloud.com/38917
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-11814 obdcalss: ensure LCT_QUIESCENT take sync 16/38416/5
Yang Sheng [Wed, 29 Apr 2020 15:16:43 +0000 (23:16 +0800)]
LU-11814 obdcalss: ensure LCT_QUIESCENT take sync

Add locking in lu_device_init ensure LCT_QUIESCENT
operating can be seen on other thread in parallel
mounting. Also add extra checking before unset the
flag to make sure we don't do it after device has
been started.

(osd_handler.c:7730:osd_device_init0()) ASSERTION( info ) failed:
(osd_handler.c:7730:osd_device_init0()) LBUG
Pid: 28098, comm: mount.lustre 3.10.0-1062.9.1.el7_lustre.x86_64
Call Trace:
 libcfs_call_trace+0x8c/0xc0 [libcfs]
 lbug_with_loc+0x4c/0xa0 [libcfs]
 osd_device_alloc+0x778/0x8f0 [osd_ldiskfs]
 obd_setup+0x129/0x2f0 [obdclass]
 class_setup+0x48f/0x7f0 [obdclass]
 class_process_config+0x190f/0x2830 [obdclass]
 do_lcfg+0x258/0x500 [obdclass]
 lustre_start_simple+0x88/0x210 [obdclass]
 server_fill_super+0xf55/0x1890 [obdclass]
 lustre_fill_super+0x498/0x990 [obdclass]
 mount_nodev+0x4f/0xb0
 lustre_mount+0x18/0x20 [obdclass]
 mount_fs+0x3e/0x1b0
 vfs_kern_mount+0x67/0x110
 do_mount+0x1ef/0xce0
 SyS_mount+0x83/0xd0
 system_call_fastpath+0x25/0x2a
 0xffffffffffffffff
 Kernel panic - not syncing: LBUG

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: Iccf3d545a5fc7c4a3b2320f1c7c7edcfbc1d17bb
Reviewed-on: https://review.whamcloud.com/38416
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 lov: convert container_of0() calls. 82/38382/4
Mr NeilBrown [Mon, 27 Apr 2020 05:31:55 +0000 (15:31 +1000)]
LU-6142 lov: convert container_of0() calls.

Most calls to container_of0() in lustre/lov/ are preceded by an
LINVRNT() which assures us that the pointer is valid, so
container_of() can be used instead.

Only in lov2obd() is there not context, so that call is changed to use
container_of_safe()

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I8b3ccc8f0ac1a32122f043e8feec078fcfe2452b
Reviewed-on: https://review.whamcloud.com/38382
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-6142 lustre: convert use of container_of0 in include/ 80/38380/5
Mr NeilBrown [Mon, 27 Apr 2020 04:54:13 +0000 (14:54 +1000)]
LU-6142 lustre: convert use of container_of0 in include/

Most uses of container_of0() are changed to the upstream-standard
interface container_of_safe().  There is no clear context suggesting
that the pointer is known to be value, so it is consistent with the
current code to use the _safe() version.

In a few cases it is clear that the pointer must be valid.
This may be because:
 - it is a '.next' for a struct list_head
 - it from lo_object_next() whic his a special case of above
 - the returned value is confirmed to be valid by an LINVRNT()

So change all container_of0() to either container_of_safe() or
container_of().

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9fdb8b7216667e58d3837ea555889f9346e4b10a
Reviewed-on: https://review.whamcloud.com/38380
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13653 mdt: ignore quota when creating slave stripe 75/38875/7
Hongchao Zhang [Wed, 24 Jun 2020 09:53:55 +0000 (17:53 +0800)]
LU-13653 mdt: ignore quota when creating slave stripe

When creating striped directory, the quota limit has been checked
on master MDT, the quota should be ignored when creating the slave
stripe object.

Change-Id: Ia53b1975a8d66c78725feb313659f7a9b889e735
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38875
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13721 utils: fix 'lfs find --pool' for PFL files 96/39196/3
Andreas Dilger [Sat, 27 Jun 2020 06:04:42 +0000 (00:04 -0600)]
LU-13721 utils: fix 'lfs find --pool' for PFL files

Fix "lfs find --pool" to check the lov_user_md_v3 for the specified
pool name in find_check_pool() for composite files (PFL or FLR).
The v3 pointer was initialized from the main layout xattr, but was
not being refreshed for each of the components in the file.

Add a test case for "lfs find --pool" usage.

Fixes: 5a76aee24476 ("LU-8998 lfs: user space tools for PFL")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I6b6f21fd2fdf58f46972704cb6fb425a943ebbe5
Reviewed-on: https://review.whamcloud.com/39196
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13600 ptlrpc: re-enterable signal_completed_replay() 40/39140/4
Mikhail Pershin [Mon, 22 Jun 2020 18:04:34 +0000 (21:04 +0300)]
LU-13600 ptlrpc: re-enterable signal_completed_replay()

The signal_completed_replay() can meet race conditions while
checking imp_replay_inflight counter, so remove assertion and
check race conditions instead.

Fixes: 3b613a442b ("LU-13600 ptlrpc: limit rate of lock replays")
Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ib7c372b1757556b7285f380b40167742f9b71ec6
Reviewed-on: https://review.whamcloud.com/39140
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
2 months agoLU-13731 llite: include linux/mm_types.h for VM_FAULT_RETRY 22/39222/2
Jian Yu [Wed, 1 Jul 2020 04:52:11 +0000 (21:52 -0700)]
LU-13731 llite: include linux/mm_types.h for VM_FAULT_RETRY

In RHEL 8.2 kernel 4.18.0-193.6.3.el8_2, VM_FAULT_RETRY is
defined in linux/mm_types.h instead of linux/mm.h. This
patch adds the #include in llite_internal.h to define
VM_FAULT_RETRY.

Test-Parameters: clientdistro=el8.2 serverdistro=el8.2 \
testlist=sanity

Test-Parameters: clientdistro=el8.1 serverdistro=el8.1 \
testlist=sanity

Change-Id: I0e48b32f661dab25392abf7924b5ac44f334d639
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/39222
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12678 socklnd: don't fall-back to tcp_sendpage. 34/39134/2
Mr NeilBrown [Sun, 7 Jun 2020 23:24:23 +0000 (19:24 -0400)]
LU-12678 socklnd: don't fall-back to tcp_sendpage.

sk_prot->sendpage is never NULL, so there is no
need for a fallback to tcp_sendpage.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Iaabf47790f2809fe98a0f09da31aa441021b26ab
Reviewed-on: https://review.whamcloud.com/39134
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12678 lnet: Fix some out-of-date comments. 27/39127/2
Mr NeilBrown [Sun, 7 Jun 2020 23:24:30 +0000 (19:24 -0400)]
LU-12678 lnet: Fix some out-of-date comments.

The structures these comments describe have changed or been removed,
but the comments weren't updated at the time.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I342edb56790290a0158d4907a5e775e88361ce08
Reviewed-on: https://review.whamcloud.com/39127
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12678 o2iblnd: allocate init_qp_attr on stack. 22/39122/2
Mr NeilBrown [Sun, 7 Jun 2020 23:24:35 +0000 (19:24 -0400)]
LU-12678 o2iblnd: allocate init_qp_attr on stack.

'struct ib_qp_init_attr' is not so large that it cannot be allocated
on the stack.  It is about 100 bytes, various other function in Linux
allocate it on the stack, and the stack isn't as constrained as it
once was.

So allocate on stack instead of using kmalloc and handling errors.

Test-Parameters: trivial testlist=sanity-lnet
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Id1f2f695f298d1883a5d6817092a6f89f1e072ef
Reviewed-on: https://review.whamcloud.com/39122
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9859 libcfs: fold cfs_tracefile_*_arch into their only callers. 17/39117/2
Mr NeilBrown [Sat, 20 Jun 2020 02:47:19 +0000 (22:47 -0400)]
LU-9859 libcfs: fold cfs_tracefile_*_arch into their only callers.

There is no need to separate "arch" init/fini from
the rest, so fold it all in.
This requires some slightly subtle changes to clean-up
to make sure we don't walk lists before they are
initialized.

Test-Parameters: trivial

Change-Id: Id78385cfc29ab3d5162a5fb84283e7a6ce6bf91a
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/39117
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-930 misc: update URLs in README 85/38985/3
Andreas Dilger [Thu, 18 Jun 2020 22:39:15 +0000 (16:39 -0600)]
LU-930 misc: update URLs in README

Update the URLs in the README file to reference the pages at
https://wiki.lustre.org/ instead of the WC pages.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I209956b74c19e8b1fd1c0f6d93b8b217073ebbe5
Reviewed-on: https://review.whamcloud.com/38985
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Peter Jones <pjones@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9859 libcfs: move tgt_descs to standard Linux bitmaps. 81/38981/3
James Simmons [Fri, 19 Jun 2020 13:39:34 +0000 (09:39 -0400)]
LU-9859 libcfs: move tgt_descs to standard Linux bitmaps.

Originally the Linux kernel was lacking a uniform bitmap API so
Lustre created its own. Todays modern kernels support a standard
bitmap API so migrate tgt_descs to the standard API.

Change-Id: If43b520c29d16355189b1eb7f9fdab7309446545
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38981
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13595 scripts: Add a debug option to lustre_rmmod 41/38941/5
Etienne AUJAMES [Thu, 11 Jun 2020 14:38:18 +0000 (16:38 +0200)]
LU-13595 scripts: Add a debug option to lustre_rmmod

The option is "-d" or "--debug-kernel", it uses "lctl debug_kernel" to
get debug message before unload libcfs.
For each module the script add a debug mark before unloading.

Example:
$ lustre_rmmod -d > /tmp/lustre_rmmod.log
$ lustre_rmmod -d lustre lnet > /tmp/lustre_rmmod2.log

Test-Parameters: trivial
Signed-off-by: Etienne AUJAMES <eaujames@ddn.com>
Change-Id: I6e44a24f2e786c08faf1db27de94e0f88ca65dc7
Reviewed-on: https://review.whamcloud.com/38941
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-13566 socklnd: fix local interface binding 43/38743/6
Amir Shehata [Wed, 17 Jun 2020 22:25:36 +0000 (15:25 -0700)]
LU-13566 socklnd: fix local interface binding

When a node is configured with multiple interfaces in
Multi-Rail config, socklnd was not utilizing the local interface
requested by LNet. In essence LNet was using all the NIDs in round
robin, however the socklnd module was not binding to the correct
interface. Traffic was thus sent on a subset of the interfaces.

The reason is that the route interface number was not being set.
In most cases lnet_connect() is called to create a socket. The
socket is bound to the interface provided and then
ksocknal_create_conn() is called to create the socklnd connection.
ksocknal_create_conn() calls ksocknal_associate_route_conn_locked()
at which point the route's local interface is assigned. However,
this is already too late as the socket has already been created
and bound to a local interface.

Therefore, it's important to assign the route's interface before
calling lnet_connect() to ensure socket is bound to correct local
interface.

To address this issue, the route's interface index is initialized
to the NI's interface index when it's added to the peer_ni.

Another bug fixed:
The interface index was not being initialized in the startup
routine.

Note: We're strictly assuming that there is one interface for each
NI. This is because tcp bonding will be removed from the socklnd as
it has been deprecated by LNet mutli-rail.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ibfa202ba009e07dbd69b19f1180790f1ea978ab1
Reviewed-on: https://review.whamcloud.com/38743
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13437 mdt: rename misses remote LOOKUP lock revoke 81/38181/16
Lai Siyao [Wed, 8 Apr 2020 14:55:22 +0000 (22:55 +0800)]
LU-13437 mdt: rename misses remote LOOKUP lock revoke

In rename, all objects but target may be remote, so to check whether
source is remote object on source parent, we need to compare which
MDTs they are located if both are remote. Add a helper function
mdt_rename_source_lock() to handle all possible combinations. If target
parent is remote, take remote LOOKUP for target on where target parent
is.

Add sanityn.sh 81c.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I2c134970d6abc8761528d01950b23495292cdf93
Reviewed-on: https://review.whamcloud.com/38181
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-930 doc: update James Simmons contact info 56/37756/3
James Simmons [Sun, 21 Jun 2020 14:50:17 +0000 (10:50 -0400)]
LU-930 doc: update James Simmons contact info

Update my email address to the one I have had for over 20 years.

Test-Parameters: trivial
Change-Id: Id29e9ef0ffb49fc1b568913d47bb24686b3175fc
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/37756
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
3 months agoLU-13180 osc: disable ext merging for rdma only pages and non-rdma 67/37567/2
Wang Shilong [Fri, 14 Feb 2020 06:50:11 +0000 (14:50 +0800)]
LU-13180 osc: disable ext merging for rdma only pages and non-rdma

This patch try to add logic to prevent CPU memory pages and RDMA
memory pages from merging into one RPC, codes which set OBD_BRW_RDMA_ONLY
will be added whenever RDMA only codes added later.

Change-Id: I11e8beda52cc533f17b2a40c34713f441e93d5b6
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37567
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
3 months agoLU-8130 lu_object: convert lu_object cache to rhashtable 07/36707/32
Mr NeilBrown [Thu, 14 May 2020 12:03:33 +0000 (08:03 -0400)]
LU-8130 lu_object: convert lu_object cache to rhashtable

The lu_object cache is a little more complex than the other lustre
hash tables for two reasons.
1/ there is a debugfs file which displays the contents of the cache,
  so we need to use rhashtable_walk in a way that works for seq_file.

2/ There is a (sharded) lru list for objects which are no longer
   referenced, so finding an object needs to consider races with the
   lru as well as with the hash table.

The debugfs file already manages walking the libcfs hash table keeping
a current-position in the private data.  We can fairly easily convert
that to a struct rhashtable_iter.  The debugfs file actually reports
pages, and there are multiple pages per hashtable object.  So as well
as rhashtable_iter, we need the current page index.

For the double-locking, the current code uses direct-access to the
bucket locks that libcfs_hash provides.  rhashtable doesn't provide
that access - callers must provide their own locking or use rcu
techniques.

The lsb_waitq.lock is still used to manage the lru list, but with
this patch it is no longer nested *inside* the hashtable locks, but
instead is outside.  It is used to protect an object with a refcount
of zero.

When purging old objects from an lru, we first set
LU_OBJECT_HEARD_BANSHEE while holding the lsb_waitq.lock,
then remove all the entries from the hashtable separately.

When removing the last reference from an object, we first take the
lsb_waitq.lock, then decrement the reference and add to the lru list
or discard it setting LU_OBJECT_UNHASHED.

When we find an object in the hashtable with a refcount of zero, we
take the corresponding lsb_waitq.lock and check that neither
LU_OBJECT_HEARD_BANSHEE or LU_OBJECT_UNHASHED is set.  If neither is,
we can safely increment the refcount.  If either are, the object is
gone.

This way, we only ever manipulate an object with a refcount of zero
while holding the lsb_waitq.lock.

As there is nothing to stop us using the resizing capabilities of
rhashtable, the code to try to guess the perfect hash size has been
removed.

Also: the "is_dying" variable in lu_object_put() is racey - the value
could change the moment it is sampled.  It is also not needed as it is
only used to avoid a wakeup, which is not particularly expensive.
In the same code as comment says that 'top' could not be accessed, but
the code then immediately accesses 'top' to calculate 'bkt'.
So move the initialization of 'bkt' to before 'top' becomes unsafe.

Also: Change "wake_up_all()" to "wake_up()".  wake_up_all() is only
relevant when an exclusive wait is used.

Moving from the libcfs hashtable to rhashtable also gives the
benefit of a very large performance boost.

Before patch:

SUMMARY rate: (of 5 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   Directory creation:      12036.610      11091.880      11452.978        318.829
   Directory stat:          25871.734      24232.310      24935.661        574.996
   Directory removal:       12698.769      12239.685      12491.008        149.149
   File creation:           11722.036      11673.961      11692.157         15.966
   File stat:               62304.540      58237.124      60282.003       1479.103
   File read:               24204.811      23889.091      24048.577        110.245
   File removal:             9412.930       9111.828       9217.546        120.894
   Tree creation:            3515.536       3195.627       3442.609        123.792
   Tree removal:              433.917        418.935        428.038          5.545

After patch:

SUMMARY rate: (of 5 iterations)
   Operation                      Max            Min           Mean        Std Dev
   ---------                      ---            ---           ----        -------
   Directory creation:      11873.308        303.626       9371.860       4539.539
   Directory stat:          31116.512      30190.574      30568.091        335.545
   Directory removal:       13082.121      12645.228      12943.239        157.695
   File creation:           12607.135      12293.319      12466.647        138.307
   File stat:              124419.347     105240.996     116919.977       7847.165
   File read:               39707.270      36295.477      38266.011       1328.857
   File removal:             9614.333       9273.931       9477.299        140.201
   Tree creation:            3572.602       3017.580       3339.547        207.061
   Tree removal:              487.687          0.004        282.188        230.659

Change-Id: I618dc2e2da003c240a887126f600e7eac5df951c
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/36707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-8130 ptlrpc: convert conn_hash to rhashtable 16/33616/20
Mr NeilBrown [Tue, 7 Apr 2020 16:20:47 +0000 (12:20 -0400)]
LU-8130 ptlrpc: convert conn_hash to rhashtable

Linux has a resizeable hashtable implementation in lib,
so we should use that instead of having one in libcfs.

This patch converts the ptlrpc conn_hash to use rhashtable.
In the process we gain lockless lookup.

As connections are never deleted until the hash table is destroyed,
there is no need to count the reference in the hash table.  There
is also no need to enable automatic_shrinking.

Linux-commit: ac2370ac2bc5215daf78546cd8d925510065bb7f

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I0537564ba544ed06be42ba243606a884a1290f20
Reviewed-on: https://review.whamcloud.com/33616
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9679 obdclass: remove init to 0 from lustre_init_lsi() 35/39135/2
Mr NeilBrown [Sun, 7 Jun 2020 23:24:21 +0000 (19:24 -0400)]
LU-9679 obdclass: remove init to 0 from lustre_init_lsi()

After allocating a struct with OBD_ALLOC, there is no value in setting
a few of the fields to zero.

Test-Parameters: trivial
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ib1291661ba9124219e69c7a4d3c6ee4dcf14e021
Reviewed-on: https://review.whamcloud.com/39135
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13525 sec: better struct sepol_downcall_data 80/38580/7
Sebastien Buisson [Tue, 12 May 2020 15:58:15 +0000 (00:58 +0900)]
LU-13525 sec: better struct sepol_downcall_data

struct sepol_downcall_data is badly formed for several reasons:
- it uses a __kernel_time_t field, which can be variably sized,
  depending on the size of __kernel_long_t. Replace it with a
  fixed-size __s64 type;
- it has __u32 sdd_magic that is immediately before a potentially
  64-bit field, whereas the 64-bit fields in a structure should
  always be naturally aligned on 64-bit boundaries to avoid potential
  incompatibility in the structure definition;
- it has __u16 sdd_sepol_len which may be followed by padding.

So create a better struct sepol_downcall_data, while maintaining
compatibility with 2.12 by keeping a struct sepol_downcall_data_old.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I07c573c2eef64fb0c796d8af4acdc3428e0761a8
Reviewed-on: https://review.whamcloud.com/38580
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13365 ldlm: check slv and limit before updating 69/37969/3
Wang Shilong [Wed, 18 Mar 2020 07:51:17 +0000 (15:51 +0800)]
LU-13365 ldlm: check slv and limit before updating

slv and limit do not change for most of time,
ldlm_cli_update_pool() could be called for each RPC reply,
try hold read lock to check firstly could avoid heavy write
lock in hot path.

Change-Id: I4ef33132a463b3bac23863c09a2652f2d2782aae
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/37969
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-9679 osc: simplify osc_extent_find() 07/37607/7
NeilBrown [Thu, 13 Dec 2018 00:32:56 +0000 (11:32 +1100)]
LU-9679 osc: simplify osc_extent_find()

osc_extent_find() contains some code with the same functionality as
osc_extent_merge().  So replace that code with a call to
osc_extent_merge().

This requires that we set cur->oe_grants earlier, as
osc_extent_merge() needs that.

Also:

 - fix a pre-existing bug - osc_extent_merge() should never try to
   merge two extends with different ->oe_mppr as later alignment
   checks can get confused.
 - Remove a redundant list_del_init() which is already included in
   __osc_extent_remove().

Linux-Commit: 85ebb57ddc5b ("lustre: osc: simplify osc_extent_find()")

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I5fa56e04ed707ee91f99179030dae4bd45456061
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/37607
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13711 build: fix typo on SSL dependency for Ubuntu 67/39167/3
Sebastien Buisson [Wed, 24 Jun 2020 14:57:54 +0000 (16:57 +0200)]
LU-13711 build: fix typo on SSL dependency for Ubuntu

On Ubuntu, SSL dependency is "libssl1.1".

Fixes: e1bf37870d ("LU-12214 build: fix build with gss enabled")
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I186631ac2f08a3f5be0fb54fadec3d1455960e8c
Reviewed-on: https://review.whamcloud.com/39167
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-13701 tests: stop running sanity-lnet tests 55/39155/3
James Nunez [Tue, 23 Jun 2020 14:31:29 +0000 (08:31 -0600)]
LU-13701 tests: stop running sanity-lnet tests

sanity-lnet tests 204, 205, 206, 207 and 208 are failing at
a very high rate for ARM client testing.  We need to stop
running these tests, add them to the ALWAYS_EXCEPT list,
until we understand why these tests are failing.

Test-Parameters: trivial testgroup=review-ldiskfs-arm
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: Ia3e970748d719f25b156666a0b37ac9d89535532
Reviewed-on: https://review.whamcloud.com/39155
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-13693 lfs: avoid opening regular files for getstripe 79/38979/3
John L. Hammond [Thu, 18 Jun 2020 15:16:52 +0000 (10:16 -0500)]
LU-13693 lfs: avoid opening regular files for getstripe

In get_mds_md_size() just return a large enough size for all striping
attributes. This saves lfs getstripe from opening the file which was
interfering with leases used for mirroring.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia45eb8f6aa942507a55965afccbc28375788dff2
Reviewed-on: https://review.whamcloud.com/38979
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
3 months agoLU-13690 mdd: remove warnings in obd_lookup() 71/38971/2
John L. Hammond [Wed, 17 Jun 2020 21:07:58 +0000 (16:07 -0500)]
LU-13690 mdd: remove warnings in obd_lookup()

In obf_lookup() remove CWARN()s about invalid FID formats. Return
-ENOENT instead of -EINVAL for invalid FIDs. This removes noisy
console messages around volatile file handling in .lustre/fid/.

Signed-off-by: John L. Hammond <jhammond@whamcloud.com>
Change-Id: Ia76f0da38baa86b8d1173cfe0ede52e275f68a28
Reviewed-on: https://review.whamcloud.com/38971
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12780 scrub: all update to bitfields must be protected. 74/38974/2
Mr NeilBrown [Thu, 18 Jun 2020 01:38:25 +0000 (11:38 +1000)]
LU-12780 scrub: all update to bitfields must be protected.

When a structure contains bitfields, these are updated by
a read-modify-write of the whole word.
If two of these updates can race, corruption can occurs - updates can
be lost.

To avoid this, it is best to protect updates with a spinlock.
Many updates to the os_* bit fields are already protected by
->os_lock.  This patch addes the lock to the remaining updates.
In many cases, this only requires moving a 'lock' earlier, or an
'unlock' later.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I2335d08cd53dcda98d8046d730829347456a6c5d
Reviewed-on: https://review.whamcloud.com/38974
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12905 tests: wrappers for createmany and unlinkmany 85/36585/15
Alex Zhuravlev [Fri, 29 May 2020 09:38:43 +0000 (12:38 +0300)]
LU-12905 tests: wrappers for createmany and unlinkmany

which set debug=0 if number of operations is high enough.
this is to speedup testing.

according to Maloo sanity in review-ldiskfs takes ~9min less.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I0e0a0ef6cf217ecddd1b780103d01e2109fc33d9
Reviewed-on: https://review.whamcloud.com/36585
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-10934 tests: fix compilation without SELINUX 73/38973/2
Mr NeilBrown [Wed, 17 Jun 2020 23:46:39 +0000 (09:46 +1000)]
LU-10934 tests: fix compilation without SELINUX

lustre/tests/statx.c does not compile is SELINUX support
is not installed.
So add some #ifdefs to fix it.

Test-Parameters: trivial
Fixes: 3f7853b31ef6 ("LU-10934 llite: integrate statx() API with Lustre")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ia0f285e1bb04270aff753bea71ffbe15a911db5f
Reviewed-on: https://review.whamcloud.com/38973
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
3 months agoLU-13657 kernel: kernel update RHEL8.2 [4.18.0-193.6.3.el8_2] 02/38902/2
Jian Yu [Thu, 11 Jun 2020 20:01:43 +0000 (13:01 -0700)]
LU-13657 kernel: kernel update RHEL8.2 [4.18.0-193.6.3.el8_2]

Update RHEL8.2 kernel to 4.18.0-193.6.3.el8_2.

Test-Parameters: trivial \
clientdistro=el8.2 serverdistro=el8.2 \
testlist=sanity

Change-Id: I7092768f227f93260f5a12c131154f1ec6fda1fd
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38902
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13662 lnet: handle undefined parameters 94/38894/2
Amir Shehata [Wed, 10 Jun 2020 22:27:23 +0000 (15:27 -0700)]
LU-13662 lnet: handle undefined parameters

If peer_tx_credits or peer_credits are 0, they should be
defaulted to the system defaults 8 and 256 respectively

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I351ff37cba0a9adaa1a6c25ff9c7da701724db2b
Reviewed-on: https://review.whamcloud.com/38894
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13597 ofd: add more information to job_stats 16/38816/7
Emoly Liu [Wed, 3 Jun 2020 10:48:00 +0000 (18:48 +0800)]
LU-13597 ofd: add more information to job_stats

Request processing times/latency and basic IO size information
is added to the job_stats output. This allows monitoring per-job
request processing performance.
Except read_bytes and write_bytes in bytes units, all the others
use "usecs" units and show min/max/sum values. What's more, two
new counters for read and write time are added to calculate
bandwidth. The output format is like:
write_bytes: { samples: 1, unit: bytes, min: x, max: x, sum: x,
sumsq: x}

sanity.sh test_205b is modified to verify this patch.

Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I7a5b77ca0ba464f6330a4bc56735c7762e167019
Reviewed-on: https://review.whamcloud.com/38816
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13562 build: SUSE build support for azure, cray_ari_s 20/38620/2
Shaun Tancheff [Fri, 15 May 2020 14:21:22 +0000 (09:21 -0500)]
LU-13562 build: SUSE build support for azure, cray_ari_s

The lustre build for SUSE is hard coded for the default flavor
(aka kernel-default-devel) however it is useful to be able to
build for other flavors so the resulting packages can resolve
package dependencies as well as follow the expected naming
convention.

Test-Parameters: trivial
HPE-bug-id: LUS-8554
Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: Id83e89ff4a2b9bf86b3f40c7a217440aa2b4fe94
Reviewed-on: https://review.whamcloud.com/38620
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@hpe.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13345 kernel: kernel update SLES12 SP4 [4.12.14-95.48.1] 47/38247/6
Jian Yu [Mon, 8 Jun 2020 22:33:10 +0000 (15:33 -0700)]
LU-13345 kernel: kernel update SLES12 SP4 [4.12.14-95.48.1]

Update SLES12 SP4 kernel to 4.12.14-95.48.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp4 \
env=LNET_SELFTEST_EXCEPT=smoke,SANITY_EXCEPT="56oc 817"

Change-Id: I1c6971001e813807d37cc177c9969bad78048cf0
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38247
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13437 mdt: don't fetch LOOKUP lock for remote object 61/38561/9
Lai Siyao [Sun, 10 May 2020 07:22:36 +0000 (15:22 +0800)]
LU-13437 mdt: don't fetch LOOKUP lock for remote object

Pack parent FID in getattr by FID, which will be used to check whether
child is remote object on parent. The helper function is called
mdt_is_remote_object(). NB, directory shard is not treated as remote
object, because if so, client needs to revalidate shards when dir is
accessed, which will hurt performance much.

For getattr by FID, if object is remote file on parent, don't fetch
LOOKUP lock, otherwise client may see stale dir entries.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id181ecc053579ee394080381a82706334503ced0
Reviewed-on: https://review.whamcloud.com/38561
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12214 build: fix build with gss enabled 30/36430/18
Alexey Lyashkov [Fri, 4 Oct 2019 13:04:50 +0000 (16:04 +0300)]
LU-12214 build: fix build with gss enabled

provide a right dependences for the gss enabled lustre

Cray-bug-id: LUS-6033, LUS-7204
Test-parameters: trivial
Change-Id: Ib530a112f7f1629f7aea35cd4bad7c3f89e781ff
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-on: https://review.whamcloud.com/36430
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13675 o2iblnd: revert 'Timed out tx' patch 58/38958/4
Andreas Dilger [Wed, 17 Jun 2020 02:19:52 +0000 (02:19 +0000)]
LU-13675 o2iblnd: revert 'Timed out tx' patch

Revert "LU-1742 o2iblnd: 'Timed out tx' error message" patch
as this is causing crashes in o2iblnd consistently.

This reverts commit 7308662efc02fde077216f54728ecf278f31311b.

Test-Parameters: trivial
Change-Id: I470023f41eb1123de92aa3d86b32c7893363bc4e
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38958
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Shuichi Ihara <sihara@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 mdt: Fix style issues for mdt_identity.c 28/38928/2
Arshad Hussain [Tue, 2 Jun 2020 19:27:02 +0000 (00:57 +0530)]
LU-6142 mdt: Fix style issues for mdt_identity.c

This patch fixes issues reported by checkpatch
for file lustre/mdt/mdt_identity.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ifb7f51ae4bdc8ab7dd816411a8017c379b298763
Reviewed-on: https://review.whamcloud.com/38928
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13617 llite: don't hold inode_lock for security notify 92/38792/3
Alexander Boyko [Mon, 1 Jun 2020 12:32:11 +0000 (08:32 -0400)]
LU-13617 llite: don't hold inode_lock for security notify

With selinux enabled client has a dead lock which leads to
client eviction from MDS.
1 thread                    2 thread
do file open                do stat
inode_lock(parend dir)
                            got LDLM_PR(parent dir)
enqueue LDLM_CW(parent dir) waits on inode_lock to notify security
waits
timeout on enqueue
and client eviction because client didn't cancel a LDLM_PR lock

security_inode_notifysecctx()->selinux_inode_notifysecctx()->
selinux_inode_setsecurity()
The call of selinux_inode_setsecurity doesn't need to hold
inode_lock.

Fixes: 1d44980bcb ("LU-8956 llite: set sec ctx on client's inode at create time")
Signed-off-by: Alexander Boyko <alexander.boyko@hpe.com>
Cray-bug-id: LUS-8924
Change-Id: I4727da45590734bde57bee9d378b61c30b5d515a
Reviewed-on: https://review.whamcloud.com/38792
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 utils: Fix style issues for lfs.c 07/38707/4
Arshad Hussain [Fri, 22 May 2020 18:58:14 +0000 (00:28 +0530)]
LU-6142 utils: Fix style issues for lfs.c

This patch fixes issues reported by checkpatch
for file lustre/utils/lfs.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Icc9ca0967c937e1fcd7b64f36f1e36f1a1f04f01
Reviewed-on: https://review.whamcloud.com/38707
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9812 spec: Fail rpmbuild if with servers but unconfigured 82/28282/9
Nathaniel Clark [Mon, 11 Mar 2019 18:10:30 +0000 (14:10 -0400)]
LU-9812 spec: Fail rpmbuild if with servers but unconfigured

Fail rpm build if "--with servers" used, but servers are not being
built after ./configure.  This would happen if --without ldiskfs but
zfs isn't found and thus configure turns off server support.

Test-Parameters: trivial
Change-Id: I57b07c7b7c5ad5bd73165238969ad3b9d2f3a5ab
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/28282
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12275 sec: ioctls to handle encryption policies 73/37673/22
Sebastien Buisson [Thu, 20 Feb 2020 14:53:22 +0000 (14:53 +0000)]
LU-12275 sec: ioctls to handle encryption policies

Introduce support for fscrypt IOCTLs that handle encryption
policies v2. It enables setting/getting encryption policies on
individual directories, letting users decide how they want to
encrypt specific directories.

fscrypt encryption policies v2 are supported from Linux 5.4.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0dc8c9ca1291ddd9c44617feb5df845b57d7dcc9
Reviewed-on: https://review.whamcloud.com/37673
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12275 tests: exercise file content encryption/decryption 91/36191/36
Sebastien Buisson [Mon, 16 Sep 2019 15:00:24 +0000 (15:00 +0000)]
LU-12275 tests: exercise file content encryption/decryption

Add new tests to sanity-sec in order to exercise file content
encryption/decryption. Also test encrypted file length, especially
when multiple threads are writting to the same file in parallel.

Test-Parameters: trivial
Test-Parameters: clientdistro=el8.1 testgroup=review-ldiskfs
Test-Parameters: clientdistro=el8.1 testgroup=review-ldiskfs-dne
Test-Parameters: clientdistro=el8.1 testgroup=review-ldiskfs-arm
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-selinux
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-part-1 env=SANITYN_EXCEPT=106
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-part-2 env=SANITY_PCC_EXCEPT="4"
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-part-3
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-part-4
Test-Parameters: clientdistro=el8.1 testgroup=review-zfs
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-zfs-part-1 env=SANITYN_EXCEPT=106
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-zfs-part-2 env=SANITY_PCC_EXCEPT="4"
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-zfs-part-3
Test-Parameters: clientdistro=el8.1 testgroup=review-dne-zfs-part-4
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48" clientdistro=el8.1 fstype=ldiskfs
Test-Parameters: testlist=sanity-sec envdefinitions=ONLY="36 37 38 39 40 41 42 43 44 45 46 47 48" clientdistro=el8.1 fstype=zfs
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I339e748e0213da980fe7779bb06ae9b3bd91bf5c
Reviewed-on: https://review.whamcloud.com/36191
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-5710 build: fix typo suggesting openssl-devel requirement 64/37564/2
Dominique Martinet [Thu, 13 Feb 2020 20:33:08 +0000 (21:33 +0100)]
LU-5710 build: fix typo suggesting openssl-devel requirement

Building without openssl-devel always prints a message suggesting to
install openssk-devel, which doesn't even exist. Fix typo.

Test-Parameters: trivial
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Change-Id: Id99b9d4dff7ed95aba30e4929a984878a7d13f0a
Reviewed-on: https://review.whamcloud.com/37564
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-9897 utils: have lfs.c use lstddef.h 21/38921/4
James Simmons [Fri, 12 Jun 2020 18:12:38 +0000 (14:12 -0400)]
LU-9897 utils: have lfs.c use lstddef.h

Instead of redefining ARRAY_SIZE in lfs.c we can use the macros
in lstddef.h

Test-Parameters: trivial
Change-Id: I33bca9773b609f1996ea66098edb67426273f801
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38921
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
3 months agoLU-9859 libcfs: move cfs_trace_data data to tracefile.c 14/38914/2
Mr NeilBrown [Wed, 10 Jun 2020 21:44:08 +0000 (17:44 -0400)]
LU-9859 libcfs: move cfs_trace_data data to tracefile.c

The macro cfs_tcd_for_each() is only used in tracefile.c so move
it from the header tracefile.h along with related material in
the header file.

Test-Parameters: trivial
Change-Id: I024dc0a4a1f5481cf3468c35e670096f29817c23
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38914
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9859 libcfs: remove cfs_trace_refill_stack() 13/38913/2
Mr NeilBrown [Wed, 10 Jun 2020 21:40:00 +0000 (17:40 -0400)]
LU-9859 libcfs: remove cfs_trace_refill_stack()

The function cfs_trace_refill_stack() is not used anywhere so
remove it.

Test-Parameters: trivial
Change-Id: Iade031c15a9bde091320c2fd2c66c1cd2951f649
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38913
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-9859 libcfs: Fix using smp_processor_id() in preemptible context 10/38810/3
James Simmons [Tue, 2 Jun 2020 16:48:28 +0000 (12:48 -0400)]
LU-9859 libcfs: Fix using smp_processor_id() in preemptible context

This warning show up with kernels that enable preemptible
BUG: using smp_processor_id() in preemptible [00000000] code: ...

Change it to disable preemption around smp_processor_id().

Change is apart of:
Linux-commit: 67bc8c33ec14f8290c6883a7d6237e213709561a

Change-Id: I41f7a1d3aa22240d3669f94ae92a192d219cca52
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38810
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13501 tests: Add tests for LNet health and resends 33/38633/4
Chris Horn [Sat, 16 May 2020 16:38:06 +0000 (11:38 -0500)]
LU-13501 tests: Add tests for LNet health and resends

Simulate all LNet health error statuses and validate that LNet health
modifies NI health values or attempts resends as appropriate for both
single-rail and multi-rail configurations.

Test-Parameters: trivial testlist=sanity-lnet
HPE-bug-id: LUS-8826
Signed-off-by: Chris Horn <chris.horn@hpe.com>
Change-Id: Ice705c073deefed00b20011dea5de834cf6f0984
Reviewed-on: https://review.whamcloud.com/38633
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13225 utils: fix install path for bash-completion 48/38548/3
Andreas Dilger [Fri, 8 May 2020 23:28:39 +0000 (17:28 -0600)]
LU-13225 utils: fix install path for bash-completion

Fix the default install path for bash-completion if the package is
not installed at build time.  This avoids BASH_COMPLETION_DIR being
badly formatted in the lustre.spec file.

Fixes: dfb4afc24102 ("LU-13225 utils: bash completion for lfs and lctl")
Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie50071c4ff86f57bc9dd53409ae339da2a3ebbe5
Reviewed-on: https://review.whamcloud.com/38548
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13501 lnet: Skip health and resends for single rail configs 48/38448/7
Chris Horn [Tue, 26 May 2020 16:31:26 +0000 (11:31 -0500)]
LU-13501 lnet: Skip health and resends for single rail configs

If the sender of a message only has a single interface it doesn't
make sense to have LNet track the health of that interface, nor
should it attempt to resend a message when it encounters a local
error. There aren't any alternative interfaces to use for a resend.

Similarly, we needn't track health values of a peer's NIs if the peer
only has a single interface. Nor do we need to attempt to resend
a message to a peer with a single interface. There's an exception for
routers. We rely on NI health to determine route aliveness, so even
if a router only has a single interface we still need to track its
health.

We can use the ln_ping_target to get the count of local NIs, and the
lnet_peer struct already contains a count of the number of peer NIs.

HPE-bug-id: LUS-8826
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Id89159a5d07c1668c1cbdfa9050535380f68d1f6
Reviewed-on: https://review.whamcloud.com/38448
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13344 osd-ldiskfs: timespec64 is broken 14/38314/6
Shaun Tancheff [Thu, 21 May 2020 15:30:29 +0000 (10:30 -0500)]
LU-13344 osd-ldiskfs: timespec64 is broken

Linux commit v5.5-rc1-6-gba70609d5ec6 removed timespec64_trunc
which was being used to determine if inode times were timespec64
Change this test to work with kernels without timespec64_truc

Linux-commit: ba70609d5ec664a8f36ba1c857fcd97a478adf79

Linux commit v5.4-rc3-21-g933f1c1e0b75 renamed h_buffer_credits
to h_total_credits

Add a configure test to determine and #define to handle this
change of name.

Linux-commit: 933f1c1e0b75bbc29730eef07c9e196c6dfd37e5

Signed-off-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Change-Id: I112da3385e5f33cbee8aadfd3efdbb4b3b823819
Reviewed-on: https://review.whamcloud.com/38314
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13467 llite: truncate deadlock with DoM files 88/38288/3
Andriy Skulysh [Thu, 27 Feb 2020 21:15:41 +0000 (23:15 +0200)]
LU-13467 llite: truncate deadlock with DoM files

All MDT intent RPCs are sent with inode mutex locked
while read/write and setattr unlocks inode mutex on entry,
takes LDLM lock and locks inode mutex again and sends the RPC.
So a deadlock can occur since LDLM lock is the same in case of DoM.

In fact read/write and setattr takes lli_trunc_sem, so
inode mutex can be ommited in truncate case.

Replace inode_lock with new lli_setattr_mutex to keep protection
from concurrent setattr time updates.

HPE-bug-id: LUS-8455
Change-Id: Ie294154306cc3b6cff977a2dff485e8d44145ed9
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/38288
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 mdt: Fix style issues for mdt_recovery.c 32/38932/2
Arshad Hussain [Tue, 2 Jun 2020 18:53:23 +0000 (00:23 +0530)]
LU-6142 mdt: Fix style issues for mdt_recovery.c

This patch fixes issues reported by checkpatch
for file lustre/mdt/mdt_recovery.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ib7a0795cf2d48c078c140aef8501a167fb24d74c
Reviewed-on: https://review.whamcloud.com/38932
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 lov: Fix style issues for lov_merge.c 30/38930/2
Arshad Hussain [Tue, 2 Jun 2020 19:52:12 +0000 (01:22 +0530)]
LU-6142 lov: Fix style issues for lov_merge.c

This patch fixes issues reported by checkpatch
or file lustre/lov/lov_merge.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I6fd60dbc7c48f3dc8fc2c41e924d8d088d6912f2
Reviewed-on: https://review.whamcloud.com/38930
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 llite: Fix style issues for vvp_page.c 29/38929/2
Arshad Hussain [Tue, 2 Jun 2020 20:34:36 +0000 (02:04 +0530)]
LU-6142 llite: Fix style issues for vvp_page.c

This patch fixes issues reported by checkpatch
for file lustre/llite/vvp_page.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I14faceb6d2e137cf1ca2eac66864eed87052b1fe
Reviewed-on: https://review.whamcloud.com/38929
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13649 mdd: orhpan cleanup fix 66/38866/2
Vitaly Fertman [Mon, 8 Jun 2020 20:24:12 +0000 (23:24 +0300)]
LU-13649 mdd: orhpan cleanup fix

due to a race with mdd_close() the objects may have been already
destroyed by close and the 2nd destroy asserts on lu_object_is_dying()

The problem appeared in LU-12846 which removed the error handling
(ENOENT) returned by dt_delete - the entry was already removed from
the parent.

Fixes: 688d5da6a8 ("LU-12846 mdd: return error while delete failed")
HPE-bug-id: LUS-8864

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I7e2f3fca7b7d4440340fd3daaf8ec528010d9117
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Tested-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-on: https://review.whamcloud.com/38866
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12511 lov: use lov_pattern_support() to verify lmm 91/38791/5
James Simmons [Tue, 9 Jun 2020 22:39:06 +0000 (18:39 -0400)]
LU-12511 lov: use lov_pattern_support() to verify lmm

We can use lov_pattern_support(), which is used by the server
and userland code, to ensure lmm is valid instead of open coding.

Change-Id: I44051e6e2dba2f0b7e481572bb58d776724aecd8
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38791
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
3 months agoLU-9441 llite: bind kthread thread to accepted node set 30/38730/4
James Simmons [Wed, 27 May 2020 17:27:59 +0000 (13:27 -0400)]
LU-9441 llite: bind kthread thread to accepted node set

Bind both the agl and statahead kernel threads to a node that is
apart of the cpt table that Lustre use. This limits the polluting
of the cache of HPC applications.

Change-Id: I1c29fb5dbbdb6a73dac0dc6c872a797c05eab1ad
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38730
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12785 dom: fix DoM component deletion code 37/38337/5
Mikhail Pershin [Thu, 23 Apr 2020 12:42:00 +0000 (15:42 +0300)]
LU-12785 dom: fix DoM component deletion code

The lod_erase_dom_stripe() deletes DoM entry from composite
layout upon file create if DoM is disabled on server.
That code works incorrectly if DoM is not the first component
in provided layout, e.g. in mirror.

Patch does correct DoM entry removal in generic case no matter
where it was placed in layout. Related test 270h is added into
sanity.sh

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ia1b3f25db16a7b59b83cd8f58ff44ddf082cab48
Reviewed-on: https://review.whamcloud.com/38337
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
3 months agoLU-13508 mdc: chlg device could be used after free 58/38658/4
Hongchao Zhang [Tue, 19 May 2020 16:21:41 +0000 (00:21 +0800)]
LU-13508 mdc: chlg device could be used after free

There are some issue of the usage of dynamic devices used by
the changelog in MDC, which could cause the device to be used
after it is freed.

Change-Id: Iacf6fa7c8b612f1a373091cf88e7082c4860cfe4
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38658
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13628 tests: replace btime with crtime for statx test 80/38880/5
Qian Yingjin [Tue, 9 Jun 2020 15:00:07 +0000 (23:00 +0800)]
LU-13628 tests: replace btime with crtime for statx test

Tests sanityn/106a failed due to wrongly using 'btime' to filter
the debugfs output for file creation time, which should be
'crtime'.

This patch also replaces '-c %q' with '-c %p' in sanityn/106c to
get the statx 'stx_attributes_mask': Mask to show what's supported
in 'stx_attributes'.

Test-Parameters: trivial clientdistro=el8
Test-Parameters: trivial clientdistro=ubuntu1804
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: Ia8273e02d4ebe7f1e9e5d6973e691c82e0524fb2
Reviewed-on: https://review.whamcloud.com/38880
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
3 months agoLU-13600 ptlrpc: limit rate of lock replays 20/38920/3
Mikhail Pershin [Fri, 12 Jun 2020 14:14:50 +0000 (17:14 +0300)]
LU-13600 ptlrpc: limit rate of lock replays

Clients send all lock replays at once and that may overwhelm
server with huge amount of replays in recovery queue causing
OOM effects.

Patch adds rate control for lock replays on client

Signed-off-by: Mikhail Pershin <mpershin@whamcloud.com>
Change-Id: Ie557f8481c5facb690468d7136cf5feebe4e8f11
Reviewed-on: https://review.whamcloud.com/38920
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13580 tests: fix retrieval of SELinux context 48/38648/6
Sebastien Buisson [Mon, 18 May 2020 09:43:22 +0000 (11:43 +0200)]
LU-13580 tests: fix retrieval of SELinux context

Use 'stat' command instead of 'ls -lZ' to retrieve SELinux security
context, to make it more portable.

Test-Parameters: trivial clientselinux testlist=sanity-selinux mdtcount=2 clientcount=2
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I61bc0efb1e8ae0427d05827e2933eb0b848fb442
Reviewed-on: https://review.whamcloud.com/38648
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12275 sec: support truncate for encrypted files 94/37794/15
Sebastien Buisson [Thu, 20 Feb 2020 14:45:07 +0000 (14:45 +0000)]
LU-12275 sec: support truncate for encrypted files

Truncation of encrypted files is not a trivial operation. The page
corresponding to the point where truncation occurs must be read,
decrypted, zeroed after truncation point, re-encrypted and then
written back.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I834f9372913d7051b1e0821515d3fea0873ffd78
Reviewed-on: https://review.whamcloud.com/37794
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12275 sec: deal with encrypted object size 46/36146/28
Sebastien Buisson [Fri, 11 Oct 2019 08:40:37 +0000 (08:40 +0000)]
LU-12275 sec: deal with encrypted object size

Problem with size of encrypted file comes from the fact that
an encrypted page will always contain PAGE_SIZE bytes of data,
even if clear text page is only a few bytes. And server infers
object size from content of encrypted page.

The way to address this is the following. Upon writing, when the
client encrypts the page representing the end of the file, it puts
into o_size info of the request's body, the size of the clear text
version of the file. On server side, this information is used to
adjust isize of the object, but still storing the complete pages
on disk.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ia83424123da26920ba0e0dfb354f54b1fa0ccfbb
Reviewed-on: https://review.whamcloud.com/36146
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12275 sec: decryption for read path 45/36145/28
Sebastien Buisson [Thu, 22 Aug 2019 08:48:19 +0000 (08:48 +0000)]
LU-12275 sec: decryption for read path

With the support for encryption, all files need to be opened with
fscrypt_file_open(). fscrypt will retrieve encryption context if
file is encrypted, or immediately return if not.
Decryption itself is carried out in osc_brw_fini_request(), right
after the reply has been received from the server.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8f8f87eb8e07e35e1a4e6cc157ceddfef6934753
Reviewed-on: https://review.whamcloud.com/36145
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12275 sec: encryption for write path 44/36144/27
Sebastien Buisson [Wed, 17 Jul 2019 14:24:26 +0000 (14:24 +0000)]
LU-12275 sec: encryption for write path

First aspect is to make sure encryption context is properly set on
files/dirs that are created or opened/looked up.
Then encryption itself is carried out in osc_brw_prep_request(), just
before pages are added to the request to be sent. Because pages in
the page cache must hold clear text data, we have to use bounce pages
for encryption. The allocation is handled by fscrypt, and for
deallocation we call fscrypt_pullback_bio_page() and/or
fscrypt_pullback_bio_page().

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ieb1355bd55b6a8740e4b549d60d1f480a5abc53f
Reviewed-on: https://review.whamcloud.com/36144
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13408 target: update in-memory per client data 55/38855/4
Lai Siyao [Sat, 6 Jun 2020 20:00:00 +0000 (04:00 +0800)]
LU-13408 target: update in-memory per client data

Some clients don't support recovery:
1. lightweight clients.
2. local clients on MDS which doesn't support "local_recovery".
3. OFD connect may cause transaction before export has valid
   last_rcvd slot.

Though such clients don't store per client data on disk, they
still need to update in memory per client data to allow reply
reconstruct and track saved LDLM locks (both local and remote) be
tracked by transaction number.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id0082358e7720e5ef61f366682ae91282bd66d6d
Reviewed-on: https://review.whamcloud.com/38855
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9325 mdt: replace simple_strtol() with kstrtol() 46/38846/3
James Simmons [Fri, 5 Jun 2020 12:55:55 +0000 (08:55 -0400)]
LU-9325 mdt: replace simple_strtol() with kstrtol()

Someday simple_strtol() will go away. The simple_strtol() call in
mdt_init0() is very simple so we can easily replace it with
kstrtol().

Change-Id: I37485735f0f42aa5c2c5b9fd361e4fdfa54dc8e5
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38846
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
3 months agoLU-13635 lfs: add -D option back to lfs_migrate 40/38840/3
Emoly Liu [Fri, 5 Jun 2020 03:39:37 +0000 (11:39 +0800)]
LU-13635 lfs: add -D option back to lfs_migrate

Enable "-D" option with its long option "--non-direct" correctly
in lfs_migrate.
sanity.sh test_56we is added to verify this patch.

Test-Parameters: trivial
Signed-off-by: Emoly Liu <emoly@whamcloud.com>
Change-Id: I6ab051c0f2e0cde9de6a5b8ace8962cc293e7656
Reviewed-on: https://review.whamcloud.com/38840
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13604 doc: update e2fsprogs to 1.45.6.wc1 57/38757/2
Li Dongyang [Fri, 29 May 2020 02:38:19 +0000 (12:38 +1000)]
LU-13604 doc: update e2fsprogs to 1.45.6.wc1

Update the recommended e2fsprogs version to 1.45.6.wc1

Change-Id: I1e3d05207da954e6d7b9204fc4ed3329486f80dd
Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-on: https://review.whamcloud.com/38757
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
3 months agoLU-8130 obd: convert obd_nid_hash to rhashtable 18/33518/21
James Simmons [Mon, 18 May 2020 22:10:10 +0000 (18:10 -0400)]
LU-8130 obd: convert obd_nid_hash to rhashtable

Linux has a resizeable hashtable implementation in lib,
so we should use that instead of having one in libcfs.

This patch converts the struct obd_export obd_nid_hash to use
rhashtable. In the process we gain lockless lookup which should
improve performance. For the nid hash we use rhltable since the
mapping can be many exports to a NID key.

Signed-off-by: James Simmons <jsimmons@infradead.org>
Change-Id: I45154ceb48336b20161f771d986d8fe7333b9849
Reviewed-on: https://review.whamcloud.com/33518
Reviewed-by: Neil Brown <neilb@suse.de>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12511 utils: Move utilies specific values out of Lustre UAPI headers 90/38790/3
James Simmons [Fri, 5 Jun 2020 02:13:38 +0000 (22:13 -0400)]
LU-12511 utils: Move utilies specific values out of Lustre UAPI headers

Use FS_IOC_FS[S|G]ETXATTR directly. Move several things in the
UAPI header lustre_user.h that is only needed by user land tools
to the proper places.

Change-Id: Ie7a33742c0aba478c365c5fa44315400b28d8193
Test-Parameters: trivial
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38790
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
3 months agoLU-6142 mdt: Fix style issues for mdt_reint.c 86/38786/3
Arshad Hussain [Fri, 29 May 2020 13:45:47 +0000 (19:15 +0530)]
LU-6142 mdt: Fix style issues for mdt_reint.c

This patch fixes issues reported by checkpatch
for file lustre/mdt/mdt_reint.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I2afd2e127c03c7f021da24ac8b9b00a059a07f0b
Reviewed-on: https://review.whamcloud.com/38786
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <beevans@whamcloud.com>
Reviewed-by: Petros Koutoupis <petros.koutoupis@hpe.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12511 build: ignore kmod handling in spec file for 49/38649/6
James Simmons [Thu, 28 May 2020 12:21:59 +0000 (08:21 -0400)]
LU-12511 build: ignore kmod handling in spec file for
 utilities only build

The lustre spec file handles kmod even when --disable-modules is
used. We don't need to manage any kmod in this case so lets make
that handling only when ${with lustre_modules} is true.

Test-Parameters: trivial
Change-Id: Ifa43720aacabae5f41abf250d2e03b235c34cb4c
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38649
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13181 o2ib: fix page mapping error 88/37388/10
Alexey Lyashkov [Mon, 8 Jun 2020 00:27:18 +0000 (20:27 -0400)]
LU-13181 o2ib: fix page mapping error

IB DMA mapping can merge a physically continues page region into
single one.
It's confused a kiblnd_fmr_pool_map function who expect to see all
fragments mapped.
It's generate a error
 (o2iblnd.c:1926:kiblnd_fmr_pool_map()) Failed to map mr 1/16 elements

By study an IB code, it looks ib_map_mr_sg return code should checked
against of result of ib_dma_map_sg instead of original fragments
count, same data should be used as argument of ib_map_mr_sg function.

Test-Parameters: trivial
Cray-bug-id: LUS-8139
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I3b845ae54d8659d4045921f519effcf0a4428e49
Reviewed-on: https://review.whamcloud.com/37388
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-10157 lnet: restore an maximal fragments count 85/37385/11
Alexey Lyashkov [Mon, 20 Apr 2020 18:42:50 +0000 (21:42 +0300)]
LU-10157 lnet: restore an maximal fragments count

Lowering a number of fragments blocks a connection from older clients
who wantsto use 256 fragments to transfer. Let's restore this number
to the original value.

Fixes: 272e49ce2d5d ("LU-10157 lnet: make LNET_MAX_IOV dependent on page size")

Test-Parameters: trivial testlist=lnet-selftest,sanity-lnet
Cray-bug-id: LUS-8139
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: Ia23aa1fb3d36a65abab6241c9ba75addc1dcce0a
Reviewed-on: https://review.whamcloud.com/37385
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Serguei Smirnov <ssmirnov@whamcloud.com>
Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-11025 mdt: remove unused code 01/38801/2
Lai Siyao [Mon, 1 Jun 2020 20:17:17 +0000 (04:17 +0800)]
LU-11025 mdt: remove unused code

Remove obsolete code in dir_split_count_store() which are left in
code rebase.

Test-parameters: trivial

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Id72385307623c7f281ea855e4c02fe110f1ed235
Reviewed-on: https://review.whamcloud.com/38801
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13488 kernel: RHEL 8.2 server support 40/38440/11
Jian Yu [Sat, 6 Jun 2020 18:29:27 +0000 (11:29 -0700)]
LU-13488 kernel: RHEL 8.2 server support

This patch makes changes to support RHEL 8.2 release with
kernel 4.18.0-193.1.2.el8 for Lustre server.

Test-Parameters: trivial \
clientdistro=el8.2 serverdistro=el8.2 \
env=SANITY_EXCEPT="130 133h" testlist=sanity

Change-Id: I350da46e1e2ff32945ef0f7106e6642821fc9ecf
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/38440
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-1742 o2iblnd: 'Timed out tx' error message 35/33235/9
Sonia Sharma [Thu, 6 Sep 2018 03:39:23 +0000 (23:39 -0400)]
LU-1742 o2iblnd: 'Timed out tx' error message

Fix the error message in kiblnd_check_txs_locked()
to report the total RDMA time outstanding rather
than the number of seconds past the deadline.

This patch also adds time_on_activeq to struct kib_tx
so the time spent by tx in internal queue and active
queue can be tracked and reported. This would help
in diagnosing the issue.

Change-Id: I4e486389220e383af88dbc482646e92a85bd5b14
Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/33235
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Stephen Champion <stephen.champion@hpe.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-9897 build: add binaries to .gitignore 25/38825/3
James Simmons [Wed, 3 Jun 2020 11:32:03 +0000 (07:32 -0400)]
LU-9897 build: add binaries to .gitignore

Several binaries are built that show up with git status.
Add them to the .gitignore file

Test-Parameters: trivial
Change-Id: I7eb38d8fe725408dffaa71eb4db2d0305721367b
Signed-off-by: James Simmons <jsimmons@infradead.org>
Reviewed-on: https://review.whamcloud.com/38825
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
3 months agoLU-9859 libcfs: merge linux-tracefile.c into tracefile.c 04/38804/3
Mr NeilBrown [Tue, 2 Jun 2020 12:31:35 +0000 (08:31 -0400)]
LU-9859 libcfs: merge linux-tracefile.c into tracefile.c

It's good to keep related code together.

Test-Parameters: trivial
Change-Id: I7708114c16b180c0f2f0e280447cd6fa4859792e
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Reviewed-on: https://review.whamcloud.com/38804
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13559 utils: fix lfs mirror delete error message 09/38609/8
Kévin Baillergeau [Fri, 15 May 2020 01:22:19 +0000 (01:22 +0000)]
LU-13559 utils: fix lfs mirror delete error message

Add different error messages depending on the option used.
Add a mirror_id variable instead of reusing the id variable
to store the result of mirror_id_of(id).

Signed-off-by: Kévin Baillergeau <kevin.baillergeau.ocre@cea.fr>
Change-Id: I5fbd307c4132c22d54470f2a1407074efe8bbc0a
Reviewed-on: https://review.whamcloud.com/38609
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Dominique Martinet <dominique.martinet@cea.fr>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
3 months agoLU-13503 mdc: allow setting max_mod_rpcs_in_flight larger 55/38455/10
Andreas Dilger [Wed, 27 May 2020 19:22:12 +0000 (12:22 -0700)]
LU-13503 mdc: allow setting max_mod_rpcs_in_flight larger

Allow setting mdc.*.max_mod_rpcs_in_flight > mdc.*.max_rpcs_in_flight
by increasing the latter value, rather than returning an error and
telling the user to do that.  This matches the similar behavior if
mdc.*.max_rpcs_in_flight is reduced lower than max_mod_rpcs_in_flight.

If there are multiple MDTs, the "mdc.*.max_mod_rpcs_in_flight" param
may be set from e.g. the MDT0000 config log before MDT0001 is fully
configured, catching MDT0001 with ocd_maxmodrpcs = 0 before the OCD
from the MDT has been filled in, and incorrectly trigger an error.
If seen during setup, allow ocd_maxmodrpcs = (max_rpcs_in_flight - 1),
since this will be fixed up later if mdc.*.max_rpcs_in_flight is set
smaller in the config log (if set larger it doesn't matter).

Test-Parameters: env=ONLY=90 testlist=conf-sanity

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Change-Id: I4b20163e9e212db451738169ebdc361ab8c1c15e
Reviewed-on: https://review.whamcloud.com/38455
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-11963 obd: Rename OS_STATE flags to OS_STATFS 89/34289/7
Patrick Farrell [Wed, 27 Feb 2019 21:31:11 +0000 (16:31 -0500)]
LU-11963 obd: Rename OS_STATE flags to OS_STATFS

The statfs state flags are oddly named "OS_STATE_[STATE]"
Rename them to "OS_STATFS_[STATE]" to make their role clearer
and make them easier to find.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I3f43b3e73155d9fbd8b3e0fa52e7f4d26b9d2f89
Reviewed-on: https://review.whamcloud.com/34289
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <farr0186@gmail.com>
3 months agoLU-10391 lnet: fix uninitialize var in choose_ipv4_src() 23/38823/2
Mr NeilBrown [Wed, 3 Jun 2020 22:57:31 +0000 (08:57 +1000)]
LU-10391 lnet: fix uninitialize var in choose_ipv4_src()

choose_ip4_src() test "*ret" without initializing it - and callers do
not (and should not) initialize the var.

Instead of testing "*ret", test "err" - if this is non-zero (it will
be -ENOENT) we want to use the address.  If it is zero, then we only
use the address if it is on the right subnet.

Test-Parameters: trivial
Reported-by: Amir Shehata <ashehata@whamcloud.com>
Fixes: d720fbaadad9 ("LU-10391 socklnd: use interface index to track local addr")
Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: I9b83207b790db07c06be1ee1c534a0fc63eb9ffa
Reviewed-on: https://review.whamcloud.com/38823
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Aurelien Degremont <degremoa@amazon.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13195 osp: invalidate object on write error 87/38387/4
Alex Zhuravlev [Mon, 27 Apr 2020 07:24:33 +0000 (10:24 +0300)]
LU-13195 osp: invalidate object on write error

do this unconditionally, to avoid cases when the object is
on another request's invalidation list.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8ee0c484e695e88c0ea6fb13ac377fa689150780
Reviewed-on: https://review.whamcloud.com/38387
Reviewed-by: Alexander Boyko <alexander.boyko@hpe.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-6142 obdclass: convert calls to container_of0() 81/38381/2
Mr NeilBrown [Mon, 27 Apr 2020 05:28:23 +0000 (15:28 +1000)]
LU-6142 obdclass: convert calls to container_of0()

Most calls to container_of8() in lustre/obdclass can be safely changed
to container_of(), etiher because the pointer passed in is obviously
not NULL (or error) from the context, or because the pointer returned
is dereferenced without and checks.

The only excepts are simple wrapped like dt2ls_dev(), lu2ls_obj(),
scrub_obj2dev() where these is no context, so it is safest to convert
to container_of_safe() instead.

Signed-off-by: Mr NeilBrown <neilb@suse.de>
Change-Id: Ice1063f3ccb74eaec575bff85c960f3288be5ef5
Reviewed-on: https://review.whamcloud.com/38381
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-12275 sec: control client side encryption 33/36433/19
Sebastien Buisson [Fri, 11 Oct 2019 08:34:02 +0000 (08:34 +0000)]
LU-12275 sec: control client side encryption

Client enables encryption by default. However, this should be
possible only if server side is encryption aware.
Moreover, we want to give the ability to decide which clients can
make use of encryption, by extending the nodemap mechanism with a
new 'forbid_encryption' property, set to 0 by default.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I765e5ce555e8277319c03c770cb6e6ac73cfc9e8
Reviewed-on: https://review.whamcloud.com/36433
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: John L. Hammond <jhammond@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
3 months agoLU-13628 tests: add sanityn test_106 to ALWAYS_EXCEPT 15/38815/2
Sebastien Buisson [Wed, 3 Jun 2020 09:31:20 +0000 (11:31 +0200)]
LU-13628 tests: add sanityn test_106 to ALWAYS_EXCEPT

sanityn test_106 fails on CentOS 8 and Ubuntu 18, and is skipped
on all other distros because of lack of support for statx.

Test-Parameters: trivial
Test-Parameters: clientdistro=el8.1 testlist=sanityn
Test-Parameters: clientdistro=ubuntu1804 testlist=sanityn
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I370585138bbd05d1e4ea8f323c74659145fe7dec
Reviewed-on: https://review.whamcloud.com/38815
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>