Whamcloud - gitweb
fs/lustre-release.git
13 months agoLU-12561 kernel: Remove RHEL6 series and targets 45/35545/4
Patrick Farrell [Wed, 17 Jul 2019 22:40:15 +0000 (18:40 -0400)]
LU-12561 kernel: Remove RHEL6 series and targets

Remove the RHEL6 series and target files.

Also remove the RHEL5 (wow!) targets, and the outdated
Fedora (fc) targets.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I00fc47cac656bc3b6f220f3994f0a25ed73879f9
Reviewed-on: https://review.whamcloud.com/35545
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-12561 kernel: Remove RHEL6 kernel configs 44/35544/3
Patrick Farrell [Wed, 17 Jul 2019 22:33:08 +0000 (18:33 -0400)]
LU-12561 kernel: Remove RHEL6 kernel configs

First in a series of patches to remove RHEL6 support, this
removes the kernel configs.

This should be a build only change, so trivial testing
should be OK.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I0b68754f8921b82e8d0c5eaa321d58187da49a70
Reviewed-on: https://review.whamcloud.com/35544
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-12561 kernel: Remove RHEL6 from which_patch 43/35543/3
Patrick Farrell [Wed, 17 Jul 2019 22:42:13 +0000 (18:42 -0400)]
LU-12561 kernel: Remove RHEL6 from which_patch

RHEL6 is not supported any more, and no longer builds, so
remove it from which_patch.

Test-Parameters: trivial
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id71e41ea76e96542cfa090e9a72bca024353601f
Reviewed-on: https://review.whamcloud.com/35543
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
13 months agoLU-12296 llite: improve ll_dom_lock_cancel 58/34858/7
Vladimir Saveliev [Wed, 5 Jun 2019 01:46:42 +0000 (04:46 +0300)]
LU-12296 llite: improve ll_dom_lock_cancel

ll_dom_lock_cancel() should zero kms attribute similar to
mdc_ldlm_blocking_ast0().

In order to avoid code duplication between mdc_ldlm_blocking_ast0()
and ll_dom_lock_cancel() - add cl_object_operations method to be able
to reach mdc's blocking ast from llite level.

Test illustrating the issue is added.

Cray-bug-id: LUS-7118
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I2b100ead6d420dbf561bc61be973d64dad317214
Reviewed-on: https://review.whamcloud.com/34858
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-12516 mdd: support for volatile creation in .lustre 58/35258/5
Alex Zhuravlev [Tue, 18 Jun 2019 09:18:27 +0000 (13:18 +0400)]
LU-12516 mdd: support for volatile creation in .lustre

this is useful to enable striping manipulation by FIDs.

Change-Id: I4d5b1b13acdfef21ac46bf3557e9ab6d5ccc796b
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35258
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-12484 tests: correct typo in test-framework::is_project_quota_supported() 62/35362/5
Oleg Drokin [Fri, 28 Jun 2019 15:57:07 +0000 (11:57 -0400)]
LU-12484 tests: correct typo in test-framework::is_project_quota_supported()

do_facet was misspelled as do_fact on the zfs side.

Change-Id: Idc112802a6817e5128799cb6059040c6b3021791
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Fixes: f172b116885 (LU-10092 llite: Add persistent cache on client)
Reviewed-on: https://review.whamcloud.com/35362
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
13 months agoLU-12489 tests: Recategorize some "Slow" tests as normal. 67/35367/3
Oleg Drokin [Fri, 28 Jun 2019 18:11:40 +0000 (14:11 -0400)]
LU-12489 tests: Recategorize some "Slow" tests as normal.

replay-single 44b is really fast now
replay-ost-single 8[ab] both take under a minute in my testing
ost-pools 23b only takes a minute now and test 18 is still
          under 10 minutes to bother excluding it.

Change-Id: I57a35c39cadd1728c38d332d83cbe50da6d6a8fe
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35367
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
13 months agoLU-4423 obdclass: use list_sort() to sort a list. 12/35512/4
NeilBrown [Mon, 15 Jul 2019 01:11:56 +0000 (21:11 -0400)]
LU-4423 obdclass: use list_sort() to sort a list.

Rather than a bespoke bubble-sort, use list_sort() to
sort this linked list.

As this would become a 1-line function that is only called once,
call list_sort() directly from the one call site.

Linux-commit: e714d3559e964d1547d20b54ad5fd6bbb3401f56

Signed-off-by: NeilBrown <neilb@suse.com>
Change-Id: Ied197a4fdba43d793c5ebbb9afc837a986609469
Reviewed-on: https://review.whamcloud.com/35512
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-12355 llite: MS_* flags and SB_* flags split 19/35019/4
Shaun Tancheff [Thu, 18 Jul 2019 14:19:03 +0000 (09:19 -0500)]
LU-12355 llite: MS_* flags and SB_* flags split

In kernel 4.20 the MS_* flags should only be used for mount
time flags and SB_* flags for checking super_block.s_flags
The MS_* flags have moved to a uapi header

Linux-commit: e262e32d6bde0f77fb0c95d977482fc872c51996

Test-Parameters: trivial
Change-Id: Ifd64efb16c7795377ece066d01ae04dc004a13ac
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/35019
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
13 months agoLU-12510 osd: osd-zfs to release zrlock quickly 24/35524/3
Alexey Zhuravlev [Mon, 15 Jul 2019 18:01:59 +0000 (21:01 +0300)]
LU-12510 osd: osd-zfs to release zrlock quickly

otherwise few threads trying to access same dnode can get stuck.
this patch is a quick workaround for the issue, it's supposed
to be replaced with a better patch using regular DMU API.

Change-Id: I24d9ed7f8e68080c6a46409476a80799dbb45230
Signed-off-by: Alexey Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35524
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-9019 libcfs: avoid using HZ and msecs_to_jiffies() 20/35520/2
James Simmons [Mon, 15 Jul 2019 15:28:32 +0000 (11:28 -0400)]
LU-9019 libcfs: avoid using HZ and msecs_to_jiffies()

HZ is a constant selected with the configuration of the kernel
and msecs_to_jiffies() is a inline function in jiffies.h. Because
we are out of tree that means prebuilt lustre packages could be
installed on a node with different values which impact the
behavior of the file system. This was addressed earlier but
regression have crept back in. Fix up all those instances by
replacing msec_to_jiffies() and HZ with cfs_time_seconds() which
translates seconds to jiffies with nsec_to_jiffies(). Add to
spelling.txt to avoid new regressions.

Change-Id: Ie13efe83774db498bb5475ed47a057bbd42d47bf
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35520
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
13 months agoLU-12462 osc: Do not assert for first extent 25/35525/3
Patrick Farrell [Tue, 16 Jul 2019 16:28:25 +0000 (12:28 -0400)]
LU-12462 osc: Do not assert for first extent

In the discard case, the OSC fsync/writeback code asserts
that each OSC extent is fully covered by the fsync request.

This is not valid for the DOM case, because OSC extent
alignment requirements can create OSC extents which start
before the OST region of the layout (ie, they cross in to
the DOM region).  This is OK because the layout prevents
them from ever being used for i/o, but this same behavior
means that the OSC fsync start/end is aligned with the
layout, and so does not necessarily cover that first
extent.

The simplest solution is just to not assert on the first
extent.  (There is no way at the OSC layer to recognize the
DOM case.)

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: If66f8d81fb9dd4546a5647a10f6ca551e2cf98e3
Reviewed-on: https://review.whamcloud.com/35525
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12326 tests: recognize Bochs as VM 42/34942/5
Alex Zhuravlev [Wed, 22 May 2019 17:29:34 +0000 (20:29 +0300)]
LU-12326 tests: recognize Bochs as VM

Bochs is reported by qemu-system-x86_64 and few tests (e.g. 399a)
depends on that.

Change-Id: I3c6cfca1c0cb811425a09b3958cd0626891e73da
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34942
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoLU-12545 llite: cleanup stats of LPROC_LL_* 14/35514/3
Li Xi [Mon, 15 Jul 2019 03:31:05 +0000 (11:31 +0800)]
LU-12545 llite: cleanup stats of LPROC_LL_*

Some LPROC_LL_ stats are not used for a long time. This patch
removes them. LPROC_LL_STAFS is changed to LPROC_LL_STATFS in
this patch too.

Change-Id: I64b033e186733147e2a2984d8afec97240d03573
Signed-off-by: Li Xi <lixi@ddn.com>
Reviewed-on: https://review.whamcloud.com/35514
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-4423 ptlrpc: make ptlrpc_last_xid an atomic64_t 10/35510/2
NeilBrown [Sun, 14 Jul 2019 23:39:31 +0000 (19:39 -0400)]
LU-4423 ptlrpc: make ptlrpc_last_xid an atomic64_t

This variable is treated like an atomic64_t,
so change it's type and simplify the code.

Change-Id: I4b219342222fd784ac1d7dc17660feb816bacd57
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35510
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-4423 ptlrpc: Fix using smp_processor_id() in preemptible context 95/35495/2
James Simmons [Sat, 13 Jul 2019 15:00:32 +0000 (11:00 -0400)]
LU-4423 ptlrpc: Fix using smp_processor_id() in preemptible context

This warning show up with kernels that enable preemptible
[ 1877.516799] BUG: using smp_processor_id() in preemptible [00000000] code: mount.lustre/14077

Change it to disable preemption around smp_processor_id().

Linux-commit: c369772e78a7383ba4e68673128fe2d6ef2863ee

Fixes: ef94e4d1bb ("LU-8710 ptlrpc: use current CPU instead of hardcoded 0")

Change-Id: If66107f0843c5d0c4bcf874e64a20251fdf9704e
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35495
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12542 ldlm: simplify lock_mode_to_index() 86/35486/2
Signed-off-by: NeilBrown [Fri, 12 Jul 2019 17:27:28 +0000 (13:27 -0400)]
LU-12542 ldlm: simplify lock_mode_to_index()

This function has the same effect as ilog2(), so just use ilog2
directly.

Change-Id: If90207c328b549e85cb6d38a6604dfb8c7b6c8a0
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35486
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11672 ldlm: awalys cancel aged locks regardless enabling or disabling lru resize 67/35467/3
Gu Zheng [Thu, 11 Jul 2019 05:52:38 +0000 (13:52 +0800)]
LU-11672 ldlm: awalys cancel aged locks regardless enabling or disabling lru resize

Currently cancelling aged locks is handled by of ldlm_pool_recalc routine,
and it only works when lru resize is enabled, means if we disabled lru
resize, old aged locks are still cached even though they reach the
ns_max_age.

But theoretically, even lru resize disabled, lru_max_age should behave
same as enabling lru resize. At the end, lru_size is like hard limit of
number of locks, but ns_max_age/lru_max_age is a elimination mechanism,
regardless enabling or disabling lru resize meaning once it gets
lru_max_age, locks need to be cancelled.

So fix it here with changing the lru flags when invoking ldlm_cancel_lru
to do the real cancel work, if lru resize is enabled, set flag to
LDLM_LRU_FLAG_LRUR, otherwise LDLM_LRU_FLAG_AGED.

Change-Id: Ic2df2550af87fd7209fdb31ca3730683d727a74d
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/35467
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12137 osd-ldiskfs: use ERR_CAST in osd_oi_index_open() 55/35455/2
James Simmons [Wed, 10 Jul 2019 14:18:32 +0000 (10:18 -0400)]
LU-12137 osd-ldiskfs: use ERR_CAST in osd_oi_index_open()

In osd_oi_index_open() when dentry is invalid it is void casting
the invalid dentry being returned. This is what ERR_CAST was
invented for so use it.

Test-Parameters: trivial

Change-Id: I71016ff9c9dfc4408db8ab14576ba87dd6dc352d
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35455
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-12137 osd-ldiskfs: migrate ll_lookup_one_len to osd_compat.c 54/35454/2
James Simmons [Wed, 10 Jul 2019 14:18:03 +0000 (10:18 -0400)]
LU-12137 osd-ldiskfs: migrate ll_lookup_one_len to osd_compat.c

The function ll_lookup_one_len() in lvfs.h is only used for the
osd-ldiskfs layer so relocate it to osd_compat.c and rename it
to osd_lookup_one_len_unlocked(). We use unlocked in the name
since this signifies it must always be called with the inode lock
not taken since it will take the inode lock itself.

Test-Parameters: trivial

Change-Id: I507100a676f98448f4cacc94e902294903a67efb
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35454
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-12431 clio: issue wake with waitqueue lock held 81/35381/9
Shaun Tancheff [Tue, 16 Jul 2019 06:05:00 +0000 (01:05 -0500)]
LU-12431 clio: issue wake with waitqueue lock held

Remove the barrier and rely on the wait queue lock for wake
synchronization.

Leave csi_end_io empty but available for future customization

Inspired by c3973b4aca6df794c492f6856ffbf02f2f8a9592

Cray-bug-id: LUS-7330
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: Idc632140256cccfa6046a52cbd1c6432955e2b11
Reviewed-on: https://review.whamcloud.com/35381
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL cleanup 14/35414/4
Vitaly Fertman [Wed, 3 Jul 2019 17:10:53 +0000 (20:10 +0300)]
LU-10070 lod: SEL cleanup

some cleanups
- dt_statfs with an extra paremeter to be dt_statfs_info;
- lod_statfs_and_check does not need an extra parameter and
to be static again;
- move asserts to a better place;
- test component-add with wrong paremeters;
- print out the layout sanity errors wherever needed;
- make an array of layout_sanity errors;
- an HSM sanity test is added;

and one defect:
- the last component cannot be 0-lenght;

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: If832579ce27cb6ab87d36a594c04363deaea8711
Reviewed-on: https://review.whamcloud.com/35414
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 utils: setstripe component-add support for SEL 14/35314/4
Vitaly Fertman [Fri, 21 Jun 2019 22:45:49 +0000 (01:45 +0300)]
LU-10070 utils: setstripe component-add support for SEL

the math of the SEL component sizes calculation does not work for the
component end when some components already exist.

Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I793ea7e5a5ac8f4639f694f83af7d413bd2b982c
Reviewed-on: https://review.whamcloud.com/35314
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-4423 lustre: convert rsi_sem to a spinlock. 79/35279/4
NeilBrown [Thu, 4 Jul 2019 21:14:21 +0000 (17:14 -0400)]
LU-4423 lustre: convert rsi_sem to a spinlock.

This lock is never held over code that sleeps, and is
only ever held for short periods of time.
So a simple spinlock is best.

Change-Id: I3280f52bf64ae2b896bd67436d8d8a42cab38ac2
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35279
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-12400 libcfs: save_stack_trace_tsk if ARCH_STACKWALK 39/35239/3
Shaun Tancheff [Mon, 15 Jul 2019 17:30:43 +0000 (12:30 -0500)]
LU-12400 libcfs: save_stack_trace_tsk if ARCH_STACKWALK

Along with CONFIG_ARCH_STACKWALK save_stack_trace_tsk is not
directly available. Try using symbol_get() to acquire it.

Linux-commit: 214d8ca6ee854f696f75e75511fe66b409e656db

Test-Parameters: trivial
Cray-bug-id: LUS-7600
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I923b718eadc6c58fa2676a6d2fbd48523c615f62
Reviewed-on: https://review.whamcloud.com/35239
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12400 lnet: SO_SNDTIMEO, SO_RCVTIMEO removed 37/35237/4
Shaun Tancheff [Mon, 15 Jul 2019 17:28:01 +0000 (12:28 -0500)]
LU-12400 lnet: SO_SNDTIMEO, SO_RCVTIMEO removed

Y2038 64-bit time removed socket options that specify time.
The previous interface is available under _OLD and versions
expecting timespec64 are _NEW.

Linux-commit: 7f1bc6e95d7840d4305595b3e4025cddda88cee5

Test-Parameters: trivial
Cray-bug-id: LUS-7600
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I73ad3aa61afdfeba6160d95470b22cea03ed17f9
Reviewed-on: https://review.whamcloud.com/35237
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12427 lnet: warn if discovery is off 00/35200/5
Amir Shehata [Wed, 12 Jun 2019 00:58:09 +0000 (17:58 -0700)]
LU-12427 lnet: warn if discovery is off

Output a warning if discovery is off and admin is
either trying to add a route or enable routing

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Iacd7762c5d19c6e0c45ff6a58693a05761f1336f
Reviewed-on: https://review.whamcloud.com/35200
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12331 llite: create obd_device with usercopy whitelist 46/34946/5
Li Dongyang [Thu, 23 May 2019 06:48:15 +0000 (16:48 +1000)]
LU-12331 llite: create obd_device with usercopy whitelist

Since kernel 4.16 hardened usercopy has been added,
whitelist the struct obd_device to silence the warning.

 Bad or missing usercopy whitelist? Kernel memory exposure attempt
 detected from SLUB object 'll_obd_dev_cache' (offset 1256, size 40)!
 WARNING: CPU: 1 PID: 17534 at mm/usercopy.c:83 usercopy_warn+0x7d/0xa0
 Call Trace:
   __check_object_size+0xfa/0x181
   lmv_iocontrol+0x1146/0x1880 [lmv]
   ll_obd_statfs+0x356/0x860 [lustre]
   ll_dir_ioctl+0x1e37/0x6760 [lustre]
   do_vfs_ioctl+0xa4/0x630

Linux-commit: 8eb8284b412906181357c2b0110d879d5af95e52

Test-Parameters: trivial
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Change-Id: Ie863e8a5e2cebd3fd716e7ccc4e0491f83f6fabc
Reviewed-on: https://review.whamcloud.com/34946
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12501 utils: fix 'lfs df' printing loop 56/35456/3
Andreas Dilger [Wed, 10 Jul 2019 16:53:59 +0000 (10:53 -0600)]
LU-12501 utils: fix 'lfs df' printing loop

If the OS_STATE_NONROT flag is set for a device, the showdf() state
printing loop will spin endlessly because this bit is not printed,
so it is never cleared from the loop's state mask.

Declaring the obd_statfs_state_names[] array indexed by OS_STATE_*
flags also is problematic because the array will double in size as
new binary flags are added (already OS_STATE_NONROT results in an
array size of 0x200 = 512 entries).  Instead, declare a struct that
is indexed linearly and stores the OS_STATE_* flag in a field,
along with the name and whether the flag indicates a problem state.

The flag printing loop can iterate over the array of flags instead
of the os_state bits, which clarifies the for-loop iteration and is
equally efficient.

This also allows printing informational flags with "lfs df -v" so
that OS_STATE_NONROT and similar flags can be visible to users.

Fixes: 68635c3d9b3 ("LU-11963 osd: Add nonrotational flag to statfs")
Change-Id: Ib62e949ca56d691c4699d5f2d9439c42643ebbe5
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35456
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-6142 obdecho: Fix style issues for echo_client.c 89/34489/8
Arshad Hussain [Wed, 20 Mar 2019 17:14:33 +0000 (22:44 +0530)]
LU-6142 obdecho: Fix style issues for echo_client.c

This patch fixes issues reported by checkpatch
for file lustre/obdecho/echo_client.c

Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: Ia6c218e3ccf35e9ea91b323eac09fb21284bb1aa
Reviewed-on: https://review.whamcloud.com/34489
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoLU-6142 utils: Fix style issues for libiam.c 38/34438/9
Arshad Hussain [Sat, 9 Mar 2019 17:31:19 +0000 (23:01 +0530)]
LU-6142 utils: Fix style issues for libiam.c

This patch fixes issues reported by checkpatch for
file lustre/utils/libiam.c

Change-Id: I441edce554e84cdab9eb874666b86fa929ef9f67
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-on: https://review.whamcloud.com/34438
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoLU-6504 socklnd: fix indentation issue highlighted by smatch 99/14599/5
Oleg Drokin [Mon, 27 Apr 2015 02:20:40 +0000 (22:20 -0400)]
LU-6504 socklnd: fix indentation issue highlighted by smatch

lnet/klnds/socklnd/socklnd.c:1459 ksocknal_close_conn_locked() warn: inconsistent indenting

Change-Id: I66139687e82ea526ad9df1a973dc4a408b490ff4
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/14599
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10756 ptlrpc: change IMPORT_SET_* macros into real functions 63/35463/2
James Simmons [Thu, 11 Jul 2019 00:52:34 +0000 (20:52 -0400)]
LU-10756 ptlrpc: change IMPORT_SET_* macros into real functions

Make the IMPORT_SET_STATE_NOLOCK and IMPORT_SET_STATE macros into
normal functions. Since import_set_state_nolock() is basically a
wrapper around __import_set_state() we can merge both functions.

Change-Id: Idaa6aeb81ff2282e2f83d758a267129e686bd794
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35463
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12477 lustre: use delete_from_page_cache() for directory pages. 62/35462/3
NeilBrown [Thu, 11 Jul 2019 00:50:22 +0000 (20:50 -0400)]
LU-12477 lustre: use delete_from_page_cache() for directory pages.

lustre sometimes uses the internal function truncate_complete_page()
to remove a page of a directory.
Much of what this function does, does not apply to directory pages
as there is no invalidatepage function, and at these times, the
page is not dirty.
The only useful part of the function is delete_from_page_cache(),
so just call that directly.

Linux-commit: d17fa2f3a0b9b40be48e0c3cc88eb3b3cea1b701

Change-Id: I54795e71e107c50f662bd2015c6f621bfe436e0a
Acked-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35462
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12477 llite: use generic_error_remove_page() 61/35461/3
NeilBrown [Thu, 11 Jul 2019 00:47:40 +0000 (20:47 -0400)]
LU-12477 llite: use generic_error_remove_page()

lustre's internal ll_invalidate_page() is behaviourally identical to
generic_error_remove_page().
In the case of lustre it isn't a memory hardware error that requires
the page being invalidated, it is the loss of a lock, which will likely
result in the data changing on the server.
In either case, we don't want the page to be accessed any more, so the
same removal is appropriate.

Linux-commit: d5419b40599b4d6e030695dad30f15347679be66

Change-Id: I92686b5332eec02580563c1bee779688e8e591a3
Acked-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35461
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12137 osd-ldiskfs: migrate osd_ios_lookup_one_len() to osd_compat.c 53/35453/2
James Simmons [Wed, 10 Jul 2019 14:15:31 +0000 (10:15 -0400)]
LU-12137 osd-ldiskfs: migrate osd_ios_lookup_one_len() to osd_compat.c

The function osd_ios_lookup_one_len() was created for the LFSCK code
to look for a dentry by name and if the inode of that dentry was
NULL treat it as an -ENOENT so LFSCK would repair the file. This
function will be used for more the scrub infrastructure in future
patches so move it to osd_compat.c.

Test-Parameters: trivial

Change-Id: Ic34c1110f8ced7a4a2f7c0fa3b8a9403be9940ca
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35453
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-12524 libcfs: Reduce memory frag due to HA debug msg 49/35449/4
Ann Koehler [Mon, 8 Jul 2019 21:36:24 +0000 (16:36 -0500)]
LU-12524 libcfs: Reduce memory frag due to HA debug msg

The dynamic allocation and freeing of Lustre trace pages has been
shown to cause memory fragmentation that sometimes prevents
applications from getting the contiguous memory they need to run. In
one such occurrence over 99% of the messages were the matched open
trace messages issued by mdc_close():

DEBUG_REQ(D_HA, mod->mod_open_req, "matched open; tag %d", tag);

D_HA is included in the default set of debug flags. This has proven
to be quite useful in debugging connection issues particularly at
mount time. So removing all HA message from the default tracing is
not a good option.

However, the matched open debug message has not proven itself to be
as generally useful. So moving the message under a different debug
flag, one that must be explicitly enabled, reduces the amount of
default tracing and thereby helps reduce fragmentation without
causing much loss of functionality. Using D_RPCTRACE to match the
corresponding open debug message in mdc_set_open_replay_data.

Test-Parameters: trivial
Cray-bug-id: LUS-7560
Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: Iee267fba517c20b82dccf8d2ac10f8e7f15354f8
Reviewed-on: https://review.whamcloud.com/35449
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-12523 ptlrpc: Add jobid to rpctrace debug messages 45/35445/6
Ann Koehler [Mon, 8 Jul 2019 20:17:07 +0000 (15:17 -0500)]
LU-12523 ptlrpc: Add jobid to rpctrace debug messages

This mod adds the jobid string found in the ptlrpc_body of an rpc
to the output of rpctrace messages. If jobids are not in use the
string will be empty. If jobids are in use, the string can be
useful in analyzing Lustre activity.

Test-Parameters: trivial
Cray-bug-id: LUS-7557
Signed-off-by: Ann Koehler <amk@cray.com>
Change-Id: Ib7ec75e28581f3ac420314812e2521fa49f021dd
Reviewed-on: https://review.whamcloud.com/35445
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoLU-12513 utils: add handling YAML_NO_TOKEN 28/35428/5
Ben Evans [Sat, 6 Jul 2019 23:23:42 +0000 (19:23 -0400)]
LU-12513 utils: add handling YAML_NO_TOKEN

YAML_NO_TOKEN is something that is not a key value pair, either
an entry or exit into a sub-structure. The parser equates it to
an error. This flaw makes a YAML format that is valid be treated
as invalid. Add handling YAML_NO_TOKEN as a valid setting. This
flaw was discovered in the patch for the LU-6081 work.

Test-Parameters: trivial

Change-Id: Ibe0c0a2bea22b26a0dd2d900b7a4a5957b96e3da
Signed-off-by: Ben Evans <bevans@cray.com>
Reviewed-on: https://review.whamcloud.com/35428
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoLU-12513 utils: change cy_valueint to 64 bits 23/35423/4
James Simmons [Sat, 6 Jul 2019 22:51:40 +0000 (18:51 -0400)]
LU-12513 utils: change cy_valueint to 64 bits

While testing the lustre YAML implementation with netlink I found
that it didn't support 64 bit integer values. Change cy_valueint
to a long long.

Test-Parameter: trivial

Change-Id: I1746a4907a83e7c5733a0681a66e7f4a54a4c392
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35423
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
14 months agoLU-12482 tests: use MDS1_VERSION in sanity-hsm 53/35353/2
Oleg Drokin [Fri, 28 Jun 2019 01:49:28 +0000 (21:49 -0400)]
LU-12482 tests: use MDS1_VERSION in sanity-hsm

MDS_VERSION_CODE is long deprecated and undefined.

Change-Id: I007cea54cb4ddbc5705656a84d12448dd84aa0a1
Test-Parameters: trivial
Test-Parameters: testlist=sanity-hsm,sanity-hsm
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35353
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
14 months agoLU-12043 llite: make sure readahead cover current read 15/35215/4
Wang Shilong [Thu, 13 Jun 2019 01:35:12 +0000 (09:35 +0800)]
LU-12043 llite: make sure readahead cover current read

When doing readahead, @ria_end_min is used to indicate
how far we are expected to read to cover current
read.

update @ria_end_min unconditionally with IO end.
also @ria_end_min is closed interval which should be
calculated as start + count - 1;

Change-Id: If7f8da44da31623a73b363d5a18c1ec8b54da745
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/35215
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12401 gss: fix checksum for Kerberos and SSK 99/35099/7
Sebastien Buisson [Fri, 7 Jun 2019 14:45:26 +0000 (23:45 +0900)]
LU-12401 gss: fix checksum for Kerberos and SSK

When computing checksum for Kerberos, krb5 wire token header is
appended to the plain text. Make sure the actual header is appended
in gss_digest_hash().
For interop with older clients, introduce new server side tunable
'sptlrpc.gss.krb5_allow_old_client_csum'. When not set, servers refuse
Kerberos connection from older clients.

In gss_crypt_generic(), protect against an undefined behavior by
switching from memcpy to memmove.

When computing checksum for SSK, make sure the actual token is used
to store the checksum.

Fixes: a21c13d4df ("LU-8602 gss: Properly port gss to newer crypto api.")
Test-Parameters: envdefinitions=SHARED_KEY=true testlist=sanity,recovery-small,sanity-sec
Test-Parameters: envdefinitions=SHARED_KEY=true clientbuildno=6308 clientjob=lustre-reviews-patchless testlist=sanity,recovery-small,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0233ada481f132af112bf88c065f5421902c942e
Reviewed-on: https://review.whamcloud.com/35099
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12131 tests: fix test_802a for GSS 35/35335/2
Sebastien Buisson [Thu, 27 Jun 2019 10:08:17 +0000 (12:08 +0200)]
LU-12131 tests: fix test_802a for GSS

test_802a should not overwrite already existing client mount options
when trying to mount client as read-only.

Test-Parameters: trivial
Test-Parameters: envdefinitions=ONLY=802a testlist=sanity
Test-Parameters: envdefinitions=SHARED_KEY=true,ONLY=802a testlist=sanity
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8189c245870fb0caf48006db11621f0af48e1878
Reviewed-on: https://review.whamcloud.com/35335
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12131 tests: properly handle GSS in server failover 41/35041/6
Sebastien Buisson [Mon, 3 Jun 2019 14:30:50 +0000 (23:30 +0900)]
LU-12131 tests: properly handle GSS in server failover

In case of server failover, a number of aspects must be handled when
GSS based features (SSK or Kerberos) are activated:
- lsvcgssd daemon must be restarted;
- targets must be mounted with proper skpath option;
- permissions on keys must be adjusted.
When service is initially started, all that is managed in setupall().
fail() and facet_failover() have to be improved to take GSS aspects
into account.

Test-Parameters: envdefinitions=SHARED_KEY=true testlist=sanity,recovery-small,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I8db686f406629c7eec655496cf83c0539c1bfb33
Reviewed-on: https://review.whamcloud.com/35041
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-6142 tests: Fix style issues for cascading_rw.c 33/35433/3
Arshad Hussain [Fri, 5 Jul 2019 20:56:26 +0000 (02:26 +0530)]
LU-6142 tests: Fix style issues for cascading_rw.c

This patch fixes issues reported by checkpatch
for file lustre/tests/mpi/cascading_rw.c

Test-Parameters: trivial
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Change-Id: I033db1a7ee23042cc3ec0dedced48830e00d1230
Reviewed-on: https://review.whamcloud.com/35433
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11761 fld: let's caller to retry FLD_QUERY 62/34962/10
Hongchao Zhang [Thu, 4 Jul 2019 13:39:24 +0000 (09:39 -0400)]
LU-11761 fld: let's caller to retry FLD_QUERY

In fld_client_rpc(), if the FLD_QUERY request between MDTs fails
with -EWOUDBLOCK because the connection is lost, return -EAGAIN
to notify the caller to retry.

It also reverts the patch https://review.whamcloud.com/12586/, which
was landed on b2_6_90_0-5-g6db07f0 to avoid returning -EAGAIN from
lod_object_init() to confuse lu_object_find_at() (thinks the object
was dying when it encounters -EAGAIN). In current Lustre version,
lu_object_find_at() just returned found object and let's caller to
check whether it's dying.

Fixes: 6db07f095fba ("LU-5871 lod: Do not return EAGAIN in lod_object_init")
Change-Id: Ie83ebfdae2bd50c96a59a065f7f3c3dcfad04e42
Signed-off-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34962
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12538 lod: Add missed qos_rr_init 90/35490/4
Patrick Farrell [Fri, 12 Jul 2019 19:24:30 +0000 (15:24 -0400)]
LU-12538 lod: Add missed qos_rr_init

The new lmv space hash code uses the lu_qos_rr struct, but
forgot to init it fully.  Specifically, the spin lock isn't
inited, causing failures.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: Id410a8dc61980b880eab7e151b85c417a8439fd5
Reviewed-on: https://review.whamcloud.com/35490
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoNew tag 2.12.56 2.12.56 v2_12_56
Oleg Drokin [Tue, 16 Jul 2019 17:08:26 +0000 (13:08 -0400)]
New tag 2.12.56

Change-Id: Iad3fd72a2720f5dba2b7dae667b088eb73199d6a
Signed-off-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-7912 osc: Remove stale comment in osc_page_transfer_add 15/19115/8
Oleg Drokin [Thu, 24 Mar 2016 00:19:46 +0000 (20:19 -0400)]
LU-7912 osc: Remove stale comment in osc_page_transfer_add

Ever since LU-3321 the fileds were not shared,
but then ops_inflight was completely deleted too.

Test-Parameters: trivial
Change-Id: I7b235f10ddb26a7ddbd4de7e502d33ee81a4f2e3
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/19115
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoLU-12410 tests: ignore return status of removal of 99-lustre-test.rules 98/35398/2
Oleg Drokin [Mon, 1 Jul 2019 17:46:59 +0000 (13:46 -0400)]
LU-12410 tests: ignore return status of removal of 99-lustre-test.rules

we don't care if we cannot remove it on th server side.

Change-Id: Id6833505c1e7cd39df9845a16b01c31c9d65e794
Test-Parameters: trivial
Test-Parameters: testlist=sanity-dlc
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35398
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Chris Horn <hornc@cray.com>
14 months agoLU-9862 lov: Correct bounds checking 84/28484/16
Nathaniel Clark [Thu, 4 Jul 2019 15:34:05 +0000 (11:34 -0400)]
LU-9862 lov: Correct bounds checking

While Dan Carpenter ran his smatch tool against the lustre code
base he encountered the following static checker warning:

lustre/lov/lov_ea.c:207 lsm_unpackmd_common()
warn: signed overflow undefined. 'min_stripe_maxbytes * stripe_count < min_stripe_maxbytes'

The current code doesn't properly handle the potential overflow
with the min_stripe_maxbytes * stripe_count. This fixes the
overflow detection for maxbytes in lsme_unpack().

Change-Id: I34646df3d59cadcb42a4defb58e16cb840acc99
Fixes: 3ddcf5b4a138 ("LU-7890 lov: Ensure correct operation for large object sizes")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nathaniel Clark <nclark@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/28484
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12495 test: shorten qos_maxage to update statfs 95/35395/4
Lai Siyao [Fri, 28 Jun 2019 19:07:09 +0000 (03:07 +0800)]
LU-12495 test: shorten qos_maxage to update statfs

sanity test_413b() should shorten lmv->desc.qos_maxage to update
cached statfs in time.

Test-Parameter: trivial envdefinitions=ONLY=413b
Test-Parameter: testlist=sanity,sanity,sanity,sanity,sanity,sanity

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I58672590669be5eaa5c0d679c51cb6cd533bc0d7
Reviewed-on: https://review.whamcloud.com/35395
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
14 months agoLU-12514 obdclass: Drop FS_HAS_FIEMAP compat macro 24/35424/2
Oleg Drokin [Fri, 5 Jul 2019 17:13:26 +0000 (13:13 -0400)]
LU-12514 obdclass: Drop FS_HAS_FIEMAP compat macro

FS_HAS_FIEMAP was some sort of old RHEL5 construct that's not
really important anymore

Linux-commit: 5c8eae72ff46f0e70d03ae2e86e631d7a1ca4fe6

Change-Id: Ia9941fa32eeb6114f9404014b78c29465d524d07
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/35424
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
14 months agoLU-12368 obdclass: don't send multiple statfs RPCs 80/35380/5
Andreas Dilger [Sat, 29 Jun 2019 01:10:41 +0000 (19:10 -0600)]
LU-12368 obdclass: don't send multiple statfs RPCs

If multiple threads are racing to send a non-cached OST_STATFS or
MDS_STATFS RPC, this can cause a significant RPC storm for systems
with many-core clients and many OSTs due to amplification of the
requests, and the fact that STATFS RPCs are sent asynchronously.
Some logs have shown few 96-core clients have 20k+ OST_STATFS RPCs
in flight concurrently, which can overload the network if many OSTs
are on the same OSS nodes (osc.*.max_rpcs_in_flight is per OST).

This was not previously a significant issue when core counts were
smaller on the clients, or with fewer OSTs per OSS.

If a thread can't use the cached statfs values, limit statfs to one
thread at a time, since the thread(s) would be blocked waiting for
the RPC replies anyway, which can't finish faster if many are sent.

Also add a llite.*.statfs_max_age parameter that can be tuned on
to control the maximum age (in seconds) of the statfs cache.  This
can avoid overhead for workloads that are statfs heavy, given that
the filesystem is _probably_ not running out of space this second,
and even so "statfs" does not guarantee space in parallel workloads.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I95690e37aecbac08ac5768a5e5c6c70ca258a832
Reviewed-on: https://review.whamcloud.com/35380
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12481 osd-ldiskfs: allow full 64KB xattr size 52/35352/2
Andreas Dilger [Fri, 28 Jun 2019 00:53:04 +0000 (18:53 -0600)]
LU-12481 osd-ldiskfs: allow full 64KB xattr size

When the 'ea_inode' feature is enabled, allow the full 64KB xattr
size, since the xattr data is stored directly in the ea_inode data
blocks, while the ext4_xattr_entry and ext4_xattr_hdr structures are
stored separately in the parent inode or external xattr block.

This avoids errors on the client trying to set a full-sized inode:

    setfattr: /mnt/lustre/f61.conf-sanity: Argument list too long

Fixes: 3ec712bd183a ("LU-11868 osd: Set max ea size to XATTR_SIZE_MAX")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1320c32af98ab0feeeb147d8dbbc66ec7d1b8e1f
Reviewed-on: https://review.whamcloud.com/35352
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-9859 libcfs: remove prng 51/35351/4
NeilBrown [Fri, 28 Jun 2019 00:59:07 +0000 (20:59 -0400)]
LU-9859 libcfs: remove prng

The cfs prng is no longer used, so discard it.

Linux-commit: 508d5e0f4d45a815a0759c6aea69fef62359cf74

Test-Parameters: trivial

Change-Id: If780690dba196c8bc5935be223a952442f6a33ae
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-on: https://review.whamcloud.com/35351
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12473 llapi: fix pool_list by path 20/35320/7
Dominique Martinet [Tue, 25 Jun 2019 16:11:30 +0000 (18:11 +0200)]
LU-12473 llapi: fix pool_list by path

lfs/lctl pool_list <fs_path> would print the FS path as pool prefix.
print fsname properly instead.

Fixes: 8813fdf2a4f2 ("LU-5030 util: migrate liblustreapi to use cfs_get_paths()")
Change-Id: I016b794fabd3d161d4651b41989637aebdf31f36
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Reviewed-on: https://review.whamcloud.com/35320
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-4423 ptlrpc: remove inline on non-inlined functions. 96/35296/5
NeilBrown [Mon, 1 Jul 2019 17:10:26 +0000 (13:10 -0400)]
LU-4423 ptlrpc: remove inline on non-inlined functions.

These three functions are never inlined.  The only time they
are used, their address is taken, and this forces them to
be compiled as stand-alone functions.  So having the "inline"
declaration is misleading.

Move the functions to the place where their address is used, and
remove the 'inline' tag.

Test-Parameters: trivial

Change-Id: I0824d362f05e7397dd828f06464ad5aa156673d4
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/35296
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12400 osd-ldiskfs: get rid of legacy 'get_ds()' function 38/35238/3
Shaun Tancheff [Mon, 10 Jun 2019 20:48:54 +0000 (15:48 -0500)]
LU-12400 osd-ldiskfs: get rid of legacy 'get_ds()' function

get rid of legacy 'get_ds()' function

Every in-kernel use of this function defined it to KERNEL_DS
(either as an actual define, or as an inline function).

Linux-commit: 736706bee3298208343a76096370e4f6a5c55915

Test-Parameters: trivial
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Change-Id: I3aabe74802f1a953b140728f22c83125dae270c3
Reviewed-on: https://review.whamcloud.com/35238
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12423 lnet: honor discovery setting 92/35192/3
Amir Shehata [Tue, 11 Jun 2019 19:02:15 +0000 (12:02 -0700)]
LU-12423 lnet: honor discovery setting

If discovery is off do not push out any updates. This could be
triggered in case of a gateway's interface changing.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: Ie421318ae85b895327ec170ffb436c9b679f6866
Reviewed-on: https://review.whamcloud.com/35192
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL: interoperability support 44/35144/8
Vitaly Fertman [Wed, 5 Jun 2019 14:23:40 +0000 (17:23 +0300)]
LU-10070 lod: SEL: interoperability support

Add a new SEL magic for storing SEL components on disk.
It is never gets out of LOD, converted on read/write to COMP_V1.
A the result, old MDS is not able to open SEL files.
At the same time old clients are able to work with existing files
seamlessly. Old clients still lacks lustre utils support, thus not
possible to create new SEL files, etc.

Cray-bug-id: LUS-2528
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ib3f0b1402cd920e56beaad78a74da485bd7ad342
Reviewed-on: https://review.whamcloud.com/35144
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12355 llite: totalram_pages changed to atomic_long_t 25/35025/5
Shaun Tancheff [Sat, 15 Jun 2019 19:32:26 +0000 (14:32 -0500)]
LU-12355 llite: totalram_pages changed to atomic_long_t

Kernel 5.0 changed totalram_pages to atomic_long_t
Provide an abstracted accessor now that totalram_pages
is now a function

Linux-commit: ca79b0c211af63fa3276f0e3fd7dd9ada2439839

Test-Parameters: trivial
Change-Id: I558e42074004e2ee5f79deea0d363e5bea332729
Signed-off-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-on: https://review.whamcloud.com/35025
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 utils: SEL: lfs find & getstripe support 09/34909/16
Vitaly Fertman [Mon, 3 Jun 2019 16:34:05 +0000 (19:34 +0300)]
LU-10070 utils: SEL: lfs find & getstripe support

The support includes:
- add --extension-size option to lfs find & getstripe along
  with +/- functionality;
- do not take the extension components into account for
  lfs find --stripe-size and --stripe-count;
- add appropriate tests;

Cray-bug-id: LUS-2528
Signed-off-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ic3ad4c713e8c676998cf7d02b524ba266c992924
Reviewed-on: https://review.whamcloud.com/34909
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-8066 obd_type: use typ_kobj.name as typ_name 17/34717/7
NeilBrown [Thu, 27 Jun 2019 21:55:26 +0000 (17:55 -0400)]
LU-8066 obd_type: use typ_kobj.name as typ_name

As the kobject has a name (after kobject_add has been called),
we don't need to also store it in typ_name.
So use typ_kobj.name instead of typ_name.

This requires changing some "char *" to "const char *" as
typ.kobj.name is const.

Change-Id: Iaf0ef192e91ba1b4bd1c1b124dc1068de632d341
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-on: https://review.whamcloud.com/34717
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Petros Koutoupis <pkoutoupis@cray.com>
14 months agoLU-12101 socklnd: fix infinite loop in ksocknal_push() 99/34499/4
NeilBrown [Thu, 27 Jun 2019 15:18:36 +0000 (11:18 -0400)]
LU-12101 socklnd: fix infinite loop in ksocknal_push()

If the list_for_each_entry() loop in ksocknal_push()
ever finds a match, then it will increment 'i', and the outer
loop will continue.

Once peer_off becomes larger than the number of matches
in a given chain, 'peer_ni' will be an invalid pointer, and
ksocknal_push_peer() will probably crash when called on it.

To abort the outer loop properly, we need to test if
"i <= peer_off", which indicates that all patching peers
have been found.

This bug can easily be reproduced by running
  lctl --net tcp push

Signed-off-by: Mr NeilBrown <neilb@suse.com>
Change-Id: I9468214c7e1a0154213586cac0deb61afaa1d53d
Reviewed-on: https://review.whamcloud.com/34499
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Sonia Sharma <sharmaso@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL: Repeated components 86/33786/28
Patrick Farrell [Thu, 6 Jun 2019 16:34:21 +0000 (19:34 +0300)]
LU-10070 lod: SEL: Repeated components

This changes behavior when there is no next component to
spill over to.  Currently, in that case, we just extend
the current component regardless of available space.

Now, if there is no following component, we try repeating
the current component, creating a new component using the
current one as a striping template.  We try assigning
striping for this component.  If there is sufficient free
space on the OSTs chosen for this component, it is
instantiated and i/o continues there.

If there is not sufficient space on the OSTs chosen for the
new component, we remove it & extend the current component.

This is a behavioral improvement, with no implications for
layout sanity checking.

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: If9f364b4105a4bb892dfe673c724e04781c46336
Reviewed-on: https://review.whamcloud.com/33786
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL: Add FLR support 85/33785/24
Patrick Farrell [Tue, 11 Dec 2018 19:34:27 +0000 (13:34 -0600)]
LU-10070 lod: SEL: Add FLR support

Add FLR support for self-extending layouts.

The basic model is that when a layout intent would modify
an FLR replica, we first run the extent_update code to
perform any layout extent changes for self extending
layouts.

This treats the FLR operations (stale, resync, etc)
similarly to i/o, creating initialized layout where those
operations need it.  This makes the interaction between SEL
and FLR fairly simple.

Add FLR tests for self-extending layouts

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ia23df8e226955f64e9b19df993b66d2d4f820f33
Reviewed-on: https://review.whamcloud.com/33785
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL: Layout sanity checking 84/33784/25
Patrick Farrell [Mon, 3 Jun 2019 17:43:16 +0000 (20:43 +0300)]
LU-10070 lod: SEL: Layout sanity checking

Add layout sanity checking for self-extending layouts.

This requires a more complex method checking layouts,
checking the entire layout rather than just individual
components against their immediate neighbors.  This is
implemented with a layout sanity callback which walks the
layout.

Incorporate mirror sanity checks from lfs.c.

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I960a4ce96ace54f7fe4305b9197e27c540f81211
Reviewed-on: https://review.whamcloud.com/33784
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL: Implement basic spillover space 83/33783/24
Patrick Farrell [Mon, 3 Jun 2019 16:24:29 +0000 (19:24 +0300)]
LU-10070 lod: SEL: Implement basic spillover space

This is a barebones implementation of spillover space.
This allows the creation of extendable layout components,
which are normal layout components followed by "extension
components".  These extension components are never
initialized, instead, when i/o reaches them, the server
checks if there is sufficient space on the preceding normal
layout component, and if so, it modifies the extent of the
component to give space to the preceding component.

If there is not sufficient space on those OSTs, the special
extension space component can be removed, and the next
component of the layout is moved down to meet the existing
component.  This allows i/o to "spill over" to this new
layout component, which is expected to be on different
OSTs.

For multi-tiered systems, this makes it possible to avoid
the situation where an inner tier is low on space, but a
an outer tier has plenty, and PFL files cannot use the
space in the outer tier because the inner is full.

This patch requires the next patch in the series for FLR
support, but does not depend on the other subsequent
patches in this series.

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I8f6c6df8ee155033d5278535dc456e604552e409
Reviewed-on: https://review.whamcloud.com/33783
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL: Add flag & setstripe support 82/33782/22
Patrick Farrell [Mon, 3 Jun 2019 16:22:09 +0000 (19:22 +0300)]
LU-10070 lod: SEL: Add flag & setstripe support

The self-extending layouts feature adds a new layout flag
and also uses the stripe size field differently.

This patch implements this basic functionality, to be used
in subsequent patches.

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I4392b70266cbab5bc8fa42afc3c360b954d5918a
Reviewed-on: https://review.whamcloud.com/33782
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 lod: SEL: split lod_del_layout 80/33780/22
Patrick Farrell [Wed, 19 Jun 2019 21:29:12 +0000 (00:29 +0300)]
LU-10070 lod: SEL: split lod_del_layout

SEL deletes layout components as part of other operations,
rather than only as a separate delete operation.

So we split lod_layout_del in to a function that prepares
the layout and one that writes it out.  The prep_layout
function will be used in later patches.

Cray-bug-id:  LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I5d7270db9a8d9bc94f4571906ed9e2d4a17a151b
Reviewed-on: https://review.whamcloud.com/33780
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-10070 tests: New test-framework functionality 78/33778/21
Patrick Farrell [Thu, 13 Jun 2019 22:30:24 +0000 (01:30 +0300)]
LU-10070 tests: New test-framework functionality

The self-extending layout tests will make heavy use of
setting OST low & high watermarks to simulate low/out of
space conditions.  To this end, add improved ways of
working with these to the test framework and use them in
sanity 253.

Add a component-count helper in sanity-pfl.

Fix pool_add_targets so it can add only 1 target.

Also move one helper from sanity to test-framework so it
can be used from sanity-pfl.

Cray-bug-id: LUS-2528
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I4e75c7db07b201ff2c410734d5daa991e74bd5c1
Reviewed-on: https://review.whamcloud.com/33778
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11762 ldlm: don't exceed hard timeout 08/34408/9
James Simmons [Thu, 4 Jul 2019 16:47:09 +0000 (12:47 -0400)]
LU-11762 ldlm: don't exceed hard timeout

For recovery lustre has both a soft timeout, obd_recovery_timeout
and a hard timeout, obd_recovery_time_hard. When the recovery
timer is adjust with the function extend_recovery_timer() you
can control if it takes in consideration what is left of the
timer. The current code is not very clear on its intent so this
patch attempts to make the code understandable. No function
change should happen with this patch.

Change-Id: I5701a6cd813ad64b6b4422863767af135eb8e94b
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34408
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12491 obdclass: add comment for rcu handling in lu_env_remove 47/35447/4
James Simmons [Mon, 8 Jul 2019 20:47:40 +0000 (16:47 -0400)]
LU-12491 obdclass: add comment for rcu handling in lu_env_remove

During the review it was pointed out why the RCU lock was dropped
in lu_env_remove() but the code itself doesn't explain why. Add
a comment giving the details why RCU locking is not needed.

Test-parameters: trivial

Change-Id: I4fd761d2e1b4adad8e970904d56cdcd057dfe7d5
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/35447
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11518 osc: cancel osc_lock list traversal once found the lock is being used 96/35396/4
Gu Zheng [Mon, 24 Jun 2019 05:51:20 +0000 (13:51 +0800)]
LU-11518 osc: cancel osc_lock list traversal once found the lock is being used

Currently, in osc_ldlm_weigh_ast, it walks osc_lock list (oo_ol_list)
to check whether target dlm is being used, normally, if found, it needs
to skip the rest ones and cancel the traversal, but it doesn't, let's
fix it here.

Change-Id: I2e64d2938cdacb6c5baca73647d74c9fb8f54f8c
Fixes: 3f3a24dc5d7d ("LU-3259 clio: cl_lock simplification")
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/35396
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11213 uapi: change "space" hash type to hash flag 18/35318/3
Lai Siyao [Fri, 21 Jun 2019 06:47:42 +0000 (14:47 +0800)]
LU-11213 uapi: change "space" hash type to hash flag

Change LMV_HASH_TYPE_SPACE to LMV_HASH_FLAG_SPACE to make it flexible
in directory layout inheritance in the future. But it's still exposed
to user as hash type "space" in "lfs setdirstripe" command to make
it easy to understand.

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: Ifa302204ed62dff8cc9d12fdc1f9ea86f8491d40
Reviewed-on: https://review.whamcloud.com/35318
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Hongchao Zhang <hongchao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11264 llapi: clean up llapi_search_tgt() code 92/35092/5
Andreas Dilger [Fri, 7 Jun 2019 03:39:36 +0000 (21:39 -0600)]
LU-11264 llapi: clean up llapi_search_tgt() code

Clean up llapi_search_tgt() and helper functions llapi_search_ost()
and llapi_search_mdt() to set errno on return.

Add man pages for all of these functions.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ieb2e93208fbc1b1492f632d8ce1383ca9fdec5f2
Reviewed-on: https://review.whamcloud.com/35092
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Arshad Hussain <arshad.super@gmail.com>
Reviewed-by: Emoly Liu <emoly@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11285 mdt: improve IBITS lock definitions 45/35045/6
Andreas Dilger [Mon, 3 Jun 2019 18:21:53 +0000 (12:21 -0600)]
LU-11285 mdt: improve IBITS lock definitions

Move MDS_INODELOCK_* flags into a named enum, and add the definitions
for the newer flags into wirecheck/wiretest to ensure consistency.

Rename MDS_INODELOCK_MAXSHIFT to MDS_INODELOCK_NUMBITS to hold current
number of lockbits, rather than one less than the number of lockbits,
since the only two places that use it expect it to be one larger than
it is.  Fix uses of MDS_INODELOCK_NUMBITS to be number of locks.  This
does not change the value of MDS_INODELOCK_FULL, which is used in the
protocol to exchange supported lock bits between client and server.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I0c2985bcc602b7182d5db2cf8d590923be2cab07
Reviewed-on: https://review.whamcloud.com/35045
Reviewed-by: Patrick Farrell <pfarrell@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12491 obdclass: use RCU to release lu_env_item 38/35038/5
Alex Zhuravlev [Mon, 3 Jun 2019 02:52:42 +0000 (05:52 +0300)]
LU-12491 obdclass: use RCU to release lu_env_item

as rhashtable_lookup_fast() is lockless and can
find just released objects.

Fixes: aa82cc8361 ("obdclass: put all service's env on the list")
Change-Id: I6ed8ccc5bb5b192eed90b55103d11b822ec90692
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35038
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Neil Brown <neilb@suse.com>
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11893 lnet: consoldate secondary IP address handling 93/34993/8
James Simmons [Tue, 2 Jul 2019 13:11:15 +0000 (09:11 -0400)]
LU-11893 lnet: consoldate secondary IP address handling

The last piece of code with broken secondary IP address
support is lnet_parse_ip2nets(). We could fix it like
o2iblnd or socklnd was done but since the LND drivers
resolved those issues instead we can move the handling
out of the LND drivers into one place in the LNet core.
To do this we introduce struct lnet_inetdev which is
a collection of data that the current LNet layer requires.
The new function lnet_inet_enumerate() is used to collect
this information.

Change-Id: I0c532caa3cf6b2178eb1ab65e55e5883d408a185
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/34993
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-1732 utils: allow ldiskfs wide striping with ea_inode 15/4315/16
Patrick Farrell [Sat, 25 May 2019 13:48:06 +0000 (09:48 -0400)]
LU-1732 utils: allow ldiskfs wide striping with ea_inode

Format the MDT filesystem with the "ea_inode" option by default
so that files can have more than 160 stripes, and large xattrs
over one filesystem block in size (normally 4096 bytes).

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I71589c13e59a9d13db3bf075282cf6334b86be30
Reviewed-on: https://review.whamcloud.com/4315
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12462 llite: Remove old fsync versions 39/35339/2
Patrick Farrell [Thu, 27 Jun 2019 15:41:29 +0000 (11:41 -0400)]
LU-12462 llite: Remove old fsync versions

The old two arg and three arg versions of fsync were last
used in 2.6.3X kernels, and we no longer support those.

Remove it to clean up our fsync defines, and add to debug
to capture all the arguments of the current fsync.

Signed-off-by: Patrick Farrell <pfarrell@whamcloud.com>
Change-Id: I3ab18fa5a7a4a6d3b0714570a8ff3f2ad820e5ad
Reviewed-on: https://review.whamcloud.com/35339
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Mike Pershin <mpershin@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Simmons <jsimmons@infradead.org>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-9971 lnet: fix peer ref counting 46/35446/2
Amir Shehata [Mon, 8 Jul 2019 19:51:05 +0000 (12:51 -0700)]
LU-9971 lnet: fix peer ref counting

Exit from the loop after peer ref count has been incremented
to avoid wrong ref count.

The code makes sure that a peer is queued for discovery at most
once if discovery is disabled. This is done to use discovery
as a standard ping for gateways which do not have discovery feature
or discovery is disabled.

Signed-off-by: Amir Shehata <ashehata@whamcloud.com>
Change-Id: I2cc4c8f9d780f5c438d9b51bb2d1106fec553f39
Reviewed-on: https://review.whamcloud.com/35446
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoRevert "LU-11760 ofd: formatted OST recognition change" 88/35388/4
Andreas Dilger [Sun, 30 Jun 2019 07:54:13 +0000 (07:54 +0000)]
Revert "LU-11760 ofd: formatted OST recognition change"

This is causing conf-sanity test_69 failures in LU-12404 due to
the increased limit on the number of objects precreated after
recovery.  The issue will be fixed in a different way.

This reverts commit d07d9c5ed0aa1d6614944c7d1e0ca55cba301dc4.

Change-Id: I437889f20699207fa15eff6685b0992292555f19
Reviewed-on: https://review.whamcloud.com/35388
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12494 kernel: kernel update SLES12 SP4 [4.12.14-95.19.1] 91/35391/2
Jian Yu [Sun, 30 Jun 2019 07:01:18 +0000 (00:01 -0700)]
LU-12494 kernel: kernel update SLES12 SP4 [4.12.14-95.19.1]

Update SLES12 SP4 kernel to 4.12.14-95.19.1 for Lustre client.

Test-Parameters: trivial clientdistro=sles12sp4 \
envdefinitions=LNET_SELFTEST_EXCEPT=smoke,SANITY_EXCEPT=103a

Change-Id: I6a101dc2637945192cf8aca661e23c3bccb47609
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35391
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12458 kernel: kernel update RHEL7.6 [3.10.0-957.21.3.el7] 68/35268/5
Jian Yu [Wed, 19 Jun 2019 20:19:53 +0000 (13:19 -0700)]
LU-12458 kernel: kernel update RHEL7.6 [3.10.0-957.21.3.el7]

Update RHEL7.6 kernel to 3.10.0-957.21.3.el7.

Test-Parameters: clientdistro=el7.6 serverdistro=el7.6

Change-Id: I10c5708a412022e6066ff7ce2801375049f188d9
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35268
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12456 kernel: kernel update RHEL 8.0 [4.18.0-80.4.2.el8_0] 67/35267/4
Jian Yu [Wed, 19 Jun 2019 18:42:53 +0000 (11:42 -0700)]
LU-12456 kernel: kernel update RHEL 8.0 [4.18.0-80.4.2.el8_0]

Update RHEL 8.0 kernel to 4.18.0-80.4.2.el8_0 for Lustre client.

Change-Id: I1ff433f6ef3433dae54def0e89bc035d25ff15a4
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35267
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12474 tests: Do not run check_progs_installed for racer 27/35327/2
Oleg Drokin [Wed, 26 Jun 2019 02:22:23 +0000 (22:22 -0400)]
LU-12474 tests: Do not run check_progs_installed for racer

it's run from within racer so racer is already there for sure

Change-Id: Ifd78cd051842c9663130b650c6e35d60332250e7
Test-Parameters: testlist=racer
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35327
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
14 months agoLU-9792 tests: t-f do not create empty files for ZFS 05/34105/7
Alex Zhuravlev [Thu, 24 Jan 2019 13:24:26 +0000 (16:24 +0300)]
LU-9792 tests: t-f do not create empty files for ZFS

as zpool doesn't like empty device files.

Change-Id: I3a224bf2e60c6f20e13013caf827fe29641a8c5c
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/34105
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Nathaniel Clark <nclark@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12395 build: build mpitests for el8 74/35374/9
Minh Diep [Fri, 28 Jun 2019 21:28:39 +0000 (14:28 -0700)]
LU-12395 build: build mpitests for el8

RHEL8 has rpm-mpi-hooks which requires binaries
to be in specific mpi bin to generate the correct
requires

See https://fedoraproject.org//wiki/Changes/RpmMPIReqProv
and https://fedoraproject.org/wiki/Packaging:MPI

Test-Parameters: trivial clientdistro=el8 serverdistro=el7.6 testgroup=regression-mpi

Change-Id: Id9fa50e15b48b9da846083b9e9cd894ad1eac967
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35374
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12483 tests: fix sanity test 60h running conditions 55/35355/7
Oleg Drokin [Fri, 28 Jun 2019 01:59:37 +0000 (21:59 -0400)]
LU-12483 tests: fix sanity test 60h running conditions

The test is supposed to run in DNE mode on 2.12.52 or above,
but the conditions are somehow reversed.

Change-Id: I322941a6098b0dbfbabe2f5c70f40f8e81d1bbab
Test-Parameters: trivial
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/35355
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: James Nunez <jnunez@whamcloud.com>
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: Lai Siyao <lai.siyao@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
14 months agoRevert "LU-12328 flr: preserve last read mirror" 50/35450/2
Oleg Drokin [Tue, 9 Jul 2019 17:27:41 +0000 (17:27 +0000)]
Revert "LU-12328 flr: preserve last read mirror"

This is causing somewhat frequent crashes tracked in
LU-12525

This reverts commit 810f2a5fef577b4f0f6a58ab234cf29afd96c748.

Change-Id: If7604ad4ca1d4ddc63a20fa2ec7d9467ee7bb5f9
Reviewed-on: https://review.whamcloud.com/35450
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12495 obdclass: generate random u64 max correctly 94/35394/5
Lai Siyao [Fri, 28 Jun 2019 17:36:27 +0000 (01:36 +0800)]
LU-12495 obdclass: generate random u64 max correctly

Generate random u64 max number correctly, and make it an obdclass
function lu_prandom_u64_max().

Fixes: 7a707d4828 (libcfs: replace cfs_rand() with prandom_u32_max())

Signed-off-by: Lai Siyao <lai.siyao@whamcloud.com>
Change-Id: I2b94a42b42539be319f358d7af2a82dc8b26117c
Reviewed-on: https://review.whamcloud.com/35394
Tested-by: jenkins <devops@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-9859 libcfs: remove libcfs_debug_vmsg2 25/35225/3
NeilBrown [Thu, 13 Jun 2019 19:10:21 +0000 (15:10 -0400)]
LU-9859 libcfs: remove libcfs_debug_vmsg2

Now that libcfs_debug_vmsg2 has no (external) users, we can remove it.
It is used to implement libcfs_debug_msg(), so simply move
the body of the function (suitably modified) into that one caller.

Linux-commit: d42a3aded317c97594c19995879999428de53c46

Signed-off-by: NeilBrown <neilb@suse.com>
Change-Id: I80d24abcc23a8f6e2f8b995d1337ba5038318d5a
Reviewed-on: https://review.whamcloud.com/35225
Reviewed-by: Shaun Tancheff <stancheff@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12328 flr: preserve last read mirror 11/35111/2
Jinshan Xiong [Sat, 8 Jun 2019 05:34:03 +0000 (22:34 -0700)]
LU-12328 flr: preserve last read mirror

This patch preserves the mirror that has been read successfully
so that all subsequent I/O can take this advantage it and avoid
trying to read unavailable OSTs.

Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Change-Id: I806f936340db7c73228048edf21d5ecbed4b3c6c
Reviewed-on: https://review.whamcloud.com/35111
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-9971 lnet: use after free in lnet_discover_peer_locked() 44/28944/5
Olaf Weber [Tue, 12 Sep 2017 12:07:50 +0000 (14:07 +0200)]
LU-9971 lnet: use after free in lnet_discover_peer_locked()

When the lnet_net_lock is unlocked, the peer attached to an
lnet_peer_ni (found via lnet_peer_ni::lpni_peer_net->lpn_peer)
can change, and the old peer deallocated. If we are really
unlucky, then all the churn could give us a new, different,
peer at the same address in memory.

Change the reference counting on the lnet_peer lp so that it
is guaranteed to be alive when we relock the lnet_net_lock for
the cpt. When the reference count is dropped lp may go away if
it was unlinked, but the new peer is guaranteed to have a
different address, so we can still correctly determine whether
the peer changed and discovery should be redone.

Signed-off-by: Olaf Weber <olaf.weber@hpe.com>
Change-Id: Ia44dce20074b27ec0e77d7c1908c6a44ec73d326
Reviewed-on: https://review.whamcloud.com/28944
Reviewed-by: Amir Shehata <ashehata@whamcloud.com>
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-930 doc: update MAINTAINERS file 41/35241/4
Andreas Dilger [Mon, 17 Jun 2019 03:23:50 +0000 (05:23 +0200)]
LU-930 doc: update MAINTAINERS file

Update Patrick's email address, and remove John Hammond from the list.

Add some existing files to existing subsystems, and add a few new
subsystems for recently-landed code.

Re-order a couple of the entries to be in alphabetical order.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I297ec31abf65d54dc363b5d5fd460b7b3e3ebbe5
Reviewed-on: https://review.whamcloud.com/35241
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-12045 tests: honor EXCEPT tests when using ONLY list 38/34938/9
James Nunez [Wed, 22 May 2019 16:22:19 +0000 (10:22 -0600)]
LU-12045 tests: honor EXCEPT tests when using ONLY list

The Lustre test framework allows a user to specify a subset
of tests to run using the ONLY parameter or --only flag.
The test framwork also allows the user to specify a list of
tests to skip using the EXCEPT or ALWAYS_EXCEPT parameters.
By default, if the ONLY parameter or --only flag is used,
the EXCEPT and ALWAYS_EXCEPT lists are ignored.

Add a flag to auster, -H, and an environment variable,
HONOR_EXCEPT, to skip the tests on the ALWAYS_EXCEPT,
EXCEPT and SLOW lists when using the ONLY/--only parameter.

Test-Parameters: trivial
Test-Parameters: envdefinitions=ONLY="40-43" testlist=sanity
Test-Parameters: envdefinitions=ONLY="40-43" austeroptions=-H testlist=sanity
Test-Parameters: envdefinitions=SLOW="no",ONLY="27" testlist=sanity
Test-Parameters: envdefinitions=SLOW="no",ONLY="27" austeroptions=-H testlist=sanity
Test-Parameters: envdefinitions=SLOW="yes",ONLY="27" testlist=sanity
Test-Parameters: envdefinitions=SLOW="yes",ONLY="27" austeroptions=-H testlist=sanity
Signed-off-by: James Nunez <jnunez@whamcloud.com>
Change-Id: I173e48e1d2dc3b404d148146639a13148bc48a3d
Reviewed-on: https://review.whamcloud.com/34938
Tested-by: Jenkins
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Jian Yu <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
14 months agoLU-11518 ptlrpc: don't reset lru_resize on idle reconnect 85/35285/7
Andriy Skulysh [Tue, 11 Jun 2019 14:44:32 +0000 (17:44 +0300)]
LU-11518 ptlrpc: don't reset lru_resize on idle reconnect

ptlrpc_disconnect_idle_interpret() clears imp_remote_handle,
so reconnect has pcaa_initial_connect set to 1.

Update only changed ns_connect_flags bits.

Fixes: 5a6ceb664f0 ("LU-7236 ptlrpc: idle connections can disconnect")
Change-Id: I2368708b6381c1d772c47dc6e61c8fb39a14a2cc
Cray-bug-id: LUS-7471
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Alexander Boyko <c17825@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-on: https://review.whamcloud.com/35285
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>