Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-9682 nodemap: delete nids range from nodemap correctly 22/28922/4
Emoly Liu [Fri, 15 Sep 2017 07:56:11 +0000 (15:56 +0800)]
LU-9682 nodemap: delete nids range from nodemap correctly

In function nodemap_del_range(), we should check if the current
nodemap has the specified range before delete it from global
range tree.
Also, test_10b is added to sanity-sec.sh to verify this patch.

Change-Id: Ibab79056509d14d52f99b1ebe3319c301dbe45d9
Test-Parameters: testlist=sanity-sec
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/28922
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9963 test: add parallel-scale tests to ALWAYS_EXCEPT 14/28914/4
dilip krishnagiri [Fri, 8 Sep 2017 18:30:57 +0000 (12:30 -0600)]
LU-9963 test: add parallel-scale tests to ALWAYS_EXCEPT

add the following parallel-scale tests

parallel_grouplock :
       test_parallel_grouplock: test failed to respond and timed out
Associated Jira ticket LU-9429 is in open state.
to ALWAYS_EXCEPT list.

Test-Parameters: trivial testlist=parallel-scale

Change-Id: I25709af1ab49a30498a89e5369521582c5ab6cf8
Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Reviewed-on: https://review.whamcloud.com/28914
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Casper <jamesx.casper@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8541 ldlm: don't use jiffies as sysfs parameter 70/28370/10
James Simmons [Thu, 31 Aug 2017 20:17:42 +0000 (16:17 -0400)]
LU-8541 ldlm: don't use jiffies as sysfs parameter

The ldlm sysfs file handles lru_max_age in jiffies which is wrong
since jiffies are not consistent across machine since HZ is
configurable at compile time. Talking to most users they thought
lru_max_age was in seconds which is incorrect. The best way to
fix this is to move lru_max_age to millisecs since most systems
lustre deals with sets HZ to 1000. To make it clear it is in
milliseconds print out lru_max_age with "ms". Since users tend
to think in seconds allow passing in seconds besides milliseconds
and internally converting them to nanaseconds. Since we have to
support milliseconds move to ktime_t since we can't use time64_t.
Unfortunately, this makes a relatively large patch, but I could
not find a way to split it up some more without breaking atomicity
of the change.

Change-Id: I0b1814fd9d903767f62fe141d2c95845b75fb95a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28370
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9802 pfl: swapping lcm_entry_count correctly 56/28256/4
Jinshan Xiong [Thu, 27 Jul 2017 16:49:42 +0000 (09:49 -0700)]
LU-9802 pfl: swapping lcm_entry_count correctly

It's a u16 integer so it should use le16_to_cpu() instead of
le32_to_cpu().

Test-Parameters: trivial
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I43c31a76d78aa294a3e3296a1bb69f4d6fb1423d
Reviewed-on: https://review.whamcloud.com/28256
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-8066 obd: make ldebugfs_remove recursive 18/28818/7
Oleg Drokin [Tue, 12 Sep 2017 14:24:28 +0000 (10:24 -0400)]
LU-8066 obd: make ldebugfs_remove recursive

ldebugfs_remove is usually called on directories with files passed in
as attributes, so simple debugfs_remove failes on them as not empty
Switch to debugfs_remove_recursive.

This fixes a number of problems where a new filesystem is mounted after
being unmounted first.

Linux-commit: 6a491f2b80f2806221ba3a5a3e26fbe945f82d83

Change-Id: I49878ea9e28365d7d834497e715eeee21e698eea
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28818
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9221 jobstats: Create a pid-based hash for jobid values 08/25208/26
Ben Evans [Wed, 1 Feb 2017 22:06:36 +0000 (16:06 -0600)]
LU-9221 jobstats: Create a pid-based hash for jobid values

Use cfs_hash_table to create a pid to jobID based mapping.
Change default behavior of JobIDs to default to procname_uid if
a suitable value cannot be found in the environment.

All entries older than RESCAN_INTERVAL  seonds are refreshed
on access.
Items can be purged by writing to procfs_name.
"" will remove all entries
When purging the cache, items older than DELETE_INTERVAL are
deleted.

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I22e9d73c4585d7c5496829bc20bce191304e0d58
Reviewed-on: https://review.whamcloud.com/25208
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9574 llite: pipeline readahead better with large I/O 88/27388/4
Jinshan Xiong [Thu, 1 Jun 2017 19:53:35 +0000 (12:53 -0700)]
LU-9574 llite: pipeline readahead better with large I/O

Fixed a bug where next readahead is not set correctly when
appplication issues large I/O;
Extend the readahead window length to at least cover the size of
current I/O.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I43c5e4f25ea30d4a36263db2588bde0401122990
Reviewed-on: https://review.whamcloud.com/27388
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-3308 tests: fix sanity/sanityn test_mkdir() usage 12/26212/18
Andreas Dilger [Tue, 21 Mar 2017 03:45:28 +0000 (23:45 -0400)]
LU-3308 tests: fix sanity/sanityn test_mkdir() usage

Remove "-p" option from test_mkdir() calls that do not need it.
test_mkdir() has its own error checking, so no need for duplicate
error checking in the caller as well.

Clean up script style for tests related to test_mkdir() changes:
- use $(...) instead of `...` for subshells
- use $tdir and $tfile for test filenames
- add useful messages to error() calls
- replace use of $SETSTRIPE wrapper with $LFS setstripe
- remove trailing "===" from test names
- use tabs for indentation instead of spaces

Combine sanity test_99[a-f] into test_99 to avoid duplicate checks.

Test-Parameters: trivial testlist=sanityn
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I38d47f0c2e18aa20a0468f354ed88b740b3e17b8
Reviewed-on: https://review.whamcloud.com/26212
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9960 osd-zfs: don't auto-upgrade quota 24/28924/3
Nathaniel Clark [Mon, 11 Sep 2017 14:14:18 +0000 (10:14 -0400)]
LU-9960 osd-zfs: don't auto-upgrade quota

To preserve the ability to down-grade from 0.7.x to 0.6.x,
don't auto-upgrade quotas.
Print warning if quotas haven't been upgraded when mouting with 0.7.0.
Do check based on zpool feature in sanity-quota instead of just
version.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I2b0dcba3a230c9b2dec3d07d1b4ca6f1a1717d47
Reviewed-on: https://review.whamcloud.com/28924
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-6179 llite: Implement ladvise lockahead 64/13564/102
Patrick Farrell [Thu, 14 Sep 2017 15:24:50 +0000 (10:24 -0500)]
LU-6179 llite: Implement ladvise lockahead

Ladvise lockahead is a new feature allowing userspace to
request extent locks in advance of the IO which will use
them. These locks are not expanded beyond the size
requested by userspace.  They are intended to make it
possible to address lock contention between multiple
clients resulting from lock expansion.  They should allow
optimizing various IO patterns, notably strided writing.
(Further information in LU-6179)

Asynchronous glimpse locks are a speculative version of
glimpse locks, and already implement the required behavior.
Lockahead requests share this behavior.

Additionally, lockahead creates extent locks in advance
of IO, and so breaks the assumption that the holder of the
highest lock knows the current file size.

So we also modify the ofd_intent_policy code to glimpse
PW locks until it finds one it knows to be in use, taking
care to send only one glimpse to each client.

The current patch allows asynchronous non-blocking lock
ahead requests and synchronous blocking requests.  We
cannot do asynchronous blocking requests, because of
deadlocks that occur in having ptlrpcd threads handle
blocking lock requests.

Finally, this patch also adds another advice to disable
lock expansion, setting a per-file descriptor flag.  This
allows user space to control whether or not lock requests
on this file descriptor will undergo lock expansion.

This means if lockahead locks are not created ahead of IO
(due to inherent raciness) or are cancelled by a competing
IO request, the IO requests that should have used the
manually requested locks will not result in expanded locks.
This avoids lock ping-pong, and because the resulting locks
will not extend to the end of the file, future lockahead
requests can be granted.  Effectively, this means that if
lockahead usage for strided IO is interrupted by a
competing request, it can re-assert itself.

lockahead is implented via the ladvise interface from
userspace.  As lockahead results in a DLM lock request
rather than file advice, we do not use the lower levels of
the ladvise implementation.

Note this patch has one oddity:
Cray released an earlier version of lockahead without
FL_SPECULATIVE support.  That version uses
OBD_CONNECT_LOCKAHEAD_OLD, this new one uses
OBD_CONNECT_LOCKAHEAD.

The client code in this patch is interoperable with that
version, so it also advertises OBD_CONNECT_LOCKAHEAD_OLD
support, but the server version is not, so the server
advertises only OBD_CONNECT_LOCKAHEAD support.

Client support for the original lockahead is slated for
removal after the release of 2.12.  This is enforced with
a compile time version test that will remove support.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I1e80286f54946a0df08b19b1339829fcfd1117e7
Reviewed-on: https://review.whamcloud.com/13564
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9966 test: add a skip test to test_411 74/28974/5
Bob Glossman [Sun, 10 Sep 2017 15:15:32 +0000 (08:15 -0700)]
LU-9966 test: add a skip test to test_411

Since recently added test_411 needs a /sys nntry that doesn't exist
in sles12 extend the existing skip logic to skip the test in the case
of the entry being missing.

Test-Parameters: trivial clientdistro=sles12sp2 \
  testgroup=review-ldiskfs

Change-Id: I1f1bf05affdc1cec9957624506dac65e59f5b4ad
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/28974
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-9980 tests: save specific facet in save_lustre_params() 63/28963/2
Elena Gryaznova [Wed, 13 Sep 2017 05:58:07 +0000 (22:58 -0700)]
LU-9980 tests: save specific facet in save_lustre_params()

In save_lustre_params(), while there are multiple server facets
having the same host, and the parameter has wildcard, duplicate
parameters with wrong facets will be saved.

This patch fixes the above issue by greping service name to save
the parameter with specific facet.

Test-Parameters: clientcount=4 osscount=2 mdscount=2 mdtcount=1 \
austeroptions=-R failover=true iscsi=1 testlist=replay-vbr

Change-Id: Icba3fc532f4c67f02272c39e8e64d49325dad0e7
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/28963
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9888 tests: Update disk2_7-zfs.tar.bz2 for quota 20/28820/4
Nathaniel Clark [Thu, 31 Aug 2017 19:54:53 +0000 (15:54 -0400)]
LU-9888 tests: Update disk2_7-zfs.tar.bz2 for quota

Add a blimit file with the larger 40960 limit in it, as this seems to
be how the image was created, but the default (without this file) is
20K.
Also add an ilimit file with 4 (default for new images).  Old default
value is 2.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I33921ac58a5252f3259145d5e00faedcd21559f9
Reviewed-on: https://review.whamcloud.com/28820
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9930 llite: only clear lli_sai if the setter 94/28794/4
Bruno Faccini [Wed, 30 Aug 2017 09:37:03 +0000 (11:37 +0200)]
LU-9930 llite: only clear lli_sai if the setter

Previous to this patch, start_statahead_thread() was
unconditionnally clearing lli->lli_sai upon error, leading
to crash upon racy scenario where it has just been set in/by
another thread context.
Now, only clear lli_sai if current thread has set it.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I555febfad3494c9dd90eeb72d6dd9157428179ea
Reviewed-on: https://review.whamcloud.com/28794
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9995 lfsck: Have LMV_HASH_FLAG_DEAD defined for a while longer
Oleg Drokin [Fri, 15 Sep 2017 17:41:06 +0000 (13:41 -0400)]
LU-9995 lfsck: Have LMV_HASH_FLAG_DEAD defined for a while longer

LMV_HASH_FLAG_DEAD is still used in lfsck and not to make any hasty
moves, just move the version check arount that define further away
while we are examining what really needs to be done there.

This unbreaks the build breakage from 2.10.53 tag.

Change-Id: I87b25136f8fc03e59aed97352567757d2460ab3a
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoNew tag 2.10.53 2.10.53 v2_10_53 v2_10_53_0
Oleg Drokin [Fri, 15 Sep 2017 04:55:41 +0000 (00:55 -0400)]
New tag 2.10.53

Change-Id: I32787e50eab953a1f4a6f13723d777b3d7daea01
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9595 tests: remove sanityn test 18c from ALWAYS_EXCEPT 14/27414/6
dilip krishnagiri [Mon, 7 Aug 2017 16:27:41 +0000 (10:27 -0600)]
LU-9595 tests: remove sanityn test 18c from ALWAYS_EXCEPT

Remove sanityn test 18c from the ALWAYS_EXCEPT list
because LU-1205 is resolved.

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: Id1a67f4ce734949d446a97379cc297ddfd68e958
Reviewed-on: https://review.whamcloud.com/27414
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9950 build: add support for Ubuntu(debian) arm64 70/28870/2
Gu Zheng [Wed, 6 Sep 2017 03:14:35 +0000 (21:14 -0600)]
LU-9950 build: add support for Ubuntu(debian) arm64

Add arm64 into the support arch list of debian control file.

Change-Id: I9c39a4d8c1896c1255432380bd956330c2edf476
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/28870
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9921 lnet: resolve unsafe list access 23/28723/6
Amir Shehata [Sat, 26 Aug 2017 04:18:16 +0000 (21:18 -0700)]
LU-9921 lnet: resolve unsafe list access

Use list_for_each_entry_safe() when accessing messages on pending
queue. Remove the message from the list before calling lnet_finalize()
or lnet_send().

When destroying the peer make sure to queue all pending messages on
a global list. We can not resend them at this point because the
cpt lock is held. Unlocking the cpt lock could lead to an inconsistent
state. Use the discovery thread to check if the global list is not
empty and if so resend all messages on the list. Use a new spin
lock to protect the resend message list. I steered clear from reusing
an existing lock because LNet locking is complex and reusing a lock
will add to this complexity. Using a new lock makes the code easier
to understand.

Verify that all lists are empty before destroying the peer_ni

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ia081419ec5ed2be5823cfbca7e050138a229ab9c
Reviewed-on: https://review.whamcloud.com/28723
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7746 tests: skip tests for older (upstream) client 18/28718/3
Andreas Dilger [Fri, 25 Aug 2017 19:02:01 +0000 (13:02 -0600)]
LU-7746 tests: skip tests for older (upstream) client

Skip some tests when running newer sanity.sh on an older client.
This typically only happens when testing the upstream client,
since otherwise the tests will always match the client version.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I78e1b0a6ae98879a2039817696c3a0dd15621fcc
Reviewed-on: https://review.whamcloud.com/28718
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9891 tests: Increase space not released for ZFS 82/28682/4
James Nunez [Thu, 24 Aug 2017 14:51:15 +0000 (08:51 -0600)]
LU-9891 tests: Increase space not released for ZFS

Several Lustre tests calculate the free space on the
object storage servers. For servers running ZFS, the amount
of space released by ZFS is not 100% deterministic. Thus,
fs_log_size() will return the buffer size that we allow
the space to be off by. For ZFS, increase this buffer
from 400 to 512 KB.

Test-Parameters: trivial testgroup=review-zfs-part-2
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I32e0ae3752d0ee0e9f0091ea779f8b53ba969a26
Reviewed-on: https://review.whamcloud.com/28682
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-9870 build: handle SNMP missing on build box 94/28494/3
James Simmons [Fri, 11 Aug 2017 18:50:52 +0000 (14:50 -0400)]
LU-9870 build: handle SNMP missing on build box

Currently the lustre spec file doesn't handle the case when SNMP
is missing. So even if the user does configure --disable-snmp our
rpm build process will ignore this and fail to build rpms. Pass
to the rpm build process the missing SNMP case.

Test-Parameters: trivial

Change-Id: Ia6dcfd31b50f4f67afe7a4545fe417c32df6e559
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28494
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9044 test: remove conf-sanity tests from ALWAYS_EXCEPT 59/25059/8
dilip krishnagiri [Mon, 28 Aug 2017 15:39:57 +0000 (09:39 -0600)]
LU-9044 test: remove conf-sanity tests from ALWAYS_EXCEPT

Removing the following conf-sanity tests

bugzilla ticket 23954 added test
24b "Multiple MGSs on a single node (should return err)"
to the ALWAYS_EXECPT list. Bugzilla 23954 is resolved.

from ALWAYS_EXCEPT list.

Test-Parameters: trivial combinedmdsmgs=false testlist=conf-sanity

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: If379ac75921563412121e96439f49ab49dfb5fbc
Reviewed-on: https://review.whamcloud.com/25059
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8342 utils: Set dnodesize/recordsize at zfs dataset create 55/21055/11
Giuseppe Di Natale [Tue, 14 Jun 2016 15:29:31 +0000 (08:29 -0700)]
LU-8342 utils: Set dnodesize/recordsize at zfs dataset create

After the zfs dataset is created, attempt to set the
dnodesize and recordsize properties. Moved xattr=sa to be
consistent with the new method of setting dataset properties.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Change-Id: I12e5863e4602496b85f8512ea780be4589489d01
Reviewed-on: https://review.whamcloud.com/21055
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9941 lov: lsm_is_composite isn't right 45/28845/2
Bobi Jam [Sat, 2 Sep 2017 08:34:23 +0000 (16:34 +0800)]
LU-9941 lov: lsm_is_composite isn't right

LOVEA magic containing LOV_MAGIC_MAGIC will also be regarded as
a composite magic.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I3ef37ee80364b2a8f27831e3c53fb88b464f2039
Reviewed-on: https://review.whamcloud.com/28845
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9260 test: Use the correct mount device when test against lustre 61/28661/4
Wei Liu [Wed, 23 Aug 2017 16:49:29 +0000 (09:49 -0700)]
LU-9260 test: Use the correct mount device when test against lustre

The changes pass the MGSNID:/FSNAME into test, instead of
using the default loop device when testing against lustre.
The corresponding changes to the Posix test suites are also needed
to make the testing pass. Related changes apply to toolkit.

Test-Parameters: trivial testlist=posix

Change-Id: I32fc5a401fdc53ed133a78dc4c84b4a7e2a5ad19
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/28661
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9810 lnet: prefer Fast Reg 78/28278/5
Alexey Lyashkov [Mon, 4 Sep 2017 14:25:55 +0000 (17:25 +0300)]
LU-9810 lnet: prefer Fast Reg

The FastReg memory model has less CPU overhead than the default.
Therefore prefer it if the HW supports it.  This
applies in particular to the MLX4 HW which supports both memory
models.

Seagate-bug-id: MRP-4508
Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Change-Id: I09a85a3724d78b61a40fe18c72dbcc4a87da3013
Reviewed-on: https://review.whamcloud.com/28278
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9810 lnet: fix build with M-OFED 4.1 77/28277/4
Alexey Lyashkov [Mon, 4 Sep 2017 14:25:54 +0000 (17:25 +0300)]
LU-9810 lnet: fix build with M-OFED 4.1

Add uapi path into includes to make build happy

Seagate-bug-id: MRP-4508
Test-Parameters: trivial
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Change-Id: If9c61a303de24c78261a7b6fdafec77f52efa0d3
Reviewed-on: https://review.whamcloud.com/28277
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7001 osp: fix llog processing 32/26132/27
Alexander Boyko [Wed, 22 Mar 2017 11:39:48 +0000 (14:39 +0300)]
LU-7001 osp: fix llog processing

The osp_sync_thread base on fact that llog_cat_process
will not end until umount. This is worng when processing reaches
bottom of catalog, or if catalog is wrapped.
The patch fixes this issue.

For wrapped catalog llog_process_thread could process old
record.
1 thread llog_process_thread read chunk and proccing first record
2 thread add rec to this catalog at this chunk and
  update bitmap
1 check bitmap for next idx and process old record

Test conf-sanity 106 was added.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Seagate-bug-id: MRP-4235
Change-Id: Ifc983018e3a325622ef3215bec4b69f5c9ac2ba2
Reviewed-on: https://review.whamcloud.com/26132
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andriy Skulysh
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7988 hsm: update many cookie status at once 84/19584/39
Bruno Faccini [Tue, 18 Jul 2017 08:21:53 +0000 (10:21 +0200)]
LU-7988 hsm: update many cookie status at once

Instead of calling mdt_agent_record_update, which calls
cdt_llog_process, once for every HAL, build a list of the cookies to
update with their status and call mdt_agent_record_update just once
per seconds at most.

Update mdt_agent_record_update to take a status for every cookie.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ie4afd667727e07570ed6a2d51e8dfaea8302b97b
Signed-off-by: Ben Evans <bevans@cray.com>
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Reviewed-on: https://review.whamcloud.com/19584
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9907 build: add patchless server for lbuild 72/28672/21
Minh Diep [Wed, 23 Aug 2017 23:07:28 +0000 (16:07 -0700)]
LU-9907 build: add patchless server for lbuild

Adding lbuild support for building patchless server
Cleanup unused TARGET_ARCHS and BUILD_ARCHS

Test-Parameters: trivial

Change-Id: I946352fa243c86d5729779406264e6ee37856145
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/28672
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9917 lnet: rediscover peer if it changed 72/28772/2
Amir Shehata [Mon, 28 Aug 2017 22:22:57 +0000 (15:22 -0700)]
LU-9917 lnet: rediscover peer if it changed

If the peer has changed after we unlocked the cpt then
we'll need to discover the new peer.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ib880746d5e67bbea1aa43122fa3aa115261c8664
Reviewed-on: https://review.whamcloud.com/28772
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9918 lnet: decref on peer after use 22/28722/3
Amir Shehata [Sat, 26 Aug 2017 04:26:00 +0000 (21:26 -0700)]
LU-9918 lnet: decref on peer after use

After looking up the peer for both ping and discover
we need to decref the peer so we don't lose a reference
on it. This needs to be done while the mutex_lock is held
to ensure the peer list remains stable.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ic57e67d21b8afe17a239cc496621bc4abf681077
Reviewed-on: https://review.whamcloud.com/28722
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-6051 utils: Remove incorrect request for getstripe help 97/28597/3
Steve Guminski [Fri, 18 Aug 2017 12:37:29 +0000 (08:37 -0400)]
LU-6051 utils: Remove incorrect request for getstripe help

The option flag for stripe-size in the getstripe command was changed
in Lustre 1.8.  To detect the correct flag to use, the help was
parsed.  However, the help was incorrectly invoked by using the
"--help" option, instead of the correct "lfs help getstripe".
Since interoperability with 1.8 is no longer needed, the incorrect
code is removed and the correct flag is hard-coded.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I29ae644c7d6b2ed247573d83c943cb556cfb6325
Reviewed-on: https://review.whamcloud.com/28597
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-6210 utils: Use C99 struct initializers in lnetctl 23/28423/4
Steve Guminski [Mon, 7 Aug 2017 14:55:35 +0000 (10:55 -0400)]
LU-6210 utils: Use C99 struct initializers in lnetctl

This patch makes no functional changes.  The option struct
initializers in lnetctl are updated to C99 syntax.  The short and
long options are renamed to short_opts and long_opts for consistency.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I1c1483a57aea918dce84afd0c7e94e31324c189e
Reviewed-on: https://review.whamcloud.com/28423
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9679 ldlm: remove flock accessor macros 56/27856/3
Andreas Dilger [Tue, 27 Jun 2017 22:15:18 +0000 (16:15 -0600)]
LU-9679 ldlm: remove flock accessor macros

Remove old flock wrapper functions that were never used in the kernel:
flock_type(), flock_set_type(), flock_pid(), flock_set_pid(),
flock_start(), flock_set_start(), flock_end(), flock_set_end()

so that our code is closer to that in the upstream kernel.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8709da925f4aa4650088f72d7e26f5e6281cab07
Reviewed-on: https://review.whamcloud.com/27856
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-2776 tests: waiting multiop finished in sanityn:51a 62/27662/3
Yang Sheng [Thu, 15 Jun 2017 15:29:25 +0000 (23:29 +0800)]
LU-2776 tests: waiting multiop finished in sanityn:51a

The test would fail if multiop be delayed, So we
should wait enough time for it finished.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I9a329857230e3c49a5c78017ed385f20b3554d98
Reviewed-on: https://review.whamcloud.com/27662
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
3 years agoLU-5965 tests: fix parsing for older Lustre versions 93/28793/2
Andreas Dilger [Wed, 30 Aug 2017 06:57:03 +0000 (00:57 -0600)]
LU-5965 tests: fix parsing for older Lustre versions

Fix parsing of Lustre version generated by "lctl get_param version"
before release 2.7.  The old code generated a valid version number
even for older releases, except in the case where the "build:" line
did not start with a numeric value, since that line was incorrectly
being parsed instead of the "lustre:" line due to "$ver" not being
double-quoted properly, so "$ver" was being treated as a single
line and "head -n 1" was doing nothing.  This was offset by sed
dropping everything before the _last_ ":" instead of before the
_first_ ":", and then using the "build: " line.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ifd7dc95aaf0d6edf3558e18b85a78bea861248d0
Reviewed-on: https://review.whamcloud.com/28793
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9791 obd: always call lprocfs_obd_setup 47/28747/3
James Simmons [Mon, 28 Aug 2017 15:31:01 +0000 (11:31 -0400)]
LU-9791 obd: always call lprocfs_obd_setup

In the case of lustre running on a single nodes the function
lprocfs_obd_setup() was not being called for lov/osc. This
was preventing sysfs from being registered. So always call
lprocfs_obd_setup(). Update lprocfs_obd_setup() to see if
obd->obd_proc_entry has already been set and return right
away.

Change-Id: Idbd99ea6a2e59eeee3991048d54c532df7d849ad
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28747
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoRevert "LU-5541 build: build static and dynamic liblustreapi" 83/28783/3
Oleg Drokin [Tue, 29 Aug 2017 17:54:12 +0000 (17:54 +0000)]
Revert "LU-5541 build: build static and dynamic liblustreapi"

This broke Ubuntu 16 build not caught by current review builders.

This reverts commit ab1df50e73ff838053fff62302c3b884e4e19552.

Change-Id: Ie916869267e370791f13c53ceac8e6b1e3de97e9
Reviewed-on: https://review.whamcloud.com/28783
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoRevert "LU-5541 lustreapi: only export the API symbols" 82/28782/3
Oleg Drokin [Tue, 29 Aug 2017 17:33:01 +0000 (17:33 +0000)]
Revert "LU-5541 lustreapi: only export the API symbols"

This commit breaks ubuntu 16 build not caught by review builder.

This reverts commit b36c377ff25c20417c481eab3798e67d042ec3a3.

Change-Id: Ibe9da0d7cd91dbf8a1d51ca3e531af1850af1fab
Reviewed-on: https://review.whamcloud.com/28782
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8653 lod: use stripe_count instead of stripe_nr 81/26681/6
Andreas Dilger [Sat, 1 Oct 2016 00:07:17 +0000 (18:07 -0600)]
LU-8653 lod: use stripe_count instead of stripe_nr

Replace the use of stripenr and stripecnt in the code with
stripe_count to be consistent the rest of the code.

Introduce LOV_PATTERN_NONE instead of using "0" around the
code to indicate no layout has been specified.

Introduce LOV_PATTERN_DEFAULT to indicate the entire layout
is unset, instead of using "0xffffffff" in the code.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I6056aebc1a381b09c1a436fb4a7986a51f3ebbe5
Reviewed-on: https://review.whamcloud.com/26681
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9915 build: remove LC_CONFIG_OBD_BUFFER_SIZE 10/28710/2
James Simmons [Fri, 25 Aug 2017 14:22:39 +0000 (10:22 -0400)]
LU-9915 build: remove LC_CONFIG_OBD_BUFFER_SIZE

One last piece of the CONFIG_LUSTRE_OBD_MAX_IOCTL_BUFFER
removal was missed.

Test-Parameters: trivial

Change-Id: I37970459b1d9427edf52938a6c15f36901c8a462
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28710
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9909 lnet: fix memory leak and lnet_interfaces_max 02/28702/2
Amir Shehata [Fri, 25 Aug 2017 01:06:57 +0000 (18:06 -0700)]
LU-9909 lnet: fix memory leak and lnet_interfaces_max

Free buffer allocated for discover command.

Set lnet_interfaces_max to LNET_INTERFACES_MAX_DEFAULT if
it's not defined or if it's being set to something below
LNET_INTERFACES_MIN.

For lnet_ping() and lnet_discover() if the provided space
can fit more NIDs than lnet_interfaces_max then ensure only
lnet_interfaces_max is copied into the buffer.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I19aed712f40a8bf44d2fb112588e9ae07257469f
Reviewed-on: https://review.whamcloud.com/28702
Tested-by: Jenkins
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9899 tests: mount client on MGS for ost-pools 38/28638/4
James Nunez [Mon, 21 Aug 2017 22:26:30 +0000 (16:26 -0600)]
LU-9899 tests: mount client on MGS for ost-pools

When a Lustre configuration has the MGS and MDS on separate
nodes, the file system must be mounted on the MGS to allow
OST pools to work properly.

Add the ability to mount the file system on the MGS when
necessary for the Lustre test suite ost-pools.sh.

Test-Parameters: trivial testlist=ost-pools
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iff0663a38b92bb8e71c313897b12fca98fdae932
Reviewed-on: https://review.whamcloud.com/28638
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Arshad Hussain <arshad.hussain@seagate.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8066 libcfs: call kernel_param_unlock on error 12/28612/3
Hongchao Zhang [Tue, 22 Aug 2017 21:13:38 +0000 (17:13 -0400)]
LU-8066 libcfs: call kernel_param_unlock on error

In libcfs_param_debug_mb_set, kerenl_param_unlock should be
called in case of an error.

Change-Id: Iafeeb21b2d891f4ed7432e4d1ddd3c383fe33d5a
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/28612
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9347 ioctl: Add BLKSSZGET ioctl support 78/28578/5
Emoly Liu [Thu, 17 Aug 2017 07:36:49 +0000 (15:36 +0800)]
LU-9347 ioctl: Add BLKSSZGET ioctl support

Add BLKSSZGET ioctl and return PAGE_SIZE for the minimun
alignment from ll_file_ioctl() for this call.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Id8a77e77cd7e1807aa90474ca6d3d1fea4d7c269
Reviewed-on: https://review.whamcloud.com/28578
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
3 years agoLU-9042 test: Remove conf-sanity tests from ALWAYS_EXCEPT 39/28539/3
dilip krishnagiri [Mon, 14 Aug 2017 19:03:25 +0000 (13:03 -0600)]
LU-9042 test: Remove conf-sanity tests from ALWAYS_EXCEPT

Removing the following conf-sanity tests:

LU-2181 added conf-sanity tests
23a "interrupt client during recovery mount delay"
34b "force umount with failed mds should be normal"
from the ALWAYS_EXCEPT list. LU-2181 is resolved.

Test-parameters: trivial testlist=conf-sanity clientdistro=sles11sp4 mdsdistro=sles11sp4 ossdistro=sles11sp4

Change-Id: Iea35039cc1de57bc3109e678c3a52bd2b9fa12f7
Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Reviewed-on: https://review.whamcloud.com/28539
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-6210 lnet: Use C99 struct initializer in framework.c 36/28436/2
Steve Guminski [Mon, 7 Aug 2017 17:17:31 +0000 (13:17 -0400)]
LU-6210 lnet: Use C99 struct initializer in framework.c

This patch makes no functional changes.  The struct initializer in
framework.c is updated to C99 syntax.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Id54894c6f9476a5bf3b9cb5077ca324703c28da4
Reviewed-on: https://review.whamcloud.com/28436
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-5170 lfs: Standardize error messages in lfs_setdirstripe() 86/28086/2
Steve Guminski [Tue, 11 Jul 2017 20:10:52 +0000 (16:10 -0400)]
LU-5170 lfs: Standardize error messages in lfs_setdirstripe()

Error and warning messages in lfs_setdirstripe() are updated to a
standard format.  Messages are prefixed with the name of the utility
and the command that caused the error.  User-provided values are
delimited with single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I1dcc60aef3eab33610cc5f1e2b2d7e570568aca4
Reviewed-on: https://review.whamcloud.com/28086
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-9588 tests: remove replay-ost-single test from ALWAYS_EXCEPT 02/27402/5
dilip krishnagiri [Wed, 9 Aug 2017 19:11:52 +0000 (13:11 -0600)]
LU-9588 tests: remove replay-ost-single test from ALWAYS_EXCEPT

Removing replay-ost-single tests
 for ZFS,   3 "Fail OST during write, with verification"
from ALWAYS_EXCEPT list.

Test-Parameters: trivial testlist=replay-ost-single mdtfilesystemtype=zfs ostfilesystemtype=zfs

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: I6d928c374adaab47288368c533c2455549d4be17
Reviewed-on: https://review.whamcloud.com/27402
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9580 tests: remove performance-sanity tests from ALWAYS_EXCEPT 75/27375/4
dilip krishnagiri [Wed, 9 Aug 2017 20:14:01 +0000 (14:14 -0600)]
LU-9580 tests: remove performance-sanity tests from ALWAYS_EXCEPT

Remove performance-sanity tests 1 and 2 from ALWAYS_EXCEPT
list as well as tests test_1 and test_2 because all they
contain are calls to echo.
Tests 1 and 2 are associated with bugzilla ticket 15266 and
it is fixed. Yet, reviewing all comment in that ticket
reveals that tests 1 and 2 were never implemented.

Test-Parameters: trivial testlist=performance-sanity

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: I402474f9db0d1875bf9c4b5c071e9c27bd47ba28
Reviewed-on: https://review.whamcloud.com/27375
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9519 utils: liblustreapi header cleanup 55/27155/4
Henri Doreau [Wed, 17 May 2017 08:50:50 +0000 (10:50 +0200)]
LU-9519 utils: liblustreapi header cleanup

Remove superfluous 'external' qualifier from liblustreapi method prototypes.
Remove superfluous 'const' qualifier.

Test-Parameters: trivial
Change-Id: I818d5d2c9ae69d947f72c9306125715547714770
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-on: https://review.whamcloud.com/27155
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-8691 tests: add mdtest to ha.sh 70/23070/5
Elena Gryaznova [Mon, 29 May 2017 19:19:03 +0000 (22:19 +0300)]
LU-8691 tests: add mdtest to ha.sh

Patch adds:
- mdtest mpi load;
- ha_simultaneous mode, which allows to reboot
  victim nodes simultaneously.

Test-Parameters: trivial
Seagate-bug-id: MRP-3896
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Change-Id: I2c37f2a383ce2ed475ae14dcfa50a7f7357cb1bf
Reviewed-on: https://review.whamcloud.com/23070
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8435 tests: slab alloc error does not LBUG 45/21745/6
Aurelien Degremont [Tue, 30 May 2017 21:56:06 +0000 (23:56 +0200)]
LU-8435 tests: slab alloc error does not LBUG

Under memory pressure, for instance using a memory cgroup
and kmem.limit_in_bytes enforced (SLURM does this),
osc_extent_alloc() could fail and error handling will
hit an LBUG.

Add a test for this.

Test-Parameters: trivial testlist=sanity,sanity,sanity

Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Change-Id: I135f05ee4be14521522c949e50bd4c8deb1f099a
Reviewed-on: https://review.whamcloud.com/21745
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-7988 hsm: added coordinator housekeeping flag 82/19582/38
Frank Zago [Fri, 8 Apr 2016 17:59:06 +0000 (13:59 -0400)]
LU-7988 hsm: added coordinator housekeeping flag

When the coordinator is not performing housekeeping, only the requests
in the ARS_WAITING state will be processed as they are new
requests. The other requests, in states ARS_FAILED, ARS_CANCELED,
ARS_SUCCEED and ARS_STARTED can wait a few more seconds until the
housekeeping starts.

Also, when not performing housekeeping, as soon as hsd.request is
full, exit from the loop as there is enough potential work queued;
there's no need to examine all the HSM records, thus shortening the
time spent in cdt_llog_process() holding the critical lock
cdt_llog_lock.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ib73c97d29ca2f86b912aeb8d055c004cff14d5cf
Reviewed-on: https://review.whamcloud.com/19582
Tested-by: Jenkins
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9890 osd-zfs: dmu_objset_own/disown changes 93/28593/3
Giuseppe Di Natale [Thu, 17 Aug 2017 17:16:49 +0000 (10:16 -0700)]
LU-9890 osd-zfs: dmu_objset_own/disown changes

ZFS 0.8.0 will introduce ZFS encryption. The interfaces
to 'dmu_objset_own' and 'dmu_objset_disown' have changed.
Add configure checks to determine which versions of these
functions are available and call them appropriately.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Test-Parameters: trivial ostfilesystemtype=zfs mdtfilesystemtype=zfs testlist=sanity
Change-Id: Ide1a712858770e373404445b06596130a574d85b
Reviewed-on: https://review.whamcloud.com/28593
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
3 years agoLU-9882 kernel: kernel update RHEL7.4 [3.10.0-693.1.1.el7] 55/28555/4
Bob Glossman [Tue, 15 Aug 2017 14:21:36 +0000 (07:21 -0700)]
LU-9882 kernel: kernel update RHEL7.4 [3.10.0-693.1.1.el7]

update RHEL 7.4 kernel to 3.10.0-693.1.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I48c1907b0db9f97fbebc8b8276cc27124433b482
Reviewed-on: https://review.whamcloud.com/28555
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9857 lmv: stripe dir page may be released mistakenly 48/28548/2
Lai Siyao [Tue, 15 Aug 2017 03:13:30 +0000 (11:13 +0800)]
LU-9857 lmv: stripe dir page may be released mistakenly

stripe_dirent_next() may put_stripe_dirent() while its dirent
is still in use, e.g. lmv_dirent_next() popped stripe last
dirent, when it can't point sd_ent to next, but it shouldn't
release stripe dir page.

stripe_dirent->sd_ent should be set NULL when its dir page
is released, which can avoid misuse.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I6d0e119d598e468d6a080b2072514a6bf1d4f786
Reviewed-on: https://review.whamcloud.com/28548
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9874 osd-ldiskfs: simplify project transfer codes 10/28510/4
Wang Shilong [Tue, 27 Jun 2017 05:51:09 +0000 (13:51 +0800)]
LU-9874 osd-ldiskfs: simplify project transfer codes

Currently, osd-ldiskfs call __ldiskfs_ioctl_project()
to transfer project quota which is user ioctl for ext4 which
will start a transaction, and reserve credits, this is not
right logic with Lustre.

Lustre have started a transaction handle and credits should be
reserved during declare phase, so calling _ldiskfs_ioctl_project()
here will cause nested handle starting, which is not a problem for
JBD2 because it will attach current thread's handle if transaction
have been started, but in this case it will ignore credits
reservation.

Also Lustre don't need inode mutex protection for
project transfer, we don't need write inode in transfer codes,
it will be done when dirty inode is called. Setting attr
have reserved enough credits for project transfer, we need
fix agent inode transfering.

This patch makes codes logic clear, also fix credits
reservation for DNE agent inode transfering.

Change-Id: I6ab3c0fdc4cf456b102e49d9326840fd0e12ade0
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/28510
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9866 kernel: kernel update [SLES12 SP2 4.4.74-92.35] 09/28509/2
Bob Glossman [Fri, 11 Aug 2017 15:25:03 +0000 (08:25 -0700)]
LU-9866 kernel: kernel update [SLES12 SP2 4.4.74-92.35]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ibd5e7e931a6055c1b0d2a52359d4f4527843dec0
Reviewed-on: https://review.whamcloud.com/28509
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8066 libcfs: test for both __kernel_param_[un]lock and kernel_param_[un]lock 98/28498/3
James Simmons [Fri, 11 Aug 2017 19:47:19 +0000 (15:47 -0400)]
LU-8066 libcfs: test for both __kernel_param_[un]lock and kernel_param_[un]lock

In earlier kernels like RHEL6 no locking is available. Later the
function __kernel_param_[un]lock() we introduced. In most recent
kernels per module locking was introduced with the functions
kernel_param_[un]lock() and __kernel_param_[un]lock() is no longer
visible to modules. Since this is the case we need to make sure
both HAVE_MODULE_PARAM_LOCKING and HAVE_KERNEL_PARAM_LOCK are not
set in the case of RHEL6.

Change-Id: I0957a16352c4fb49fb5d96c0ff4d331a8be9703a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28498
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9860 tests: Add conf-sanity tests to ALWAYS_EXCEPT list 97/28497/10
James Nunez [Fri, 11 Aug 2017 19:45:52 +0000 (13:45 -0600)]
LU-9860 tests: Add conf-sanity tests to ALWAYS_EXCEPT list

The following tests fail when run with a separate MDS and MGS:
conf-sanity tests 33a, 43b, 53b, 54b, 70e, 80, 84, 87, 100,
102, 103, 104, 105 and 107.
We need to add these tests to the ALWAYS_EXCEPT list
when running with a separate MDS and MGS.

Test-Parameters: trivial combinedmdsmgs=false testlist=conf-sanity envdefinitions=SLOW=yes
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I1b17714216e14ad04eb9a492cb5f1aa4ed82bd1a
Reviewed-on: https://review.whamcloud.com/28497
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Dilip Krishnagiri <dilipx.krishnagiri@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9869 lnet: fix incorrect arguments order calling lstcon_session_new 87/28487/3
Colin Ian King [Fri, 11 Aug 2017 17:17:57 +0000 (13:17 -0400)]
LU-9869 lnet: fix incorrect arguments order calling lstcon_session_new

The arguments args->lstio_ses_force and args->lstio_ses_timeout are
in the incorrect order. Fix this by swapping them around.

Detected by CoverityScan, CID#1226833 ("Arguments in wrong order")

Test-Parameters: trivial testlist=lnet-selftest

Change-Id: If11c574655425db5bbf21ba2264be8d83a7e8bf8
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28487
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-6210 ptlrpc: Use C99 initializer in ptlrpc_register_rqbd() 79/28479/3
Steve Guminski [Mon, 7 Aug 2017 18:01:32 +0000 (14:01 -0400)]
LU-6210 ptlrpc: Use C99 initializer in ptlrpc_register_rqbd()

This patch makes no functional changes.  The struct initializer in
ptlrpc_register_rqbd() is updated to C99 syntax.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I7c24bac3ba6be6732b206406cd74b0d4f8a1f9c2
Reviewed-on: https://review.whamcloud.com/28479
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9856 mdd: handle NULL buffer in mdd_xattr_list() 69/28469/2
John L. Hammond [Thu, 10 Aug 2017 19:44:24 +0000 (14:44 -0500)]
LU-9856 mdd: handle NULL buffer in mdd_xattr_list()

The upper layer may call mdd_xattr_list() with a NULL buffer to get
the length of the xattr name list. Handle this case safely by skipping
the removal of the link xattr for unlinked objects.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iae87fba20325b228ef75ee762acfa49353932b1b
Reviewed-on: https://review.whamcloud.com/28469
Tested-by: Jenkins
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-6210 utils: Use C99 struct initializers in lfs_getdirstripe 21/28421/2
Steve Guminski [Tue, 8 Aug 2017 17:46:24 +0000 (13:46 -0400)]
LU-6210 utils: Use C99 struct initializers in lfs_getdirstripe

This patch makes no functional changes.  The option struct
initializer in lfs_getdirstripe() is updated to C99 syntax.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I6f2d4a82e5a9ef2c76946746d6c46b1202e8c278
Reviewed-on: https://review.whamcloud.com/28421
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-832 test: Add error check when running run-llog.sh 12/28412/2
Wei Liu [Mon, 7 Aug 2017 19:03:15 +0000 (12:03 -0700)]
LU-832 test: Add error check when running run-llog.sh

Add error status check in sanity test_60a when calling
run-llog.sh

Test-Parameters: trivial

Change-Id: I1296907c8892b7dd54dac37045d8a7c4e03b1f52
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/28412
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9803 tests: cast st_blksize for printf 62/28262/4
Chris Horn [Thu, 27 Jul 2017 20:10:15 +0000 (15:10 -0500)]
LU-9803 tests: cast st_blksize for printf

Compilation with -Werror=format complains about this printf. Expects
unsigned long but st_blksize has type __blksize_t. Cast it to unsigned
long for printf

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I1eeb5613e485132de8f0bce08bd4d89887e52cf6
Reviewed-on: https://review.whamcloud.com/28262
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9781 llog: Improve catalog full warning 93/28093/4
Giuseppe Di Natale [Tue, 18 Jul 2017 21:57:18 +0000 (14:57 -0700)]
LU-9781 llog: Improve catalog full warning

When warning that a catalog file is full, provide the name
of the catalog file. If the name of catalog file isn't
defined, print its FID.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Change-Id: I559e43d08febfd8a1512ceb58fd3030b06372e9f
Reviewed-on: https://review.whamcloud.com/28093
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-6210 utils: Use C99 initializers in lfs_changelog() 22/27522/3
Steve Guminski [Fri, 14 Apr 2017 19:33:23 +0000 (15:33 -0400)]
LU-6210 utils: Use C99 initializers in lfs_changelog()

This patch makes no functional changes.  Struct initializers that
use C89 or GCC-only syntax are updated to C99 syntax.  Variables of
type struct option are renamed to long_opts for consistency.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

This patch updates lfs_changelog() to use the C99 syntax.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I4f9d82974f68742d788f00d58c5e3d61449fc5bb
Reviewed-on: https://review.whamcloud.com/27522
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-9593 tests: remove sanity-sec tests from ALWAYS_EXCEPT 11/27411/4
dilip krishnagiri [Mon, 7 Aug 2017 17:29:12 +0000 (11:29 -0600)]
LU-9593 tests: remove sanity-sec tests from ALWAYS_EXCEPT

sanity-sec tests 2, 5 and 6 no longer exist. Test 2 was
removed by LU-6971 patch change ID I06f4348b. Tests
5 and 6 were removed by LU-3105 patch change I865a92b57.

Remove sanity-sec tests 2, 5 and 6 from the ALWAYS_EXCEPT
list.

Test-Parameters: trivial testlist=sanity-sec

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: Ia0377ff0da41c4ba9df6c90bc26f0469cb9de9a6
Reviewed-on: https://review.whamcloud.com/27411
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Chris Hanna <hannac@iu.edu>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9591 tests: remove replay-vbr tests 12a from ALWAYS_EXCEPT 06/27406/3
dilip krishnagiri [Tue, 8 Aug 2017 21:17:42 +0000 (15:17 -0600)]
LU-9591 tests: remove replay-vbr tests 12a from ALWAYS_EXCEPT

Removing replay-vbr test 12a - lock replay with VBR from
the ALWAYS_EXCEPT list. It is associated with bugzilla
ticket 16356 which is in NEW state.
This test did not run for years.

Test-Parameters: trivial testlist=replay-vbr

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: I251bbaeea744a11fdf3e34870a00fc6b53fae3b1
Reviewed-on: https://review.whamcloud.com/27406
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8276 ldlm: Make lru clear always discard read lock pages 85/20785/7
Patrick Farrell [Mon, 14 Aug 2017 10:09:35 +0000 (05:09 -0500)]
LU-8276 ldlm: Make lru clear always discard read lock pages

A significant amount of time is sometimes spent during
lru clearing (IE, echo 'clear' > lru_size) checking
pages to see if they are covered by another read lock.
Since all unused read locks will be destroyed by this
operation, the pages will be freed momentarily anyway,
and this is a waste of time.

This patch sets the LDLM_FL_DISCARD_DATA flag on all the PR
locks which are slated for cancellation by
ldlm_prepare_lru_list when it is called from
ldlm_ns_drop_cache.

The case where another lock covers those pages (and is in
use and so does not get cancelled by lru clear) is safe for
a few reasons:

1. When discarding pages, we wait (discard_cb->cl_page_own)
until they are in the cached state before invalidating.
So if they are actively in use, we'll wait until that use
is done.

2. Removal of pages under a read lock is something that can
happen due to memory pressure, since these are VFS cache
pages. If a client reads something which is then removed
from the cache and goes to read it again, this will simply
generate a new read request.

This has a performance cost for that reader, but if anyone
is clearing the ldlm lru while actively doing I/O in that
namespace, then they cannot expect good performance.

In the case of many read locks on a single resource, this
improves cleanup time dramatically.  In internal testing at
Cray with ~80,000 read locks on a single file, this improves
cleanup time from ~60 seconds to ~0.5 seconds.  This also
slightly improves cleanup speed in the case of 1 or a few
read locks on a file.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I0c076b31ea474bb5f012373ed2033de3e447b62d
Reviewed-on: https://review.whamcloud.com/20785
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-5541 lustreapi: only export the API symbols 43/11643/19
frank zago [Sun, 13 Aug 2017 18:17:17 +0000 (14:17 -0400)]
LU-5541 lustreapi: only export the API symbols

By default, all kind of symbols are exported from the library (dump,
libcfs_ukuc_start, l_ioctl, set_ioctl_dump, ...), which may create
external conflicts. Use the linker version-script options to only
export the API symbols, and prevent the export of internal symbols.

Only the symbols declared in the global section of liblustreapi.map
will be seen by applications.

Fix lshowmount to use libcfs and not internal liblustreapi symbol.

Change-Id: Ica4226c1ea9b6b159a056ad22bacaa2ffcf4b171
Signed-off-by: frank zago <fzago@cray.com>
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-on: https://review.whamcloud.com/11643
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
3 years agoLU-5541 build: build static and dynamic liblustreapi 25/11625/31
frank zago [Sun, 13 Aug 2017 18:11:26 +0000 (14:11 -0400)]
LU-5541 build: build static and dynamic liblustreapi

libtool knows how to build both, so no need to hack the Makefile. As
two added benefits, the utilities will now use the dynamic version,
thus reducing their footprint, and calling make twice in a row won't
rebuild objects already built.

Test-Parameters: trivial

Change-Id: If4191e1ff1564793c476ffe03f5d4b6ad5295421
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: https://review.whamcloud.com/11625
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9848 llog: check padding size for update reclen 54/28554/2
Lai Siyao [Tue, 15 Aug 2017 11:51:08 +0000 (19:51 +0800)]
LU-9848 llog: check padding size for update reclen

Update log only checks padding size for split case, which should also
be done if it's less than chunk size.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ie7819f67dd9bcbfb060713bb208c9777420c5178
Reviewed-on: https://review.whamcloud.com/28554
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0 91/28491/2
Doug Oucharek [Wed, 31 May 2017 21:39:12 +0000 (14:39 -0700)]
LU-9828 ptlrpc: Do not assert when bd_nob_transferred != 0

There is a case in the routine ptlrpc_register_bulk() where we were
asserting if bd_nob_transferred != 0 when not resending.  There is
evidence that network errors can create a situation where
this does happen.  So we should not be asserting!

This patch changes that assert to an error return code of -EIO.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I6a73ca1b04a86f187744d3b8b5d46df71d95e416
Reviewed-on: https://review.whamcloud.com/28491
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9863 lmv: Off by two in lmv_fid2path() 77/28477/2
Dan Carpenter [Fri, 11 Aug 2017 00:26:39 +0000 (20:26 -0400)]
LU-9863 lmv: Off by two in lmv_fid2path()

We want to concatonate join string one, a '/' character, string two and
then a NUL terminator. The destination buffer holds ori_gf->gf_pathlen
characters. The strlen() function returns the number of characters not
counting the NUL terminator. So we should be adding two extra spaces,
one for the foward slash and one for the NULL.

Change-Id: Ia96461a2d1b3331f44d3791ca0148f6e836caf0d
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28477
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9019 mdd: migrate from jiffies64 to ktime 07/28407/4
James Simmons [Mon, 14 Aug 2017 18:25:19 +0000 (14:25 -0400)]
LU-9019 mdd: migrate from jiffies64 to ktime

The mdd layer uses cfs_time_xxx_64() for 64 bit time percision.
This was written before ktime_t came into existence and it uses
64 bit version of jiffies which can vary between nodes due to
HZ being configurable. This provides a consistent format with
nanosecond precision on any node.

Change-Id: Ibec17227fd70a148c83296e8d1c41668f67e9201
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28407
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9657 pfl: llapi_layout_comp_usei should handle non-pfl file 65/27865/4
Emoly Liu [Tue, 15 Aug 2017 08:41:39 +0000 (16:41 +0800)]
LU-9657 pfl: llapi_layout_comp_usei should handle non-pfl file

This patch improves llapi_layout_comp_use() to treat non-composite
file as single component file. When doing "is composite" check,
"1" is returned when LLAPI_LAYOUT_COMP_USE_NEXT/PREV is specified.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I3ba4f07ec843d9b61273af331060d5f8827c2f8b
Reviewed-on: https://review.whamcloud.com/27865
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-8993 utils: Use absolute pathname for debug_daemon log file 85/25485/9
Steve Guminski [Mon, 13 Feb 2017 20:24:08 +0000 (15:24 -0500)]
LU-8993 utils: Use absolute pathname for debug_daemon log file

The lctl debug_daemon command is changed to always provide an
absolute pathname to the kernel.  The kernel code will return EINVAL
if the pathname does not begin with '/', leading to the confusing
error "Invalid argument". This patch allows the user to provide a
relative pathname to the command without generating this error.

The absolute_path function has been moved to string.c and renamed to
cfs_abs_path, so that it may be used by all utilities.

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I35af242bfcfcb9a56135aeabe0423e28e9634bab
Reviewed-on: https://review.whamcloud.com/25485
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9410 ldiskfs: no check mb bitmap if flex_bg enabled 66/28566/4
Fan Yong [Wed, 9 Aug 2017 18:30:02 +0000 (02:30 +0800)]
LU-9410 ldiskfs: no check mb bitmap if flex_bg enabled

When initializes (reformat) the filesystem, the number of
free blocks in the group descriptor is calculated via the
ext2fs_reserve_super_and_bgd() (e2fsprogs). As commented
in such function: "This is not necessarily the case when
the flex_bg feature is enabled, so callers should take care!".

So it is normal that we may find the block group descriptor
that has LDISKFS_BG_BLOCK_UNINIT flag but with 0 free blocks.
The ldiskfs_mb_check_ondisk_bitmap() should NOT report error
for such block group, instead, skip the check directly.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iba0fb2bf0632a6e54222472bc724a8ea0478e9ae
Reviewed-on: https://review.whamcloud.com/28566
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9841 lov: do not split IO for single striped file 51/28451/2
Jinshan Xiong [Wed, 9 Aug 2017 23:31:17 +0000 (16:31 -0700)]
LU-9841 lov: do not split IO for single striped file

stripe size for single striped file is not reliable, it shouldn't
be used to split I/O.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I47c31d59b46b07d4a6760b8985e1c19da4765a5c
Reviewed-on: https://review.whamcloud.com/28451
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9842 osd: return ENODATA for XATTR_NAME_FID on MDT 34/28434/2
Fan Yong [Tue, 8 Aug 2017 23:18:21 +0000 (07:18 +0800)]
LU-9842 osd: return ENODATA for XATTR_NAME_FID on MDT

The XATTR_NAME_FID xattr is OST side EA, if someone calls
getxattr() for XATTR_NAME_FID on MDT, then return -ENODATA.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I18b1466cf62d10fa28f7ed9731490e963b6274f4
Reviewed-on: https://review.whamcloud.com/28434
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9767 utils: validate filesystem name for mkfs.lustre 70/28070/7
James Simmons [Mon, 21 Aug 2017 03:24:20 +0000 (23:24 -0400)]
LU-9767 utils: validate filesystem name for mkfs.lustre

The patch "LU-6401 uapi: turn lustre_param.h into a proper
UAPI header" removed various user land functions used to
validate poolnames and file system names were removed. The
checks instead were enforced on the kernel side to ensure
any possible user land software directly interfacing to the
kernel wouldn't be able to break things badly. For the case
of formating the backend file system no kernel interaction
doesn't happen until it tries to mount the MDT/OST/MGT which
is very late in the process. So for this case lets add back
the file system name verification to mkfs.lustre to warn
users long before they try to mount anything.

Secondly we remove the verify_poolname() in lfs.c since
it duplicates extract_fsname_poolname() in obd.c. Their
is no need to do the same test twice. The function
pool_cmd() calls the ioctl for pool handling which in
turn returns an error code. Use this error code to notify
the user what mistake they did for their pool command.
For the MGS kernel code mgs_extract_fs_pool() was checking
MTI_NAME_MAXLEN instead of LUSTRE_MAXFSNAME. Also use
LUSTRE_MAXFSNAME instead of the raw number in the function
server_name2fsname() located in obd_mount.c.

Change-Id: If094644e56a70b6dd8e6b0378adc8736911aeef1
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28070
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9913 lnet: balance references in lnet_discover_peer_locked() 95/28695/2
John L. Hammond [Thu, 24 Aug 2017 20:01:34 +0000 (15:01 -0500)]
LU-9913 lnet: balance references in lnet_discover_peer_locked()

In lnet_discover_peer_locked() avoid a leaked reference to the peer in
the non-blocking discovery case.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ic48414859c923af1ebb197b0b0f2f8d6752043ac
Reviewed-on: https://review.whamcloud.com/28695
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9480 lnet: Multi-Rail Dynamic Discovery feature
Oleg Drokin [Tue, 22 Aug 2017 16:32:16 +0000 (12:32 -0400)]
LU-9480 lnet: Multi-Rail Dynamic Discovery feature

Merge remote-tracking branch 'origin/multi-rail'

Change-Id: I63d21d1085f4bf665480d29d5d14c065b6a22191
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9480 lnet: cleanup lnetctl and cyaml 49/27349/15
Sonia [Wed, 31 May 2017 08:48:15 +0000 (01:48 -0700)]
LU-9480 lnet: cleanup lnetctl and cyaml

lnetctl set commands results in segmentation fault
if no values are provided. This patch makes them
show help if no values are provided to with set commands.

Made general cleanups in the lnetctl code to consolidate
where the help is being printed. Created a function
check_cmd() which checks for the expected number of
arguments and for the -h/--help option and prints
the help string if either scenario is encountered.

fixed the fsm transition in cyaml to allow proper
parsing of empty cyaml documents

Change-Id: Ia081e9304ba2d6baa804e4c8890fb1988d860c1c
Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-on: https://review.whamcloud.com/27349
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
3 years agoLU-9480 lnet: show peer state 30/26130/21
Amir Shehata [Wed, 22 Mar 2017 20:34:23 +0000 (13:34 -0700)]
LU-9480 lnet: show peer state

It is important to show the peer state when debugging.
This patch exports the peer state from the kernel to
user space, and is shown when the detail level requested
in the peer show command is >= 3

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I1e169b2b7bf80671ea302f04c6fb948bbcbbb245
Reviewed-on: https://review.whamcloud.com/26130
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
3 years agoLU-9480 lnet: add enhanced statistics 95/25795/27
Amir Shehata [Thu, 2 Feb 2017 22:01:15 +0000 (14:01 -0800)]
LU-9480 lnet: add enhanced statistics

Added statistics to track the different types of
LNet messages which are sent/received/dropped

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I7e1fc991a56df20181f9e55a794765349a4d2cb9
Reviewed-on: https://review.whamcloud.com/25795

3 years agoLU-9480 lnet: add "lnetctl discover" 93/25793/29
Sonia Sharma [Mon, 13 Feb 2017 20:40:19 +0000 (12:40 -0800)]
LU-9480 lnet: add "lnetctl discover"

Add a "discover" subcommand to lnetctl

jt_discover() in lnetctl.c calls lustre_lnet_discover_nid()
to implement "lnetctl discover". The output is similar to
"lnetctl ping" command.
This patch also does some clean up in linlnetconfig.c
For parameters under global settings, the common code
for them is pulled in funtions ioctl_set_value() and
ioctl_show_global_values().

Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I98ebb0b27de4b32ea07421f7dd71a4a1c96f3e05
Reviewed-on: https://review.whamcloud.com/25793

3 years agoLU-9077 lnet: fix for static analysis issues 92/25792/29
sharmaso [Wed, 8 Feb 2017 22:42:01 +0000 (14:42 -0800)]
LU-9077 lnet: fix for static analysis issues

fixes the 11 static analysis issues found in
v2_9_52_0-66-gec839d4.

1. lustre_lnet_show_numa_range - fixed
2. lnet_select_pathway - fixed
3. lustre_lnet_show_discovery - fixed
4. lnet_discover_peer_locked - false positive
5. lustre_lnet_ping_nid - fixed
6. lustre_lnet_ping_nid - false positive
7. lustre_lnet_show_discovery - duplicate of 3
8. lustre_lnet_show_max_intf - fixed
9. lustre_lnet_show_max_intf - duplicate of 8
10. lnet_peer_set_primary_data - false positive
11. lustre_lnet_show_numa_range - fixed

Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I4cb03e4f64cd0c743ee3646f4628d34533b2d4ba
Reviewed-on: https://review.whamcloud.com/25792
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
3 years agoLU-9480 lnet: add "lnetctl ping" command 91/25791/31
Olaf Weber [Thu, 6 Apr 2017 09:43:20 +0000 (11:43 +0200)]
LU-9480 lnet: add "lnetctl ping" command

Adds function jt_ping() in lnetctl.c and
lustre_lnet_ping_nid() in liblnetconfig.c file.
The output of "lnetctl ping" is similar to
"lnetctl peer show".

Function jt_ping() in lnetctl.c calls lustre_lnet_ping_nid()
to implement "lnetctl ping". Adds a function infra_ping_nid()
to be later reused for the ping similar lnetctl commands.
Uses a new ioctl call, IOC_LIBCFS_PING_PEER for "lnetctl ping".
With "lnetctl ping", multiple nids can be pinged. Uses a new
struct(lnet_ioctl_ping_data in lib-dlc.h) to pass the data
from kernel to user space for ping. Also changes lnet_ping()
function and its input parameters in lnet/lnet/api-ni.c

Test-Parameters: trivial
Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I67024d87fa5cca6aa7ff7a8099d4400a795f3a83
Reviewed-on: https://review.whamcloud.com/25791
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
3 years agoLU-9480 lnet: add "lnetctl peer list" 90/25790/26
Olaf Weber [Fri, 27 Jan 2017 15:36:47 +0000 (16:36 +0100)]
LU-9480 lnet: add "lnetctl peer list"

Add IOC_LIBCFS_GET_PEER_LIST to obtain a list of the primary
NIDs of all peers known to the system. The list is written
into a userspace buffer by the kernel. The typical usage is
to make a first call to determine the required buffer size,
then a second call to obtain the list.

Extend the "lnetctl peer" set of commands with a "list"
subcommand that uses this interface.

Modify the IOC_LIBCFS_GET_PEER_NI ioctl (which is new in the
Multi-Rail code) to use a NID to indicate the peer to look
up, and then pass out the data for all NIDs of that peer.

Re-implement "lnetctl peer show" to obtain the list of NIDs
using IOC_LIBCFS_GET_PEER_LIST followed by one or more
IOC_LIBCFS_GET_PEER_NI calls to get information for each
peer.

Make sure to copy the structure from kernel space to
user space even if the ioctl handler returns an error.
This is needed because if the buffer passed in by the
user space is not big enough to copy the data, we want
to pass the requested size to user space in the structure
passed in. The return code in this case is -E2BIG.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I522c11e6ec09bec46121496d526bb258e10295f1
Reviewed-on: https://review.whamcloud.com/25790
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
3 years agoLU-9480 lnet: implement Peer Discovery 89/25789/24
Olaf Weber [Tue, 28 Mar 2017 13:05:03 +0000 (15:05 +0200)]
LU-9480 lnet: implement Peer Discovery

Implement Peer Discovery.

A peer is queued for discovery by lnet_peer_queue_for_discovery().
This set LNET_PEER_DISCOVERING, to indicate that discovery is in
progress.

The discovery thread lnet_peer_discovery() checks the peer and
updates its state as appropriate.

If LNET_PEER_DATA_PRESENT is set, then a valid Push message or
Ping reply has been received. The peer is updated in accordance
with the data, and LNET_PEER_NIDS_UPTODATE is set.

If LNET_PEER_PING_FAILED is set, then an attempt to send a Ping
message failed, and peer state is updated accordingly. The discovery
thread can do some cleanup like unlinking an MD that cannot be done
from the message event handler.

If LNET_PEER_PUSH_FAILED is set, then an attempt to send a Push
message failed, and peer state is updated accordingly. The discovery
thread can do some cleanup like unlinking an MD that cannot be done
from the message event handler.

If LNET_PEER_PING_REQUIRED is set, we must Ping the peer in order to
correctly update our knowledge of it. This is set, for example, if
we receive a Push message for a peer, but cannot handle it because
the Push target was too small. In such a case we know that the
state of the peer is incorrect, but need to do extra work to obtain
the required information.

If discovery is not enabled, then the discovery process stops here
and the peer is marked with LNET_PEER_UNDISCOVERED. This tells the
discovery process that it doesn't need to revisit the peer while
discovery remains disabled.

If LNET_PEER_NIDS_UPTODATE is not set, then we have reason to think
the lnet_peer is not up to date, and will Ping it.

The peer needs a Push if it is multi-rail and the ping buffer
sequence number for this node is newer than the sequence number it
has acknowledged receiving by sending an Ack of a Push.

If none of the above is true, then discovery has completed its work
on the peer.

Discovery signals that it is done with a peer by clearing the
LNET_PEER_DISCOVERING flag, and setting LNET_PEER_DISCOVERED or
LNET_PEER_UNDISCOVERED as appropriate. It then dequeues the peer
and clears the LNET_PEER_QUEUED flag.

When the local node is discovered via the loopback network, the
peer structure that is created will have an lnet_peer_ni for the
local loopback interface. Subsequent traffic from this node to
itself will use the loopback net.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I30acd1e046604013025b231b5806be25468a2286
Reviewed-on: https://review.whamcloud.com/25789
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
3 years agoLU-9480 lnet: add the Push target 88/25788/23
Olaf Weber [Tue, 28 Mar 2017 12:48:44 +0000 (14:48 +0200)]
LU-9480 lnet: add the Push target

Peer Discovery will send a Push message (same format as an
LNet Ping) to Multi-Rail capable peers to give the peer the
list of local interfaces.

Set up a target buffer for these pushes in the_lnet. The
size of this buffer defaults to LNET_MIN_INTERFACES, but it
is resized if required.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I09b5ad8ae504ba8368d908539001fb8afc2c2778
Reviewed-on: https://review.whamcloud.com/25788
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
3 years agoLU-9480 lnet: tune lnet_peer_discovery_disabled with lnetctl 87/25787/21
Olaf Weber [Tue, 28 Mar 2017 09:09:32 +0000 (11:09 +0200)]
LU-9480 lnet: tune lnet_peer_discovery_disabled with lnetctl

A new tunable, lnet_peer_discovery_disabled, has been introduced.
Make it tunable with lnetctl. Note that the state of discovery is
reported as 1/enabled, or 0/disabled, which is the inverse of the
module parameter.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I67333d86520c5b6db8ff99c924054c4b487c8029
Reviewed-on: https://review.whamcloud.com/25787
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
3 years agoLU-9480 lnet: add discovery thread 86/25786/23
Olaf Weber [Fri, 27 Jan 2017 15:32:11 +0000 (16:32 +0100)]
LU-9480 lnet: add discovery thread

Add the discovery thread, which will be used to handle peer
discovery. This change adds the thread and the infrastructure
that starts and stops it. The thread itself does trivial work.

Peer Discovery gets its own event queue (ln_dc_eqh), a queue
for peers that are to be discovered (ln_dc_request), a queue
for peers waiting for an event (ln_dc_working), a wait queue
head so the thread can sleep (ln_dc_waitq), and start/stop
state (ln_dc_state).

Peer discovery is started from lnet_select_pathway(), for
GET and PUT messages not sent to the LNET_RESERVED_PORTAL.
This criterion means that discovery will not be triggered by
the messages used in discovery, and neither will an LNet ping
trigger it.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I38a48ab7f61c8ef1b994cd17069729f243912bdf
Reviewed-on: https://review.whamcloud.com/25786
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
3 years agoLU-9480 lnet: add msg_type to lnet_event 85/25785/23
Olaf Weber [Fri, 27 Jan 2017 15:31:57 +0000 (16:31 +0100)]
LU-9480 lnet: add msg_type to lnet_event

Add a msg_type field to the lnet_event structure. This makes
it possible for an event handler to tell whether LNET_EVENT_SEND
corresponds to a GET or a PUT message.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: If9ecc42c26eb078c19697f399a17f80b2e225639
Reviewed-on: https://review.whamcloud.com/25785
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>