Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-9118 o2iblnd: handle MOFED libcfs time api collision 64/25564/6
James Simmons [Wed, 22 Feb 2017 22:31:06 +0000 (17:31 -0500)]
LU-9118 o2iblnd: handle MOFED libcfs time api collision

Both libcfs and the MOFED 4 stack define
ktime_get_real_ns() for platforms that lack it.
The solution is to reverse the logic of testing
for ktime_get_real_ns() done by lustre. This way
we avoid the HAVE_KTIME_GET_REAL_NS collision.
Also to ensure older platforms with an older OFED
stack still will build only turn off NEED_KTIME_GET_REAL_NS
set by libcfs in o2iblnd.h when the OFED stack
has defined LINUX_3_17_COMPAT_H. The compat-3.17.h
OFED header is where ktime_get_real_ns() gets defined
when its lacking on the native platform.

Test-Parameters: trivial

Change-Id: I44966f22cfbb6138fa7bc3fa47148a6f0a94ebd4
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25564
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8642 build: suppport building various OFED 58/22758/17
Minh Diep [Tue, 27 Sep 2016 16:03:33 +0000 (09:03 -0700)]
LU-8642 build: suppport building various OFED

* Remove the 01-remove-mlx4-erroneous-modprobe-config-file:rhel6.ed
* differentiate each type of OFED to allow different way
of downloading, unpack and build with different options
* symlink SLES linux-obj after unpack the rpm

Test-Parameters: trivial

Change-Id: I7fcd50a6b747dbb5419bb029087967f809ef2485
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/22758
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9109 ldlm: restore missing newlines in ldlm sysfs files 22/25522/3
John L. Hammond [Fri, 17 Feb 2017 17:23:49 +0000 (11:23 -0600)]
LU-9109 ldlm: restore missing newlines in ldlm sysfs files

Restore the missing trailing newlines in
/sys/fs/lustre/ldlm/namespaces/*/lru_{max_age,size}.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ib9acf8ea6126a16f89da86cfaeadf7685d5c802c
Reviewed-on: https://review.whamcloud.com/25522
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8686 osd: add few more credits if debugging is enabled 44/23044/4
Alex Zhuravlev [Mon, 10 Oct 2016 10:07:13 +0000 (13:07 +0300)]
LU-8686 osd: add few more credits if debugging is enabled

this can make JBD happy, prevent panic and let OSD detect
credits overuse.

Change-Id: I93fae9bd0d8208af888b75232eb9b9cde205a98f
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/23044
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9132 utils: tuning max_sectors_kb on mount 83/25483/4
Niu Yawei [Thu, 16 Feb 2017 02:45:23 +0000 (21:45 -0500)]
LU-9132 utils: tuning max_sectors_kb on mount

Sometimes user doesn't want the max_sectors_kb being
overwritten on MDT/OST mount, we should provide a way for
user to have more control over this important parameter,
a new mount option 'max_sectors_kb' is introduced:

- When max_sectors_kb isn't specified on mount, change the
  max_sectors_kb to max_hw_sectors_kb, it's default behavior
  suited for most users;
- When max_sectors_kb is specified as zero, leave the old
  setting of max_sectors_kb untouched;
- When max_sectors_kb is specified as a positive number,
  change the max_sectors_kb to this number arbitrarily;

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I251d75bbdf16d8cb6d503bbbfc69fb18993f7a3e
Reviewed-on: https://review.whamcloud.com/25483
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9127 target: tgt_cb_last_committed is too noisy 69/25469/4
Andrew Perepechko [Wed, 15 Feb 2017 09:02:20 +0000 (12:02 +0300)]
LU-9127 target: tgt_cb_last_committed is too noisy

tgt_cb_last_committed() prints a D_HA message even if
last_committed was not updated. We can only print a
message when last_committed was updated, so we have
mostly the same information and save some debug log
space and cpu resource.

Change-Id: Ic2784e6a3652ca1851cde1313d6985ed2f90e36b
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-on: https://review.whamcloud.com/25469
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alexander Boyko <alexander.boyko@seagate.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
2 years agoLU-9116 libcfs: avoid overflow of crypto bandwidth caculation 36/25436/4
Gu Zheng [Tue, 14 Feb 2017 03:26:11 +0000 (11:26 +0800)]
LU-9116 libcfs: avoid overflow of crypto bandwidth caculation

bcount and buf_len are both int, and no force convert in the caculation code:
tmp = ((bcount * buf_len / jiffies_to_msecs(end - start)) *
1000) / (1024 * 1024);
That may cause overflow in modern fast machine.

Change-Id: I1e5abccad3e4df62907317a09de02beb6d831e13
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/25436
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Li Xi <lixi@ddn.com>
2 years agoLU-4423 llite: use 64-bit times in another debug print 04/25404/2
Arnd Bergmann [Sun, 12 Feb 2017 17:14:07 +0000 (12:14 -0500)]
LU-4423 llite: use 64-bit times in another debug print

The ll_setattr_raw() function prints the new inode timestamps
along with the current time using '%lu', which overflows in
2106. This changes the printing of the current time for
now, the other two will change when we migrate the VFS code
to use 64-bit timestamps.

Linux-commit: 8d7eed54a2391db16f184b18cde5c1824775ebdc

Change-Id: I96b5b1599b8af8446ee68f88fc739843291b304c
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25404
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9103 tests: SKIP recovery-small/110g for old MDS versions 67/25367/2
Parinay Kondekar [Fri, 10 Feb 2017 05:58:43 +0000 (11:28 +0530)]
LU-9103 tests: SKIP recovery-small/110g for old MDS versions

LU-2430 is not present on 2.5.x esp the change
"LU-2430 mdd: add lfs mv to migrate inode."
so for MDS version less than 2.6.57 this test needs
to be SKIPPed.

Test-Parameters: testlist=recovery-small
Signed-off-by: Parinay Kondekar <Parinay.Kondekar@seagate.com>
Change-Id: I21d685062733a9d8a633ff208bcb83d3ec146ca0
Reviewed-on: https://review.whamcloud.com/25367
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
2 years agoLU-9067 utils: ensure debugfs is mounted 82/25182/8
James Simmons [Fri, 17 Feb 2017 17:17:19 +0000 (12:17 -0500)]
LU-9067 utils: ensure debugfs is mounted

With the move of lustre to sysfs and tracepoint it
will become critical to have debugfs mounted. On
older platforms like RHEL6 its not mounted by default.
Also it is possible that debugfs could become umounted
by accident thus disabling needed functionality
to control lustre. Add to libcfs.a a function that
is always called to ensure debugfs is mounted.
If debugfs is not mounted then mount it if the caller
is root.

Change-Id: I21f85ba252b67bfbc22b23920e2ccaffc196074b
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25182
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-8981 test: sanity 311 check is too strict 14/24814/4
Lai Siyao [Fri, 6 Jan 2017 19:29:59 +0000 (03:29 +0800)]
LU-8981 test: sanity 311 check is too strict

System may be too busy to destroy unlinked objs in time, which
cause sanity fail, let's use a smaller value to not fail autotest.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ibbe543f279c3548176c53b3fdb7b8048ea08931f
Reviewed-on: https://review.whamcloud.com/24814
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8911 tests: sanity-hsm test_24d fails on a local setup 85/24185/11
Quentin Bouget [Tue, 6 Dec 2016 14:41:16 +0000 (15:41 +0100)]
LU-8911 tests: sanity-hsm test_24d fails on a local setup

In test_24d, do not use the default mountpoint of copytool_setup()
as it is also the one the test mounts in read-only mode.

This patch also removes:
 - the "continuing fast path" in copytool_setup (calling
   "pkill -CONT -x lhsmtool_posix" instead of launching a new
   copytool) as it is not explicitly used by any test
 - as well as the first copytool_setup located outside of any
   tests.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I86bc41213bc656d7a83d63fb8e9bc595ba6b73ca
Reviewed-on: https://review.whamcloud.com/24185
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9038 obdclass: handle early requests vs CT registering 50/25050/9
Bruno Faccini [Tue, 24 Jan 2017 15:19:31 +0000 (16:19 +0100)]
LU-9038 obdclass: handle early requests vs CT registering

This patch addresses cases where CDT may start to send requests
before CT has fully registered with all MDTs and thus when the KUC
pipe kernel side has still not been initialized in
lmv_hsm_ct_register().
This will avoid Oops'es due to kkuc_groups[KUC_GRP_HSM] being
uninitialized/zero'ed and we rely on CDT to later retry.
sanity-hsm/test_402b has been added to verify.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ibccf2627aebe8da52128da5d90d24751394bf61d
Reviewed-on: https://review.whamcloud.com/25050
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6210 mdd: Change positional struct initializers to C99 47/23747/3
Steve Guminski [Mon, 9 Jan 2017 13:33:13 +0000 (08:33 -0500)]
LU-6210 mdd: Change positional struct initializers to C99

This patch makes no functional changes.  Struct initializers in the
mdd directory that use C89 or GCC-only syntax are updated to C99
syntax.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializer has been updated:

lustre/mdd/mdd_dir.c:
static struct lu_name lname_dotdot

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I390a67c9c7182ca1e1aee7aa926e7d2509f73cbf
Reviewed-on: https://review.whamcloud.com/23747
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8403 obd: remove OBD_NOTIFY_SYNC{,_NONBLOCK} 21/21421/5
John L. Hammond [Tue, 19 Jul 2016 14:21:15 +0000 (09:21 -0500)]
LU-8403 obd: remove OBD_NOTIFY_SYNC{,_NONBLOCK}

None of the OBD notify handlers listen for OBD_NOTIFY_SYNC{,_NONBLOCK}
events so remove them and related code in lov_notify().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I3bc2bd34b268b28777241555dae8896577150c91
Reviewed-on: https://review.whamcloud.com/21421
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 obd: use 64-bit timestamps for rpc stats 10/25410/4
James Simmons [Fri, 17 Feb 2017 15:45:58 +0000 (10:45 -0500)]
LU-9019 obd: use 64-bit timestamps for rpc stats

The debugfs rpc stats interface contains timestamps that are
computed from timeval, which overflows in 2038 on 32-bit systems.

This changes the output to use a timespec64 type to avoid the
overflow. I also change the format to print the sub-second portion
as 9 digits (nanoseconds) for clarity, rather than printing six
digits without leading zeroes.

Change-Id: I8fca45ef62672f3880a444961cb068d8c436e2c7
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Grégoire Pichon <gregoire.pichon@bull.net>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9094 o2iblnd: kill timedout txs from ibp_tx_queue 76/25376/2
Sergey Cheremencev [Tue, 27 Dec 2016 20:29:52 +0000 (23:29 +0300)]
LU-9094 o2iblnd: kill timedout txs from ibp_tx_queue

Sometimes connection can't be established for a long time
due to rejections and produces cycle of reconnections.
Peer is not removed in each iteration unlike connection.
Thus until connection becomes established txs live in
peer->ibp_tx_queue. This patch adds tx_deadline checking
for txs from peer tx_queue.

Change-Id: Id2623285c735d1dff40ec755a5c8d20e9c62e60a
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Seagate-bug-id: MRP-4056
Reviewed-on: https://review.whamcloud.com/25376
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9094 lnet: remove ni from lnet_finalize 75/25375/5
Sergey Cheremencev [Fri, 13 Jan 2017 16:35:40 +0000 (19:35 +0300)]
LU-9094 lnet: remove ni from lnet_finalize

Remove ni from lnet_finalize and kiblnd_txlist_done
input arguments. Also small code cleanup.

Change-Id: I509bfb21629e3dc0b4c80ec083e3953b78fdf874
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Seagate-bug-id: MRP-4056
Reviewed-on: https://review.whamcloud.com/25375
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7441 nrs: Free hash table if failed to start a nrs policy 24/17224/6
Li Xi [Tue, 17 Nov 2015 09:07:00 +0000 (17:07 +0800)]
LU-7441 nrs: Free hash table if failed to start a nrs policy

Hash table should be freed correctly if failed to start a nrs policy,
otherwise it will cause memory leak.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Ib5380fab1843af129f4cbf7ac396fb620bb8a617
Reviewed-on: https://review.whamcloud.com/17224
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8773 llite: refactor lov_object_fiemap() 61/23461/3
Bobi Jam [Thu, 27 Oct 2016 08:39:11 +0000 (16:39 +0800)]
LU-8773 llite: refactor lov_object_fiemap()

* Change loff_t to u64 in lov_object_fiemap() since loff_t is a
  signed value type.
* Add fiemap_for_stripe() to get file map extent from each stripe
  device.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic8ac98747eb32f4be90e602a0995fad8ef211bb8
Reviewed-on: https://review.whamcloud.com/23461
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8767 llite: Improve proc file text in lproc_llite.c 42/23942/5
Steve Guminski [Thu, 24 Nov 2016 17:17:30 +0000 (12:17 -0500)]
LU-8767 llite: Improve proc file text in lproc_llite.c

Improves the instructions displayed when reading from the stats
files.  Several repeated code blocks are consolidated into a new
function to reduce duplication.

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Ic04ca76016e6e6568ea9f81a8af6153d99053412
Reviewed-on: https://review.whamcloud.com/23942
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9100 lnet: lctl net down success when lnet not loaded 59/25359/4
Giuseppe Di Natale [Thu, 9 Feb 2017 18:59:00 +0000 (10:59 -0800)]
LU-9100 lnet: lctl net down success when lnet not loaded

lctl network down|unconfigure will no longer issue an
error when the lnet module is not loaded. There is no
network to unconfigure. Therefore, we can let the user
know and exit without error.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Change-Id: I1eda08bd5bb0198e9d374b3b7f00306286da25cc
Reviewed-on: https://review.whamcloud.com/25359
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8703 libcfs: remove usless abstraction 48/25048/2
Dmitry Eremin [Tue, 24 Jan 2017 12:49:28 +0000 (15:49 +0300)]
LU-8703 libcfs: remove usless abstraction

Remove aditional abstraction of cfs_cpu_ht_nsiblings().
Replace it with direct call to original function.

Change-Id: I63fa4a197519431dcf76c66cf22328e8b4410681
Test-Parameters: trivial
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/25048
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-8888 clio: remove unused members from struct cl_thread_info 62/24062/5
Dmitry Eremin [Wed, 9 Nov 2016 12:49:00 +0000 (15:49 +0300)]
LU-8888 clio: remove unused members from struct cl_thread_info

The pointer to the topmost ongoing IO in the thread and
other members are not used any more.

Change-Id: I4875fe7d0e5f64fd3a4a60fc1bd1877c9d6e3340
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/24062
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7670 mdt: allow changelog commands to return errors 30/18030/28
Ben Evans [Thu, 14 Jan 2016 20:58:15 +0000 (14:58 -0600)]
LU-7670 mdt: allow changelog commands to return errors

Return errors to lctl/lfs users for out of range, and invalid IDs.
Currently only 0 is ever returned, regardless of outcome.  Some
information is printed in the MDS logs, but nothing on the client
to indicate success or failure.

Split purge and clear code into 2 separate paths.

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: Ie139b170da0c8bef1c315ddb5361783230bb51ad
Reviewed-on: https://review.whamcloud.com/18030
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 mdt: use 64-bit timestamps for rename stats 12/25412/2
James Simmons [Sun, 12 Feb 2017 17:39:33 +0000 (12:39 -0500)]
LU-9019 mdt: use 64-bit timestamps for rename stats

The rename stats interface contains timestamps that are
computed from timeval, which overflows in 2038 on 32-bit systems.

This changes the output to use a timespec64 type to avoid the
overflow. I also change the format to print the sub-second portion
as 9 digits (nanoseconds) for clarity, rather than printing six
digits without leading zeroes.

Change-Id: I3b395160d8f5c76553f20dc4ca1047ae2a3df2b6
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25412
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8560 build: announce linux kernel 4.6.7 support 74/25174/4
James Simmons [Wed, 8 Feb 2017 20:26:49 +0000 (15:26 -0500)]
LU-8560 build: announce linux kernel 4.6.7 support

Bump kernel version in ChangeLog to latest supported
kernel which is 4.6.7

Test-Parameters: trivial

Change-Id: I2f829a967fffedcaef0d98dddf9c9bb6582ee5c5
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25174
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9125 test: Correct setstripe -s option 81/25481/5
James Nunez [Wed, 15 Feb 2017 23:27:21 +0000 (16:27 -0700)]
LU-9125 test: Correct setstripe -s option

Some flags for 'lfs setstripe' were deprecated and
tests were not updated. recovery-small test 24b calls
'lfs setstripe' with a '-s' flag which needs to be '-S'.

Test-Parameters: trivial testlist=recovery-small

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iaaaecf0a73a9b659ff44d3c0a50e4386540ba0f3
Reviewed-on: https://review.whamcloud.com/25481
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
2 years agoLU-9123 test: correct setstripe options in layout test 71/25471/4
John L. Hammond [Wed, 15 Feb 2017 16:07:11 +0000 (10:07 -0600)]
LU-9123 test: correct setstripe options in layout test

In llapi_layout_test.c use '-S' instead of the now removed '-s'
option to lfs setstripe

Test-Parameters: trivial

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I74f020f29b1a274e6fe63aac29685b88e541e649
Reviewed-on: https://review.whamcloud.com/25471
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9094 o2iblnd: reconnect peer for REJ_INVALID_SERVICE_ID 78/25378/3
Sergey Cheremencev [Fri, 16 Dec 2016 12:08:56 +0000 (15:08 +0300)]
LU-9094 o2iblnd: reconnect peer for REJ_INVALID_SERVICE_ID

Don't kill the peer in case of INVALID_SERVICE_ID. This produces
huge number of peers for the same nid and may cause an OOM.

The OOM was frequently seen with mlnx-ofa-kernel-2.3 where used
RCU mechanism in mlx4_cq_free. In older mlnx4 versions to mitigate
the issue RCU was changed with spin locks.

Change-Id: Ib609232242c45bc9819e1cb4c593da3a490c63a0
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Seagate-bug-id: MRP-4056
Reviewed-on: https://review.whamcloud.com/25378
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9115 llite: buggy special handling on MULTIMODRPCS 35/25435/2
Niu Yawei [Tue, 14 Feb 2017 03:04:56 +0000 (22:04 -0500)]
LU-9115 llite: buggy special handling on MULTIMODRPCS

There is some special handling over MULTIMODPRCS flag in
client_connect_import(), it looks unnecessary and buggy,
the MULTIMODPRCS flag would be cleared unexpectedly from
imp_connect_data on reconnect.

This patch removed the special handling code and treat
MULTIMODRPCS normally just like other flags.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Icba5a1413349f7a4c61dcac0bb4a39f1d1b0128d
Reviewed-on: https://review.whamcloud.com/25435
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
2 years agoLU-4423 mdc: use 64-bit timestamps for mdc 05/25405/2
Arnd Bergmann [Sun, 12 Feb 2017 17:18:25 +0000 (12:18 -0500)]
LU-4423 mdc: use 64-bit timestamps for mdc

These three are timestamps that are sent over the wire in mdc_lib
and the obd logging 64-bit values, but are generated using the 32-bit
get_seconds() function, which will eventually overflow.

Changing them to use 64-bit ktime_get_real_seconds() solves the problem.

Linux-commit: 14e3f92a4c46eedfe745b0dec42a4dcb1b16a989

Change-Id: I2063ae51a8335cd6887f800f0f30e8d90cfe7d2b
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25405
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
2 years agoLU-9101 kernel: kernel update [SLES11 SP4 3.0.101-94] 74/25374/2
Bob Glossman [Thu, 9 Feb 2017 23:18:42 +0000 (15:18 -0800)]
LU-9101 kernel: kernel update [SLES11 SP4 3.0.101-94]

Update SLES11 SP4 kernel to 3.0.101-94

Test-Parameters: mdsdistro=sles11sp4 ossdistro=sles11sp4 \
  clientdistro=sles11sp4 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testgroup=review-ldiskfs

Change-Id: Ia2d60239d990972ecf2ade567793a933f4491f28
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/25374
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 ldlm: move server side /proc/fs/lustre/ldlm to sysfs 60/25160/5
James Simmons [Fri, 17 Feb 2017 20:54:48 +0000 (15:54 -0500)]
LU-8066 ldlm: move server side /proc/fs/lustre/ldlm to sysfs

Move the rest of the simple proc files for ldlm that
only appear on server nodes to sysfs.

Change-Id: I0e4b72dfc4fe3be72b005b5b075005c11e4197d9
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25160
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8947 test: fix getting OST name at sanity test_253 87/24387/4
Alexander Boyko [Fri, 16 Dec 2016 06:59:22 +0000 (09:59 +0300)]
LU-8947 test: fix getting OST name at sanity test_253

The test gets OST name the wrong way. And if system
has more than ten OSTs, the test fail. This patch
resolves the issue.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Seagate-bug-id: MRP-3330
Test-Parameters: trivial
Change-Id: Ic6080f725c6e693f6cca49d9aebf06c6add4610b
Reviewed-on: https://review.whamcloud.com/24387
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
2 years agoLU-9125 utils: Postpone deprecation of some options. 03/25503/3
Oleg Drokin [Thu, 16 Feb 2017 21:10:18 +0000 (16:10 -0500)]
LU-9125 utils: Postpone deprecation of some options.

This postpones deprecation of soem setstripe options
to 2.9.59, since there's still plenty of code still
using it in the test framework and we need to orderly
update all of it.

Test-Parameters: trivial

Change-Id: Ia6c06ba9298c71b012da5d678731a62e677600e4
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/25503
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
2 years agoLU-9081 config: don't attach sub logs for LWP 93/25293/2
Niu Yawei [Tue, 7 Feb 2017 07:51:12 +0000 (02:51 -0500)]
LU-9081 config: don't attach sub logs for LWP

Lustre target processes client log to retrieve MDT NIDs and start
LWPs, it goes the same code path of mgc_process_config() just like
processing the target config log, so that sub clds for security,
nodemap, param & recovery will be attached unnecessarily.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I926ec4f33d3899c73a9c000b3ad0c0e5c102dfde
Reviewed-on: https://review.whamcloud.com/25293
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5170 utils: Add support for --list-commands option 02/24902/4
Steve Guminski [Fri, 13 Jan 2017 15:43:35 +0000 (10:43 -0500)]
LU-5170 utils: Add support for --list-commands option

A --list-commands option has been added to lfs, lctl, lnetctl and
lst to output a list of the commands supported by each utility. The
commands are printed in a multi-column format to produce more compact
output that is easier to read.  The appropriate man pages have been
updated to include the new options.

The obsolete ACL and join commands have been removed from lfs.

Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I3715049539c76e0cd03accfccfbf7eda6f4bf2ff
Reviewed-on: https://review.whamcloud.com/24902
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
2 years agoLU-6210 utils: Change positional struct initializers to C99 37/23537/15
Steve Guminski [Tue, 8 Nov 2016 21:38:15 +0000 (16:38 -0500)]
LU-6210 utils: Change positional struct initializers to C99

This patch makes no functional changes.  Struct initializers in the
utils directory that use C89 or GCC-only syntax are updated to C99
syntax.  Whitespace is corrected to match the coding style guidelines.
Variables of type struct option have been renamed to long_opts for
consistency.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

utils/gss/gssd.c:
struct sembuf op (2 occurrences)
utils/gss/lgss_sk.c:
static struct option long_opt[]
utils/gss/lsupport.c:
static struct convert_struct converter[]
static struct user_mapping mapping
utils/gss/svcgssd_mech2file.c:
static const struct oid2mech o2m[]

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I9bee8bbea817afc369a34252513ad3bdee947851
Reviewed-on: https://review.whamcloud.com/23537
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6499 obdclass: obdclass module cleanup upon load error 44/22544/13
Bruno Faccini [Fri, 16 Sep 2016 08:15:06 +0000 (10:15 +0200)]
LU-6499 obdclass: obdclass module cleanup upon load error

Fix obdclass_init() error paths to proceed with cleanup.
This will particularly allow to no longer crash upon next
load attempt and this due to previous miscdevice not been
deregistered and thus still referenced in misc_list when
unmapped.

conf-sanity/test_102 has been created to verify correct
behavior/cleanup.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I6f65cf651251cf92427e2bab3ff6b51ebb98d699
Reviewed-on: https://review.whamcloud.com/22544
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8550 test: fix problems of conf-sanity test_32 46/22146/3
Li Xi [Mon, 22 Aug 2016 08:22:30 +0000 (16:22 +0800)]
LU-8550 test: fix problems of conf-sanity test_32

1) test_32a: The MGS should be umounted otherwise the seperate
MGS will cause failure of conf-sanity 32a. and re-start mgs in
t32_test_cleanup() because mgs need be always started.
2) test_32c: Format and tunefs on the same mds in conf-sanity 32
tests If mds1 and mds2 are different machine, conf-sanity 32c
will fail.

    Test-Parameters: trivial combinedmdsmgs=false envdefinitions=ONLY=32 testlist=conf-sanity

    Test-Parameters: trivial envdefinitions=ONLY=32 testlist=conf-sanity

Change-Id: I16e259d16912f894341a17855aca1e3fbbe6a8f3
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/22146
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5969 lustreapi: allow "version" without "lustre:" 24/25324/2
Andreas Dilger [Wed, 8 Feb 2017 12:11:32 +0000 (05:11 -0700)]
LU-5969 lustreapi: allow "version" without "lustre:"

The upstream kernel /sys/fs/lustre/version file does not have the
"lustre:" prefix in it, only the version string.  Since we no longer
have multiple version strings in this file, if "lustre:" is not
found then just assume the whole string is the version, otherwise
continue to strip out the "lustre:" prefix as we did before.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ife50f83ad422a8e3c9ae857f519a5cf4ca3ebbe5
Reviewed-on: https://review.whamcloud.com/25324
Tested-by: Jenkins
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9059 utils: skip label check for client 34/25234/2
Hongchao Zhang [Wed, 12 Oct 2016 11:05:13 +0000 (19:05 +0800)]
LU-9059 utils: skip label check for client

When mounting Lustre client, the device label should not be checked.

Change-Id: I5018e5361545ed6c4e31e3d85360bb8fa5670b5e
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/25234
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-8526 tests: ensure all OSTs active for allocations 48/23148/2
Bruno Faccini [Thu, 13 Oct 2016 21:53:53 +0000 (23:53 +0200)]
LU-8526 tests: ensure all OSTs active for allocations

This patch ensures that all OSTs are active to be selected
for objids allocations. This will allow for random selection
of OSTs during setstripe to be effective.
To do so, wait_osts_up() has been exported from conf-sanity
to test-framework to become part of the generic functions
set and be used from any tests suites, and particularly in
replay-single/test_90 sub-test.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Id5db31948ab86c1b3c2bc289191917ed3e8aadf8
Reviewed-on: https://review.whamcloud.com/23148
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6142 lnet: remove most of typedefs from LNet headers 31/20831/10
James Simmons [Tue, 7 Feb 2017 15:55:49 +0000 (10:55 -0500)]
LU-6142 lnet: remove most of typedefs from LNet headers

Remove the majority of typedefs from the LNet headers.
Change them into structures instead. Currently only
lnet_nid_t, lnet_pid_t, and lnet_kiov_t are left.

Test-Parameters: trivial

Change-Id: Ib083d305ab945bab8d78ac96d17015550c0f9486
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/20831
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9040 scrub: handle group boundary properly 05/25105/4
Fan Yong [Thu, 6 Oct 2016 09:06:05 +0000 (17:06 +0800)]
LU-9040 scrub: handle group boundary properly

If the last bit in current inode bitmap is set, then the
osd_iit_param::offset will be set as the inode count of
per group (LDISKFS_INODES_PER_GROUP). Unfortunately, the
orignal logic for osd_inode_iteration() did not handle such
boundary value properly, as to the iteration will scan current
group again and again.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ie555838cd782a9378c6305485dd737b6bc6b2d46
Reviewed-on: https://review.whamcloud.com/25105
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew tag 2.9.53 2.9.53 v2_9_53 v2_9_53_0
Oleg Drokin [Tue, 14 Feb 2017 03:58:24 +0000 (22:58 -0500)]
New tag 2.9.53

Change-Id: Ib1bde03bdfb7819e534f3fa290f8284dfb788770
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9032 tests: syntax error in cleanup_test32_mount 42/24942/4
Dmitry Eremin [Wed, 18 Jan 2017 19:05:58 +0000 (22:05 +0300)]
LU-9032 tests: syntax error in cleanup_test32_mount

Fix regular expression in sed parameter.

sed: unmatched '/'

Also make cleanup function common for all test_32*.
Style cleanup.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I6245d0b3c00b0f0f79213263b6c81e9ab6000d11
Reviewed-on: https://review.whamcloud.com/24942
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8843 client: fix all less than 0 comparison for unsigned values 11/23811/8
James Simmons [Tue, 24 Jan 2017 13:42:04 +0000 (08:42 -0500)]
LU-8843 client: fix all less than 0 comparison for unsigned values

Remove all test of less than zero for unsigned values
found with -Wtype-limits.

Change-Id: Ia0a5cc5fc280b3856397cc8a494014368a04bf75
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23811
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9078 lnet: Fix route hops print 50/25250/3
Amir Shehata [Fri, 3 Feb 2017 20:13:00 +0000 (12:13 -0800)]
LU-9078 lnet: Fix route hops print

The default number of hops for  a route is -1. This is
currently being printed as %u. Change that to %d to
make it print out properly.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I13c416b1ff86b55d72ffa124441dc358c9cea97b
Reviewed-on: https://review.whamcloud.com/25250
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8974 osd-ldiskfs: increase supported ldiskfs fs size 24/24524/5
Artem Blagodarenko [Mon, 26 Dec 2016 09:53:47 +0000 (12:53 +0300)]
LU-8974 osd-ldiskfs: increase supported ldiskfs fs size

Change "force_over_256tb" mount option to "force_over_512tb".

Seagate-bug-id: MRP-4077
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Change-Id: I4b0c1d9faf2139f2b4fc4ff94ed78af5e218110d
Reviewed-on: https://review.whamcloud.com/24524
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5620 ptlrpc: Add QoS for opcode in NRS-TBF 18/11918/37
Qian Yingjin [Mon, 6 Jun 2016 08:40:35 +0000 (16:40 +0800)]
LU-5620 ptlrpc: Add QoS for opcode in NRS-TBF

This patch add a new QoS feature in TBF policy which could
limits the rate based on opcode.

The syntax is like:
    lctl set_param x.x.x.nrs_tbf_rule=
         "[reg|hp] start <rule_name> <arguments>..."
Start the tbf opcode QoS:
    lctl set_param ost.OSS.ost_io.nrs_policies="tbf opcode"
Limit the ost_read and ost_write respectively:
    lctl set_param ost.OSS.ost_io.nrs_tbf_rule=
 "start ost_r opcode={ost_read} rate=100"
    lctl set_param ost.OSS.ost_io.nrs_tbf_rule=
 "start ost_w opcode={ost_write} rate=200"
Limit both ost_read and ost_write:
    lctl set_param ost.OSS.ost_io.nrs_tbf_rule=
         "start ost_rw opcode={ost_read ost_write} rate=200"

The limit numbers like 100, 200 mean the number of
requests per second.

And, the opcode-based policy can not be combined with
NID-based and JobID-based policies now.

Test-Parameters: alwaysuploadlogs
Signed-off-by: Wu Libin <lwu@ddn.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I4ff93972df560ad1ebc8e38e942d503518a835c7
Reviewed-on: https://review.whamcloud.com/11918
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 osd: migrate osd-ldiskfs thandle stats to 64 bit time 61/25161/4
James Simmons [Sun, 5 Feb 2017 01:10:37 +0000 (20:10 -0500)]
LU-9019 osd: migrate osd-ldiskfs thandle stats to 64 bit time

To avoid the overflow in 2038 issues we migrate the
thandle stats from using jiffies to ktime.

Change-Id: I57260d55ca7c0921a9115f4717b13724b870826d
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25161
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8702 tests: parallel execution of IOR and MDTEST added. 26/23126/4
Aditya Pandit [Wed, 9 Dec 2015 09:58:58 +0000 (15:28 +0530)]
LU-8702 tests: parallel execution of IOR and MDTEST added.

Added test case for execution of mdtest and IOR in parallel.

Test-Parameters: trivial testlist=parallel-scale
Change-Id: I3b8a74a94739417467cc04bcc5e688b487d0cfe7
Seagate-bug-id: MRP-3149
Signed-off-by: Ashish Maurya <ashish.maurya@seagate.com>
Signed-off-by: Aditya Pandit <aditya.pandit@seagate.com>
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/10376
Tested-by: Jenkins
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: https://review.whamcloud.com/23126
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 obdclass: Get rid of remaining /proc/sys/lustre plumbing 34/24034/4
Oleg Drokin [Sun, 1 Jan 2017 19:54:47 +0000 (14:54 -0500)]
LU-8066 obdclass: Get rid of remaining /proc/sys/lustre plumbing

Since all of the variables from /proc/sys/lustre were moved to
/sys/fs/lustre, get rid of the remaining infrastructure.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Change-Id: I6facdb8f52b86efb1e85a4d43ca2532a2f460a85
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24034
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9073 gss: quiet insecure key file warning 01/25201/2
Andreas Dilger [Thu, 2 Feb 2017 05:55:15 +0000 (22:55 -0700)]
LU-9073 gss: quiet insecure key file warning

Quiet spurious warning about insecure file access mode, because the
st_mode contains file type as well.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: If347eb3de67074269de4fe279ba4a849e03ebbe5
Reviewed-on: https://review.whamcloud.com/25201
Tested-by: Jenkins
Reviewed-by: Nathan Lavender <nblavend@iu.edu>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 ldlm: move /proc/fs/lustre/ldlm to sysfs 49/25049/4
Oleg Drokin [Mon, 30 Jan 2017 20:10:47 +0000 (15:10 -0500)]
LU-8066 ldlm: move /proc/fs/lustre/ldlm to sysfs

Unregister ldlm namespace from sysfs on free

ldlm_namespace_sysfs_unregister needs to be called ldlm_namespace_free_post
so that we don't have this dangling object there after the namespace
has disappeared.

Linux-commit: 9c7e397c98d646a3a23ffd304def1750be916803

move procfs ldlm pool stats to sysfs

Suitable contents of /proc/fs/lustre/ldlm/namespaces/.../pools/
is moved to /sys/fs/lustre/ldlm/namespaces/.../pools/:
cancel_rate grant_plan grant_speed lock_volume_factor
server_lock_volume granted grant_rate limit recalc_period

Linux-commit: 24b8c88a7122df35ce6a413cd76e9581411eab8f

Add infrastructure to move ldlm pool controls to sysfs

This adds registration of /sys/fs/lustre/ldlm/namespaces/.../pool
dir.

Linux-commit: f2825e039e1a6b58411087e1e17638f872d00a93

move namespaces/lru_max_age to sysfs

Move ldlm display of lru_max_age from procfs to sysfs

Linux-commit: c841236dda9aa334f7e241e3c526360328f77343

move namespaces/lock_unused_count to sysfs

Move ldlm display of lock_unused_count from procfs to sysfs

Linux-commit: 3dd4598271fc119a4e3c5589be03f88a41c31e64

move namespaces/early_lock_cancel to sysfs

Move ldlm display of early_lock_cancel from procfs to sysfs

Linux-commit: 87d32094efc208f31e4e3b226d25e58058352208

move namespaces/lru_size to sysfs

Move ldlm display of lru_size from procfs to sysfs

Linux-commit: 6784096b4818636ad512575c701e164e8e6a09d3

move namespace/lock_count to sysfs

Move ldlm display of lock_count from procfs to sysfs

Linux-commit: 63af1f57474fac888116d896a0c5f17aeb6a702d

move namespaces/resource_count to sysfs

Move ldlm display of resource_count from procfs to sysfs

Linux-commit: 0f53c823f9664683ce1aadab2d6a4cee950d6f62

move cancel_unused_locks_before_replay to sysfs

/proc/fs/lustre/ldlm/cancel_unused_locks_before_replay is
moved to /sys/fs/lustre/ldlm/cancel_unused_locks_before_replay

Linux-commit: 0f53c823f9664683ce1aadab2d6a4cee950d6f62

Preparation to move /proc/fs/lustre/ldlm to sysfs

Add necessary infrastructure, register /sys/fs/lustre/ldlm,
/sys/fs/lustre/ldlm/namespaces and /sys/fs/lustre/ldlm/services

Linux-commit: 18fd8850a4c8177ecf4870ff38c208d329a21ed0

Change-Id: I2bb6e925bb95336b79a265ef46ebdd29d47b957c
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25049
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
2 years agoLU-9029 kernel: kernel update [SLES12 SP2 4.4.38-93] 38/24938/5
Bob Glossman [Wed, 14 Dec 2016 16:49:05 +0000 (08:49 -0800)]
LU-9029 kernel: kernel update [SLES12 SP2 4.4.38-93]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ia72bf9e6542627efdd946c7213dee2c77fa73e57
Reviewed-on: https://review.whamcloud.com/24938
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8995 tests: set debug size correctly 82/24782/5
Alexey Lyashkov [Tue, 24 Jan 2017 12:05:35 +0000 (17:35 +0530)]
LU-8995 tests: set debug size correctly

Use library function to set the debug log size

Seagate-bug-id: MRP-4055
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Parinay Kondekar <Parinay.Kondekar@seagate.com>
Change-Id: I125ce7f5f7f7754e82f913ef8cf6944f40f631d6
Reviewed-on: https://review.whamcloud.com/24782
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4423 libcfs: remove IS_PO2 and __is_po2 77/24577/3
Aya Mahfouz [Tue, 17 Jan 2017 17:30:12 +0000 (12:30 -0500)]
LU-4423 libcfs: remove IS_PO2 and __is_po2

Removes IS_PO2 and __is_po2 since the uses of IS_PO2 have
been replaced by is_power_of_2

Linux-commit: d4891039904fa25edf1ca793a0469633ed81df3f

The following commit message is the same for the following
patches:

hash.c: Replace IS_PO2 by is_power_of_2

Linux-commit: 71872e9cc2af4dca1903ebc57daa15f08c795d86

selftest.h: replace IS_PO2 by is_power_of_2

Linux-commit: b3367164f4ff8ff2c1aa8bd79c7548f113b62b83

workitem.c: replace IS_PO2 by is_power_of_2

Linux-commit: 57b573d14b0fb9f83575a2cf155862d251c8f0d1

ldlm_extent.c: replace IS_PO2 by is_power_of_2

Linux-commit: 5f4179e04b31441b0b7995d14320a457aafba01b

Replaces IS_PO2 by is_power_of_2. It is more accurate to use
is_power_of_2 since it returns 1 for numbers that are powers
of 2 only whereas IS_PO2 returns 1 for 0 and numbers that are
powers of 2.

Change-Id: Ic8bb40394b46ea433e3096c878abe467eacc7996
Signed-off-by: Aya Mahfouz <mahfouz.saif.elyazal@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24577
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
2 years agoLU-6245 libcfs: replace IS_PO2 with is_power_of_2 in server code 75/24575/10
James Simmons [Sat, 28 Jan 2017 03:47:24 +0000 (22:47 -0500)]
LU-6245 libcfs: replace IS_PO2 with is_power_of_2 in server code

Replaces IS_PO2 by is_power_of_2. It is more accurate to use
is_power_of_2 since it returns 1 for numbers that are powers
of 2 only whereas IS_PO2 returns 1 for 0 and numbers that are
powers of 2.

Change-Id: I595053a658a96818ac9b434377c275d3ed7143ec
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24575
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
2 years agoLU-8066 obdclass: move lustre server sysctl to sysfs 33/24033/3
James Simmons [Sun, 1 Jan 2017 19:42:27 +0000 (14:42 -0500)]
LU-8066 obdclass: move lustre server sysctl to sysfs

A few of the lustre sysctl are server side so lets
move those as well to sysfs. Both memused ane memused_max
are missing upstream but we need to keep them around
for now.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: Ie2fd2408d79aede4e40272a86f63f1e55311d1b9
Reviewed-on: https://review.whamcloud.com/24033
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8928 osd: convert osd-zfs to reference dnode, not db 93/24293/7
Alex Zhuravlev [Fri, 9 Dec 2016 16:38:34 +0000 (19:38 +0300)]
LU-8928 osd: convert osd-zfs to reference dnode, not db

this will be used later with methods like zap_add_by_dnode()
and similar, which are significantly faster as they don't
need to lookup dnode by dnode#.

Change-Id: Idc5341e9a472bbf0e5088b1bee784e4ddb6d635b
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/24293
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8769 lnet: removal of obsolete LNDs 21/23621/6
Sonia Sharma [Mon, 7 Nov 2016 17:32:00 +0000 (09:32 -0800)]
LU-8769 lnet: removal of obsolete LNDs

Obsolete LNDs were already removed. commented out the name<->network
number mapping for the obsolete LNDs. Removed their initialization
from the array in nidstrings.c file and occurences of the constants
for these LNDs in other files

Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Change-Id: I5c6ba0e88f5cabf0e875dc76bc5fccfbb16e9ab8
Reviewed-on: https://review.whamcloud.com/23621
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 o2iblnd: use 64-bit ibn_incarnation computation 67/23267/7
James Simmons [Sat, 28 Jan 2017 03:53:33 +0000 (22:53 -0500)]
LU-9019 o2iblnd: use 64-bit ibn_incarnation computation

ibn_incarnation is a 64-bit value, but using timeval to compute
it will cause an overflow in 2038. This changes it to use
ktime_get_real_ns() instead.

Change-Id: I4698a046ece30a85c93ac1f12e541d81fcfd70f2
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23267
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8457 pacemaker: Pacemaker script to monitor LNet 66/22266/5
Gabriele Paciucci [Thu, 1 Sep 2016 15:54:55 +0000 (16:54 +0100)]
LU-8457 pacemaker: Pacemaker script to monitor LNet

A new script to be used in Pacemaker to monitor LNet compatible
with ZFS and LDISKFS based Lustre server installations.
This RA is able to monitor a single LNet device using the
Pacemaker's clone technology.

pcs resource create [Resource Name] ocf:lustre:healthLNET
dampen=[seconds 5s]
multiplier=[number 1000]
lctl=[true|false]
device=[device name ib0]
host_list=[list of NIDs, space separated]
--clone

where:
* dampen The time to wait (dampening) further changes occur
* multiplier The number by which to multiply the number of
connected ping nodes by
* attempts Number of ping attempts, per host, before
declaring it dead
* timeout How long, in seconds, to wait before declaring
a ping lost
* lctl Option to enable lctl ping instead of the normal ping.
The default is true
* device Device used for the LNET network. We assume the
same device accross the cluster

This script should be located in /usr/lib/ocf/resource.d/lustre/
of both the Lustre servers with permission 755.

Test-Parameters: trivial
Signed-off-by: Gabriele Paciucci <gabriele.paciucci@intel.com>
Change-Id: I6292ce36dde0083fa95cb1d047fe582bd7d53116
Reviewed-on: https://review.whamcloud.com/22266
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Christopher J. Morrone <morrone2@llnl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8420 ldlm: take at_current change into account on prolong 48/21448/9
Vladimir Saveliev [Fri, 2 Dec 2016 00:40:40 +0000 (02:40 +0200)]
LU-8420 ldlm: take at_current change into account on prolong

Prolong timeout is calculated based upon estimated service time. When
prolong is called after bulk transfer timeout there is a chance that
service estimate on server side was reset recently due to more time than
at_history passed since the worst rpc time.  If rpc timeout was
initially based on bigger service estimate, it may happen that prolonged
timeout will be smaller than the original one, and the lock callback
timer will not get prolonged which may result in client's eviction.

When trying to prolong lock callback timer take into account that the
worst server estimate might get reset. In that case calculate prolong
timeout based upon service estimate set by client on sending the rpc.

A test to illustrates the issue is included.

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Seagate-bug-id: MRP-3582
Change-Id: I79988c8e82967d8eef077f42cd6331999294ea50
Reviewed-on: https://review.whamcloud.com/21448
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4121 tests: Enable zfs tests dependent on ost,mgs ordering 13/7113/16
Nathaniel Clark [Thu, 25 Jul 2013 13:32:11 +0000 (09:32 -0400)]
LU-4121 tests: Enable zfs tests dependent on ost,mgs ordering

This enables tests that were marked as skipped for bug LU-2059, now
tracked as LU-4274.  The skipped tests are ones failing due to
mounting OSTs without MGS started causes OST mount to hang and wait
for MGS.

Test-Parameters: trivial osscount=2 mdscount=2 ostcount=2 mdtcount=1 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=conf-sanity,insanity,sanity-quota
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I3e27a7583c857d416ef3a0bd2d5ee74814975def
Reviewed-on: https://review.whamcloud.com/7113
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-1573 recovery: Avoid data corruption for DIO during FOFB 80/16680/10
Parinay Kondekar [Thu, 1 Dec 2016 20:16:52 +0000 (01:46 +0530)]
LU-1573 recovery: Avoid data corruption for DIO during FOFB

When there is a userland app doing DIO and OST fails over,
obd->obd_no_transno is set to 1 & last_committed on server
is not sent to the client.  Thus client is not sure, if the
req is _committed_ to disk or not. So it removes the req
from resend queue and adds it to replay queue.

Now trans_no > last_committed, thus after reconnect, as a
part of recovery process request is replayed. Userland app,
refills the DIO buffer with different data, thus invalid data
is committed resulting in corruption.

This change avoids the client replay by dropping the reply to
the client rather than sending a reply without any transno.
This ensures the client will resend the RPC before returning
to userspace instead of putting it in the replay queue, and
thus avoids the corruption.

The test changes require replay-ost-single test_9 to be modifed
with an additional write to the file to increase the grants
available and avoid a sync write.

Seagate-Bug-Id: MRP-542, MRP-2418
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/3358

Change-Id: Ia30783c99e6c16a0c7ab70841eb98ed75dba1de9
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Tested-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Reviewed-on: https://review.whamcloud.com/16680
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
2 years agoLU-8979 ldlm: disable brw lock request in recovery 28/24528/5
Jinshan Xiong [Tue, 13 Dec 2016 18:20:47 +0000 (10:20 -0800)]
LU-8979 ldlm: disable brw lock request in recovery

It shouldn't acquire local brw lock in recovery otherwise it may
cause the the invocation of ldlm_reprocess_all() in lock replay
phase, due to the async lock cancellation in ldlm_lock_decref(),
evetually it will cause the problem described in LU-8437.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ie54d84d154f025918d5196a5d2ecc4956bd57953
Reviewed-on: https://review.whamcloud.com/24528
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7734 gnilnd: update GNI lnd driver to handle multirail api changes 22/25122/2
James Simmons [Thu, 7 Jul 2016 18:07:50 +0000 (14:07 -0400)]
LU-7734 gnilnd: update GNI lnd driver to handle multirail api changes

The multirail changes moved several parameters in struct lnet_ni
to the new data structure called struct lnet_net. This patch
updates the Gemini driver to handle the API changes.

Test-Parameters: trivial
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I75830c570ed56c5b1b665115e8ac96a733a7e57e
Reviewed-on: https://review.whamcloud.com/21192
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-on: https://review.whamcloud.com/25122
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9034 mgc: relate sptlrpc & params to MGC 88/24988/7
Hongchao Zhang [Tue, 11 Oct 2016 17:41:55 +0000 (01:41 +0800)]
LU-9034 mgc: relate sptlrpc & params to MGC

If sptlrpc or params config logs come from different MGC,
it should be regarded as different logs, this patch binds
these config logs with MGC obd device to separate them.

Change-Id: Ib4f55c7b20bfe722a6a6f7511324a37e98cf9c66
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/24988
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9031 osd: handle jinode change for ldiskfs 41/24941/11
Yang Sheng [Mon, 23 Jan 2017 19:31:27 +0000 (03:31 +0800)]
LU-9031 osd: handle jinode change for ldiskfs

We need take care of jinode for ldiskfs. Since we
didn't got inode from syscall like sys_open(). So
have to initailize it in OSD by ourselves.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Iec6db290c3779a8f7c98e5d1356b71fd928d7c88
Reviewed-on: https://review.whamcloud.com/24941
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9030 kernel: kernel update RHEL7.3 [3.10.0-514.6.1.el7] 36/24936/5
Bob Glossman [Tue, 17 Jan 2017 22:31:27 +0000 (14:31 -0800)]
LU-9030 kernel: kernel update RHEL7.3 [3.10.0-514.6.1.el7]

update RHEL 7.3 kernel to 3.10.0-514.6.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ieffb0e4868afbd4f15932a850c17b6e16c1e84f8
Reviewed-on: https://review.whamcloud.com/24936
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8602 gss: Support GSS on linux 4.6+ kernels 89/23289/7
James Simmons [Sun, 1 Jan 2017 17:51:14 +0000 (12:51 -0500)]
LU-8602 gss: Support GSS on linux 4.6+ kernels

Currently the GSS code for Lustre directly uses the linux crypto API.
The GSS code uses struct crypto_hash which has now been removed in
newer kernels for struct crypto_ahash. It is possible in the future
that we could run into this issue again so to make porting easier
lets move the GSS code to the libcfs crypto api. That way in the
future when the linux crypto api changes the libcfs layer will handle
these changes so GSS will not need further patches. This patch also
exposes some of the libcfs crypto functions to user land as well.

Change-Id: I7baed64d0340ad864732a782ea401e2e0e9ae1b7
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23289
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9033 llite: don't zero timestamps internally 84/24984/2
Niu Yawei [Thu, 19 Jan 2017 02:58:51 +0000 (21:58 -0500)]
LU-9033 llite: don't zero timestamps internally

In ll_md_blocking_ast(), we zero all timestamps to avoid these
'leftovers' interfering the new timestamps from MDS, especially
when the timestamps are set back by other clients. It's not
quite right to change timestamps in this way, because:

1. The pending lock can be matched by getattr, so these zero
   timestamps can be fetched by application in a small race window.

2. It doesn't make sense to zero the mtime and ctime, because we
   always use the newest ctime and mtime from MDS when do attributes
   merge, they won't interfere new timestamps set by other clients.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ieb9577abe4938bc47dc0577454a4a1bbf4796876
Reviewed-on: https://review.whamcloud.com/24984
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-8954 kernel: kernel update [SLES12 SP1 3.12.67-60.64.24] 27/24427/6
Bob Glossman [Mon, 28 Nov 2016 20:50:04 +0000 (12:50 -0800)]
LU-8954 kernel: kernel update [SLES12 SP1 3.12.67-60.64.24]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12 testgroup=review-ldiskfs \
  mdsdistro=sles12 ossdistro=sles12 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I6c474a939e3d6e8853388d645d82dbfe3038edee
Reviewed-on: https://review.whamcloud.com/24427
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8903 tests: racer test_1 to drop all error messages 39/24139/4
Chennaiah Palla [Mon, 5 Dec 2016 13:05:09 +0000 (18:35 +0530)]
LU-8903 tests: racer test_1 to drop all error messages

Filtered and dropped all Segmentation fault and Bus error
messages. Used "export LANG=C" to display messages are in
English instead of the local language.

Test-Parameters: trivial mdtcount=1 testlist=racer,racer

Seagate-bug-id: MRP-3009
Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Change-Id: Ibef083870634f4c8dd6b86e6aa91b5978f27c656
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-on: https://review.whamcloud.com/24139
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8865 tests: add fs_test test 32/23932/4
Chennaiah Palla [Mon, 19 Dec 2016 16:28:07 +0000 (21:58 +0530)]
LU-8865 tests: add fs_test test

Patch adds parallel-scale fs_test test.

Formerly it is MPI-IO test and it provides I/O performance
results for Effective bandwidths. We used fs_test to generate
a maximum IO write and read bandwidth scenario where a lot of
data is written to amortize away the overhead of open/sync/close
operations.

Test-Parameters: trivial testlist=parallel-scale
Seagate-bug-id: MRP-3914
Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Change-Id: I6057c8269fa72a151a792dcf1d05d30f4882204d
Reviewed-on: https://review.whamcloud.com/23932
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6210 gss: Change positional struct initializers to C99 77/23677/3
Steve Guminski [Mon, 31 Oct 2016 17:46:10 +0000 (13:46 -0400)]
LU-6210 gss: Change positional struct initializers to C99

This patch makes no functional changes.  Struct initializers in the
gss directory that use C89 or GCC-only syntax are updated to C99
syntax.  Whitespace is changed to match the coding style guidelines.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

lustre/ptlrpc/gss/gss_keyring.c:
struct vfs_cred vcred (2 occurrences)
lustre/ptlrpc/gss/gss_krb5_mech.c:
static struct krb5_enctype enctypes[]
lustre/ptlrpc/gss/gss_sk_mech.c:
static struct gss_api_mech gss_sk_mech

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I87aecee88dc8c97df5f6892c08c914732d455356
Reviewed-on: https://review.whamcloud.com/23677
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6210 lnet: Change positional struct initializers to C99 93/23493/7
Steve Guminski [Thu, 27 Oct 2016 13:06:06 +0000 (09:06 -0400)]
LU-6210 lnet: Change positional struct initializers to C99

This patch makes no functional changes.  Struct initializers in the
lnet directory that use C89 or GCC-only syntax are updated to C99
syntax.  Whitespace is corrected to match coding style guidelines.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

The following struct initializers have been updated:

lnet/include/lnet/lib-lnet.h:
struct libcfs_ioctl_handler ident
struct kvec diov (2 occurrences)
struct kvec siov (2 occurrences)
lnet/klnds/gnilnd/gnilnd_stack.c:
static struct rcadata rd[RCA_EVENTS]
lnet/utils/lnetconfig/liblnetconfig.c:
static struct lookup_cmd_hdlr_tbl lookup_config_tbl[]
static struct lookup_cmd_hdlr_tbl lookup_del_tbl[]
static struct lookup_cmd_hdlr_tbl lookup_show_tbl[]
lnet/utils/lnetctl.c:
const struct option long_options[] (10 occurrences)
lnet/utils/lst.c:
static struct option session_opts[]
static struct option ping_opts[]
static struct option update_group_opts[]
static struct option list_group_opts[]
static struct option stat_opts[]
static struct option  show_error_opts[]
struct option start_batch_opts[]
struct option stop_batch_opts[]
struct option list_batch_opts[]
struct option query_batch_opts[]
struct option add_test_opts[]
struct lst_sid LST_INVALID_SID
lnetutils/portals.c:
struct option opts[] (2 occurrences)
lnet/lnet/api-ni.c:
lnet_process_id_t id
lnet/lnet/lo.c:
lnd_t the_lolnd
lnet/selftest/framework.c:
struct lst_sid LST_INVALID_SID
lnet/utils/debug.c:
struct mod_paths

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I73cb72de2f084f572a3cf6b3ba5cd34805f39c5d
Reviewed-on: https://review.whamcloud.com/23493
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9041 test: Add version check to sanity test_402 10/23410/3
Wei Liu [Wed, 26 Oct 2016 17:55:04 +0000 (10:55 -0700)]
LU-9041 test: Add version check to sanity test_402

Skip sanity test_402 if server version is older than 2.7.3
or older than 2.7.66 or older than 2.7.18.4

Test-Parameters: trivial testlist=sanity

Change-Id: Ib47a5ab1e0f436661077d75b67bc9e7b2728b929
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/23410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8687 tests: list pool on mds when mgs is separate 46/23046/12
Jadhav Vikram [Mon, 19 Dec 2016 12:36:36 +0000 (18:06 +0530)]
LU-8687 tests: list pool on mds when mgs is separate

In case of separate MGS and MDS setup list pool on MGS
will show nothing, so listing pool from MDS instead of
MGS.

Test-Parameters: testlist=conf-sanity,ost-pools

Seagate-bug-id: MRP-3327
Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Change-Id: If5e9e6a7303059ab79e14967d2ea86b6d61c8aba
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/11895
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: https://review.whamcloud.com/23046
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-7910 osd: do not lookup child objects in osd_dir_insert() 33/21333/11
Alex Zhuravlev [Fri, 15 Jul 2016 13:05:29 +0000 (17:05 +0400)]
LU-7910 osd: do not lookup child objects in osd_dir_insert()

instead cache FID->dnode mapping in @env at declarations.

Change-Id: I2c2ab17cd6e158e9462715f12c21da2c2b8402db
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/21333
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8382 hsm: reorder coordinator's cleanup functions 07/21207/8
Quentin Bouget [Wed, 10 Aug 2016 21:02:14 +0000 (23:02 +0200)]
LU-8382 hsm: reorder coordinator's cleanup functions

The functions to initialize the coordinator and its proc entries
were called in the same order as the cleanup ones.
This patch reorders the cleanup functions called in mdt_fini()
according to the error path of mdt_init0().

Signed-off-by: Quentin Bouget <quentin.bouget.ocre@cea.fr>
Change-Id: Ic242b8f02cf44f900541446964297982ad6fc178
Reviewed-on: https://review.whamcloud.com/21207
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6319 tests: Resume parallel-grouplock testing 07/19107/8
James Nunez [Wed, 23 Mar 2016 20:34:34 +0000 (14:34 -0600)]
LU-6319 tests: Resume parallel-grouplock testing

The parallel_grouplock test from the parallel-scale test suite
was added to the ALWAYS_EXCEPT list in 2009. We need to resume
testing of parallel_grouplock.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I70eba62f433d280f7117aea63d7c1b56cd1fb676
Reviewed-on: https://review.whamcloud.com/19107
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8972 osp: skip subsequent orphan cleanups 79/25079/3
Alex Zhuravlev [Wed, 25 Jan 2017 04:51:40 +0000 (07:51 +0300)]
LU-8972 osp: skip subsequent orphan cleanups

orphan cleanup should be done once, then we need to recreate
missing precreated objects (due to OST failures). otherwise
we risk to hit a deadlock (if we block creations during orphan
cleanup) or destroy objects being allocated (which results in
data corruptions).

Change-Id: Ie8bc301ae4463c170b0cf5fc5ddd52e41fa88638
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/25079
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 mdt: use ktime_t for calculating elapsed time 23/24923/2
James Simmons [Tue, 17 Jan 2017 19:36:44 +0000 (14:36 -0500)]
LU-9019 mdt: use ktime_t for calculating elapsed time

mdt_identity_do_upcall() tries to print how much time has passed
across a call_usermodehelper() function, and uses struct timeval
for that.

We want to remove this structure, so this is better expressed
in terms of ktime_t and ktime_us_delta().

Change-Id: I2d167a50c537c525600622977b8cb422f0a88ba4
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24923
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6245 libcfs: use libcfs_private.h only for kernel space 38/22138/11
James Simmons [Wed, 11 Jan 2017 16:24:35 +0000 (11:24 -0500)]
LU-6245 libcfs: use libcfs_private.h only for kernel space

The current lustre userland code no longer uses special
macros which are present in libcfs_private.h.
We can then eliminate those macros and only use the
libcfs_private.h header for kernel space. The special
macros no longer needed for userland are:

1) LOGL and LOGU which were used in the UAPI header
   lustre_cfg.h and lustre_ioctl.h.

2) [un]likely macros used in the UAPI header lustre_ostid.h

3) The special libcfs assert macros

Change-Id: Iaa54bdcfb6104d13f2aabae63335e041481244a5
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/22138
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8411 ofd: handle last_rcvd file can't update properly 98/21398/13
Alexey Lyashkov [Mon, 18 Jul 2016 14:28:18 +0000 (17:28 +0300)]
LU-8411 ofd: handle last_rcvd file can't update properly

last_rcvd update may fail but "no fail" return code will
be sent to client. DIO request may be replayed in that case instead
of resend, but as no fail return code send to client, user
application will free a buffer, so replay will be sent with incorrect
data.

Write should fail if last_rcvd can't update properly.

This patch causes sanity test 407 to fail or has brought out an
existing bug in Lustre. sanity test 407 is added to the
ALWAYS_EXCPET list.

Seagate-bug-id: MRP-3609
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Idcbff5fd990edbc84539197da9876748b33795dd
Reviewed-on: https://review.whamcloud.com/21398
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8900 mgs: use reference count for fs_db 15/24415/11
Fan Yong [Tue, 27 Sep 2016 08:30:50 +0000 (16:30 +0800)]
LU-8900 mgs: use reference count for fs_db

That will prevent the in-using 'fs_db' being freed/erased
by others. Then the user (in subsequent patches) can hold
the 'fs_db' for a long time without holding related lock.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Icf5d548a2c51548aae2c05b1b34f003e725f4e02
Reviewed-on: https://review.whamcloud.com/24415
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6455 tests: Re-enable replay-vbr and replay-single tests 65/21565/5
James Nunez [Thu, 28 Jul 2016 15:13:31 +0000 (09:13 -0600)]
LU-6455 tests: Re-enable replay-vbr and replay-single tests

Tests replay-vbr 4i, 4j, 4k and 10b and replay-single test 28
were added to the ALWAYS_EXCEPT list because they were failing
on el7. The issues causing those failures have been fixed and
landed in http://review.whamcloud.com/#/c/14928/ .

The replay-vbr and replay-single tests need to be re-enabled.

Test-Parameters: trivial testlist=replay-vbr,replay-vbr,replay-vbr
Test-Parameters: testlist=replay-single,replay-single,replay-single

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic6911849ee96673072e4e1a7abe96706c7c9f87f
Reviewed-on: https://review.whamcloud.com/21565
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9045 osp: Revert "LU-8840 osp: handle EA cache properly" 34/25134/4
Jian Yu [Fri, 27 Jan 2017 11:36:02 +0000 (19:36 +0800)]
LU-9045 osp: Revert "LU-8840 osp: handle EA cache properly"

The patch caused test failures tracked in LU-9045 and LU-9048.

This reverts commit 555d02f47401340182b47b3245a657b52fc3e68a.

Test-Parameters: mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
mdscount=2 mdtcount=4 \
testlist=conf-sanity,conf-sanity,sanity-lfsck,sanity-lfsck,sanity-hsm,sanity-hsm

Change-Id: I3d922abd76b441f10ed0446e5528644a38211949
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/25134
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7734 lnet: multi-rail feature 87/25087/1
Amir Shehata [Wed, 25 Jan 2017 20:28:42 +0000 (12:28 -0800)]
LU-7734 lnet: multi-rail feature

Merge branch 'multi-rail'

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I88d3d86d81681802387fc70dba2b9315a9720470

2 years agoLU-7734 lnet: Fix setting numa range
Amir Shehata [Thu, 12 Jan 2017 21:57:11 +0000 (13:57 -0800)]
LU-7734 lnet: Fix setting numa range

Call the correct API when setting numa_range.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Test-Parameters: trivial
Change-Id: I1f9f8f1aabc277dff1fddd678cd360a9c49af4a5
Reviewed-on: https://review.whamcloud.com/24861
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-7734 lnet: Update lnetctl usage
Stephen Champion [Fri, 9 Dec 2016 20:31:49 +0000 (12:31 -0800)]
LU-7734 lnet: Update lnetctl usage

Bring lnetctl help descriptions, man page, and usage in line
with changes to peer functions.

Signed-off-by: Stephen Champion <schamp@sgi.com>
Test-Parameters: trivial
Change-Id: Idf115319727d92f23e50a97585f2f2c1e8c1b7b8
Reviewed-on: https://review.whamcloud.com/24279
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-7734 lnet: cpt locking
Amir Shehata [Fri, 2 Dec 2016 05:23:20 +0000 (21:23 -0800)]
LU-7734 lnet: cpt locking

When source nid is specified it is necessary to also
use the destination nid. Otherwise bulk transfer will end up
on a different interface than the nearest interface to the
memory. This has significant performance impact on NUMA
systems such as the SGI UV.

The CPT which the MD describing the bulk buffers belongs to
is not the same CPT of the actual pages of memory.
Therefore, it is necessary to communicate the CPT of the pages
to LNet, in order for LNet to select the nearest interface.

The MD which describes the pages of memory gets attached to
an ME, to be matched later on. The MD which describes the
message to be sent is different and this patch adds the
handle of the bulk MD into the MD which ends up being
accessible by lnet_select_pathway(). In that function
a new API, lnet_cpt_of_md_page(), is called which returns the
CPT of the buffers used for the bulk transfer.
lnet_select_pathway() proceeds to use this CPT to select
the nearest interface.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I4117ef912835f16dcdcaafb70703f92d74053b9b
Reviewed-on: https://review.whamcloud.com/24085

2 years agoLU-7734 lnet: rename peer key_nid to prim_nid
Amir Shehata [Thu, 27 Oct 2016 23:49:27 +0000 (16:49 -0700)]
LU-7734 lnet: rename peer key_nid to prim_nid

To make the interface clear, renamed key_nid to
prim_nid to indicate that this parameter refers to
the peer's primary nid.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I74bd17cdd55ba8d2c52bc28557db149d23ecbfb5
Reviewed-on: http://review.whamcloud.com/23460
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: Enhance DLC ip2nets
Amir Shehata [Thu, 8 Sep 2016 01:32:34 +0000 (18:32 -0700)]
LU-7734 lnet: Enhance DLC ip2nets

If the interfaces YAML block is specified then commission
the interfaces which match the ip-range if it is defined.
Otherwise commission the interfaces as long as they exist
and are up.

If the interfaces YAML block is not specified but an
ip-range is specified then configure all interfaces
in the system that match the ip-range.

If no interfaces and no ip-range is specified, then
commission the first interface that exists and is UP.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I01b2ced6f50fed2528f626166154be874f394e8b
Reviewed-on: http://review.whamcloud.com/22372
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-7734 lnet: fix NULL access in lnet_peer_aliveness_enabled
Amir Shehata [Fri, 26 Aug 2016 19:39:27 +0000 (12:39 -0700)]
LU-7734 lnet: fix NULL access in lnet_peer_aliveness_enabled

When a peer is not on a local network, lpni->lpni_net is NULL.
The lpni_net is access in lnet_peer_aliveness_enabled() without
checking if it's NULL. Fixed.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: If328728e2bda2a19b273140a20c04b22bdda6bc4
Reviewed-on: http://review.whamcloud.com/22183
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>