Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-8702 tests: parallel execution of IOR and MDTEST added. 26/23126/4
Aditya Pandit [Wed, 9 Dec 2015 09:58:58 +0000 (15:28 +0530)]
LU-8702 tests: parallel execution of IOR and MDTEST added.

Added test case for execution of mdtest and IOR in parallel.

Test-Parameters: trivial testlist=parallel-scale
Change-Id: I3b8a74a94739417467cc04bcc5e688b487d0cfe7
Seagate-bug-id: MRP-3149
Signed-off-by: Ashish Maurya <ashish.maurya@seagate.com>
Signed-off-by: Aditya Pandit <aditya.pandit@seagate.com>
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/10376
Tested-by: Jenkins
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: https://review.whamcloud.com/23126
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 obdclass: Get rid of remaining /proc/sys/lustre plumbing 34/24034/4
Oleg Drokin [Sun, 1 Jan 2017 19:54:47 +0000 (14:54 -0500)]
LU-8066 obdclass: Get rid of remaining /proc/sys/lustre plumbing

Since all of the variables from /proc/sys/lustre were moved to
/sys/fs/lustre, get rid of the remaining infrastructure.

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Change-Id: I6facdb8f52b86efb1e85a4d43ca2532a2f460a85
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24034
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9073 gss: quiet insecure key file warning 01/25201/2
Andreas Dilger [Thu, 2 Feb 2017 05:55:15 +0000 (22:55 -0700)]
LU-9073 gss: quiet insecure key file warning

Quiet spurious warning about insecure file access mode, because the
st_mode contains file type as well.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: If347eb3de67074269de4fe279ba4a849e03ebbe5
Reviewed-on: https://review.whamcloud.com/25201
Tested-by: Jenkins
Reviewed-by: Nathan Lavender <nblavend@iu.edu>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 ldlm: move /proc/fs/lustre/ldlm to sysfs 49/25049/4
Oleg Drokin [Mon, 30 Jan 2017 20:10:47 +0000 (15:10 -0500)]
LU-8066 ldlm: move /proc/fs/lustre/ldlm to sysfs

Unregister ldlm namespace from sysfs on free

ldlm_namespace_sysfs_unregister needs to be called ldlm_namespace_free_post
so that we don't have this dangling object there after the namespace
has disappeared.

Linux-commit: 9c7e397c98d646a3a23ffd304def1750be916803

move procfs ldlm pool stats to sysfs

Suitable contents of /proc/fs/lustre/ldlm/namespaces/.../pools/
is moved to /sys/fs/lustre/ldlm/namespaces/.../pools/:
cancel_rate grant_plan grant_speed lock_volume_factor
server_lock_volume granted grant_rate limit recalc_period

Linux-commit: 24b8c88a7122df35ce6a413cd76e9581411eab8f

Add infrastructure to move ldlm pool controls to sysfs

This adds registration of /sys/fs/lustre/ldlm/namespaces/.../pool
dir.

Linux-commit: f2825e039e1a6b58411087e1e17638f872d00a93

move namespaces/lru_max_age to sysfs

Move ldlm display of lru_max_age from procfs to sysfs

Linux-commit: c841236dda9aa334f7e241e3c526360328f77343

move namespaces/lock_unused_count to sysfs

Move ldlm display of lock_unused_count from procfs to sysfs

Linux-commit: 3dd4598271fc119a4e3c5589be03f88a41c31e64

move namespaces/early_lock_cancel to sysfs

Move ldlm display of early_lock_cancel from procfs to sysfs

Linux-commit: 87d32094efc208f31e4e3b226d25e58058352208

move namespaces/lru_size to sysfs

Move ldlm display of lru_size from procfs to sysfs

Linux-commit: 6784096b4818636ad512575c701e164e8e6a09d3

move namespace/lock_count to sysfs

Move ldlm display of lock_count from procfs to sysfs

Linux-commit: 63af1f57474fac888116d896a0c5f17aeb6a702d

move namespaces/resource_count to sysfs

Move ldlm display of resource_count from procfs to sysfs

Linux-commit: 0f53c823f9664683ce1aadab2d6a4cee950d6f62

move cancel_unused_locks_before_replay to sysfs

/proc/fs/lustre/ldlm/cancel_unused_locks_before_replay is
moved to /sys/fs/lustre/ldlm/cancel_unused_locks_before_replay

Linux-commit: 0f53c823f9664683ce1aadab2d6a4cee950d6f62

Preparation to move /proc/fs/lustre/ldlm to sysfs

Add necessary infrastructure, register /sys/fs/lustre/ldlm,
/sys/fs/lustre/ldlm/namespaces and /sys/fs/lustre/ldlm/services

Linux-commit: 18fd8850a4c8177ecf4870ff38c208d329a21ed0

Change-Id: I2bb6e925bb95336b79a265ef46ebdd29d47b957c
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/25049
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
2 years agoLU-9029 kernel: kernel update [SLES12 SP2 4.4.38-93] 38/24938/5
Bob Glossman [Wed, 14 Dec 2016 16:49:05 +0000 (08:49 -0800)]
LU-9029 kernel: kernel update [SLES12 SP2 4.4.38-93]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ia72bf9e6542627efdd946c7213dee2c77fa73e57
Reviewed-on: https://review.whamcloud.com/24938
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8995 tests: set debug size correctly 82/24782/5
Alexey Lyashkov [Tue, 24 Jan 2017 12:05:35 +0000 (17:35 +0530)]
LU-8995 tests: set debug size correctly

Use library function to set the debug log size

Seagate-bug-id: MRP-4055
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Parinay Kondekar <Parinay.Kondekar@seagate.com>
Change-Id: I125ce7f5f7f7754e82f913ef8cf6944f40f631d6
Reviewed-on: https://review.whamcloud.com/24782
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4423 libcfs: remove IS_PO2 and __is_po2 77/24577/3
Aya Mahfouz [Tue, 17 Jan 2017 17:30:12 +0000 (12:30 -0500)]
LU-4423 libcfs: remove IS_PO2 and __is_po2

Removes IS_PO2 and __is_po2 since the uses of IS_PO2 have
been replaced by is_power_of_2

Linux-commit: d4891039904fa25edf1ca793a0469633ed81df3f

The following commit message is the same for the following
patches:

hash.c: Replace IS_PO2 by is_power_of_2

Linux-commit: 71872e9cc2af4dca1903ebc57daa15f08c795d86

selftest.h: replace IS_PO2 by is_power_of_2

Linux-commit: b3367164f4ff8ff2c1aa8bd79c7548f113b62b83

workitem.c: replace IS_PO2 by is_power_of_2

Linux-commit: 57b573d14b0fb9f83575a2cf155862d251c8f0d1

ldlm_extent.c: replace IS_PO2 by is_power_of_2

Linux-commit: 5f4179e04b31441b0b7995d14320a457aafba01b

Replaces IS_PO2 by is_power_of_2. It is more accurate to use
is_power_of_2 since it returns 1 for numbers that are powers
of 2 only whereas IS_PO2 returns 1 for 0 and numbers that are
powers of 2.

Change-Id: Ic8bb40394b46ea433e3096c878abe467eacc7996
Signed-off-by: Aya Mahfouz <mahfouz.saif.elyazal@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24577
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
2 years agoLU-6245 libcfs: replace IS_PO2 with is_power_of_2 in server code 75/24575/10
James Simmons [Sat, 28 Jan 2017 03:47:24 +0000 (22:47 -0500)]
LU-6245 libcfs: replace IS_PO2 with is_power_of_2 in server code

Replaces IS_PO2 by is_power_of_2. It is more accurate to use
is_power_of_2 since it returns 1 for numbers that are powers
of 2 only whereas IS_PO2 returns 1 for 0 and numbers that are
powers of 2.

Change-Id: I595053a658a96818ac9b434377c275d3ed7143ec
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24575
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
2 years agoLU-8066 obdclass: move lustre server sysctl to sysfs 33/24033/3
James Simmons [Sun, 1 Jan 2017 19:42:27 +0000 (14:42 -0500)]
LU-8066 obdclass: move lustre server sysctl to sysfs

A few of the lustre sysctl are server side so lets
move those as well to sysfs. Both memused ane memused_max
are missing upstream but we need to keep them around
for now.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: Ie2fd2408d79aede4e40272a86f63f1e55311d1b9
Reviewed-on: https://review.whamcloud.com/24033
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8928 osd: convert osd-zfs to reference dnode, not db 93/24293/7
Alex Zhuravlev [Fri, 9 Dec 2016 16:38:34 +0000 (19:38 +0300)]
LU-8928 osd: convert osd-zfs to reference dnode, not db

this will be used later with methods like zap_add_by_dnode()
and similar, which are significantly faster as they don't
need to lookup dnode by dnode#.

Change-Id: Idc5341e9a472bbf0e5088b1bee784e4ddb6d635b
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/24293
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8769 lnet: removal of obsolete LNDs 21/23621/6
Sonia Sharma [Mon, 7 Nov 2016 17:32:00 +0000 (09:32 -0800)]
LU-8769 lnet: removal of obsolete LNDs

Obsolete LNDs were already removed. commented out the name<->network
number mapping for the obsolete LNDs. Removed their initialization
from the array in nidstrings.c file and occurences of the constants
for these LNDs in other files

Signed-off-by: Sonia Sharma <sonia.sharma@intel.com>
Change-Id: I5c6ba0e88f5cabf0e875dc76bc5fccfbb16e9ab8
Reviewed-on: https://review.whamcloud.com/23621
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 o2iblnd: use 64-bit ibn_incarnation computation 67/23267/7
James Simmons [Sat, 28 Jan 2017 03:53:33 +0000 (22:53 -0500)]
LU-9019 o2iblnd: use 64-bit ibn_incarnation computation

ibn_incarnation is a 64-bit value, but using timeval to compute
it will cause an overflow in 2038. This changes it to use
ktime_get_real_ns() instead.

Change-Id: I4698a046ece30a85c93ac1f12e541d81fcfd70f2
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23267
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8457 pacemaker: Pacemaker script to monitor LNet 66/22266/5
Gabriele Paciucci [Thu, 1 Sep 2016 15:54:55 +0000 (16:54 +0100)]
LU-8457 pacemaker: Pacemaker script to monitor LNet

A new script to be used in Pacemaker to monitor LNet compatible
with ZFS and LDISKFS based Lustre server installations.
This RA is able to monitor a single LNet device using the
Pacemaker's clone technology.

pcs resource create [Resource Name] ocf:lustre:healthLNET
dampen=[seconds 5s]
multiplier=[number 1000]
lctl=[true|false]
device=[device name ib0]
host_list=[list of NIDs, space separated]
--clone

where:
* dampen The time to wait (dampening) further changes occur
* multiplier The number by which to multiply the number of
connected ping nodes by
* attempts Number of ping attempts, per host, before
declaring it dead
* timeout How long, in seconds, to wait before declaring
a ping lost
* lctl Option to enable lctl ping instead of the normal ping.
The default is true
* device Device used for the LNET network. We assume the
same device accross the cluster

This script should be located in /usr/lib/ocf/resource.d/lustre/
of both the Lustre servers with permission 755.

Test-Parameters: trivial
Signed-off-by: Gabriele Paciucci <gabriele.paciucci@intel.com>
Change-Id: I6292ce36dde0083fa95cb1d047fe582bd7d53116
Reviewed-on: https://review.whamcloud.com/22266
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Christopher J. Morrone <morrone2@llnl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8420 ldlm: take at_current change into account on prolong 48/21448/9
Vladimir Saveliev [Fri, 2 Dec 2016 00:40:40 +0000 (02:40 +0200)]
LU-8420 ldlm: take at_current change into account on prolong

Prolong timeout is calculated based upon estimated service time. When
prolong is called after bulk transfer timeout there is a chance that
service estimate on server side was reset recently due to more time than
at_history passed since the worst rpc time.  If rpc timeout was
initially based on bigger service estimate, it may happen that prolonged
timeout will be smaller than the original one, and the lock callback
timer will not get prolonged which may result in client's eviction.

When trying to prolong lock callback timer take into account that the
worst server estimate might get reset. In that case calculate prolong
timeout based upon service estimate set by client on sending the rpc.

A test to illustrates the issue is included.

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Seagate-bug-id: MRP-3582
Change-Id: I79988c8e82967d8eef077f42cd6331999294ea50
Reviewed-on: https://review.whamcloud.com/21448
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4121 tests: Enable zfs tests dependent on ost,mgs ordering 13/7113/16
Nathaniel Clark [Thu, 25 Jul 2013 13:32:11 +0000 (09:32 -0400)]
LU-4121 tests: Enable zfs tests dependent on ost,mgs ordering

This enables tests that were marked as skipped for bug LU-2059, now
tracked as LU-4274.  The skipped tests are ones failing due to
mounting OSTs without MGS started causes OST mount to hang and wait
for MGS.

Test-Parameters: trivial osscount=2 mdscount=2 ostcount=2 mdtcount=1 mdtfilesystemtype=zfs ostfilesystemtype=zfs testlist=conf-sanity,insanity,sanity-quota
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I3e27a7583c857d416ef3a0bd2d5ee74814975def
Reviewed-on: https://review.whamcloud.com/7113
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-1573 recovery: Avoid data corruption for DIO during FOFB 80/16680/10
Parinay Kondekar [Thu, 1 Dec 2016 20:16:52 +0000 (01:46 +0530)]
LU-1573 recovery: Avoid data corruption for DIO during FOFB

When there is a userland app doing DIO and OST fails over,
obd->obd_no_transno is set to 1 & last_committed on server
is not sent to the client.  Thus client is not sure, if the
req is _committed_ to disk or not. So it removes the req
from resend queue and adds it to replay queue.

Now trans_no > last_committed, thus after reconnect, as a
part of recovery process request is replayed. Userland app,
refills the DIO buffer with different data, thus invalid data
is committed resulting in corruption.

This change avoids the client replay by dropping the reply to
the client rather than sending a reply without any transno.
This ensures the client will resend the RPC before returning
to userspace instead of putting it in the replay queue, and
thus avoids the corruption.

The test changes require replay-ost-single test_9 to be modifed
with an additional write to the file to increase the grants
available and avoid a sync write.

Seagate-Bug-Id: MRP-542, MRP-2418
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/3358

Change-Id: Ia30783c99e6c16a0c7ab70841eb98ed75dba1de9
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Tested-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Reviewed-by: Vitaly Fertman <vitaly.fertman@seagate.com>
Reviewed-on: https://review.whamcloud.com/16680
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
2 years agoLU-8979 ldlm: disable brw lock request in recovery 28/24528/5
Jinshan Xiong [Tue, 13 Dec 2016 18:20:47 +0000 (10:20 -0800)]
LU-8979 ldlm: disable brw lock request in recovery

It shouldn't acquire local brw lock in recovery otherwise it may
cause the the invocation of ldlm_reprocess_all() in lock replay
phase, due to the async lock cancellation in ldlm_lock_decref(),
evetually it will cause the problem described in LU-8437.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ie54d84d154f025918d5196a5d2ecc4956bd57953
Reviewed-on: https://review.whamcloud.com/24528
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7734 gnilnd: update GNI lnd driver to handle multirail api changes 22/25122/2
James Simmons [Thu, 7 Jul 2016 18:07:50 +0000 (14:07 -0400)]
LU-7734 gnilnd: update GNI lnd driver to handle multirail api changes

The multirail changes moved several parameters in struct lnet_ni
to the new data structure called struct lnet_net. This patch
updates the Gemini driver to handle the API changes.

Test-Parameters: trivial
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I75830c570ed56c5b1b665115e8ac96a733a7e57e
Reviewed-on: https://review.whamcloud.com/21192
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-on: https://review.whamcloud.com/25122
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9034 mgc: relate sptlrpc & params to MGC 88/24988/7
Hongchao Zhang [Tue, 11 Oct 2016 17:41:55 +0000 (01:41 +0800)]
LU-9034 mgc: relate sptlrpc & params to MGC

If sptlrpc or params config logs come from different MGC,
it should be regarded as different logs, this patch binds
these config logs with MGC obd device to separate them.

Change-Id: Ib4f55c7b20bfe722a6a6f7511324a37e98cf9c66
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/24988
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9031 osd: handle jinode change for ldiskfs 41/24941/11
Yang Sheng [Mon, 23 Jan 2017 19:31:27 +0000 (03:31 +0800)]
LU-9031 osd: handle jinode change for ldiskfs

We need take care of jinode for ldiskfs. Since we
didn't got inode from syscall like sys_open(). So
have to initailize it in OSD by ourselves.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Iec6db290c3779a8f7c98e5d1356b71fd928d7c88
Reviewed-on: https://review.whamcloud.com/24941
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9030 kernel: kernel update RHEL7.3 [3.10.0-514.6.1.el7] 36/24936/5
Bob Glossman [Tue, 17 Jan 2017 22:31:27 +0000 (14:31 -0800)]
LU-9030 kernel: kernel update RHEL7.3 [3.10.0-514.6.1.el7]

update RHEL 7.3 kernel to 3.10.0-514.6.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ieffb0e4868afbd4f15932a850c17b6e16c1e84f8
Reviewed-on: https://review.whamcloud.com/24936
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8602 gss: Support GSS on linux 4.6+ kernels 89/23289/7
James Simmons [Sun, 1 Jan 2017 17:51:14 +0000 (12:51 -0500)]
LU-8602 gss: Support GSS on linux 4.6+ kernels

Currently the GSS code for Lustre directly uses the linux crypto API.
The GSS code uses struct crypto_hash which has now been removed in
newer kernels for struct crypto_ahash. It is possible in the future
that we could run into this issue again so to make porting easier
lets move the GSS code to the libcfs crypto api. That way in the
future when the linux crypto api changes the libcfs layer will handle
these changes so GSS will not need further patches. This patch also
exposes some of the libcfs crypto functions to user land as well.

Change-Id: I7baed64d0340ad864732a782ea401e2e0e9ae1b7
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23289
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9033 llite: don't zero timestamps internally 84/24984/2
Niu Yawei [Thu, 19 Jan 2017 02:58:51 +0000 (21:58 -0500)]
LU-9033 llite: don't zero timestamps internally

In ll_md_blocking_ast(), we zero all timestamps to avoid these
'leftovers' interfering the new timestamps from MDS, especially
when the timestamps are set back by other clients. It's not
quite right to change timestamps in this way, because:

1. The pending lock can be matched by getattr, so these zero
   timestamps can be fetched by application in a small race window.

2. It doesn't make sense to zero the mtime and ctime, because we
   always use the newest ctime and mtime from MDS when do attributes
   merge, they won't interfere new timestamps set by other clients.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ieb9577abe4938bc47dc0577454a4a1bbf4796876
Reviewed-on: https://review.whamcloud.com/24984
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-8954 kernel: kernel update [SLES12 SP1 3.12.67-60.64.24] 27/24427/6
Bob Glossman [Mon, 28 Nov 2016 20:50:04 +0000 (12:50 -0800)]
LU-8954 kernel: kernel update [SLES12 SP1 3.12.67-60.64.24]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12 testgroup=review-ldiskfs \
  mdsdistro=sles12 ossdistro=sles12 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I6c474a939e3d6e8853388d645d82dbfe3038edee
Reviewed-on: https://review.whamcloud.com/24427
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8903 tests: racer test_1 to drop all error messages 39/24139/4
Chennaiah Palla [Mon, 5 Dec 2016 13:05:09 +0000 (18:35 +0530)]
LU-8903 tests: racer test_1 to drop all error messages

Filtered and dropped all Segmentation fault and Bus error
messages. Used "export LANG=C" to display messages are in
English instead of the local language.

Test-Parameters: trivial mdtcount=1 testlist=racer,racer

Seagate-bug-id: MRP-3009
Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Change-Id: Ibef083870634f4c8dd6b86e6aa91b5978f27c656
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-on: https://review.whamcloud.com/24139
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8865 tests: add fs_test test 32/23932/4
Chennaiah Palla [Mon, 19 Dec 2016 16:28:07 +0000 (21:58 +0530)]
LU-8865 tests: add fs_test test

Patch adds parallel-scale fs_test test.

Formerly it is MPI-IO test and it provides I/O performance
results for Effective bandwidths. We used fs_test to generate
a maximum IO write and read bandwidth scenario where a lot of
data is written to amortize away the overhead of open/sync/close
operations.

Test-Parameters: trivial testlist=parallel-scale
Seagate-bug-id: MRP-3914
Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Change-Id: I6057c8269fa72a151a792dcf1d05d30f4882204d
Reviewed-on: https://review.whamcloud.com/23932
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6210 gss: Change positional struct initializers to C99 77/23677/3
Steve Guminski [Mon, 31 Oct 2016 17:46:10 +0000 (13:46 -0400)]
LU-6210 gss: Change positional struct initializers to C99

This patch makes no functional changes.  Struct initializers in the
gss directory that use C89 or GCC-only syntax are updated to C99
syntax.  Whitespace is changed to match the coding style guidelines.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

lustre/ptlrpc/gss/gss_keyring.c:
struct vfs_cred vcred (2 occurrences)
lustre/ptlrpc/gss/gss_krb5_mech.c:
static struct krb5_enctype enctypes[]
lustre/ptlrpc/gss/gss_sk_mech.c:
static struct gss_api_mech gss_sk_mech

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I87aecee88dc8c97df5f6892c08c914732d455356
Reviewed-on: https://review.whamcloud.com/23677
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6210 lnet: Change positional struct initializers to C99 93/23493/7
Steve Guminski [Thu, 27 Oct 2016 13:06:06 +0000 (09:06 -0400)]
LU-6210 lnet: Change positional struct initializers to C99

This patch makes no functional changes.  Struct initializers in the
lnet directory that use C89 or GCC-only syntax are updated to C99
syntax.  Whitespace is corrected to match coding style guidelines.

C89 positional initializers require values to be placed in the
correct order. This will cause errors if the fields of the struct
definition are reordered or fields are added or removed. C99 named
initializers avoid this problem, and also automatically clear any
values that are not explicitly set.

The following struct initializers have been updated:

lnet/include/lnet/lib-lnet.h:
struct libcfs_ioctl_handler ident
struct kvec diov (2 occurrences)
struct kvec siov (2 occurrences)
lnet/klnds/gnilnd/gnilnd_stack.c:
static struct rcadata rd[RCA_EVENTS]
lnet/utils/lnetconfig/liblnetconfig.c:
static struct lookup_cmd_hdlr_tbl lookup_config_tbl[]
static struct lookup_cmd_hdlr_tbl lookup_del_tbl[]
static struct lookup_cmd_hdlr_tbl lookup_show_tbl[]
lnet/utils/lnetctl.c:
const struct option long_options[] (10 occurrences)
lnet/utils/lst.c:
static struct option session_opts[]
static struct option ping_opts[]
static struct option update_group_opts[]
static struct option list_group_opts[]
static struct option stat_opts[]
static struct option  show_error_opts[]
struct option start_batch_opts[]
struct option stop_batch_opts[]
struct option list_batch_opts[]
struct option query_batch_opts[]
struct option add_test_opts[]
struct lst_sid LST_INVALID_SID
lnetutils/portals.c:
struct option opts[] (2 occurrences)
lnet/lnet/api-ni.c:
lnet_process_id_t id
lnet/lnet/lo.c:
lnd_t the_lolnd
lnet/selftest/framework.c:
struct lst_sid LST_INVALID_SID
lnet/utils/debug.c:
struct mod_paths

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I73cb72de2f084f572a3cf6b3ba5cd34805f39c5d
Reviewed-on: https://review.whamcloud.com/23493
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9041 test: Add version check to sanity test_402 10/23410/3
Wei Liu [Wed, 26 Oct 2016 17:55:04 +0000 (10:55 -0700)]
LU-9041 test: Add version check to sanity test_402

Skip sanity test_402 if server version is older than 2.7.3
or older than 2.7.66 or older than 2.7.18.4

Test-Parameters: trivial testlist=sanity

Change-Id: Ib47a5ab1e0f436661077d75b67bc9e7b2728b929
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/23410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8687 tests: list pool on mds when mgs is separate 46/23046/12
Jadhav Vikram [Mon, 19 Dec 2016 12:36:36 +0000 (18:06 +0530)]
LU-8687 tests: list pool on mds when mgs is separate

In case of separate MGS and MDS setup list pool on MGS
will show nothing, so listing pool from MDS instead of
MGS.

Test-Parameters: testlist=conf-sanity,ost-pools

Seagate-bug-id: MRP-3327
Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Change-Id: If5e9e6a7303059ab79e14967d2ea86b6d61c8aba
Reviewed-on: http://es-gerrit.xyus.xyratex.com:8080/11895
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: https://review.whamcloud.com/23046
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-7910 osd: do not lookup child objects in osd_dir_insert() 33/21333/11
Alex Zhuravlev [Fri, 15 Jul 2016 13:05:29 +0000 (17:05 +0400)]
LU-7910 osd: do not lookup child objects in osd_dir_insert()

instead cache FID->dnode mapping in @env at declarations.

Change-Id: I2c2ab17cd6e158e9462715f12c21da2c2b8402db
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/21333
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8382 hsm: reorder coordinator's cleanup functions 07/21207/8
Quentin Bouget [Wed, 10 Aug 2016 21:02:14 +0000 (23:02 +0200)]
LU-8382 hsm: reorder coordinator's cleanup functions

The functions to initialize the coordinator and its proc entries
were called in the same order as the cleanup ones.
This patch reorders the cleanup functions called in mdt_fini()
according to the error path of mdt_init0().

Signed-off-by: Quentin Bouget <quentin.bouget.ocre@cea.fr>
Change-Id: Ic242b8f02cf44f900541446964297982ad6fc178
Reviewed-on: https://review.whamcloud.com/21207
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6319 tests: Resume parallel-grouplock testing 07/19107/8
James Nunez [Wed, 23 Mar 2016 20:34:34 +0000 (14:34 -0600)]
LU-6319 tests: Resume parallel-grouplock testing

The parallel_grouplock test from the parallel-scale test suite
was added to the ALWAYS_EXCEPT list in 2009. We need to resume
testing of parallel_grouplock.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I70eba62f433d280f7117aea63d7c1b56cd1fb676
Reviewed-on: https://review.whamcloud.com/19107
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8972 osp: skip subsequent orphan cleanups 79/25079/3
Alex Zhuravlev [Wed, 25 Jan 2017 04:51:40 +0000 (07:51 +0300)]
LU-8972 osp: skip subsequent orphan cleanups

orphan cleanup should be done once, then we need to recreate
missing precreated objects (due to OST failures). otherwise
we risk to hit a deadlock (if we block creations during orphan
cleanup) or destroy objects being allocated (which results in
data corruptions).

Change-Id: Ie8bc301ae4463c170b0cf5fc5ddd52e41fa88638
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/25079
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 mdt: use ktime_t for calculating elapsed time 23/24923/2
James Simmons [Tue, 17 Jan 2017 19:36:44 +0000 (14:36 -0500)]
LU-9019 mdt: use ktime_t for calculating elapsed time

mdt_identity_do_upcall() tries to print how much time has passed
across a call_usermodehelper() function, and uses struct timeval
for that.

We want to remove this structure, so this is better expressed
in terms of ktime_t and ktime_us_delta().

Change-Id: I2d167a50c537c525600622977b8cb422f0a88ba4
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24923
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6245 libcfs: use libcfs_private.h only for kernel space 38/22138/11
James Simmons [Wed, 11 Jan 2017 16:24:35 +0000 (11:24 -0500)]
LU-6245 libcfs: use libcfs_private.h only for kernel space

The current lustre userland code no longer uses special
macros which are present in libcfs_private.h.
We can then eliminate those macros and only use the
libcfs_private.h header for kernel space. The special
macros no longer needed for userland are:

1) LOGL and LOGU which were used in the UAPI header
   lustre_cfg.h and lustre_ioctl.h.

2) [un]likely macros used in the UAPI header lustre_ostid.h

3) The special libcfs assert macros

Change-Id: Iaa54bdcfb6104d13f2aabae63335e041481244a5
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/22138
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8411 ofd: handle last_rcvd file can't update properly 98/21398/13
Alexey Lyashkov [Mon, 18 Jul 2016 14:28:18 +0000 (17:28 +0300)]
LU-8411 ofd: handle last_rcvd file can't update properly

last_rcvd update may fail but "no fail" return code will
be sent to client. DIO request may be replayed in that case instead
of resend, but as no fail return code send to client, user
application will free a buffer, so replay will be sent with incorrect
data.

Write should fail if last_rcvd can't update properly.

This patch causes sanity test 407 to fail or has brought out an
existing bug in Lustre. sanity test 407 is added to the
ALWAYS_EXCPET list.

Seagate-bug-id: MRP-3609
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Idcbff5fd990edbc84539197da9876748b33795dd
Reviewed-on: https://review.whamcloud.com/21398
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8900 mgs: use reference count for fs_db 15/24415/11
Fan Yong [Tue, 27 Sep 2016 08:30:50 +0000 (16:30 +0800)]
LU-8900 mgs: use reference count for fs_db

That will prevent the in-using 'fs_db' being freed/erased
by others. Then the user (in subsequent patches) can hold
the 'fs_db' for a long time without holding related lock.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Icf5d548a2c51548aae2c05b1b34f003e725f4e02
Reviewed-on: https://review.whamcloud.com/24415
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6455 tests: Re-enable replay-vbr and replay-single tests 65/21565/5
James Nunez [Thu, 28 Jul 2016 15:13:31 +0000 (09:13 -0600)]
LU-6455 tests: Re-enable replay-vbr and replay-single tests

Tests replay-vbr 4i, 4j, 4k and 10b and replay-single test 28
were added to the ALWAYS_EXCEPT list because they were failing
on el7. The issues causing those failures have been fixed and
landed in http://review.whamcloud.com/#/c/14928/ .

The replay-vbr and replay-single tests need to be re-enabled.

Test-Parameters: trivial testlist=replay-vbr,replay-vbr,replay-vbr
Test-Parameters: testlist=replay-single,replay-single,replay-single

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic6911849ee96673072e4e1a7abe96706c7c9f87f
Reviewed-on: https://review.whamcloud.com/21565
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9045 osp: Revert "LU-8840 osp: handle EA cache properly" 34/25134/4
Jian Yu [Fri, 27 Jan 2017 11:36:02 +0000 (19:36 +0800)]
LU-9045 osp: Revert "LU-8840 osp: handle EA cache properly"

The patch caused test failures tracked in LU-9045 and LU-9048.

This reverts commit 555d02f47401340182b47b3245a657b52fc3e68a.

Test-Parameters: mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
mdscount=2 mdtcount=4 \
testlist=conf-sanity,conf-sanity,sanity-lfsck,sanity-lfsck,sanity-hsm,sanity-hsm

Change-Id: I3d922abd76b441f10ed0446e5528644a38211949
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/25134
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7734 lnet: multi-rail feature 87/25087/1
Amir Shehata [Wed, 25 Jan 2017 20:28:42 +0000 (12:28 -0800)]
LU-7734 lnet: multi-rail feature

Merge branch 'multi-rail'

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I88d3d86d81681802387fc70dba2b9315a9720470

2 years agoLU-7734 lnet: Fix setting numa range
Amir Shehata [Thu, 12 Jan 2017 21:57:11 +0000 (13:57 -0800)]
LU-7734 lnet: Fix setting numa range

Call the correct API when setting numa_range.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Test-Parameters: trivial
Change-Id: I1f9f8f1aabc277dff1fddd678cd360a9c49af4a5
Reviewed-on: https://review.whamcloud.com/24861
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-7734 lnet: Update lnetctl usage
Stephen Champion [Fri, 9 Dec 2016 20:31:49 +0000 (12:31 -0800)]
LU-7734 lnet: Update lnetctl usage

Bring lnetctl help descriptions, man page, and usage in line
with changes to peer functions.

Signed-off-by: Stephen Champion <schamp@sgi.com>
Test-Parameters: trivial
Change-Id: Idf115319727d92f23e50a97585f2f2c1e8c1b7b8
Reviewed-on: https://review.whamcloud.com/24279
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-7734 lnet: cpt locking
Amir Shehata [Fri, 2 Dec 2016 05:23:20 +0000 (21:23 -0800)]
LU-7734 lnet: cpt locking

When source nid is specified it is necessary to also
use the destination nid. Otherwise bulk transfer will end up
on a different interface than the nearest interface to the
memory. This has significant performance impact on NUMA
systems such as the SGI UV.

The CPT which the MD describing the bulk buffers belongs to
is not the same CPT of the actual pages of memory.
Therefore, it is necessary to communicate the CPT of the pages
to LNet, in order for LNet to select the nearest interface.

The MD which describes the pages of memory gets attached to
an ME, to be matched later on. The MD which describes the
message to be sent is different and this patch adds the
handle of the bulk MD into the MD which ends up being
accessible by lnet_select_pathway(). In that function
a new API, lnet_cpt_of_md_page(), is called which returns the
CPT of the buffers used for the bulk transfer.
lnet_select_pathway() proceeds to use this CPT to select
the nearest interface.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I4117ef912835f16dcdcaafb70703f92d74053b9b
Reviewed-on: https://review.whamcloud.com/24085

2 years agoLU-7734 lnet: rename peer key_nid to prim_nid
Amir Shehata [Thu, 27 Oct 2016 23:49:27 +0000 (16:49 -0700)]
LU-7734 lnet: rename peer key_nid to prim_nid

To make the interface clear, renamed key_nid to
prim_nid to indicate that this parameter refers to
the peer's primary nid.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I74bd17cdd55ba8d2c52bc28557db149d23ecbfb5
Reviewed-on: http://review.whamcloud.com/23460
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: Enhance DLC ip2nets
Amir Shehata [Thu, 8 Sep 2016 01:32:34 +0000 (18:32 -0700)]
LU-7734 lnet: Enhance DLC ip2nets

If the interfaces YAML block is specified then commission
the interfaces which match the ip-range if it is defined.
Otherwise commission the interfaces as long as they exist
and are up.

If the interfaces YAML block is not specified but an
ip-range is specified then configure all interfaces
in the system that match the ip-range.

If no interfaces and no ip-range is specified, then
commission the first interface that exists and is UP.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I01b2ced6f50fed2528f626166154be874f394e8b
Reviewed-on: http://review.whamcloud.com/22372
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-7734 lnet: fix NULL access in lnet_peer_aliveness_enabled
Amir Shehata [Fri, 26 Aug 2016 19:39:27 +0000 (12:39 -0700)]
LU-7734 lnet: fix NULL access in lnet_peer_aliveness_enabled

When a peer is not on a local network, lpni->lpni_net is NULL.
The lpni_net is access in lnet_peer_aliveness_enabled() without
checking if it's NULL. Fixed.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: If328728e2bda2a19b273140a20c04b22bdda6bc4
Reviewed-on: http://review.whamcloud.com/22183
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-7734 lnet: set primary NID in ptlrpc_connection_get()
Olaf Weber [Thu, 4 Aug 2016 11:27:01 +0000 (13:27 +0200)]
LU-7734 lnet: set primary NID in ptlrpc_connection_get()

Set the NID in ptlrpc_connection::c_peer to the primary NID of a peer.
This ensures that regardless of the NID used to start a connection, we
consistently use the same NID (the primary NID) to identify a peer. It
also means that PtlRPC will not create multiple connections to a peer.

The primary NID is obtained by calling LNetPrimaryNID(), an addition
to the exported symbols of the LNet module. The name was chosen to
match the existing naming pattern.

Test-Parameters: trivial
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: Idc0605d17a58678b634db246221028cf81ad2407
Reviewed-on: http://review.whamcloud.com/21710
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-7734 lnet: fix string.h header inclusion
Amir Shehata [Thu, 11 Aug 2016 01:07:17 +0000 (18:07 -0700)]
LU-7734 lnet: fix string.h header inclusion

string.h is intended for user space and libcfs_string.h
is intended for kernel space. Use string.h in liblnetconfig
library.

Add cfs_expr_list_values() in the string.h header file since
it's used in liblnetconfig library.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I50b1bb1aff6fe176cfbe28f039f34d063c9265e4
Reviewed-on: http://review.whamcloud.com/21874
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-7734 lnet: minor fixes
Amir Shehata [Wed, 20 Jul 2016 09:11:50 +0000 (02:11 -0700)]
LU-7734 lnet: minor fixes

Fixed some issues Gatekeeper helper robot pointed out

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Id33b4c9e94b22bddc0bfddf8f51235b81d3d86dc
Reviewed-on: http://review.whamcloud.com/21450

2 years agoLU-7734 lnet: double free in lnet_add_net_common()
Olaf Weber [Wed, 20 Jul 2016 12:57:36 +0000 (14:57 +0200)]
LU-7734 lnet: double free in lnet_add_net_common()

lnet_startup_lndnet() always consumes its net parameter, so we
should not free net after the function has been called. This
fixes a double free triggered by adding a network twice.

Eliminate the netl local variable.

Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I1cfc3494eada4660b792f6a1ebd96b5dc80d9945
Reviewed-on: http://review.whamcloud.com/21446
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-7734 lnet: Fix crash in router_proc.c
Amir Shehata [Thu, 14 Jul 2016 23:51:32 +0000 (16:51 -0700)]
LU-7734 lnet: Fix crash in router_proc.c

Fixed NULL access in the case when a peer is a remote
peer. In that case lpni_net is NULL.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ida234ff016b2bdc305acf74df0f99600d2555e27
Reviewed-on: http://review.whamcloud.com/21327

2 years agoLU-7734 lnet: fix routing selection
Amir Shehata [Thu, 14 Jul 2016 23:50:07 +0000 (16:50 -0700)]
LU-7734 lnet: fix routing selection

Always prefer locally connected networks over routed networks.
If there are multiple routed networks and no connected networks
pick the best gateway to use. If all gateways are equal then
round robin through them.

Renamed dev_cpt to ni_dev_cpt to maintain naming convention.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ie6a3aaa7a9ec4f5474baf5e1ec0258d481418cb1
Reviewed-on: http://review.whamcloud.com/21326

2 years agoLU-7734 lnet: power8 compile fix
James Simmons [Wed, 29 Jun 2016 17:19:41 +0000 (13:19 -0400)]
LU-7734 lnet: power8 compile fix

On Power8 the following error occured:

error: inlining failed in call to always_inline ‘lnet_get_numa_range’:
function body not available inline __u32 lnet_get_numa_range(void);

The reason for this was for the linux kernel you
must fill in the body of a inline function. Replace
this inline function with exposing the lnet_numa_range
module parameter like we do for portal_rotor. Also
treat all the lnet_numa_range handling as unsigned int.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: Ic566e56eb1e333d145de21d1c197218c6425dc5b
Reviewed-on: http://review.whamcloud.com/21078
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-7734 lnet: Routing fixes part 2
Amir Shehata [Wed, 6 Jul 2016 02:36:08 +0000 (19:36 -0700)]
LU-7734 lnet: Routing fixes part 2

Fix lnet_select_pathway() to handle the routing cases correctly.
The following general cases are handled:
. Non-MR directly connected
. Non-MR not directly connected
. MR Directly connected
. MR Not directly connected
  . No gateway
  . Gateway is non-mr
  . Gateway is mr

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: If2d16b797b94421e78a9f2a254a250a440f8b244
Reviewed-on: http://review.whamcloud.com/21167

2 years agoLU-7734 lnet: Routing fixes part 1
Amir Shehata [Mon, 4 Jul 2016 21:51:06 +0000 (14:51 -0700)]
LU-7734 lnet: Routing fixes part 1

This is the first part of a routing fix.
- Fix crash in lnet_parse_get()
- Resolve deadlock when adding a route.
- Fix an issue with dynamically turning on routing
- Set the final destination NID properly when routing a msg

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I68d0e4d52192aa96e37c77952a1ebe75c1b770c5
Reviewed-on: http://review.whamcloud.com/21166

2 years agoLU-7734 lnet: fix lnet_select_pathway()
Amir Shehata [Mon, 20 Jun 2016 21:21:13 +0000 (14:21 -0700)]
LU-7734 lnet: fix lnet_select_pathway()

Fixed the selection algorithm to work properly with > 1 local
networks. The behavior now is to iterate through all interfaces
on all networks

Also removed the health variable from struct lnet_peer_net since
it's never used.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ib91748e80446585b6a9e1bc0f3af6894599d8aaa
Reviewed-on: http://review.whamcloud.com/20890
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Jenkins
2 years agoLU-7734 lnet: configuration fixes
Amir Shehata [Fri, 17 Jun 2016 22:55:13 +0000 (15:55 -0700)]
LU-7734 lnet: configuration fixes

Fix cpt configuration from DLC to configure the proper list
of cpts in LNet. Check in LNet that no CPTs are outside the
available CPTs in the system.

Fix peer_rtr_credits name to peer_tx_credits to reflect the
actual value.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ic4a3985a470ed901be6166df4079205677921817
Reviewed-on: http://review.whamcloud.com/20862
Tested-by: Jenkins
2 years agoLU-7734 lnet: fix lnet_peer_table_cleanup_locked()
Olaf Weber [Thu, 16 Jun 2016 10:27:46 +0000 (12:27 +0200)]
LU-7734 lnet: fix lnet_peer_table_cleanup_locked()

In lnet_peer_table_cleanup_locked() we delete the entire peer if the
lnet_peer_ni for the primary NID of the peer is deleted. If the next
lnet_peer_ni in the list belongs to the peer being deleted, then the
next pointer kept by list_for_each_entry_safe() ends up pointing at
freed memory.

Add a list_for_each_entry_from() loop to advance next to a peer_ni
that does not belong to the peer being deleted and will therefore
remain present in the list.

Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I92bf219dc93a79f7d90035ccfbb38cd251138c04
Reviewed-on: http://review.whamcloud.com/20824
Tested-by: Jenkins
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-7734 lnet: Fix lnet_msg_free()
Amir Shehata [Fri, 10 Jun 2016 22:07:06 +0000 (15:07 -0700)]
LU-7734 lnet: Fix lnet_msg_free()

Remove the ni_decref in lnet_msg_free(), since this function
gets called with no lnet_net_lock() held

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ibfcbcea25287f4d22ae6146d7aa01f4279ffe969
Reviewed-on: http://review.whamcloud.com/20729

2 years agoLU-7734 lnet: simplify and fix lnet_select_pathway()
Amir Shehata [Fri, 10 Jun 2016 06:43:35 +0000 (23:43 -0700)]
LU-7734 lnet: simplify and fix lnet_select_pathway()

In lnet_select_pathway() we restart selection if the DLC seq
counter changes. Provided we take a hold on the preferred
lnet_peer_ni, we only need to restart if an lnet_ni was added
or removed. Update the locations where lnet_incr_dlc_seq() is
called to take this into account.

A number of local variables must be reset whenever we goto
again. Do this immediately after the label for the global
variables, and immediately before the block that uses them
for the helper variables.

In the loop where NUMA distances are compared, use the NUMA
range for distances smaller than the NUMA range, simplifying
the subsequent comparisons between distances.

Remote the lo_sent output parameter. Instead do an early
return with LNET_CREDIT_OK.

Move the increment of the best_lpni->lpni_seq number after
the check that best_lpni isn't NULL.

When routing, the best_gw should be treated as the best_lpni
for the purpose of determining the CPT to lock.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: Ie71eebc2301601cf1c85c6248dbed06951b89274
Reviewed-on: http://review.whamcloud.com/20720

2 years agoLU-7734 lnet: protect peer_ni credits
Amir Shehata [Thu, 9 Jun 2016 08:17:45 +0000 (01:17 -0700)]
LU-7734 lnet: protect peer_ni credits

Currently multiple NIs can talk to the same peer_ni. The per-CPT
lnet_net_lock therefore no longer protects the lpni against
concurrent updates. To resolve this issue a spinlock is added
to the lnet_peer_ni, which must be locked when the peer NI
credits, delayed message queue, and delayed routed message queue
are modified. The lock is not taken when reporting credits.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: I52153680a74d43e595314b63487026cc3f6a5a8f
Reviewed-on: http://review.whamcloud.com/20702

2 years agoLU-7734 lnet: proper cpt locking
Amir Shehata [Thu, 9 Jun 2016 06:08:06 +0000 (23:08 -0700)]
LU-7734 lnet: proper cpt locking

1. add a per NI credits, which is just the total credits
   assigned on NI creation
2. Whenever percpt credits are added or decremented, we
   mirror that in the NI credits
3. We use the NI credits to determine best NI
4. After we have completed the peer_ni/ni selection we
   determine the cpt to use for locking:
cpt_of_nid(lpni->nid, ni)

The lpni_cpt is not enough to protect all the fields in the
lnet_peer_ni structure. This is due to the fact that multiple
NIs can talk to the same peer, and functions can be called with
different cpts locked. To properly protect the fields in the
lnet_peer_ni structure, a spin lock is introduced for the
purpose.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ief7868c3c8ff7e00ea9e908dd50d8cef77d9f9a4
Reviewed-on: http://review.whamcloud.com/20701

2 years agoLU-7734 lnet: peer/peer_ni handling adjustments
Amir Shehata [Thu, 26 May 2016 22:42:39 +0000 (15:42 -0700)]
LU-7734 lnet: peer/peer_ni handling adjustments

A peer can be added by specifying a list of NIDs
The first NID shall be used as the primary NID. The rest of
the NIDs will be added under the primary NID

A peer can be added by explicitly specifying the key NID, and then
by adding a set of other NIDs, all done through one API call

If a key NID already exists, but it's not an MR NI, then adding that
Key NID from DLC shall convert that NI to an MR NI

If a key NID already exists, and it is an MR NI, then re-adding the
Key NID shall have no effect

if a Key NID already exists as part of another peer, then adding that
NID as part of another peer or as primary shall fail

if a NID is being added to a peer NI and that NID is a non-MR, then
that NID is moved under the peer and is made to be MR capable

if a NID is being added to a peer and that NID is an MR NID and part
of another peer, then the operation shall fail

if a NID is being added to a peer and it is already part of that Peer
then the operation is a no-op.

Moreover, the code is structured to consider the addition of Dynamic
Discovery in later patches.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I71f740192a31ae00f83014ca3e9e06b61ae4ecd5
Reviewed-on: http://review.whamcloud.com/20531

2 years agoLU-7734 lnet: Add peer_ni and NI stats for DLC
Doug Oucharek [Fri, 13 May 2016 00:25:21 +0000 (17:25 -0700)]
LU-7734 lnet: Add peer_ni and NI stats for DLC

This patch adds three stats to the peer_ni and NI structures:
send_count, recv_count, and drop_count. These stats get printed
when you do an "lnetctl net show -v" (for NI) and
"lnetctl peer show" (for peer_ni).

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: Ic41c88cbc68dba677151d87a1fab53a48d36ea29
Reviewed-on: http://review.whamcloud.com/20170
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Amir Shehata <amir.shehata@intel.com>
2 years agoLU-7734 lnet: rename LND peer to peer_ni
Amir Shehata [Fri, 1 Apr 2016 19:28:58 +0000 (12:28 -0700)]
LU-7734 lnet: rename LND peer to peer_ni

Patch to rename LND peers to peer_ni to reflect the fact that these
constructs reflect an actual connection between a local NI and remote
peer NI.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I1c25a12eae61d8822a8c4ada2e077a5b2011ba22
Reviewed-on: http://review.whamcloud.com/19307
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: handle N NIs to 1 LND peer
Amir Shehata [Fri, 1 Apr 2016 17:41:34 +0000 (10:41 -0700)]
LU-7734 lnet: handle N NIs to 1 LND peer

This patch changes o2iblnd only, as socklnd already handles this
case. In the new design you can have multiple NIs communicating
to one peer. In the o2ilbnd the kib_peer has a pointer to the NI
which implies a 1:1 relationship.

This patch changes kiblnd_find_peer_locked() to use the peer NID
and the NI NID as the key. This way a new peer will be created for
each unique NI/peer_NI pair.

This is similar to how socklnd handles this case.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ifab7764489757ea473b15c46c1a22ef9ceeeceea
Reviewed-on: http://review.whamcloud.com/19306
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: handle non-MR peers
Amir Shehata [Thu, 31 Mar 2016 23:53:02 +0000 (16:53 -0700)]
LU-7734 lnet: handle non-MR peers

Add the ability to declare a peer to be non-MR from the DLC
interface. By default if a peer is configured from DLC it is
assumed to be MR capable, except when the non-mr flag is set.

For non-MR peers always use the same NI to communicate with it.
If multiple NIs are used to communicate with a non-MR peer the
peer will consider that it's talking to different peers which could
cause upper layers to be confused.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ie3ec45f5f44fa7d72e3e0335b1383f9c3cc92627
Reviewed-on: http://review.whamcloud.com/19305
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-7734 lnet: Primary NID and traffic distribution
Amir Shehata [Tue, 15 Mar 2016 21:44:07 +0000 (14:44 -0700)]
LU-7734 lnet: Primary NID and traffic distribution

When receiving messages from a multi-rail peer we must keep track of
both the source NID and the primary NID of the peer. When sending a
reply message or RPC respone, the source NID is preferred. But most
other uses require identifcation of the peer regardless of which
source NID the message came from, and so the primary NID of the peer
must then be used.

An example for this is the creation of match entries. Another occurs
when an event is created: the initiator should be the primary NID, to
ensure upper layers (PtlRPC and Lustre) always see the same NID for
that peer.

This change also contains code to have PtlRPC use LNET_NID_ANY for
the 'self' parameter of LNetPut() and LNetGet() when it doesn't care
which NI it sends from, and to provide a local/peer NID pair when it
does. This can be broken out into a separate change.

Signed-off-by: Olaf Weber <olaf@sgi.com>
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: If4391f2537a94f5784e8c61ae03aad266b2f8e7d
Reviewed-on: http://review.whamcloud.com/18938
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: NUMA support
Amir Shehata [Tue, 15 Mar 2016 10:13:03 +0000 (03:13 -0700)]
LU-7734 lnet: NUMA support

This patch adds NUMA node support. NUMA node information is stored
in the CPT table. A NUMA node mask is maintained for the entire table
as well as for each CPT to track the NUMA nodes related to each of
the CPTs. Following key APIs added:

cfs_cpt_of_node(): returns the CPT of particular NUMA node
cfs_cpt_distance(): calculates the distance between two CPTs

When the LND device is started it finds the NUMA node of the physical
device and then from there it finds the CPT, which is subsequently
stored in the NI structure.

When selecting the NI, the MD CPT is determined and the distance
between the MD CPT and the device CPT is calculated. The NI
with the shortest distance is preferred.

If the device or system is not NUMA aware then the CPT for the
device will default to CFS_CPT_ANY and the distance calculated
when CFS_CPT_ANY is used is largest in the system. IE, none
NUMA aware devices are least preferred.

A NUMA range value can be set. If the value is large enough
it amounts to basically turning off NUMA criterion completely.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I2d7c63f8e8fc8e8a6a249b0d6bfdd08fd090a837
Reviewed-on: http://review.whamcloud.com/18916
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: configure local NI from DLC
Amir Shehata [Fri, 29 Jan 2016 01:00:09 +0000 (17:00 -0800)]
LU-7734 lnet: configure local NI from DLC

This patch adds the ability to configure multiple network interfaces
on the same network. This can be done via the lnetctl CLI interface
or through a YAML configuration. Refer to the multi-rail HLD for
more details on the syntax.

It also deprecates ip2nets kernel parsing. All string parsing and
network maching now happens in the DLC userspace library.

New IOCTLs are added for adding/deleting local NIs, to keep backwards
compatibility with older version of the DLC and lnetctl.

The changes also include parsing and matching ip2nets syntax at the
user level and then passing down the network interfaces down to the
kernel to be configured.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I19ee7dc76514beb6f34de6517d19654d6468bcec
Reviewed-on: http://review.whamcloud.com/18886
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-7734 lnet: configure peers from DLC
Amir Shehata [Wed, 13 Jan 2016 01:09:31 +0000 (17:09 -0800)]
LU-7734 lnet: configure peers from DLC

This patch adds the ability to configure peers from the DLC
interface.

When a peer is added a primary NID should be provided. If none is
provided then the first NID in the list of NIDs will be used
as the primary NID.

Basic error checking is done at the DLC level to ensure properly
formatted NIDs. However, if a NID is a duplicate, this will be
detected when adding it in the kernel. Operation is halted, which
means some peer NIDs might have already been added, but not the
entire set. It's the role of the caller to backtrack and remove that
peer that failed to add.

When deleting a peer a primary NID or a normal NID can be provided.
If a standard NID is provided, then the peer is found, and the
primary NID is compared to the peer ni. If they are the same the
entire peer is deleted. Otherwise, only the identified peer ni is
deleted. If a set of NIDs are provided each one will be removed
from the peer identified by the peer NID in turn.

The existing show peer credits API can be used to show peer
information.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Iaf588a062b44d74305aa9aa7d31c7341c6c384b9
Reviewed-on: http://review.whamcloud.com/18476
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoLU-7734 lnet: Multi-Rail local_ni/peer_ni selection
Amir Shehata [Tue, 5 Jan 2016 00:02:25 +0000 (16:02 -0800)]
LU-7734 lnet: Multi-Rail local_ni/peer_ni selection

This patch implements the local_ni/peer_ni selection algorithm.
It adds APIs to the peer module to encapsulate
iterating through the peer_nis in a peer and creating a peer.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ifc0e5ebf84ab25753adfcfcb433b024100f35ace
Reviewed-on: http://review.whamcloud.com/18383
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Jenkins
Tested-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: Multi-Rail peer split
Amir Shehata [Sat, 12 Dec 2015 04:02:54 +0000 (20:02 -0800)]
LU-7734 lnet: Multi-Rail peer split

Split the peer structure into peer/peer_net/peer_ni, as
described in the Multi-Rail HLD.

Removed deathrow list in peers, instead peers are immediately
deleted. deathrow complicates memory management for peers to
little gain.

Moved to LNET_LOCK_EX for any operations which will modify the
peer tables. And CPT locks for any operatios which read the peer
tables. Therefore there is no need to use lnet_cpt_of_nid() to
calculate the CPT of the peer NID, instead we use lnet_nid_cpt_hash()
to distribute peers across multiple CPTs.

It is no longe true that peers and NIs would exist on
the same CPT. In the new design peers and NIs don't have a 1-1
relationship. You can send to the same peer from several NIs, which
can exist on separate CPTs

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ida41d830d38d0ab2bb551476e4a8866d52a25fe2
Reviewed-on: http://review.whamcloud.com/18293
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
2 years agoLU-7734 lnet: Multi-Rail local NI split
Amir Shehata [Sat, 12 Dec 2015 04:02:54 +0000 (20:02 -0800)]
LU-7734 lnet: Multi-Rail local NI split

This patch allows the configuration of multiple NIs under one Net.
It is now possible to have multiple NIDs on the same network:
   Ex: <ip1>@tcp, <ip2>@tcp.
This can be configured using the following syntax:
   Ex: tcp(eth0, eth1)

The data structures for the example above can be visualized
as follows

               NET(tcp)
                |
        -----------------
        |               |
      NI(eth0)        NI(eth1)

For more details refer to the Mult-Rail Requirements and HLD
documents

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Id7c73b9b811a3082b61e53b9e9f95743188cbd51
Reviewed-on: http://review.whamcloud.com/18274
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
2 years agoNew tag 2.9.52 2.9.52 v2_9_52 v2_9_52_0
Oleg Drokin [Tue, 24 Jan 2017 05:26:21 +0000 (00:26 -0500)]
New tag 2.9.52

Change-Id: I7fd2714e14825f5966751a6d0a66313c1e3088b5
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8587 utils: Fix incorrect indenting in llapi_hsm_log_ct_progress 70/24570/3
Oleg Drokin [Sun, 1 Jan 2017 18:45:06 +0000 (13:45 -0500)]
LU-8587 utils: Fix incorrect indenting in llapi_hsm_log_ct_progress

gcc6 highlights this case of incorrect indenting:
        if (progress_type == CT_RUNNING)
                rc = llapi_json_add_item(&json_items, "current_bytes",
                                         LLAPI_JSON_BIGNUM, &current);
                if (rc < 0)
                        goto err;

Just add the braces around, though logic-wise it's all fine.

Change-Id: I770857fe2f9ce29817558247ce0987842b8d06b4
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/24570
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-8996 kernel: kernel update RHEL6.8 [2.6.32-642.13.1.el6] 78/24878/2
Bob Glossman [Tue, 10 Jan 2017 17:06:03 +0000 (09:06 -0800)]
LU-8996 kernel: kernel update RHEL6.8 [2.6.32-642.13.1.el6]

Update RHEL6.8 kernel to 2.6.32-642.13.1.el6

Test-Parameters: trivial clientdistro=el6.8 mdsdistro=el6.8 ossdistro=el6.8 \
  mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I3d8ffa55ff050503a69c4db260ac6b915564349a
Reviewed-on: https://review.whamcloud.com/24878
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8929 lfsck: dumper gets current position properly 69/24869/2
Fan Yong [Tue, 27 Sep 2016 15:12:12 +0000 (23:12 +0800)]
LU-8929 lfsck: dumper gets current position properly

It is normal that the LFSCK iteration has been done when the
dump hanlder found the status as LS_SCANNING_PHASE1, it may
because of race, or the LFSCK failed to update the status.
Under such cases, the dump handler will use the position in
the last checkpoint as the current position. It may be not
100% accurate, but not serious issue.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I672258baa9d0b0aa8ec12249c13b2b147a274ab4
Reviewed-on: https://review.whamcloud.com/24869
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7481 utils: label lustre device correctly 45/24845/3
Hongchao Zhang [Sat, 8 Oct 2016 19:49:33 +0000 (03:49 +0800)]
LU-7481 utils: label lustre device correctly

Currently, the device label will be read before mounting Lustre,
the flags LDD_F_VIRGIN and LDD_F_WRITECONF will be set according
to the label (the corresponding original flags containing in the
lustre_disk_data will be ignored). But the device label could be
changed during mount by the journal recovery, and the device should
be also labeled to indicate the target device is supposed to start.

Test-Parameters: testlist=conf-sanity,conf-sanity,conf-sanity

Change-Id: I2df1d81f764a7d1ffa26afb0197d137c057a25e9
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/24845
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8932 lnet: define new network driver ptl4lnd 68/24768/3
Gregoire Pichon [Tue, 13 Dec 2016 13:41:42 +0000 (14:41 +0100)]
LU-8932 lnet: define new network driver ptl4lnd

Assign an ID to the new network driver ptl4lnd developped by Bull
that implements a LND based on Portals 4 API. It is intended to be
used with BXI, the Bull interconnect hardware.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: I38e505916899dc7f01b3ad0372c9f068fa06f308
Reviewed-on: https://review.whamcloud.com/24768
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
2 years agoLU-7429 tests: generate dangling name entry properly 63/24763/2
Fan Yong [Sun, 25 Sep 2016 00:24:10 +0000 (08:24 +0800)]
LU-7429 tests: generate dangling name entry properly

There may be some creation after the dangling injection, such as
create ".lustre/lost+found/MDTxxxx" for LFSCK, create some quota
related local files. These creations may reuse the just released
local object/inode that is referenced by the dangling name entry.
That will fail the dangling injection as to the subsequent LFSCK
will not find dangling name entry. So before deleting the target
object for the dangling name entry, remove some other objects to
avoid the target object being reused by some potential creations.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I0e1bb60c1095119e10009fe2a8ce38687e3e7692
Reviewed-on: https://review.whamcloud.com/24763
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4423 ptlrpc: use 64-bit times for ptlrpc_sec 10/24710/3
Arnd Bergmann [Wed, 11 Jan 2017 01:50:19 +0000 (20:50 -0500)]
LU-4423 ptlrpc: use 64-bit times for ptlrpc_sec

Here we use an unsigned long to store the timeout for gc,
which is probably safe until 2106, but this patch converts it
to use ktime_get_real_seconds() and time64_t for consistency.

Linux-commit: 8cc980713ec9e6847896891c54562ad815c33424

Change-Id: I9c66ac818239debe676b78fbee5764cd5b69028c
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24710
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-4423 ptlrpc: use 64-bit times in ptlrpc_enc_page_pool 08/24708/2
Arnd Bergmann [Wed, 4 Jan 2017 21:40:12 +0000 (16:40 -0500)]
LU-4423 ptlrpc: use 64-bit times in ptlrpc_enc_page_pool

ptlrpc_enc_page_pool computes time deltas using 'long' values from
get_seconds(). This is probably safe beyond y2038, but it's better
to go use monotonic times and 64-bit here for consistency.

Linux-commit: 80018a9edbc3180ae31a7197f9dacab975a7f5e2

Change-Id: I15511b04d8b4f7353ce109d8d9bb7887f551d880
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24708
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-4423 lnet: Better cookie gen 82/24682/6
Tina Ruchandani [Thu, 5 Jan 2017 14:53:30 +0000 (09:53 -0500)]
LU-4423 lnet: Better cookie gen

api-ni.c uses do_gettimeofday to get a 'cookie' or timestamp.
This patch replaces it with ktime_get_ns for the following reasons:

1. ktime_get_ns returns a __u64 which is safer than 'struct timeval'
   which will overflow on 32-bit systems in year 2038 and beyond.
2. Improved resolution: nsecs instead of usecs.
3. Reduced compute: ktime_get_ns is faster than the multiply/add
   combination used in this function

Linux-commit: 9056be30542bfff51190bdda67088f319cf4c9f5

Drop unneeded wrapper function Remove the function
lnet_create_interface_cookie() and replace its call
with the function ktime_get_ns().

Linux-commit: 7bcd831b8579212303ec7c30e975432b914493dc

The ln_interface_cookie is used to ensure that a node can tell whether
the following sequence of events has happened:

node sends GET or PUT to peer
node is rebooted
peer sends REPLY or ACK to node

The ln_interface_cookie is set once, when LNet starts, and remains
unchanged afterwards. To avoid accidentally obtaining the same cookie
after a reboot, the code generated ths cookie using ktime_get_ns().
Once generated, the value of the cookie is not interpreted, only
compared for equality. Olaf Weber reported that due to the use of
ktime_get_ns() a small chance exist of generating a cookie of identical
value across reboots. Using ktime_get_real_ns() removes any chance of
this from happening.

Change-Id: I159a0ff2573afb87f279a8e8f282b0ac076d9bf3
Signed-off-by: Tina Ruchandani <ruchandani.tina@gmail.com>
Signed-off-by: Shivani Bhardwaj <shivanib134@gmail.com>
Suggested-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24682
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4423 libcfs: Use swap() in cfs_hash_bd_order() 76/24576/2
Amitoj Kaur Chawla [Mon, 2 Jan 2017 00:19:16 +0000 (19:19 -0500)]
LU-4423 libcfs: Use swap() in cfs_hash_bd_order()

Use swap() function instead of using a temporary variable for swapping
two variables.

The Coccinelle semantic patch used to make this change is as follows:
//<smpl>
@@
type T;
T a,b,c;
@@
- a = b;
- b = c;
- c = a;
+ swap(b, c);
//<smpl>

Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Change-Id: I19aa52fe4e05fed2e03c1d8515731a5ce01b3d09
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24576
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-4315 docs: Fix Makefile.am to have one man page per line 71/24371/4
Steve Guminski [Thu, 15 Dec 2016 15:20:05 +0000 (10:20 -0500)]
LU-4315 docs: Fix Makefile.am to have one man page per line

The man pages for lfs(1) and lctl(8) are quite large. Splitting
them into one page per subcommand will allow for more detailed
information for each subcommand.

This patch modifies the Makefile.am so that the source man pages
are listed one per line.  This will make it easier for subsequent
patches to add new pages.  Existing pages that were missing from
the Makefile.am have been added to it, and an obsolete page has
been removed from the Makefile.am and been deleted.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: If312fff9bc5e68176caf0a70a51876e69b1614d8
Reviewed-on: https://review.whamcloud.com/24371
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
2 years agoLU-8926 llite: reduce jobstats race window 53/24253/5
Patrick Farrell [Tue, 13 Dec 2016 15:43:34 +0000 (09:43 -0600)]
LU-8926 llite: reduce jobstats race window

In the current code, lli_jobid is set to zero on every call
to lustre_get_jobid.  This causes problems, because it's
used asynchronously to set the job id in RPCs, and some
RPCs will falsely get no jobid set.  (For small IO sizes,
this can be up to 60% of RPCs.)

It would be very expensive to put hard synchronization
between this and every outbound RPC, and it's OK to very
rarely get an RPC without correct job stats info.

This patch only updates the lli_jobid when the job id has
changed, which leaves only a very small window for reading
an inconsistent job id.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I6c3a7f8683dc5f5d467940920938db18b0c20462
Reviewed-on: https://review.whamcloud.com/24253
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
2 years agoLU-8855 llite: return small device numbers for compat stat() 77/23877/3
John L. Hammond [Mon, 21 Nov 2016 15:22:52 +0000 (09:22 -0600)]
LU-8855 llite: return small device numbers for compat stat()

The compat_sys_*stat*() syscalls will fail unless the devices majors
and minors are both less than 256. So in ll_getattr_it(), if we are in
32 bit compat mode then coerce the device numbers in to the expected
format.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1bf13258902e13c76b9ebf3476fd1767712de0b3
Reviewed-on: https://review.whamcloud.com/23877
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-8821 mdt: avoid double find in mdt_path_current() 01/23701/2
John L. Hammond [Thu, 12 Nov 2015 15:49:09 +0000 (09:49 -0600)]
LU-8821 mdt: avoid double find in mdt_path_current()

In mdt_path_current() avoid finding the object we are already holding
a reference to.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iae4796047d2c5b02989d29baf2e7620545f7e45c
Reviewed-on: https://review.whamcloud.com/23701
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8066 obdclass: move lustre sysctl to sysfs 28/23428/6
Oleg Drokin [Sun, 1 Jan 2017 19:11:04 +0000 (14:11 -0500)]
LU-8066 obdclass: move lustre sysctl to sysfs

Backport from upstream the changes to port lustre
systctl to sysfs. Needed to re-export the function
lprocfs_read_frac_helper for later work. The
following patches were backported:

Fix class_procfs_init error return value. Dan Carpenter noticed
that procfs conversion patches introduced a bug where should
kobject_create_and_add, an error is not returned from
class_procfs_init.

Linux-commit: 3c4872f94359ed38a1392c0a9238c48a9aee6f8f

Move sysctl timeout to sysfs. This is the first step of
moving lustre sysctls from /proc/sys/lustre to /sys/fs/lustre

Linux-commit: e2424a1265f2772b66f068c205256e2aef5f74a0

Move max_dirty_mb from sysctl to sysfs. max_dirty_mb is
now a parameter in /sys/fs/lustre.

Linux-commit: df476a4d5de09d9324b108fc9c5ff2c00a0850d0

Move debug controls to sysfs. debug_peer_on_timeout,
dump_on_timeout and dump_on_eviction controls from
/proc/sys/lustre to /sys/fs/lustre

Linux-commit: 9e7fa14935901bcd09576b2866d5dd15f69caf83

Move AT controls from sysctl to sysfs. Adaptive Timeouts
controls are being moved from /proc/sys/lustre to
/sys/fs/lustre

Linux-commit: bcef118e7ed67e28edcaab9be9ca11412176c540

Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Change-Id: Id1b00bebf9ecca5284e9c71f4c0f91e56cbf391b
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23428
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-4423 ldlm: use 64-bit time for pl_recalc 50/23350/4
Arnd Bergmann [Sun, 1 Jan 2017 16:41:33 +0000 (11:41 -0500)]
LU-4423 ldlm: use 64-bit time for pl_recalc

The ldlm pool calculates elapsed time by comparing the previous and
current get_seconds() values, which is unsafe on 32-bit machines
after 2038.

This changes the code to use time64_t and ktime_get_real_seconds(),
keeping the 'real' instead of 'monotonic' time because of the
debug prints.

Linux-commit: 8f83409cf2382c968f96877368cd5b542b92af1d
Linux-commit: b8cb86fd95bb461c3496e1f4b4083b198c963a9c

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Change-Id: I81cca5b529dbf5615cf46461ad1c9179fdee7835
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23350
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-8710 ptlrpc: use current CPU instead of harcoded 0 05/23305/3
Dmitry Eremin [Fri, 21 Oct 2016 12:38:46 +0000 (15:38 +0300)]
LU-8710 ptlrpc: use current CPU instead of harcoded 0

fix crash if CPU 0 disabled.

Change-Id: I8ac5a10f544a1c8fc454bc64a6bb1d3607240be9
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/23305
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-8703 libcfs: remove usless CPU partition code 03/23303/3
Dmitry Eremin [Fri, 21 Oct 2016 11:56:31 +0000 (14:56 +0300)]
LU-8703 libcfs: remove usless CPU partition code

 * remove scratch buffer and mutex which guard it.
 * remove global cpumask and spinlock which guard it.
 * remove cpt_version for checking CPUs state change during setup
   because of just disable CPUs state change during setup.
 * remove whole global struct cfs_cpt_data cpt_data.
 * remove few unused APIs.

Change-Id: I0cc853d57952e76cf32801838a19e6872905aaa0
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/23303
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-4423 obd: use ktime_t for calculating elapsed time 46/23146/9
Arnd Bergmann [Tue, 10 Jan 2017 23:16:11 +0000 (18:16 -0500)]
LU-4423 obd: use ktime_t for calculating elapsed time

process_param2_config() tries to print how much time has passed
across a call_usermodehelper() function, and uses struct timeval
for that.

We want to remove this structure, so this is better expressed
in terms of ktime_t and ktime_us_delta().

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Change-Id: I90c61d9d49ee0d500772f1b370790e37859f18b2
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/23146
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-7471 tests: Modified make_custom_file_for_progress fn 46/17346/7
Aditya Pandit [Tue, 24 Nov 2015 09:35:40 +0000 (15:05 +0530)]
LU-7471 tests: Modified make_custom_file_for_progress fn

After executing tests like test_200 and if there is not
enough space the function make_custom_file_for_progress was returning
1 on error but this error was not getting caught properly.
Modified the code to catch the error properly.

Test-Parameters: trivial testlist=sanity-hsm

Seagate-bug-id: MRP-3026
Signed-off-by: Mikhail V. Pridushchenko <mikhail.v.pridushchenko@seagate.com>
Signed-off-by: Aditya Pandit <aditya.pandit@seagate.com>
Change-Id: I0db45e327d6f49d066b5c631096a459b8acd2758
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-on: https://review.whamcloud.com/17346
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-3764 tests: clean up sanity test_116a code style 82/7882/4
Andreas Dilger [Tue, 8 Oct 2013 18:56:06 +0000 (12:56 -0600)]
LU-3764 tests: clean up sanity test_116a code style

Clean up the code style for sanity.sh test_116a():
- proper indentation
- don't use $ for variables inside $((...))
- one statement per line

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I5fe502b85365e00ce34b90f8fabfb58320a07e51
Reviewed-on: https://review.whamcloud.com/7882
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 osd : remove struct timeval use in osd-ldiskfs 96/24896/2
James Simmons [Fri, 13 Jan 2017 21:39:23 +0000 (16:39 -0500)]
LU-9019 osd : remove struct timeval use in osd-ldiskfs

For brw_stats_show change the output to use a timespec64
type to avoid the overflow. Also change the format to
print the sub-second portion as 9 digits (nanoseconds)
for clarity, rather than printing six digits without
leading zeroes.

Both osd_write_prep() and osd_read_prep() want to report
the time it took to perform its operations in the
osd_get_page counter. This is currently done with
a call to do_gettimeofday with struct timeval which is
not 2038 safe. Move this operation to 64 bit time
handling.

Change-Id: I457ac799d855d2596220b6e0d0c5039e8b00021f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/24896
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8562 osp: osp_precreate_thread gets stuck after disconnect 58/24758/3
Ned Bass [Sat, 7 Jan 2017 01:43:47 +0000 (17:43 -0800)]
LU-8562 osp: osp_precreate_thread gets stuck after disconnect

osp_precreate_thread() can get stuck because d->opd_got_disconnected
never gets reset. When opd_got_disconnected is set,
osp_precreate_cleanup_orphans() returns early with EAGAIN and can't
clear d->opd_pre_recovering. And because d->opd_pre_recovering can't
be cleared we always break out of the while loop where
d->opd_got_disconnected normally gets reset. So
osp_precreate_cleanup_orphans() is stuck always failing.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Change-Id: I0b4f4e2e55e7a8d7ffae633a4d3c578b4a484ae2
Reviewed-on: https://review.whamcloud.com/24758
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8840 osp: handle EA cache properly 82/23782/10
Fan Yong [Thu, 22 Sep 2016 08:54:55 +0000 (16:54 +0800)]
LU-8840 osp: handle EA cache properly

For success case, dt_xattr_get() should return the EA size
instead of zero. If such EA does not exist, return -ENODATA.

More code cleanup for OSP EA cache to avoid potential reference
leak, buffer overflow, and so on.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I352b99b1ed08f1b15bdb8da2bf28689ae2d61c23
Reviewed-on: https://review.whamcloud.com/23782
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>