Whamcloud - gitweb
fs/lustre-release.git
2 years agoLU-4277 scripts: ofd status integrated with zpool status 07/30907/7
Nathaniel Clark [Wed, 24 Jan 2018 13:35:05 +0000 (08:35 -0500)]
LU-4277 scripts: ofd status integrated with zpool status

Add zedlet to ZFS ZED that markes OFD as degraded/undegraded,
when a zpool is degraded or online, respectivly.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ia8ec3cf3a31ce24d8598d690bcb0356245712858
Reviewed-on: https://review.whamcloud.com/30907
Tested-by: Jenkins
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10452 lnet: cleanup YAML output 45/30845/6
Amir Shehata [Fri, 12 Jan 2018 03:29:54 +0000 (19:29 -0800)]
LU-10452 lnet: cleanup YAML output

The level of verbosity is high when exporting the YAML configuration
for the purposes of storing it to reconfigure a node. This patch
eliminates the unnecessary YAML lines which are not needed when
reconfiguring a node, such as statistics, status, etc.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ie57c761415cfb0ceee8b2dbc0b293e85ae415685
Reviewed-on: https://review.whamcloud.com/30845
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9158 quota: adjust quota ASAP 65/30765/8
Hongchao Zhang [Wed, 6 Dec 2017 08:49:54 +0000 (16:49 +0800)]
LU-9158 quota: adjust quota ASAP

In qsd_upd_thread, the quota adjust request will only be
scheduled to run when the current time (seconds) is larger
than the queued time (seconds). The transactions in subtest 12b
of sanity_quota are committed in one second simultaneously,
which cause the quota is not freed.

Test-Parameters: alwaysuploadlogs \
envdefinitions=ENABLE_QUOTA=yes,DEBUG_SIZE=64,PTLDEBUG=rpctrace \
clientcount=2 osscount=2 mdscount=2 mdtcount=4 \
austeroptions=-R mdtfilesystemtype=zfs ostfilesystemtype=zfs \
testlist=sanity-quota,sanity-quota,sanity-quota,sanity-quota

Change-Id: I9310237d58a21ee8d47daab8901892bd12016339
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/30765
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9378 utils: split getstripe and find from lfs.1 64/30464/10
Andreas Dilger [Sat, 9 Dec 2017 11:26:54 +0000 (04:26 -0700)]
LU-9378 utils: split getstripe and find from lfs.1

Split the getstripe and find commands from the lfs.1 man page into
their own lfs-getstripe.1 and lfs-find.1 man pages.

While updating the lfs-find.1 man page I realized that the short
options for "-print" and "-print0" were incorrectly documented
in both the usage message as well as the man page, which implies
that the short options were rarely, if ever, used.

Fix the "--print" option to be correctly documented as "-P" instead
of "-p", and deprecate the usage of "-p" for "--print0" in favour
of "-0".  This gives us the opportunity to reclaim "-p" for "--pool",
which is already used as such for "lfs df", "lfs getstripe", and
"lfs setstripe", after some period of printing a deprecation warning.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I9aa7a415d109d269c646fd034ea77785a94cab07
Reviewed-on: https://review.whamcloud.com/30464
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5152 quota: enforce block quota for chgrp 46/30146/23
Hongchao Zhang [Sat, 27 Jan 2018 21:21:35 +0000 (05:21 +0800)]
LU-5152 quota: enforce block quota for chgrp

When an unprivileged user calls chgrp to change the group
of one of his files, the block quota limit of that new group
should be checked to ensure it not exceeds the limit.

The side effect of this patch could be,
1.The performance of chgrp from non-privileged user will be
very slow, no matter if quota is enabled. Since we assume that
chgrp issued from non-privileged user is very rare, the performance
impact possibly is acceptable.
2.If MDT crash while performing chgrp, inconsistency (group ownership
among MDT and OST objects) will be created. It should be acceptable.

This patch has fixed the bug while calculating the disk space of
some file for ldiskfs and zfs, the block unit is always 512.

Change-Id: I4b781e94493fe63c8cbd5700dc68293b2504c2ac
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/30146
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10193 tests: test migration between ldiskfs and zfs 06/30106/10
Fan Yong [Mon, 22 Jan 2018 05:44:14 +0000 (13:44 +0800)]
LU-10193 tests: test migration between ldiskfs and zfs

New test cases in conf-sanity.sh

test_108a: migrate from ldiskfs to zfs
test_108b: migrate from zfs to ldiskfs

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibb4749c316f51b0820648e59235a03a9656f762e
Reviewed-on: https://review.whamcloud.com/30106
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10193 osd-ldiskfs: backup index object with plain format 11/30911/11
Fan Yong [Thu, 25 Jan 2018 08:48:00 +0000 (16:48 +0800)]
LU-10193 osd-ldiskfs: backup index object with plain format

This patch is mainly for migrating filesyste between ZFS
backend and ldiskfs backend via server file level backup
and restore. It will dumps the ldiskfs special formatted
index object to the local '/index_backup' directory with
the name of source index's FID string and ".lbx" postfix
when umount device.

The format of the backup is as following (same ZFS case):
1) header: 512 bytes, including:
   magic:       4 bytes
   count:       4 bytes
   keysize:     4 bytes
   recsize:     4 bytes
   owner_fid:   16 bytes
   padding:     480 bytes

2) body: after the header, <key, rec> pairs one by one.

The backup will be done when server umount. The backup behavior
is controlled via new OSD lproc interface "index_backup". It is
off by default. You can turn it on to enable backup when server
umount via writing non-zero value to such lproc interface.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I5ac81dd470f3cb29eb3c9ec0e01935c9b1a0fda9
Reviewed-on: https://review.whamcloud.com/30911
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10193 osd-zfs: backup index object with plain format 10/30910/16
Fan Yong [Thu, 25 Jan 2018 08:47:20 +0000 (16:47 +0800)]
LU-10193 osd-zfs: backup index object with plain format

Lustre uses ZAP to implement index object. When tar the index
object via backend ZPL for backup, it is explained as regular
file, then when untar it, it is not ZAP formatted again, then
the Lustre cannot recognize the 'bad' formatted index object.

On the other hand, each backend FS has its own special format
for index object. Then we cannot migrate the index files from
one backend to another directly.

To resolve such issue, the patch will backup the index object
with plain format to the local '/index_backup' directory with
the name of source index's FID string and ".lbx" postfix when
umount the device.

The format of the backup is as following:
1) header: 512 bytes, including:
   magic:       4 bytes
   count:       4 bytes
   keysize:     4 bytes
   recsize:     4 bytes
   owner_fid:   16 bytes
   padding:     480 bytes

2) body: after the header, <key, rec> pairs one by one.

The backup will be done when server umount. The backup behavior
is controlled via new OSD lproc interface "index_backup". It is
off by default. You can turn it on to enable backup when server
umount via writing non-zero value to such lproc interface.

Test-Parameters: envdefinitions=SLOW=yes testlist=sanity-scrub mdtfilesystemtype=zfs ostfilesystemtype=zfs mdscount=2 mdtcount=4
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I01730bc9cfa3ae597f2d8652df9fb76418cf55ce
Reviewed-on: https://review.whamcloud.com/30910
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0 15/29215/12
Martin Schroeder [Tue, 13 Jun 2017 14:42:57 +0000 (10:42 -0400)]
LU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0

This enables compatibility with the current LTS flavours of Ubuntu.
Do note that you need the Xenial HWE Kernel for Ubuntu 14.04.5, as
that distribution originally used a 3.x series Kernel.

The patches have been developed to apply cleanly to the kernel versions
4.4.0-45.66 to 4.4.0-85.108 from the Ubuntu Xenial (and its Trusty backports).

This change also adjusts the Debian scripting to produce the
ldiskfs modules and the server utilities.

To create the server modules run "./configure" with "--enable-server"
and specify "--enable-ldiskfs" and "--with-zfs/-spl" as appropriate.

The call to "make debs" will then produce the server modules and
utils instead of their client versions.

NOTE: This contains a small hack taken from LU-9995 / #29130

Test-Parameters: trivial
Signed-off-by: Martin Schroeder <martin.h.schroeder@intel.com>
Change-Id: I02cd5e9314367ad4e1f8f3d81712f84441a8bc71
Reviewed-on: https://review.whamcloud.com/29215
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9919 lnet: safe access in debug print 71/28771/3
Amir Shehata [Mon, 28 Aug 2017 22:09:21 +0000 (15:09 -0700)]
LU-9919 lnet: safe access in debug print

Move debug print within the cpt lock to keep
peer access safe.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ic37ff0973367b3eb9cbc0059ffee9c31ecf98c34
Reviewed-on: https://review.whamcloud.com/28771
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7501 utils: clean up lfs argument handling/docs 92/28592/15
Andreas Dilger [Thu, 17 Aug 2017 23:33:47 +0000 (17:33 -0600)]
LU-7501 utils: clean up lfs argument handling/docs

Change "mdt-hash" option in lfs_setstripe() and lfs_setdirstripe
to use C99 formatting as used for other options.

Add comments for already-used options to lfs_find(), lfs_getstripe(),
and lfs_setstripe() to avoid conflicts in the future.

A few initializers can fit onto the same line with minor formatting
changes, better to be more compact than a slave to exact formatting.

Remove options that are obsoleted by LUSTRE_VERSION_CODE after 2.10.
Remove over-zealous deprecation of "lfs mkdir -c".

Sort options to be in mostly alphabetical order, unless the long
option parsing would return a deprecated short option.

Add deprecation warnings for short/long options that were deprecated
already in commit cdeb2f3a56e8 (http://review.whamcloud.com/22581).

Fix up lfs-setdirstripe.1 and lfs-getdirstripe.1 man pages to list
preferred option names.  Also, lfs-getdirstripe.1 listed some options
that never existed, and others that were named incorrectly.

Move test scripts over to use preferred command and option names.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I74a59ce372115ae0906d0feb37c539a450bed6bd
Reviewed-on: https://review.whamcloud.com/28592
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9727 nodemap: add audit_mode flag to nodemap 13/28313/15
Sebastien Buisson [Wed, 2 Aug 2017 09:44:33 +0000 (18:44 +0900)]
LU-9727 nodemap: add audit_mode flag to nodemap

Give the ability to specify an audit_mode flag on a nodemap.
When set to 1, a client pertaining to this nodemap will be able to
record file system access events to the Changelogs, if Changelogs are
otherwise activated.
When set to 0, events are not logged into the Changelogs, no matter
Changelogs are activated or not.
By default, audit_mode flag is set to 1 in newly created nodemap
entries. And it is also set to 1 on 'default' nodemap.

The idea of disabling audit on a per-nodemap basis is that it would
be possible to have some nodes (e.g. backup, HSM agent nodes) that do
not flood the audit logs.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ieb6c461c443b1734312afef44680d903deee5398
Reviewed-on: https://review.whamcloud.com/28313
Reviewed-by: Jean-Baptiste Riaux <riaux.jb@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10513 acl: prepare small buffer for ACL RPC reply 16/28116/11
Fan Yong [Mon, 15 Jan 2018 14:45:37 +0000 (22:45 +0800)]
LU-10513 acl: prepare small buffer for ACL RPC reply

For most of files, their ACL entries are very limited, under
such case, it is unnecessary to prepare very large reply buffer
to hold unknown-sized ACL entries for the getattr/open RPCs.
Instead, we can prepare some relative small buffer, such as the
LUSTRE_POSIX_ACL_MAX_SIZE_OLD (260) bytes, that is equal to the
ACL size before patch 64b2fad22a4eb4727315709e014d8f74c5a7f289.
If the target file has too many ACL entries and exceeds the
prepared reply buffer, then the MDT will reply -ERANGE failure
to the client, and then the client can prepare more large buffer
and try again. Since the file with large ACL is rare case, such
retrying getattr/open RPCs will not affect the real performance
too much.

The advantage is that it reduces the client side RAM pressure.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4c01b19520cab1cc712e36f3b0225973fba00410
Reviewed-on: https://review.whamcloud.com/28116
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9727 lustre: record CLOSE if OPEN was recorded 29/27929/21
Sebastien Buisson [Tue, 4 Jul 2017 15:21:44 +0000 (00:21 +0900)]
LU-9727 lustre: record CLOSE if OPEN was recorded

Record CL_CLOSE events in changelogs only if file was opened in
write mode, or if CL_OPEN was recorded.
Changelogs mask may change between open and close operations,
but this is not a big deal if we have a CL_CLOSE entry with no
matching CL_OPEN. Plus Changelogs mask may not change often.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5984a4b07b84d84c3860b9b21abc3b19b7fd9b1a
Reviewed-on: https://review.whamcloud.com/27929
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Matthew S <matthew.sanderson@anu.edu.au>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9727 lustre: implement CL_OPEN for Changelogs 14/28214/23
Sebastien Buisson [Tue, 25 Jul 2017 13:45:58 +0000 (09:45 -0400)]
LU-9727 lustre: implement CL_OPEN for Changelogs

Record OPEN events in Changelogs, and add a new changelog
extension named changelog_ext_openmode to hold open mode.
An OPEN changlog entry is in the form:
7 10OPEN  13:38:51.510728296 2017.07.25 0x242
t=[0x200000401:0x2:0x0] ef=0x7 u=500:500 nid=10.128.11.159@tcp m=-w-
By default, disable recording of OPEN events in Changelogs.
Note that CREAT are still recorded even if OPEN are disabled.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I72c479938ab4782523f1b16aef19fbbc96f43c7f
Reviewed-on: https://review.whamcloud.com/28214
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9727 lustre: add client NID to Changelogs entries 13/28213/18
Sebastien Buisson [Mon, 24 Jul 2017 15:58:05 +0000 (11:58 -0400)]
LU-9727 lustre: add client NID to Changelogs entries

Add a new changelog extension named changelog_ext_nid to hold
client's NID information.
NID info is added to every Changelog entry type except MARK, in
the form 'nid=<nid>':
1 01CREAT 15:50:20.834838318 2017.07.24 0x0 t=[0x200000401:0x2:0x0]
ef=0x3 u=500:500 nid=10.128.11.159@tcp p=[0x200000007:0x1:0x0] fileA

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I1049a699c17d3829d38abfade3187a28ca457bd1
Reviewed-on: https://review.whamcloud.com/28213
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-3397 lprocfs: create "export" /proc file on server 13/6713/17
Emoly Liu [Thu, 9 Oct 2014 16:50:00 +0000 (00:50 +0800)]
LU-3397 lprocfs: create "export" /proc file on server

Similar to the "import" file on the client for each client-to-server
connection, it would be useful to have a file on the server in the
per-nid directory obdfilter/*/exports/$NID/export. This contains
export connection information as in the "import" file, like:
 a793e354-49c0-aa11-8c4f-a4f2b1a1a92b:
     name: MGS
     client: 10.211.55.10@tcp
     connect_flags: [ version, barrier, adaptive_timeouts, ... ]
     connect_data:
        flags: 0x2000011005002020
        instance: 0
        target_version: 2.10.51.0
        export_flags: [ ... ]

Also, sanity.sh test_0d is added to verify this patch.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I60896090e3a8ad872141a8d4299f0698f0a5636a
Reviewed-on: https://review.whamcloud.com/6713
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
2 years agoLU-10003 lnet: clarify lctl deprecation message 30/31030/2
Amir Shehata [Thu, 25 Jan 2018 22:18:10 +0000 (14:18 -0800)]
LU-10003 lnet: clarify lctl deprecation message

Print out the lctl command which is deprecated

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I2d07a609718c205fba172530202e6f0c1b1d2119
Reviewed-on: https://review.whamcloud.com/31030
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-6349 idl: add PTLRPC definitions to enum 58/30958/5
Andreas Dilger [Sat, 11 Nov 2017 07:30:51 +0000 (00:30 -0700)]
LU-6349 idl: add PTLRPC definitions to enum

Add PTLRPC definitions to enums for cleanliness, and use them:

    enum lustre_msg_version for LUSTRE_*_VERSION
    enum lustre_msg_magic for LUSTRE_MSG_MAGIC_*
    enum lustre_msghdr for MSGHDR_*

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ie17619bfc2433339ab32e257d596adf64c2c4ff0
Reviewed-on: https://review.whamcloud.com/30958
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10418 flr: revise lease API 63/30363/11
Jinshan Xiong [Sat, 20 Jan 2018 19:45:50 +0000 (19:45 +0000)]
LU-10418 flr: revise lease API

Introduce two lease API llapi_lease_{acquire,release}() to replace
confusing llapi_lease_{get,put}(). Rename llapi_lease_ext_get() to
llapi_lease_set(). Implement new mirror_extend_layout() to separate
lfs_migrate().

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I88ab1f7f27aa81c44418aecf31c9e89e494fc01e
Reviewed-on: https://review.whamcloud.com/30363
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-9972 osd: cache OI mapping in dt_declare_ref_add 09/29709/11
Alex Zhuravlev [Mon, 23 Oct 2017 16:32:15 +0000 (19:32 +0300)]
LU-9972 osd: cache OI mapping in dt_declare_ref_add

so that subsequent calls to manipulate /PENDING don't need to
consult with OI.
use OIC in osd_index_ea_delete() to optimize FLDB lookups.

Change-Id: I779bbddb429b577aecf1ad092d74d0802e43d567
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/29709
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7934 tests: compatibility check 47/20847/5
Alexander Boyko [Fri, 17 Jun 2016 06:59:23 +0000 (09:59 +0300)]
LU-7934 tests: compatibility check

The patch adds compatibility check for test 103 replay-single.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Test-Parameters: trivial testlist=replay-single envdefinitions="ONLY=103"
Change-Id: I8c3457a2d18c32d9d1e701003c282e893692f5fc
Reviewed-on: https://review.whamcloud.com/20847
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10531 gss: fix GSS support for DNE 84/30984/2
Sebastien Buisson [Tue, 23 Jan 2018 12:33:52 +0000 (21:33 +0900)]
LU-10531 gss: fix GSS support for DNE

Part of the patch https://review.whamcloud.com/24236 was
inadvertently reverted by patch https://review.whamcloud.com/27823.
So re-apply missing part to have full DNE support in GSS code.

This patch is necessary because with DNE, an OSP can be used on MDT
for connections to other MDTs. So to determine exactly every
connection's purpose, use client_obd's cl_sp_to in
import_to_gss_svc().

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I029dbfc52eeabb2c6f0f6d2c972ceea1879405e7
Reviewed-on: https://review.whamcloud.com/30984
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-10562 tests: fix non-portable bash redirection 50/31050/2
Andreas Dilger [Fri, 26 Jan 2018 22:20:51 +0000 (15:20 -0700)]
LU-10562 tests: fix non-portable bash redirection

Replace non-portable bash redirection with a simple(r) array
assignment.  This was reporting for at least bash 4.4.26:

   line 5211: syntax error near unexpected token `<'

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8289a4a3a1518c3cb6f33e2d8e52ba22db3ebbe5
Reviewed-on: https://review.whamcloud.com/31050
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10536 build: add -lnvpair to ZFS LDFLAGS 47/30947/2
John L. Hammond [Fri, 19 Jan 2018 21:37:27 +0000 (15:37 -0600)]
LU-10536 build: add -lnvpair to ZFS LDFLAGS

The ZFS mount utils plugin directly depends on libnvpair so add
-lnvpair to the relevant LDFLAGS. This makes building with
--disable-shared succeed.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie7d7043b011dab3e139786ed8fffb09e1078a34b
Reviewed-on: https://review.whamcloud.com/30947
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10531 obd: handle case tgt equals fsname for obdname2fsname 37/30937/2
James Simmons [Fri, 19 Jan 2018 07:55:11 +0000 (02:55 -0500)]
LU-10531 obd: handle case tgt equals fsname for obdname2fsname

The function obdname2fsname() was updated to handle the case of
when tgt was for the format:

llite.lustre*.xattr_cache
llite.lustre-*.xattr_cache

but the case of tgt being exactly equal to the fsname is also
valid as in the case of sptlrpc: fsname.srpc.flavor.default.cli2ost

Change-Id: I9bb4082ee5f194842c7d72ae25e14f6998c0b8ed
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30937
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9859 libcfs: merge UMP and SMP libcfs cpu header code 73/30873/3
James Simmons [Sat, 20 Jan 2018 23:05:44 +0000 (18:05 -0500)]
LU-9859 libcfs: merge UMP and SMP libcfs cpu header code

Currently we have two headers, linux-cpu.h that contains the SMP
version and libcfs_cpu.h contains the UMP version. We can simplify
the headers into a single header which handles both cases. Many
cleanups for checkpatch violations are done as well.

Change-Id: Ifb92e59ad370e991220b5957538b6bab89423e2f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30873
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10485 lustre: move LA_* flags to lustre_user.h 25/30825/4
Sebastien Buisson [Wed, 10 Jan 2018 16:24:47 +0000 (01:24 +0900)]
LU-10485 lustre: move LA_* flags to lustre_user.h

The LA_* flags are written to disk as part of the ChangeLog records
in mdd_attr_set_changelog(), which means they are now part of the
on-disk and network protocol, and cannot be changed (at least not the
first 12 bits that are written).
They need to be moved to lustre_user.h.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9fc92e01301e70f0f4e5cd74135b9b2079d63658
Reviewed-on: https://review.whamcloud.com/30825
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
2 years agoLU-10464 kernel: kernel update [SLES12 SP2 4.4.103-92.56] 48/30748/7
Bob Glossman [Thu, 4 Jan 2018 18:57:07 +0000 (10:57 -0800)]
LU-10464 kernel: kernel update [SLES12 SP2 4.4.103-92.56]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I0eb5dc783bcbcd4ee8643431345daefb9cb89eb1
Reviewed-on: https://review.whamcloud.com/30748
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10155 recovery: support setstripe replay 04/30704/8
Lai Siyao [Thu, 4 Jan 2018 01:38:43 +0000 (09:38 +0800)]
LU-10155 recovery: support setstripe replay

Regular file open will always reserve space for LOV ea, which is used
to store user specified lov_user_md, or lov_mds_md for replay, but if
this open is the first open in 'lfs setstripe', it doesn't have
lov_user_md specified, or lov_mds_md for replay because
O_LOV_DELAY_CREATE is set, but MDT will treat the EA field in the
request as valid one, so fails in magic check in this open replay.

This patch fixes this issues on both sides:
1. client doesn't reserve space for LOV ea in
   open(O_LOV_DELAY_CREATE), this change is not necessary, but to
   make clean of the code.
2. server doesn't create OST objects for open(O_LOV_DELAY_CREATE)
   replay.

Add setstripe/setdirstripe replay test in replay-single.sh, which
are 2c, 2d.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Ia7971523710822308328239f36f7d690314e0e45
Reviewed-on: https://review.whamcloud.com/30704
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10287 flr: lfs mirror verify command 87/30387/20
Jian Yu [Sat, 20 Jan 2018 18:24:25 +0000 (10:24 -0800)]
LU-10287 flr: lfs mirror verify command

This patch adds "lfs mirror verify" command to verify
that each SYNC mirror of a mirrored file contains exactly
the same data. It supports specifying multiple mirrored
files in one command line.

Usage:

lfs mirror verify [--only <mirror_id,mirror_id2[,...]>]
  [--verbose|-v]
  <mirrored_file> [<mirrored_file2> ...]

Options:

--only <mirror_id,mirror_id2[,...]>
Only verify the mirrors specified by mirror_ids contain
exactly the same data. This option cannot be used when
multiple mirrored files are specified.

--verbose|-v
With this option, the command will print where the
differences are if the data do not match. Otherwise,
the command will just return an error in that case.

Test-Parameters: testlist=sanity-flr
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ib436dea73a8e7a0f8e30b246bb0062023a97cb81
Reviewed-on: https://review.whamcloud.com/30387
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10123 lnet: ensure peer put back on dc request queue 47/30147/10
Bruno Faccini [Fri, 17 Nov 2017 11:57:42 +0000 (12:57 +0100)]
LU-10123 lnet: ensure peer put back on dc request queue

Upon async PUT request received from peer already in discovery
process, lnet_peer_push_event() was not handling the case where
peer could be on working/ln_dc_working queue. This could lead
for peer not to be re-dsicovered as expected, but left on
working queue and to be finally timed-out.

Also ensure that peer will not be put back on request queue by
event handler if discovery is already completed.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ic74a313c00edc1b8fdd14794d2c88411d12e0979
Reviewed-on: https://review.whamcloud.com/30147
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7585 zfs: OI scrub for ZFS 09/30909/2
Fan Yong [Thu, 18 Jan 2018 01:34:50 +0000 (09:34 +0800)]
LU-7585 zfs: OI scrub for ZFS

The ZFS backend OI scrub is used for verifying OI mappings
consistency. ZFS has some mechanism to maintion the data
integrity, but there is still possible data corruption,
especially consider the data migration from other backend,
such as ldiskfs, or server side data backup and restore.
The OI scrub can check OI mappings consistency and rebuild
them when needed.

The ZFS backend OI scrub shares the same control interface
as ldiskfs backend. It can be triggered manually via the
lctl command:
lctl lfsck_start -M $device -t scrub

It also can be triggered automatically when inconsistency
detected if you do not disable 'auto_scrub' that can be
controlled via:
lct set_param -n osd-zfs.*.auto_scrub_interval=xxx

You can check the OI scrub status similar as you do for
ldiskfs backend:
lctl get_param -n osd-zfs.*.oi_scrub

Test-Parameters: envdefinitions=SLOW=yes testlist=sanity-scrub mdtfilesystemtype=zfs ostfilesystemtype=zfs mdscount=2 mdtcount=4
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I59ae3142ecd7b27f48b14f2a2d1d110d9c8296e3
Reviewed-on: https://review.whamcloud.com/30909
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9372 ptlrpc: allow to limit number of service's rqbds 64/29064/4
Bruno Faccini [Mon, 18 Sep 2017 22:55:01 +0000 (00:55 +0200)]
LU-9372 ptlrpc: allow to limit number of service's rqbds

This patch provides a way to limit the number of rqbds per
service.
This should help to avoid OOM during heavy clients
requests load, like during target failover/recovery for
thousands of Clients.
This change has been required, even after first patch for
LU-9372 (ptlrpc: drain "ptlrpc_request_buffer_desc" objects)
which already allowed to drain unused rqbds previously
allocated during heavy load, but was not efficient during
too long period of load.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ib43f3e07741b9fcecdfae24a3753128a939d2196
Reviewed-on: https://review.whamcloud.com/29064
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9735 compat: heed the fs_struct::seq 07/28907/7
Bobi Jam [Fri, 8 Sep 2017 13:56:03 +0000 (21:56 +0800)]
LU-9735 compat: heed the fs_struct::seq

2.6.37 kernel uses a seqlock in the fs_struct to enable us to take
an atomic copy of the complete cwd and root paths.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I35384b8f5c468a8c142a59032f3148b698a0c79e
Reviewed-on: https://review.whamcloud.com/28907
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10045 obdclass: multiple try when register target 61/30761/7
Fan Yong [Thu, 11 Jan 2018 15:27:19 +0000 (23:27 +0800)]
LU-10045 obdclass: multiple try when register target

It is possible that the connection between MGC and MGS has not
been established when register target to MGS for server mount.
At that time, the ptlrpcd may be trying to (re-)connect to MGS
at background. Under such case, the mount process should not
report failure (-ESHUTDOWN -r -EIO), instead, it can retry the
MGS_TARGET_REG RPC after sometime (such as 2 seconds).

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I44e53a9d1de037907bdb5148b8c44d332439a50c
Reviewed-on: https://review.whamcloud.com/30761
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10504 flr: check layout pointer before using it 15/30915/2
Jian Yu [Thu, 18 Jan 2018 06:44:14 +0000 (22:44 -0800)]
LU-10504 flr: check layout pointer before using it

This patch fixes mirror_create() to check layout pointer
before using it.

Change-Id: Ia1454b5c7fcfcee227d0b954a477cefe5d7bb5f7
Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/30915
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10503 flr: fix error handling in mirror_extend_file() 14/30914/2
Jian Yu [Thu, 18 Jan 2018 06:20:16 +0000 (22:20 -0800)]
LU-10503 flr: fix error handling in mirror_extend_file()

This patch fixes the error handling issues in mirror_extend_file().

Change-Id: I388295886657cf9b9b072017002be937a9e657c0
Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/30914
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10515 utils: remove obd.c dependencies from lustre_rsync 70/30870/4
John L. Hammond [Mon, 15 Jan 2018 19:15:07 +0000 (13:15 -0600)]
LU-10515 utils: remove obd.c dependencies from lustre_rsync

lustre_rsync does not depend on /dev/obd and therefore should not call
register_ioc_dev(OBD_DEV_ID, OBD_DEV_PATH). So remove the
call. Replace erroneous uses of D_TRACE with the locally defined
DTRACE constant. Remove unneeded includes.

Test-Parameters: trivial testlist=lustre-rsync-test

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I3251469d48f4c60106dd14d1bf97e159a147688c
Reviewed-on: https://review.whamcloud.com/30870
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5541 lustreapi: only export the API symbols 65/30865/3
frank zago [Sun, 14 Jan 2018 17:48:21 +0000 (12:48 -0500)]
LU-5541 lustreapi: only export the API symbols

By default, all kind of symbols are exported from the library (dump,
libcfs_ukuc_start, l_ioctl, set_ioctl_dump, ...), which may create
external conflicts. Use the linker version-script options to only
export the API symbols, and prevent the export of internal symbols.

Only the symbols declared in the global section of liblustreapi.map
will be seen by applications.

Test-Parameters: trivial

Change-Id: I4fc2febd2528fc85f546426e08e3ab67e1305c40
Signed-off-by: frank zago <fzago@cray.com>
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-on: https://review.whamcloud.com/30865
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10462 lod: Correct lfs --component-add striping 90/30790/4
Giuseppe Di Natale [Mon, 8 Jan 2018 16:50:39 +0000 (08:50 -0800)]
LU-10462 lod: Correct lfs --component-add striping

lfs --component-add -E <end> -c -1 would apply the default
stripe count to a file when it should be striping across
all OSTs. This was due to incorrect handling of sentinel
values in various portions of lod_object.

Signed-off-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Change-Id: I178f13be81ee546f44edcdadcf7086a1a57f7e9a
Reviewed-on: https://review.whamcloud.com/30790
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10456 kernel: kernel update RHEL6.9 [2.6.32-696.18.7.el6] 36/30736/5
Bob Glossman [Thu, 4 Jan 2018 16:26:03 +0000 (08:26 -0800)]
LU-10456 kernel: kernel update RHEL6.9 [2.6.32-696.18.7.el6]

Update RHEL6.9 kernel to 2.6.32-696.18.7.el6

Test-Parameters: clientdistro=el6.9 mdsdistro=el6.9 \
  ossdistro=el6.9 mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Change-Id: Ib0b7d5ff7f50f96c8bbc064541ae5c8bf2a4367f
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/30736
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10505 scripts: avoid version dependent shell script syntax 60/30860/3
Bob Glossman [Fri, 12 Jan 2018 21:51:26 +0000 (13:51 -0800)]
LU-10505 scripts: avoid version dependent shell script syntax

Stop using -1 as an array index in lfs_migrate.
This isn't a valid construct in all supported bash versions.

Test-Parameters: trivial clientdistro=el6.9

Change-Id: I5500af40926e0fdb2a432c6bae7fbbe05097ec7c
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/30860
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9411 tests: remove spaces around '+=' 25/30725/3
James Nunez [Thu, 4 Jan 2018 21:13:37 +0000 (14:13 -0700)]
LU-9411 tests: remove spaces around '+='

In sanity test 27D, the variable that collects tests to
skip needs to have spaces removed when the variable is
being concatenated with new tests to skip.

Test-Parameters: trivial
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ie57efecd2867b748f56998bc2fc375ea9d566611
Reviewed-on: https://review.whamcloud.com/30725
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10405 lov: fill no-extent fiemap on object with no stripe. 91/30591/6
Dominique Martinet [Tue, 19 Dec 2017 09:00:04 +0000 (10:00 +0100)]
LU-10405 lov: fill no-extent fiemap on object with no stripe.

This is useful for cp of large sparse files with no stripe info,
as cp relies on fiemap to detect what to read.

Add test 130f: fiemap for unstriped file

Change-Id: I12802c5b1cdec6baf5ee3cec6c706d92d9be4f63
Signed-off-by: Dominique Martinet <dominique.martinet@cea.fr>
Reviewed-on: https://review.whamcloud.com/30591
Tested-by: Jenkins
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-9934 build: address issues raised by gcc7 76/30376/15
James Simmons [Sun, 14 Jan 2018 04:23:22 +0000 (23:23 -0500)]
LU-9934 build: address issues raised by gcc7

Starting with gcc version 7 several platforms have enabled new
flags to report potential problems when compling code. For lustre
much of the reported problems deal with potential buffer overruns.
Also we have unused data structures and are not properly
initializing some data structures.

Change-Id: I10243ea88f2c726032d179febdbf26f28de13715
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30376
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9833 utils: resolve buffer over runs in lustre_rsync 73/30373/9
James Simmons [Fri, 12 Jan 2018 19:05:52 +0000 (14:05 -0500)]
LU-9833 utils: resolve buffer over runs in lustre_rsync

Newer version of gcc will report of snprintf is used in an
incorrect way. For the case of the lustre_rsync application
many times two buffers of size PATH_MAX are being placed into
one buffer of the size PATH_MAX. This can easily lead to a
buffer overrun. This patch resolves those bugs.

Test-Parameters: trivial testlist=lustre-rsync-test

Change-Id: I035b4a3b1d9695a16822649c2165e492e9f2879d
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30373
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10290 tests: properly set fileset with combined MGT/MDT 93/30293/5
Sebastien Buisson [Tue, 28 Nov 2017 09:25:31 +0000 (10:25 +0100)]
LU-10290 tests: properly set fileset with combined MGT/MDT

We need to make sure MDS receives updated fileset info from MGS.
In case of combined MGT/MDT, directly setting fileset on the node
will mask llog-based info retrieval mechanism.
This patch also removes sanity-sec test_27 from ALWAYS_EXCEPT.

Test-Parameters: trivial testlist=sanity-sec,sanity-sec,sanity-sec,sanity-sec
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I7f25f03a213833f15d082a871ac6368a0e11aa82
Reviewed-on: https://review.whamcloud.com/30293
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10526 build: Ubuntu Kernel 4.4.0 lacks symbols used by o2iblnd.c 93/30893/2
Martin Schroeder [Wed, 17 Jan 2018 10:19:09 +0000 (11:19 +0100)]
LU-10526 build: Ubuntu Kernel 4.4.0 lacks symbols used by o2iblnd.c

Recently, a change has been merged to "lnet/klnds/o2iblnd/o2iblnd.c" which
introduces the usage of IB_DEVICE_SG_GAPS_REG and IB_MR_TYPE_SG_GAPS.

Unfortunately, these symbols are not available in the 4.4.0 Kernels as used
by Ubuntu 14/16.

Additionally, there seems to be general warning against their use:
 - https://patchwork.kernel.org/patch/9573483/
 - https://lkml.org/lkml/2017/3/13/206

 Also, there is a related performance issue as reported in LU-10394.

The solution is to create a preprocessor guard around their use, so that
Kernels lacking these symbols will not use them and revert to using the older
IB_MR_TYPE_MEM_REG, instead.

Test-Parameters: trivial
Signed-off-by: Martin Schroeder <martin.h.schroeder@intel.com>
Change-Id: Ie835d6e04f3859634ba508c24dff1f27f1b24cf6
Reviewed-on: https://review.whamcloud.com/30893
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8346 obdclass: protect key_set_version 48/27448/10
Hongchao Zhang [Sat, 13 Jan 2018 10:08:29 +0000 (18:08 +0800)]
LU-8346 obdclass: protect key_set_version

In lu_context_refill, the key_set_version should be protected
before comparing it to version stored in the lu_context.

This patch is a supplement of the previous patch
https://review.whamcloud.com/#/c/28405/, which adds protection
for key_set_version from modification in lu_context_refill
and lu_context_key_degister.

Change-Id: I201f56214382a717cfc31ba573e06fec9fbedae4
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/27448
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9422 tests: generalize SLES version check 06/26906/8
James Nunez [Mon, 6 Nov 2017 16:03:40 +0000 (09:03 -0700)]
LU-9422 tests: generalize SLES version check

Some tests in the Lustre test suites cannot run on all
versions of SuSE Linux and need to be skipped based on
the SuSE version.

Generalize the function that compiles the version of SLES
and skip tests based on this new routine.

Test-Parameters: trivial

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ia61022c4a477da1968210550fb7a628d31c062ce
Reviewed-on: https://review.whamcloud.com/26906
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
2 years agoLU-9228 nrs: TBF realtime policies under congestion 87/26087/8
Qian Yingjin [Mon, 6 Mar 2017 07:05:01 +0000 (15:05 +0800)]
LU-9228 nrs: TBF realtime policies under congestion

During TBF evaluation, we find that when the sum of I/O bandwidth
requirements for all classes exceeds the system capacity, the
classes with same rate limits get less bandwidth than preconfigured
evenly.

The reason is as follows: under heavy load on a congested server,
it will result in some missed deadlines for some classes. The
calculated tokens may larger than 1 during dequeuing. In the original
implementation, all classes are equally handled to simply discard
exceeding tokens.

Thus, a Hard Token Compensation (HTC) strategy is proposed. A class
can be configured with HTC feature by the rule it matches. This
feature means that requests in this kind of class queues have high
real-time requirements and that the bandwidth assignment must be
satisfied as good as possible. When deadline misses happen, the
class keeps the deadline unchanged and the time residue (the
remainder of elapsed time divided by 1/r) is compensated to the
next round. This ensures that the next idle I/O thread will always
select this class to serve until all accumulated exceeding tokens
are handled or there are no pending requests in the class queue.

A new command format is added to enable realtime feature for a rule:
start $ruleName jobid={dd.0} rate=100 realtime=1

Change-Id: I3c867052c27e57a30ccdfe649e0905d141792663
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/26087
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10460 osd-zfs: Add tunables to disable sync 61/7761/9
Brian Behlendorf [Fri, 12 May 2017 15:05:13 +0000 (08:05 -0700)]
LU-10460 osd-zfs: Add tunables to disable sync

This patch allows replacing the call to txg_wait_synced(),
which blocks waiting for a full pool sync, with a smaller
tunable delay.  This delay is intended to stand in for the time
it would have taken to synchronously write the dirty data to
the intent log.

This allows testing ZFS behaviour as if there were a low-latency
ZIL device enabled to handle sync IO operations.  Setting the
delay to zero disables sync operations on the server completely.
However, be aware that no data is guaranteed to be written to
disk if the tunables are enabled, and this patch is solely for
performance analysis.  By default the tunables are set to -1,
which leaves the system using the normal sync behaviour.

Two new tunables are introduced to control the delay, the
osd_object_sync_delay_us and osd_txg_sync_delay_us module options.
These values default to -1 which preserves the safe full sync
pool behavior.  Setting these values to zero or larger will
replace the pool sync with a delay of N microseconds.

The initial test results obtained by running sanityN test 16
(fsx) are encouraging.  If the zil_commit() time can be kept to
less than 10ms we should see a significant performance improvement.
These tests were run in a pristine centos 6.4 VM and the results
are averaged over four runs.

osd_txg_sync_delay_us     -1    -1     -1     -1      -1
osd_obj_sync_delay_us     -1     0   1000  10000  100000
--------------------------------------------------------
SanityN test 16 (secs)  24.3   7.3    7.6   10.1    34.4

Change-Id: Iff9b66888edc79a5e1585fa3ce8377be068748f2
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Darby Vicker <darby.vicker-1@nasa.gov>
Reviewed-on: https://review.whamcloud.com/7761
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
2 years agoLU-9019 libcfs: remove cfs_time_XXX_64 wrappers 67/30867/2
James Simmons [Mon, 15 Jan 2018 00:17:01 +0000 (19:17 -0500)]
LU-9019 libcfs: remove cfs_time_XXX_64 wrappers

In an attempt to support 64 bit time handling before the linux
kernel developed time64_t and ktime lustre attempted to use
64 bit jiffies with a libcfs abstraction. Lets remove these
wrappers and replace them with modern 64 bit time support. The
lustre code that used these wrappers needs time resolution at
the seconds level so replace the code with time64_t handling.

Change-Id: I2bd53c4ce83830bedd4448678dffce9f2b2173b1
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30867
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-9019 lnet: move ping and delay injection to time64_t 58/30658/4
James Simmons [Fri, 12 Jan 2018 16:50:46 +0000 (11:50 -0500)]
LU-9019 lnet: move ping and delay injection to time64_t

Migrate away from jiffies for the pinger to time_64_t to one make
it clear its for time keeping and secondly to ensure the behavior
is consistent across any platform. Besides the lnet pinger code
move the lnet dely injection code to time64_t as well.

Change-Id: If363523893fc1dcce4eaa866501946edd6558751
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30658
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10055 mdt: use max_mdsize in reply for layout intent 04/30004/10
Mikhal Pershin [Mon, 30 Oct 2017 16:45:42 +0000 (19:45 +0300)]
LU-10055 mdt: use max_mdsize in reply for layout intent

The LAYOUT intent reply LVB buffer size is set to a current
file layout, meanwhile it is not working when layout is changed
and the mdt_max_mdsize is better to use as size of reply buffer.
This buffer will be shrinked to the new layout size after all.

Without that change the new layout size may be bigger and layout
is not returned back, causing extra RPC from client.
The mdt_lvbo_fill() is changed also to update mdt_max_mdsize if
larger layout is found. The related message level is decreased
from D_ERROR to D_INFO.

Signed-off-by: Mikhal Pershin <mike.pershin@intel.com>
Change-Id: Iaac5dcb8b4c5aa2c050dddb5b3fb2662c59f133b
Reviewed-on: https://review.whamcloud.com/30004
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10453 lnet: support gni net configuration 29/30829/5
Amir Shehata [Thu, 11 Jan 2018 00:38:33 +0000 (16:38 -0800)]
LU-10453 lnet: support gni net configuration

GNI interfaces don't have IP addresses so when configuring GNI
interfaces there is no point of trying to query the ip. There is
also only one GNI interface, therefore the net configuration
command shouldn't enforce an interface name.

This patch also adds more descriptive error commands. It also allows
deleting an entire network without having to specify an interface.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I549647675fe5530db7d86272a7dc79892117847d
Reviewed-on: https://review.whamcloud.com/30829
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10429 lod: LBUG lod_comp_ost_in_use() 89/30889/2
Bobi Jam [Wed, 17 Jan 2018 07:46:23 +0000 (15:46 +0800)]
LU-10429 lod: LBUG lod_comp_ost_in_use()

* print more debug info in lod_comp_ost_in_use().
* lod_alloc_qos() could possibly rollback too much items in the
  inuse array, leads to negative inuse array count number.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ie3787f193468c6b783776e7df2ed4a6d54d8a12b
Reviewed-on: https://review.whamcloud.com/30889
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10419 lfsck: no delay for notify RPC 68/30768/4
Fan Yong [Thu, 18 Jan 2018 02:56:11 +0000 (10:56 +0800)]
LU-10419 lfsck: no delay for notify RPC

It is impossible that current MDT has trouble on the connection
with some other MDT(s) or OST(s). Under such case, the LFSCK on
current MDT should skip related MDT(s) or OST(s) to avoid whole
LFSCK process being blocked by the trouble connection or remote
targets via setting the LFSCK notify RPC as rq_no_delay.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib35080cedcbe49f4ae8c4b3690a4743d5afe41b1
Reviewed-on: https://review.whamcloud.com/30768
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10422 lfsck: misc fixes to avoid unexpected repairing 12/30612/5
Fan Yong [Wed, 20 Dec 2017 13:57:09 +0000 (21:57 +0800)]
LU-10422 lfsck: misc fixes to avoid unexpected repairing

There are several issues that will misguide LFSCK to
trigger unexpected RPC or repairing by wrong, including:

1) object_update_result_insert() should pack the OUT RPC
   result (not the return value) into the reply buffer via
   object_update_result::our_data. But it did that in some
   wrong address.

2) out_xattr_get() used wrong index to obtain the EA buffer
   as to may overwrite former update (such as OUT_XATTR_GET)
   results.

3) osp_declare_xattr_get() does not consider the last '0'
   of the EA name for the length parameter for
   osp_insert_async_request().

4) osp_xattr_get_interpterer() missed to handle the positive
   value for the given parameter @rc. That will cause the PFID
   EA to be double read when the target OST-object has it.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibf0e095ae2735c60b9b88e4b0992389c906728f9
Reviewed-on: https://review.whamcloud.com/30612
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9836 osd-ldiskfs: read directory completely 70/30770/8
Fan Yong [Tue, 16 Jan 2018 03:21:05 +0000 (11:21 +0800)]
LU-9836 osd-ldiskfs: read directory completely

For ldiskfs backend, the return of readdir() does NOT means
the whole directory being read. Instead, it is the caller's
duty to count whether there are new items read via the last
readdir() then determine whether or not the whole directroy
has been read.

Unfortunately, some old osd-ldiskfs logic, such as OI scrub,
did not handle that properly, as to some directory, such as
lost+found, may be partly scanned. That is why some orphans
cannot be recovered.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib328643c4cdcdb14b548807ed05e8835f80bbf6a
Reviewed-on: https://review.whamcloud.com/30770
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8264 lod: lfs setstripe fix for pool. 49/20849/13
Hongchao Zhang [Sat, 13 Jan 2018 09:01:54 +0000 (17:01 +0800)]
LU-8264 lod: lfs setstripe fix for pool.

If a file is created (with lfs) in the directory associated
with pool without -p pool_name option then limit stripe count
to number of osts in the pool as that directory is associated
with the pool. This patch fixes this problem.

Also removed the wrong check from ost-pools.sh, test_20 where
we were creating file in a directory associated with pool and
checking it as not part of the pool.

Add test cases in ost_pool.sh test_20.

Signed-off-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Seagate-bug-id: MRP-3615
Change-Id: Id6dd5126856db7fc773a1fe9c837a214db8d6d70
Reviewed-on: https://review.whamcloud.com/20849
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10514 utils: statically link l_getidentity with libcfs.a 72/30872/3
John L. Hammond [Tue, 16 Jan 2018 00:50:46 +0000 (18:50 -0600)]
LU-10514 utils: statically link l_getidentity with libcfs.a

l_getidentity runs in a restricted environment which is not compatible
with the libtool wrapper script so statically link it with libcfs.a.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I4d3455003d48a11bad4570c3ad23de65c95e5b2c
Reviewed-on: https://review.whamcloud.com/30872
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10480 liblustre: suppress progname prefix in output 19/30819/3
Bobi Jam [Wed, 10 Jan 2018 00:59:32 +0000 (08:59 +0800)]
LU-10480 liblustre: suppress progname prefix in output

Makes liblustre tool not prefix the progname before every output.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic8d4cfa19e739ae15048152ec63d90f4b2959d20
Reviewed-on: https://review.whamcloud.com/30819
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10052 test: relate fs_log_size to recordsize 16/30916/5
Hongchao Zhang [Thu, 18 Jan 2018 08:58:42 +0000 (16:58 +0800)]
LU-10052 test: relate fs_log_size to recordsize

If the backend filesystem is ZFS, the block usage difference is
related to the recordsize of it, the maximum difference are 2 blocks.
This affects several different tests that have intermittent failures.

    replay-dual test_14b, replay-single test_20b, test_89

Change-Id: I36b184587306bd2b9221e5771bf1adfe071653ca
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/30916
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10516 doc: recommend e2fsprogs 1.42.13.wc6 71/30871/3
Andreas Dilger [Mon, 15 Jan 2018 23:04:55 +0000 (16:04 -0700)]
LU-10516 doc: recommend e2fsprogs 1.42.13.wc6

Update the recommended e2fsprogs version to 1.42.13.wc6 in
lustre/ChangeLog as this has been released for some time already.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Id4663e02849675f1c8a4b9c13e191ed9d735ab56
Reviewed-on: https://review.whamcloud.com/30871
Reviewed-by: Peter Jones <peter.a.jones@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10488 tests: fix sub-test return value issue in sanity-dom.sh 42/30842/3
Jian Yu [Thu, 11 Jan 2018 20:05:13 +0000 (12:05 -0800)]
LU-10488 tests: fix sub-test return value issue in sanity-dom.sh

This patch fixes test_sanity() and test_sanityn() in sanity-dom.sh
to return the actual exit values of sanity.sh and sanityn.sh.

For bash, variable assignments preceding commands affect only that
command. So, we can just change sh to bash and do not need save
and restore the value of ONLY.

Test-Parameters: trivial

Change-Id: I1edb1022f856552cb19cb6bd713aa9b6fce37b73
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/30842
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10476 tests: add version check to sanity-dom and sanity-flr 16/30816/3
Jian Yu [Tue, 9 Jan 2018 23:10:58 +0000 (15:10 -0800)]
LU-10476 tests: add version check to sanity-dom and sanity-flr

This patch adds Lustre version check codes into sanity-dom.sh
and sanity-flr.sh to make the tests interoperate with servers
that do not support the DOM and FLR features.

Test-Parameters: trivial \
mdsjob=lustre-b2_10 ossjob=lustre-b2_10 serverbuildno=52 \
testlist=sanity-dom,sanity-flr

Change-Id: If36125e84a424976a60b9bcc1e2c94c5fab2ac7d
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/30816
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10437 lod: clear layout header when generating layout 85/30785/2
Jinshan Xiong [Mon, 8 Jan 2018 21:36:35 +0000 (21:36 +0000)]
LU-10437 lod: clear layout header when generating layout

LOD needs to clear layout header otherwise the lcm_flags and
lcm_padding will be random data, which will create issues when
those fields are used by future module.

It already confused FLR because it uses lcm_flags and mirror_count
to do sanity check.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: If9511e6691144debd51ccab575ef4479d0c9b865
Reviewed-on: https://review.whamcloud.com/30785
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-10435 tests: add version check to conf-sanity test 32e 66/30766/2
Jian Yu [Mon, 8 Jan 2018 05:25:42 +0000 (21:25 -0800)]
LU-10435 tests: add version check to conf-sanity test 32e

This patch adds Lustre version check codes into conf-sanity
test 32e to make the test interoperate with servers that do
not support the DOM feature.

Test-Parameters: trivial envdefinitions=ONLY=32e \
mdsjob=lustre-b2_10 ossjob=lustre-b2_10 serverbuildno=52 \
testlist=conf-sanity

Change-Id: I6a561d2972dfc1071c0722af5cb265de0423626c
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/30766
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10458 kernel: kernel update [SLES12 SP3 4.4.103-6.38] 38/30738/4
Bob Glossman [Tue, 9 Jan 2018 15:45:42 +0000 (07:45 -0800)]
LU-10458 kernel: kernel update [SLES12 SP3 4.4.103-6.38]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ib7a308dbce58d94c5f5775cd54f33563cf067e7
Reviewed-on: https://review.whamcloud.com/30738
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5955 utils: lfs shouldn't skip .lustre directory 63/30463/4
Andreas Dilger [Sat, 9 Dec 2017 08:23:36 +0000 (01:23 -0700)]
LU-5955 utils: lfs shouldn't skip .lustre directory

Before Lustre 2.5.3 the MDS returned the .lustre directory to clients
with readdir in the root directory.  This has always been masked out
for "lfs find" and "lfs getstripe" by llapi_semantic_traverse(), but
had the side-effect of also skipping a real .lustre directory that may
exist in the filesystem (for whatever reason, I'm not sure).

Since 2.5.3-84-g2976f91 the /.lustre directory is no longer returned
by the MDS, so there is no need to exclude it in the tools anymore.
Add a sanity test to confirm that the .lustre directory is not listed
(there already are many tests that verify it can be accessed).

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I7ec6ee94b6012445d3bfd9a8a47497dacdbcab07
Reviewed-on: https://review.whamcloud.com/30463
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10282 flr: comp-flags support when creating mirrors 60/30360/13
Jinshan Xiong [Mon, 4 Dec 2017 19:52:25 +0000 (11:52 -0800)]
LU-10282 flr: comp-flags support when creating mirrors

This patch will allow flags to be set when creating mirrors.
The flags are set to individual components therefore it would be
flexible to flags based on the location of components. Also, 'stale'
and 'prefer' flags are allowed to set to individual components later
on.

This patch also revises component flags matching rules to allow
flags and inverted flags to be set at the same time in the command
lfs-find(1) and lfs-getstripe(1).

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ia077ca5454d49eb411bd82bd451c9dfc426d780c
Reviewed-on: https://review.whamcloud.com/30360
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10175 ldlm: remove obsoleted lock convert code 91/30491/3
Mikhail Pershin [Tue, 12 Dec 2017 11:17:21 +0000 (14:17 +0300)]
LU-10175 ldlm: remove obsoleted lock convert code

Patch removes lock mode convert mechanics from Lustre,
it is obsoleted and not functional at the moment. Also
there are no plans to restore it and use again.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I477caf24927768dfcdc15888e59a7d5e62d5b577
Reviewed-on: https://review.whamcloud.com/30491
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9618 clio: Use readahead for partial page write 44/27544/8
Patrick Farrell [Mon, 26 Jun 2017 16:07:38 +0000 (11:07 -0500)]
LU-9618 clio: Use readahead for partial page write

When writing to a region of a file less than file size
(either an existing file or a shared file with multiple
writers), writes of less than one page in size must first
read in that page.

This results in extremely poor performance. For random I/O,
there's no easy improvements available, but the sequential
case can benefit enormously by using readahead to bring in
those pages.

This patch connects ll_prepare_partial_page to the readahead
infrastructure.

This does not affect random I/O or large unaligned writes,
where readahead does not detect I/O.

Benchmarks are from a small VM system, files are NOT in
cache when rewriting.

Write numbers are in MB/s.

File per process:
    access             = file-per-process
    ordering in a file = sequential offsets
    ordering inter file= no tasks offsets
    clients            = 1 (1 per node)
    repetitions        = 1
    blocksize          = 1000 MiB
    aggregate filesize = 1000 MiB

New file (best case):
xfsize  ppr write
1KiB n/a 59.44
5KiB n/a 164.5

Rewrite of existing file:
xfsize  ppr re-write
1KiB off 4.65
1KiB on 48.40
5KiB off 12.95
5KiB on 143.3

Shared file writing:
access             = single-shared-file
ordering in a file = sequential offsets
ordering inter file= no tasks offsets
clients            = 4 (4 per node)
repetitions        = 1
blocksize          = 1000 MiB
        aggregate filesize = 4000 MiB

xfsize  ppr     write
1KiB off 11.26
1KiB on 58.72
5KiB off 18.7
5KiB on 127.3

Cray-bug-id: LUS-188
Signed-off-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Change-Id: I822395995ee23b1c9ca289ae982e5294b69a0cff
Reviewed-on: https://review.whamcloud.com/27544
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10463 osd-zfs: use 1MB RPC size by default 57/30757/3
Andreas Dilger [Sat, 6 Jan 2018 01:39:06 +0000 (18:39 -0700)]
LU-10463 osd-zfs: use 1MB RPC size by default

Revert back to using 1MB RPC size for ZFS back-end storage, if it
is not otherwise specified, and as long as the ZFS recordsize is
1MB or smaller.  Continue to use the ZFS recordsize if it is larger.

For ldiskfs, continue to use 4MB RPC size, unless the bigalloc
feature is enabled and has a larger chunksize.

Testing has shown that while 4MB RPC size is good for ldiskfs, it
does not improve ZFS performance, and increases IO variability in
some cases.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I4b306843667bfd960ad07ecc3886a696fd3ebbe5
Reviewed-on: https://review.whamcloud.com/30757
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10003 lnet: deprecate lctl net commands 55/30755/2
Amir Shehata [Fri, 5 Jan 2018 22:20:04 +0000 (14:20 -0800)]
LU-10003 lnet: deprecate lctl net commands

Added a deprecated message for commands which are implemented in
lnetctl. The lctl commands will continue to function.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I3f528d0145f7958106a2fc6842fcd1670c9b9d7c
Reviewed-on: https://review.whamcloud.com/30755
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10454 mdd: check return value of lu_ucred() 07/30707/4
Sebastien Buisson [Mon, 8 Jan 2018 14:28:27 +0000 (23:28 +0900)]
LU-10454 mdd: check return value of lu_ucred()

In mdd_changelog_data_store_by_fid() part of the function checked
for the return value of lu_ucred(), part of it did not. This lead
to NULL pointer dereferencing.

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Iefe9d10191e499aec94415fb6fe0d5d2064f86f0
Reviewed-on: https://review.whamcloud.com/30707
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
2 years agoLU-10201 tests: Fix overly greedy grep in conf_sanity test 20 57/29957/4
Oleg Drokin [Tue, 7 Nov 2017 00:59:20 +0000 (19:59 -0500)]
LU-10201 tests: Fix overly greedy grep in conf_sanity test 20

Need to better ensure the mountpoint matching so that only
/mnt/lustre is mtched, but not /mnt/lustre-{mds,ost}

Change-Id: I0ca274a358de3a38542e05bb5682641459fea93d
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/29957
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-10459 lnd: throttle tx based on queue depth 51/30751/3
Amir Shehata [Fri, 5 Jan 2018 20:22:45 +0000 (12:22 -0800)]
LU-10459 lnd: throttle tx based on queue depth

Throttle the transmits based on the negotiated conn queue depth
to ensure we keep the number of outstanding transmits below the
negotiated queue depth.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I27190364904d6c79c0cd6d382228f8b8d2b11ba0
Reviewed-on: https://review.whamcloud.com/30751
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoNew tag 2.10.57 2.10.57 v2_10_57 v2_10_57_0
Oleg Drokin [Wed, 17 Jan 2018 07:25:01 +0000 (02:25 -0500)]
New tag 2.10.57

Change-Id: Ic2704bf2256afdf0800a30c1a979e3d99f2c208a
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-7004 obd: make LCFG_SET_PARAM functional 90/28590/26
James Simmons [Wed, 13 Dec 2017 20:05:56 +0000 (15:05 -0500)]
LU-7004 obd: make LCFG_SET_PARAM functional

The LCFG_SET_PARAM infrastructure was meant to replace the
class_process_proc_param() functionality but various software
bugs have prevented its adoption. This patch does the following:

1) Take the better print_lustre_cfg() of the mgs module and use
   that in llog_swab.c instead with the intent of exporting this
   function. I add to process_param2_config() a call to
   print_lustre_cfg() for debugging purposes.

2) Move obdname2fsname to obd_mount.c and make it exportable.
   Expanded the functionality to work for both lctl conf_param
   and lctl set_parm -P.

3) Split mgs_setparam() into two functions since the difference
   in LCFG_SET_PARAM and LCFG_PARAM are large enough.

Currently virtual attributes failover.nid, sptlrpc, and quota
are not fully supported. They will be addressed in later patches.

Change-Id: Iced6505f39a3270139c1630270cfe1dc4a2e49ed
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28590
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10455 kernel: kernel update RHEL7.4 [3.10.0-693.11.6.el7] 34/30734/3
Bob Glossman [Thu, 4 Jan 2018 15:57:37 +0000 (07:57 -0800)]
LU-10455 kernel: kernel update RHEL7.4 [3.10.0-693.11.6.el7]

update RHEL 7.4 kernel to 3.10.0-693.11.6.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Id3428aa00e4b1501b642587db7911b6adafd51ef
Reviewed-on: https://review.whamcloud.com/30734
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8999 test: ignore unrelated quota id 30/30730/3
Hongchao Zhang [Wed, 6 Dec 2017 04:59:23 +0000 (12:59 +0800)]
LU-8999 test: ignore unrelated quota id

In test_38 of sanity_quota, the quota id larger than 9999
should be ignored.

Change-Id: I12e7936c0c1abc2dcaad7646a048c98bb37de254
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/30730
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9859 libcfs: delete libcfs/linux/libcfs.h 06/30706/4
James Simmons [Tue, 9 Jan 2018 05:52:16 +0000 (00:52 -0500)]
LU-9859 libcfs: delete libcfs/linux/libcfs.h

Lustre uses libcfs.h as a the header to include all headers. This
approach has drawbacks like colliding with MOFED compat headers
that do the same thing. This patch is the first step to unwind
including libcfs.h everywhere. This starts with eliminating
linux/libcfs.h.

Test-Parameters: trivial

Change-Id: Id2040d4295c16135561c8251e160cb2117ee21b8
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30706
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
2 years agoLU-10052 tests: wait for OST objects to be deleted 78/30678/4
Hongchao Zhang [Mon, 4 Dec 2017 19:47:30 +0000 (03:47 +0800)]
LU-10052 tests: wait for OST objects to be deleted

In test_20b of replay-single, the used space difference after
the file creation and deletion shows that a block is not freed,
wait for OST objects to be destroyed after recovery is done.

Test-Parameters: trivial testlist=replay-single.sh ostfilesystemtype=zfs mdtfilesystemtype=zfs
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Change-Id: I6311d8b8fa4cea713a9755cfb6a3d63e693c8344
Reviewed-on: https://review.whamcloud.com/30678
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
2 years agoLU-10383 hsm: flatten mdt_cdt_started_cb() 61/30561/4
John L. Hammond [Fri, 15 Dec 2017 20:19:46 +0000 (14:19 -0600)]
LU-10383 hsm: flatten mdt_cdt_started_cb()

Rewrite mdt_cdt_started_cb() to avoid creating a fake progress kernel
for mdt_hsm_update_request_state() and handle the cleanup from the
timedout action directly. Cancel cancel actions that have timedout
rather than leaving them in the log indefinitely. The code is improved
in several places to clean up all resources associated with the action
rather than having the clean up depend on unnecessary assumptions.

Since mdt_hsm_coordinator_update() in then only called from the
MDS_HSM_PROGRESS handler, the update_record parameter can be removed
aw well as the now useless wrapper function
mdt_hsm_coordinator_update().

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ic6663b29b2a87de0da59085ccbe297b50abd049d
Reviewed-on: https://review.whamcloud.com/30561
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10383 hsm: consolidate CDT restore handle handling 57/30557/4
John L. Hammond [Fri, 15 Dec 2017 19:24:32 +0000 (13:24 -0600)]
LU-10383 hsm: consolidate CDT restore handle handling

Consolidate duplicated HSM coordinator restore handle handling into
new functions cdt_restore_handle_{add,del_(). Rename
mdt_hsm_restore_hdl_find() and some struct members for consistency.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I9798ed93ea26a9d61d4786540c6dae95cdc38c4b
Reviewed-on: https://review.whamcloud.com/30557
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10383 hsm: refactor mdt_coordinator_cb() 52/30552/3
John L. Hammond [Fri, 15 Dec 2017 16:14:11 +0000 (10:14 -0600)]
LU-10383 hsm: refactor mdt_coordinator_cb()

Split the ARS_WAITING and ARS_STARTED cases of mdt_coordinator_cb()
into subfunctions, mdt_cdt_waiting_cb() and mdt_cdt_started_cb().

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I734e10e4db72f76a6b0de76c383ad0b03efd76d8
Reviewed-on: https://review.whamcloud.com/30552
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-6051 lfs: Update lfs_migrate man page for in-use files 50/29950/2
Steve Guminski [Mon, 6 Nov 2017 17:46:30 +0000 (12:46 -0500)]
LU-6051 lfs: Update lfs_migrate man page for in-use files

Update man page to state that it is safe to use the script on in-use
files for versions at or above 2.5.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I412ee7681db3860ca395c4afc2a30c87f1f49d6d
Reviewed-on: https://review.whamcloud.com/29950
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-5541 build: move libcfs and liblustreapi over to libtool 62/30562/6
James Simmons [Mon, 8 Jan 2018 22:30:23 +0000 (17:30 -0500)]
LU-5541 build: move libcfs and liblustreapi over to libtool

Change libcfs into a convenience library using libtool. This allows
use to embbed libcfs library into both liblnetconfig and liblustreapi
so their is no longer a need to link applications to libcfs.a
anymore. With this change we need migrate liblustreapi to libtool.

libtool knows how to build both static and dymanic libraries for
liblusteapi, so no need to hack the Makefile. As two added benefits,
the utilities will now use the dynamic version, thus reducing their
footprint, and calling make twice in a row won't rebuild objects
already built.

Test-Parameters: trivial

Change-Id: Icc1e5d42df503b9bf393396fe09f4e4f1f242486
Signed-off-by: frank zago <fzago@cray.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30562
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-8358 vvp: Print discarded page warning on -EIO 11/21111/3
Patrick Farrell [Thu, 27 Jul 2017 15:18:08 +0000 (10:18 -0500)]
LU-8358 vvp: Print discarded page warning on -EIO

On client eviction, the client sometimes has dirty pages
outstanding, which are then discarded.  The client is
supposed to print an error when this happens,
from vvp_vmpage_error->ll_dirty_page_discard_warn.

However, the client looks for specific errors, and newer
Lustre clients will sometimes return -EIO to I/O requests
on eviction, instead of -EINTR.  Since they can still
return -EINTR, we must add -EIO as a new condition and
keep -EINTR.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I22ac82570a3840782c3fc6db40281b4a2c1cba1c
Reviewed-on: https://review.whamcloud.com/21111
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-3846 test: Fix sanity test_56* with different layouts 79/7479/13
Andreas Dilger [Fri, 7 Jul 2017 16:51:17 +0000 (10:51 -0600)]
LU-3846 test: Fix sanity test_56* with different layouts

Fix a bug in "lfs getstripe --obd" and "lfs find --obd" where they
tried to access objects of uninitialized components to check if
they were on specified OSTs.  Those components have no objects, so
skip uninitialized components when searching for specific OSTs.

Sanity test_56s and test_56u will fail when the default stripe count
is not 1 and the test is run with more than one OST.  Explicitly
set stripe_count=1 for the default directory layout for these tests.

For PFL layout testing, test_56a needs to fix its output parsing, as
it is using this to verify that lfs getstripe is returning valid data
by counting occurrences of "obdidx" and not the new "l_ost_idx".

Do not delete the filesystem default striping in 56a, 56g, 56h, as
this will silently cause failures with default PFL layout testing.

The sanity test_56* subdirectories were allowed to be shared long ago,
but when $tdir became a per-subtest directory this was lost.  Allow
the subdirectories to be shared again, where possible, to avoid
duplicate setup of the test directory for each subtest if not needed.

Pass test directory explicitly to setup_56() and not via global $TDIR.

Clean up test_56* to better match modern test code style.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I1bcdeb80fc6e39227a87365f823879db70eec652
Reviewed-on: https://review.whamcloud.com/7479
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
2 years agoLU-10444 utils: Don't remount debugfs every time 75/30675/3
Oleg Drokin [Sat, 30 Dec 2017 03:16:30 +0000 (22:16 -0500)]
LU-10444 utils: Don't remount debugfs every time

Check if debugfs is mounted at /sys/kernel/debug and only
mount if it is not.

Change-Id: Ib31bd8f7c5c93ab942c6708ed3a4d17a11159e95
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: https://review.whamcloud.com/30675
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
2 years agoLU-9019 osp: migrate to 64 bit time 74/30674/3
James Simmons [Thu, 4 Jan 2018 03:35:41 +0000 (22:35 -0500)]
LU-9019 osp: migrate to 64 bit time

Change opd_statfs_maxage from int to time64_t to make it clear
this field is in units of seconds. Change the last libcfs specific
cfs_time_t which maps to jiffies to ktime_t since it give better
than second resolution which is needed in this case.

Change-Id: I31baa73d5f6bd53dbcce4fc9f90462b11c6457a3
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30674
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9019 osc: migrate to 64 bit time 07/30607/4
James Simmons [Thu, 4 Jan 2018 03:24:57 +0000 (22:24 -0500)]
LU-9019 osc: migrate to 64 bit time

Change od_contention_time from int to time64_t to make it clear
this field is in units of seconds. Change the *_contention_time
fields from jiffies to ktime_t to make it clear we are dealing
with time and ktime_t is consistent on any platform unlike
jiffies.

Change-Id: Ieb240e40cc4d56050607314db057004db00aae13
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30607
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-9892 test: fix SuSe nfsserver setup 76/30476/23
Minh Diep [Mon, 11 Dec 2017 18:12:20 +0000 (10:12 -0800)]
LU-9892 test: fix SuSe nfsserver setup

Checking for SuSE-release and use nfsserver
Add export info to a /etc/exports

Test-Parameters: trivial testlist=parallel-scale-nfsv4

Change-Id: Id12370ae35d878e51bdf6f71a77b1b82b5e82c33
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/30476
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10350 tests: make parsing routines pattern aware 36/30636/2
James Nunez [Thu, 21 Dec 2017 21:23:40 +0000 (14:23 -0700)]
LU-10350 tests: make parsing routines pattern aware

'lfs getstripe' now returns the pattern for each component
of a directory and files. The routines that parse
parameters, parse_layout_param() and parse_plain_param(),
need to look for the component pattern when parsing the output
of 'lfs getstripe'.

Test-Parameters: trivial testlist=sanity-pfl,ost-pools
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iab605f58e9c8f501fa0889806c511e3310cb6dd7
Reviewed-on: https://review.whamcloud.com/30636
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10237 mdc: interruptable during RPC retry for EINPROGRESS 66/30166/2
Fan Yong [Sun, 19 Nov 2017 05:55:11 +0000 (13:55 +0800)]
LU-10237 mdc: interruptable during RPC retry for EINPROGRESS

Sometimes, some system resource may be inaccessible temporarily,
for example, related OI mapping is crashed and has yet not been
rebuilt. Under such case, the server will reply the client with
"-EINPROGRESS", then client will retry the RPC some time later.

Currently, the client will retry infinitely until related RPC
succeed or get other failure. But we do not know how long it
will be before related resource becoming available. It may be
very long time as to the RPC sponsor - the application or the
user does not want to retry any more, then we need to make the
logic to be interruptable. This patch is for such purpose.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4f939f9a350d3a99ce3d3af37d0dea8ab8030fee
Reviewed-on: https://review.whamcloud.com/30166
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
2 years agoLU-10468 tests: sync zfs dataset before reading blocks 28/30828/5
Jinshan Xiong [Thu, 11 Jan 2018 01:21:25 +0000 (17:21 -0800)]
LU-10468 tests: sync zfs dataset before reading blocks

Before reading blocks it should synchronize zfs dataset therefore
the block number will be accurate.

Test-Parameters: trivial envdefinitions=SLOW=yes,ENABLE_QUOTA=yes mdtfilesystemtype=zfs ostfilesystemtype=zfs mdscount=2 mdtcount=4 testlist=sanity-flr,sanity-flr,sanity-flr
Test-Parameters: trivial envdefinitions=SLOW=yes,ENABLE_QUOTA=yes mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4 testlist=sanity-flr,sanity-flr,sanity-flr
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I663d689296e5847b460de9a491b551a56bfbc77d
Reviewed-on: https://review.whamcloud.com/30828
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>