Whamcloud - gitweb
fs/lustre-release.git
3 years agoLU-10007 pacemaker: Use lctl and load lustre 44/29144/2
Nathaniel Clark [Thu, 21 Sep 2017 17:09:18 +0000 (13:09 -0400)]
LU-10007 pacemaker: Use lctl and load lustre

When scripts are started, load lustre module.
Use lctl instead of directly accessing health_check file.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I4a81248939464e498006dc2c4072d44685add018
Reviewed-on: https://review.whamcloud.com/29144
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-9611 lov: allow lov.*.stripe{size,count}=-1 param 46/27946/5
Andreas Dilger [Thu, 6 Jul 2017 06:55:08 +0000 (00:55 -0600)]
LU-9611 lov: allow lov.*.stripe{size,count}=-1 param

Since LU-7344 patch http://review.whamcloud.com/16930 was landed,
lov_stripeoffset_seq_write() and lov_stripecount_seq_write() have
incorrectly checked that lov.*.stripecount and lov.*.stripeoffset
are not negative.  In fact they can both be "-1" to indicate that
the filesystem-wide default value should be used. These parameters
can also be set internally if using "lfs setstripe -c -1 $MOUNT"
or "lfs setstripe -i -1 $MOUNT" to set the system wide default,
generating console errors on the MDS from class_process_proc_param():

    lov.: error writing proc entry 'stripecount': rc = -34
    lov.: error writing proc entry 'stripeoffset': rc = -34

Fix these functions to allow "-1" as a valid value.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I295d2591d535b039634689524a29725e963ebbe5
Reviewed-on: https://review.whamcloud.com/27946
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7988 hsm: wake up cdt when requests are empty 42/29742/4
Ben Evans [Tue, 24 Oct 2017 15:34:06 +0000 (11:34 -0400)]
LU-7988 hsm: wake up cdt when requests are empty

The coordinator only runs once per second, we need a mechanism
to send more work when everything is done (cdt_request_count
goes to zero)

Without this, there is a hard limit of max_requests per sec
requests that can be processed, causing performance issues
with small files.

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I563666a1a3e53f0ec5908de593de71ff4d925467
Reviewed-on: https://review.whamcloud.com/29742
Reviewed-by: Frank Zago <fzago@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sergey Cheremencev <cherementsev@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10098 scripts: Fix mounted check in Lustre RA 51/29351/3
Nathaniel Clark [Fri, 6 Oct 2017 16:49:23 +0000 (12:49 -0400)]
LU-10098 scripts: Fix mounted check in Lustre RA

The "Lustre" resource agent for pacemaker can mis-identify a resource
as being mounted if it's mountpoint is a substring match for anything
else in /proc/mounts.  Change the lustre_is_mounted() function to
check to make sure it's a lustre fs mounted at mountpoint and the
"source" (i.e. device) is the target we expect.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib877b0dc3d3ce0d93fd4663aa2418ac21d670428
Reviewed-on: https://review.whamcloud.com/29351
Tested-by: Jenkins
Reviewed-by: Malcolm Cowe <malcolm.j.cowe@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9994 obdclass: fix llog_cat_id2handle() error handling 70/29370/3
Bruno Faccini [Mon, 9 Oct 2017 15:21:50 +0000 (17:21 +0200)]
LU-9994 obdclass: fix llog_cat_id2handle() error handling

Patch for LU-9153 ("llog: consolidate common error checking") has
introduced a regression in llog_cat_id2handle() error handling
path by adding llog_cat_process_common() common routine additional
call in sequence and allowing it to return zero even when catalog
entry was invalid and it had cleared it instead to populate
llog_handle, thus causing an exception when handle was later
dereferenced in llog_process_thread().

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I50153b931e3c1567bfe9c15564ba29fabe3a2d4c
Reviewed-on: https://review.whamcloud.com/29370
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
3 years agoLU-10132 llite: handle xattr cache refill race 54/29654/2
John L. Hammond [Tue, 17 Oct 2017 20:32:52 +0000 (15:32 -0500)]
LU-10132 llite: handle xattr cache refill race

In ll_xattr_cache_refill() if the xattr cache was invalid (and no
request was sent) then return -EAGAIN so that ll_getxattr_common()
caller will fetch the xattr from the MDT.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ia9ec7424e8786d92bdecf4897fafcf71d5061fb1
Reviewed-on: https://review.whamcloud.com/29654
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9951 lustre_compat: add wrapper function for posix_acl_update_mode 71/28871/9
Gu Zheng [Wed, 18 Oct 2017 19:10:53 +0000 (15:10 -0400)]
LU-9951 lustre_compat: add wrapper function for posix_acl_update_mode

posix_acl_update_mode is introduced in kernel 4.9, add the precheck
of it, if not exists, use inline wrapper function instead.

Change-Id: I8a1476d611c387a88efef5d5b8707edf5feacca8
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/28871
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9558 llite: port lustre to unified handling of bdi 11/28511/7
Jan Kara [Wed, 18 Oct 2017 16:26:19 +0000 (12:26 -0400)]
LU-9558 llite: port lustre to unified handling of bdi

For the linux 4.12 kernel the bdi handling was unified for all
file systems. This was done by allocating struct backing_dev_info
separately instead of embedding it inside superblock. For older
kernels we move all the bdi handling lustre does to the function
super_setup_bdi_name() which is what exist in the latest kernels.

Linux-commit: 9594caf216dc0fe3e318b34af0127276db661241

Change-Id: I5af60ea3661e3d3a97973fd99a79c28dcd1ce1cc
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28511
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7251 osp: do not assign commit callback to every thandle 70/17270/36
Alex Zhuravlev [Fri, 19 Oct 2012 14:18:05 +0000 (18:18 +0400)]
LU-7251 osp: do not assign commit callback to every thandle

with OSP there is a risk of getting a lot of commit callbacks.
say, 10K unlinks/sec on 4-striped files could result in 4*10K*5
= 200K commit callbacks. this patch implements another schema:
every OSP registers own callback every second. this should result
in 4*5 commit callbacks in the same situation. in case of forced
sync the commit callback is registered unconditionally.

the patch removes th_tags and th_ctx from struct thandle as they
are not used anymore. this elimintates 3 allocations from every
transaction:
(lu_object.c:1714:keys_init()) kmalloced 'ctx->lc_value': 320
(update_records.c:1217:update_key_init()) kmalloced 'value': 408
(osp_dev.c:1807:osp_txn_key_init()) kmalloced 'value': 4

Change-Id: I460d5eccb585b166423d84d5c142af2e27751d8b
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/17270
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
3 years agoLU-9019 ldlm: migrate the rest of the code to 64 bit time 95/29295/4
James Simmons [Wed, 18 Oct 2017 17:26:00 +0000 (13:26 -0400)]
LU-9019 ldlm: migrate the rest of the code to 64 bit time

Replace the last cfs_time_current_sec() to avoid the overflow
issues in 2038 with ktime_get_real_seconds(). Reduce the jiffies
usage to the bare minimum which is useage for mod_timer() and
schedule_timeout(). This makes the ldlm totally 64 bit time
compliant.

Change-Id: Iaee92c17d51fdfc55bd26e9e813e30a6ce794856
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29295
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10015 o2iblnd: fix race at kiblnd_connect_peer 34/29134/6
Alexander Boyko [Thu, 21 Sep 2017 13:13:27 +0000 (16:13 +0300)]
LU-10015 o2iblnd: fix race at kiblnd_connect_peer

cmid will be destroyed at OFED if kiblnd_cm_callback return error.
if error happen before the end of kiblnd_connect_peer, it will touch
destroyed cmid and fail as
(o2iblnd_cb.c:1315:kiblnd_connect_peer())
            ASSERTION( cmid->device != ((void *)0) ) failed:

Seagate-bug-id: MRP-4592
Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Change-Id: I83eb5bceeb567acef0316498b936d25d6c6ccd95
Reviewed-on: https://review.whamcloud.com/29134
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8066 obd: migrate to ksets 48/28948/11
James Simmons [Mon, 9 Oct 2017 14:39:23 +0000 (10:39 -0400)]
LU-8066 obd: migrate to ksets

Lustre's sysfs only uses kobjects but with the introduction of
ksets we can use functionality like kset_find_kobj() and uevents.
Currently lustre is layered as lustre_kobj -> class -> obd device.
This patch changes the obd devices and the top level lustre_kobj
into ksets. The class level is kept as kobjects but are bound to
the top level lustre kset so they searchable and uevents can be
created for them. Also much of the class functionality can be
replaced with what ksets can do. With obd devices now ksets we
can replace lprocfs_kset_register with lprocfs_obd_setup. Some
of the sysfs attributes were not cleaned up so added proper
removal. Reversed what the default sysfs attributes are. This
will be needed for the replacement functionality for
class_process_proc_param().

Change-Id: I3ced5f69ace6a0a9a6bc51957f20a0caecdbafc9
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28948
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10041 osd: osd-zfs to choose dnode size 42/29242/3
Alex Zhuravlev [Thu, 28 Sep 2017 11:30:55 +0000 (14:30 +0300)]
LU-10041 osd: osd-zfs to choose dnode size

depending on dnodesize property it can be:
legacy (512 bytes), auto (512 bytes to 16K) or absolute
size (512, 1024, 2048, 4096, 8192, 16384).

Change-Id: Iea35d8ae850523440272467320410850821f484c
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/29242
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
3 years agoLU-10131 llite: Update i_nlink on unlink 51/29651/4
Patrick Farrell [Wed, 18 Oct 2017 10:24:41 +0000 (05:24 -0500)]
LU-10131 llite: Update i_nlink on unlink

Currently, the client inode link count is not updated on
last unlink.  This is fine because the dentries are all
gone and the inode is eligible for reclaim, but it's still
incorrect.  This causes two problems:

1. Inode is not immediately reclaimed
2. i_nlink count is > 0 for a fully unlinked file, which
confuses wrapfs

On last unlink, the MDT sends back attributes.  Use the
nlink count from these to update the client inode.

Remove null check inherited from ll_get_child_fid, because
the inode should never be null on an unlink.

Re-enabled test 76, which passes with this patch.
Removed slab allocator tuning from test_76, because slab is
no longer the default Linux allocator.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ib253b5cf3d35188554cf8fc33a8a3d4b8bb237e8
Reviewed-on: https://review.whamcloud.com/29651
Tested-by: Jenkins
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10089 o2iblnd: use IB_MR_TYPE_SG_GAPS 51/29551/3
Amir Shehata [Tue, 10 Oct 2017 19:26:56 +0000 (12:26 -0700)]
LU-10089 o2iblnd: use IB_MR_TYPE_SG_GAPS

When allocating fastreg buffers use IB_MR_TYPE_SG_GAPS
instead of IB_MR_TYPE_MEM_REG, since the fragments we provide
the fast registration API can have gaps. MEM_REG doesn't handle
that case.

Test-Parameters: trivial
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I36dba6f676fdbc60730aed7c50d71f2a6b7c2549
Reviewed-on: https://review.whamcloud.com/29551
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8475 target: use slab allocation 54/21654/5
Alexander Boyko [Wed, 3 Aug 2016 08:12:19 +0000 (11:12 +0300)]
LU-8475 target: use slab allocation

The patch adds kmem slabs for target threads and session info
to improve allocation and better accounting.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Change-Id: Ia0a93d410618c5d7724f2dcc86f1bcb9ae32e572
Seagate-bug-id: MRP-2836
Reviewed-on: https://review.whamcloud.com/21654
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10101 tests: correct sanity-quota call to quota_error 58/29358/2
James Nunez [Fri, 6 Oct 2017 19:53:03 +0000 (13:53 -0600)]
LU-10101 tests: correct sanity-quota call to quota_error

sanity-quota test 7e calls quota_error with a "-u"
argument. Input to quota_error should not be prefaced
with  "-".

Test-Parameters: trivial testgroup=review-zfs-part-1,review-dne-part-2
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iac84e3fb2348588a157beefcf4d554b1ac3171ed
Reviewed-on: https://review.whamcloud.com/29358
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10046 misc: replace LASSERT() with CLASSERT() 56/29256/2
Andreas Dilger [Fri, 29 Sep 2017 03:17:15 +0000 (21:17 -0600)]
LU-10046 misc: replace LASSERT() with CLASSERT()

Some code consistency checks are being done at runtime with LASSERT()
when they could be done at compile time with CLASSERT(). This might
miss defects introduced into the code if that particular code path is
not exercised during testing.

Replace LASSERT() with CLASSERT() in such cases.

Style cleanup for related code.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8dca903109b7151de59afe17fe9ca311119d1b36
Reviewed-on: https://review.whamcloud.com/29256
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10141 llapi: integer overflow in llapi_changelog_start 74/29674/2
Henri Doreau [Thu, 19 Oct 2017 08:14:18 +0000 (10:14 +0200)]
LU-10141 llapi: integer overflow in llapi_changelog_start

Use the appropriate type to store and check the return value from lseek.
This prevents from misinterpreting high offsets as errors.

Change-Id: I15e92be3454af20ee6611c2a7ddfc1b597d639c2
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-on: https://review.whamcloud.com/29674
Tested-by: Jenkins
Reviewed-by: Thomas LEIBOVICI <thomas.leibovici@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10023 kernel: kernel update [SLES12 SP3 4.4.82-6.9] 60/29560/4
Bob Glossman [Fri, 22 Sep 2017 15:29:55 +0000 (08:29 -0700)]
LU-10023 kernel: kernel update [SLES12 SP3 4.4.82-6.9]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Change-Id: Ie65afb3d7356c4679f4f37f4af324e955261b5af
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/29560
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4705 mdc: improve mdc_enqueue() error message 78/28978/3
Andreas Dilger [Wed, 13 Sep 2017 16:47:07 +0000 (10:47 -0600)]
LU-4705 mdc: improve mdc_enqueue() error message

Include the parent/child FIDs and name in the mdc_enqueue()
debug message.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I7b84921a52a4650be70fe87eea691ba2217bb3a6
Reviewed-on: https://review.whamcloud.com/28978
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9897 utils: remove libcfsutils.a and libptlctl.a 52/28752/7
James Simmons [Mon, 9 Oct 2017 14:37:14 +0000 (10:37 -0400)]
LU-9897 utils: remove libcfsutils.a and libptlctl.a

Currently lustre creates many libraries and combines them in
redudant ways. Broke up libptlctl.a and merged debug.c and
portals.c into lctl. The application lustre_rsync pulled in
way too much extra code that is not needed just for the function
obd_initialize(). Instead just call register_ioc_dev() directly
for lustre_rsync. This removes the libptlctl.a/portals.c
dependency. In time the portals.c code can be replaced by the
work from liblnetconfig. Integrated cyaml into liblnetconfig
instead of directly linking cyaml.c into lnetctl. This way
we can take advantage of YAML in the future for lnet selftest
and lustre utilities. Only a small change was needed for lnet
selftest to be dependent on liblnetconfig instead of libptlctl.a.

Change-Id: Ic1caaa01b3faedf90dc7ae8bc26ee40396a52a07
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28752
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9578 llite: use security context if it's enabled in the kernel 64/27364/4
Alex Zhuravlev [Thu, 1 Jun 2017 07:20:12 +0000 (11:20 +0400)]
LU-9578 llite: use security context if it's enabled in the kernel

if it's disabled, then Lustre stop to work properly (can not create
files, etc)

Change-Id: I1e431ec95a2b0613b43893567eb6d1a64ec832de
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/27364
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Chris Horn <hornc@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9452 ldlm: remove MSG_CONNECT_LIBCLIENT support 72/26972/9
Andreas Dilger [Fri, 5 May 2017 23:39:48 +0000 (17:39 -0600)]
LU-9452 ldlm: remove MSG_CONNECT_LIBCLIENT support

Remove old server code that handled liblustre client connections,
marked with MSG_CONNECT_LIBCLIENT and associated code checking for
exp_libclient.  Servers will now outright refuse connections from
liblustre clients with a clear message, rather than allowing the
connection and pretending to work.  Liblustre client support was
broken and removed years ago.

There are still some liblustre remnants in the code (e.g. blocked
lock handling for LDLM_FL_CANCEL_ON_BLOCK), but that has more
complex semantics and should be removed in a separate patch.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ifbea507e82d758f849db24094c5cc0a8003ebbe5
Reviewed-on: https://review.whamcloud.com/26972
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7990 llite: increase whole-file readahead to RPC size 55/26955/3
Andreas Dilger [Wed, 4 May 2016 05:29:42 +0000 (23:29 -0600)]
LU-7990 llite: increase whole-file readahead to RPC size

Increase the default whole-file readahead limit to match the current
RPC size.  That ensures that files smaller than the RPC size will be
read in a single round-trip instead of sending multiple smaller RPCs.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3bdb1c7f92c546d58951a9e6b783af23c83ebbe5
Reviewed-on: https://review.whamcloud.com/26955
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9405 utils: remove device path parsing from mount.lustre 09/26909/3
John L. Hammond [Mon, 1 May 2017 21:54:16 +0000 (16:54 -0500)]
LU-9405 utils: remove device path parsing from mount.lustre

In mount_utils_ldiskfs.c remove code that analyzes the device path
(/dev/mdXX, /dev/mdXXpX, /dev/mapper/XXX, /dev/loopX ...) to determine
the device type and replace with use of stat(). Locate the device's
sysfs directory using the /sys/dev/block/<major>:<minor> symlink
rather than globbing. Rename set_blockdev_tunables() to
tune_block_dev() and break into subfunctions to handle setting
md/stripe_cache_size, queue/max_sectors_kb, and queue/scheduler.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ibd272bdf2e76bdec4c207c29dff76a96a65ea333
Reviewed-on: https://review.whamcloud.com/26909
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-9019 ofd: migrate to 64 bit time 76/28976/9
James Simmons [Tue, 17 Oct 2017 00:40:19 +0000 (20:40 -0400)]
LU-9019 ofd: migrate to 64 bit time

Change fmd_expire and ofd_fmd_max_age to time64_t fields since we
don't need more than seconds resolution. Move several parts of the
code away from jiffies handling since it can be different across a
set of nodes. We leave l_last_used and the stats timeout code alone
since it affects more than the ofd layer.

Change-Id: I505ad9b1c553bf1769241a5920cf146595a3812c
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28976
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-9814 ldiskfs: restore simple_strtol in prealloc 53/28553/5
Yang Sheng [Tue, 15 Aug 2017 10:42:09 +0000 (18:42 +0800)]
LU-9814 ldiskfs: restore simple_strtol in prealloc

Since kstrtol needs a null-terminor string so we
back to use simple_strtol in prealloc patches.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I331308d830fbeef9c00156bb8c14b43651d66420
Reviewed-on: https://review.whamcloud.com/28553
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4923 osd-ldiskfs: dirdata is not needed on MGS 74/29274/2
Andreas Dilger [Fri, 29 Sep 2017 21:25:04 +0000 (15:25 -0600)]
LU-4923 osd-ldiskfs: dirdata is not needed on MGS

Don't print a warning message if the "dirdata" feature is not enabled
on MGS devices.  It is only needed for ldiskfs MDT devices.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I1d8ccfc9c60eff128b480ea8efa298c1212c041a
Reviewed-on: https://review.whamcloud.com/29274
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9782 osd-ldiskfs: avoid extra search 45/28145/8
Alexey Lyashkov [Thu, 20 Jul 2017 09:06:18 +0000 (14:36 +0530)]
LU-9782 osd-ldiskfs: avoid extra search

Extent tree grow greatly durin random IO test with small block size.
osd_is_mapped responsible to large cpu consumption in this case.

|          |
|          |--94.49%-- ldiskfs_es_find_delayed_extent_range
|          |          ldiskfs_fiemap
|          |          osd_is_mapped
|          |          osd_declare_write_commit
|          |
|          |--5.49%-- ldiskfs_fiemap
|          |          osd_is_mapped
|          |          osd_declare_write_commit
|
|--21.80%-- ldiskfs_es_find_delayed_extent_range
|          |
|          |--100.00%-- ldiskfs_fiemap
|          |          osd_is_mapped

let's cache a osd_is_mapped result to avoid extra search in extent
tree,

Seagate-bug-id: MRP-4474
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Change-Id: I63d480bfc7c6b7599b80ceeec9447b227a1610c8
Reviewed-on: https://review.whamcloud.com/28145
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-4134 obdclass: obd_device improvement 45/8045/30
Alexander Boyko [Thu, 15 Jun 2017 14:25:58 +0000 (17:25 +0300)]
LU-4134 obdclass: obd_device improvement

The patch removes self exports from obd's reference counting which
allows to avoid freeing of self exports by zombie thread.
A pair of functions class_register_device()/class_unregister_device()
is to make sure that an obd can not be referenced again once its
refcount reached 0. For target_handle_connect() take a reference for
obd_device during finding it by name.

Fix grant mismatch message
"tot_granted 4194304 != fo_tot_granted".

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Signed-off-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Seagate-bug-id: MRP-2139 MRP-3267
Change-Id: I9cc6860431c6bb7db6983e0d15a5d3d2b564265e
Reviewed-on: https://review.whamcloud.com/8045
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9990 lnet: add backwards compatibility for YAML config 33/29333/4
Amir Shehata [Thu, 12 Oct 2017 23:43:08 +0000 (19:43 -0400)]
LU-9990 lnet: add backwards compatibility for YAML config

In 2.10 YAML configuration had a numa block, which was
removed post 2.10. The YAML parser needs to continue handling
the numa block to maintain backwards compatibility.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ic6ff6033631c5bec82323d9f8d5d4f2a19fd8d1b
Reviewed-on: https://review.whamcloud.com/29333
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9968 tests: correct stripe index sanity 300g 35/28935/3
James Nunez [Mon, 11 Sep 2017 21:16:42 +0000 (15:16 -0600)]
LU-9968 tests: correct stripe index sanity 300g

In sanity test 300g, we set the starting stripe index to MDT 1
and MDT 2 using 'lfs setstripedir -iN' for N = 1 and 2. At the
beginning of the test, we check that there are two or more MDTs
in the Lustre file system being tested. If there are only two MDTs,
then this test will fail when we try to set the starting stripe
index to 2 because the MDT indexes are zero based.

For santiy test 300g only use MDT start index 0 and 1.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: If2c252ad9bb7249aa777764f212ee40523aee82f
Reviewed-on: https://review.whamcloud.com/28935
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
3 years agoLU-9860 tests: Run command on MGS for conf-sanity 33a 78/28478/7
James Nunez [Fri, 11 Aug 2017 00:57:53 +0000 (18:57 -0600)]
LU-9860 tests: Run command on MGS for conf-sanity 33a

When conf-sanity test 33a is run on a Lustre configuration
with separate MGS and MDS, the 'lctl set_param' command for
timeout must be run on the MGS.

For the same test, adding the '--mgs' flag when formatting
the MDS of the second file system should be based on
if there is a combined or separate MDS and MGS.

Test-Parameters: combinedmdsmgs=false testlist=conf-sanity

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Iadb9e0e3ab4f64edba2c0bbc938e64ff3bce9468
Reviewed-on: https://review.whamcloud.com/28478
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9956 kernel: kernel upgrade [SLES12 SP3 4.4.82-6.3] 10/28910/9
Bob Glossman [Thu, 7 Sep 2017 19:37:27 +0000 (12:37 -0700)]
LU-9956 kernel: kernel upgrade [SLES12 SP3 4.4.82-6.3]

Minor linux version upgrade, but SP2 and SP3 use linux 4.4 versions.
Some new kernel patches, a few revised ldiskfs patches.
All new target and config files.
Some autoconf changes to adapt to new or altered kernel APIs.

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Change-Id: I99e2b6848197ea19402fa415fdb562d03e87d947
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-on: https://review.whamcloud.com/28910
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9140 nrs: measure the runtime of dd directly 78/27878/10
Qian Yingjin [Thu, 29 Jun 2017 09:03:44 +0000 (17:03 +0800)]
LU-9140 nrs: measure the runtime of dd directly

This patch changes the way to measure the runtime of "dd". Instead
of parsing the output of "dd", we use date command to calculate
the runtime of dd directly, avoiding the parsing failure caused
by changed output format of "dd".

Change-Id: Ibd2e3963f791404ee927981238227012cf4dbf2c
Test-Parameters: trivial testlist=sanityn
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/27878
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10119 scripts: Correct shebang/hashpling format 05/29605/2
Chris Horn [Fri, 13 Oct 2017 18:58:42 +0000 (13:58 -0500)]
LU-10119 scripts: Correct shebang/hashpling format

Shebang/hashpling should not have a space between the number sign and
exclamation mark.

Test-Parameters: trivial
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I4a31d7dd7579bfaf231284c678e4100acea77d9b
Reviewed-on: https://review.whamcloud.com/29605
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-8344 test: fix sanity 256 98/29598/3
Alexander Boyko [Fri, 13 Oct 2017 08:57:49 +0000 (04:57 -0400)]
LU-8344 test: fix sanity 256

The test error
 Changelog catalog has wrong number of slots 1
is a result of the debugfs dump happaned before a previous
changes were commited to a disk.

The patch adds mds sync before debugfs command.
Also it fixes temp file removal.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Change-Id: Ic14a58956642f419b0f6d695027f88a0cad9fd39
Test-Parameters: trivial testlist=sanity
Reviewed-on: https://review.whamcloud.com/29598
Reviewed-by: Sergey Cheremencev <cherementsev@gmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
3 years agoLU-10029 osd-ldiskfs: make project inherit attr removeable 89/29189/5
Wang Shilong [Mon, 25 Sep 2017 12:35:26 +0000 (20:35 +0800)]
LU-10029 osd-ldiskfs: make project inherit attr removeable

Inherit attribute should be clearable now

Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: I878fde0dc134c9820436ee80979d87e6dacfb70d
Reviewed-on: https://review.whamcloud.com/29189
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9416 hsm: add kkuc before sending registration RPCs 51/28751/5
Henri Doreau [Wed, 23 Aug 2017 15:16:25 +0000 (17:16 +0200)]
LU-9416 hsm: add kkuc before sending registration RPCs

This avoids a situation where the registration completes and the CDT
sends HSM actions just before the kkuc registration happens. In this
case the client drops the actions because there are no CT pipes in the
kkuc list.

Change-Id: Icbd6575f04c0ca7e8f731ee8481ec72a9ff4f2e1
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-on: https://review.whamcloud.com/28751
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9752 man: Reference zgenhostid instead of genhostid 27/29327/2
Nathaniel Clark [Thu, 5 Oct 2017 11:43:26 +0000 (07:43 -0400)]
LU-9752 man: Reference zgenhostid instead of genhostid

In ZFS 0.7.0, they added zgenhostid(8) to be used in place of
Redhat's genhostid, so that there would be a platform agnostic
way to generate /etc/hostid.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I691266d04f91d5fa7c50b72948c801afa69d647d
Reviewed-on: https://review.whamcloud.com/29327
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-9469 ldiskfs: add additional attach_jinode call 65/28665/17
Bob Glossman [Wed, 23 Aug 2017 18:51:24 +0000 (11:51 -0700)]
LU-9469 ldiskfs: add additional attach_jinode call

In some execution paths jinode data structures are not
getting initialized. Add an extra init call to fix that.

Test-Parameters: clientdistro=sles12sp2 testlist=conf-sanity \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I087eb06b9c5122be1cfd0aabbc04ea1db7ec765a
Reviewed-on: https://review.whamcloud.com/28665
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9908 tests: force umount client in test 70e, 41b, and 105 67/28767/10
Yang Sheng [Mon, 28 Aug 2017 19:30:13 +0000 (03:30 +0800)]
LU-9908 tests: force umount client in test 70e, 41b, and 105

In test_70e, import state may not update while
mds stopping. Since statfs will be invoked in sles12
before umounting, so umount with force flag to avoid
waitting a long time.
Add -f to stopall call in test-framework for the same reason,
to do umount with force flag.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I3d1e73b3501e98008ef18c05f7b5498d12cb46fb
Reviewed-on: https://review.whamcloud.com/28767
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9983 ko2iblnd: allow for discontiguous fragments 90/29290/5
John L. Hammond [Mon, 2 Oct 2017 21:23:15 +0000 (16:23 -0500)]
LU-9983 ko2iblnd: allow for discontiguous fragments

In the IOVEC case the buffers passed to the LND may not span
complete pages, therefore the RDMA descriptor needs to describe
all the buffers.  Moreover for the FMR case, the addresses that get
set in the RDMA descriptor need to be relative addresses. This
issue was exposed after:
LU-9026 o2iblnd: Adapt to the removal of ib_get_dma_mr()

Fastreg still expects only one fragment with the total nob.
Otherwise there is a dump_cqe error from MLX5

Test-Parameters: trivial
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ie1cf52677f65af83357a3dd25fd1a45f3466a96e
Reviewed-on: https://review.whamcloud.com/29290
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10051 build: Build with ZFS 0.7.2 72/29272/2
Nathaniel Clark [Fri, 29 Sep 2017 20:51:01 +0000 (16:51 -0400)]
LU-10051 build: Build with ZFS 0.7.2

Update ZFS and SPL version to 0.7.2
Changelog: https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.7.2

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I7043503a9d00be8db13acf2239d22549a16a728c
Reviewed-on: https://review.whamcloud.com/29272
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9158 test: Use project ID for project quota for quota_scan 57/28957/4
Wei Liu [Tue, 12 Sep 2017 19:30:14 +0000 (12:30 -0700)]
LU-9158 test: Use project ID for project quota for quota_scan

Use project ID instead of quota user for project quota in
function quota_scan. This is not a fix for the rebalancing error,
only correct the test script issue.

Change-Id: I9165ada17de0f32ac38f720a5e9e9be46363b41f
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/28957
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9660 ptlrpc: do not wakeup every second 76/28776/7
Alex Zhuravlev [Tue, 29 Aug 2017 12:08:42 +0000 (15:08 +0300)]
LU-9660 ptlrpc: do not wakeup every second

Even if there are no RPC requests on the set, there is no need to
wake up every second. The thread is woken up when a request is added
to the set or when the STOP bit is set, so it is sufficient to only
wake up when there are requests on the set to worry about.

Change-Id: Iac01d8c46e8645ecb6303ce72e0f6c59f16dcd5d
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/28776
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-5170 lfs: Standardize error messages in lfs_setstripe() 49/28049/5
Steve Guminski [Wed, 21 Jun 2017 12:28:35 +0000 (08:28 -0400)]
LU-5170 lfs: Standardize error messages in lfs_setstripe()

Error and warning messages in lfs_setstripe() are updated to a
standard format.  Messages are prefixed with the name of the utility
and the command that caused the error.  User-provided values are
delimited with single quotes.

Test-Parameters: trivial
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I578e1adb94736a3d22aee4a85a3d7994fc78c6f0
Reviewed-on: https://review.whamcloud.com/28049
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9741 test: Correct check of stripe count for directories 80/27980/3
James Nunez [Mon, 21 Aug 2017 23:43:19 +0000 (17:43 -0600)]
LU-9741 test: Correct check of stripe count for directories

With the progressive file layout feature, the 'lfs getstripe -d'
output will report stripe count for each component. Since
sanity test 27w checks that there is only one "stripe_count"
line in the 'lfs getstripe -d' output, this needs to be changed
to check that there is at least one "stripe_count" reported.

Test-Parameters: trivial
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I51f141b2542e65d1bc296cb2cb14c12f22afdcec
Reviewed-on: https://review.whamcloud.com/27980
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
3 years agoLU-9672 gss: fix expiration time of sunrpc cache 67/27667/9
Sebastien Buisson [Mon, 2 Oct 2017 20:00:52 +0000 (16:00 -0400)]
LU-9672 gss: fix expiration time of sunrpc cache

Expiration time of sunrpc cache is misinterpreted. Downcal
and response from user space must provide an epoch time,
not a duration.
And on kernel side, expiry must always be expressed in seconds
from boot, as set when retrieved from get_expiry().

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I35c58a040a62410374dee0be3ae5bed7956cd985
Reviewed-on: https://review.whamcloud.com/27667
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9590 tests: remove replay-single tests from ALWAYS_EXCEPT 04/27404/5
dilip krishnagiri [Thu, 17 Aug 2017 16:10:25 +0000 (10:10 -0600)]
LU-9590 tests: remove replay-single tests from ALWAYS_EXCEPT

Removing replay-single tests
61d "error in llog_setup should cleanup the llog context correctly"
73b "open(O_CREAT), unlink, replay, reconnect at open_replay reply, close"
89 "no disk space leak on late ost connection"
from ALWAYS_EXCEPT list.

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: Iead44525d19fa09b44e49486fabdc0487eff1d10
Reviewed-on: https://review.whamcloud.com/27404
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9462 doc: update lfs setstripe man page and usage 66/27066/8
Andreas Dilger [Thu, 11 May 2017 10:43:14 +0000 (04:43 -0600)]
LU-9462 doc: update lfs setstripe man page and usage

Update the lfs-setstripe.1 man page formatting and content.
Update the "lfs setstripe" usage message to be in "common use" order.

Test-Parameters: trivial
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: Ia761c7562fa398a8c1bb4354d09757cd5f3ebbe5
Reviewed-on: https://review.whamcloud.com/27066
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8721 tests: add parallel-scale fio test 26/23226/4
Elena Gryaznova [Thu, 25 May 2017 14:59:28 +0000 (17:59 +0300)]
LU-8721 tests: add parallel-scale fio test

Patch adds parallel-scale fio test.

Fio (https://git.kernel.org/pub/scm/linux/kernel/git/axboe/fio.git)
is a flexible generator of I/O of various modes: memory mapped,
multithreaded, asynchronous read/write I/O, etc.
Any fio job file including those created by third party can be
supplied via $fio_jobFile.
By default, random 128mb write
is performed.

Test-Parameters: trivial envdefinitions=ONLY=fio testlist=parallel-scale
Seagate-bug-id: MRP-3707
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Change-Id: I36b471e47e366caae5709f109c28ba57c89f22c7
Reviewed-on: https://review.whamcloud.com/23226
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10047 tests: stop skipping test_102 subtests 88/29288/7
Bob Glossman [Fri, 29 Sep 2017 22:27:12 +0000 (15:27 -0700)]
LU-10047 tests: stop skipping test_102 subtests

Remove obsolete version check on tar.
All tar versions in supported distros are now capable of --xattrs.
But not all tar version have --xattrs-include, so must test for and
adapt for that.

Test-Parameters: trivial envdefinitions=ONLY=102 testlist=sanity
Test-Parameters: trivial clientdistro=el6.9 envdefinitions=ONLY=102 testlist=sanity
Test-Parameters: trivial clientdistro=sles12sp2 envdefinitions=ONLY=102 testlist=sanity

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I5e76bd1a762c4e01cf8a3a33789ca3a30c15abb0
Reviewed-on: https://review.whamcloud.com/29288
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoNew tag 2.10.54 2.10.54 v2_10_54 v2_10_54_0
Oleg Drokin [Thu, 12 Oct 2017 05:29:01 +0000 (01:29 -0400)]
New tag 2.10.54

Change-Id: I1523714ee6e80ffb1e80c1bf6464e5e4c98b7f62
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoRevert "LU-9810 lnet: prefer Fast Reg"
Oleg Drokin [Thu, 12 Oct 2017 05:27:15 +0000 (01:27 -0400)]
Revert "LU-9810 lnet: prefer Fast Reg"

This patch seems to be breaking local mounts as documented in
LU-10068. While this might me mlx5 problem and not this patch
per se, we need to have the repo in working order.

This reverts commit 8f0d0f052a516a5dd3e588ced6b49c840584855c.

Change-Id: Ib54a958018875b368410e98863a5199bf1b89604
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10011 utils: suppress annoying messages for project quota 07/29107/8
Wang Shilong [Tue, 19 Sep 2017 13:10:48 +0000 (21:10 +0800)]
LU-10011 utils: suppress annoying messages for project quota

See following output:

[wsl@w003 ~]$ lfs quota /lustre1
Disk quotas for usr wsl (uid 14434):
      Filesystem  kbytes   quota   limit   grace   files   quota limit
grace
        /lustre1       0       0       0       -       0 0       0       -
Disk quotas for grp se (gid 1000):
      Filesystem  kbytes   quota   limit   grace   files   quota limit
grace
        /lustre1   44028       0       0       -      40 0       0       -
Unexpected quotactl error: Operation not supported
Disk quotas for prj <unknown> (pid 0):
      Filesystem  kbytes   quota   limit   grace   files   quota limit
grace
        /lustre1     [0]     [0]     [0]       -     [0]     [0] [0]       -
Some errors happened when getting quota info. Some devices may be not
working or deactivated. The data in "[]" is inaccurate.

Since project quota is disabled in default, unsupported project
quota messages will be annoying here.

Even Project quota is enabled, project id 0 will be outputed here,
it dose not make much sense either, instead, lfs quota <mnt> will only
output current user/group quota information in default

Change-Id: Ie76ba14c4c4486a9246aafed0e8538eaae85ee98
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/29107
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9967 tests: run-llog to cleanup properly 57/29057/2
Alex Zhuravlev [Mon, 18 Sep 2017 20:17:29 +0000 (23:17 +0300)]
LU-9967 tests: run-llog to cleanup properly

move ignore_errors so that setup's error is ignored, the script
continues with detach command needed to release the module.

Change-Id: I899de7a321fa927ca9884a6e4f9e8785bbc46f7c
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/29057
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9996 build: include MOFED lib 20/29020/3
Minh Diep [Mon, 11 Sep 2017 17:02:19 +0000 (10:02 -0700)]
LU-9996 build: include MOFED lib

Test-Parameters: trivial

Change-Id: Ia1b5a97f59d5055311786934c9165a81f2af7cae
Signed-off-by: Minh Diep <minh.diep@intel.com>
Reviewed-on: https://review.whamcloud.com/29020
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9978 kernel: kernel update RHEL7.4 [3.10.0-693.2.2.el7] 99/28999/5
Bob Glossman [Thu, 14 Sep 2017 13:24:28 +0000 (06:24 -0700)]
LU-9978 kernel: kernel update RHEL7.4 [3.10.0-693.2.2.el7]

update RHEL 7.4 kernel to 3.10.0-693.2.2.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I2fe4e038f185f8d4acec74d9cc398da987bcad6b
Reviewed-on: https://review.whamcloud.com/28999
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7988 hsm: split mdt_hsm_add_actions() 72/20272/32
Frank Zago [Tue, 3 May 2016 20:13:05 +0000 (16:13 -0400)]
LU-7988 hsm: split mdt_hsm_add_actions()

Split a portion of mdt_hsm_add_actions() to the new function
mdt_hsm_process_hal() in order to re-use it in a subsequent patch.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I32785c549a744592d4e84787fd16d12e7ffd7322
Reviewed-on: https://review.whamcloud.com/20272
Tested-by: Jenkins
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9995 lfsck: keep the LMV_HASH_FLAG_DEAD flag 30/29130/4
Fan Yong [Tue, 26 Sep 2017 04:41:21 +0000 (12:41 +0800)]
LU-9995 lfsck: keep the LMV_HASH_FLAG_DEAD flag

Since lustre 2.8, the flag LMV_HASH_FLAG_DEAD is not needed, instead
this DEAD and orphan flags will be stored in LMA (see LMAI_ORPHAN).
At that time, we kept such flag for LFSCK to handle the filesystem
upgrading from old release (Lustre-2.7 or older). Lustre-2.11 may
still needs to support the upgrading from Lustre-2.7 (mainly consider
the EE users), we will continuously keep such flag for a while.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I11c1fa5a354d5f244514c10e3ca2c9e952eef975
Reviewed-on: https://review.whamcloud.com/29130
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-8400 mdd: remove OBD_IOC_GET_MNTOPT 77/28777/2
Henri Doreau [Tue, 29 Aug 2017 13:13:13 +0000 (15:13 +0200)]
LU-8400 mdd: remove OBD_IOC_GET_MNTOPT

This ioctl was only used between MDT and MDD.

Replace it, as well as mdo_maxeasize_get by a generic mdo_dtconf_get
function.

Change-Id: Idb080007a0b7311d6fc95f06cf24db8e9c3265d6
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-on: https://review.whamcloud.com/28777
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9855 obdclass: Code cleanup 58/28458/9
Ben Evans [Mon, 7 Aug 2017 19:41:03 +0000 (14:41 -0500)]
LU-9855 obdclass: Code cleanup

Remove uuid.c, and rewrite class_uuid_unparse
Remove always-on #defines and #ifdefs
Unwrap DECLARE_LU_VARS and remove #define
Rewrite CL_ENV_INC and DEC
Rewrite cs_page_inc/dec and cs_pagestate_inc/dec
Unwrap lustre_(get|put)_group_info #defines
Remove D_KUC #define

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I1ebaa47dbf749cedcf3d33028906a35caf7db694
Reviewed-on: https://review.whamcloud.com/28458
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8066 lnet: port lnet router to debugfs 30/26430/8
Oleg Drokin [Tue, 22 Aug 2017 17:22:02 +0000 (13:22 -0400)]
LU-8066 lnet: port lnet router to debugfs

Move all the lnet procfs variables to debugfs.

Linux-commit: b03f395a3e9bf4874fb58f4fe6033866d3b9f105

This brings the OpenSFS branch into sync with upstream
lustre client. Technically debugfs is the not the proper
place for stats but due to the upstream client being
uses for production systems we have to keep them :-(
New work will be done in the future to properly handle
stats using sysfs instead.

Test-Parameters: trivial envdefinitions=ONLY=215 testlist=sanity

Change-Id: Id9244d8525f0844d321f29af4a01e1fbce8e5884
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/26430
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9019 ptlrpc: migrate pinger to 64 bit time 35/28035/12
James Simmons [Wed, 13 Sep 2017 15:44:56 +0000 (11:44 -0400)]
LU-9019 ptlrpc: migrate pinger to 64 bit time

Replace cfs_time_current_sec() to avoid the overflow
issues in 2038 with ktime_get_real_seconds(). Remove
cfs_timeout_cap() and CFS_DURATION abstraction. Change
imp_next_ping, obd_eviction_timer, and ti_timeout to
time64_t. With these changes the pinger will be 64
bit time compliant.

Change-Id: I447dfa5d47ce947a5afa7203326b6486b8855912
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28035
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-9904 lnet: reduce logging severity 26/29026/2
Amir Shehata [Fri, 15 Sep 2017 23:55:39 +0000 (16:55 -0700)]
LU-9904 lnet: reduce logging severity

On shutdown a push event can be triggered for a
non-existent peer. Reducing the severity of the log.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I88afafbd9a5bb2baa8c6ced7e6428af8cde2fdd2
Reviewed-on: https://review.whamcloud.com/29026
Tested-by: Jenkins
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9992 lnet: don't discover loopback interface 07/29007/2
Amir Shehata [Thu, 14 Sep 2017 22:49:38 +0000 (15:49 -0700)]
LU-9992 lnet: don't discover loopback interface

Whenever we send messages destined to the loopback interface
it should always go over the loopback interface. To achieve that
there is no real need to initiate discovery on the loopback.
This will result in a non-mr peer created for the loopback,
Which makes sense because if we are to send messages to
ourselves we do not want to use the different interfaces
rather just keep sending over the lolnd. In effect this
is a special case where we want to behave as a non-mr
node.

When sending a message destined for the loopback interface
there is no need to go through the selection process, it
is sufficient to shortcut all the MR logic and send directly
over the lolnd.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I821a74e9dbe35481a0168389b857f07397cee126
Reviewed-on: https://review.whamcloud.com/29007
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9931 tests: : fix REQFAIL calculation 97/28797/4
Elena Gryaznova [Wed, 30 Aug 2017 13:53:47 +0000 (16:53 +0300)]
LU-9931 tests: : fix REQFAIL calculation

REQFAIL is the number of times that a sleep is allowed to be
less than $MINSLEEP before the test is considered a fail.
The result of
  "DURATION / SERVER_FAILOVER_PERIOD * REQFAIL_PERCENT / 100"
may not be an integer (165.6) and test fails with :
  "Failed to load with for a minimum
  period of 166 times ( REQFAIL=165 )".
Patch rounds up REQFAIL value for such case.

Test-Parameters: clients=3 testlist=recovery-mds-scale,recovery-random-scale
Seagate-bug-id: MRP-4412
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Reviewed-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Change-Id: I9234de09bc156b1580cab01fcae80a57722fa9d7
Reviewed-on: https://review.whamcloud.com/28797
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Arshad Hussain <arshad.hussain@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9214 llite: enable readahead for small read_ahead_per_file 96/25996/7
Erich Focht [Wed, 15 Mar 2017 09:51:29 +0000 (10:51 +0100)]
LU-9214 llite: enable readahead for small read_ahead_per_file

Fixes for a regression introduced by http://review.whamcloud.com/19368
for the case that max_read_ahead_per_file_mb is smaller than
max_pages_per_rpc. With 16MB RPCs this happens pretty easily. In that
case the readahead window stayed zero and the backend saw only
requests of the size of the user IOs.

This patch restores the previous behavior for this corner case while
keeping the fix for large RPCs introduced by the alignment. When
max_read_ahead_per_file_mb is smaller than max_pages_per_rpc the
RPC size will not be optimal, but will be at least 1MB and the
readahead window will be as large as expected instead of zero.

Change-Id: Ie8f4da90da56439fd1466844c6db877849adea82
Signed-off-by: Erich Focht <efocht@hpce.nec.com>
Reviewed-on: https://review.whamcloud.com/25996
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9929 nodemap: add default ACL unmapping handling 10/29010/10
Emoly Liu [Fri, 22 Sep 2017 03:31:28 +0000 (11:31 +0800)]
LU-9929 nodemap: add default ACL unmapping handling

This patch adds default ACL unmapping code to mdt_getxattr
functions so that clients can get a correctly unmapped id.
Also, test_23b is added to sanity-sec.sh to verify this fix.

Change-Id: I6562372c58ca9772f16f7d6b0b98b45ada87971a
Test-Parameters: testlist=sanity-sec
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/29010
Tested-by: Jenkins
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9887 lfsck: calculate LFSCK speed properly 17/28617/6
Fan Yong [Tue, 19 Sep 2017 15:55:59 +0000 (11:55 -0400)]
LU-9887 lfsck: calculate LFSCK speed properly

Originally, we used do_div(a,b) to calculate the LFSCK average
speed, and got the result from the parameter @a. But later, we
replaced do_div(a,b) with div_u64(a,b). The latter one doesn't
stores the quotient in the parameter @a, instead, the quotient
is returned via the function return value. The patch fixes the
LFSCK logic to obtain the LFSCK average speed from div_u64(a,b)
return value.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I442fb8f7e6c51a4853ea37694e3c221f97e26b19
Reviewed-on: https://review.whamcloud.com/28617
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
3 years agoLU-9933 lnet: Handle ping buffer with only loopback NID 11/28811/3
Olaf Weber [Wed, 13 Sep 2017 15:42:36 +0000 (11:42 -0400)]
LU-9933 lnet: Handle ping buffer with only loopback NID

During startup lnet_peer_data_resent() can see a ping buffer
for the local node which contains only the loopback NID. This
shows up as pi_nnis == 1, and there is nothing to be done (or
that needs to be done) in that case.

Signed-off-by: Olaf Weber <olaf.weber@hpe.com>
Change-Id: Ie5454b58f01ded17d0cff47ad5358e5d4bcebcd8
Reviewed-on: https://review.whamcloud.com/28811
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-10024 tests: sanity 120f restore stripe count option 04/29204/2
James Nunez [Mon, 25 Sep 2017 21:22:04 +0000 (15:22 -0600)]
LU-10024 tests: sanity 120f restore stripe count option

The patch for LU-3308 with commit ID c75aa6c74cd86c6 removed
the stripe count, '-c', input parameter for three calls to
test_mkdir in santiy test 120f. The stripe count parameter
should be restored so the test will work correctly.

Test-Parameters: trivial testgroup=review-dne-part-1
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I33c98259c0ee7a209afa4e9002b3aa8cab42d55c
Reviewed-on: https://review.whamcloud.com/29204
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Gu Zheng <gzheng@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
3 years agoLU-7802 ldlm: No -EINVAL for canceled != unused 60/28560/4
Patrick Farrell [Mon, 18 Sep 2017 11:05:48 +0000 (06:05 -0500)]
LU-7802 ldlm: No -EINVAL for canceled != unused

If any locks are removed from or added to the lru, the
check of "number unused vs number cancelled" may be wrong.
This is fine - do not return an error or print debug in
this case.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I2ee9b8d86fbd6c9bd2c29e3472e3d410ee303374
Reviewed-on: https://review.whamcloud.com/28560
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9379 tests: Improve get_version to handle empty return value 67/26767/9
Arshad Hussain [Fri, 28 Apr 2017 20:11:36 +0000 (01:41 +0530)]
LU-9379 tests: Improve get_version to handle empty return value

replay-vbr fails when get_version() returns empty value to
either pre/post variable. This patch modifies replay-vbr
get_version() by returning '-1' in case getobjversion fails.

This patch also adds chk_get_version(). Which is a wrapper
to get_version(). chk_get_version() helps in verifying
return code of get_version().

Signed-off-by: Arshad Hussain <arshad.hussain@seagate.com>
Reviewed-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-by: Parinay Kondekar <parinay.kondekar@seagate.com>
Seagate-bug-id: MRP-3589
Test-Parameters: trivial testlist=replay-vbr
Change-Id: Ief4d42d1433d890086122218ff6f53d4ea9489e1
Reviewed-on: https://review.whamcloud.com/26767
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9558 tests: add completion.h header to kinode.c 98/28998/2
James Simmons [Thu, 14 Sep 2017 13:40:56 +0000 (09:40 -0400)]
LU-9558 tests: add completion.h header to kinode.c

When building lustre on a ARM fedora box running a 4.11 kernel
kinode.c failed to build due to the completion.h header missing.
A simple including of this header fixes everything.

Test-Parameters: trivial

Change-Id: I2a193346b214616a6f5dee5049466fb55567e396
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28998
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9979 kernel: kernel update RHEL6.9 [2.6.32-696.10.2.el6] 67/28967/2
Bob Glossman [Tue, 12 Sep 2017 16:36:41 +0000 (09:36 -0700)]
LU-9979 kernel: kernel update RHEL6.9 [2.6.32-696.10.2.el6]

Update RHEL6.9 kernel to 2.6.32-696.10.2.el6

Test-Parameters: clientdistro=el6.9 mdsdistro=el6.9 \
  ossdistro=el6.9 mdtfilesystemtype=ldiskfs \
  ostfilesystemtype=ldiskfs testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ie86a84fda1ec391c4e0b9ab18a82d4a5b0bd25d1
Reviewed-on: https://review.whamcloud.com/28967
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9840 lod: add ldo_dir_stripe_loaded 62/28962/3
Di Wang [Wed, 13 Sep 2017 00:53:13 +0000 (00:53 +0000)]
LU-9840 lod: add ldo_dir_stripe_loaded

Add ldo_dir_stripe_loaded flag to avoid loading
stripes mulitple times especcially for non-stripe
directory.

Change-Id: Ia9360aac9e24706e401184c75fae4ec7f8ec46d9
Signed-off-by: Di Wang <di.wang@intel.com>
Reviewed-on: https://review.whamcloud.com/28962
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
3 years agoLU-9682 nodemap: delete nids range from nodemap correctly 22/28922/4
Emoly Liu [Fri, 15 Sep 2017 07:56:11 +0000 (15:56 +0800)]
LU-9682 nodemap: delete nids range from nodemap correctly

In function nodemap_del_range(), we should check if the current
nodemap has the specified range before delete it from global
range tree.
Also, test_10b is added to sanity-sec.sh to verify this patch.

Change-Id: Ibab79056509d14d52f99b1ebe3319c301dbe45d9
Test-Parameters: testlist=sanity-sec
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-on: https://review.whamcloud.com/28922
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9963 test: add parallel-scale tests to ALWAYS_EXCEPT 14/28914/4
dilip krishnagiri [Fri, 8 Sep 2017 18:30:57 +0000 (12:30 -0600)]
LU-9963 test: add parallel-scale tests to ALWAYS_EXCEPT

add the following parallel-scale tests

parallel_grouplock :
       test_parallel_grouplock: test failed to respond and timed out
Associated Jira ticket LU-9429 is in open state.
to ALWAYS_EXCEPT list.

Test-Parameters: trivial testlist=parallel-scale

Change-Id: I25709af1ab49a30498a89e5369521582c5ab6cf8
Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Reviewed-on: https://review.whamcloud.com/28914
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Casper <jamesx.casper@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-8541 ldlm: don't use jiffies as sysfs parameter 70/28370/10
James Simmons [Thu, 31 Aug 2017 20:17:42 +0000 (16:17 -0400)]
LU-8541 ldlm: don't use jiffies as sysfs parameter

The ldlm sysfs file handles lru_max_age in jiffies which is wrong
since jiffies are not consistent across machine since HZ is
configurable at compile time. Talking to most users they thought
lru_max_age was in seconds which is incorrect. The best way to
fix this is to move lru_max_age to millisecs since most systems
lustre deals with sets HZ to 1000. To make it clear it is in
milliseconds print out lru_max_age with "ms". Since users tend
to think in seconds allow passing in seconds besides milliseconds
and internally converting them to nanaseconds. Since we have to
support milliseconds move to ktime_t since we can't use time64_t.
Unfortunately, this makes a relatively large patch, but I could
not find a way to split it up some more without breaking atomicity
of the change.

Change-Id: I0b1814fd9d903767f62fe141d2c95845b75fb95a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28370
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9802 pfl: swapping lcm_entry_count correctly 56/28256/4
Jinshan Xiong [Thu, 27 Jul 2017 16:49:42 +0000 (09:49 -0700)]
LU-9802 pfl: swapping lcm_entry_count correctly

It's a u16 integer so it should use le16_to_cpu() instead of
le32_to_cpu().

Test-Parameters: trivial
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I43c31a76d78aa294a3e3296a1bb69f4d6fb1423d
Reviewed-on: https://review.whamcloud.com/28256
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-8066 obd: make ldebugfs_remove recursive 18/28818/7
Oleg Drokin [Tue, 12 Sep 2017 14:24:28 +0000 (10:24 -0400)]
LU-8066 obd: make ldebugfs_remove recursive

ldebugfs_remove is usually called on directories with files passed in
as attributes, so simple debugfs_remove failes on them as not empty
Switch to debugfs_remove_recursive.

This fixes a number of problems where a new filesystem is mounted after
being unmounted first.

Linux-commit: 6a491f2b80f2806221ba3a5a3e26fbe945f82d83

Change-Id: I49878ea9e28365d7d834497e715eeee21e698eea
Signed-off-by: Oleg Drokin <green@linuxhacker.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28818
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9221 jobstats: Create a pid-based hash for jobid values 08/25208/26
Ben Evans [Wed, 1 Feb 2017 22:06:36 +0000 (16:06 -0600)]
LU-9221 jobstats: Create a pid-based hash for jobid values

Use cfs_hash_table to create a pid to jobID based mapping.
Change default behavior of JobIDs to default to procname_uid if
a suitable value cannot be found in the environment.

All entries older than RESCAN_INTERVAL  seonds are refreshed
on access.
Items can be purged by writing to procfs_name.
"" will remove all entries
When purging the cache, items older than DELETE_INTERVAL are
deleted.

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I22e9d73c4585d7c5496829bc20bce191304e0d58
Reviewed-on: https://review.whamcloud.com/25208
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9574 llite: pipeline readahead better with large I/O 88/27388/4
Jinshan Xiong [Thu, 1 Jun 2017 19:53:35 +0000 (12:53 -0700)]
LU-9574 llite: pipeline readahead better with large I/O

Fixed a bug where next readahead is not set correctly when
appplication issues large I/O;
Extend the readahead window length to at least cover the size of
current I/O.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I43c5e4f25ea30d4a36263db2588bde0401122990
Reviewed-on: https://review.whamcloud.com/27388
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-3308 tests: fix sanity/sanityn test_mkdir() usage 12/26212/18
Andreas Dilger [Tue, 21 Mar 2017 03:45:28 +0000 (23:45 -0400)]
LU-3308 tests: fix sanity/sanityn test_mkdir() usage

Remove "-p" option from test_mkdir() calls that do not need it.
test_mkdir() has its own error checking, so no need for duplicate
error checking in the caller as well.

Clean up script style for tests related to test_mkdir() changes:
- use $(...) instead of `...` for subshells
- use $tdir and $tfile for test filenames
- add useful messages to error() calls
- replace use of $SETSTRIPE wrapper with $LFS setstripe
- remove trailing "===" from test names
- use tabs for indentation instead of spaces

Combine sanity test_99[a-f] into test_99 to avoid duplicate checks.

Test-Parameters: trivial testlist=sanityn
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Steve Guminski <stephenx.guminski@intel.com>
Change-Id: I38d47f0c2e18aa20a0468f354ed88b740b3e17b8
Reviewed-on: https://review.whamcloud.com/26212
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9960 osd-zfs: don't auto-upgrade quota 24/28924/3
Nathaniel Clark [Mon, 11 Sep 2017 14:14:18 +0000 (10:14 -0400)]
LU-9960 osd-zfs: don't auto-upgrade quota

To preserve the ability to down-grade from 0.7.x to 0.6.x,
don't auto-upgrade quotas.
Print warning if quotas haven't been upgraded when mouting with 0.7.0.
Do check based on zpool feature in sanity-quota instead of just
version.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I2b0dcba3a230c9b2dec3d07d1b4ca6f1a1717d47
Reviewed-on: https://review.whamcloud.com/28924
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
3 years agoLU-6179 llite: Implement ladvise lockahead 64/13564/102
Patrick Farrell [Thu, 14 Sep 2017 15:24:50 +0000 (10:24 -0500)]
LU-6179 llite: Implement ladvise lockahead

Ladvise lockahead is a new feature allowing userspace to
request extent locks in advance of the IO which will use
them. These locks are not expanded beyond the size
requested by userspace.  They are intended to make it
possible to address lock contention between multiple
clients resulting from lock expansion.  They should allow
optimizing various IO patterns, notably strided writing.
(Further information in LU-6179)

Asynchronous glimpse locks are a speculative version of
glimpse locks, and already implement the required behavior.
Lockahead requests share this behavior.

Additionally, lockahead creates extent locks in advance
of IO, and so breaks the assumption that the holder of the
highest lock knows the current file size.

So we also modify the ofd_intent_policy code to glimpse
PW locks until it finds one it knows to be in use, taking
care to send only one glimpse to each client.

The current patch allows asynchronous non-blocking lock
ahead requests and synchronous blocking requests.  We
cannot do asynchronous blocking requests, because of
deadlocks that occur in having ptlrpcd threads handle
blocking lock requests.

Finally, this patch also adds another advice to disable
lock expansion, setting a per-file descriptor flag.  This
allows user space to control whether or not lock requests
on this file descriptor will undergo lock expansion.

This means if lockahead locks are not created ahead of IO
(due to inherent raciness) or are cancelled by a competing
IO request, the IO requests that should have used the
manually requested locks will not result in expanded locks.
This avoids lock ping-pong, and because the resulting locks
will not extend to the end of the file, future lockahead
requests can be granted.  Effectively, this means that if
lockahead usage for strided IO is interrupted by a
competing request, it can re-assert itself.

lockahead is implented via the ladvise interface from
userspace.  As lockahead results in a DLM lock request
rather than file advice, we do not use the lower levels of
the ladvise implementation.

Note this patch has one oddity:
Cray released an earlier version of lockahead without
FL_SPECULATIVE support.  That version uses
OBD_CONNECT_LOCKAHEAD_OLD, this new one uses
OBD_CONNECT_LOCKAHEAD.

The client code in this patch is interoperable with that
version, so it also advertises OBD_CONNECT_LOCKAHEAD_OLD
support, but the server version is not, so the server
advertises only OBD_CONNECT_LOCKAHEAD support.

Client support for the original lockahead is slated for
removal after the release of 2.12.  This is enforced with
a compile time version test that will remove support.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I1e80286f54946a0df08b19b1339829fcfd1117e7
Reviewed-on: https://review.whamcloud.com/13564
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9966 test: add a skip test to test_411 74/28974/5
Bob Glossman [Sun, 10 Sep 2017 15:15:32 +0000 (08:15 -0700)]
LU-9966 test: add a skip test to test_411

Since recently added test_411 needs a /sys nntry that doesn't exist
in sles12 extend the existing skip logic to skip the test in the case
of the entry being missing.

Test-Parameters: trivial clientdistro=sles12sp2 \
  testgroup=review-ldiskfs

Change-Id: I1f1bf05affdc1cec9957624506dac65e59f5b4ad
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-on: https://review.whamcloud.com/28974
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
3 years agoLU-9980 tests: save specific facet in save_lustre_params() 63/28963/2
Elena Gryaznova [Wed, 13 Sep 2017 05:58:07 +0000 (22:58 -0700)]
LU-9980 tests: save specific facet in save_lustre_params()

In save_lustre_params(), while there are multiple server facets
having the same host, and the parameter has wildcard, duplicate
parameters with wrong facets will be saved.

This patch fixes the above issue by greping service name to save
the parameter with specific facet.

Test-Parameters: clientcount=4 osscount=2 mdscount=2 mdtcount=1 \
austeroptions=-R failover=true iscsi=1 testlist=replay-vbr

Change-Id: Icba3fc532f4c67f02272c39e8e64d49325dad0e7
Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/28963
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9888 tests: Update disk2_7-zfs.tar.bz2 for quota 20/28820/4
Nathaniel Clark [Thu, 31 Aug 2017 19:54:53 +0000 (15:54 -0400)]
LU-9888 tests: Update disk2_7-zfs.tar.bz2 for quota

Add a blimit file with the larger 40960 limit in it, as this seems to
be how the image was created, but the default (without this file) is
20K.
Also add an ilimit file with 4 (default for new images).  Old default
value is 2.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I33921ac58a5252f3259145d5e00faedcd21559f9
Reviewed-on: https://review.whamcloud.com/28820
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9930 llite: only clear lli_sai if the setter 94/28794/4
Bruno Faccini [Wed, 30 Aug 2017 09:37:03 +0000 (11:37 +0200)]
LU-9930 llite: only clear lli_sai if the setter

Previous to this patch, start_statahead_thread() was
unconditionnally clearing lli->lli_sai upon error, leading
to crash upon racy scenario where it has just been set in/by
another thread context.
Now, only clear lli_sai if current thread has set it.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I555febfad3494c9dd90eeb72d6dd9157428179ea
Reviewed-on: https://review.whamcloud.com/28794
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9995 lfsck: Have LMV_HASH_FLAG_DEAD defined for a while longer
Oleg Drokin [Fri, 15 Sep 2017 17:41:06 +0000 (13:41 -0400)]
LU-9995 lfsck: Have LMV_HASH_FLAG_DEAD defined for a while longer

LMV_HASH_FLAG_DEAD is still used in lfsck and not to make any hasty
moves, just move the version check arount that define further away
while we are examining what really needs to be done there.

This unbreaks the build breakage from 2.10.53 tag.

Change-Id: I87b25136f8fc03e59aed97352567757d2460ab3a
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoNew tag 2.10.53 2.10.53 v2_10_53 v2_10_53_0
Oleg Drokin [Fri, 15 Sep 2017 04:55:41 +0000 (00:55 -0400)]
New tag 2.10.53

Change-Id: I32787e50eab953a1f4a6f13723d777b3d7daea01
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9595 tests: remove sanityn test 18c from ALWAYS_EXCEPT 14/27414/6
dilip krishnagiri [Mon, 7 Aug 2017 16:27:41 +0000 (10:27 -0600)]
LU-9595 tests: remove sanityn test 18c from ALWAYS_EXCEPT

Remove sanityn test 18c from the ALWAYS_EXCEPT list
because LU-1205 is resolved.

Signed-off-by: dilip krishnagiri <dilipx.krishnagiri@intel.com>
Change-Id: Id1a67f4ce734949d446a97379cc297ddfd68e958
Reviewed-on: https://review.whamcloud.com/27414
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9950 build: add support for Ubuntu(debian) arm64 70/28870/2
Gu Zheng [Wed, 6 Sep 2017 03:14:35 +0000 (21:14 -0600)]
LU-9950 build: add support for Ubuntu(debian) arm64

Add arm64 into the support arch list of debian control file.

Change-Id: I9c39a4d8c1896c1255432380bd956330c2edf476
Signed-off-by: Gu Zheng <gzheng@ddn.com>
Reviewed-on: https://review.whamcloud.com/28870
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yingjin Qian <qian@ddn.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Thomas Stibor <t.stibor@gsi.de>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9921 lnet: resolve unsafe list access 23/28723/6
Amir Shehata [Sat, 26 Aug 2017 04:18:16 +0000 (21:18 -0700)]
LU-9921 lnet: resolve unsafe list access

Use list_for_each_entry_safe() when accessing messages on pending
queue. Remove the message from the list before calling lnet_finalize()
or lnet_send().

When destroying the peer make sure to queue all pending messages on
a global list. We can not resend them at this point because the
cpt lock is held. Unlocking the cpt lock could lead to an inconsistent
state. Use the discovery thread to check if the global list is not
empty and if so resend all messages on the list. Use a new spin
lock to protect the resend message list. I steered clear from reusing
an existing lock because LNet locking is complex and reusing a lock
will add to this complexity. Using a new lock makes the code easier
to understand.

Verify that all lists are empty before destroying the peer_ni

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ia081419ec5ed2be5823cfbca7e050138a229ab9c
Reviewed-on: https://review.whamcloud.com/28723
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-7746 tests: skip tests for older (upstream) client 18/28718/3
Andreas Dilger [Fri, 25 Aug 2017 19:02:01 +0000 (13:02 -0600)]
LU-7746 tests: skip tests for older (upstream) client

Skip some tests when running newer sanity.sh on an older client.
This typically only happens when testing the upstream client,
since otherwise the tests will always match the client version.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I78e1b0a6ae98879a2039817696c3a0dd15621fcc
Reviewed-on: https://review.whamcloud.com/28718
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Steve Guminski <stephenx.guminski@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
3 years agoLU-9891 tests: Increase space not released for ZFS 82/28682/4
James Nunez [Thu, 24 Aug 2017 14:51:15 +0000 (08:51 -0600)]
LU-9891 tests: Increase space not released for ZFS

Several Lustre tests calculate the free space on the
object storage servers. For servers running ZFS, the amount
of space released by ZFS is not 100% deterministic. Thus,
fs_log_size() will return the buffer size that we allow
the space to be off by. For ZFS, increase this buffer
from 400 to 512 KB.

Test-Parameters: trivial testgroup=review-zfs-part-2
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I32e0ae3752d0ee0e9f0091ea779f8b53ba969a26
Reviewed-on: https://review.whamcloud.com/28682
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>