Whamcloud - gitweb
fs/lustre-release.git
22 months agoLU-10794 lfs: make quota work for grace time 06/31606/3
Wang Shilong [Fri, 9 Mar 2018 05:25:07 +0000 (13:25 +0800)]
LU-10794 lfs: make quota work for grace time

Following commit:
LU-10011 utils: refactor lfs quota codes

Introduce a regression which will make 'lfs quota -t'
will output nothing, fix this bug and also add
a test case in sanity-quota.sh in case it is broken
in the future again.

Test-Parameters: trivial testlist=sanity-quota
Change-Id: I2063552505cf07464d9924f66c29fc2504bc56ce
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/31606
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10783 kernel: kernel update RHEL7.4 [3.10.0-693.21.1.el7] 12/31612/2
Bob Glossman [Tue, 6 Mar 2018 22:06:18 +0000 (14:06 -0800)]
LU-10783 kernel: kernel update RHEL7.4 [3.10.0-693.21.1.el7]

update RHEL 7.4 kernel to 3.10.0-693.21.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ib7d5233d438798e1cdd1c31bb6728f8ea6697959
Reviewed-on: https://review.whamcloud.com/31612
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10750 mdd: declare changelogs only when enabled 77/31477/8
John L. Hammond [Thu, 1 Mar 2018 16:02:09 +0000 (10:02 -0600)]
LU-10750 mdd: declare changelogs only when enabled

In the mdd layer, rename recording_changelog() to
mdd_changelog_enabled() and add the changelog record type as a
parameter. In mdd_changelog_enabled() test to see if the type is
enabled in addition to checking is changelogs are generally enabled
and only lookup the ucred if the other tests pass. Add a type
parameter to mdd_declare_changelog_store() so that this information
can be passed to mdd_declare_changelog_store(). In mdd_close() check
if CLOSE changelogs are enabled before opening a transaction and
declaring the record.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Idd7604de5e97bad72a802cb4b49dae4668b2644a
Reviewed-on: https://review.whamcloud.com/31477
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
22 months agoLU-10465 lov: decrease default stripe size to 1MB 89/31589/2
Jian Yu [Thu, 8 Mar 2018 19:08:59 +0000 (11:08 -0800)]
LU-10465 lov: decrease default stripe size to 1MB

Commit 3f5abc6fa30e7c0256077ccf6a149d1809450465 increased
the default stripe size from 1MB to 4MB. However, this
caused usability issue in LU-10786 for PFL/DoM files.

This patch changes the default stripe size back to 1MB
until we have a better method of handling DoM components.
Otherwise, it means that DoM files will not be created
easily with default settings.

Change-Id: Ie6b6fe97596ed65abec771b3f37afd950dc821c8
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31589
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
22 months agoRevert "LU-10419 lfsck: skip dead target" 00/31600/2
Oleg Drokin [Fri, 9 Mar 2018 00:19:51 +0000 (00:19 +0000)]
Revert "LU-10419 lfsck: skip dead target"

This is causing uninterruptible lfsck instances in soak testing documented in LU-10419 by Cliff

This reverts commit 012834c5e7c7be50ff117cee4ac473d7fee4294d.

Change-Id: I119d21c7ce3375140fbbb25a300e65b4c6aa9e73
Reviewed-on: https://review.whamcloud.com/31600
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10722 test: Add version check to sanity-quota test_55 31/31531/2
Wei Liu [Mon, 5 Mar 2018 18:38:43 +0000 (10:38 -0800)]
LU-10722 test: Add version check to sanity-quota test_55

Skip sanity-quota test_55 if server is older than 2.10.58

Test-Parameters: trivial testlist=sanity-quota
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Change-Id: Ia8a129298d75fb019699adda07fecd2f4d9eb46a
Reviewed-on: https://review.whamcloud.com/31531
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10705 utils: add "lfs find --blocks" 93/31393/4
Andreas Dilger [Fri, 23 Feb 2018 07:34:22 +0000 (00:34 -0700)]
LU-10705 utils: add "lfs find --blocks"

Add support for "lfs find --blocks|-b <block>" to be able to find
files with the specified number of allocated blocks (in kilobytes or
other specified units). This is distinct from "--size <size>" since
that doesn't properly check the space used for sparse files.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I7d48f919d95242c11ef7d3075ecc3f7e963ebbe5
Reviewed-on: https://review.whamcloud.com/31393
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10596 tests: skip tests require remote server with nodsh 21/31121/6
Elena Gryaznova [Sun, 4 Mar 2018 18:17:38 +0000 (21:17 +0300)]
LU-10596 tests: skip tests require remote server with nodsh

Patch fixes the following tests to be skipped for remote
servers with nodsh set:
sanity 56c, 60aa, 77c, 101g, 160f, 160g, 161d
Patch skips 160f and 160g for old MDS.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanity
Cray-bug-id: MRP-4757, LUS-5710
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I44f35129df5bc5c8c6e6ace3e68f3f2d400db86c
Reviewed-on: https://review.whamcloud.com/31121
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
22 months agoLU-10336 osp: wakeup opd_pre_waitq when decrement opd_pre_reserved 97/30397/4
Sergey Cheremencev [Wed, 6 Dec 2017 13:52:33 +0000 (16:52 +0300)]
LU-10336 osp: wakeup opd_pre_waitq when decrement opd_pre_reserved

osp_precreate_cleanup_orphans could be blocked due to
reserved objects. In such case it set opd_pre_recovering
flag and waits until opd_pre_reserved becomes 0.
Thus we need to wake it up when opd_pre_reserved is reset
to 0.

Change-Id: Ib8d4708685c3c9675872577985a4c6897e3ee385
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Cray-bug-id: MRP-3623
Reviewed-on: https://review.whamcloud.com/30397
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9160 ldiskfs: preload block group descriptors 22/25722/7
Artem Blagodarenko [Sat, 18 Feb 2017 09:00:13 +0000 (12:00 +0300)]
LU-9160 ldiskfs: preload block group descriptors

With 300TB OST size, we saw slow mount time, which
caused 13 minutes, with this patch applied, it reduced
to 30s, so this patch greatly reduce mount time, backport
it from Linux upstream.

Linux-commit: 85c8f176a6111ecde9c158109989dbd445a0e59a

With enabled meta_bg option block group descriptors
reading IO is not sequential and requires optimization.

Seagate-bug-id: MRP-4129
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Change-Id: Iaa621c11ff88364021887d9f9dcec250dd5fd955
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Reviewed-on: https://review.whamcloud.com/25722
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
22 months agoLU-10723 tests: disable sanity 232b before 2.10.58 87/31487/2
Quentin Bouget [Fri, 2 Mar 2018 08:22:25 +0000 (08:22 +0000)]
LU-10723 tests: disable sanity 232b before 2.10.58

The fix that allows test_232b of sanity.sh to pass was introduced in
lustre 2.10.58 so the test should not be run before this version.

Signed-off-by: Quentin Bouget <quentin.bouget@cea.fr>
Change-Id: I7c625e916bfd0d4a614cc9924670bffe4ba3b8b0
Reviewed-on: https://review.whamcloud.com/31487
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10483 lustre: replace FMODE_{READ,WRITE} with MDS_* equivs 24/30824/10
Sebastien Buisson [Wed, 10 Jan 2018 14:37:24 +0000 (23:37 +0900)]
LU-10483 lustre: replace FMODE_{READ,WRITE} with MDS_* equivs

In file lustre/include/uapi/linux/lustre/lustre_user.h, replace direct
use of FMODE_READ and FMODE_WRITE with MDS_* equivalents.
That will avoid name clashes with the kernel symbols, and avoid
problems if their values ever change.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I07e77d8d025c5ddb3dc4e085738645e20fb77d0c
Reviewed-on: https://review.whamcloud.com/30824
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10003 lnet: remove lctl deprecation messages 34/31534/3
John L. Hammond [Mon, 5 Mar 2018 23:11:25 +0000 (17:11 -0600)]
LU-10003 lnet: remove lctl deprecation messages

Defer deprecation of these commands for now.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I09b97bacded9ac65a8c5df3ba47867a6a19fbf7b
Reviewed-on: https://review.whamcloud.com/31534
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
22 months agoLU-10419 lfsck: skip dead target 75/31475/2
Fan Yong [Thu, 1 Mar 2018 06:30:36 +0000 (14:30 +0800)]
LU-10419 lfsck: skip dead target

Do not send LFSCK RPC to dead targets to avoid being blocked.
The patch adds warning message when try to send LFSCK RPC on
the non-full connection, it is helpful to understand why the
LFSCK may be blocked.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I0599eb961f1aabd58d0de53fd51f25ca1ec8ff34
Reviewed-on: https://review.whamcloud.com/31475
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10769 osd-zfs: fix deadlock on osd_object::oo_guard 11/31511/4
Fan Yong [Mon, 5 Mar 2018 11:35:02 +0000 (19:35 +0800)]
LU-10769 osd-zfs: fix deadlock on osd_object::oo_guard

There is race condition inside osd-zfs, it may cause deadlock.
Consider the following scenarios:

1) The Thread1 calls osd_attr_set() to set flags on the object.
   The osd_attr_set() will call the osd_xattr_get() with holding
   the read mode semaphore on the object::oo_guard.

2) The Thread2 calls the osd_declare_destroy() to destroy such
   object, it will down_write() on the object::oo_gurad, but be
   blocked by the Thread1's granted read mode semaphore.

3) The osd_xattr_get() triggered by the osd_xattr_set() will also
   down_read() on the object::oo_guard. But it will be blocked by
   the Thread2's pending down_write() request.

Then the Thread1 and the Thread2 deadlock.
This patch makes the osd_attr_set() to call the lockless version
xattr_get osd_xattr_get_internal() to avoid such deadlock.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iaac2e414b5f1fd197303bb7ec7d5e2763b6f3e9a
Reviewed-on: https://review.whamcloud.com/31511
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
22 months agoLU-10681: Disable tiny writes for append 53/31353/8
Patrick Farrell [Sat, 3 Mar 2018 22:59:43 +0000 (16:59 -0600)]
LU-10681: Disable tiny writes for append

Unfortunately, tiny writes do not work correctly with
appending to files.  When appending to a file, we must take
DLM locks to EOF on all stripes, in order to protect file
size so we can append correctly.

If we dirty a page with a normal write then append to it
with a tiny write, these DLM locks are not present, and we
can use an incorrect size if another client writes to a
different stripe, increasing the size without cancelling
the lock which is protecting our dirty page.

We could theoretically check to make sure the required DLM
locks are held, but this would be time consuming.

The simplest solution is to just not allow tiny writes when
appending.

Also add option to disable tiny writes at runtime.

Cray-bug-id: LUS-5723

Change-Id: Ic9421faa3d0268d907040881e8ba3c894261fd49
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/31353
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10520 mkfs: enable extents for big MDT 37/31037/13
Yang Sheng [Fri, 26 Jan 2018 13:35:33 +0000 (21:35 +0800)]
LU-10520 mkfs: enable extents for big MDT

Enable extents while MDT size is big than 16T.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Iccd39c48e715a3f084cb5ee803be0541563f5d10
Reviewed-on: https://review.whamcloud.com/31037
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
22 months agoLU-10680 mdd: disable changelog garbage collection by default 52/31552/2
John L. Hammond [Tue, 6 Mar 2018 19:25:50 +0000 (13:25 -0600)]
LU-10680 mdd: disable changelog garbage collection by default

Changelog garbage collection has introduced some instability so
disable it by default.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I708198d76af060cb796de89266ee74a968f92ac1
Reviewed-on: https://review.whamcloud.com/31552
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10786 tests: add stripe size to lfs setstripe 69/31569/3
James Nunez [Wed, 7 Mar 2018 16:27:59 +0000 (09:27 -0700)]
LU-10786 tests: add stripe size to lfs setstripe

Since the default stripe size increased from one to four
MB, we need to add the stripe size parameter to calls
to 'lfs setstripe' for composite files when the component
size is less than the file system stripe size. Thus, add
the stripe size parameter to calls to 'lfs setstripe' for
sanity-flr tests 45 and 46 and sanity-pfl test 16.

Test-Parameters: trivial testlist=sanity-flr,sanity-pfl

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic169eaebd922175467f010b159a2b065fb91b3fb
Reviewed-on: https://review.whamcloud.com/31569
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8066 fid: move all files from procfs to debugfs 66/28366/10
James Simmons [Sun, 21 Jan 2018 16:55:10 +0000 (11:55 -0500)]
LU-8066 fid: move all files from procfs to debugfs

Linux-commit: f3aa79fbef7942971825fb2084a88e9527c6b04c

Besides the client port form upstream also port the server
side proc entires to debugfs.

Change-Id: I934fc5a39c8c407799abd0d6154240d3a579c93e
Signed-off-by: Dmitry Eremin <dmiter4ever@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28366
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8066 obd: final pieces for sysfs/debugfs support. 08/28108/24
James Simmons [Thu, 22 Feb 2018 17:26:16 +0000 (12:26 -0500)]
LU-8066 obd: final pieces for sysfs/debugfs support.

This patch puts in place the basics needed for debugfs.
It also creates class_setup_tunables so sysfs kobject
creation is handled for both obd_devices and llite. Add a
special LDEBUGFS_FOPS_WR_ONLY since often in this case
i_private is not set so any attempt to call PDE_DATA(inode)
will cause it to crash. Make lprocfs_obd_setup select either
debugfs or procfs but not both.

Handle the special symlinks needed for both debugfs
and sysfs with the server case. For lod we need to
create "lov" and osp we create "osc" for both sysfs
and debugfs. Handle the complex case of when a node
is both a server and client. For debugfs we can take
advantage of d_lookup() and for sysfs kset_find_obj()
to avoid special access to struct obd_type. This also
places the burden on the server lod/osp modules instead
of the client lov/osc modules.

Change-Id: I87090859db4da2300ab9e2aa3c23cb3773276103
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28108
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10551 lod: obd_fid_alloc() could start a nested trans 68/31268/10
Bobi Jam [Mon, 12 Feb 2018 05:44:48 +0000 (13:44 +0800)]
LU-10551 lod: obd_fid_alloc() could start a nested trans

* obd_fid_alloc() could possibly start a nested transaction, which
  would reset the OI cache. So we add a
  osd_thread_info::oti_ins_cache_depth to prevent clearing OI cache
  in the nested trnasaction.

* Add more debug mesages in osd_idc_find_or_init()/
  osd_idc_find_and_init()

Test-Parameters: alwaysuploadlogs envdefinitions=PTLDEBUG=-1 testlist=sanity-pfl ostfilesystemtype=zfs mdtfilesystemtype=zfs mdscount=2 mdtcount=4
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Id75fd1787ffc0f47bbf110d460f23db6c34670da
Reviewed-on: https://review.whamcloud.com/31268
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10607 osd-zfs: skip io stat for OI scrub 80/31180/5
Fan Yong [Wed, 28 Feb 2018 03:36:52 +0000 (11:36 +0800)]
LU-10607 osd-zfs: skip io stat for OI scrub

It is unnecessary to stat io for OI scrub triggered request.
On the other hand, the OI setup logic may read/write the OI
scrub file. At that time, related lproc (including io stat)
for such OSD is not initialized yet. So this patch skips io
stat for OI scrub.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9498c1351c1875ac9aa46eed5189cb61a6d102ac
Reviewed-on: https://review.whamcloud.com/31180
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-5991 obd: fix mount error handing 59/12959/9
Vladimir Saveliev [Thu, 15 Feb 2018 15:40:16 +0000 (18:40 +0300)]
LU-5991 obd: fix mount error handing

lustre_fill_super() allocates lsi and assumes that on failures lsi
will be freed by server_fill_super() or ll_fill_super().
- server_fill_super() does not free lsi when lsi_prepare() fails.
- ll_fill_super() does not free lsi when OBD_ALLOC_PTR(cfg) or
ll_init_sbi() fail.

osd_device_fini() needs osd_index_backup(). Otherwise
struct lustre_index_backup_unit-s leak if server_fill_super() fails
after osd_start().

Cray-bug-id: MRP-2229
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I366dc2b46a504a65b030bcbf687998dd0676f404
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/12959
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10737 misc: Wrong checksum return value 48/31448/2
Qian Yingjin [Wed, 28 Feb 2018 09:22:01 +0000 (17:22 +0800)]
LU-10737 misc: Wrong checksum return value

In the checksum calculation functions: tgt_checksum_niobuf and
osc_checksum_bulk, it is wrongly taken the error return value
of cfs_crypto_hash_init as the checksum value.
This patch fixes the problem.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I647c402deeab00ec5c6437423b0cab250b42c3e5
Reviewed-on: https://review.whamcloud.com/31448
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
22 months agoLU-4423 lov: use correct env in lov_io_data_version_end() 18/31418/4
NeilBrown [Wed, 28 Feb 2018 02:50:23 +0000 (21:50 -0500)]
LU-4423 lov: use correct env in lov_io_data_version_end()

lov - the logical object volume manager - is responsible for
striping data across multiple volumes.

So when it is given a request, it creates one or more
sub-requests, one for each target volume.  Each sub_io
request has a sub_env environment which it operates in.

When lov_io_data_version_end() calls lov_io_end_wrapper() to
wait for and close off a sub_io, it passes the wrong
environment.

This causes an LINVRNT() to fail in cl2osc_io(), and may
cause other problems.

This patch changes the call to use ->sub_env, much like
other code in the same file.

Change-Id: Id120929f4189196232d18103007e45ba89195fff
Fixes: fcd45488711a (LU-5683 clio: add CIT_DATA_VERSION)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31418
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10421 echo: use echo layer when finding stripe object 38/31338/3
John L. Hammond [Fri, 16 Feb 2018 18:55:05 +0000 (12:55 -0600)]
LU-10421 echo: use echo layer when finding stripe object

In echo_md_dir_stripe_choose(), find the stripe object using the echo
device rather than the down layer (mdd) device. mdd objects are not
equipped to be top layer objects and should not be found in this way.

Test-Parameters: trivial testlist=mds-survey
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ibb396ae64b6d542c64697336d227e06163a0bb39
Reviewed-on: https://review.whamcloud.com/31338
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
22 months agoLU-10662 llite: Add exit for filedata allocation failed 96/31296/3
Ben Evans [Tue, 13 Feb 2018 19:20:18 +0000 (14:20 -0500)]
LU-10662 llite: Add exit for filedata allocation failed

When the filedata allocation fails, we need to exit to
a later point than out_openerr, which calls
deauthorize_statahead and ll_file_data_put, neither of
which is valid. (This leads to a panic.)

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I670d578f01b2731761e3149db36dd8da1551a30a
Cray-bug-id: LUS-1321
Reviewed-on: https://review.whamcloud.com/31296
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10656 ldlm: fix export reference 39/31139/5
Hongchao Zhang [Sun, 28 Jan 2018 19:25:42 +0000 (03:25 +0800)]
LU-10656 ldlm: fix export reference

In ptlrpc_connect_interpert, the export reference could be
leaked if there is error before the following class_exp_put.

Change-Id: I9ddd82fa1bbf8e17079e9746202be63e6233c052
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31139
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10582 out: can't obtain remote acl xattr 88/31088/5
Andriy Skulysh [Tue, 12 Dec 2017 13:39:14 +0000 (15:39 +0200)]
LU-10582 out: can't obtain remote acl xattr

osp_xattr_get() fails due to hardcoded
reply size limitation.

With large_xattr enabled ddp_max_ea_size can be
almost 1MB and out_handles fails to send such big
reply.

Limit maximum ACL buffer size by XATTR_SIZE_MAX.
Limit ddp_max_ea_size to fit resulting reply
request into LNET_MTU.

Cray-bug-id: MRP-4724
Change-Id: I6405330605809911c3f814fe5cb9d476d7ac40ed
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/31088
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9934 build: support for gcc7 10/28810/10
Alex Zhuravlev [Fri, 23 Feb 2018 17:50:12 +0000 (12:50 -0500)]
LU-9934 build: support for gcc7

Supress few false warnings with a compiler option when
building the lustre kernel modules. A few other fixes
to make lustre buildable on Fedora.

Change-Id: If14d226e5d92ae9ce54e216d032df94d9398654e
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/28810
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10465 lov: increase default stripe size to 4MB 51/27151/13
Jian Yu [Tue, 27 Feb 2018 08:28:40 +0000 (00:28 -0800)]
LU-10465 lov: increase default stripe size to 4MB

Increase the default stripe size from 1MB to 4MB
so that widely-striped files can generate full RPCs
without pinning so much memory on the client.

The patch also renames STRIPE_BYTES and STRIPES_PER_OBJ
to DEF_STRIPE_SIZE and DEF_STRIPE_COUNT in cfg/local.sh,
and unsets them to support formatting Lustre filesystem
with default stripe size and count.

Change-Id: I59d1fdb3e30599c125e0e5e800d168921bd69098
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/27151
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
22 months agoLU-9727 mdd: properly call recording_changelog() 56/31456/2
Sebastien Buisson [Wed, 28 Feb 2018 16:18:32 +0000 (01:18 +0900)]
LU-9727 mdd: properly call recording_changelog()

recording_changelog() must be called everywhere in the code instead
of directly checking (mdd->mdd_cl.mc_flags & CLM_ON).

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9ed5aac4871573e6aea94cfd4dc46b95d5df1e4a
Reviewed-on: https://review.whamcloud.com/31456
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-6142 UAPI: replace cfs_size_* macros with __ALIGN_KERNEL 79/30379/6
James Simmons [Mon, 26 Feb 2018 16:11:02 +0000 (11:11 -0500)]
LU-6142 UAPI: replace cfs_size_* macros with __ALIGN_KERNEL

The lustre specific cfs_size_* macros can be easily replaced with
the __ALIGN_KERNEL macro provided by the linux kernel for our
user land code. This brings us closer to building against the
upstream client.

Change-Id: I5cd261807f60296eaac884b66f084c128adc5b01
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30379
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9324 lfs: add setstripe --copy=lustre_file_or_dir parameter 79/26879/6
Bobi Jam [Fri, 28 Apr 2017 06:18:47 +0000 (14:18 +0800)]
LU-9324 lfs: add setstripe --copy=lustre_file_or_dir parameter

Add a "lfs setstripe --copy=<lustre_src> <lustre_file_or_dir_dst>"
usage to set stripe using stripe info from a source lustre file/dir.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ibcd80f98c53bdff5b41ba9b1010fceefd6c9d8b7
Reviewed-on: https://review.whamcloud.com/26879
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
22 months agoLU-8910 osp: Add correct handling of errors to osp_statfs_interpret 67/24167/5
Sergey Cheremencev [Tue, 6 Dec 2016 09:16:05 +0000 (12:16 +0300)]
LU-8910 osp: Add correct handling of errors to osp_statfs_interpret

MDT's statfs info could be disagreed with OST's info for a very long time.
If osp_statfs_update() is called and extends the timeout 1000*obd_timeout
into the future but then osp_statfs_interpret() hits an error it
will never reset the timeout.
Now when osp_update_statfs request fails osp_statfs_interpret causes
osp_precreate_cleanup_orphans to send new one after 10 seconds.

Change-Id: Ib282d806ba4932db5c72df34905988f96de99297
Cray-bug-id: MRP-3892
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-on: https://review.whamcloud.com/24167
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10212 test: ESTALE read 01/31101/7
Alexander Boyko [Wed, 31 Jan 2018 11:17:42 +0000 (06:17 -0500)]
LU-10212 test: ESTALE read

The patch reproduces the issue, when a read rpc come
to OST with a lock handle which has the LDLM_FL_DESTROY
flag. And then a client gets the ESTALE error for a read
operation.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: MRP-4604
Change-Id: I0722fc57a61153b25a05bf7aebce5d7f32bbc95b
Reviewed-on: https://review.whamcloud.com/31101
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
22 months agoLU-10653 kernel: kernel update [SLES12 SP2 4.4.114-92.64] 55/31255/5
Bob Glossman [Fri, 9 Feb 2018 18:26:17 +0000 (10:26 -0800)]
LU-10653 kernel: kernel update [SLES12 SP2 4.4.114-92.64]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I98f3784eddc05e4faf00091e05b751d78090f66d
Reviewed-on: https://review.whamcloud.com/31255
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10655 tests: eliminate 'ssh exited with exit code 1' 63/31263/5
Vladimir Saveliev [Sat, 10 Feb 2018 10:21:44 +0000 (13:21 +0300)]
LU-10655 tests: eliminate 'ssh exited with exit code 1'

Eliminate meaningless 'ssh exited with exit code 1' issued by stop()
and wait_exit_ST()

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Elena V. Gryaznova <c17455@cray.com>
Cray-bug-id: MRP-1483
Change-Id: Ie1af3cda0b48b7bf482ea35b84c93e38d0f6c0a9
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/31263
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Jenkins
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10716 tests: skip sanity 56xb for old server 16/31416/2
Elena Gryaznova [Mon, 26 Feb 2018 13:15:20 +0000 (16:15 +0300)]
LU-10716 tests: skip sanity 56xb for old server

Patch skips sanity test_56xb for servers which
do not contain LU-6051.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2609
Test-Parameters: trivial testlist=sanity envdefinitions="ONLY=56bx"
Change-Id: I1b03f5e1c144dedef2bdd7b0c46e431d6761eb47
Reviewed-on: https://review.whamcloud.com/31416
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10712 tests: skip conf-sanity 108[a,b] for old server 13/31413/2
Elena Gryaznova [Mon, 26 Feb 2018 00:59:22 +0000 (03:59 +0300)]
LU-10712 tests: skip conf-sanity 108[a,b] for old server

Patch skips test_108a and test_108b for old servers.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5691
Test-Parameters: trivial testlist=conf-sanity
Change-Id: I7494e15710ac2663a0e01a9d3568fa5bcd590a6a
Reviewed-on: https://review.whamcloud.com/31413
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10684 tests: skip recovery-small 110[h-j] 50/31350/3
Elena Gryaznova [Tue, 20 Feb 2018 15:12:15 +0000 (18:12 +0300)]
LU-10684 tests: skip recovery-small 110[h-j]

Skip test_110h, test_110i. test_110j for old
servers.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2589
Test-Parameters: trivial testlist=recovery-small envdefinitions=ONLY=110
Change-Id: I9d89fcdc55b5d1d1fd4004d8e09e0297eb4bc595
Reviewed-on: https://review.whamcloud.com/31350
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10672 lnet: pass in only time64_t to lnet_notify 39/31339/3
James Simmons [Wed, 21 Feb 2018 18:22:02 +0000 (13:22 -0500)]
LU-10672 lnet: pass in only time64_t to lnet_notify

With the migration to 64 bit second time some calls to lnet_notify
did not get updated to use time64_t. Update those calling points.
Also for the ioctl IOC_LIBCFS_NOTIFY_ROUTER we pass in the number
of seconds since the epoch. We subtract the current epoch time but
we missed adding in the current number of seconds since booting
since the lnet ping code expects the seconds since boot to be used.

Change-Id: I5a92df08cdaf3b747fd17721a92038df05669a81
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31339
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10604 osd: define couple fields as bitfield 97/31097/5
Alex Zhuravlev [Wed, 31 Jan 2018 06:08:02 +0000 (09:08 +0300)]
LU-10604 osd: define couple fields as bitfield

redefine oo_compat_dot_created and oo_compat_dotdot_created
to save 8 bytes per object.

Change-Id: I92dafc693f1d118debc251d7d064b206e36624f0
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31097
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-7787 mdd: clean up orphan object handling 47/30547/8
Andreas Dilger [Thu, 14 Dec 2017 21:53:58 +0000 (14:53 -0700)]
LU-7787 mdd: clean up orphan object handling

There was a potential problem in the orphan object naming because
it had an embedded space in the filename before the "operation",
which might cause issues if they are accessed for other reasons.
It turns out that there is no need for the "operation" to be
embedded into the filename, since it was always ORPH_OP_UNLINK.

Use standard DFID formatting for the orphan object names, which
is a bit shorter and more efficient on disk, without the embedded
operation type.

Remove the use of "ORPH_OP_UNLINK" in the code, except in the
compatibility code for handling orphans left over after upgrades
from older Lustre versions.  This can be removed at some point
in the future when there are no longer upgrades from pre-2.11
versions.

Rename the orphan handling functions to start with mdd_orphan_*
for consistency with other MDD functions:
orph_index_init -> mdd_orphan_index_init
orph_index_iterate -> mdd_orphan_index_iterate
orph_index_fini -> mdd_orphan_index_fini
orph_declare_index_insert -> mdd_orphan_declare_insert
orph_declare_index_insert -> mdd_orphan_declare_insert
orph_key_test_and_del -> mdd_orphan_key_test_and_delete
orph_key_fill -> mdd_orphan_key_fill
orph_key_fill_18 -> mdd_orphan_key_fill_20
__mdd_orphan_add -> mdd_orphan_insert
__mdd_orphan_del -> mdd_orphan_delete
__mdd_orphan_cleanup -> mdd_orphan_cleanup_thread

Remove single-line wrapper functions to clarify actual code:
mdd_orphan_write_lock -> dt_write_lock
mdd_orphan_write_unlock -> dt_write_unlock
mdd_orphan_delete_obj -> dt_delete
mdd_orphan_ref_add -> dt_ref_add
mdd_orphan_ref_del -> dt_ref_del

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ica90cc03c3212103c39cba11c4566584bf9cab07
Reviewed-on: https://review.whamcloud.com/30547
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9325 obd: replace lprocfs_str_to_s64 39/30539/16
James Simmons [Fri, 9 Feb 2018 23:46:39 +0000 (18:46 -0500)]
LU-9325 obd: replace lprocfs_str_to_s64

The original goal of lprocfs_str_to_s64[_with_units] was to allow
passing in values of different unit sizes i.e 64K to a proc file.
Their are a few problems with the implementation that prevents its
direct use with sysfs/debugfs. The first problem is that
lprocfs_str_to_s64() was used for a lot of cases where it doesn't
make sense to use it. Often it was used for bool values passing
in or after retrieving a value as signed 64 bit it ensures its in
range of some other unit size. For these cases we can simply move
to kstrtoXXX_from_user(). To handle the case of bool values we
add in supoort for kstrtobool_from_user().

Replace the lprocfs_rd_uint() and lprocfs_wr_uint() generic callbacks
with a simpler, more direct implementation of ldlm_rw_uint_fops.

There's a slight change in lustre debugfs write semantics: Using kstrtox
causes EINVAL when the written number is followed by other (garbage)
characters, whereas previously the garbage would be ignored and such a
write would succeed.

Linux-commit: 8b23093269c84b0da1201e1949c91d0beb9892ef

Change-Id: I39f0ba3dc72685fe6e29c7077f37ad4e69a20b4a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Mathias Rav <mathiasrav@gmail.com>
Reviewed-on: https://review.whamcloud.com/30539
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10260 hsm: enable max archive_id posix copytool 71/30171/6
Thomas Stibor [Mon, 20 Nov 2017 15:36:57 +0000 (16:36 +0100)]
LU-10260 hsm: enable max archive_id posix copytool

The current maximum archive-id in posix copytool is
limited to id < LL_HSM_MAX_ARCHIVE. However, the Lustre HSM
implementation checks as follows for the maximum:
if (id > LL_HSM_MAX_ARCHIVE) then flag ERROR.
Thus the number of archive id's is in the
range 0,1,..,32 = LL_HSM_MAX_ARCHIVE, and therefore
32 = LL_HSM_MAX_ARCHIVE should be included.
Note, archive-id = 0 is reserved to specify to listen to
ANY archive id and is use as default when no archive-id
option is provided.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Thomas Stibor <t.stibor@gsi.de>
Change-Id: I6289c8c0e7d86b05f1f2d821b7f6b3127e5fa352
Reviewed-on: https://review.whamcloud.com/30171
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10244 osc: add a bit to indicate osc_page in cache tree 96/30096/6
Bobi Jam [Wed, 15 Nov 2017 07:02:30 +0000 (15:02 +0800)]
LU-10244 osc: add a bit to indicate osc_page in cache tree

Add osc_page::ops_intree to indicate whether the osc_page is in the
osc_object's cache tree, so that when page cannot insert in the
cache as race happens, the cleanup code won't try to remove it from
the cache.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ifcfe158d10c23a40c116414c7f4f86b257e1fa76
Reviewed-on: https://review.whamcloud.com/30096
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9669 tests: check required nrs availability on a facet 60/27660/7
Elena Gryaznova [Fri, 16 Feb 2018 16:59:34 +0000 (19:59 +0300)]
LU-9669 tests: check required nrs availability on a facet

sanityn/77[abcdefg], 78 failed with interop testing due to
missing nrs policy related proc entry's in OSS/MGS/MDS node.

Fix is to check for availabilty of a required nrs on a facet.
Patch removes tne versions based check from basic NRS policies
regression tests to make the possibility of interop testing with
old servers with NRS feature backported.

Author: Jadhav Vikram <jadhav.vikram@seagate.com>

Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanityn
Cray-bug-id: LUS-5259
Seagate-bug-id: MRP-3999
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Parinay Vijayprakash Kondekar <parinay.kondekar@seagate.com>
Change-Id: If0eca183ac388d481ddb3b1d39e0c9def5dd0c37
Reviewed-on: https://review.whamcloud.com/27660
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9624 tests: fix pre-DNE test exceptions/llog usage 35/27535/31
Andreas Dilger [Thu, 8 Jun 2017 20:27:50 +0000 (14:27 -0600)]
LU-9624 tests: fix pre-DNE test exceptions/llog usage

Remove some test skips when running with multiple MDTs in DNE mode,
or fix tests to work better with multiple MDTs.  Tests updated are:
recovery-small: 60
sanity: 17hi, 154ab, 160abcde, 161abcd, 162a, 205, 225ab, 254, 256

In particular, sanity.sh test_160, test_161, test_162 ignored test
failures in DNE mode.  Fix test_160* to work with ChangeLogs stored
on multiple MDTs.  This adds test coverage both because we aren't
skipping these tests when running in DNE mode, but also because we
are now validating ChangeLogs running on multiple MDTs at once.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity,recovery-small,sanity-hsm
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3fc3ce85b46f34e507c1e28b4c76574a698cab07
Reviewed-on: https://review.whamcloud.com/27535
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8912 nodemap: fix contiguous range support 97/24397/4
Kit Westneat [Thu, 15 Dec 2016 23:45:00 +0000 (07:45 +0800)]
LU-8912 nodemap: fix contiguous range support

This patch fixes the contiguous range check to allow the addition of
multiple "full" ([0-255]) ranges. As part of this change,
is_contiguous and find_min_max are combined as they were always
called together and the logic is fairly similar. This also removes
the multiple range expression support, since it was broken.

Also, sanity-sec.sh test_10c is added to verify this patch.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I3c49a077039327fcbde87196f82db140f67a74d0
Reviewed-on: https://review.whamcloud.com/24397
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8672 tests: Fix error handling in replay-single test_89 74/22974/7
Abrarahmed Momin [Thu, 22 Feb 2018 16:50:16 +0000 (19:50 +0300)]
LU-8672 tests: Fix error handling in replay-single test_89

Update replay-single test_89() to error out on wait_mds_ost_sync and
wait_delete_completed timeout.

Correct error handling in wait_delete_completed_mds and
wait_delete_completed.

Signed-off-by: Abrarahmed Momin <abrar.habib@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Cray-bug-id: MRP-1680
Test-Parameters: trivial
Change-Id: I54e30221361e73a17ba857cb19b1efcc019b412f
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/22974
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-6863 tests: change obdfilter-survey.sh for CLIENTONLY mode 31/15631/11
Elena Gryaznova [Fri, 23 Feb 2018 18:17:35 +0000 (21:17 +0300)]
LU-6863 tests: change obdfilter-survey.sh for CLIENTONLY mode

obdfilter-survey.sh requires server access and can not be
used for CLIENTONLY mode:
get_devs $oss -> do_nodes $oss "lctl dl"
host_nids_address $oss -> do_nodes $oss "$LCTL list_nids"
Patch fixes the script to compose the targets list
without access to servers.

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: MRP-1757
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I910afc940a29ea4f5d8928131652f9b6ef809ce7
Reviewed-on: https://review.whamcloud.com/15631
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10657 utils: fd leak in mirror_split() 10/31410/2
Bobi Jam [Sat, 24 Feb 2018 05:17:17 +0000 (13:17 +0800)]
LU-10657 utils: fd leak in mirror_split()

fd could be leaked in some error handling path for mirror_split().

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I54b06191bd337ca7a9e6b58bdc4ab8197f29ed22
Reviewed-on: https://review.whamcloud.com/31410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10682 lnd: pending transmits dropped silently 74/31374/3
Amir Shehata [Thu, 22 Feb 2018 00:21:02 +0000 (16:21 -0800)]
LU-10682 lnd: pending transmits dropped silently

list_add was being used erroneously. The logic should be to move
the txs on ibp_tx_queue on a local list which is then processed.
The code, however, did the reverse, which would result in the
pending txs not processed and thus dropped silently. This in turn
would lead to peers reference counts at the LNet layer not
decremented since lnet_finalize() might not be called for a message.

Initialize local list and use list_splice_init() to move
transmits on the ibp_tx_queue to the local list.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I6b36f709db2c89e53e0b3354883a8a1b1052a1dd
Reviewed-on: https://review.whamcloud.com/31374
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9019 selftest: remove remaining cfs_time wrappers 41/31041/3
James Simmons [Thu, 22 Feb 2018 18:00:25 +0000 (13:00 -0500)]
LU-9019 selftest: remove remaining cfs_time wrappers

Remove remaining libcfs time wrappers from lnet selftest. Migrate
crp_stamp to nanoseconds and both timestamps nd_stamp, sn_start to
ktime. The move away from jiffies which can vary on platforms to
something that is the consistent on every node. This will ensure
that the reported results to the user will always be correct.

Test-Parameters: trivial testlist=lnet-selftest

Change-Id: Id8d1b195f690c69635de60dd9b501f6d97f90f4d
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31041
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10318 dom: support DATA_VERSION IO type 49/30449/14
Mikhal Pershin [Tue, 5 Dec 2017 20:10:02 +0000 (23:10 +0300)]
LU-10318 dom: support DATA_VERSION IO type

add support for DATA_VERSION IO type, return from MDT
data version and layout version if requested by CLIO.
Also ensure that version is changed on punch and write
operations.
This fixes HSM archive with DOM files.

Change-Id: Id7b63697ffc48c889370638682625ea04a0348c5
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-on: https://review.whamcloud.com/30449
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9431 obd: resolve config log sysfs issues 43/30143/8
James Simmons [Tue, 6 Feb 2018 16:43:07 +0000 (11:43 -0500)]
LU-9431 obd: resolve config log sysfs issues

This resolves long standing issues with modifying sysfs settings
on multiple nodes simultaneously by running a single command on
the backend MGS server. Their are two ways to change the settings,
LCFG_PARAM and LCFG_SET_PARAM. For the LCFG_PARAM case we create
a new function class_modify_config() that grabs the attributes
from the passed in kobject. We can use those attributes to
modify the sysfs settings. If we can't find the attribute then
send a uevent to let userland resolve the change. For the
LCFG_SET_PARAM case we handle two class of settings. The function
class_set_global() was modifiy to handle the top lustre sysfs
files since they are not searchable with kset_find_obj.
To make the new version of class_set_global() work both sets of
sysfs attributes for the top level sysfs entries have been merged.
If we can find a kobject with kset_find_obj then we can send
a uevent so userland change manage the change.

Change-Id: I4e7f19c4a232767119355c3c96e5752a10000da8
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30143
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10676 dkms: Provide lustre-dkms for lustre-zfs-dkms 29/31329/2
Nathaniel Clark [Thu, 15 Feb 2018 20:36:43 +0000 (15:36 -0500)]
LU-10676 dkms: Provide lustre-dkms for lustre-zfs-dkms

To facilitate upgrading from old lustre-dkms style package to new
lustre-zfs-dkms, provide the old package in lustre-zfs-dkms.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ia6f0fffad35ad8e219bfbe05527865ccd1904ff7
Reviewed-on: https://review.whamcloud.com/31329
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9761 dkms: Add ldiskfs dkms support 90/27990/10
Nathaniel Clark [Fri, 5 Aug 2016 15:22:21 +0000 (11:22 -0400)]
LU-9761 dkms: Add ldiskfs dkms support

This breaks out lustre-dkms into lustre-zfs-dkms and
lustre-ldiskfs-dkms (or lustre-all-dkms) as "flavours" of lustre
server dkms.  The reason for the flavours is to prevent lustre
ldiskfs dkms build from having ZFS dependencies, and to maintain
lustre zfs dkms build ordering when rebuilding for new kernels.
This also prevents building of tests and utils when --disable-tests
and --disable-utils (respectively) are passed to configure.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Iba500d9830a8f57662066141a176c381151861f4
Reviewed-on: https://review.whamcloud.com/27990
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9324 lfs: add setstripe --yaml=template parameter 60/26860/34
Bobi Jam [Tue, 25 Apr 2017 01:18:44 +0000 (09:18 +0800)]
LU-9324 lfs: add setstripe --yaml=template parameter

Add a "lfs setstripe --yaml=<yaml_template> <lustre_file_or_dir>"
usage to set stripe using stripe info from a YAML template file.

The YAML template file can be get from
$ lfs getstripe --yaml <lustre_file_or_dir>
and user can manually edit it to tweak stripe options.

This patch fixes two cyaml issues:
1. a YAML_BLOCK_ENTRY_TOKEN can follow a YAML_VALUE_TOKEN.
2. free_node() has memory leak, it needs to free
   cYAML::cy_valuestring and cYAML::cy_string if possible.

Test-Parameters: testlist=sanity-pfl,sanity-flr
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I78149bb011fbc03387cbe3d057eb030550dd75ae
Reviewed-on: https://review.whamcloud.com/26860
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8727 mgs: remove skip records from config file 45/23245/23
Vladimir Saveliev [Mon, 19 Feb 2018 13:14:28 +0000 (16:14 +0300)]
LU-8727 mgs: remove skip records from config file

Configuration logs are append-only files of limited size.  Over the
course of time the logs may grow over the limit size.  Usually,
configuration logs keep needless records marked as SKIP. The new lctl
command "clear_conf" is added to allow administartors to clear
configuration files by removing mentioned SKIP records. lctl man page
is updated.
conf-sanity test (for ldiskfs only) is added to test the new command.

Change-Id: I274cb48138c16e536cfca56836c3313e944eba56
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Cray-bug-id: MRP-2091
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Alexey Leonidovich Lyashkov <c17817@cray.com>
Tested-by: Elena V. Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/23245
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9437 lfsck: handle LMV EA for migrating directory 66/31266/5
Fan Yong [Tue, 20 Feb 2018 19:25:55 +0000 (03:25 +0800)]
LU-9437 lfsck: handle LMV EA for migrating directory

For the in-migration directory, its LMV EA contains not only the
LMV header, but also the FIDs for both source and target. So the
LMV EA size is larger. The lfsck_read_stripe_lmv() logic need to
handle such case properly.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ic43853fb5ca058042fafa0f6c81fa99d4b8d8897
Reviewed-on: https://review.whamcloud.com/31266
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
22 months agoLU-10615 osd: stop OI scrub before FLDB closed 41/31241/3
Fan Yong [Fri, 23 Feb 2018 11:44:05 +0000 (19:44 +0800)]
LU-10615 osd: stop OI scrub before FLDB closed

OI scrub may check FLDB when scans the device. During umount
the device, we need to stop OI scrub before closing the FLDB
to void invalid RAM accessing.

Some code optimization and cleanup.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib358abd77f970c12b0c29a603f9bcaf8e310cc98
Reviewed-on: https://review.whamcloud.com/31241
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoRevert "LU-8856 osd: mark specific transactions netfree" 42/31442/2
Oleg Drokin [Tue, 27 Feb 2018 18:27:40 +0000 (18:27 +0000)]
Revert "LU-8856 osd: mark specific transactions netfree"

This patch caused very frequent sanity-lfsck 9a failures
reported in LU-10732

This reverts commit 8d1639b5cf1edbc885876956dcd6189173c00955.

Change-Id: Ibf353042d2d37d37eccbf3895453f51ca07ea6d3
Reviewed-on: https://review.whamcloud.com/31442
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8878 tests: skip several tests for CLIENTONLY mode 28/24028/4
Elena Gryaznova [Wed, 7 Feb 2018 12:01:35 +0000 (15:01 +0300)]
LU-8878 tests: skip several tests for CLIENTONLY mode

tests 107, 300, 301, 302 fail SINGLEMDS, so they are to be
skipped for CLIENTONLY mode.

Author: Chennaiah Palla <chennaiah.palla@seagate.com>

Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanity-hsm
Cray-bug-id: LUS-4966
Seagate-bug-id: MRP-3529
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Change-Id: I8286bbaa403089a4a85fcf0c4d9451fe24e67836
Reviewed-on: https://review.whamcloud.com/24028
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-4423 lnet: free a struct kib_conn outside of the kiblnd_destroy_conn() 73/31273/3
Dmitry Eremin [Mon, 12 Feb 2018 12:37:18 +0000 (15:37 +0300)]
LU-4423 lnet: free a struct kib_conn outside of the kiblnd_destroy_conn()

To avoid confusion this fix moved the freeing a struct kib_conn outside of
the function kiblnd_destroy_conn().

Change-Id: Iae28802f5d319570064a504feb14dffd13a22b84
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/31273
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10652 tests: restructure sanity 133[f,g] 45/31245/4
Elena Gryaznova [Wed, 14 Feb 2018 14:24:06 +0000 (17:24 +0300)]
LU-10652 tests: restructure sanity 133[f,g]

sanity 133f and 133g both get skipped in CLIENONLY mode,
but tests are to run on clients on this mode.

The fix separates code of the tests so that 133f tests
clients, while 133g runs on servers. Then in CLIENTONLY mode
only 133g is skipped.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: envdefinitions=ONLY=133 testlist=sanity
Seagate-bug-id: MRP-2438
Cray-bug-id: LUS-4289
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ibba69a3fd4fd4a9f8d90729ec2a294443dd4f29e
Reviewed-on: https://review.whamcloud.com/31245
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10639 tests: rename the tests 30/31230/4
Elena Gryaznova [Fri, 9 Feb 2018 09:01:54 +0000 (12:01 +0300)]
LU-10639 tests: rename the tests

The following tests are renamed to be run separately
from other tests in the groups:
sanity-hsm:
    test_1 to test_1A
    test_9 to test_9A
    test_26 to test_26A
    test_220 to test_220A
    test_224 to test_224A

conf-sanity:
    test_28 to test_28A

lustre-rsync-test.sh:
    test_1 to test_1A

sanity.sh:
    test_239 to test_239A

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ajay Nair <ajay.nair@seagate.com>
Cray-bug-id: LUS-2608, LUS-5328
Seagate-bug-id: MRP-4695, MRP-4121
Test-Parameters: testlist=sanity,sanity-hsm,conf-sanity
Test-Parameters: testlist=lustre-rsync-test
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: Ib1542d55328c0fb60c0c2c59257fa9f5742a57dc
Reviewed-on: https://review.whamcloud.com/31230
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10617 tests: Dir's and file's stripe counts are mismatched 93/31193/3
Elena Gryaznova [Tue, 13 Feb 2018 18:17:01 +0000 (21:17 +0300)]
LU-10617 tests: Dir's and file's stripe counts are mismatched

the case when stripe count of dir equals to -1 and files
in the dir must be equal to ost count added into
the test_24 of ost-pool.sh

Author: Alyona Romanenko <alyona.romanenko@seagate.com>

Signed-off-by: Alyona Romanenko <alyona.romanenko@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=ost-pools envdefinitions="ONLY=24"
Cray-bug-id: LUS-4467
Seagate-bug-id: MRP-2746
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: I91e7c65e178c7706f53a95a2807e06b1bc8e0d24
Reviewed-on: https://review.whamcloud.com/31193
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10612 tests: reply_single.sh,test_48: No space left 82/31182/2
Elena Gryaznova [Tue, 6 Feb 2018 14:20:53 +0000 (17:20 +0300)]
LU-10612 tests: reply_single.sh,test_48: No space left

MDS need to have time to discover the OST state, attempt to
recover, fail and recover again.

Author: gaurav mahajan <gaurav.mahajan@seagate.com>

Signed-off-by: gaurav mahajan <gaurav.mahajan@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=replay-single envdefinitions="ONLY=48"
Cray-bug-id: LUS-4384
Seagate-bug-id: MRP-2616
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Change-Id: I2b3cca70872b7c9f13c64b50e1b4373096fbc147
Reviewed-on: https://review.whamcloud.com/31182
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10600 tests: clean up sanity tests 64d and 65k 59/31159/4
James Nunez [Fri, 2 Feb 2018 23:53:51 +0000 (16:53 -0700)]
LU-10600 tests: clean up sanity tests 64d and 65k

Several saity tests create files or modified the environment
and does not clean up or return the environment to the
original state. sanity test 64d fills and OST and does not
clean up the file after the OST if full. sanity test 65k
sets OSTs to be inactive and, on error, does not set the OST
back to active.

These two tests need to clean up after themselves.

Test-Parameters: trivial testlist=sanity,sanity
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I01bc376680798815c9dd398da7781c92c6b70b2f
Reviewed-on: https://review.whamcloud.com/31159
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10570 obd: fix statfs handling 58/31158/3
James Simmons [Sun, 4 Feb 2018 19:38:25 +0000 (14:38 -0500)]
LU-10570 obd: fix statfs handling

The function lod_qos_statfs_updates() refreshes statfs
data every N seconds. Taking lq_rw_sem can take a very long
time so the testing for stale stats had to be done again after
taking the semaphore. Now that we are using only seconds
resolution it is more likely that max_age and obd_osfs_age
will be equal compared to when the code was using jiffies.
So only release the lock right away when osfs_age has passed
the max_age.

The comment 'use the value of cfs_time_current + HZ' for
obd_statfs() and obd_statfs_async() needs to updated to
the time64_t case.

Simplify llite_statfs_internal() handling by calculating
max_age inside of llite_statfs_internal(). This makes the
code cleaner.

Change-Id: I22aa5d4d78b30d6480e73998e05ec6582a316d4f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31158
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8854 llapi: remove lustre specific strlcpy & strlcat functions 98/29798/6
James Simmons [Sat, 10 Feb 2018 16:18:53 +0000 (11:18 -0500)]
LU-8854 llapi: remove lustre specific strlcpy & strlcat functions

In the days when lustre supported many more platforms some of those
platforms natively support strl[cpy|cat] but Linux has always lack
these functions. So lustre ended up providing its own versions of
these functions to fill in this functionality. Today Lustre only
supports the Linux platforms which has a version of libc that will
most likely never support strl[cat|cpy]. Since this is the case we
can remove the AC_CHECK_FUNCS since they only test against libc.
We could support detecting strl[cpy|cat] in another library but
many libraries provide their own version so the chances of collision
are high. The best solution is remove strlcpy and strlcat by
replacing those functions with string functions that are always
provided by the standard c library.

Change-Id: I72df93c8f83ed1aad80653fe0d1c4d54d1d8e2f2
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29798
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9727 lustre: record if enable_audit is set on nodemap 14/28314/18
Sebastien Buisson [Wed, 2 Aug 2017 14:47:47 +0000 (23:47 +0900)]
LU-9727 lustre: record if enable_audit is set on nodemap

Record changelogs from a client only if it pertains to a nodemap
on which enable_audit is set, and changelogs are activated.
If client is not explicitely assigned to a nodemap, enable_audit value
from default nodemap is used.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I31d361cfd8cc69db68b60298934cbbef4af0d75d
Reviewed-on: https://review.whamcloud.com/28314
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
22 months agoLU-9250 tests: add parallel-scale xdd test 76/26176/5
Elena Gryaznova [Fri, 16 Feb 2018 10:11:04 +0000 (13:11 +0300)]
LU-9250 tests: add parallel-scale xdd test

Patch adds parallel-scale xdd test.

Our customers report the Lustre issues hit during
xdd test. We need a flexible way to reproduce the
failures.

Author: Chennaiah Palla <chennaiah.palla@seagate.com>

Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5206
Seagate-bug-id: MRP-3915
Test-Parameters: testlist=parallel-scale
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Change-Id: Ia4823aa8ce64aad3d43b2611b24f48a532b8796c
Reviewed-on: https://review.whamcloud.com/26176
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-6867 test: detect active facet based on current state 38/15638/17
Elena Gryaznova [Fri, 17 Jul 2015 16:32:49 +0000 (19:32 +0300)]
LU-6867 test: detect active facet based on current state

Lustre failover tests can not be ran test-by-test
on the setup with ${facet}_HOST != ${facet}failover_HOST
because of t-f does not restore facet state.
t-f keeps this info in "${facet}active" files, which are created
when facet_failover() is executed first time in the test session.
Before facet_failover() executed these files are empty and
active facet is ${facet} by default.
In case when tests are executed test-by-test the active facet is
${facet}failover after 1st test completed, and 2nd test is started
having ${facet}failover active without this info stored in
${facet}active files.

Patch contains the following changes:
- add the active facet detection based on current lustre state;
- fix sanity-hsm defect: exist with error if agt${n}1_HOST is empty.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-2680
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Change-Id: Ie42baaa55a6433596e6004d16eb5c18ae2ef7479
Reviewed-on: https://review.whamcloud.com/15638
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10680 mdd: fix run_gc_task uninitialized 47/31347/2
Bruno Faccini [Sun, 18 Feb 2018 19:13:04 +0000 (20:13 +0100)]
LU-10680 mdd: fix run_gc_task uninitialized

run_gc_task has been mistakenly left uninitialized in previous
patch for LU-7340. This has been silently ignored by gcc even
if -Wall option is used during build, possibly because no
optimization level/option requested where -Wuninitialized
option/check may only pe performed.
The side effect is that generated assembly code completelly
avoids run_gc_task usage from source, and thus a kthread
for ChangeLogs garbage-collection is created upon each
record creation and this without any of the garbage-collection
conditions are triggered.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ieb9ce062ba6ebf0c365c1e6f8a57f89dd39e0a9d
Reviewed-on: https://review.whamcloud.com/31347
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
22 months agoLU-10561 flr: remove "--parent" option from lfs mirror command 98/31298/5
Jian Yu [Tue, 20 Feb 2018 22:35:52 +0000 (14:35 -0800)]
LU-10561 flr: remove "--parent" option from lfs mirror command

"--parent" option for "lfs mirror create/extend" command was
originally designed to use default stripe options inherited
from parent directory. However, if parent directory has
composite layout, there will be inconsistency to choose the
stripe options from which component to inherit. And if there
is any other option specified, it's also inconsistent to
inherit the layout of parent directory.

So, this patch removes "--parent" option to eliminate ambiguity.
For "--pool|-p" option, this patch supports specifying "none" to
clear the pool name and inherit from parent directory.

Unspecified stripe count, stripe size and OST pool name will
inherit from previous component. If there is no previous component,
then unspecified stripe count and stripe size attributes will
inherit from filesystem-wide default values. Unspecified or
cleared OST pool name will inherit from parent directory.

Change-Id: Ib0ec3cbc65fb307c42881f35dc676090ab8319ff
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31298
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10663 utils: clear errno before check 05/31305/4
John L. Hammond [Wed, 14 Feb 2018 18:27:54 +0000 (12:27 -0600)]
LU-10663 utils: clear errno before check

In jt_obd_destroy() clear errno before calling strtoull() and checking
it.

Test-Parameters: trivial testlist=obdfilter-survey

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I686cd6eb0a57248177e5b0878df5e3f450fbc942
Reviewed-on: https://review.whamcloud.com/31305
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9724 ldiskfs: update ext4-large-eas.patch to match upstream ext4 33/31033/9
Emoly Liu [Fri, 26 Jan 2018 07:26:00 +0000 (15:26 +0800)]
LU-9724 ldiskfs: update ext4-large-eas.patch to match upstream ext4

In order to match the enhanced ea_inode functionality being landed
to the upstream ext4 kernel tree, ext4-large-eas.patch is modified
to start properly initializing some of the fields we don't
currently use to minimize the interoperability issues.

In particular, the new EA inode refcount is initialized to 1, and
hash field is computed based on the xattr value as it is in the
upstream kernel patch.

However, since ext4_xattr_inode_get_hash() has not been added to
ldiskfs code so that this hash value is not used anywhere, if the
new checksum driver (sbi->s_chksum_driver) is not available, hash
value will be 0 in the current implementation, until we find a way
to calculate it based on the xattr value propely.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I2bcf45c67a580f2f545816e1a70a6322c6ccc368
Reviewed-on: https://review.whamcloud.com/31033
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
22 months agoLU-10277 utils: 'lfs mkdir -i -1' pick the less full MDTs 98/30598/11
Lai Siyao [Mon, 4 Dec 2017 07:38:25 +0000 (15:38 +0800)]
LU-10277 utils: 'lfs mkdir -i -1' pick the less full MDTs

If 'lfs mkdir -i -1 -c count' is specified, it will 'df' first,
and then randomly pick 'count' less full MDTs as specific MDTs.

Add sanity test 413.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I2ce1720479d37b1ae397054743afae865129fee3
Reviewed-on: https://review.whamcloud.com/30598
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9019 obd: migrate upcall cache to time64_t 64/31064/3
James Simmons [Wed, 7 Feb 2018 06:20:54 +0000 (01:20 -0500)]
LU-9019 obd: migrate upcall cache to time64_t

Move all the upcall cache time handling from jiffies to time64_t.

Change-Id: I86039c6e6e35ac83b773753c952936f1b2f5e14a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31064
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10270 lnet: remove an early rx code 54/30254/7
Alexey Lyashkov [Thu, 23 Nov 2017 11:28:18 +0000 (14:28 +0300)]
LU-10270 lnet: remove an early rx code

early RX added to the o2ib lnd as attempt to reordering problem
handling, When messages have arrived before actual connection sets.
But it code can fill all incoming queue and normal connect will not
processed.

Cray-bug-id: MRP-4638
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I2efc73534a20c4628ed462ee5055c901dbf44278
Reviewed-on: https://review.whamcloud.com/30254
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10181 tests: add FIO as test for DOM 59/30059/12
Mikhal Pershin [Mon, 13 Nov 2017 15:23:54 +0000 (18:23 +0300)]
LU-10181 tests: add FIO as test for DOM

Add FIO test for basic DOM performance tracking,
- remove unused smallfileio test,
- make parameter setting compatible with DNE,
- turn off extra stats output by default
- format test output

Test-Parameters: trivial mdssizegb=20 testlist=sanity-dom,dom-performance
Signed-off-by: Mikhal Pershin <mike.pershin@intel.com>
Change-Id: Id4236643e841165d35e7d3f0c1ab64ae8f9e1751
Reviewed-on: https://review.whamcloud.com/30059
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-9019 osd-ldiskfs: migrate to 64 bit time 57/29857/10
James Simmons [Mon, 5 Feb 2018 17:14:51 +0000 (12:14 -0500)]
LU-9019 osd-ldiskfs: migrate to 64 bit time

Replace cfs_time_current_sec() to avoid the overflow issues in
2038 with ktime_get_real_seconds(). Besides changing struct
scrub_file sf_time_* fields to time64_t for usage with
ktime_get_real_seconds() the other fields can also be moved to
time64_t as well since we don't need precision better than one
second for the scrubbing code. The dr_* time fields in struct
osd_iobuf are jiffies which does get reporting with the histograms.
This was with the thinking that jiffies equal milliseconds which
is not always the case. Since we need better than one second
resolution move dr_* time fields to ktime. This way the value
passed to lprocfs_oh_tally_log() will always be in milliseconds.

Change-Id: Ibce7f7d9f972c8d3188271950f68dcda7663676f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29857
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-5695 libcfs: watchdog dispatch thread fix 55/12155/8
Alexander Zarochentsev [Fri, 16 Feb 2018 18:06:57 +0000 (13:06 -0500)]
LU-5695 libcfs: watchdog dispatch thread fix

lc_watchdogd may stop imediately after start
because nobody clears the stop flag.

Xyratex-bug-id: MRP-2108 MRP-1913
Change-Id: I1eaaf0330c111b7f2b17081c716ef8c200677d6b
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-on: https://review.whamcloud.com/12155
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10650 obd: add check to obd_statfs 43/31243/7
Alexander Boyko [Fri, 9 Feb 2018 12:07:19 +0000 (07:07 -0500)]
LU-10650 obd: add check to obd_statfs

The race could happend between mount and lctl get_param.
Because procfs files are ready before a full obd initialization.
For example:
3372:0:(dt_object.h:2509:dt_statfs()) ASSERTION( dev )
3372:0:(dt_object.h:2509:dt_statfs()) LBUG
Pid: 3372, comm: lctl
Call Trace:
libcfs_call_trace+0x4e/0x60[libcfs]
lbug_with_loc+0x4c/0xb0[libcfs]
tgt_statfs_internal+0x2ea/0x350[ptlrpc]
ofd_statfs+0x66/0x470 [ofd]
lprocfs_filesfree_seq_show+0xf6/0x520 [obdclass]
ofd_filesfree_seq_show+0x12/0x20 [ofd]

The patch adds a check of completed obd_setup to obd_statfs().
The patch adds the sanity 276 test.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-2665
Change-Id: I55a9ffa7e036f486388a8f548051d28974d47951
Reviewed-on: https://review.whamcloud.com/31243
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8990 lod: put root at cleanup 43/31143/3
Lai Siyao [Fri, 2 Feb 2018 15:00:15 +0000 (23:00 +0800)]
LU-8990 lod: put root at cleanup

'lod_md_root' was put at precleanup, but soak test shows there exists
race, and some ongoing request may re-initialize it, move this put
to cleanup.

Also add debug code to dump remaining objects if lod device is still
referenced at lod_device_free().

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I6f1ab0ba149ccf95279c1182c90a5588607ad8fa
Reviewed-on: https://review.whamcloud.com/31143
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10550 flr: resync RDONLY state FLR file 10/31010/8
Bobi Jam [Wed, 24 Jan 2018 15:32:37 +0000 (23:32 +0800)]
LU-10550 flr: resync RDONLY state FLR file

When some components are failed to resync due to various reasons,
those components will still have STALE bit set but the file statue may
become to RDONLY.

This patch makes resync RDONLY FLR file possible.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I2e3b518bb969aedd7f214e6b09b895079cab69ab
Reviewed-on: https://review.whamcloud.com/31010
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10356 llite: have ll_write_end to sync for DIO 59/30659/2
Vladimir Saveliev [Tue, 26 Dec 2017 19:49:58 +0000 (22:49 +0300)]
LU-10356 llite: have ll_write_end to sync for DIO

direct IO write uses buffered write for pages which could not be
released. If not adjacent pages are not releasable,
vio->u.write.vui_queue list becomes non-contiguos which makes
page_list_sanity_check() to fail.

Have ll_write_commit to do vvp_io_write_commit() when it is called in
course of direct IO.

Cray-bug-id: MRP-4415
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I21e653c4d45553c85ff5ded8edf22017966c7ba4
Reviewed-on: https://review.whamcloud.com/30659
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-8856 osd: mark specific transactions netfree 30/26930/20
Alex Zhuravlev [Wed, 3 May 2017 12:45:13 +0000 (15:45 +0300)]
LU-8856 osd: mark specific transactions netfree

osd-zfs should mark some transactions netfree. this means those transactions
are expected to release space (rather than consume) and for this kind of
transaction half of reserved space is available.

Change-Id: I71605bc224882aafac26b3dfb0f3d7e82af8fde8
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/26930
Tested-by: Jenkins
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10670 test: make sanity-flr test_43 more reliable 15/31315/3
Bobi Jam [Thu, 15 Feb 2018 07:59:14 +0000 (15:59 +0800)]
LU-10670 test: make sanity-flr test_43 more reliable

Improve sanity-flr test_43 more reliable by setting the active
state of OSP device instead of OSC device to simulate OST's
unavailability.

Test-Parameters: testlist=sanity-flr
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ibfb4a54479a7dafff251dd3645b03ec172b6884e
Reviewed-on: https://review.whamcloud.com/31315
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10443 test: Handle file lifecycle correctly 54/31254/2
Patrick Farrell [Fri, 9 Feb 2018 15:00:13 +0000 (09:00 -0600)]
LU-10443 test: Handle file lifecycle correctly

The current lockahead_test.c removes the test file on exit,
which will destroy the locks which sanity.sh counts to
verify correct operation.  This usually works because
sanity.sh wins the race with the object destroy command
from the MDS to the OSS.

Change lockahead_test.c to remove the test file on entry,
and to use $tfile rather than its own file, so it is
automatically cleaned up by sanity.

Change-Id: I3cd1fdb7f33da167ca21476a7b3cbe5f57fd5782
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/31254
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10634 kernel: kernel update [SLES12 SP3 4.4.114-94.11] 24/31224/3
Bob Glossman [Wed, 7 Feb 2018 23:04:40 +0000 (15:04 -0800)]
LU-10634 kernel: kernel update [SLES12 SP3 4.4.114-94.11]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I3ffcd4c368b2976cffa6a517f9fabcf674781ac9
Reviewed-on: https://review.whamcloud.com/31224
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10603 ptlrpc: export req_buffers_max via procfs 62/31162/2
Alex Zhuravlev [Mon, 5 Feb 2018 10:03:17 +0000 (13:03 +0300)]
LU-10603 ptlrpc: export req_buffers_max via procfs

after LU-9372 gcc7 complains:
lustre/ptlrpc/lproc_ptlrpc.c:382:16: error: ‘ptlrpc_lprocfs_req_buffers_max_fops’ defined but not used [-Werror=unused-const-variable=]
 LPROC_SEQ_FOPS(ptlrpc_lprocfs_req_buffers_max);
                 ^

Change-Id: Ie4806b79d104c7ea9aa34b6a8a280587fccef689
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31162
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10560 libcfs: handle rename to wait_queue_entry_t 53/31153/11
Mike Marciniszyn [Fri, 9 Feb 2018 18:22:50 +0000 (13:22 -0500)]
LU-10560 libcfs: handle rename to wait_queue_entry_t

The 4.13 kernel renames wait_queue_t to wait_queue_entry_t.

Add a probe and handle rename across the code base and have
a define to translate to the new name when indicated.

Test-Parameters: trivial

Change-Id: I8f0f5ec4d02ccb270acb72ccffe13f0ecf6bd2f7
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31153
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
22 months agoLU-10560 lustre_compat: Convert GFP_TEMPORARY to GFP_KERNEL 52/31152/6
Mike Marciniszyn [Fri, 2 Feb 2018 16:45:54 +0000 (08:45 -0800)]
LU-10560 lustre_compat: Convert GFP_TEMPORARY to GFP_KERNEL

The 4.14 kernel removes this gfp.h define.

Adjust the code to use GFP_KERNEL as the upstream
patch does.

Change-Id: I40fff2724499fa17aa285507e0fd9b21f4afc070
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31152
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
22 months agoLU-10574 tests: remove useless check from sanity-dom.sh 74/31074/3
Elena Gryaznova [Wed, 7 Feb 2018 20:09:58 +0000 (23:09 +0300)]
LU-10574 tests: remove useless check from sanity-dom.sh

Tests test_sanity() and test_sanityn() are skipped if started
not from lustre/tests directory because of incorrect check
that ./sanity.sh exists.
Patch removes the check of the files which are part of
lustre/tests.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2594
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Test-Parameters: testlist=sanity-dom
Change-Id: I51ad517fbf3ff653d9a11994eb280daee589a886
Reviewed-on: https://review.whamcloud.com/31074
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
22 months agoLU-10449 nrs: Generic TBF policy can't be shown correctly 96/30696/6
Qian Yingjin [Wed, 3 Jan 2018 09:21:10 +0000 (17:21 +0800)]
LU-10449 nrs: Generic TBF policy can't be shown correctly

After setting TBF NID/OPCode/JobID policy and switch to generic
policy, the output of "lctl get_param ost.OSS.ost.nrs_policies"
can not display correctly.

Change-Id: If8dcb7ae6ade634ec7ec4dfcb5887501cda90cdf
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/30696
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>