Whamcloud - gitweb
fs/lustre-release.git
6 years agoLU-10520 mkfs: enable extents for big MDT 37/31037/13
Yang Sheng [Fri, 26 Jan 2018 13:35:33 +0000 (21:35 +0800)]
LU-10520 mkfs: enable extents for big MDT

Enable extents while MDT size is big than 16T.

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Iccd39c48e715a3f084cb5ee803be0541563f5d10
Reviewed-on: https://review.whamcloud.com/31037
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
6 years agoLU-10680 mdd: disable changelog garbage collection by default 52/31552/2
John L. Hammond [Tue, 6 Mar 2018 19:25:50 +0000 (13:25 -0600)]
LU-10680 mdd: disable changelog garbage collection by default

Changelog garbage collection has introduced some instability so
disable it by default.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I708198d76af060cb796de89266ee74a968f92ac1
Reviewed-on: https://review.whamcloud.com/31552
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10786 tests: add stripe size to lfs setstripe 69/31569/3
James Nunez [Wed, 7 Mar 2018 16:27:59 +0000 (09:27 -0700)]
LU-10786 tests: add stripe size to lfs setstripe

Since the default stripe size increased from one to four
MB, we need to add the stripe size parameter to calls
to 'lfs setstripe' for composite files when the component
size is less than the file system stripe size. Thus, add
the stripe size parameter to calls to 'lfs setstripe' for
sanity-flr tests 45 and 46 and sanity-pfl test 16.

Test-Parameters: trivial testlist=sanity-flr,sanity-pfl

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic169eaebd922175467f010b159a2b065fb91b3fb
Reviewed-on: https://review.whamcloud.com/31569
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8066 fid: move all files from procfs to debugfs 66/28366/10
James Simmons [Sun, 21 Jan 2018 16:55:10 +0000 (11:55 -0500)]
LU-8066 fid: move all files from procfs to debugfs

Linux-commit: f3aa79fbef7942971825fb2084a88e9527c6b04c

Besides the client port form upstream also port the server
side proc entires to debugfs.

Change-Id: I934fc5a39c8c407799abd0d6154240d3a579c93e
Signed-off-by: Dmitry Eremin <dmiter4ever@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28366
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8066 obd: final pieces for sysfs/debugfs support. 08/28108/24
James Simmons [Thu, 22 Feb 2018 17:26:16 +0000 (12:26 -0500)]
LU-8066 obd: final pieces for sysfs/debugfs support.

This patch puts in place the basics needed for debugfs.
It also creates class_setup_tunables so sysfs kobject
creation is handled for both obd_devices and llite. Add a
special LDEBUGFS_FOPS_WR_ONLY since often in this case
i_private is not set so any attempt to call PDE_DATA(inode)
will cause it to crash. Make lprocfs_obd_setup select either
debugfs or procfs but not both.

Handle the special symlinks needed for both debugfs
and sysfs with the server case. For lod we need to
create "lov" and osp we create "osc" for both sysfs
and debugfs. Handle the complex case of when a node
is both a server and client. For debugfs we can take
advantage of d_lookup() and for sysfs kset_find_obj()
to avoid special access to struct obd_type. This also
places the burden on the server lod/osp modules instead
of the client lov/osc modules.

Change-Id: I87090859db4da2300ab9e2aa3c23cb3773276103
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/28108
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10551 lod: obd_fid_alloc() could start a nested trans 68/31268/10
Bobi Jam [Mon, 12 Feb 2018 05:44:48 +0000 (13:44 +0800)]
LU-10551 lod: obd_fid_alloc() could start a nested trans

* obd_fid_alloc() could possibly start a nested transaction, which
  would reset the OI cache. So we add a
  osd_thread_info::oti_ins_cache_depth to prevent clearing OI cache
  in the nested trnasaction.

* Add more debug mesages in osd_idc_find_or_init()/
  osd_idc_find_and_init()

Test-Parameters: alwaysuploadlogs envdefinitions=PTLDEBUG=-1 testlist=sanity-pfl ostfilesystemtype=zfs mdtfilesystemtype=zfs mdscount=2 mdtcount=4
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Id75fd1787ffc0f47bbf110d460f23db6c34670da
Reviewed-on: https://review.whamcloud.com/31268
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10607 osd-zfs: skip io stat for OI scrub 80/31180/5
Fan Yong [Wed, 28 Feb 2018 03:36:52 +0000 (11:36 +0800)]
LU-10607 osd-zfs: skip io stat for OI scrub

It is unnecessary to stat io for OI scrub triggered request.
On the other hand, the OI setup logic may read/write the OI
scrub file. At that time, related lproc (including io stat)
for such OSD is not initialized yet. So this patch skips io
stat for OI scrub.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9498c1351c1875ac9aa46eed5189cb61a6d102ac
Reviewed-on: https://review.whamcloud.com/31180
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-5991 obd: fix mount error handing 59/12959/9
Vladimir Saveliev [Thu, 15 Feb 2018 15:40:16 +0000 (18:40 +0300)]
LU-5991 obd: fix mount error handing

lustre_fill_super() allocates lsi and assumes that on failures lsi
will be freed by server_fill_super() or ll_fill_super().
- server_fill_super() does not free lsi when lsi_prepare() fails.
- ll_fill_super() does not free lsi when OBD_ALLOC_PTR(cfg) or
ll_init_sbi() fail.

osd_device_fini() needs osd_index_backup(). Otherwise
struct lustre_index_backup_unit-s leak if server_fill_super() fails
after osd_start().

Cray-bug-id: MRP-2229
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I366dc2b46a504a65b030bcbf687998dd0676f404
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/12959
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10737 misc: Wrong checksum return value 48/31448/2
Qian Yingjin [Wed, 28 Feb 2018 09:22:01 +0000 (17:22 +0800)]
LU-10737 misc: Wrong checksum return value

In the checksum calculation functions: tgt_checksum_niobuf and
osc_checksum_bulk, it is wrongly taken the error return value
of cfs_crypto_hash_init as the checksum value.
This patch fixes the problem.

Signed-off-by: Qian Yingjin <qian@ddn.com>
Change-Id: I647c402deeab00ec5c6437423b0cab250b42c3e5
Reviewed-on: https://review.whamcloud.com/31448
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-4423 lov: use correct env in lov_io_data_version_end() 18/31418/4
NeilBrown [Wed, 28 Feb 2018 02:50:23 +0000 (21:50 -0500)]
LU-4423 lov: use correct env in lov_io_data_version_end()

lov - the logical object volume manager - is responsible for
striping data across multiple volumes.

So when it is given a request, it creates one or more
sub-requests, one for each target volume.  Each sub_io
request has a sub_env environment which it operates in.

When lov_io_data_version_end() calls lov_io_end_wrapper() to
wait for and close off a sub_io, it passes the wrong
environment.

This causes an LINVRNT() to fail in cl2osc_io(), and may
cause other problems.

This patch changes the call to use ->sub_env, much like
other code in the same file.

Change-Id: Id120929f4189196232d18103007e45ba89195fff
Fixes: fcd45488711a (LU-5683 clio: add CIT_DATA_VERSION)
Signed-off-by: NeilBrown <neilb@suse.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31418
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10421 echo: use echo layer when finding stripe object 38/31338/3
John L. Hammond [Fri, 16 Feb 2018 18:55:05 +0000 (12:55 -0600)]
LU-10421 echo: use echo layer when finding stripe object

In echo_md_dir_stripe_choose(), find the stripe object using the echo
device rather than the down layer (mdd) device. mdd objects are not
equipped to be top layer objects and should not be found in this way.

Test-Parameters: trivial testlist=mds-survey
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ibb396ae64b6d542c64697336d227e06163a0bb39
Reviewed-on: https://review.whamcloud.com/31338
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
6 years agoLU-10662 llite: Add exit for filedata allocation failed 96/31296/3
Ben Evans [Tue, 13 Feb 2018 19:20:18 +0000 (14:20 -0500)]
LU-10662 llite: Add exit for filedata allocation failed

When the filedata allocation fails, we need to exit to
a later point than out_openerr, which calls
deauthorize_statahead and ll_file_data_put, neither of
which is valid. (This leads to a panic.)

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I670d578f01b2731761e3149db36dd8da1551a30a
Cray-bug-id: LUS-1321
Reviewed-on: https://review.whamcloud.com/31296
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10656 ldlm: fix export reference 39/31139/5
Hongchao Zhang [Sun, 28 Jan 2018 19:25:42 +0000 (03:25 +0800)]
LU-10656 ldlm: fix export reference

In ptlrpc_connect_interpert, the export reference could be
leaked if there is error before the following class_exp_put.

Change-Id: I9ddd82fa1bbf8e17079e9746202be63e6233c052
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: https://review.whamcloud.com/31139
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Dongyang <dongyangli@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10582 out: can't obtain remote acl xattr 88/31088/5
Andriy Skulysh [Tue, 12 Dec 2017 13:39:14 +0000 (15:39 +0200)]
LU-10582 out: can't obtain remote acl xattr

osp_xattr_get() fails due to hardcoded
reply size limitation.

With large_xattr enabled ddp_max_ea_size can be
almost 1MB and out_handles fails to send such big
reply.

Limit maximum ACL buffer size by XATTR_SIZE_MAX.
Limit ddp_max_ea_size to fit resulting reply
request into LNET_MTU.

Cray-bug-id: MRP-4724
Change-Id: I6405330605809911c3f814fe5cb9d476d7ac40ed
Signed-off-by: Andriy Skulysh <c17819@cray.com>
Reviewed-on: https://review.whamcloud.com/31088
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9934 build: support for gcc7 10/28810/10
Alex Zhuravlev [Fri, 23 Feb 2018 17:50:12 +0000 (12:50 -0500)]
LU-9934 build: support for gcc7

Supress few false warnings with a compiler option when
building the lustre kernel modules. A few other fixes
to make lustre buildable on Fedora.

Change-Id: If14d226e5d92ae9ce54e216d032df94d9398654e
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/28810
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10465 lov: increase default stripe size to 4MB 51/27151/13
Jian Yu [Tue, 27 Feb 2018 08:28:40 +0000 (00:28 -0800)]
LU-10465 lov: increase default stripe size to 4MB

Increase the default stripe size from 1MB to 4MB
so that widely-striped files can generate full RPCs
without pinning so much memory on the client.

The patch also renames STRIPE_BYTES and STRIPES_PER_OBJ
to DEF_STRIPE_SIZE and DEF_STRIPE_COUNT in cfg/local.sh,
and unsets them to support formatting Lustre filesystem
with default stripe size and count.

Change-Id: I59d1fdb3e30599c125e0e5e800d168921bd69098
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/27151
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-9727 mdd: properly call recording_changelog() 56/31456/2
Sebastien Buisson [Wed, 28 Feb 2018 16:18:32 +0000 (01:18 +0900)]
LU-9727 mdd: properly call recording_changelog()

recording_changelog() must be called everywhere in the code instead
of directly checking (mdd->mdd_cl.mc_flags & CLM_ON).

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9ed5aac4871573e6aea94cfd4dc46b95d5df1e4a
Reviewed-on: https://review.whamcloud.com/31456
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6142 UAPI: replace cfs_size_* macros with __ALIGN_KERNEL 79/30379/6
James Simmons [Mon, 26 Feb 2018 16:11:02 +0000 (11:11 -0500)]
LU-6142 UAPI: replace cfs_size_* macros with __ALIGN_KERNEL

The lustre specific cfs_size_* macros can be easily replaced with
the __ALIGN_KERNEL macro provided by the linux kernel for our
user land code. This brings us closer to building against the
upstream client.

Change-Id: I5cd261807f60296eaac884b66f084c128adc5b01
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30379
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9324 lfs: add setstripe --copy=lustre_file_or_dir parameter 79/26879/6
Bobi Jam [Fri, 28 Apr 2017 06:18:47 +0000 (14:18 +0800)]
LU-9324 lfs: add setstripe --copy=lustre_file_or_dir parameter

Add a "lfs setstripe --copy=<lustre_src> <lustre_file_or_dir_dst>"
usage to set stripe using stripe info from a source lustre file/dir.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ibcd80f98c53bdff5b41ba9b1010fceefd6c9d8b7
Reviewed-on: https://review.whamcloud.com/26879
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
6 years agoLU-8910 osp: Add correct handling of errors to osp_statfs_interpret 67/24167/5
Sergey Cheremencev [Tue, 6 Dec 2016 09:16:05 +0000 (12:16 +0300)]
LU-8910 osp: Add correct handling of errors to osp_statfs_interpret

MDT's statfs info could be disagreed with OST's info for a very long time.
If osp_statfs_update() is called and extends the timeout 1000*obd_timeout
into the future but then osp_statfs_interpret() hits an error it
will never reset the timeout.
Now when osp_update_statfs request fails osp_statfs_interpret causes
osp_precreate_cleanup_orphans to send new one after 10 seconds.

Change-Id: Ib282d806ba4932db5c72df34905988f96de99297
Cray-bug-id: MRP-3892
Signed-off-by: Sergey Cheremencev <c17829@cray.com>
Reviewed-on: https://review.whamcloud.com/24167
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10212 test: ESTALE read 01/31101/7
Alexander Boyko [Wed, 31 Jan 2018 11:17:42 +0000 (06:17 -0500)]
LU-10212 test: ESTALE read

The patch reproduces the issue, when a read rpc come
to OST with a lock handle which has the LDLM_FL_DESTROY
flag. And then a client gets the ESTALE error for a read
operation.

Test-Parameters: trivial testlist=sanity
Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: MRP-4604
Change-Id: I0722fc57a61153b25a05bf7aebce5d7f32bbc95b
Reviewed-on: https://review.whamcloud.com/31101
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
6 years agoLU-10653 kernel: kernel update [SLES12 SP2 4.4.114-92.64] 55/31255/5
Bob Glossman [Fri, 9 Feb 2018 18:26:17 +0000 (10:26 -0800)]
LU-10653 kernel: kernel update [SLES12 SP2 4.4.114-92.64]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp2 testgroup=review-ldiskfs \
  mdsdistro=sles12sp2 ossdistro=sles12sp2 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I98f3784eddc05e4faf00091e05b751d78090f66d
Reviewed-on: https://review.whamcloud.com/31255
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10655 tests: eliminate 'ssh exited with exit code 1' 63/31263/5
Vladimir Saveliev [Sat, 10 Feb 2018 10:21:44 +0000 (13:21 +0300)]
LU-10655 tests: eliminate 'ssh exited with exit code 1'

Eliminate meaningless 'ssh exited with exit code 1' issued by stop()
and wait_exit_ST()

Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Elena V. Gryaznova <c17455@cray.com>
Cray-bug-id: MRP-1483
Change-Id: Ie1af3cda0b48b7bf482ea35b84c93e38d0f6c0a9
Test-Parameters: trivial
Reviewed-on: https://review.whamcloud.com/31263
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Tested-by: Jenkins
Reviewed-by: Sergey Cheremencev <c17829@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10716 tests: skip sanity 56xb for old server 16/31416/2
Elena Gryaznova [Mon, 26 Feb 2018 13:15:20 +0000 (16:15 +0300)]
LU-10716 tests: skip sanity 56xb for old server

Patch skips sanity test_56xb for servers which
do not contain LU-6051.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2609
Test-Parameters: trivial testlist=sanity envdefinitions="ONLY=56bx"
Change-Id: I1b03f5e1c144dedef2bdd7b0c46e431d6761eb47
Reviewed-on: https://review.whamcloud.com/31416
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10712 tests: skip conf-sanity 108[a,b] for old server 13/31413/2
Elena Gryaznova [Mon, 26 Feb 2018 00:59:22 +0000 (03:59 +0300)]
LU-10712 tests: skip conf-sanity 108[a,b] for old server

Patch skips test_108a and test_108b for old servers.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5691
Test-Parameters: trivial testlist=conf-sanity
Change-Id: I7494e15710ac2663a0e01a9d3568fa5bcd590a6a
Reviewed-on: https://review.whamcloud.com/31413
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10684 tests: skip recovery-small 110[h-j] 50/31350/3
Elena Gryaznova [Tue, 20 Feb 2018 15:12:15 +0000 (18:12 +0300)]
LU-10684 tests: skip recovery-small 110[h-j]

Skip test_110h, test_110i. test_110j for old
servers.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2589
Test-Parameters: trivial testlist=recovery-small envdefinitions=ONLY=110
Change-Id: I9d89fcdc55b5d1d1fd4004d8e09e0297eb4bc595
Reviewed-on: https://review.whamcloud.com/31350
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Reviewed-by: Alexandr Boyko <c17825@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10672 lnet: pass in only time64_t to lnet_notify 39/31339/3
James Simmons [Wed, 21 Feb 2018 18:22:02 +0000 (13:22 -0500)]
LU-10672 lnet: pass in only time64_t to lnet_notify

With the migration to 64 bit second time some calls to lnet_notify
did not get updated to use time64_t. Update those calling points.
Also for the ioctl IOC_LIBCFS_NOTIFY_ROUTER we pass in the number
of seconds since the epoch. We subtract the current epoch time but
we missed adding in the current number of seconds since booting
since the lnet ping code expects the seconds since boot to be used.

Change-Id: I5a92df08cdaf3b747fd17721a92038df05669a81
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31339
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10604 osd: define couple fields as bitfield 97/31097/5
Alex Zhuravlev [Wed, 31 Jan 2018 06:08:02 +0000 (09:08 +0300)]
LU-10604 osd: define couple fields as bitfield

redefine oo_compat_dot_created and oo_compat_dotdot_created
to save 8 bytes per object.

Change-Id: I92dafc693f1d118debc251d7d064b206e36624f0
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31097
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7787 mdd: clean up orphan object handling 47/30547/8
Andreas Dilger [Thu, 14 Dec 2017 21:53:58 +0000 (14:53 -0700)]
LU-7787 mdd: clean up orphan object handling

There was a potential problem in the orphan object naming because
it had an embedded space in the filename before the "operation",
which might cause issues if they are accessed for other reasons.
It turns out that there is no need for the "operation" to be
embedded into the filename, since it was always ORPH_OP_UNLINK.

Use standard DFID formatting for the orphan object names, which
is a bit shorter and more efficient on disk, without the embedded
operation type.

Remove the use of "ORPH_OP_UNLINK" in the code, except in the
compatibility code for handling orphans left over after upgrades
from older Lustre versions.  This can be removed at some point
in the future when there are no longer upgrades from pre-2.11
versions.

Rename the orphan handling functions to start with mdd_orphan_*
for consistency with other MDD functions:
orph_index_init -> mdd_orphan_index_init
orph_index_iterate -> mdd_orphan_index_iterate
orph_index_fini -> mdd_orphan_index_fini
orph_declare_index_insert -> mdd_orphan_declare_insert
orph_declare_index_insert -> mdd_orphan_declare_insert
orph_key_test_and_del -> mdd_orphan_key_test_and_delete
orph_key_fill -> mdd_orphan_key_fill
orph_key_fill_18 -> mdd_orphan_key_fill_20
__mdd_orphan_add -> mdd_orphan_insert
__mdd_orphan_del -> mdd_orphan_delete
__mdd_orphan_cleanup -> mdd_orphan_cleanup_thread

Remove single-line wrapper functions to clarify actual code:
mdd_orphan_write_lock -> dt_write_lock
mdd_orphan_write_unlock -> dt_write_unlock
mdd_orphan_delete_obj -> dt_delete
mdd_orphan_ref_add -> dt_ref_add
mdd_orphan_ref_del -> dt_ref_del

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ica90cc03c3212103c39cba11c4566584bf9cab07
Reviewed-on: https://review.whamcloud.com/30547
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9325 obd: replace lprocfs_str_to_s64 39/30539/16
James Simmons [Fri, 9 Feb 2018 23:46:39 +0000 (18:46 -0500)]
LU-9325 obd: replace lprocfs_str_to_s64

The original goal of lprocfs_str_to_s64[_with_units] was to allow
passing in values of different unit sizes i.e 64K to a proc file.
Their are a few problems with the implementation that prevents its
direct use with sysfs/debugfs. The first problem is that
lprocfs_str_to_s64() was used for a lot of cases where it doesn't
make sense to use it. Often it was used for bool values passing
in or after retrieving a value as signed 64 bit it ensures its in
range of some other unit size. For these cases we can simply move
to kstrtoXXX_from_user(). To handle the case of bool values we
add in supoort for kstrtobool_from_user().

Replace the lprocfs_rd_uint() and lprocfs_wr_uint() generic callbacks
with a simpler, more direct implementation of ldlm_rw_uint_fops.

There's a slight change in lustre debugfs write semantics: Using kstrtox
causes EINVAL when the written number is followed by other (garbage)
characters, whereas previously the garbage would be ignored and such a
write would succeed.

Linux-commit: 8b23093269c84b0da1201e1949c91d0beb9892ef

Change-Id: I39f0ba3dc72685fe6e29c7077f37ad4e69a20b4a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Mathias Rav <mathiasrav@gmail.com>
Reviewed-on: https://review.whamcloud.com/30539
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10260 hsm: enable max archive_id posix copytool 71/30171/6
Thomas Stibor [Mon, 20 Nov 2017 15:36:57 +0000 (16:36 +0100)]
LU-10260 hsm: enable max archive_id posix copytool

The current maximum archive-id in posix copytool is
limited to id < LL_HSM_MAX_ARCHIVE. However, the Lustre HSM
implementation checks as follows for the maximum:
if (id > LL_HSM_MAX_ARCHIVE) then flag ERROR.
Thus the number of archive id's is in the
range 0,1,..,32 = LL_HSM_MAX_ARCHIVE, and therefore
32 = LL_HSM_MAX_ARCHIVE should be included.
Note, archive-id = 0 is reserved to specify to listen to
ANY archive id and is use as default when no archive-id
option is provided.

Test-Parameters: trivial testlist=sanity-hsm
Signed-off-by: Thomas Stibor <t.stibor@gsi.de>
Change-Id: I6289c8c0e7d86b05f1f2d821b7f6b3127e5fa352
Reviewed-on: https://review.whamcloud.com/30171
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10244 osc: add a bit to indicate osc_page in cache tree 96/30096/6
Bobi Jam [Wed, 15 Nov 2017 07:02:30 +0000 (15:02 +0800)]
LU-10244 osc: add a bit to indicate osc_page in cache tree

Add osc_page::ops_intree to indicate whether the osc_page is in the
osc_object's cache tree, so that when page cannot insert in the
cache as race happens, the cleanup code won't try to remove it from
the cache.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ifcfe158d10c23a40c116414c7f4f86b257e1fa76
Reviewed-on: https://review.whamcloud.com/30096
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9669 tests: check required nrs availability on a facet 60/27660/7
Elena Gryaznova [Fri, 16 Feb 2018 16:59:34 +0000 (19:59 +0300)]
LU-9669 tests: check required nrs availability on a facet

sanityn/77[abcdefg], 78 failed with interop testing due to
missing nrs policy related proc entry's in OSS/MGS/MDS node.

Fix is to check for availabilty of a required nrs on a facet.
Patch removes tne versions based check from basic NRS policies
regression tests to make the possibility of interop testing with
old servers with NRS feature backported.

Author: Jadhav Vikram <jadhav.vikram@seagate.com>

Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanityn
Cray-bug-id: LUS-5259
Seagate-bug-id: MRP-3999
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Parinay Vijayprakash Kondekar <parinay.kondekar@seagate.com>
Change-Id: If0eca183ac388d481ddb3b1d39e0c9def5dd0c37
Reviewed-on: https://review.whamcloud.com/27660
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9624 tests: fix pre-DNE test exceptions/llog usage 35/27535/31
Andreas Dilger [Thu, 8 Jun 2017 20:27:50 +0000 (14:27 -0600)]
LU-9624 tests: fix pre-DNE test exceptions/llog usage

Remove some test skips when running with multiple MDTs in DNE mode,
or fix tests to work better with multiple MDTs.  Tests updated are:
recovery-small: 60
sanity: 17hi, 154ab, 160abcde, 161abcd, 162a, 205, 225ab, 254, 256

In particular, sanity.sh test_160, test_161, test_162 ignored test
failures in DNE mode.  Fix test_160* to work with ChangeLogs stored
on multiple MDTs.  This adds test coverage both because we aren't
skipping these tests when running in DNE mode, but also because we
are now validating ChangeLogs running on multiple MDTs at once.

Test-Parameters: trivial mdscount=2 mdtcount=4 testlist=sanity,recovery-small,sanity-hsm
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3fc3ce85b46f34e507c1e28b4c76574a698cab07
Reviewed-on: https://review.whamcloud.com/27535
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8912 nodemap: fix contiguous range support 97/24397/4
Kit Westneat [Thu, 15 Dec 2016 23:45:00 +0000 (07:45 +0800)]
LU-8912 nodemap: fix contiguous range support

This patch fixes the contiguous range check to allow the addition of
multiple "full" ([0-255]) ranges. As part of this change,
is_contiguous and find_min_max are combined as they were always
called together and the logic is fairly similar. This also removes
the multiple range expression support, since it was broken.

Also, sanity-sec.sh test_10c is added to verify this patch.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I3c49a077039327fcbde87196f82db140f67a74d0
Reviewed-on: https://review.whamcloud.com/24397
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8672 tests: Fix error handling in replay-single test_89 74/22974/7
Abrarahmed Momin [Thu, 22 Feb 2018 16:50:16 +0000 (19:50 +0300)]
LU-8672 tests: Fix error handling in replay-single test_89

Update replay-single test_89() to error out on wait_mds_ost_sync and
wait_delete_completed timeout.

Correct error handling in wait_delete_completed_mds and
wait_delete_completed.

Signed-off-by: Abrarahmed Momin <abrar.habib@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ashish Purkar <ashish.purkar@seagate.com>
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Cray-bug-id: MRP-1680
Test-Parameters: trivial
Change-Id: I54e30221361e73a17ba857cb19b1efcc019b412f
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Reviewed-by: Ujjwal Lanjewar <ujjwal.lanjewar@seagate.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/22974
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6863 tests: change obdfilter-survey.sh for CLIENTONLY mode 31/15631/11
Elena Gryaznova [Fri, 23 Feb 2018 18:17:35 +0000 (21:17 +0300)]
LU-6863 tests: change obdfilter-survey.sh for CLIENTONLY mode

obdfilter-survey.sh requires server access and can not be
used for CLIENTONLY mode:
get_devs $oss -> do_nodes $oss "lctl dl"
host_nids_address $oss -> do_nodes $oss "$LCTL list_nids"
Patch fixes the script to compose the targets list
without access to servers.

Test-Parameters: trivial testlist=obdfilter-survey
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: MRP-1757
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: I910afc940a29ea4f5d8928131652f9b6ef809ce7
Reviewed-on: https://review.whamcloud.com/15631
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10657 utils: fd leak in mirror_split() 10/31410/2
Bobi Jam [Sat, 24 Feb 2018 05:17:17 +0000 (13:17 +0800)]
LU-10657 utils: fd leak in mirror_split()

fd could be leaked in some error handling path for mirror_split().

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I54b06191bd337ca7a9e6b58bdc4ab8197f29ed22
Reviewed-on: https://review.whamcloud.com/31410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10682 lnd: pending transmits dropped silently 74/31374/3
Amir Shehata [Thu, 22 Feb 2018 00:21:02 +0000 (16:21 -0800)]
LU-10682 lnd: pending transmits dropped silently

list_add was being used erroneously. The logic should be to move
the txs on ibp_tx_queue on a local list which is then processed.
The code, however, did the reverse, which would result in the
pending txs not processed and thus dropped silently. This in turn
would lead to peers reference counts at the LNet layer not
decremented since lnet_finalize() might not be called for a message.

Initialize local list and use list_splice_init() to move
transmits on the ibp_tx_queue to the local list.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I6b36f709db2c89e53e0b3354883a8a1b1052a1dd
Reviewed-on: https://review.whamcloud.com/31374
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9019 selftest: remove remaining cfs_time wrappers 41/31041/3
James Simmons [Thu, 22 Feb 2018 18:00:25 +0000 (13:00 -0500)]
LU-9019 selftest: remove remaining cfs_time wrappers

Remove remaining libcfs time wrappers from lnet selftest. Migrate
crp_stamp to nanoseconds and both timestamps nd_stamp, sn_start to
ktime. The move away from jiffies which can vary on platforms to
something that is the consistent on every node. This will ensure
that the reported results to the user will always be correct.

Test-Parameters: trivial testlist=lnet-selftest

Change-Id: Id8d1b195f690c69635de60dd9b501f6d97f90f4d
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31041
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10318 dom: support DATA_VERSION IO type 49/30449/14
Mikhal Pershin [Tue, 5 Dec 2017 20:10:02 +0000 (23:10 +0300)]
LU-10318 dom: support DATA_VERSION IO type

add support for DATA_VERSION IO type, return from MDT
data version and layout version if requested by CLIO.
Also ensure that version is changed on punch and write
operations.
This fixes HSM archive with DOM files.

Change-Id: Id7b63697ffc48c889370638682625ea04a0348c5
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Reviewed-on: https://review.whamcloud.com/30449
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9431 obd: resolve config log sysfs issues 43/30143/8
James Simmons [Tue, 6 Feb 2018 16:43:07 +0000 (11:43 -0500)]
LU-9431 obd: resolve config log sysfs issues

This resolves long standing issues with modifying sysfs settings
on multiple nodes simultaneously by running a single command on
the backend MGS server. Their are two ways to change the settings,
LCFG_PARAM and LCFG_SET_PARAM. For the LCFG_PARAM case we create
a new function class_modify_config() that grabs the attributes
from the passed in kobject. We can use those attributes to
modify the sysfs settings. If we can't find the attribute then
send a uevent to let userland resolve the change. For the
LCFG_SET_PARAM case we handle two class of settings. The function
class_set_global() was modifiy to handle the top lustre sysfs
files since they are not searchable with kset_find_obj.
To make the new version of class_set_global() work both sets of
sysfs attributes for the top level sysfs entries have been merged.
If we can find a kobject with kset_find_obj then we can send
a uevent so userland change manage the change.

Change-Id: I4e7f19c4a232767119355c3c96e5752a10000da8
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/30143
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10676 dkms: Provide lustre-dkms for lustre-zfs-dkms 29/31329/2
Nathaniel Clark [Thu, 15 Feb 2018 20:36:43 +0000 (15:36 -0500)]
LU-10676 dkms: Provide lustre-dkms for lustre-zfs-dkms

To facilitate upgrading from old lustre-dkms style package to new
lustre-zfs-dkms, provide the old package in lustre-zfs-dkms.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ia6f0fffad35ad8e219bfbe05527865ccd1904ff7
Reviewed-on: https://review.whamcloud.com/31329
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9761 dkms: Add ldiskfs dkms support 90/27990/10
Nathaniel Clark [Fri, 5 Aug 2016 15:22:21 +0000 (11:22 -0400)]
LU-9761 dkms: Add ldiskfs dkms support

This breaks out lustre-dkms into lustre-zfs-dkms and
lustre-ldiskfs-dkms (or lustre-all-dkms) as "flavours" of lustre
server dkms.  The reason for the flavours is to prevent lustre
ldiskfs dkms build from having ZFS dependencies, and to maintain
lustre zfs dkms build ordering when rebuilding for new kernels.
This also prevents building of tests and utils when --disable-tests
and --disable-utils (respectively) are passed to configure.

Test-Parameters: trivial
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Iba500d9830a8f57662066141a176c381151861f4
Reviewed-on: https://review.whamcloud.com/27990
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9324 lfs: add setstripe --yaml=template parameter 60/26860/34
Bobi Jam [Tue, 25 Apr 2017 01:18:44 +0000 (09:18 +0800)]
LU-9324 lfs: add setstripe --yaml=template parameter

Add a "lfs setstripe --yaml=<yaml_template> <lustre_file_or_dir>"
usage to set stripe using stripe info from a YAML template file.

The YAML template file can be get from
$ lfs getstripe --yaml <lustre_file_or_dir>
and user can manually edit it to tweak stripe options.

This patch fixes two cyaml issues:
1. a YAML_BLOCK_ENTRY_TOKEN can follow a YAML_VALUE_TOKEN.
2. free_node() has memory leak, it needs to free
   cYAML::cy_valuestring and cYAML::cy_string if possible.

Test-Parameters: testlist=sanity-pfl,sanity-flr
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I78149bb011fbc03387cbe3d057eb030550dd75ae
Reviewed-on: https://review.whamcloud.com/26860
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8727 mgs: remove skip records from config file 45/23245/23
Vladimir Saveliev [Mon, 19 Feb 2018 13:14:28 +0000 (16:14 +0300)]
LU-8727 mgs: remove skip records from config file

Configuration logs are append-only files of limited size.  Over the
course of time the logs may grow over the limit size.  Usually,
configuration logs keep needless records marked as SKIP. The new lctl
command "clear_conf" is added to allow administartors to clear
configuration files by removing mentioned SKIP records. lctl man page
is updated.
conf-sanity test (for ldiskfs only) is added to test the new command.

Change-Id: I274cb48138c16e536cfca56836c3313e944eba56
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Signed-off-by: Arshad Hussain <arshad.super@gmail.com>
Cray-bug-id: MRP-2091
Reviewed-by: Artem Blagodarenko <c17828@cray.com>
Reviewed-by: Alexey Leonidovich Lyashkov <c17817@cray.com>
Tested-by: Elena V. Gryaznova <c17455@cray.com>
Reviewed-on: https://review.whamcloud.com/23245
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9437 lfsck: handle LMV EA for migrating directory 66/31266/5
Fan Yong [Tue, 20 Feb 2018 19:25:55 +0000 (03:25 +0800)]
LU-9437 lfsck: handle LMV EA for migrating directory

For the in-migration directory, its LMV EA contains not only the
LMV header, but also the FIDs for both source and target. So the
LMV EA size is larger. The lfsck_read_stripe_lmv() logic need to
handle such case properly.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ic43853fb5ca058042fafa0f6c81fa99d4b8d8897
Reviewed-on: https://review.whamcloud.com/31266
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
6 years agoLU-10615 osd: stop OI scrub before FLDB closed 41/31241/3
Fan Yong [Fri, 23 Feb 2018 11:44:05 +0000 (19:44 +0800)]
LU-10615 osd: stop OI scrub before FLDB closed

OI scrub may check FLDB when scans the device. During umount
the device, we need to stop OI scrub before closing the FLDB
to void invalid RAM accessing.

Some code optimization and cleanup.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib358abd77f970c12b0c29a603f9bcaf8e310cc98
Reviewed-on: https://review.whamcloud.com/31241
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoRevert "LU-8856 osd: mark specific transactions netfree" 42/31442/2
Oleg Drokin [Tue, 27 Feb 2018 18:27:40 +0000 (18:27 +0000)]
Revert "LU-8856 osd: mark specific transactions netfree"

This patch caused very frequent sanity-lfsck 9a failures
reported in LU-10732

This reverts commit 8d1639b5cf1edbc885876956dcd6189173c00955.

Change-Id: Ibf353042d2d37d37eccbf3895453f51ca07ea6d3
Reviewed-on: https://review.whamcloud.com/31442
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8878 tests: skip several tests for CLIENTONLY mode 28/24028/4
Elena Gryaznova [Wed, 7 Feb 2018 12:01:35 +0000 (15:01 +0300)]
LU-8878 tests: skip several tests for CLIENTONLY mode

tests 107, 300, 301, 302 fail SINGLEMDS, so they are to be
skipped for CLIENTONLY mode.

Author: Chennaiah Palla <chennaiah.palla@seagate.com>

Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=sanity-hsm
Cray-bug-id: LUS-4966
Seagate-bug-id: MRP-3529
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Change-Id: I8286bbaa403089a4a85fcf0c4d9451fe24e67836
Reviewed-on: https://review.whamcloud.com/24028
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-4423 lnet: free a struct kib_conn outside of the kiblnd_destroy_conn() 73/31273/3
Dmitry Eremin [Mon, 12 Feb 2018 12:37:18 +0000 (15:37 +0300)]
LU-4423 lnet: free a struct kib_conn outside of the kiblnd_destroy_conn()

To avoid confusion this fix moved the freeing a struct kib_conn outside of
the function kiblnd_destroy_conn().

Change-Id: Iae28802f5d319570064a504feb14dffd13a22b84
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: https://review.whamcloud.com/31273
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Sonia Sharma <sonia.sharma@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10652 tests: restructure sanity 133[f,g] 45/31245/4
Elena Gryaznova [Wed, 14 Feb 2018 14:24:06 +0000 (17:24 +0300)]
LU-10652 tests: restructure sanity 133[f,g]

sanity 133f and 133g both get skipped in CLIENONLY mode,
but tests are to run on clients on this mode.

The fix separates code of the tests so that 133f tests
clients, while 133g runs on servers. Then in CLIENTONLY mode
only 133g is skipped.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: envdefinitions=ONLY=133 testlist=sanity
Seagate-bug-id: MRP-2438
Cray-bug-id: LUS-4289
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Vitaly Fertman <c17818@cray.com>
Change-Id: Ibba69a3fd4fd4a9f8d90729ec2a294443dd4f29e
Reviewed-on: https://review.whamcloud.com/31245
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10639 tests: rename the tests 30/31230/4
Elena Gryaznova [Fri, 9 Feb 2018 09:01:54 +0000 (12:01 +0300)]
LU-10639 tests: rename the tests

The following tests are renamed to be run separately
from other tests in the groups:
sanity-hsm:
    test_1 to test_1A
    test_9 to test_9A
    test_26 to test_26A
    test_220 to test_220A
    test_224 to test_224A

conf-sanity:
    test_28 to test_28A

lustre-rsync-test.sh:
    test_1 to test_1A

sanity.sh:
    test_239 to test_239A

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Signed-off-by: Ajay Nair <ajay.nair@seagate.com>
Cray-bug-id: LUS-2608, LUS-5328
Seagate-bug-id: MRP-4695, MRP-4121
Test-Parameters: testlist=sanity,sanity-hsm,conf-sanity
Test-Parameters: testlist=lustre-rsync-test
Reviewed-by: Ashish Purkar <ashish.purkar@seagate.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: Ib1542d55328c0fb60c0c2c59257fa9f5742a57dc
Reviewed-on: https://review.whamcloud.com/31230
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10617 tests: Dir's and file's stripe counts are mismatched 93/31193/3
Elena Gryaznova [Tue, 13 Feb 2018 18:17:01 +0000 (21:17 +0300)]
LU-10617 tests: Dir's and file's stripe counts are mismatched

the case when stripe count of dir equals to -1 and files
in the dir must be equal to ost count added into
the test_24 of ost-pool.sh

Author: Alyona Romanenko <alyona.romanenko@seagate.com>

Signed-off-by: Alyona Romanenko <alyona.romanenko@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=ost-pools envdefinitions="ONLY=24"
Cray-bug-id: LUS-4467
Seagate-bug-id: MRP-2746
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: I91e7c65e178c7706f53a95a2807e06b1bc8e0d24
Reviewed-on: https://review.whamcloud.com/31193
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10612 tests: reply_single.sh,test_48: No space left 82/31182/2
Elena Gryaznova [Tue, 6 Feb 2018 14:20:53 +0000 (17:20 +0300)]
LU-10612 tests: reply_single.sh,test_48: No space left

MDS need to have time to discover the OST state, attempt to
recover, fail and recover again.

Author: gaurav mahajan <gaurav.mahajan@seagate.com>

Signed-off-by: gaurav mahajan <gaurav.mahajan@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Test-Parameters: trivial testlist=replay-single envdefinitions="ONLY=48"
Cray-bug-id: LUS-4384
Seagate-bug-id: MRP-2616
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Rahul Deshmukh <rahul.deshmukh@seagate.com>
Change-Id: I2b3cca70872b7c9f13c64b50e1b4373096fbc147
Reviewed-on: https://review.whamcloud.com/31182
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10600 tests: clean up sanity tests 64d and 65k 59/31159/4
James Nunez [Fri, 2 Feb 2018 23:53:51 +0000 (16:53 -0700)]
LU-10600 tests: clean up sanity tests 64d and 65k

Several saity tests create files or modified the environment
and does not clean up or return the environment to the
original state. sanity test 64d fills and OST and does not
clean up the file after the OST if full. sanity test 65k
sets OSTs to be inactive and, on error, does not set the OST
back to active.

These two tests need to clean up after themselves.

Test-Parameters: trivial testlist=sanity,sanity
Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I01bc376680798815c9dd398da7781c92c6b70b2f
Reviewed-on: https://review.whamcloud.com/31159
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10570 obd: fix statfs handling 58/31158/3
James Simmons [Sun, 4 Feb 2018 19:38:25 +0000 (14:38 -0500)]
LU-10570 obd: fix statfs handling

The function lod_qos_statfs_updates() refreshes statfs
data every N seconds. Taking lq_rw_sem can take a very long
time so the testing for stale stats had to be done again after
taking the semaphore. Now that we are using only seconds
resolution it is more likely that max_age and obd_osfs_age
will be equal compared to when the code was using jiffies.
So only release the lock right away when osfs_age has passed
the max_age.

The comment 'use the value of cfs_time_current + HZ' for
obd_statfs() and obd_statfs_async() needs to updated to
the time64_t case.

Simplify llite_statfs_internal() handling by calculating
max_age inside of llite_statfs_internal(). This makes the
code cleaner.

Change-Id: I22aa5d4d78b30d6480e73998e05ec6582a316d4f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31158
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8854 llapi: remove lustre specific strlcpy & strlcat functions 98/29798/6
James Simmons [Sat, 10 Feb 2018 16:18:53 +0000 (11:18 -0500)]
LU-8854 llapi: remove lustre specific strlcpy & strlcat functions

In the days when lustre supported many more platforms some of those
platforms natively support strl[cpy|cat] but Linux has always lack
these functions. So lustre ended up providing its own versions of
these functions to fill in this functionality. Today Lustre only
supports the Linux platforms which has a version of libc that will
most likely never support strl[cat|cpy]. Since this is the case we
can remove the AC_CHECK_FUNCS since they only test against libc.
We could support detecting strl[cpy|cat] in another library but
many libraries provide their own version so the chances of collision
are high. The best solution is remove strlcpy and strlcat by
replacing those functions with string functions that are always
provided by the standard c library.

Change-Id: I72df93c8f83ed1aad80653fe0d1c4d54d1d8e2f2
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29798
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 lustre: record if enable_audit is set on nodemap 14/28314/18
Sebastien Buisson [Wed, 2 Aug 2017 14:47:47 +0000 (23:47 +0900)]
LU-9727 lustre: record if enable_audit is set on nodemap

Record changelogs from a client only if it pertains to a nodemap
on which enable_audit is set, and changelogs are activated.
If client is not explicitely assigned to a nodemap, enable_audit value
from default nodemap is used.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I31d361cfd8cc69db68b60298934cbbef4af0d75d
Reviewed-on: https://review.whamcloud.com/28314
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
6 years agoLU-9250 tests: add parallel-scale xdd test 76/26176/5
Elena Gryaznova [Fri, 16 Feb 2018 10:11:04 +0000 (13:11 +0300)]
LU-9250 tests: add parallel-scale xdd test

Patch adds parallel-scale xdd test.

Our customers report the Lustre issues hit during
xdd test. We need a flexible way to reproduce the
failures.

Author: Chennaiah Palla <chennaiah.palla@seagate.com>

Signed-off-by: Chennaiah Palla <chennaiah.palla@seagate.com>
Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-5206
Seagate-bug-id: MRP-3915
Test-Parameters: testlist=parallel-scale
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Elena Gryaznova <c17455@cray.com>
Change-Id: Ia4823aa8ce64aad3d43b2611b24f48a532b8796c
Reviewed-on: https://review.whamcloud.com/26176
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6867 test: detect active facet based on current state 38/15638/17
Elena Gryaznova [Fri, 17 Jul 2015 16:32:49 +0000 (19:32 +0300)]
LU-6867 test: detect active facet based on current state

Lustre failover tests can not be ran test-by-test
on the setup with ${facet}_HOST != ${facet}failover_HOST
because of t-f does not restore facet state.
t-f keeps this info in "${facet}active" files, which are created
when facet_failover() is executed first time in the test session.
Before facet_failover() executed these files are empty and
active facet is ${facet} by default.
In case when tests are executed test-by-test the active facet is
${facet}failover after 1st test completed, and 2nd test is started
having ${facet}failover active without this info stored in
${facet}active files.

Patch contains the following changes:
- add the active facet detection based on current lustre state;
- fix sanity-hsm defect: exist with error if agt${n}1_HOST is empty.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-2680
Reviewed-by: Alexander Lezhoev <alexander.lezhoev@seagate.com>
Reviewed-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Change-Id: Ie42baaa55a6433596e6004d16eb5c18ae2ef7479
Reviewed-on: https://review.whamcloud.com/15638
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Lezhoev <c17454@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10680 mdd: fix run_gc_task uninitialized 47/31347/2
Bruno Faccini [Sun, 18 Feb 2018 19:13:04 +0000 (20:13 +0100)]
LU-10680 mdd: fix run_gc_task uninitialized

run_gc_task has been mistakenly left uninitialized in previous
patch for LU-7340. This has been silently ignored by gcc even
if -Wall option is used during build, possibly because no
optimization level/option requested where -Wuninitialized
option/check may only pe performed.
The side effect is that generated assembly code completelly
avoids run_gc_task usage from source, and thus a kthread
for ChangeLogs garbage-collection is created upon each
record creation and this without any of the garbage-collection
conditions are triggered.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ieb9ce062ba6ebf0c365c1e6f8a57f89dd39e0a9d
Reviewed-on: https://review.whamcloud.com/31347
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Stephan Thiell <sthiell@stanford.edu>
6 years agoLU-10561 flr: remove "--parent" option from lfs mirror command 98/31298/5
Jian Yu [Tue, 20 Feb 2018 22:35:52 +0000 (14:35 -0800)]
LU-10561 flr: remove "--parent" option from lfs mirror command

"--parent" option for "lfs mirror create/extend" command was
originally designed to use default stripe options inherited
from parent directory. However, if parent directory has
composite layout, there will be inconsistency to choose the
stripe options from which component to inherit. And if there
is any other option specified, it's also inconsistent to
inherit the layout of parent directory.

So, this patch removes "--parent" option to eliminate ambiguity.
For "--pool|-p" option, this patch supports specifying "none" to
clear the pool name and inherit from parent directory.

Unspecified stripe count, stripe size and OST pool name will
inherit from previous component. If there is no previous component,
then unspecified stripe count and stripe size attributes will
inherit from filesystem-wide default values. Unspecified or
cleared OST pool name will inherit from parent directory.

Change-Id: Ib0ec3cbc65fb307c42881f35dc676090ab8319ff
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31298
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@gmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10663 utils: clear errno before check 05/31305/4
John L. Hammond [Wed, 14 Feb 2018 18:27:54 +0000 (12:27 -0600)]
LU-10663 utils: clear errno before check

In jt_obd_destroy() clear errno before calling strtoull() and checking
it.

Test-Parameters: trivial testlist=obdfilter-survey

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I686cd6eb0a57248177e5b0878df5e3f450fbc942
Reviewed-on: https://review.whamcloud.com/31305
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9724 ldiskfs: update ext4-large-eas.patch to match upstream ext4 33/31033/9
Emoly Liu [Fri, 26 Jan 2018 07:26:00 +0000 (15:26 +0800)]
LU-9724 ldiskfs: update ext4-large-eas.patch to match upstream ext4

In order to match the enhanced ea_inode functionality being landed
to the upstream ext4 kernel tree, ext4-large-eas.patch is modified
to start properly initializing some of the fields we don't
currently use to minimize the interoperability issues.

In particular, the new EA inode refcount is initialized to 1, and
hash field is computed based on the xattr value as it is in the
upstream kernel patch.

However, since ext4_xattr_inode_get_hash() has not been added to
ldiskfs code so that this hash value is not used anywhere, if the
new checksum driver (sbi->s_chksum_driver) is not available, hash
value will be 0 in the current implementation, until we find a way
to calculate it based on the xattr value propely.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I2bcf45c67a580f2f545816e1a70a6322c6ccc368
Reviewed-on: https://review.whamcloud.com/31033
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10277 utils: 'lfs mkdir -i -1' pick the less full MDTs 98/30598/11
Lai Siyao [Mon, 4 Dec 2017 07:38:25 +0000 (15:38 +0800)]
LU-10277 utils: 'lfs mkdir -i -1' pick the less full MDTs

If 'lfs mkdir -i -1 -c count' is specified, it will 'df' first,
and then randomly pick 'count' less full MDTs as specific MDTs.

Add sanity test 413.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I2ce1720479d37b1ae397054743afae865129fee3
Reviewed-on: https://review.whamcloud.com/30598
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9019 obd: migrate upcall cache to time64_t 64/31064/3
James Simmons [Wed, 7 Feb 2018 06:20:54 +0000 (01:20 -0500)]
LU-9019 obd: migrate upcall cache to time64_t

Move all the upcall cache time handling from jiffies to time64_t.

Change-Id: I86039c6e6e35ac83b773753c952936f1b2f5e14a
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31064
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10270 lnet: remove an early rx code 54/30254/7
Alexey Lyashkov [Thu, 23 Nov 2017 11:28:18 +0000 (14:28 +0300)]
LU-10270 lnet: remove an early rx code

early RX added to the o2ib lnd as attempt to reordering problem
handling, When messages have arrived before actual connection sets.
But it code can fill all incoming queue and normal connect will not
processed.

Cray-bug-id: MRP-4638
Signed-off-by: Alexey Lyashkov <c17817@cray.com>
Change-Id: I2efc73534a20c4628ed462ee5055c901dbf44278
Reviewed-on: https://review.whamcloud.com/30254
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <dougso@me.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10181 tests: add FIO as test for DOM 59/30059/12
Mikhal Pershin [Mon, 13 Nov 2017 15:23:54 +0000 (18:23 +0300)]
LU-10181 tests: add FIO as test for DOM

Add FIO test for basic DOM performance tracking,
- remove unused smallfileio test,
- make parameter setting compatible with DNE,
- turn off extra stats output by default
- format test output

Test-Parameters: trivial mdssizegb=20 testlist=sanity-dom,dom-performance
Signed-off-by: Mikhal Pershin <mike.pershin@intel.com>
Change-Id: Id4236643e841165d35e7d3f0c1ab64ae8f9e1751
Reviewed-on: https://review.whamcloud.com/30059
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9019 osd-ldiskfs: migrate to 64 bit time 57/29857/10
James Simmons [Mon, 5 Feb 2018 17:14:51 +0000 (12:14 -0500)]
LU-9019 osd-ldiskfs: migrate to 64 bit time

Replace cfs_time_current_sec() to avoid the overflow issues in
2038 with ktime_get_real_seconds(). Besides changing struct
scrub_file sf_time_* fields to time64_t for usage with
ktime_get_real_seconds() the other fields can also be moved to
time64_t as well since we don't need precision better than one
second for the scrubbing code. The dr_* time fields in struct
osd_iobuf are jiffies which does get reporting with the histograms.
This was with the thinking that jiffies equal milliseconds which
is not always the case. Since we need better than one second
resolution move dr_* time fields to ktime. This way the value
passed to lprocfs_oh_tally_log() will always be in milliseconds.

Change-Id: Ibce7f7d9f972c8d3188271950f68dcda7663676f
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/29857
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-5695 libcfs: watchdog dispatch thread fix 55/12155/8
Alexander Zarochentsev [Fri, 16 Feb 2018 18:06:57 +0000 (13:06 -0500)]
LU-5695 libcfs: watchdog dispatch thread fix

lc_watchdogd may stop imediately after start
because nobody clears the stop flag.

Xyratex-bug-id: MRP-2108 MRP-1913
Change-Id: I1eaaf0330c111b7f2b17081c716ef8c200677d6b
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-on: https://review.whamcloud.com/12155
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10650 obd: add check to obd_statfs 43/31243/7
Alexander Boyko [Fri, 9 Feb 2018 12:07:19 +0000 (07:07 -0500)]
LU-10650 obd: add check to obd_statfs

The race could happend between mount and lctl get_param.
Because procfs files are ready before a full obd initialization.
For example:
3372:0:(dt_object.h:2509:dt_statfs()) ASSERTION( dev )
3372:0:(dt_object.h:2509:dt_statfs()) LBUG
Pid: 3372, comm: lctl
Call Trace:
libcfs_call_trace+0x4e/0x60[libcfs]
lbug_with_loc+0x4c/0xb0[libcfs]
tgt_statfs_internal+0x2ea/0x350[ptlrpc]
ofd_statfs+0x66/0x470 [ofd]
lprocfs_filesfree_seq_show+0xf6/0x520 [obdclass]
ofd_filesfree_seq_show+0x12/0x20 [ofd]

The patch adds a check of completed obd_setup to obd_statfs().
The patch adds the sanity 276 test.

Signed-off-by: Alexander Boyko <c17825@cray.com>
Cray-bug-id: LUS-2665
Change-Id: I55a9ffa7e036f486388a8f548051d28974d47951
Reviewed-on: https://review.whamcloud.com/31243
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8990 lod: put root at cleanup 43/31143/3
Lai Siyao [Fri, 2 Feb 2018 15:00:15 +0000 (23:00 +0800)]
LU-8990 lod: put root at cleanup

'lod_md_root' was put at precleanup, but soak test shows there exists
race, and some ongoing request may re-initialize it, move this put
to cleanup.

Also add debug code to dump remaining objects if lod device is still
referenced at lod_device_free().

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I6f1ab0ba149ccf95279c1182c90a5588607ad8fa
Reviewed-on: https://review.whamcloud.com/31143
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10550 flr: resync RDONLY state FLR file 10/31010/8
Bobi Jam [Wed, 24 Jan 2018 15:32:37 +0000 (23:32 +0800)]
LU-10550 flr: resync RDONLY state FLR file

When some components are failed to resync due to various reasons,
those components will still have STALE bit set but the file statue may
become to RDONLY.

This patch makes resync RDONLY FLR file possible.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I2e3b518bb969aedd7f214e6b09b895079cab69ab
Reviewed-on: https://review.whamcloud.com/31010
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10356 llite: have ll_write_end to sync for DIO 59/30659/2
Vladimir Saveliev [Tue, 26 Dec 2017 19:49:58 +0000 (22:49 +0300)]
LU-10356 llite: have ll_write_end to sync for DIO

direct IO write uses buffered write for pages which could not be
released. If not adjacent pages are not releasable,
vio->u.write.vui_queue list becomes non-contiguos which makes
page_list_sanity_check() to fail.

Have ll_write_commit to do vvp_io_write_commit() when it is called in
course of direct IO.

Cray-bug-id: MRP-4415
Signed-off-by: Vladimir Saveliev <c17830@cray.com>
Change-Id: I21e653c4d45553c85ff5ded8edf22017966c7ba4
Reviewed-on: https://review.whamcloud.com/30659
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Reviewed-by: Andrew Perepechko <c17827@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8856 osd: mark specific transactions netfree 30/26930/20
Alex Zhuravlev [Wed, 3 May 2017 12:45:13 +0000 (15:45 +0300)]
LU-8856 osd: mark specific transactions netfree

osd-zfs should mark some transactions netfree. this means those transactions
are expected to release space (rather than consume) and for this kind of
transaction half of reserved space is available.

Change-Id: I71605bc224882aafac26b3dfb0f3d7e82af8fde8
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/26930
Tested-by: Jenkins
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10670 test: make sanity-flr test_43 more reliable 15/31315/3
Bobi Jam [Thu, 15 Feb 2018 07:59:14 +0000 (15:59 +0800)]
LU-10670 test: make sanity-flr test_43 more reliable

Improve sanity-flr test_43 more reliable by setting the active
state of OSP device instead of OSC device to simulate OST's
unavailability.

Test-Parameters: testlist=sanity-flr
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ibfb4a54479a7dafff251dd3645b03ec172b6884e
Reviewed-on: https://review.whamcloud.com/31315
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10443 test: Handle file lifecycle correctly 54/31254/2
Patrick Farrell [Fri, 9 Feb 2018 15:00:13 +0000 (09:00 -0600)]
LU-10443 test: Handle file lifecycle correctly

The current lockahead_test.c removes the test file on exit,
which will destroy the locks which sanity.sh counts to
verify correct operation.  This usually works because
sanity.sh wins the race with the object destroy command
from the MDS to the OSS.

Change lockahead_test.c to remove the test file on entry,
and to use $tfile rather than its own file, so it is
automatically cleaned up by sanity.

Change-Id: I3cd1fdb7f33da167ca21476a7b3cbe5f57fd5782
Signed-off-by: Patrick Farrell <paf@cray.com>
Reviewed-on: https://review.whamcloud.com/31254
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10634 kernel: kernel update [SLES12 SP3 4.4.114-94.11] 24/31224/3
Bob Glossman [Wed, 7 Feb 2018 23:04:40 +0000 (15:04 -0800)]
LU-10634 kernel: kernel update [SLES12 SP3 4.4.114-94.11]

Update target and kernel_config files for new version

Test-Parameters: clientdistro=sles12sp3 testgroup=review-ldiskfs \
  mdsdistro=sles12sp3 ossdistro=sles12sp3 \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I3ffcd4c368b2976cffa6a517f9fabcf674781ac9
Reviewed-on: https://review.whamcloud.com/31224
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10603 ptlrpc: export req_buffers_max via procfs 62/31162/2
Alex Zhuravlev [Mon, 5 Feb 2018 10:03:17 +0000 (13:03 +0300)]
LU-10603 ptlrpc: export req_buffers_max via procfs

after LU-9372 gcc7 complains:
lustre/ptlrpc/lproc_ptlrpc.c:382:16: error: ‘ptlrpc_lprocfs_req_buffers_max_fops’ defined but not used [-Werror=unused-const-variable=]
 LPROC_SEQ_FOPS(ptlrpc_lprocfs_req_buffers_max);
                 ^

Change-Id: Ie4806b79d104c7ea9aa34b6a8a280587fccef689
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: https://review.whamcloud.com/31162
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10560 libcfs: handle rename to wait_queue_entry_t 53/31153/11
Mike Marciniszyn [Fri, 9 Feb 2018 18:22:50 +0000 (13:22 -0500)]
LU-10560 libcfs: handle rename to wait_queue_entry_t

The 4.13 kernel renames wait_queue_t to wait_queue_entry_t.

Add a probe and handle rename across the code base and have
a define to translate to the new name when indicated.

Test-Parameters: trivial

Change-Id: I8f0f5ec4d02ccb270acb72ccffe13f0ecf6bd2f7
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31153
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10560 lustre_compat: Convert GFP_TEMPORARY to GFP_KERNEL 52/31152/6
Mike Marciniszyn [Fri, 2 Feb 2018 16:45:54 +0000 (08:45 -0800)]
LU-10560 lustre_compat: Convert GFP_TEMPORARY to GFP_KERNEL

The 4.14 kernel removes this gfp.h define.

Adjust the code to use GFP_KERNEL as the upstream
patch does.

Change-Id: I40fff2724499fa17aa285507e0fd9b21f4afc070
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-on: https://review.whamcloud.com/31152
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
6 years agoLU-10574 tests: remove useless check from sanity-dom.sh 74/31074/3
Elena Gryaznova [Wed, 7 Feb 2018 20:09:58 +0000 (23:09 +0300)]
LU-10574 tests: remove useless check from sanity-dom.sh

Tests test_sanity() and test_sanityn() are skipped if started
not from lustre/tests directory because of incorrect check
that ./sanity.sh exists.
Patch removes the check of the files which are part of
lustre/tests.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2594
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Test-Parameters: testlist=sanity-dom
Change-Id: I51ad517fbf3ff653d9a11994eb280daee589a886
Reviewed-on: https://review.whamcloud.com/31074
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10449 nrs: Generic TBF policy can't be shown correctly 96/30696/6
Qian Yingjin [Wed, 3 Jan 2018 09:21:10 +0000 (17:21 +0800)]
LU-10449 nrs: Generic TBF policy can't be shown correctly

After setting TBF NID/OPCode/JobID policy and switch to generic
policy, the output of "lctl get_param ost.OSS.ost.nrs_policies"
can not display correctly.

Change-Id: If8dcb7ae6ade634ec7ec4dfcb5887501cda90cdf
Signed-off-by: Qian Yingjin <qian@ddn.com>
Reviewed-on: https://review.whamcloud.com/30696
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9452 tests: remove sanityn test_29 from ALWAYS_EXCEPT 46/30646/2
Andreas Dilger [Fri, 22 Dec 2017 10:19:18 +0000 (03:19 -0700)]
LU-9452 tests: remove sanityn test_29 from ALWAYS_EXCEPT

There is no longer a sanityn.sh test_29() so it shouldn't be
listed in ALWAYS_EXCEPT.

Test-Parameters: trivial testlist=sanityn
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia8112e20cbd3203b69e85e586e2400551b94de81
Reviewed-on: https://review.whamcloud.com/30646
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10278 utils: allow to migrate without direct io 01/30301/4
Daniel Kobras [Tue, 28 Nov 2017 16:26:00 +0000 (00:26 +0800)]
LU-10278 utils: allow to migrate without direct io

Using direct i/o to copy file contents during migration minimizes
cache interference, but may significatly reduce performance.
Introduce new option -D/--non-direct to lfs migrate/lfs_migrate that
leaves the tradeoff at the discretion of the caller.

Signed-off-by: Daniel Kobras <d.kobras@science-computing.de>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I9c2935ff204ea5385bfc38006c5476b956deb6a7
Reviewed-on: https://review.whamcloud.com/30301
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-6142 uapi: remove remaining typedef in lustre UAPI headers 75/31175/2
James Simmons [Mon, 5 Feb 2018 20:42:25 +0000 (15:42 -0500)]
LU-6142 uapi: remove remaining typedef in lustre UAPI headers

Remove remaining tyepdef in lustre UAPI headers to make them
linux kernel compliant.

Test-Parameters: trivial

Change-Id: I13a24deed348e06c1c63bd0c332f63d4f77a0d76
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31175
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ben Evans <bevans@cray.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-7004 quota: make lctl set_param -P functional for quota 81/31081/5
James Simmons [Tue, 6 Feb 2018 15:39:30 +0000 (10:39 -0500)]
LU-7004 quota: make lctl set_param -P functional for quota

Currently setting up quota permanently can only be done with a
command like lctl conf_param $FSNAME.quota.ost=ug. To see if those
settings take hold we examine the 'enabled' proc file located in
the quota_slave directory in the proc tree. To make this workable
with lctl set_param -P we can make the 'enabled' proc file
writable and lustre can treat the config log change from set_param
-P for quota like any other tunable to be set permanetly.

Change-Id: I6a4c1fdc9d16658930f48d21e4f79e6f36047511
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: https://review.whamcloud.com/31081
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wang Shilong <wshilong@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10576 tests: sleep seconds to avoid using cached statfs 02/31102/8
Fan Yong [Wed, 14 Feb 2018 00:28:55 +0000 (08:28 +0800)]
LU-10576 tests: sleep seconds to avoid using cached statfs

In sanity test_803, we check the object usage via "lfs df -i".
But the MDT may return cached statfs if two "df" calls arrive
too close each other (about 1 second). Sleep 3 seconds between
two "df" calls to avoid such trouble.

Test-Parameters: trivial envdefinitions=SLOW=yes testlist=sanity mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs mdscount=2 mdtcount=4
Signed-off-by: Fan Yong <fan.yong@intel.com>
Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I9ce4cb6c069a88fe2b93d2d5a6304c96bdb5a0c1
Reviewed-on: https://review.whamcloud.com/31102
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8444 tests: test for unsigned xattr inode number 47/21547/24
Artem Blagodarenko [Wed, 27 Jul 2016 16:01:06 +0000 (19:01 +0300)]
LU-8444 tests: test for unsigned xattr inode number

The patch for "MRP-3025 Hitting "LDISKFS-fs error (device md66):
ldiskfs_xattr_inode_iget: error while reading EA inode -2147483347" on
large MDT volumes with large_xattr feature enabled."

Added test:
1. MDS should have more than 2G inodes
2. mdt fs should be created with large_xattr flag.
3. set inode_goal to get higher inode number allocated:
   echo 2147483947 > /sys/fs/ldiskfs/<device>/inode_goal
3. create a file
4. start adding hard links to that file

Test-Parameters: trivial testlist=conf-sanity
Signed-off-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Seagate-bug-id: MRP-3378
Change-Id: Id9c0fe9d8047935e5cf5be1b9209a74588565f2e
Reviewed-on: https://review.whamcloud.com/21547
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alexander Zarochentsev <c17826@cray.com>
Reviewed-by: Alexander Lezhoev <garson2@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-8602 gss: autoconf check missing "test" keyword 91/31191/2
Olaf Faaland [Tue, 6 Feb 2018 23:04:12 +0000 (15:04 -0800)]
LU-8602 gss: autoconf check missing "test" keyword

Change https://review.whamcloud.com/31095 introduced an error in the
autoconf, omitting the command "test" in an autoconf check.  Add it.

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Change-Id: I525805801d9e8166ec1064dccbf6cec6f97efdfa
Reviewed-on: https://review.whamcloud.com/31191
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10611 autoconf: check zlib library and zlib.h header file 86/31186/2
Jian Yu [Tue, 6 Feb 2018 20:37:23 +0000 (12:37 -0800)]
LU-10611 autoconf: check zlib library and zlib.h header file

After landing commit f1daa8fc6575e5b9e4a2f1f2ae4ceaefb889a694,
zlib library and zlib.h header file are required to compile lfs.c.
This patch adds the check in configure script.

Change-Id: Id3a8acfc780fb4fcdec0bb99b79b550c5c9e957a
Test-Parameters: trivial testlist=sanity-flr
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31186
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Quentin Bouget <quentin.bouget@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10577 tests: fix lfsck-performance for separate MGT and MDT 75/31075/2
Elena Gryaznova [Mon, 29 Jan 2018 17:30:54 +0000 (20:30 +0300)]
LU-10577 tests: fix lfsck-performance for separate MGT and MDT

lfsck-performance 0,1,2,3 tests run stopall and then mount MDT
which cause the tests failures on configuration with not
combined MGS/MDS.
Patch fixes these tests defects.

Signed-off-by: Elena Gryaznova <c17455@cray.com>
Cray-bug-id: LUS-2534
Test-Parameters: testlist=lfsck-performance
Reviewed-by: Vladimir Saveliev <c17830@cray.com>
Reviewed-by: Andriy Skulysh <c17819@cray.com>
Change-Id: I24c15b9998511bab3dc6fdd3445793e70281c890
Reviewed-on: https://review.whamcloud.com/31075
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10482 flr: enhance "lfs find" to add mirror options 69/31069/6
Jian Yu [Thu, 8 Feb 2018 06:45:02 +0000 (22:45 -0800)]
LU-10482 flr: enhance "lfs find" to add mirror options

This patch adds the following mirror related search
options to "lfs find" command:

[[!] --mirror-count|-N [+-]n]
[[!] --mirror-state <[^]state>]

--mirror-count|-N indicates mirror count.
--mirror-state indicates mirrored file state.

A mirrored file can be one of the following states:
ro indicates the mirrored file is in read-only state.
   All of the mirrors contain the up-to-date data.
wp indicates the mirrored file is being written.
sp indicates the mirrored file is being resynchronized.

Change-Id: I3c8f5c8bb6518ba4bd73fc2f164dd52afdfac211
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: https://review.whamcloud.com/31069
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 doc: update llog_reader man page for Changelogs 70/30970/9
Sebastien Buisson [Mon, 22 Jan 2018 17:07:01 +0000 (02:07 +0900)]
LU-9727 doc: update llog_reader man page for Changelogs

Add new paragraph in llog_reader's man page to explain how to read
Changelogs with llog_reader, and add an example.

Test-Parameters: trivial
Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I3e1123b9a5ac88334a370fd69c1d9d63597e16f7
Reviewed-on: https://review.whamcloud.com/30970
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Olaf Weber <olaf.weber@hpe.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9906 osd: use pagevec for putting pages 31/30531/5
Patrick Farrell [Mon, 5 Feb 2018 12:16:58 +0000 (06:16 -0600)]
LU-9906 osd: use pagevec for putting pages

Using a pagevec instead of individual page puts is much
more efficient.  This should reduce contention on the page
cache allocation/freeing, which becomes a bottleneck with
high speed OSTs.

Cray-bug-id: LUS-5670
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ic15cb8e30887ec55e9348e50af307bfd7108c7e4
Reviewed-on: https://review.whamcloud.com/30531
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Alexey Lyashkov <c17817@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-10377 build: Update ZFS Version to 0.7.6 22/30522/6
Nathaniel Clark [Fri, 26 Jan 2018 14:02:02 +0000 (09:02 -0500)]
LU-10377 build: Update ZFS Version to 0.7.6

Update SPL and ZFS version that is built against.

https://github.com/zfsonlinux/zfs/releases/tag/zfs-0.7.6

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: If010d3a7e78b66a2acbd70242fe517218a438c02
Reviewed-on: https://review.whamcloud.com/30522
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 utils: make llog_reader decode changelog fields 15/30315/13
Sebastien Buisson [Wed, 29 Nov 2017 17:18:32 +0000 (02:18 +0900)]
LU-9727 utils: make llog_reader decode changelog fields

Make llog_reader decode all Changelogs fields and extra fields.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Idfb41607fc5664cb99b254aece4625d1796331af
Reviewed-on: https://review.whamcloud.com/30315
Reviewed-by: Li Xi <lixi@ddn.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
6 years agoLU-9727 lustre: record denied OPEN in Changelogs 12/28812/24
Sebastien Buisson [Tue, 29 Aug 2017 08:45:30 +0000 (17:45 +0900)]
LU-9727 lustre: record denied OPEN in Changelogs

Record denied OPEN events in Changelogs, in the same format as
successful OPEN events.
Recording denied OPEN events is useful for security audit,
in order to find out who tried to get access to some data.
An NOPEN changlog entry is in the form:
4 24NOPEN 15:45:44.947406626 2017.08.31 0x2 t=[0x200000402:0x1:0x0]
ef=0xf u=500:500 nid=10.128.11.158@tcp m=-w-
By default, disable recording of NOPEN events in Changelogs.
NOPEN entries in Changelogs are rate limited: no more than one
entry per user per file per minute, configurable via
/proc/fs/lustre/mdd/<fsname>-MDTXXX/changelog_deniednext

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: Ib33651dda63735e21fffeed34cb1adc803ff7eca
Reviewed-on: https://review.whamcloud.com/28812
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Matthew S <matthew.sanderson@anu.edu.au>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
6 years agoLU-9727 lustre: limit OPEN and CLOSE rates in Changelogs 99/28299/31
Sebastien Buisson [Mon, 31 Jul 2017 11:50:22 +0000 (20:50 +0900)]
LU-9727 lustre: limit OPEN and CLOSE rates in Changelogs

Record OPEN only once in the Changelogs per UID/GID, for a given
open mode, as long as the file is not closed by this UID/GID.
Similarly, only record the last CLOSE per UID/GID.
For instance, it avoids flooding the Changelogs if there is an MPI
job opening the same file thousands of times from different threads.
It reduces the ChangeLog load significantly, without significantly
affecting the audit information.

To achieve this, add a list to struct mdd_object, containing uid/gid
of clients opening files. Record OPEN only if client uid/gid is not
already in list. And record CLOSE only if client uid/gid was just
removed from list.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I0fa08d11f0284d63e531ab48c03a8af6f3928487
Reviewed-on: https://review.whamcloud.com/28299
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Li Xi <lixi@ddn.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>