Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-7623 lmv: Mark lmv_hsm_ct_register/unregister uarg as __user 83/17783/3
Oleg Drokin [Sun, 3 Jan 2016 21:07:16 +0000 (16:07 -0500)]
LU-7623 lmv: Mark lmv_hsm_ct_register/unregister uarg as __user

Since it is a userspace pointer, this makes things neater and
sparse happier.

Change-Id: I3249ecba20a2018b6ebba4d257ce918b4bd9aed1
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17783
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
4 years agoLU-7623 lmv: Properly mark lmv_fid2path uarg argment as __user 82/17782/4
Oleg Drokin [Tue, 2 Feb 2016 15:29:11 +0000 (10:29 -0500)]
LU-7623 lmv: Properly mark lmv_fid2path uarg argment as __user

This makes sparse happy too.

Change-Id: Ice8067168af9a6d13900e6224d3224dbb6bf0541
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17782
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-7623 Update obd iocontrol methods with __user attribute 81/17781/5
Oleg Drokin [Thu, 4 Feb 2016 14:21:22 +0000 (09:21 -0500)]
LU-7623 Update obd iocontrol methods with __user attribute

lmv_iocontrol, osc_iocontrol, mdt_iocontrol, mgs_iocontrol, ofd_iocontrol,
osc_iocontrol, osp_iocontrol and echo_client_brw_ioctl were somehow missing
the __user attribute for uarg.

Change-Id: I10603823f5856fee6ca48c2aea03273e9d29144e
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17781
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
4 years agoLU-6587 obdclass: use OBD_FREE_LARGE with OBD_ALLOC_LARGE 34/18034/4
Andreas Dilger [Mon, 18 Jan 2016 19:54:46 +0000 (12:54 -0700)]
LU-6587 obdclass: use OBD_FREE_LARGE with OBD_ALLOC_LARGE

The change to use is_vmalloc_addr() instead of checking the allocation
size was introduced in commit 919b85d796f8, which allows using trying
kmalloc() before vmalloc(), but the deprecation of OBD_FREE_LARGE()
should not have happened since this adds needless overhead.

Use OBD_FREE_LARGE() for memory allocated with OBD_ALLOC_LARGE() so
that we only need to check is_vmalloc_addr() in OBD_FREE_LARGE()
instead of every call to OBD_FREE().

Add comments to data structures using OBD_ALLOC_LARGE() memory so
that it is clear to the users that OBD_FREE_LARGE() must be used
when freeing that memory.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ief38142f6f777eec4ec0dae4ec64bfbf78b804ed
Reviewed-on: http://review.whamcloud.com/18034
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
4 years agoLU-6719 osd-zfs: Ignore EEXIST during object init 54/18054/4
Nathaniel Clark [Wed, 20 Jan 2016 16:16:00 +0000 (11:16 -0500)]
LU-6719 osd-zfs: Ignore EEXIST during object init

ZFS can return EEXIST if object exists but is being destroyed.

Specifically see dnode_hold_impl()

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Id99b406b2f02a1337b9f1566fba30dbced755d5d
Reviewed-on: http://review.whamcloud.com/18054
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7578 gnilnd: Return correct error on GNI_RC_ERROR_NOMEM 66/17666/2
Chuck Fossen [Fri, 11 Dec 2015 15:02:44 +0000 (15:02 +0000)]
LU-7578 gnilnd: Return correct error on GNI_RC_ERROR_NOMEM

gni_mem_register() can now return GNI_RC_ERROR_NOMEM.
The upper layers need GNI_RC_ERROR_RESOURCE returned so that the
registration will retry.
In kgnilnd_mem_register, convert GNI_RC_ERROR_NOMEM to
GNI_RC_ERROR_RESOURCE.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I117acbe7ed24447bb2cf6d36b7f4814eea05ac2d
Reviewed-on: http://review.whamcloud.com/17666
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7578 gnilnd: Handle new return code in gni_mem_register() 65/17665/2
Chuck Fossen [Tue, 1 Dec 2015 22:50:32 +0000 (22:50 +0000)]
LU-7578 gnilnd: Handle new return code in gni_mem_register()

gni_mem_register() can now return GNI_RC_ERROR_NOMEM. Add
GNI_RC_ERROR_NOMEM to the case statement of handled return codes.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: Ib591212f070b5eb15240fa4bdd247aa3deb4357a
Reviewed-on: http://review.whamcloud.com/17665
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7578 gnilnd: Add module parameter reg_fail_timeout 64/17664/4
Chuck Fossen [Mon, 1 Feb 2016 23:46:00 +0000 (18:46 -0500)]
LU-7578 gnilnd: Add module parameter reg_fail_timeout

During network outages on very large machines, it is possible to use
up all of GART space with connections that are in purgatory waiting
to be freed when we finally make a new connection.
This mod adds a timeout parameter so that when we fail registering
memory for fma blocks for a period of time, we can bring the node down
so it is not stuck in a state of being up but unusable.
This can only happen on service nodes as there can potentially be 10s
of thousands of connections.
A recommended setting for reg_fail_timeout would be 60 - 300 seconds.
The default setting for reg_fail_timeout is -1 (disabled).

Set fail_loc 0xf002 which fails memory registrations and see that we
BUG after the required timeout.
Test that transient registration failures within the timeout period
do not cause BUG.

Signed-off-by: Chris Horn <hornc@cray.com>
Signed-off-by: Chuck Fossen <chuckf@cray.com>
Change-Id: I214b5e5a297c547f3c4675fcc263e5dd8aaed24f
Reviewed-on: http://review.whamcloud.com/17664
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7646 o2iblnd: connrace protocol improvement 37/18037/4
Liang Zhen [Thu, 7 Jan 2016 16:50:51 +0000 (00:50 +0800)]
LU-7646 o2iblnd: connrace protocol improvement

This patch can allow a peer that has lower NID to win the connection
race if it has already lost the race for many times.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I49c8151469ff9c4019213117396c49231f6b6948
Reviewed-on: http://review.whamcloud.com/18037
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7729 target: fix process_req_last_xid() return value 45/18245/3
Niu Yawei [Mon, 1 Feb 2016 16:15:44 +0000 (11:15 -0500)]
LU-7729 target: fix process_req_last_xid() return value

process_req_last_xid() returns ptlrpc_error() on error, which
actually returns 0 to caller mistankely.

Test-Parameters: envdefinitions=ONLY=failover_ost \
clientcount=4 osscount=2 mdscount=2 mdtcount=1 \
austeroptions=-R failover=true iscsi=1 \
testlist=recovery-mds-scale

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I136a8ef153a3ea08dcbf05e11fb412e31947be20
Reviewed-on: http://review.whamcloud.com/18245
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7584 tests: create file on single MDS in sanity test 129 92/18192/8
Jian Yu [Thu, 28 Jan 2016 22:54:35 +0000 (14:54 -0800)]
LU-7584 tests: create file on single MDS in sanity test 129

In sanity test 129, it creates only one more file to check
whether the directory size exceeds the limit or not. However,
with DNE configuration, the new file might be created in a
different stripe from the previous one that hit ENOSPC.
So, directory size might not exceed the limit, which causes
the test fail.

Since the test is for checking ldiskfs dir size parameters, the
patch just fixes it to create files on single MDS so as to make
sure creating new files will increase the directory size.

Test-Parameters: envdefinitions=ONLY=129 clientdistro=el7 ossdistro=el7 mdsdistro=el7 mdscount=2 mdtcount=4 testlist=sanity
Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I75a2437fe3a4f6b160651d8704799ce8478a0041
Reviewed-on: http://review.whamcloud.com/18192
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6824 ldiskfs: add dir htree growing warning patch 69/18169/6
Jian Yu [Wed, 27 Jan 2016 02:25:03 +0000 (18:25 -0800)]
LU-6824 ldiskfs: add dir htree growing warning patch

RHEL 7.2 and SLES 12 were supported after landing commit
07660ad33a7d109cced29b6400f99f25adab3f54. This patch adds
the missing ext4-give-warning-with-dir-htree-growing.patch
into the series files for both distros.

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I40b4d34de467bc933dd43e175d78e37f59d91b16
Reviewed-on: http://review.whamcloud.com/18169
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7564 osp: Do not match the lock for OSP 06/18206/3
Di Wang [Thu, 28 Jan 2016 11:20:07 +0000 (06:20 -0500)]
LU-7564 osp: Do not match the lock for OSP

In DNE operation, we do not need match the lock
in the OSP cache, so to lock the remote object
exclusively on master MDT, then other threads on
master MDT will not be able to access the remote
object at the same time.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I69a4f243fb26f4e37857fea6fd63b650b6ad046e
Reviewed-on: http://review.whamcloud.com/18206
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6225 test: test-framework does not cleanup for failed tests 92/13692/5
gaurav mahajan [Mon, 9 Feb 2015 12:53:21 +0000 (18:23 +0530)]
LU-6225 test: test-framework does not cleanup for failed tests

adding reset_fail_loc to error_noexit() func in test-framework
which resets fail_loc and makes sure that the next test
will be started with no error injected.

Xyratex-bug-id: MRP-2079
Signed-off-by: gaurav mahajan <gaurav.mahajan@seagate.com>
Change-Id: I8cadd21a794d0eb429aee4734d47bd56caf0b8fe
Signed-off-by: gaurav mahajan <gaurav.mahajan@seagate.com>
Reviewed-on: http://review.whamcloud.com/13692
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-3953 build: Fix duplicate snmp directory packaging 91/18191/2
Christopher J. Morrone [Thu, 28 Jan 2016 01:55:48 +0000 (20:55 -0500)]
LU-3953 build: Fix duplicate snmp directory packaging

The %{_datadir}/lustre/snmp/mibs is in conflict with the later
%{_datadir}/lustre in the %files section.  Fortunately, it just
prints a warning rather than aborting the process.  But we can
fix that warning.

We remove the more specific %{_datadir}/lustre/snmp/mibs since
the files are already included with the more general form.

Change-Id: I293f0bf07760719f7cf3e1a963e49c007a483311
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/18191
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7578 gnilnd: Modify allocator flags to prevent waiting 63/17663/2
James Shimek [Tue, 10 Nov 2015 19:18:01 +0000 (19:18 +0000)]
LU-7578 gnilnd: Modify allocator flags to prevent waiting

kgnilnd currently utilized several flags to try and prevent specific
things from causing the node to hang. This has not been enough to
prevent oom conditions from stalling all network traffic on computes
nodes during periods where memory filling tests are run doing IO.
Based on discussions with the kernel group we are adding a new flag
__GFP_NORETRY to the allocator flags in the hopes that it prevents the
allocator from spinning forever. Change GFP_NOFS to GFP_NOIO to fully
protect against any "IO" occuring in an IO path.

Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I2bcc71ebf6e8ff75d2ac41cae44387294328c74c
Reviewed-on: http://review.whamcloud.com/17663
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7468 tests: update maloo_upload.sh to create upload.tar.gz 44/17344/4
Leonel Ochoa [Mon, 23 Nov 2015 23:17:51 +0000 (15:17 -0800)]
LU-7468 tests: update maloo_upload.sh to create upload.tar.gz

Uploaded files are now expected to have the '.tar.gz' extension.
This patch updates maloo_upload.sh to create upload.tar.gz before
uploading.

Signed-off-by: Leonel Ochoa <leonel.ochoa@intel.com>
Change-Id: Id8b6dd08dde873fad9e85438360e451945903e9c
Reviewed-on: http://review.whamcloud.com/17344
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7703 mdd: linkea should be updated properly at migration 09/18109/6
Alex Zhuravlev [Sat, 23 Jan 2016 22:17:41 +0000 (01:17 +0300)]
LU-7703 mdd: linkea should be updated properly at migration

when we're migrating a directory and fix children's linkeas,
do this correctly - search for old fid, replace with a new one.

Change-Id: Ib48f73d51ca635083d733202c59a9bdcdfe116fb
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/18109
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7443 llog: remove unused and empty llog 27/17227/9
Alexander Boyko [Tue, 17 Nov 2015 12:22:05 +0000 (15:22 +0300)]
LU-7443 llog: remove unused and empty llog

This patch adds ability to remove plain llog during record
cancellation for inactive plain llog. Before it such files
were removed during mount operation. And this is not enough
for changelog. The current marker of catalog could reach the
undeleted record, and this causes changelog problem.

Signed-off-by: Alexander Boyko <alexander.boyko@seagate.com>
Seagate-bug-id: MRP-2897
Change-Id: Ic24a1643f2fb264ad1212668e382a0bbc9b735b7
Reviewed-on: http://review.whamcloud.com/17227
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 util: migrate lctl params functions to use cfs_get_paths() 66/17466/18
James Simmons [Thu, 28 Jan 2016 16:08:48 +0000 (11:08 -0500)]
LU-5030 util: migrate lctl params functions to use cfs_get_paths()

Make the normal lctl set_param,list_param, and get_param
operations to use the new cfs_get_paths() function which
enables sysfs support along side procfs.

Change-Id: I5817e96c3172de53930776f0891f2a642907bfde
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Wang Chao <chao.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/17466
Reviewed-by: Ryan Haasken <haasken@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7715 out: fix err_serious in out_handle 87/18187/3
Di Wang [Wed, 27 Jan 2016 14:16:52 +0000 (09:16 -0500)]
LU-7715 out: fix err_serious in out_handle

Only return err_serious before out_handle() pack reply.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I088501019c3b79561e8a0c43609e33f3a5a7d746
Reviewed-on: http://review.whamcloud.com/18187
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7716 mdt: No is_subdir check for same dir rename 72/18172/2
Di Wang [Wed, 27 Jan 2016 00:13:01 +0000 (19:13 -0500)]
LU-7716 mdt: No is_subdir check for same dir rename

In rename, if the source and target are in the same
directory, then it does not need is_subdir check.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I03a4aff71b2c284197a8f78f6306568249162aca
Reviewed-on: http://review.whamcloud.com/18172
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7705 ldlm: make round_timeout() static 19/18119/3
Alex Zhuravlev [Mon, 25 Jan 2016 11:47:35 +0000 (14:47 +0300)]
LU-7705 ldlm: make round_timeout() static

to make gcc5 happy.

Change-Id: I5e92facd497c04b2595dea3782935f2cc5791de1
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/18119
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7576 llapi: use dirname() in opendir_parent() 96/17796/4
John L. Hammond [Mon, 4 Jan 2016 18:24:00 +0000 (12:24 -0600)]
LU-7576 llapi: use dirname() in opendir_parent()

In opendir_parent() pass the path through dirname() so that the
resulting directory may be used with basename().

Add test_230i() to sanity.sh to ensure that lfs migrate -m tolerates
trailing slashes.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I330717da540618052bc5efbb5df9cbe6c4194050
Reviewed-on: http://review.whamcloud.com/17796
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7229 hsm: relax time check of sanity-hsm test_60 42/17742/3
Li Xi [Wed, 30 Dec 2015 10:40:40 +0000 (18:40 +0800)]
LU-7229 hsm: relax time check of sanity-hsm test_60

If the copytool and test script round clock time in a different
way, a strict time check would causes failure.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I97ebe02d6a0cdd9425ef68e5770e63ac9968ebaa
Reviewed-on: http://review.whamcloud.com/17742
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7578 gnilnd: Revert max_immediate setting 67/17667/3
James Shimek [Mon, 18 Jan 2016 15:55:16 +0000 (10:55 -0500)]
LU-7578 gnilnd: Revert max_immediate setting

max_immediate was changed based on performance testing for
5.2UP04 and 6.0, this caused the eager_recv path to always use vmalloc
when allocating space for new eager messages. The vmalloc path is very
slow especially when constantly freeing at the same time across all
CPU's

This change will also cause more messages to be governed by the
service nodes rdma engine.

Modifications
max_immediate default is now 2048.
max_immediate is now read only.
eager_credits is now writeable at run time.

Signed-off-by: James Shimek <jshimek@cray.com>
Signed-off-by: Chris Horn <hornc@cray.com>
Change-Id: I2754d28b1f05a7aeaaeac7fc5f41f1f36568d79c
Reviewed-on: http://review.whamcloud.com/17667
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 libcfs: remove userland headers from libcfs.h 14/16914/8
James Simmons [Thu, 7 Jan 2016 21:16:20 +0000 (16:16 -0500)]
LU-6245 libcfs: remove userland headers from libcfs.h

Currently libcfs.h is used as a master header that
contains all the needed headers. Since Lustre user
land utilities and applications no longer have a
strong dependency on libcfs.h we can remove all
the added user land headers contained in libcfs.h.

Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Change-Id: I6403d109875a1d42d8490a3a1c7635f2dac9fc90
Reviewed-on: http://review.whamcloud.com/16914
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 libcfs: make libcfs_ioctl.h and lnetctl.h uapi compliant 43/17643/9
James Simmons [Wed, 27 Jan 2016 15:20:33 +0000 (10:20 -0500)]
LU-6245 libcfs: make libcfs_ioctl.h and lnetctl.h uapi compliant

For UAPI headers the policy is to only have data
structures shared between user land and kernel
space. All non data structures except a reference
to libcfs_ioctl_data_adjust() have been removed.
libcfs_ioctl_data_adjust can go away when the two
module.c files for libcfs will merger. For lnetctl.h
we remove userland only function prototypes.h

Change-Id: I4e09041a7f0b590d7eb81eda32f0bccdfb9d28ac
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17643
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6684 lfsck: set the lfsck notify as interruptable 82/18082/8
Fan Yong [Tue, 24 Nov 2015 22:47:59 +0000 (06:47 +0800)]
LU-6684 lfsck: set the lfsck notify as interruptable

If the LFSCK engine is notifying the remote LFSCK engine about some
LFSCK event, such as LE_PHASE1_DONE, but if the remote server (MDT
or OST) is offline, then such notification RPC will be blocked until
the remote server is online. At that time, if someone wants to stop
the LFSCK, he/she has to wait.

To avoid such trouble, we will make the LFSCK notification RPC to
be interruptable. Then even if some remote server is offline, the
running LFSCK still can be stopped.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ie9220bc578eb9fe1b1b804a6732fe8ecfba4affb
Reviewed-on: http://review.whamcloud.com/18082
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoNew tag 2.7.66 2.7.66 v2_7_66 v2_7_66_0
Oleg Drokin [Mon, 1 Feb 2016 18:50:06 +0000 (13:50 -0500)]
New tag 2.7.66

Change-Id: I540150c9567b137ea14fb4799fa1e2e942ac6b52
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7710 test: sync all clients in recovery-small 130[acb] 38/18138/3
John L. Hammond [Mon, 25 Jan 2016 20:41:49 +0000 (14:41 -0600)]
LU-7710 test: sync all clients in recovery-small 130[acb]

In recovery-small test_130[abc]() call sync on all clients rather than
just on the client where the test script is running.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1d7aec650d08a6fb417a5df3509b657e9ccda902
Reviewed-on: http://review.whamcloud.com/18138
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7070 tests: Skip sanity 24x based on server version 90/17990/2
James Nunez [Thu, 14 Jan 2016 00:10:36 +0000 (17:10 -0700)]
LU-7070 tests: Skip sanity 24x based on server version

sanity test 24x tests cross MDT rename and link. Cross-MDT
rename and link was added to Lustre after the 2.7.55 tag. Thus,
only run sanity 24x for server version 2.7.56 or later.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I0c2e8c8581b8499ec7f1a25092b17be29aa49c1e
Reviewed-on: http://review.whamcloud.com/17990
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7273 tests: dump stacks upon CT stop failure 82/16782/4
Bruno Faccini [Fri, 9 Oct 2015 12:35:01 +0000 (14:35 +0200)]
LU-7273 tests: dump stacks upon CT stop failure

This patch adds full threads stacks dump upon copytool stop failure
at end of grace period, in sanity-hsm/wait_copytools().

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I3da4876b55fbc72c941bbf75cc89819acecc82c0
Reviewed-on: http://review.whamcloud.com/16782
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6788 build: Remove build/lbuild backwards compatibility symlink 64/15464/10
Christopher J. Morrone [Wed, 1 Jul 2015 21:34:37 +0000 (14:34 -0700)]
LU-6788 build: Remove build/lbuild backwards compatibility symlink

Enough time has passed since lbuild was moved to contrib to remove
the symlink that we left behind in the build directory to accommodate
Intel's build farm.

Change-Id: I4d3b6038aad0663c3030590d161b6d71d05e6d43
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/15464
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5147 doc: design docs in documentation dir 18/10618/9
Richard Henwood [Thu, 2 Apr 2015 21:13:40 +0000 (16:13 -0500)]
LU-5147 doc: design docs in documentation dir

Move design documents into the ./Documentation directory.
Update references to design documentation in the source code.
Minor readability updates to ldiskfs.txt.

Signed-off-by: Richard Henwood <richard.henwood@intel.com>
Change-Id: Ia4d1662225d019358876caade6f564c48f450fff
Reviewed-on: http://review.whamcloud.com/10618
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Christopher J. Morrone <morrone2@llnl.gov>
4 years agoLU-7309 lod: notify client retry creation 39/17839/5
Hongchao Zhang [Fri, 27 Nov 2015 08:07:29 +0000 (16:07 +0800)]
LU-7309 lod: notify client retry creation

In lod_alloc_rr, if there is no available OSTs to allocate
the object required by some client and there is OSP connecting
to OST at the same time, then it should indicate the client to
retry the creation request later.

Change-Id: I6740edf830dbe736e33e24c92387df371f070570
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/17839
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7623 lov: Get rid of an ugly statfs hack in lov_iocontrol 80/17780/3
Oleg Drokin [Sun, 3 Jan 2016 20:53:50 +0000 (15:53 -0500)]
LU-7623 lov: Get rid of an ugly statfs hack in lov_iocontrol

For some crazy reason ll_obd_statfs decided to decode async flag
passed from userspace and then pass it via a userspace pointer
argument to lov_iocontrol.
This patch moves flags decoding to lov_iocontrol where it belongs.

Change-Id: I1b54e778d60b878fc3fc463c256aad360b2cab21
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17780
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
4 years agoLU-7623 lnet: Get rid of IOC_LIBCFS_PORTALS_COMPATIBILITY ioctl 79/17779/3
Oleg Drokin [Sun, 3 Jan 2016 20:45:06 +0000 (15:45 -0500)]
LU-7623 lnet: Get rid of IOC_LIBCFS_PORTALS_COMPATIBILITY ioctl

This has been unused for ages and could be safely removed now.

Change-Id: I89af1bcce77119780de623b69ee1c74da1bfcce2
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17779
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
4 years agoLU-7623 lnet: Get rid of IOC_LIBCFS_DEBUG_PEER hack 78/17778/3
Oleg Drokin [Sun, 3 Jan 2016 20:42:29 +0000 (15:42 -0500)]
LU-7623 lnet: Get rid of IOC_LIBCFS_DEBUG_PEER hack

IOC_LIBCFS_DEBUG_PEER was added back in the stone ages to print debug
statistics on a peer when peer timeout happens.
Redo it properly as a separate LNet API call,
also get rid of "ioctl" forwarding into the underlying LNDs,
since no current LNDs implement this function anymore.

Change-Id: I3ec68a28faf840eb67d6084aa0fa5dcbbe2d7567
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17778
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
4 years agoLU-3538 dne: Commit-on-Sharing for DNE 30/12530/48
Lai Siyao [Sun, 2 Nov 2014 15:30:23 +0000 (23:30 +0800)]
LU-3538 dne: Commit-on-Sharing for DNE

This patch contains three parts:
1. Sync-on-Cancel for cross-MDT lock, which eleminates dependency
   between transactions and distributed transaction which modified
   remote object, this can guarantee the change of the distributed
   transaction will not be lost.
2. enable Commit-on-Sharing for DNE, PW/EX locks will be converted
   to COS locks, but by default they are ignored, when operation
   finds itself a distributed transaction, it will lock with
   LDLM_FL_COS_INCOMPAT flag to check against existed COS locks.
   This will eliminate dependency between distributed transaction
   and transactions which modify the same local object, and it
   guarantees distributed transaction can always be recovered.
3. striped directory creation needs to ensure its parent permanent
   on disk, to ensure this, cache child locks in mkdir.

Sync-on-Cancel for cross-MDT lock

When two operations have dependency on an object, and the first
operation has a PW/EX cross-MDT lock on this object, trigger
transaction commit on the MDT where the object resides to
eliminate dependency, in short, this patch eliminates dependency
between locks and existed PW/EX cross-MDT lock.

This patch contains following changes:
* enable Sync on Cancel for DNE by default.
* save cross-MDT lock into tgt_uncommitted_soc_locks after use,
  and it will be released upon transaction commit, note, just
  a lock refcount is taken when lock is saved, the read/write
  count is released in mdt_object_unlock().
* the saved cross-MDT lock will be discarded upon BAST,
  because the MDT where the object resides will do sync on lock
  cancel.
* use existed BLOCKING_SYNC_ON_CANCEL mechanism to commit
  transaction upon cross-MDT lock cancel.

Commit-on-Sharing for DNE

On DNE, Commit-on-Sharing is disabled by default, but MDT local
PW/EX lock will be saved as COS lock, and such lock will be
ignored in compatilibity check by default, unless it's required,
there are two situations:
1. when distributed transaction locks local object, it will
   conflict with COS locks.
2. when distributed transaction enqueues cross-MDT lock, it will
   conflict with COS locks.

This patch contains following changes:
* on DNE, local PW/EX lock is converted to COS and saved like
  before even when COS is not enabled.
* above COS locks will be ignored in lock compatibility check by
  default, so for local operations COS won't take effect. But if
  operation finds itself may modify remote MDT object, it will lock
  all local locks with COS checked.
* cross-MDT lock will always conflict with COS locks.
* if operation is reint, it will check whether it's a distributed
  operation (involved objects are remote or striped) if so, check
  against COS locks when enqueing locks.

Eliminate dependency in dir creation

Mkdir needs to take a lock on child, so that any subsequent
distributed operation using that directory would observe a conflict
and ensure that the original mkdir is committed.

Benchmark result with createmany/unlinkmany is as follows:
        mkdir rmdir open unlink mknod unlink (ops/sec)
2.6     1194 1310 1314 1185 2242 1396
master   978 1166  937 1028 1681 1202
current  930 1161  918 1018 1691 1202

* 10 createmany/unlinkmany processes running on local client
  (on MDS), 4M dirs/files created/unlinked, and the numbers are
  average of 10 processes.
* for 2.6, each process is running on a separate mountpoint.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I91928d097cbb26bd1e1089c3f8851ac6a6440a69
Reviewed-on: http://review.whamcloud.com/12530
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7638 recovery: do not abort update recovery. 85/17885/6
Di Wang [Thu, 7 Jan 2016 22:40:09 +0000 (17:40 -0500)]
LU-7638 recovery: do not abort update recovery.

When normal recovery timeout, if there are update
replay in the queue, it should still keep the
exports of other MDTs and continue update replay
until recovery is manually aborted.

Add tdtd_recovery_threads_count/waitq to manage
the update recovery threads(retrieving the update
log), so during abort, these recovery threads
should be stopped, then it can cleanup the update
replay reqs in the list.

Fix the negative recovery time console message.

Add test cases replay-single 119 and 120 to verify
these cases.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Iedcc4922f1500aedec664ff70266b6d2e9f812de
Reviewed-on: http://review.whamcloud.com/17885
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7490 recovery: abort update recovery once fails 99/17199/23
Di Wang [Fri, 13 Nov 2015 16:55:07 +0000 (08:55 -0800)]
LU-7490 recovery: abort update recovery once fails

If update or MDT-MDT recovery fails, then we abort
the replay and resent, because further updates might
cause filesystem or llog corruption.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: Icc7241e94159f7f46a99fb003643605fe2a13c8d
Reviewed-on: http://review.whamcloud.com/17199
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7054 o2iblnd: less intense allocating retry 70/16470/4
Liang Zhen [Wed, 16 Sep 2015 17:50:07 +0000 (01:50 +0800)]
LU-7054 o2iblnd: less intense allocating retry

ko2iblnd may retry too frequent for growing pools, all schedulers
are spinning if another thread is in progress of allocating a new
pool and can't finish right away because of high system load.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I21be43c6f77b1ae13d500ecbd6795b6d0099d2f1
Reviewed-on: http://review.whamcloud.com/16470
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7635 utils: Fix lhsmtool_posix interval reporting 78/17878/4
Nathaniel Clark [Thu, 7 Jan 2016 18:25:28 +0000 (13:25 -0500)]
LU-7635 utils: Fix lhsmtool_posix interval reporting

At specified time intervals lhsmtool_posix reports how much data it's
written.  It should report how much data has been written since last
update, but it reports total data written.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I0e85b81fa2a8cf16474cc832bca30bf1425fa81c
Reviewed-on: http://review.whamcloud.com/17878
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Robert Read <robert.read@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7666 llog: use correct size when freeing log header 09/18009/3
John L. Hammond [Thu, 14 Jan 2016 19:33:29 +0000 (13:33 -0600)]
LU-7666 llog: use correct size when freeing log header

In llog_cat_new_log() pass the allocated size of the llog header to
OBD_FREE_LARGE().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ib8a2ae8608918a9913b01dda967365cd9f7a3925
Reviewed-on: http://review.whamcloud.com/18009
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7482 tests: fix uninitialized value in llapi_hsm_test 64/17364/3
Frank Zago [Wed, 25 Nov 2015 18:15:32 +0000 (12:15 -0600)]
LU-7482 tests: fix uninitialized value in llapi_hsm_test

llapi_hsm_user_request_alloc doesn't zero the memory, so all the
fields in the returned structure must be set.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ib2d99138a5bab6253c00da5d48ebb90e9679e235
Reviewed-on: http://review.whamcloud.com/17364
Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7039 llog: update llog header and size 69/16969/35
Di Wang [Mon, 26 Oct 2015 10:02:29 +0000 (03:02 -0700)]
LU-7039 llog: update llog header and size

Once update request fails due to eviction or other failures,
all of update request in the sending list should return fail,
because after the failure, the update log in the following
request will have wrong llog bitmap. So once this happens,it
will

1. invalidate all of requests in the sending list.
2. lod_sub will update the llog header from remote target.
3. Then Sending list can accept new request.

Also a few other fixes for llog corruption

1. Because the size in OSP cache is not safe, because no lock
protect it. So we will add lgh_write_offset in loghandle to
track the write offset for remote update llog, and revalidate
the offset during updating the llog header.

2. rollback the lgh_index and bitmap once add new records
fails.

Add replay-single.sh 118 to verify the case.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I2d3a700d3363867ac60aeb6b7641eceb65dfe12a
Reviewed-on: http://review.whamcloud.com/16969
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7324 lnet: Use after free in lnet_ptl_match_delay() 40/17840/3
Olaf Weber [Wed, 6 Jan 2016 12:50:03 +0000 (13:50 +0100)]
LU-7324 lnet: Use after free in lnet_ptl_match_delay()

In lnet_ptl_match_delay() we check msg->msg_rx_delayed to see whether
the message has been added to the delay queue. But this check is done
after lnet_ptl_unlock() and lnet_res_unlock(), and the message can be
processed and freed before the check.

Replace the check with checking rc against LNET_MATCHMD_NONE, which
is how the callers of lnet_ptl_match_delay() know whether the message
was added to the delay queue. To make this work we reset rc in the
loop when there was no match and the message hasn't been delayed. In
addition reorganize the code and add comments to clarify the logic.

In lnet_ptl_match_md() a similar msg->msg_rx_delayed is replaced for
the same reason.

Signed-off-by: Olaf Weber <olaf@sgi.com>
Change-Id: Ifbc6573664fdc4849b9155b6102c8589e692996b
Reviewed-on: http://review.whamcloud.com/17840
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7591 test: Applying filtering in t32_test 73/17973/6
Saurabh Tandan [Wed, 13 Jan 2016 00:32:36 +0000 (17:32 -0700)]
LU-7591 test: Applying filtering in t32_test

With SELinux feature enabled on Client, conf-sanity
test_32b failed with 'list verification failed' and
'Host key verification failed'. Because of the SELinux
enabled feature there was extra '.' in the permission
column.

Hence, that filtering is applied to remove that extra
'.' from permission column. t32_test within conf-sanity
is modified by applying filter to it.

Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: I62fbb79ecf29bb09fd382ab1ebc62d41f6678280
Reviewed-on: http://review.whamcloud.com/17973
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7649 mgs: skip single OST conf update 24/17924/3
Hongchao Zhang [Mon, 23 Nov 2015 22:22:09 +0000 (06:22 +0800)]
LU-7649 mgs: skip single OST conf update

If some OST is marked to regenerate its configuration by
"tunefs.lustre --writeconf", the newly added OSC entry for
this OST should be marked as "skip" just as the new OSP entry
in MDT(LOD) configuration.

Change-Id: I1f680cb81a4c77fcf48ab2df441c821db5d456b5
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/17924
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7628 lfs: fix NULL pointer check in cb_migrate_mdt_init() 14/17814/3
Emoly Liu [Tue, 5 Jan 2016 03:55:40 +0000 (11:55 +0800)]
LU-7628 lfs: fix NULL pointer check in cb_migrate_mdt_init()

After calling opendir() at the end of cb_migrate_mdt_init(), we should
check *dirp rather then dirp, and we should close temporary parent
directory before return error.
Also, this patch improves the code a little to make it more readable.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I2507cbd6464ea56fba7793431905cdf51136413f
Reviewed-on: http://review.whamcloud.com/17814
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6037 test: skip s-q 37 for old server 48/17648/2
Niu Yawei [Thu, 17 Dec 2015 02:47:38 +0000 (21:47 -0500)]
LU-6037 test: skip s-q 37 for old server

Old server (prior 2.6.93) doesn't have fix of LU-5006, the s-q 37
test should be skipped in interoperability testing.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ic794c3dead462f27fde9ab4b9dfcd2fd1d68db69
Reviewed-on: http://review.whamcloud.com/17648
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6939 nrs: add lock to protect TBF rule linkage 96/17596/3
Li Xi [Thu, 3 Dec 2015 02:49:03 +0000 (10:49 +0800)]
LU-6939 nrs: add lock to protect TBF rule linkage

Due to lack of lock protection for operations on rule's list of
client (nrs_tbf_rule->tr_cli_list), it leads to assertion panic
"ASSERTION( cli->tc_rule == ((void *)0) ) failed". Add spinlock
to control concurrent access to the list of client for a rule to
avoid this assertion failure.

Also, some of the fields of rule is inited earlier in
nrs_tbf_rule_start() so as to prevent crash in nrs_tbf_rule_put().

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: Ic9ecb4318329400ac0f5900d185c60d4b9a98c48
Reviewed-on: http://review.whamcloud.com/17596
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Nikitas Angelinas <nikitasangelinas@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7483 test: Modifying filtering in acl for SElinux feature 29/17529/4
Saurabh Tandan [Wed, 9 Dec 2015 19:50:09 +0000 (12:50 -0700)]
LU-7483 test: Modifying filtering in acl for SElinux feature

With SElinux feature enabled on Client, sanity.sh test_103a failed
with 'Host key verification failure'. "ls -l" command was producing
'.' at the end to indicate extra security attributes when SElinux is
is enabled.

Hence, modifying filtering by not allowing '.' in the output. Files
modified are as follows: cp.test, misc.test, permission.test,
setfacl.test, 974.test, 4924.test .

Signed-off-by: Saurabh Tandan <saurabh.tandan@intel.com>
Change-Id: I725c1aa095f1a9feac521675cf29faa0a750598e
Reviewed-on: http://review.whamcloud.com/17529
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6961 ldiskfs: buffer head leak in mmp 41/17841/5
Jadhav Vikram [Tue, 5 Jan 2016 09:32:24 +0000 (15:02 +0530)]
LU-6961 ldiskfs: buffer head leak in mmp

Release bh_check in case of error.
patch added for following kernel
RHEL7.2 3.10.0-327.3.1.el7
SLES11SP3 3.0.101-0.47.55
SLES12 - since the code is fairly same across RHEL7 versions
adding the patch from RHEL7 to the SLES12 patch series.

Seagate-bug-id: MRP-2337
Signed-off-by: Jadhav Vikram <jadhav.vikram@seagate.com>
Change-Id: Ib3712abcc3e754077882f9302b6065e38e1f014c
Reviewed-on: http://review.whamcloud.com/17841
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7480 tests: sanityn test_14 should be renamed 66/17366/3
Kyrylo Shatskyy [Wed, 21 Oct 2015 09:04:14 +0000 (14:34 +0530)]
LU-7480 tests: sanityn test_14 should be renamed

Made test_14 be run separately from others in this group.
Following sanityn tests have been renamed:
- 14 to 14aa
- 14a to 14ab

Change-Id: Ib407db9b0ab032dd8368a0602b8cb9cc468b3178
Seagate-bug-id: MRP-824
Signed-off-by: Kshipra Namjoshi <kshipra.namjoshi@seagate.com>
Signed-off-by: Kyrylo Shatskyy <kyrylo_shatskyy@xyratex.com>
Reviewed-on: http://review.whamcloud.com/17366
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7329 tests: increase sanity test_60b llog_test line count 70/17670/2
Andreas Dilger [Fri, 18 Dec 2015 08:29:25 +0000 (01:29 -0700)]
LU-7329 tests: increase sanity test_60b llog_test line count

With the increased number of tests in sanity test_60a due to the
landing of LU-6556, LU-6714, and LU-7329, the check test_60b that
detects CWARN/CERROR() rate limiting can be exceeded in some cases,
up to 81 llog_test lines in my testing.

Increase the maximum line count correspondingly.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I08e899f0faed10430aae0d90a54f43aad7500c1e
Reviewed-on: http://review.whamcloud.com/17670
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 snmp: migrate lustre SNMP utilites to use cfs_get_paths 65/17465/9
James Simmons [Thu, 14 Jan 2016 19:43:22 +0000 (14:43 -0500)]
LU-5030 snmp: migrate lustre SNMP utilites to use cfs_get_paths

Move the SNMP lustre tools to use cfs_get_paths so
the tools function with the upstream client as well.

Change-Id: I18d76e10c45a9c8c582e77917a33bb2afa86aac4
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Wang Chao <chao.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/17465
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 utils: migrate most lustre utils to use cfs_get_paths() 62/17462/14
James Simmons [Mon, 18 Jan 2016 14:39:43 +0000 (09:39 -0500)]
LU-5030 utils: migrate most lustre utils to use cfs_get_paths()

Move all the lustre test applications and utilities
except for the code to handle lctl [s|get]_param
to use cfs_get_paths() instead of accessing the
proc file system directly. This automatically
enables support for sysfs used in the upstream
client.

Change-Id: Iecb4a269c00ac8a5e0633d69d0370e9881ee6e33
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Wang Chao <chao.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/17462
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 tests: remove /proc paths from tests 82/17082/17
Andreas Dilger [Mon, 18 Jan 2016 14:41:46 +0000 (09:41 -0500)]
LU-5030 tests: remove /proc paths from tests

Remove the hard-coded /proc paths from cancel_lru_locks().  This was
only kept around because of a legacy implementation of that function
where it listed all lru_size files and then clear'd them manually.

If the filenames contained '.' then "lctl {get_param,set_param}" get
unhappy, but if the "lctl {get_param,set_param}" built-in globbing is
used with a wildcard parameter like "MGC*" then the '.' in the param
does not get replaced by '/' during processing and it just works.

Fix sanity-sec.sh test_24 to use "lctl get_param -R".  I couldn't see
a functional difference between the two read checks in that test,
so only left a single check.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I0aff1dd29d44b9272f56e18f0e443c7fab3ebbe5
Reviewed-on: http://review.whamcloud.com/17082
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Ryan Haasken <haasken@cray.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7661 mgs: restrict MGS_SET_INFO to stripe parameters 82/17982/2
John L. Hammond [Wed, 13 Jan 2016 17:16:28 +0000 (11:16 -0600)]
LU-7661 mgs: restrict MGS_SET_INFO to stripe parameters

The MGS_SET_INFO RPC is only needed for setting the default striping
on a filesystem, so in mgs_set_info() reject attempts to set a
parameter that is not of the form *.lov.stripe{count,size,offset}=*.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I5ac693f886fd035ab639e03628849e791e1e2e9a
Reviewed-on: http://review.whamcloud.com/17982
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sbuisson@ddn.com>
Reviewed-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7089 tests: sanity test 99b: find: invalid mode ‘+4’ 77/16177/21
Yang Sheng [Wed, 2 Sep 2015 13:48:44 +0000 (21:48 +0800)]
LU-7089 tests: sanity test 99b: find: invalid mode ‘+4’

Changes perm +4 pattern to perm /4. '+4' has deprecated.

Test-Parameters: envdefinitions=ONLY=99 clientdistro=sles12 testlist=sanity
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: If5e04fece8d90193acd89cd42f61e3ee5e79064e
Reviewed-on: http://review.whamcloud.com/16177
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 util: migrate liblustreapi to use cfs_get_paths() 68/17468/8
James Simmons [Thu, 14 Jan 2016 20:15:18 +0000 (15:15 -0500)]
LU-5030 util: migrate liblustreapi to use cfs_get_paths()

Move liblustreapi library to use cfs_get_paths() so
it can be used with the upstream client as well.

Change-Id: I7c2ee3ded74dec0827f1aeaa631b1c334d39c917
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Signed-off-by: Wang Chao <chao.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/17468
Reviewed-by: Ben Evans <bevans@cray.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7174 build: add mmap_cat to .gitignore 45/17845/3
James Simmons [Wed, 6 Jan 2016 16:27:04 +0000 (11:27 -0500)]
LU-7174 build: add mmap_cat to .gitignore

While testing patches other non-patch the new application
mmap_cat will show up with git status. To avoid adding
this by accident place thes by product files in the
proper .gitignore files.

Change-Id: I11ae5fd73ef0c5238fedf2d7a5673944af421ead
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17845
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7630 mdt: keep FS capability for getattr_name 15/17815/2
Li Dongyang [Tue, 5 Jan 2016 05:43:39 +0000 (16:43 +1100)]
LU-7630 mdt: keep FS capability for getattr_name

This is a follow up of LU-6528.

When "no_subtree_check" is set for NFS export, nfsd_set_fh_dentry()
doesn't set correct fsuid explicitely, but raise capability to allow
exportfs_decode_fh() to reconnect disconnected dentry into dcache.

The patch of LU-6528 fixed the issue for mdt_reint_getattr() but
missed the case for mdt_getattr_name().

LU-6528 added drop_fs_cap to old_init_ucred() to preserve
the capability but the logic was removed by LU-7199 commit
2aea469a3a, this patch reverts that.

This patch also makes sure old_init_ucred() won't fail identity check
when we have a raised capability but not a valid fsuid.

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Change-Id: Ia41a8243eb18b1e469529bef186e3239fe9ebc1d
Reviewed-on: http://review.whamcloud.com/17815
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7556 kernel: kernel update RHEL 6.7 [2.6.32-573.12.1.el6] 33/17633/4
Bob Glossman [Tue, 15 Dec 2015 22:59:02 +0000 (14:59 -0800)]
LU-7556 kernel: kernel update RHEL 6.7 [2.6.32-573.12.1.el6]

Update RHEL 6.7 kernel to 2.6.32-573.12.1.el6

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I0a18a4dc18bf43ca12036b8f711390b79f034b45
Reviewed-on: http://review.whamcloud.com/17633
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7276 utils: fix llog_reader loop on empty log 27/17627/4
Andreas Dilger [Thu, 24 Jul 2014 20:19:08 +0000 (14:19 -0600)]
LU-7276 utils: fix llog_reader loop on empty log

If llog_reader tries to read an empty log file (0 bytes) it does not
detect this as an error, since it tries to read 0 bytes from the llog
and then tries to parse an empty llog header.

Return an error if the llog file is smaller than struct llog_log_hdr.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ife482eb6d73b35f09cc70ca4b79bfd7b053ebb35
Reviewed-on: http://review.whamcloud.com/17627
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7569 o2iblnd: avoid intensive reconnecting 92/17892/3
Liang Zhen [Fri, 8 Jan 2016 15:30:47 +0000 (23:30 +0800)]
LU-7569 o2iblnd: avoid intensive reconnecting

When there is a connection race between two nodes and one side
of the connection is rejected by the other side. o2iblnd will
reconnect immediately, this is going to generate a lot of
trashes if:

- race winner is slow and can't send out connecting request
  in short time.
- remote side leaves a cmid in TIMEWAIT state, which will reject
  future connection requests

To resolve this problem, this patch changed the reconnection
behave: reconnection is submitted by connd only if a zombie
connection is being destroyed and there is a pending
reconnection request for the corresponding peer.

Also, after a few rejections, reconnection will have a time
interval between each attempt.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I45f010a27ef0b337c225a2354845db8cb6bb9969
Reviewed-on: http://review.whamcloud.com/17892
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Tested-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 util: delete no longer functional lltrack_stats application 04/17904/3
James Simmons [Mon, 11 Jan 2016 14:30:29 +0000 (09:30 -0500)]
LU-5030 util: delete no longer functional lltrack_stats application

While cleaning up the lustre utilites to remove direct
procfs access one of the application lltrack_stats had
to be updated. While testing I discovered this application
actually calls a external perl script llstat.pl which
was removed a very long time ago which means this app
doesn't even work. Since this is the case we can just
delete this non-functional application.

Change-Id: I39189393060d592272dd7112904cd598912d8271
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17904
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 utils: fix lnet/utils/debug.c compile issue 00/17900/3
James Simmons [Mon, 11 Jan 2016 14:38:27 +0000 (09:38 -0500)]
LU-5030 utils: fix lnet/utils/debug.c compile issue

The source file debug.c will fail to compile if gcc
uses the flag -Werror=format-security. The solution
is to add "%s" to cfs_get_param_path().

Change-Id: If0fb438010f692e11432aa1539218d16c9e4548e
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17900
Tested-by: Jenkins
Reviewed-by: Ryan Haasken <haasken@cray.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6401 headers: move swab functions to new header files 39/16339/11
Ben Evans [Fri, 4 Sep 2015 14:26:02 +0000 (09:26 -0500)]
LU-6401 headers: move swab functions to new header files

Create headers for pack_generic.c and llog_swab.c
Reference only where needed.

Signed-off-by: Ben Evans <bevans@cray.com>
Change-Id: I0942d2b7e3d60994c43832a94625fa300bac6617
Reviewed-on: http://review.whamcloud.com/16339
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7382 llite: Fix iovec references accounting in ll_file_aio_read/write 32/17632/6
Andriy Skulysh [Wed, 16 Dec 2015 14:05:31 +0000 (16:05 +0200)]
LU-7382 llite: Fix iovec references accounting in ll_file_aio_read/write

lti_local_iov is used to store iovec in case of 1 segment.

It is needed to hold reference on lu_env during
call of ll_file_write_iter() or ll_file_read_iter().

Otherwise an assertion fails:
vvp_io_update_iov()) ASSERTION( vio->vui_tot_nrsegs >= vio->vui_iter->nr_segs ) failed

Change-Id: Iaff4c81c6ced9ac0e1557dd0eb1fab5205b48e28
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/17632
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7586 test: wait for remove in sanity-hsm test_406() 28/17828/2
John L. Hammond [Tue, 5 Jan 2016 20:28:00 +0000 (14:28 -0600)]
LU-7586 test: wait for remove in sanity-hsm test_406()

In sanity-hsm test_406() wait for the remove to complete before
continuing and call 'lfs mv' with the parent directory instead of the
individual file.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Idc22747c3e9da35ca40671735f18889345af5d9d
Reviewed-on: http://review.whamcloud.com/17828
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7647 lnet: remove annoying message in parse_nidrange 16/17916/2
Li Xi [Sun, 10 Jan 2016 13:26:39 +0000 (21:26 +0800)]
LU-7647 lnet: remove annoying message in parse_nidrange

When setting TBF rules of jobid, parse_nidrange() prints warning
messages. However, this is unnecessary and annoying since paring
a TBF rule will always try to parse the jobid like a nid.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: Idaa9525991bceadaafbb9d72b723e4f54b6dbe14
Reviewed-on: http://review.whamcloud.com/17916
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6684 lfsck: stop lfsck even if some servers offline 32/17032/6
Fan Yong [Wed, 23 Sep 2015 05:40:46 +0000 (13:40 +0800)]
LU-6684 lfsck: stop lfsck even if some servers offline

It is possible that during the LFSCK scanning, some server, MDT
or OST, maybe offline. At that time, if the LFSCK needs to talk
with such offline server, related RPC will trigger reconnect to
the offline server, and the LFSCK engine has to wait there till
the offline server become online or someone deactive the server
by force. Under such case, if the admin wants to stop the LFSCK,
the stop request will be blocked. It is NOT good usage.

This patch allows the lfsck_stop sponsor to send SIGINT signal
to the LFSCK engine to make it awake from the infinite waiting
status, then the LFSCK can be stopped even if some servers are
offline.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I07e7ae7ca98ebf213888b58d615ae8001d28afbe
Reviewed-on: http://review.whamcloud.com/17032
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7422 mdt: fix ENOENT handling in mdt_intent_reint 77/17177/5
Sergey Cheremencev [Fri, 23 Oct 2015 16:13:51 +0000 (19:13 +0300)]
LU-7422 mdt: fix ENOENT handling in mdt_intent_reint

In case of DISP_OPEN_CREATE client waits for valid
fid value in reply when it_status == 0.
When reint_open returns ENOENT fid is not set and
client gets fid filled by 0. This may cause following
panic:
ll_prep_inode()) ASSERTION( fid_is_sane(&md.body->fid1) )

Change-Id: I1c8821f547de11709663565ce509044613564bc5
Signed-off-by: Sergey Cheremencev <sergey.cheremencev@seagate.com>
Xyratex-bug-id: MRP-3073
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-by: Alexey Leonidovich Lyashkov <alexey.lyashkov@seagate.com>
Tested-by: Elena V. Gryaznova <elena.gryaznova@seagate.com>
Reviewed-on: http://review.whamcloud.com/17177
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7624 fld: copy userspace buffer 97/17797/4
Bob Glossman [Mon, 4 Jan 2016 19:28:43 +0000 (11:28 -0800)]
LU-7624 fld: copy userspace buffer

copy userspace buffer into kernel space before use.

Based on:
 Linux-commit: 48f46e74dc7d1770a69b1dc9ef9a54ab7c3aedc0

    staging: lustre: lustre: fld: lproc_fld.c fixed warning

    fixed warning for line over 80 characters by moving the struct init
    onto a diff line.

Signed-off-by: Anil Belur <askb23@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 Linux-commit: e84962e3afc1665756bd4854c63da662696fb687

    staging: lustre: fix sparse warning on LPROC_SEQ_FOPS macros

    ...

    The patch also fixes one __user pointer direct dereference by
    strncmp() in function fld_proc_hash_seq_write().

Signed-off-by: Tristan Lelong <tristan@lelong.xyz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 Linux-commit: 41dff7ac1a7c97f5532931154bfdf505d7ce1631

    staging: lustre: remove kmalloc from fld_proc_hash_seq_write

    This patch simplifies the fld_proc_hash_seq_write() function
    by removing the dynamic memory allocation.
    The longest fh_name used so far in lustre is 4 characters.
    We use a 8 bytes variable to be on the safe side.

Signed-off-by: Tristan Lelong <tristan@lelong.xyz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I3ca796f12d340753c6fd952587d2592dcfbc80c8
Reviewed-on: http://review.whamcloud.com/17797
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-7636 gss: avoid useless sec debug log flooding 69/17869/2
Sebastien Buisson [Thu, 7 Jan 2016 10:39:19 +0000 (11:39 +0100)]
LU-7636 gss: avoid useless sec debug log flooding

Avoid useless sec debug log flooding by printing gss context deletion
messages only when context is not null.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I9a93813c52c6edadac435e5ba53ed292b81d6566
Reviewed-on: http://review.whamcloud.com/17869
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7525 mdd: mdd_migrate_create() should not set nlink 96/17496/5
Alex Zhuravlev [Mon, 7 Dec 2015 13:35:40 +0000 (16:35 +0300)]
LU-7525 mdd: mdd_migrate_create() should not set nlink

at object creation. the migration process itself takes
care of nlink incrementing it atomically with dt_insert().

Change-Id: Ia2c8f6cdd77e0808a8060d6e1c542c596612a3ce
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/17496
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7587 tests: fix sanity test_231a on SLES12 22/17722/6
Yang Sheng [Thu, 24 Dec 2015 02:43:30 +0000 (10:43 +0800)]
LU-7587 tests: fix sanity test_231a on SLES12

Flush out write RPC to ensure it is finished before
we read statistic data from proc.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I8b40654cb95b151fe82067ec979dcf9f604355dc
Reviewed-on: http://review.whamcloud.com/17722
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7479 tests: fix lustre-rsync-test test_2a on SLES12 75/17475/8
Yang Sheng [Fri, 4 Dec 2015 06:22:12 +0000 (14:22 +0800)]
LU-7479 tests: fix lustre-rsync-test test_2a on SLES12

In sles12, ldd output has a wrong format for linux-vdso.so.1.
Since it is a virtual file so strip out is reasonable.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I335139c4278db771a0c313573dde41eb9b96d649
Reviewed-on: http://review.whamcloud.com/17475
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6627 llite: remove extraneous export parameter 53/14953/8
Andreas Dilger [Wed, 27 May 2015 08:52:36 +0000 (02:52 -0600)]
LU-6627 llite: remove extraneous export parameter

The ll_close_inode_openhandle() and ll_md_close() functions passed an
extra "obd_export *md_exp" parameter, but it turns out that all of the
callers already pass inode->i_sb->s_fs_info->lsi_llsbi->ll_md_exp in
one form or another, so it can just be extracted from "inode" directly
as needed.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Id9c841b11e05d2d16845998ec3f4e09fdf3ebbe5
Reviewed-on: http://review.whamcloud.com/14953
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-519 o2iblnd: check wr_id returned by ib_poll_cq 47/12747/3
Liang Zhen [Mon, 17 Nov 2014 06:35:25 +0000 (14:35 +0800)]
LU-519 o2iblnd: check wr_id returned by ib_poll_cq

If ib_poll_cq returned +ve without initialising ib_wc::wr_id (bug
in driver), then o2iblnd will run into unpredictable situation
because ib_wc::wr_id may refer to stale tx/rx pointer in stack.

It indicates bug in HCA driver if this happened, ko2iblnd should
output console error then close current connection.

This patch could also be helpful for LU-5271

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I7851e009bb6cd7df3c299b23b6f338b86ba73b68
Reviewed-on: http://review.whamcloud.com/12747
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4825 osp: rename variables to match /proc entry 32/16032/5
Andreas Dilger [Thu, 20 Aug 2015 05:58:25 +0000 (23:58 -0600)]
LU-4825 osp: rename variables to match /proc entry

Rename the opd_pre_min_grow_count, opd_pre_max_grow_count,
opd_pre_grow_slow, and opd_pre_grow_count to opd_pre_min_create_count,
opd_pre_max_create_count, opd_pre_create_slow and opd_pre_create_count
respectively, for consistency with /proc files osp.*.max_create_count
and osp.*.create_count to make them easier to find, and also many
other internal functions are named "precreate" instead of "grow".

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I3a3ef3a0f13d593e9d0f8cedb53cc690528c0c7f
Reviewed-on: http://review.whamcloud.com/16032
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5030 tests: handle missing common_name for upstream client 75/14675/50
James Simmons [Tue, 5 Jan 2016 23:30:02 +0000 (18:30 -0500)]
LU-5030 tests: handle missing common_name for upstream client

The upstream client has moved from procfs to sysfs
except for llite.*.lov.common_name which does not exist
for the upstream client. In the current test suite common_name
is used to gather information like stripe count and ost count.
This is left overs from days before lfs [s|g]et_stripe existed
and OSTCOUNT was not defined in the test suite configuration
files. Instead we can replace the use of common_name with
OSTCOUNT and use lfs [g|s]et_stripe.

Change-Id: I7fbb7348e009cb9f77ba9847aeb4879dc302e4ee
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/14675
Tested-by: Jenkins
Reviewed-by: Ryan Haasken <haasken@cray.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoNew tag 2.7.65 2.7.65 v2_7_65 v2_7_65_0
Oleg Drokin [Mon, 11 Jan 2016 19:08:05 +0000 (14:08 -0500)]
New tag 2.7.65

Change-Id: I051d19ad984a6d9b2d02305765e4f86b9aa2846a
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7601 build: fix typo for spl/zfs added case handler 29/17829/2
Bruno Faccini [Tue, 5 Jan 2016 20:55:57 +0000 (21:55 +0100)]
LU-7601 build: fix typo for spl/zfs added case handler

A typo has been introduced to detect SPL/ZFS DKMS versions even
when they are still in the "added" state.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I51c269c6c35991cf1b7c26fa44769c1bc73fe20d
Reviewed-on: http://review.whamcloud.com/17829
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7223 test: check mmp feature properly in mmp.sh 84/17884/4
Niu Yawei [Fri, 8 Jan 2016 03:33:43 +0000 (22:33 -0500)]
LU-7223 test: check mmp feature properly in mmp.sh

The mmp.sh checks if the mmp feature is enabled by looking at
the failover_HOST variables, which is not reliable, this patch
changed it to looking at the features on device directly.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ia86d1bfbfa56e01efa06955a36fa34ca62997a54
Reviewed-on: http://review.whamcloud.com/17884
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7504 kernel: kernel update [SLES11 SP3 3.0.101-0.47.71] 13/17413/6
Bob Glossman [Tue, 1 Dec 2015 17:04:02 +0000 (09:04 -0800)]
LU-7504 kernel: kernel update [SLES11 SP3 3.0.101-0.47.71]

Update SLES11 SP3 kernel to 3.0.101-0.47.71

Test-Parameters: mdsdistro=sles11sp3 ossdistro=sles11sp3 \
  clientdistro=sles11sp3 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I84e817f1a9f30070ce74b3a313f4b22696a11cff
Reviewed-on: http://review.whamcloud.com/17413
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6142 lnet: return proper error code 26/17626/3
James Simmons [Mon, 21 Dec 2015 20:56:47 +0000 (15:56 -0500)]
LU-6142 lnet: return proper error code

It is consider bad style in the linux kernel to
return -1 or a positive number for an error.
Instead return the appropriate error codes.

Change-Id: Icd9729de84f162b07df17caab48e7693459a03e8
Signed-off-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-on: http://review.whamcloud.com/17626
Tested-by: Jenkins
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7609 tests: fix sanity-krb5 27/17727/5
Sebastien Buisson [Thu, 24 Dec 2015 17:07:06 +0000 (18:07 +0100)]
LU-7609 tests: fix sanity-krb5

Fix various issues with sanity-krb5 test script:
- replace elan with $NETTYPE in test 99;
- start gss daemon on MGS in test 151.

Signed-off-by: Sebastien Buisson <sbuisson@ddn.com>
Change-Id: I5056240f733e82a59d96543b08476a959449f29d
Reviewed-on: http://review.whamcloud.com/17727
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7412 osp_md_read() may pass an ERR_PTR() to osp_update_request_destroy() 22/17522/3
akam [Wed, 9 Dec 2015 07:59:53 +0000 (13:29 +0530)]
LU-7412 osp_md_read() may pass an ERR_PTR() to osp_update_request_destroy()

In osp_md_read() if osp_update_request_create() fails with ERR_PTR()
it should return rather than passing on ERR_PTR() to the
osp_update_request_destroy()

Change-Id: Id4c0c5b3e0619a4e657c22bf27a5679e02164007
Signed-off-by: akam kumar bharathi <azurelustre@gmail.com>
Reviewed-on: http://review.whamcloud.com/17522
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
4 years agoLU-3953 build: Only chmod dkms.mkconf once 16/17516/2
Christopher J. Morrone [Tue, 8 Dec 2015 21:55:52 +0000 (16:55 -0500)]
LU-3953 build: Only chmod dkms.mkconf once

With AC_CONFIG_FILES, the "commands" parameter (the second one) is
applied for _each_ command, not just once.  That means the existing
chmod command was run many times, and several of the times it runs
it complains because the dkms.mkconf file does not yet exist.

This patch fixes that by giving the dkms.mkconf file its own
AC_CONFIG_FILES macro.

Change-Id: Ic71cc5d8c3555d28ff16efa23d564dce28662443
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/17516
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7503 utils: add “--verbose|-v” option to “lfs migrate -m” 20/17420/3
Jian Yu [Wed, 2 Dec 2015 05:54:07 +0000 (21:54 -0800)]
LU-7503 utils: add “--verbose|-v” option to “lfs migrate -m”

“lfs mv” has -v option to track the migration progress, which is
very useful for migrating big directory. However, the option is
missing while we changing “lfs mv” to “lfs migrate -m” in commit
849d7d5b1b4cabb7578c3ab5aaf271e90dd33864. This patch adds the option.

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I0729f74f46943736c6ed6ade46ca26aee905f550
Reviewed-on: http://review.whamcloud.com/17420
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
4 years agoLU-6710 tests: Constrain stripe index conf-sanity test 82a 24/15824/3
James Nunez [Fri, 31 Jul 2015 21:00:03 +0000 (15:00 -0600)]
LU-6710 tests: Constrain stripe index conf-sanity test 82a

conf-sanity test 82a specifies the OSTs to stripe a file over.
The OST index is computed as RANDOM * 2 for a maximum of 65534.
Yet, the maximum stripe count is 65532. Thus, the OST index in
conf-sanity test 82a needs to be limited.

Also, change the single use of the deprecated
llapi_stripe_offset_is_valid() to llapi_stripe_index_is_valid().

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I4ceb44d639b88527105c1e8812cbd7590d041316
Reviewed-on: http://review.whamcloud.com/15824
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Saurabh Tandan <saurabh.tandan@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-5717 ptlrpc: fix deadlock problem of nrs_tbf_timer_cb 28/12228/5
Li Xi [Wed, 8 Oct 2014 12:11:37 +0000 (20:11 +0800)]
LU-5717 ptlrpc: fix deadlock problem of nrs_tbf_timer_cb

When callback of TBF timer is triggered, nrs_lock could be
held by the current CPU which will cause dead lock. This
patch removes unnecessary nrs_lock to fix this problem.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I6329e3e71da30a415dbb35b37d79ade118917c6a
Reviewed-on: http://review.whamcloud.com/12228
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
4 years agoLU-7210 lnet: Change connect peer failed cleanup order 04/17004/2
Doug Oucharek [Sat, 31 Oct 2015 01:11:31 +0000 (18:11 -0700)]
LU-7210 lnet: Change connect peer failed cleanup order

A race condition has been found where connd is cleaning up failed
connections, the peer ref counter goes to zero, but we stil have
a connecting counter > 0.

One possible race is when we are retrying a connection by
calling kiblnd_connect_peer() which itself fails and decrements
the peer ref counter and gets swapped out before it can decrement
the connecting counter.  connd swaps in and cleans up the
connection where it sees a peer ref counter of 1 and a connecting
counter of 1.  This will trigger the assert seen in LU-7210 when
it decrements the peer counter.

The solution: be sure to decrement the connecting counter
before decrementing the peer counter in the peer connect
failure path.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I2d6ddeae80ac72492a4323a730e3e61c876ebb36
Reviewed-on: http://review.whamcloud.com/17004
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5718 o2iblnd: Revert original fix 99/17699/2
Doug Oucharek [Mon, 21 Dec 2015 21:37:57 +0000 (13:37 -0800)]
LU-5718 o2iblnd: Revert original fix

The original fix for this ticket introduced a regression
where bit flags could interfere with each other triggering
asserts.  Also, the focus was on addressing connection
races, but the fix should be expanded to include all
reconnects.

The updated fix is being done under ticket: LU-7569.

Signed-off-by: Doug Oucharek <doug.s.oucharek@intel.com>
Change-Id: I455e43f8a5134f7896ad14c3cd0888b8c08d38d2
Reviewed-on: http://review.whamcloud.com/17699
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7577 mdt: root inode checking for migration 69/17669/5
Di Wang [Wed, 16 Dec 2015 10:46:31 +0000 (02:46 -0800)]
LU-7577 mdt: root inode checking for migration

Do not migrate root inode, and add test case
to verify it.

Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I8b7a4211d76cbfc1e1b095c6e8f94841d42bc50f
Reviewed-on: http://review.whamcloud.com/17669
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-7623 Add __user to seq_write buffer arguments 88/17788/2
Oleg Drokin [Sun, 3 Jan 2016 22:28:02 +0000 (17:28 -0500)]
LU-7623 Add __user to seq_write buffer arguments

Updates whole tree and adds forgotten __user attribute,
syncs up prototypes and such.
This keeps sparse happy and helps to ensure user/kernel pointers
correctness.

Change-Id: I54cf7479fffbd8ce211b28f9f3a9de81f600a32e
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/17788
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@yahoo.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Frank Zago <fzago@cray.com>