Whamcloud - gitweb
fs/lustre-release.git
9 years agoLU-6235 scrub: replace the stale OI mapping 45/13745/3
Fan Yong [Thu, 20 Nov 2014 04:54:59 +0000 (12:54 +0800)]
LU-6235 scrub: replace the stale OI mapping

If the OI mapping on the OST contains an invalid one, then the OI
lookup via osd_obj_map_lookup() may return -ENOENT. From the view
of OI scrub, it is indistinguishable from the case of there is no
such OI mapping, then it will cause the OI scrub to use "INSERT"
@ops for osd_scrub_refresh_mapping() to repair such inconsistency
by wrong. So the osd_obj_map_lookup() should return -ESTALE under
the case of invalid OI mapping exists, then the OI scrub can use
"UPDATE" @ops for osd_scrub_refresh_mapping() to repair.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I013125eb0aaec683ac8f56ec32a30e7858262f87
Reviewed-on: http://review.whamcloud.com/13745
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6239 doc: include missing lnetctl.8 49/13749/2
James Simmons [Thu, 12 Feb 2015 19:14:50 +0000 (14:14 -0500)]
LU-6239 doc: include missing lnetctl.8

Doing a man lnetctl currently doesn't work on system
with lustre installed. This is due to lnetctl.8 does
not get included in generated rpms. This simple fix
ensure lnetctl.8 is included in the rpms.

Change-Id: I72e2ef2841f5936e1d0def538c239ee2da32d7c3
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/13749
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5873 ldiskfs: osd_do_bio()) ASSERTION( iobuf->dr_rw == 0 ) failed 00/12600/9
Andriy Skulysh [Mon, 10 Nov 2014 10:48:11 +0000 (12:48 +0200)]
LU-5873 ldiskfs: osd_do_bio()) ASSERTION( iobuf->dr_rw == 0 ) failed

The bug happens when  16TB-4KB limit is exceeded during write.

Add check for maximum file size on client and server sides.

Xyratex-bug-id: MRP-2131
Change-Id: I73f0ee803670ada869c2618f275049948668848e
Signed-off-by: Andriy Skulysh <andriy.skulysh@seagate.com>
Reviewed-on: http://review.whamcloud.com/12600
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6222 statahead: add to list before make ready 08/13708/2
Lai Siyao [Tue, 10 Feb 2015 13:44:44 +0000 (21:44 +0800)]
LU-6222 statahead: add to list before make ready

__sa_make_ready() set entry ready before adding to list, so that
revalidate_statahead_dentry()->sa_kill() may free an entry which
is not in any list yet.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: I0b5f7200fb74c88450133d66bf7bf38d9355036f
Reviewed-on: http://review.whamcloud.com/13708
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5549 mdc: cl_default_mds_easize not refreshed 14/11614/12
Ned Bass [Wed, 17 Dec 2014 00:05:42 +0000 (16:05 -0800)]
LU-5549 mdc: cl_default_mds_easize not refreshed

The client_obd::cl_default_mds_easize field should track the largest
observed EA size advertised by the MDT, subject to a reasonable upper
bound.  The MDC uses cl_default_mds_easize to calculate the initial
size of request buffers.  The default value should be small enough to
avoid wasted memory and excessive use of vmalloc(), yet large enough
to accommodate the common use case.

In the current code, the default value is only updated if
client_obd::cl_max_mds_easize is strictly less than
mdt_body::mbo_max_mdsize. This condition is almost never met, because
client_obd::cl_max_mds_easize is computed at client mount-time based
on the number of OSTs in the filesystem, so the MDT won't ever observe
and advertise an EA size larger than that.

As a result, client_obd::cl_default_mds_easize indefinitely retains
its initial value, which is computed at client mount-time based on
the filesystem's default stripe width. Any getattr() requests for
widely striped files will consequently allocate a request buffer
that is too small, forcing reallocations on both the client and
server side. To avoid this, update client_obd::cl_default_mds_easize
independently of the value of client_obd::cl_max_mds_easize.

In addition, this patch includes these changes:

- Add comments to the client_obd structure to clarify what the
  cl_{default,max}_mds_{cookie,ea}size values mean.

- Prevent mdc_get_info() from storing uninitialized data in
  client_obd::cl_max_mds_cookiesize.

- Use 4096 as an upper bound for the default values.  The former
  bound of PAGE_CACHE_SIZE is too large on 64k-page platforms
  (i.e. PPC), so it fails to prevent the vmalloc() spinlock
  contention described in LU-3338. The new value was chosen to
  be large enough to accommodate common use cases while staying
  well below the 16k threshold at which allocations start using
  vmalloc().

- Add test case 27E to ./lustre/tests/sanity.sh.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Signed-off-by: Kyle Blatter <kyleblatter@llnl.gov>
Change-Id: I363017844d6af3e6b67b7c03bd206226f9495116
Reviewed-on: http://review.whamcloud.com/11614
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5549 llite: make default_easize writeable in /proc 12/13112/7
Ned Bass [Wed, 17 Dec 2014 00:03:10 +0000 (16:03 -0800)]
LU-5549 llite: make default_easize writeable in /proc

Allow default_easize to be tuned via /proc. A system administrator
might want this if a rare access to widely striped files drives up the
value on a filesystem where narrowly striped files are the more common
case. In practice, however, this is wanted primarily to facilitate
a test case for LU-5549.

- Plumb the necessary interfaces through the LMV and MDC layers
  to expose write access to this value by higher layers.

- Add block comments to modified functions.

- Correct misspelling of "default" in /proc handler function names
  in lustre/llite/lproc_llite.c. The file names in /proc were already
  spelled correctly so there are no issues with backward
  compatibility.

- Convert remaining space-indented lines in lmv_set_info_async()
  to tabs.

Signed-off-by: Ned Bass <bass6@llnl.gov>
Change-Id: Iae2c8d0ca28cccf12af9372b1a10a0f9d170fddf
Reviewed-on: http://review.whamcloud.com/13112
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
9 years agoLU-5523 mdt: add --index option to default dir stripe 60/13360/10
wang di [Fri, 16 Jan 2015 00:23:44 +0000 (16:23 -0800)]
LU-5523 mdt: add --index option to default dir stripe

Add --index option to default dirstripe EA. If MDT find
out the client send the create req to the wrong MDT because
of default stripeEA, it will return -EREMOTE, then client
will retrieve default stripeEA through xattr cache, and
re-create the object.

Add delete default dirstripeEA (-d) to delete dir default
stripeEA.

Add ldo_dir_def_striping_cached and ldo_def_striping_cached
to means if default striping EA has been cached in ldo_object.

And ldo_striping_cached means if the object's own striping
has been loaded from disk.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ic2896e9050f1581344db9368b8f7b25bfded3d7d
Reviewed-on: http://review.whamcloud.com/13360
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4647 lctl: add nodemap man pages to lctl 78/13478/3
Kit Westneat [Wed, 21 Jan 2015 02:55:13 +0000 (21:55 -0500)]
LU-4647 lctl: add nodemap man pages to lctl

This patch adds separate man pages for the 8 lctl nodemap commands,
and updates the lctl man page to point to them.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Ia1350471a2878a8f4057d66a91141ad8dd132bc2
Reviewed-on: http://review.whamcloud.com/13478
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6086 obdclass: check peer's version for MDT-MDT connection 85/13285/5
Fan Yong [Tue, 4 Nov 2014 09:32:15 +0000 (17:32 +0800)]
LU-6086 obdclass: check peer's version for MDT-MDT connection

Because new DNE/LFSCK changed some wire protocol, we cannot support
the interoperations between different version MDTs. The basic rules
for the permitted connection are:
1) The @major in the connection version should be the same;
2) The @minor in the connection version should be the same;
3) The difference of the @patch in the connection version should NOT
   more than 3.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9e77f305c7552ad01e92c97f1eda0756f1291d30
Reviewed-on: http://review.whamcloud.com/13285
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-6199 ldiskfs: delete bad WARN_ON_ONCE from ldiskfs 04/13604/5
Bob Glossman [Tue, 3 Feb 2015 00:39:07 +0000 (16:39 -0800)]
LU-6199 ldiskfs: delete bad WARN_ON_ONCE from ldiskfs

lustre needs to call certain ext4/ldiskfs entry points without locking
i_mutex in order to avoid deadlocks.  This triggers a warning check
in ext4 code new in el6.6, not present in el6.5.  Already fixed
in ldiskfs patches for future kernel versions, but wasn't fixed for
el6.6

This mod adds an ldiskfs patch to eliminate the warning.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ia375a6d851a5262c578d722e2f8f4db2ea5249b7
Reviewed-on: http://review.whamcloud.com/13604
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agonew tag 2.6.94 2.6.94 v2_6_94 v2_6_94_0
Oleg Drokin [Mon, 9 Feb 2015 06:24:32 +0000 (01:24 -0500)]
new tag 2.6.94

Change-Id: I7cbdaaa209cb5c3db1612f0f9f36ac6668906962

9 years agoLU-6084 ptlrpc: prevent request timeout grow due to recovery 20/13520/9
Mikhail Pershin [Tue, 3 Feb 2015 18:30:14 +0000 (10:30 -0800)]
LU-6084 ptlrpc: prevent request timeout grow due to recovery

Patch fixes the issue with growing request timeout which occured
after commit 1d889090 for LU-5079. While commit itself is correct,
it reveals another issue. If request is being processed for a long
time on server then client adaptive timeouts will adapt to that
after receiving reply and new requests will have bigger timeout.
Another problem is that server AT history is corrupted by recovery
request processing time which not pure service time but includes
also waiting time for clients to recover.

Patch prevents the AT stats update from early replies on client and
from recovering requests processing time on server.
The ptlrpc_at_recv_early_reply() still updates the current request
timeout as asked by server, but don't include this into AT stats.
The real reply will bring that data from server after all.

Test-Parameters: alwaysuploadlogs testlist=replay-vbr,replay-dual

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ifcadfd669162013b6ccb386eb2b508bd9f0b22d9
Reviewed-on: http://review.whamcloud.com/13520
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6082 tests: fix too slow nodemap SLOW test 05/13605/5
Kit Westneat [Sat, 7 Feb 2015 08:38:42 +0000 (00:38 -0800)]
LU-6082 tests: fix too slow nodemap SLOW test

The SLOW test for nodemap is too slow to complete. This patch changes
the test to do 000-007, 010-070, 100-700 (octal) instead of testing
all modes, as was done before.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
mdtcount=1 testlist=sanity-sec

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: Ic92a3718de078ccfd13cf0b6580ab078dfedb144
Reviewed-on: http://review.whamcloud.com/13605
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
9 years agoLU-6109 lfsck: check FID validity before locating object 11/13511/4
Fan Yong [Mon, 10 Nov 2014 09:46:33 +0000 (17:46 +0800)]
LU-6109 lfsck: check FID validity before locating object

It is possible that the FID from iteration or linkEA is corrupted.
The LFSCK needs to check its validity before locating the object
with it to avoid falling into hung or other unexpected status.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I1df8d085bf5abf926d03882457cb8b221633d3aa
Reviewed-on: http://review.whamcloud.com/13511
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
9 years agoLU-6010 lnet: prevent assert on LNet module unload 10/13110/16
Amir Shehata [Wed, 17 Dec 2014 17:35:15 +0000 (09:35 -0800)]
LU-6010 lnet: prevent assert on LNet module unload

There is a use case where lnet can be unloaded while there are
no NIs configured.  Removing lnet in this case will cause
LNetFini() to be called without a prior call to LNetNIFini().
This will cause the LASSERT(the_lnet.ln_refcount == 0) to be
triggered.

To deal with this use case when LNet is configured a reference
count on the module is taken using try_module_get().  This way
LNet must be unconfigured before it could be removed; therefore
avoiding the above case.  When LNet is unconfigured module_put()
is called to return the reference count.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I0f283eeb395fa9a076a4d65ab3edd5e7807fc169
Reviewed-on: http://review.whamcloud.com/13110
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6175 ha: add health_check routine to the MDS, MGS and OSD 58/13558/2
Mikhail Pershin [Tue, 27 Jan 2015 23:25:04 +0000 (02:25 +0300)]
LU-6175 ha: add health_check routine to the MDS, MGS and OSD

Patch adds obd_health_check() methods in MDS and MGS to check
ptlrpc services health like OST does. Patch adds also health_check()
routine directly to OSD to check it is mounted and is not read-only.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ib4af652b08e7e3616ebb3b99ce3e4ad03bdd5ab5
Reviewed-on: http://review.whamcloud.com/13558
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6166 utils: fix bug in lr_link 46/13546/2
Wu Libin [Wed, 28 Jan 2015 06:17:06 +0000 (14:17 +0800)]
LU-6166 utils: fix bug in lr_link

When create a hard link of a file, the path and the file name are
same if it at the root directory, so the length of the path and name
will be the same in this case.

Signed-off-by: Wu Libin <lwu@ddn.com>
Change-Id: I3a72491efdc041ad0e96d036b04600b76bb646fe
Reviewed-on: http://review.whamcloud.com/13546
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6167 utils: fix bugs in lustre_sync 45/13545/3
Wu Libin [Wed, 28 Jan 2015 05:55:20 +0000 (13:55 +0800)]
LU-6167 utils: fix bugs in lustre_sync

The lustre_rsync will cause endloop and core dump problems, this
patch fix this problems. In function lr_cascade_move, it should
delete "curr" node in the "parents" list first, then move to the
next lr_cascade_move.

Signed-off-by: Wu Libin <lwu@ddn.com>
Change-Id: I5a5686ab89379da37453d07a5a00df4fd217ee59
Reviewed-on: http://review.whamcloud.com/13545
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6125 test: sanity test_27i defect: missing test_mkdir() 07/13407/4
Elena Gryaznova [Tue, 3 Feb 2015 00:10:14 +0000 (04:10 +0400)]
LU-6125 test: sanity test_27i defect: missing test_mkdir()

Fix sanity test_27i() to call test_mkdir()

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-1194
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Change-Id: I093cb44590b98857189d69d1b8f6e9e9c423d3bc
Reviewed-on: http://review.whamcloud.com/13407
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5510 scrub: ldiskfs_create_inode returns locked inode 87/13187/6
Fan Yong [Thu, 13 Nov 2014 16:45:52 +0000 (00:45 +0800)]
LU-5510 scrub: ldiskfs_create_inode returns locked inode

There was race condition between creating new inode and OI scrub:
the OI scrub may find the new created inode just after the creator
creating it but before setting the LMA EA. Originally, to resolve
such trouble, the creator will set the new created inode's state
as LDISKFS_STATE_LUSTRE_NOSCRUB. But such state is set after the
new inode unlocked. So the OI scrub still has some chance to find
the new created inode with neither LDISKFS_STATE_LUSTRE_NOSCRUB
nor LMA EA.

Be as improvement, this patch makes the ldiskfs_create_inode() to
return the new created inode with lock. The caller can set more
state (not only for LFSCK, but also for other purposes in future)
on the new created inode before unlock it.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Idc1a8fbd3701f7e431ef4b7858cfdf4674d74add
Reviewed-on: http://review.whamcloud.com/13187
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
9 years agoLU-5722 obdclass: reorganize busy object accounting 68/12468/5
Frank Zago [Tue, 28 Oct 2014 22:02:14 +0000 (17:02 -0500)]
LU-5722 obdclass: reorganize busy object accounting

Due to some accounting bug, lsb_busy of a hash bucket can become
larger than the total number of objects in said bucket. A busy object
can be counted more than once. When that happens, a negative value is
returned by the shrinker to Linux's shrink_slab() function. In older
kernel, such as 2.6.32 used in RHEL 6, this will cause a forever loop
inside shrink_slab(), in essence hanging the host.

Instead of trying (and failing) to count the busy objects, count the
objects than are not busy, i.e. the objects that are present on the
lsb_lru list. The number of busy objects is then the difference
between the number of objects in the hash and the objects on the
lsb_lru list.

Change-Id: Ia6973991a1ff7fc53cdf8132bf2aab532934cf94
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12468
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6120 lfsck: notify ever failed server to exit LFSCK 25/13525/3
Fan Yong [Mon, 10 Nov 2014 08:48:08 +0000 (16:48 +0800)]
LU-6120 lfsck: notify ever failed server to exit LFSCK

During the first-stage scanning, the local LFSCK instance records
which OSTs have ever failed to respond LFSCK verification requests
(maybe because of network issues or the OST itself trouble). Then
before start the second-stage scanning, the local LFSCK instance
will notify those ever failed OSTs to skip orphan handling since
they missed some OST-objects verification via la_sync_failures().

Originally, after la_sync_failures(), related OSTs will be removed
from the LFSCK targets list, in spite of whether la_sync_failures()
succeed or not, then the subsequent LFSCK notification RPCs will not
be sent to those OSTs. That may cause some OST(s) cannot exit LFSCK
expectedly, and then the subsequent LFSCK start will get failure
since former LFSCK instance has not exit.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Id0283c78527d6a3a6c563de7ce6af1fe2d3f1a30
Reviewed-on: http://review.whamcloud.com/13525
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6050 target: control OST-index in IDIF via ROCOMPAT flag 16/13516/5
Fan Yong [Mon, 10 Nov 2014 20:48:24 +0000 (04:48 +0800)]
LU-6050 target: control OST-index in IDIF via  ROCOMPAT flag

Introduce new flag OBD_ROCOMPAT_IDX_IN_IDIF that is stored in the
last_rcvd file. For new formatted OST device, it will be auto set;
for the case of upgrading from old OST device, you can enable it
via the lproc interface osd-ldiskfs.index_in_idif. With such flag
enabled, for new created OST-object, its IDIF-in-LMA will contain
the OST-index; for the existing OST-object, the OSD will convert
old format IDIF as new format IDIF with OST-index stored in the
LMA EA when accessing such OST-object or via OI scrub. Once such
flag is enabled, it cannot be reverted back, so the system cannot
be downgraded to the orignal incompatible version.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9e6e089d54fdb3970bb201eedac8dc09be2cc1c1
Reviewed-on: http://review.whamcloud.com/13516
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6063 kernel: use proper flags for call_usermodehelper 77/13677/2
James Simmons [Fri, 6 Feb 2015 17:43:05 +0000 (12:43 -0500)]
LU-6063 kernel: use proper flags for call_usermodehelper

When a parameter is permanently changed on the MGS the
MGS send a changelog packet to the proper nodes that
are affected by the change. Once the nodes receive the
change they then call the userland utility lctl to
change its local value. When calling a userland
application from the kernel you specify a flag to
control the interaction with the application. Originally
by default the flag was set to 0 which is UMH_NO_WAIT
which meant lctl was being called asynchronously. In
older kernels this was fine since UHM_NO_WAIT and
UHM_WAIT_PROC had nearly the same logic. This changed
with newer kernels which broke updating our parameters.
Plus doing a UHM_NO_WAIT doesn't report back a error
if something goes wrong with lctl. The fix is to set
the flag to UHM_WAIT_PROC so kernel space waits until
lctl has finished and we get a proper error code if
something does go wrong with lctl. Secondly the patch
uses the proper flag name instead of a number for the
use of call_usermodehelper in mdt_identity.c so the
code is more readable.

Change-Id: I016fd4342315e9db6ec3ef544bcfb3a477c97b52
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/13677
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-6154 zfs: striped directory and migration on ZFS 18/13518/5
wang di [Sun, 11 Jan 2015 20:19:30 +0000 (12:19 -0800)]
LU-6154 zfs: striped directory and migration on ZFS

1. Increase/decrease the refcount for sub_stripe object,
because we need explicitly increase/decrease refcount
for ZFS directory.

2. setup/cleanup sequence service for osd-zfs, so it can
create FID for local OSD.

3. Do not zero dah_eadata in OSD layer, instead of set it
MDD layer, so striping create process will be interferred.

4. Put 0 at the end of link data during migration, since
osd-zfs does not do it when reading link.

5. Create orphan object with linkEA data, so if migration
is interrupted, then other threads are able to read entries
from this half-migrated directory, because osd-zfs needs to
retrieve the parent FID from linkea data during read dir
entries (see osd_dir_it_rec()).

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I67cbd0b09d2716b163277425066dcf155df68039
Reviewed-on: http://review.whamcloud.com/13518
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6162 kernel: kernel update RHEL6.6 [2.6.32-504.8.1.el6] 60/13560/5
Bob Glossman [Tue, 27 Jan 2015 22:39:59 +0000 (14:39 -0800)]
LU-6162 kernel: kernel update RHEL6.6 [2.6.32-504.8.1.el6]

Update RHEL6.6 kernel to 2.6.32-504.8.1.el6

Test-Parameters: clientdistro=el6.6 mdsdistro=el6.6\
  ossdistro=el6.6 mdsfilesystemtype=ldiskfs\
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: If1bf2bca5f70e305be4859d8f5f196b3574abed3
Reviewed-on: http://review.whamcloud.com/13560
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6106 tests: Skip test_16 to test_23 if MDS version older than 2.6.90 09/13509/4
Wei Liu [Fri, 23 Jan 2015 01:12:44 +0000 (17:12 -0800)]
LU-6106 tests: Skip test_16 to test_23 if MDS version older than 2.6.90

Skip sanity-sec test_16 to test_23 if MDS version older than 2.6.90

Change-Id: I0f95dae3a7a0bdef52160a3ca76fefac6765007c
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: http://review.whamcloud.com/13509
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoRevert "LU-1214 ptlrpc: start minimum service threads" 47/13647/2
Oleg Drokin [Wed, 4 Feb 2015 18:11:53 +0000 (18:11 +0000)]
Revert "LU-1214 ptlrpc: start minimum service threads"

This seems to have broke something and causes wide conf-sanity failures.
See LU-6206 for more info

This reverts commit 43f96aa9cc3cec66d9b9e0a03e5fc23e094525e7.

Change-Id: Ie0d7124c72c7e590581ec92c2ab49c3d7bfa09fe
Reviewed-on: http://review.whamcloud.com/13647
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5829 ptlrpc: remove unnecessary EXPORT_SYMBOL 10/12510/13
Frank Zago [Fri, 9 Jan 2015 18:21:12 +0000 (12:21 -0600)]
LU-5829 ptlrpc: remove unnecessary EXPORT_SYMBOL

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Change-Id: I5dad1093f136577fa268cd7ecbebd1d660cfa8ef
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12510
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4870 lfsck: lock old MDT-object in migrating 82/13182/6
Fan Yong [Tue, 21 Oct 2014 13:54:21 +0000 (21:54 +0800)]
LU-4870 lfsck: lock old MDT-object in migrating

According to current metadata migration implementation, before the old
MDT-object is removed, both the new MDT-object and old MDT-object will
reference the same LOV layout. Then if the layout LFSCK finds the new
MDT-object by race, it will regard related OST-object(s) as multiple
referenced case, and will try to create new OST-object(s) for the new
MDT-object. To avoid such trouble, the layout LFSCK needs to lock the
old MDT-object before confirm the multiple referenced case.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9e42cb86683c33bedfef01ae7f6e2cc305f1137d
Reviewed-on: http://review.whamcloud.com/13182
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4712 llite: lock the inode to be migrated 89/9689/8
wang di [Mon, 17 Mar 2014 18:23:02 +0000 (11:23 -0700)]
LU-4712 llite: lock the inode to be migrated

Because the inode and its connected dentries will be cleared
out of the cache after migration, the inode needs to be locked
during the migration.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ibbbb33473de1a67df85ef8930debcf22cd775bcb
Reviewed-on: http://review.whamcloud.com/9689
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5242 osd-zfs: umount hang in sanity 133g 00/13600/2
Isaac Huang [Mon, 2 Feb 2015 23:43:30 +0000 (16:43 -0700)]
LU-5242 osd-zfs: umount hang in sanity 133g

Disable 78 79 80 that's known to trigger txg_wait_open()
hang which would block umount forever.

Change-Id: I3770c11120790f55ecc021cc054971e00acc951b
Signed-off-by: Isaac Huang <he.huang@intel.com>
Test-Parameters: mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity,sanity,sanity,sanity,sanity,sanity
Reviewed-on: http://review.whamcloud.com/13600
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5820 lfsck: use multiple namespace LFSCK trace files 09/12809/19
Fan Yong [Thu, 6 Nov 2014 11:59:27 +0000 (19:59 +0800)]
LU-5820 lfsck: use multiple namespace LFSCK trace files

The namespace LFSCK uses trace file to record the FID of the object
that has multiple hard links, or has remote name entry, or contains
some uncertain inconsistency, and so on. Only single namespace LFSCK
trace file may be not efficient, especially when there are millions
of FIDs to be recorded. So use multiple namespace LFSCK trace files
and per trace file based semaphore to control the concurrent access
of the trace file.

For Lustre-2.x (x <= 6), the LFSCK used LFSCK_NAMESPACE_MAGIC_V1 as
the namespace trace file magic. When downgrade to such old release,
the old LFSCK will not recognize the new LFSCK_NAMESPACE_MAGIC_V2 in
the new trace file, then it will reset the whole LFSCK, and will not
cause start failure. The similar case will happen when upgrade from
such old release.

This patch also drops some repeated FID recording in the namespace
LFSCK trace file. Related FID should have been recorded in the trace
file via lfsck_namespace_exec_oit(), it is unnecessary to do that
again when scanning the directory.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iec27c52b21789dbde1e4c1153f61162f028ceac3
Reviewed-on: http://review.whamcloud.com/12809
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
9 years agoLU-6095 tests: define TRUNCATE program for racer 01/13501/6
Jinshan Xiong [Thu, 22 Jan 2015 20:52:12 +0000 (12:52 -0800)]
LU-6095 tests: define TRUNCATE program for racer

In file_truncate.sh of racer, TRUNCATE was not defined for remote
clients. Let it point to tests/truncate in case it's not defined.

The same thing happens to MCREATE and LFS, fix them also and do
some cleanup.

Test-Parameters: alwaysuploadlogs testlist=racer
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ie6898f1573bd19810a2d8f14dc0aa375d3774e08
Reviewed-on: http://review.whamcloud.com/13501
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5357 lod: hold thandle during lod_trans_stop 20/13420/2
wang di [Wed, 14 Jan 2015 13:25:31 +0000 (05:25 -0800)]
LU-5357 lod: hold thandle during lod_trans_stop

Hold thandle during lod_trans_stop, to avoid the thandle
being freed in local transaction stop.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I2448d725e35b119a61bbfb2e9567446d203bec16
Reviewed-on: http://review.whamcloud.com/13420
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6115 test: sanity 133g defect: missing return after "skip" 89/13389/3
Elena Gryaznova [Wed, 14 Jan 2015 21:00:51 +0000 (01:00 +0400)]
LU-6115 test: sanity 133g defect: missing return after "skip"

Patch fixes test_133g(): add return() after skip()

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-2153
Change-Id: I1787e1300930542c5a34c5a7e8bd277df28bf17a
Reviewed-on: http://review.whamcloud.com/13389
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
9 years agoLU-5829 obdclass: remove unnecessary EXPORT_SYMBOL 23/13323/4
Frank Zago [Fri, 9 Jan 2015 18:25:18 +0000 (12:25 -0600)]
LU-5829 obdclass: remove unnecessary EXPORT_SYMBOL

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Removed now unused function cat_cancel_cb() and fixed 3 comments in
test code mentioning this function.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ia0fa1e8e65f197235c04997f56b49d8fd87d4fd6
Reviewed-on: http://review.whamcloud.com/13323
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5829 misc: remove unnecessary EXPORT_SYMBOL 21/13321/2
Frank Zago [Fri, 9 Jan 2015 18:28:13 +0000 (12:28 -0600)]
LU-5829 misc: remove unnecessary EXPORT_SYMBOL

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ibb6dd722c47c7c76275ac24f1a6d8a4a988f433a
Reviewed-on: http://review.whamcloud.com/13321
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2430 utils: fix "lfs mv" command parsing 61/13161/2
Andreas Dilger [Sat, 20 Dec 2014 00:03:34 +0000 (17:03 -0700)]
LU-2430 utils: fix "lfs mv" command parsing

Fix the lfs_mv() long option parsing so that it uses "--mdt-index"
instead of incorrectly requiring "----mdt-index" for the short "-M"
option.

Fix up some error messages in lfs_mv() as well, and change a test
case to use the long option form.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I20ffde97fb5d31364e91d6b21d407eb3323ebbe5
Reviewed-on: http://review.whamcloud.com/13161
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5478 lov: get rid of obd_* typedefs 44/13144/5
Dmitry Eremin [Fri, 19 Dec 2014 13:42:51 +0000 (16:42 +0300)]
LU-5478 lov: get rid of obd_* typedefs

We have a bunch of typedefs for common things that made no sense
and hid the actual type from plain view.
Replace them with proper uXX or sXX types.
Exception is in lustre_idl.h and lustre_ioctl.h where
they are replaced with __uXX and __sXX to be able to be included
in userspace. Replace obd_off with loff_t.

patch 3 in series: modify lov/lmv

Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I9dfcc0bac691160c64ef8a120887b160c0c6986f
Reviewed-on: http://review.whamcloud.com/13144
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
9 years agoLU-2675 lnet: assume a kernel build 21/13121/6
John L. Hammond [Wed, 28 Jan 2015 16:40:23 +0000 (11:40 -0500)]
LU-2675 lnet: assume a kernel build

In lnet/lnet/ and lnet/selftest/ assume a kernel build (assume that
__KERNEL__ is defined). Remove some common code only needed for user
space LNet.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I79d6f50bac895116628c93c35e23f64dd102780f
Reviewed-on: http://review.whamcloud.com/13121
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5957 mdt: Update MDT flags after layout swap 77/12877/2
Henri Doreau [Thu, 27 Nov 2014 13:51:09 +0000 (14:51 +0100)]
LU-5957 mdt: Update MDT flags after layout swap

Swap MOF_LOV_CREATED flags between MDT objects after a layout swap to
guarantee that layout will be re-created on next write if its LOV has
been deleted.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Change-Id: I3d0497d8be2a7335c1fb43e10af2b222243e6a81
Reviewed-on: http://review.whamcloud.com/12877
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2445 lfs: fixed support for lfs migrate -b 27/12627/4
Frank Zago [Fri, 7 Nov 2014 21:15:15 +0000 (15:15 -0600)]
LU-2445 lfs: fixed support for lfs migrate -b

-b is the short alias for --block to the lfs migrate command, but
wasn't set in the call to getopt_long().

Change-Id: Ie7397b994a34de71b9978cf51b55961b4c9ded69
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12627
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5521 grant: quiet message on grant waiting timeout 46/12146/6
Johann Lombardi [Mon, 1 Sep 2014 10:38:31 +0000 (12:38 +0200)]
LU-5521 grant: quiet message on grant waiting timeout

Use at_max in osc_enter_cache() to bound how long we wait for grant
space before switching to synchronous I/Os. Do not print a message
on the console when the timeout is hit since such long wait can
be legitimate with flaky network (i.e. BRW is resent multiple times).

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I63b40783381f6133e2f77dbc0f827e13f571ccd2
Reviewed-on: http://review.whamcloud.com/12146
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5023 tests: check FID seq properly for sanity-lfsck t_11b 76/10276/4
Fan Yong [Fri, 10 Oct 2014 18:14:04 +0000 (02:14 +0800)]
LU-5023 tests: check FID seq properly for sanity-lfsck t_11b

To guarantee the right FID seq to be checked.

Other scripts improvement for error handling.

Try to collect more logs.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I51cb75c15cc7421721ea0bc149fc2a5a72c13cc6
Reviewed-on: http://review.whamcloud.com/10276
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6081 user: use random() instead of /dev/urandom 77/13277/12
Patrick Farrell [Tue, 9 Dec 2014 04:26:28 +0000 (22:26 -0600)]
LU-6081 user: use random() instead of /dev/urandom

/dev/urandom gives good random numbers, but using it is very prone to
error, and opening/closing the device every time a number is needed
takes time.

Instead, initializes the library with our seed by calling srandom(),
and then use random(). Export a boolean variable
liblustreapi_initialized to let applications check that the library
was properly initialized by the loader.

Signed-off-by: frank zago <fzago@cray.com>
Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Change-Id: Ie6ced0d39df29d7054919e239add58a23115ec35
Reviewed-on: http://review.whamcloud.com/13277
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-1214 ptlrpc: start minimum service threads 76/2876/18
Andreas Dilger [Wed, 14 Jan 2015 14:55:44 +0000 (09:55 -0500)]
LU-1214 ptlrpc: start minimum service threads

If the ptlrpc_min_threads parameter is changed via /proc after the
service has started, then at least the requested number of service
threads should be started.  Otherwise this parameter would only be
used at initial thread startup and ignored if changed via /proc.

Fix conf-sanity.sh test_52[ab] to verify that at least the minimum
number of threads has been started when threads_min parameter is
changed, instead of just checking the parameter itself.  Also fix
test code style for 80-column line wrapping and tabs for indents.

The head utility does not always support shortcut "-1" option. It
should be specified as "-n1".

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I6e4bb4131d7500a93952b64102f885c76558cab0
Reviewed-on: http://review.whamcloud.com/2876
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5816 target: don't trigger watchdog waiting in recovery 72/12672/7
Hongchao Zhang [Thu, 9 Oct 2014 22:43:31 +0000 (06:43 +0800)]
LU-5816 target: don't trigger watchdog waiting in recovery

In target_recovery_thread, the process should not be considered
to be "blocked state" if it was waiting something to happen,
otherwise, the kernel watchdog will print:

task tgt_recov:19764 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
tgt_recov     D 0000000000000003     0 19764      2 0x00000000
Call Trace:
check_for_clients+0x0/0x70 [ptlrpc]
target_recovery_overseer+0x9d/0x230 [ptlrpc]
exp_connect_healthy+0x0/0x20 [ptlrpc]
autoremove_wake_function+0x0/0x40
target_recovery_thread+0x0/0x1920 [ptlrpc]

Change-Id: Ic1ad4dce1df974dd99e0b28cee211de173d178e5
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/12672
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
9 years agoLU-6147 lfsck: NOT purge object by OI scrub 93/13493/8
Fan Yong [Sun, 9 Nov 2014 04:00:41 +0000 (12:00 +0800)]
LU-6147 lfsck: NOT purge object by OI scrub

Originally, when the OI scrub found some inconsistent FID mapping,
it will repair the FID mapping and ask others to reload the object
by purging such object. Such behavior may cause others to hang.
Because if the object corresponding to the FID has already been
established in RAM, and if some other holds the object's reference,
such as the LFSCK engine will hold the .lustre/lost+found/MDTxxxx,
then purging object will set LU_OBJECT_HEARD_BANSHEE on the object,
then the subsequent object find against such FID will be blocked
until the object's reference become zero and re-establish the object
in RAM. Unfortunately, if it is the object's reference holder tries
to find the same object, it will be blocked by itself for ever.

On the other hand, on the server side, the OI scrub will repair
the bad OI mappping, if the object is established in RAM before
its bad FID mapping repaired, then it must be marked as non-exist,
and should not be cached in RAM after the last reference released.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I651ef5f5e8f4f478f07bcbb5622b345deed7cb31
Reviewed-on: http://review.whamcloud.com/13493
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6031 test: Check server version in recovery-small test 10d 57/13557/2
Mikhail Pershin [Wed, 28 Jan 2015 21:33:02 +0000 (00:33 +0300)]
LU-6031 test: Check server version in recovery-small test 10d

Test should check server version for interoperability needs.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I3b46ba9291c8c64cc3d3c235c0985f88df23f633
Reviewed-on: http://review.whamcloud.com/13557
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-6171 kernel: kernel update [RHEL7 3.10.0-123.20.1.el7] 70/13570/2
Bob Glossman [Fri, 30 Jan 2015 15:46:55 +0000 (07:46 -0800)]
LU-6171 kernel: kernel update [RHEL7 3.10.0-123.20.1.el7]

update RHEL7 kernel to 3.10.0-123.20.1.el7

Test-Parameters: clientdistro=el7 mdsfilesystemtype=ldiskfs\
        mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ieb1e8a2bb4cd86268721af91dd15d2c5bc69d0bf
Reviewed-on: http://review.whamcloud.com/13570
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-3536 osd: allocate it for each iteration. 23/13223/8
wang di [Mon, 22 Dec 2014 23:08:41 +0000 (15:08 -0800)]
LU-3536 osd: allocate it for each iteration.

Add osd iteration structure(osd_it_ea) to specific SLAB,
and allocate new osd_it_ea for each iteration, so iteration
can be nested, which will help DNE and LFSCK.

Since iteration for iam and quota are not so often,
we just allocate them with normal OBD_ALLOC_PTR.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I6402259708264f9341f314e7a2f6afe16cc66481
Reviewed-on: http://review.whamcloud.com/13223
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5971 llite: rename ccc_req to vvp_req 77/13377/3
John L. Hammond [Tue, 13 Jan 2015 16:06:42 +0000 (10:06 -0600)]
LU-5971 llite: rename ccc_req to vvp_req

Rename struct ccc_req to struct vvp_req and move related functions
from lustre/llite/lcommon_cl.c to the new file lustre/llite/vvp_req.c.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I6589cd1e039b41e55fcd833476f6a58ff2492900
Reviewed-on: http://review.whamcloud.com/13377
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6003 lnet: improvement to router checker 35/13035/6
Amir Shehata [Thu, 11 Dec 2014 18:52:26 +0000 (10:52 -0800)]
LU-6003 lnet: improvement to router checker

This patch starts router checker thread all the time.

The router checker only checks routes by ping if
live_router_check_interval or dead_router_check_interval are set
to something other than 0, and there are routes configured.

If these conditions are not met the router checker sleeps until woken
up when a route is added.  It is also woken up whenever the RC is
being stopped to ensure the thread doesn't hang.

In the future when DLC starts configuring the live and dead
router_check_interval parameters, then by manipulating them
the router checker can be turned on and off by the user.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I778690755e7121abd575f1a261637cb6dc754edd
Reviewed-on: http://review.whamcloud.com/13035
Tested-by: Jenkins
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5823 clio: add cl_object_find_cbdata() 94/12494/13
Bobi Jam [Thu, 30 Oct 2014 07:00:22 +0000 (15:00 +0800)]
LU-5823 clio: add cl_object_find_cbdata()

* Delete obsolete obd_ops::o_find_cbdata interface.
* Delete obsolete obd_ops::o_change_cbdata interface.
* Add cl_object_find_cbdata().

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I2e64e2e9a112783cb5c66bf4580fd1aec794417b
Reviewed-on: http://review.whamcloud.com/12494
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5971 llite: move vvp_io functions to vvp_io.c 76/13376/3
John L. Hammond [Tue, 13 Jan 2015 15:29:14 +0000 (09:29 -0600)]
LU-5971 llite: move vvp_io functions to vvp_io.c

Move all vvp_io related functions from lustre/llite/lcommon_cl.c to
the sole file where they are used lustre/llite/vvp_io.c.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I5b7d9671a32aaff7a2ebce42b0f5ff10e2eeb4ab
Reviewed-on: http://review.whamcloud.com/13376
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoNew tag 2.6.93 2.6.93 2_6_93 2_6_93_0 v2_6_93 v2_6_93_0
Oleg Drokin [Tue, 27 Jan 2015 18:04:42 +0000 (13:04 -0500)]
New tag 2.6.93

Change-Id: I826747da53ed1d9b0b2417b7b597dab3b76088a3

9 years agoLU-6114 test: add $mbench_OPTIONS to run_metabench() 88/13388/3
Elena Gryaznova [Wed, 14 Jan 2015 21:32:29 +0000 (01:32 +0400)]
LU-6114 test: add $mbench_OPTIONS to run_metabench()

Cray's metabench version requires -p <dictionary> parameter.
Patch adds mbench_OPTIONS to metabench call.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-1113
Reviewed-by: Vladimir Saveliev <vladimir_saveliev@xyratex.com>
Reviewed-by: Alexander Lezhoev <Alexander_Lezhoev@xyratex.com>
Change-Id: Id00f96c034f3d2d501421c0dd435354becea7512
Reviewed-on: http://review.whamcloud.com/13388
Tested-by: Jenkins
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6081 lfs: fixed bad return value 93/13293/2
Frank Zago [Thu, 8 Jan 2015 16:09:14 +0000 (10:09 -0600)]
LU-6081 lfs: fixed bad return value

When a command parameter line is invalid, CMD_HELP should be returned.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Icca4886ca2d6497837ea359b3a96398253467e19
Reviewed-on: http://review.whamcloud.com/13293
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5499 tests: use grep -w to search /proc/mounts 09/12409/4
Andreas Dilger [Fri, 24 Oct 2014 00:42:45 +0000 (18:42 -0600)]
LU-5499 tests: use grep -w to search /proc/mounts

When searching for /sbin/mount.lustre in /proc/mounts, use "grep -qw"
instead of using a trailing space, because if the mount.lustre binary
is deleted while it is mounted (e.g. by "make clean") it may have a
non-printable character following it and not be found and unmounted.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia33c4d4b7efa73f543999f73da198fa0698cab07
Reviewed-on: http://review.whamcloud.com/12409
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-1095 llite: improve max_readahead console messages 99/12399/3
Andreas Dilger [Thu, 23 Oct 2014 11:47:30 +0000 (05:47 -0600)]
LU-1095 llite: improve max_readahead console messages

Improve the max_readahead_mb, max_readahead_per_file_mb, and
max_read_ahead_whole_mb console error messages to print the
parameters properly in MB instead of PAGE_SIZE units, and include
the filesystem name and bad parameters in the output.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ifae8bd7012c2b5e11306fd8ecb53ef7fe500c1e2
Reviewed-on: http://review.whamcloud.com/12399
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5997 mdd: initialize mdd's obd->obd_vars 80/12980/6
Vladimir Saveliev [Mon, 5 Jan 2015 15:01:53 +0000 (10:01 -0500)]
LU-5997 mdd: initialize mdd's obd->obd_vars

mdd_procfs_init() initializes obd->obd_vars of not mdd's obd, but
mdt's one. Having mdd's obd->obd_vars uninitialized leads conf_param
to fail on setting mdd' parametes.

Xyratex-bug-id: MRP-2277
Signed-off-by: Vladimir Saveliev <vladimir.saveliev@seagate.com>
Change-Id: I065dc9e4577816ce08f22787116fae4f7e971db5
Reviewed-on: http://review.whamcloud.com/12980
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-2675 lustre: remove lustre/include/linux for debian 95/13495/2
Li Xi [Thu, 22 Jan 2015 08:35:02 +0000 (16:35 +0800)]
LU-2675 lustre: remove lustre/include/linux for debian

The directory of lustre/include/linux has been removed. Build
system for Debian shouldn't pack that directory any more.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I7d28681f574a990b8c54261567a3f107f9a9d159
Reviewed-on: http://review.whamcloud.com/13495
Tested-by: Jenkins
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5275 build: add LPROCFS to the deprecated symbol list 63/13463/2
John L. Hammond [Tue, 20 Jan 2015 02:29:29 +0000 (20:29 -0600)]
LU-5275 build: add LPROCFS to the deprecated symbol list

In contrib/scripts/checkpatch.pl deprecate LPROCFS and suggest use of
CONFIG_PROC_FS instead.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I801acd18b97c5c1aa474aa3960c9bfc0758e3652
Reviewed-on: http://review.whamcloud.com/13463
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6087 lod: use correct attrs in striped directory create 73/13473/2
John L. Hammond [Tue, 20 Jan 2015 22:02:45 +0000 (16:02 -0600)]
LU-6087 lod: use correct attrs in striped directory create

In lod_xattr_set_lmv() use the times, ownership, and mode of the local
object when creating the shards. Add test_33f to sanity.sh to check
that the ownership is handled properly.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Icc511d0f56888bcc8c095f0da4a6bdf99ccdeab5
Reviewed-on: http://review.whamcloud.com/13473
Tested-by: Jenkins
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5863 tests: add a separate MGS/MDS test case into conf-sanity 91/13391/3
Jian Yu [Wed, 14 Jan 2015 01:16:54 +0000 (17:16 -0800)]
LU-5863 tests: add a separate MGS/MDS test case into conf-sanity

In conf-sanity.sh, test 21d is a basic test case that verifies
separate MGS/MDS. However, it's always skipped under combined
MGS/MDS configuration. This patch adds a new test case 21e to
setup another Lustre filesystem to verify separate MGS/MDS without
depending on the configuration of the origial Lustre filesystem.

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ONLY=21 \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs \
ostfilesystemtype=ldiskfs testlist=conf-sanity

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ONLY=21 \
mdtfilesystemtype=zfs mdsfilesystemtype=zfs \
ostfilesystemtype=zfs testlist=conf-sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I3defa936a9b4f97dc3849c3a4a9626332da53d0f
Reviewed-on: http://review.whamcloud.com/13391
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6146 tests: race condition for check/use cfs_fail_val 81/13481/9
Fan Yong [Tue, 4 Nov 2014 08:02:22 +0000 (16:02 +0800)]
LU-6146 tests: race condition for check/use cfs_fail_val

There are some race conditions when check/use cfs_fail_val.
For example: when inject failure stub for LFSCK test as following:

764   if (OBD_FAIL_CHECK(OBD_FAIL_LFSCK_DELAY2) &&
765       cfs_fail_val > 0) {
766           struct l_wait_info lwi;
767
768           lwi = LWI_TIMEOUT(cfs_time_seconds(cfs_fail_val),
769                             NULL, NULL);
770           l_wait_event(thread->t_ctl_waitq,
771                        !thread_is_running(thread),
772                        &lwi);
773
774           if (unlikely(!thread_is_running(thread))) {
775                   CDEBUG(D_LFSCK, "%s: scan dir exit for engine "
776                          "stop, parent "DFID", cookie "LPX64"n",
777                          lfsck_lfsck2name(lfsck),
778                          PFID(lfsck_dto2fid(dir)),
779                          lfsck->li_cookie_dir);
780                   RETURN(0);
781           }
782   }

The "cfs_fail_val" may be changed as zero by others after the check
at the line 765 but before using it at the line 768. Then the LFSCK
engine will fall into "wait" until someone run "lfsck_stop".

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I418621faaf6a1f42ba1d541b37374c1dc21831be
Reviewed-on: http://review.whamcloud.com/13481
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5859 llog: do not cleanup orphans in remote catalogs 14/13414/3
Alex Zhuravlev [Thu, 15 Jan 2015 10:26:43 +0000 (13:26 +0300)]
LU-5859 llog: do not cleanup orphans in remote catalogs

when a catalog is being processed by the client, just ignore
empty llogs, do not try to clean them as the client has no
direct access to the storage.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ida933d44475fd392fe3db96bcdd4a05076b63881
Reviewed-on: http://review.whamcloud.com/13414
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-3716 obdecho: create a separate root object for echo access 30/10130/12
Jian Yu [Sat, 17 Jan 2015 07:00:47 +0000 (23:00 -0800)]
LU-3716 obdecho: create a separate root object for echo access

Currently, while echo client and normal client are attached at the
same time, both md echo objects and normal objects are created
and looked up under the same root object (ROOT), which will cause
ASSERTION( lu_device_is_mdt(o->lo_dev) ) failure.

This patch fixes the issue by creating a separate root object
(ROOT_ECHO) for echo access. The md echo objects created under
this root object can only be accessed by echo client. Normal client
will never see these echo objects.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes \
mdtcount=1 testlist=mds-survey

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I8d8a9bd2c467bb40a7993d492aa3d4ba6676ac8f
Reviewed-on: http://review.whamcloud.com/10130
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6088 lmv: Do not revalidate stripes with master lock 32/13432/5
wang di [Wed, 14 Jan 2015 22:47:53 +0000 (14:47 -0800)]
LU-6088 lmv: Do not revalidate stripes with master lock

Do not revalidate slave stripes while holding master lock.
Otherwise if the revalidating slaves are blocked, then the
master lock can not be released in time.

Remove some unnecesary merging in ll_revalidate_slave(), and
the attributes will be stored in each stripe, only
merging them if required.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I57c43236894e2bbbf9a20b1d90c5ab2a5dc62ef1
Reviewed-on: http://review.whamcloud.com/13432
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4951 scripts: remove nodemap.ko in dkms.conf 84/12784/2
Bruno Faccini [Wed, 19 Nov 2014 15:45:25 +0000 (16:45 +0100)]
LU-4951 scripts: remove nodemap.ko in dkms.conf

This new/2nd patch for a similar cause (new/removed Lustre module)
fixes dkms.conf creation script to comply with nodemap.ko removal
that has been introduced by LU-4647 patch (Gerrit change #9299, at
http://review.whamcloud.com/9299, Commit
83f04354ff68a14d7492e35a9576c91492a1206c) that has landed in
master (between tags 2.6.54 and 2.6.90).

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I9d0eb4dd3da31c46d7eda54e0ced998edb837741
Reviewed-on: http://review.whamcloud.com/12784
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6128 lnet: handle lnet_check_routes() errors 45/13445/2
Amir Shehata [Fri, 16 Jan 2015 20:42:55 +0000 (12:42 -0800)]
LU-6128 lnet: handle lnet_check_routes() errors

After adding a route, lnet_check_routes() is called to ensure that
the route added doesn't invalidate the routing configuration.  If
lnet_check_routes() fails then the route just added, which caused the
current configuration to be invalidated is deleted, and an error
is returned to the user

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I9b0cc105f97e7ddb0e4549626606c91118ca3ff5
Reviewed-on: http://review.whamcloud.com/13445
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5971 llite: use vui prefix for struct vvp_io members 63/13363/2
John L. Hammond [Tue, 13 Jan 2015 14:21:51 +0000 (08:21 -0600)]
LU-5971 llite: use vui prefix for struct vvp_io members

Rename members of struct vvp_io to used to start with vui_ rather than
cui_.  Rename several instances of struct vvp_io * from cio to vio.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iacdd982f82469c120bf801570b1bc152034d2a11
Reviewed-on: http://review.whamcloud.com/13363
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6074 doc: synchronize clio.txt with current implementation 35/13335/4
Jinshan Xiong [Tue, 13 Jan 2015 23:55:20 +0000 (15:55 -0800)]
LU-6074 doc: synchronize clio.txt with current implementation

Obsoleted stuff has been deleted in this patch, and changes to
current CLIO implementation has been updated.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I55b137ff0df527c50f96148e4418394d4fcbfd38
Reviewed-on: http://review.whamcloud.com/13335
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5648 tests: a new test to avoid reallocated object IDs 09/13309/3
Emoly Liu [Mon, 12 Jan 2015 02:23:06 +0000 (10:23 +0800)]
LU-5648 tests: a new test to avoid reallocated object IDs

Add test_101 in replay-single.sh to verify that the precreated
objects should not be reassigned to other files after recovery.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: If4368a4baa7f9ba72432d5e3558b5a0645d02014
Reviewed-on: http://review.whamcloud.com/13309
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6102 osd: log message when trigger OI scrub 14/13314/3
Fan Yong [Wed, 22 Oct 2014 21:30:23 +0000 (05:30 +0800)]
LU-6102 osd: log message when trigger OI scrub

These log messages can give us more information about why
the OI scrub is triggered.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4229fd41c61b4de9a8ad00c486ba92b5037db3a1
Reviewed-on: http://review.whamcloud.com/13314
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server 89/13189/3
Artem Blagodarenko [Mon, 29 Dec 2014 11:32:51 +0000 (14:32 +0300)]
LU-6071 client: Fix mkdir -i 1 from DNE2 client to DNE1 server

After DNE phase 2 has been added to client it sends
create request to slave MDT.  DNT1-only server doesn't
expect request to slave MDT from client. It expects
only cross-mdt request from master MDT. Thus if DNE2
client tries to "mkdir -i 1" on DNE1 server, then
LBUG happened.

This patch adds OBD_CONNECT_DIR_STRIPE connection
flag check on client side. If striped directories are not
supported by server, then create requrest is sent to
master MDT.

Signed-off-by: Artem Blagodarenko <artem_blagodarenko@xyratex.com>
Xyratex-bug-id: MRP-2319
Change-Id: I837a7ac144bf4aaf5039c55d03cabc0cd9847faa
Reviewed-on: http://review.whamcloud.com/13189
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6046 clio: update comments after cl_lock simplification 37/13137/4
Bobi Jam [Fri, 19 Dec 2014 02:03:02 +0000 (10:03 +0800)]
LU-6046 clio: update comments after cl_lock simplification

Update comments to reflect current cl_lock situations.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic5f904cd2ea10005a6f4e13546d7a2e4b5ba8eb2
Reviewed-on: http://review.whamcloud.com/13137
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-1154 clio: pass fid for OST setattr 02/12902/5
Bobi Jam [Tue, 2 Dec 2014 06:55:45 +0000 (14:55 +0800)]
LU-1154 clio: pass fid for OST setattr

Store inode's fid in cl_setattr_ost() and OSC packs this info on the
wire (via lustre_set_wire_obdo) so that OST can use.

NOTE: currently lu_fid::f_ver and obdo::o_parent_ver are not used on
OFD device, and we use obdo::o_stripe_idx as
filter_fid::ff_parent::f_ver and save it to the device.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ib396d19da0a7049f76b80e4d73bcad82b73f06df
Reviewed-on: http://review.whamcloud.com/12902
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
9 years agoLU-1154 clio: rename coo_attr_set to coo_attr_update 88/12888/8
Bobi Jam [Mon, 1 Dec 2014 09:20:06 +0000 (17:20 +0800)]
LU-1154 clio: rename coo_attr_set to coo_attr_update

coo_attr_set() is used to update object's attribute but its name
makes confusion that people intuitively think that it is used to
pass object's attribute down to server sides.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I4fb45ff1467f37c571b3acbb9465c787d0c5f261
Reviewed-on: http://review.whamcloud.com/12888
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5817 tests: Several tests for group locks. 38/12838/11
Frank Zago [Thu, 13 Nov 2014 21:20:05 +0000 (15:20 -0600)]
LU-5817 tests: Several tests for group locks.

Stress test on group locks (take many, take invalid locks, ...)

Change-Id: Ibf94455d484acf9d48863f14df1adea86ee6d2ea
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12838
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5726 ldiskfs: missed brelse() in large EA patch 52/13452/3
Niu Yawei [Mon, 19 Jan 2015 16:00:13 +0000 (11:00 -0500)]
LU-5726 ldiskfs: missed brelse() in large EA patch

brelse() is missed in ldiskfs_xattr_delete_inode(), this
defect is introduced by the ldiskfs large EA patch.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Icfe6015ce9d518b11ec448fe32673ef76ebf4c85
Reviewed-on: http://review.whamcloud.com/13452
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
9 years agoLU-6127 test: add new racer scripts to Makefile 10/13410/2
Elena Gryaznova [Thu, 15 Jan 2015 01:05:07 +0000 (05:05 +0400)]
LU-6127 test: add new racer scripts to Makefile

Patch fixes LU-3072: add new scripts to Makefile.

Signed-off-by: Elena Gryaznova <elena.gryaznova@seagate.com>
Xyratex-bug-id: MRP-2364
Reviewed-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Change-Id: If144f0493454d11bc0e1208624ee4be443ab1932
Reviewed-on: http://review.whamcloud.com/13410
Tested-by: Jenkins
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5275 lprocfs: sync names to upstream kernel lustre client 30/13330/4
James Simmons [Tue, 13 Jan 2015 15:44:03 +0000 (10:44 -0500)]
LU-5275 lprocfs: sync names to upstream kernel lustre client

When seq file handling was introduced to lustre we had to
create duplicate functions that handle a different type of
struct lprocfs_vars which was names struct lprocfs_seq_vars.
Now that lustre has moved to using only seq_file we can
rename the special *seq* functions and structures to what is
exactly in the upstream linux kernel. This helps to greatly
reduce the difference between upstream and the Intel branch.

Change-Id: Ic4f7eac105736c691ea4b37438352e5542ce344c
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/13330
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
9 years agoLU-5275 lprocfs: remove all non-seq file functions 35/12235/11
James Simmons [Tue, 13 Jan 2015 00:54:52 +0000 (19:54 -0500)]
LU-5275 lprocfs: remove all non-seq file functions

With the completion of the move to seq_file based
proc handling for lustre we can remove all the
no longer used non-seq_file handling routines.
Rename lprocfs_try_remove_proc_entry to match the
new function in newer kernels and RHEL6.6 that
does a similar thing (remove_proc_subtree).

Change-Id: Ieff19f0216770da94f29562d51abcbf5869bad34
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/12235
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5275 libcfs: merge params_tree.h into lprocfs_status.h 41/13341/4
James Simmons [Mon, 12 Jan 2015 15:37:21 +0000 (10:37 -0500)]
LU-5275 libcfs: merge params_tree.h into lprocfs_status.h

The macros in params_tree.h are only used for proc handling
in the lustre layer. Since this is the case we move all the
handling from params_tree.h to lprocfs_status.h

Change-Id: I590c1f2525bdd748450008af38510d19cd68f917
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/13341
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5275 lprocfs: replace LPROCFS with CONFIG_PROC_FS 99/13299/3
John L. Hammond [Sun, 11 Jan 2015 16:56:32 +0000 (11:56 -0500)]
LU-5275 lprocfs: replace LPROCFS with CONFIG_PROC_FS

Instead of defining LPROCFS if CONFIG_PROC_FS is defined and testing
for LPROCFS just test for CONFIG_PROC_FS. This reduces the need to
ensure that params_tree.h has been included everywhere.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1da970c5c932d329833d433615322a95b0c14011
Reviewed-on: http://review.whamcloud.com/13299
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6075 osd: race for check/chance od_dirent_journal 11/13311/4
Fan Yong [Wed, 22 Oct 2014 17:34:26 +0000 (01:34 +0800)]
LU-6075 osd: race for check/chance od_dirent_journal

Originally, the osd_device::od_dirent_journal was a bit variable,
changing such variable can happen when other is changing other bit
that is in the same integer in parallel. Because there is no lock
protection when change the bits, one thread changing may overwrite
others. So split the osd_device::od_dirent_journal as independent
variable.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I859932290ebf3b94be4f588f8e3e9635fe204d49
Reviewed-on: http://review.whamcloud.com/13311
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-4808 tests: don't skip sanity test_116b incorrectly 63/13263/2
Andreas Dilger [Wed, 7 Jan 2015 08:22:57 +0000 (01:22 -0700)]
LU-4808 tests: don't skip sanity test_116b incorrectly

An exception was added in http://review.whamcloud.com/9766 (commit
e217648d50da) to skip sanity test_116b if running against an MDS
that does not have the qos_threshold_rr tunable, but this was
incorrectly checking the local (client) for the tunable instead of
the MDS.  Fix the check to run on the MDS.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia1ae05373af9e92d91710761f1da6470f8500c1e
Reviewed-on: http://review.whamcloud.com/13263
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6081 lfs: check that pool name is not too long 41/13241/8
Frank Zago [Wed, 3 Dec 2014 22:42:18 +0000 (16:42 -0600)]
LU-6081 lfs: check that pool name is not too long

There was no check on the length of the pool name, so it could be
silently truncated when used.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ic18d28a4572ce54c39b35c3ea130ccbfdf33b34d
Reviewed-on: http://review.whamcloud.com/13241
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
9 years agoLU-5990 lnet: fix for lnet_prepare failure handling 73/12973/4
Liang Zhen [Fri, 5 Dec 2014 08:52:55 +0000 (16:52 +0800)]
LU-5990 lnet: fix for lnet_prepare failure handling

lnet_prepare() should return errno on failure

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: Ie86b4ba71550628e293352d8bff4d17af80e8e05
Reviewed-on: http://review.whamcloud.com/12973
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6027 mdt: Allow user EAs with empty values 00/12900/9
Li Wei [Mon, 1 Dec 2014 07:05:56 +0000 (15:05 +0800)]
LU-6027 mdt: Allow user EAs with empty values

Setting a user EA with an empty value is a valid case, according to
attr(5) and some experiments with ext4.  Doing so with Lustre
currently results in what appears to be a no-op---the EA name won't be
added or updated (if an EA with the same name already existed).

Change-Id: Ic8950963baeceada99c4607631ecd2a6510ae3ed
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/12900
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6060 lnet: set downis to 1 if there's no NI for remote net 17/13417/3
Liang Zhen [Thu, 15 Jan 2015 17:34:24 +0000 (09:34 -0800)]
LU-6060 lnet: set downis to 1 if there's no NI for remote net

lnet_route_t::lr_downis is marked as zero even if there is no NI to
target network, this is wrong and breaks logic of ARF. This patch
fixes this problem.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I7831e258d0098e2a4bee650e69bb4f1a12429f46
Reviewed-on: http://review.whamcloud.com/13417
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5875 lnet: return -EEXIST if NI is not unique 56/13056/6
Amir Shehata [Fri, 12 Dec 2014 22:43:03 +0000 (14:43 -0800)]
LU-5875 lnet: return -EEXIST if NI is not unique

Return -EEXIST and not -EINVAL when trying to add a
network interface which is not unique.

Some minor cleanup in api-ni.c

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Ic74a768c0e7688ba0e35740e2ca2ac9ae4f999ea
Reviewed-on: http://review.whamcloud.com/13056
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5874 lnet: reject invalid net configuration 12/12912/7
Amir Shehata [Tue, 2 Dec 2014 02:06:39 +0000 (18:06 -0800)]
LU-5874 lnet: reject invalid net configuration

Currently if there exists a route that goes over a
remote net and then this net is added dynamically as
a local net, then traffic stops because the code in
lnet_send() determines that the destination nid
can be reached from another local_ni, but the src_nid
is still stuck on the earlier NI, because the src_nid
is stored in the ptlrpc layer and is not updated
when a local NI is configured.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Idd4ea9e131db127f541dd8d75b90ac509c16e2c3
Reviewed-on: http://review.whamcloud.com/12912
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6025 lctl: remove bad error checking from nodemap_cmd() 91/13191/3
Kit Westneat [Sat, 20 Dec 2014 00:25:31 +0000 (19:25 -0500)]
LU-6025 lctl: remove bad error checking from nodemap_cmd()

There was some error checking that didn't really make sense in
nodemap_cmd() and was causing false errors. memcpy() doesn't have an
error condition, so there's no need to check it.

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I58225f8ebb1d8dc0941534503f939b213d57c27f
Reviewed-on: http://review.whamcloud.com/13191
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
9 years agoLU-5767 lfsck: use OUT RPC to create remote orphan parent 72/13172/8
Fan Yong [Wed, 22 Oct 2014 07:00:20 +0000 (15:00 +0800)]
LU-5767 lfsck: use OUT RPC to create remote orphan parent

When the namespace LFSCK tries to repair the missing name entry,
means inserting the lost name entry back to its parent directory,
it may find that the parent MDT-object was also lost. Under such
case, the namespace LFSCK will firstly create the missing parent
MDT-object as an orphan and insert into the
.lustre/lost+found/MDTxxxx/ directory remotely. Then insert the
lost name entry into the orphan parent according to the linkEA.
Originally, the namespace LFSCK uses the LFSCK RPC to handle the
case of creating orphan parent MDT-object on remote MDT. But it
is not the normal way for cross-MDTs modification that usually
is handled via the OUT RPC. This patch replaces the LFSCK RPC
with normal OUT RPC to create orphan parent on remote MDT.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I8b192e51f4c159cbf28e266f22ec487a8c6a68f0
Reviewed-on: http://review.whamcloud.com/13172
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6018 conf: replace RHEL_KERNEL_VERSION with RHEL_RELEASE_NO 33/13033/3
Minh Diep [Thu, 11 Dec 2014 15:58:54 +0000 (07:58 -0800)]
LU-6018 conf: replace RHEL_KERNEL_VERSION with RHEL_RELEASE_NO

No longer we use kernel version to detect kernel when we
compile ofed compat-rdma. We are using RHEL_RELEASE_NO

Signed-off-by: Minh Diep <minh.diep@intel.com>
Change-Id: Iab9a316570b61dfecf1c65c5da5e7af6a68601a5
Reviewed-on: http://review.whamcloud.com/13033
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-6002 lnet: startup acceptor thread dynamically 10/13010/5
Amir Shehata [Tue, 13 Jan 2015 01:14:33 +0000 (17:14 -0800)]
LU-6002 lnet: startup acceptor thread dynamically

With DLC it's possible to start up a system with no NIs that require
the acceptor thread, and thus it won't start.  Later on the user
can add an NI that requires the acceptor thread to start, it is
then necessary to start it up.

If the user removes a NI and as a result there are no more
NIs that require the acceptor thread then it should be stopped.
This patch adds logic in the dynamically adding and removing NIs
code to ensure the above logic is implemented.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: Iecada597d417dcb8991e9fb98f6844382295246a
Reviewed-on: http://review.whamcloud.com/13010
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
9 years agoLU-5423 llite: pack suppgid to MDS correctly 76/12476/17
Fan Yong [Sat, 27 Sep 2014 15:38:20 +0000 (23:38 +0800)]
LU-5423 llite: pack suppgid to MDS correctly

The ll_lookup_it() may trigger IT_OPEN RPC to open a file by name.
But at that time, the client does not know the target file's GID,
so it cannot pack the necessary supplementary group ID in the RPC.
Because of missing the supplementary group ID, the RPC maybe fail
for open permission check on the MDS. Under such case, MDS should
return the target file's GID, if the current thread on the client
in the right group (according to the file's GID), the client will
try the IT_OPEN RPC again with the right supplementary group ID.

This patch is also helpful if some other(s) changed the file's GID
after current RPC sent to the MDS with the suppgid as the original
GID by race.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Icaf1ae72b64a27c64c42830d231bae4bca4acb66
Reviewed-on: http://review.whamcloud.com/12476
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>