Whamcloud - gitweb
fs/lustre-release.git
10 years agoLU-3336 lfsck: orphan OST-objects iteration 03/8303/21
Fan Yong [Wed, 12 Feb 2014 09:21:32 +0000 (17:21 +0800)]
LU-3336 lfsck: orphan OST-objects iteration

During the second stage scanning, the LFSCK on the MDT(s) will scan
the orphan OST-objects via OSP level iteration which fetches remote
orphan OST-objects information via OBD_IDX_READ RPC, and shares the
existing framework/functions with others, such as quota.

Implement the sponsor (the master LFSCK engine on the MDT) logic
for the orphan OST-objects iteration.

Implement LFSCK layout rbtree iteration - lfsck_orphan_index_ops,
for slave LFSCK on OST. The lfsck_orphan_index_ops is registered
onto the rbtree object. The incoming OBD_IDX_READ RPC for orphan
OST-object scanning will iterate the rbtree via dt_index_read to
call the registered lfsck_orphan_index_ops.

Others:
1) Speed control during the second-phase scanning.
2) The LFSCK layout trace file (on the MDT) flags should be set
   with LF_INCOMPLETE if LFSCK slave on OST restart or failed.
3) Some code cleanup.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I67d5d870dbf9b80530f4d61ed1a3e5b5df70b1a0
Reviewed-on: http://review.whamcloud.com/8303
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4357 libcfs: restore __GFP_WAIT flag to memalloc calls 23/9223/5
Ann Koehler [Wed, 12 Feb 2014 17:14:00 +0000 (01:14 +0800)]
LU-4357 libcfs: restore __GFP_WAIT flag to memalloc calls

In 2.4, the flags passed to the memory allocation functions are
translated from CFS enumeration values types to the kernel GFP
values by calling cfs_alloc_flags_to_gfp(). This function adds
__GFP_WAIT to all flags except CFS_ALLOC_ATOMIC. In 2.5, when
the cfs wrappers were dropped, cfs_alloc_flags_to_gfp() was
removed and the CFS_ALLOC_xxxx was simply replaced with __GFP_xxxx.
This means that most memory allocation calls are missing the
__GFP_WAIT flag. The result is that Lustre experiences more ENOMEM
errors, many of which the higher levels of Lustre do not handle
robustly.
Notes GFP_NOFS = __GFP_WAIT | __GFP_IO. So the patch replaces
__GFP_IO with GFP_NOFS.
Patch does not add __GFP_WAIT to GFP_IOFS. GFP_IOFS was not used in
2.4 so it has never been used with __GFP_WAIT.

Signed-off-by: Ann Koehler <amk@cray.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Ib241b39674129a27fea53c23c8ce3e74d165372a
Reviewed-on: http://review.whamcloud.com/9223
Tested-by: Jenkins
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4460 mount: fix lmd_parse() to handle comma-separated NIDs 18/8918/9
Jian Yu [Wed, 12 Feb 2014 15:48:58 +0000 (23:48 +0800)]
LU-4460 mount: fix lmd_parse() to handle comma-separated NIDs

This patch reverts commit 3917e62018878dfffac59ceed70f20b0419945d3,
which cannot handle the upgrade situation that old mountdata already
contains comma-separated NIDs. The correct way to fix the original
issue is to parse comma-separated NIDs in lmd_parse().

The patch also updates disk2_4-ldiskfs.tar.bz2 to make the mountdata
of ost contain comma-separated NIDs so as to verify the patch under
upgrade situation.

Test-Parameters: alwaysuploadlogs \
envdefinitions=SLOW=yes,ENABLE_QUOTA=yes testlist=conf-sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: If179618c9c89dc2168f748aeba59384ea31197ff
Reviewed-on: http://review.whamcloud.com/8918
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4471 mdd: mdd_unlink: do trans_start after sanity check 27/8827/11
Patrick Farrell [Wed, 22 Jan 2014 19:04:26 +0000 (13:04 -0600)]
LU-4471 mdd: mdd_unlink: do trans_start after sanity check

Currently, mdd_trans_start is called before
mdd_unlink_sanity_check. This means a remote directory
which has files in it can be removed on MDT0 before the
sanity check on MDT1 finds the files and errors, which
orphans the files on MDT1. This patch moves the sanity
check before mdd_trans_create and mdd_trans_start.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I08882b682b9f0016577214821efec4759ee5c184
Reviewed-on: http://review.whamcloud.com/8827
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
10 years agoNew tag 2.5.56 2.5.56 v2_5_56 v2_5_56_0
Oleg Drokin [Tue, 25 Feb 2014 17:44:39 +0000 (12:44 -0500)]
New tag 2.5.56

Change-Id: I9f095e4102fc46f7ed9829a6056ed83c0ac3f692
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3531 mdt: delete striped directory 45/7445/42
wang di [Fri, 23 May 2014 08:44:33 +0000 (01:44 -0700)]
LU-3531 mdt: delete striped directory

Add delete striped directory, it includes

1. enable sync log between MDTs, so slave objects will
be delete by unlink log, which is similar as deleting ost
object.

2. retrieve layout information of striped directory on MDT,
then lock all of the slave objects before unlink.

3. remove a few unnecessary cfs_size_round, because update_size
and update_buf_size already do size_around inside.

4. add sanity 300 for striped dir test

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ib461156bbff9e416ac9d2500a0b9491427542340
Reviewed-on: http://review.whamcloud.com/7445
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3950 lfsck: control all LFSCK nodes via single command (2) 57/9257/5
Fan Yong [Tue, 11 Feb 2014 04:54:06 +0000 (12:54 +0800)]
LU-3950 lfsck: control all LFSCK nodes via single command (2)

The single command should work for not only layout LFSCK, but also for
other LFSCK components, such as namespace LFSCK, OI scrub on each node
and DNE LFSCK in the future.

Introduce another lfsck_start option "-o" for enable orphan handling.
Currently it is used for orphan OST-objects handling. When enable it,
the layout LFSCK will be triggered on all servers by default.

Code cleanup and more log information.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iaed9ee61d3d0fced32f9dd6b2a7f6663de6d2dc7
Reviewed-on: http://review.whamcloud.com/9257
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3336 lfsck: use rbtree to record OST-object accessing 43/7743/21
Fan Yong [Tue, 11 Feb 2014 04:53:47 +0000 (12:53 +0800)]
LU-3336 lfsck: use rbtree to record OST-object accessing

To find out orphan OST-objects, the LFSCK on OST side maintains
two bitmaps in RAM for the OST-object accessed during the LFSCK.
After the first cycle system scanning, the LFSCK got the bitmap
for the known OST-objects, and got another bitmap for which OST
objects have been referenced by MDT-objects. Then the LFSCK can
know which OST-objects are not referenced by any MDT-object via
comparing the two bitmaps.

Above two bitmaps are organized via a single rbtree. The rbtree
is maintained by LFSCK on the OST side. For every LFSCK scanned
OST-object, it will be recorded in the known-bitmap, for every
OST-object accessed by any RPC during the scanning, it will be
recoreded in the accessed-bitmap.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: If399584a8617e7c368e48922a3582294ac98d5f4
Reviewed-on: http://review.whamcloud.com/7743
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3594 lfsck: repair inconsistent owner and multiple referenced cases 24/7524/30
Fan Yong [Mon, 10 Feb 2014 13:16:50 +0000 (21:16 +0800)]
LU-3594 lfsck: repair inconsistent owner and multiple referenced cases

Sometimes, the OST-object owner information is inconsistent with the
MDT-object owner information because of incompleted chown/chgrp, or
other system crash. Under such case, the MDT-object owner information
is trusted over the OST-object's. Because the chown/chgrp processing
order is: client => MDT => OST, it is possible that the OST-object
owner information is stale rather than the MDT-object's. Also, the
MDT-object's owner information is visible to users and can be directly
repaired by the system administrator, while the OST-object's owner
information is only used internally by quota. So the LFSCK will update
the OST-object owner information according to the MDT-object's owner.

If both MDT-object1 and MDT-object2 claim the OST-object1 as one
of its child OST-object, but the OST-object1 only recognizes the
MDT-object1, then the LFSCK will create new a OST-object and fix
the MDT-object2's layout information to reference the new created
OST-object.

Replace is_remote_th() with is_only_remote_trans(), then drop the
compat patch http://review.whamcloud.com/9361

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I6b148180b5a2d68650b291250c03aac651e5f6e9
Reviewed-on: http://review.whamcloud.com/7524
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3591 lfsck: repair unmatched MDT-OST objects pairs 19/7519/26
Fan Yong [Fri, 7 Feb 2014 01:26:49 +0000 (09:26 +0800)]
LU-3591 lfsck: repair unmatched MDT-OST objects pairs

Sometimes, the MDT-object1 claims that the OST-object1 is one of its
child objects. But the OST-object1 says inconsistent information:

1. It claims invalid parent information, such as empty or bad parent
   FID information.
2. It claims that its parent is the MDT-object2, but the MDT-object2
   does not exist, or
3. The MDT-object2 exists, but it does not recognize the OST-object1.

Under such cases, the MDT-object layout information is trusted over
the OST-object back-pointer because it relates to user visible file
data. The OST-object back-pointer is only used for internal recovery
purposes and is not visible to the user, so does not affect proper
file usage information, nor was kept consistent for Lustre 1.8.x MDT
file-level backup/restore. The LFSCK will update the OST-object to
make it recognize the new parent.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I01e67baf661b0a9e1c3de37a35de86699b07d049
Reviewed-on: http://review.whamcloud.com/7519
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3590 lfsck: repair MDT-object with dangling reference 17/7517/28
Fan Yong [Wed, 5 Feb 2014 17:46:42 +0000 (01:46 +0800)]
LU-3590 lfsck: repair MDT-object with dangling reference

If the OST-object referenced by the MDT-object is lost, then the
LFSCK needs to recreate the OST-object with the specified FID and
initialize it with the given parent MDT-object FID and owner attr.
Although the new created OST-object is initialized, the SUID+SGID
mode will be kept, which will be dropped by the first modification
RPC, like write/punch/setattr. Then we can distinguish whether the
recreate OST-object has been modified or not.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ic45254695e7b1902020c133bb23fd32685b9a414
Reviewed-on: http://review.whamcloud.com/7517
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3590 osp: compat macro for is_remote_trans 61/9361/2
Oleg Drokin [Sat, 22 Feb 2014 18:22:48 +0000 (13:22 -0500)]
LU-3590 osp: compat macro for is_remote_trans

There is a clash between a recently landed DNE patch and
a ready to land LFSCK series.

Change-Id: I0b3f9805fbf892e4ad0eb4b7fc736a871c438d77
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/9361
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4629 gss: fix few issues found by Klocwork Insight tool 74/9274/3
Dmitry Eremin [Wed, 12 Feb 2014 11:02:58 +0000 (15:02 +0400)]
LU-4629 gss: fix few issues found by Klocwork Insight tool

Array 'message_buf' of size 500 may use index value(s) -1

Object 'enc_key.data' was freed at line 164 after being freed
by calling 'free' at line 150. Also there are 3 similar errors
on line(s) 164.

Suspicious dereference of pointer 'vmsg' before NULL check at
line 187. Also there are 2 similar errors on line(s) 196, 205.

Suspicious dereference of pointer 'rmsg' before NULL check at
line 191. Also there are 2 similar errors on line(s) 200, 209.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I50905ea99d904123df30ba7078b180b44b8a6e06
Reviewed-on: http://review.whamcloud.com/9274
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4629 llite: fix suspicious dereference (merge issue) 73/9273/3
Dmitry Eremin [Wed, 12 Feb 2014 11:06:34 +0000 (15:06 +0400)]
LU-4629 llite: fix suspicious dereference (merge issue)

Suspicious dereference of pointer 'lfd' before NULL check at line 286

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I64c652279abb8fa1e720d23d645f74f07e5237ca
Reviewed-on: http://review.whamcloud.com/9273
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3963 libcfs: remove cfs_hash_long 68/9268/2
Peng Tao [Fri, 14 Feb 2014 01:58:50 +0000 (09:58 +0800)]
LU-3963 libcfs: remove cfs_hash_long

Replace the name with Linux defined hash_long.
The similar patch has already been submitted upstream.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
Change-Id: Ia96bb703284ed4843c4433a1a50539d9c68ed6d1
Reviewed-on: http://review.whamcloud.com/9268
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4598 quota: fix s-q test_30 77/9177/2
Niu Yawei [Fri, 7 Feb 2014 12:13:51 +0000 (07:13 -0500)]
LU-4598 quota: fix s-q test_30

After LU-4139 landed, the block grace time isn't accurate as before,
the s-q test_30 should now write more bytes to make sure the spare
quota allocated on slave be used up.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Iee14721051b9e41074a13f11afb11a7b286352c2
Reviewed-on: http://review.whamcloud.com/9177
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
10 years agoLU-4423 obdclass: fix return value check in capa_hmac() 81/8681/3
Oleg Drokin [Tue, 31 Dec 2013 01:38:49 +0000 (20:38 -0500)]
LU-4423 obdclass: fix return value check in capa_hmac()

In case of error, the function crypto_alloc_hash() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check
should be replaced with IS_ERR().

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I4889387752d1eb5400649cd5f4da172d64c054e2
Reviewed-on: http://review.whamcloud.com/8681
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1538 tests: delete test files from /tmp after use 15/8615/3
Andreas Dilger [Wed, 23 Oct 2013 04:03:52 +0000 (22:03 -0600)]
LU-1538 tests: delete test files from /tmp after use

Delete files created for tests in /tmp after testing has finished.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8c3fb62f844cb50d82eba81f274d86c73e3d2e08
Reviewed-on: http://review.whamcloud.com/8615
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3943 utils: fix lfs df -i summary line 14/8614/2
Andreas Dilger [Wed, 18 Dec 2013 06:59:23 +0000 (23:59 -0700)]
LU-3943 utils: fix lfs df -i summary line

If the number of free objects on the OSTs is fewer than on the MDT,
use the number of free OST objects in the filesystem summary, so
it matches "df -i" (limited internally by ll_statfs_internal()).

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Icfaaba3f9c39f12174e6681e9fb68c1f7a2540e5
Reviewed-on: http://review.whamcloud.com/8614
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4413 ptlrpc: don't try to recover no_recov connection 96/8996/4
Andreas Dilger [Sat, 25 Jan 2014 01:16:43 +0000 (18:16 -0700)]
LU-4413 ptlrpc: don't try to recover no_recov connection

If a connection has been stopped with ptlrpc_pinger_del_import() and
marked obd_no_recov, don't reconnect in ptlrpc_disconnect_import() if
the import is already disconnected.  Otherwise, without the pinger it
will just wait there indefinitely for the reconnection that will never
happen.

Put the obd_no_recov check inside ptlrpc_import_in_recovery() so that
any threads waiting on the connection to recover would also be broken
out of their sleep if obd_no_recov is set.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Icd8041be0ce344add8d67b026353df1b1e0cab07
Reviewed-on: http://review.whamcloud.com/8996
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4515 tests: set fail_loc only once per node 78/8978/4
Andreas Dilger [Thu, 23 Jan 2014 19:03:45 +0000 (12:03 -0700)]
LU-4515 tests: set fail_loc only once per node

Since fail_loc and fail_val are common for all services on a node,
it is only necessary to set it once per node instead of once per
facet.  That avoids a bunch of extra remote commands and spew in
the test output.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I9a69eb325fa80a90d929ab9a258cce21973ebbe5
Reviewed-on: http://review.whamcloud.com/8978
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4356 kernel: kernel update [SLES11 SP2 3.0.101-0.7] 51/8951/3
Bob Glossman [Mon, 6 Jan 2014 23:02:16 +0000 (15:02 -0800)]
LU-4356 kernel: kernel update [SLES11 SP2 3.0.101-0.7]

update target and config files for new version

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ia5a9472029e7134b3f6b997506371f5e5e624797
Reviewed-on: http://review.whamcloud.com/8951
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4102 utils: add dry-run to ll_recover_lost_found_objs 61/8061/6
Andreas Dilger [Thu, 24 Oct 2013 08:11:54 +0000 (02:11 -0600)]
LU-4102 utils: add dry-run to ll_recover_lost_found_objs

Add the dry-run (-n) option to ll_recover_lost_found_objs.  This
allows scanning an OST filesystem without modifying it.

It is now possible to test both the "lost+found/" directory, as well
as the "O/" directory to verify the LMA and FID xattrs on existing
objects that are in the normal filesystem hierarchy.

Fix verbose (-v) option to print all of the inodes being checked.
This option previously did nothing.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8e2b427046c6acf5cf7429f41ccd57496c500c1e
Reviewed-on: http://review.whamcloud.com/8061
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4597 clio: clear nowait flag agl lock re-enqueue 49/9249/2
Niu Yawei [Thu, 13 Feb 2014 07:07:14 +0000 (02:07 -0500)]
LU-4597 clio: clear nowait flag agl lock re-enqueue

The LDLM_FL_BLOCK_NOWAIT flag should be cleared when re-enqueue
the agl lock as normal glimpse, otherwise, it won't get size back
if there is conflicting locks on other client.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ifd311606e824d6574bfbf3256841061e8867214a
Reviewed-on: http://review.whamcloud.com/9249
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Ned Bass <bass6@llnl.gov>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4529 quota: call qsd_op_end() after trasn stop 68/8968/3
Niu Yawei [Thu, 23 Jan 2014 04:09:54 +0000 (23:09 -0500)]
LU-4529 quota: call qsd_op_end() after trasn stop

qsd_op_end() shouldn't be called before the transaction stopped,
because qsd_op_end() is a quite heavy operation which could
probably allocate memory with standard allocator flag (__GFP_IO),
and allocating memory could result in dirty flush on other
filesystems, that will lead to opening transaction on different
journal and trigger the assert in jbd2_journal_start():
J_ASSERT(handle->h_transaction->t_journal == journal) at the end.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I4ea3ff011fa7e44460b9912050e90b174813e01a
Reviewed-on: http://review.whamcloud.com/8968
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
10 years agoLU-4488 build: fix compilation with --enable-invariants 53/8853/10
Dmitry Eremin [Wed, 15 Jan 2014 08:46:04 +0000 (12:46 +0400)]
LU-4488 build: fix compilation with --enable-invariants

Fix the build which was broken since the following commit:

    commit 0a259bd7dbac76d75b89a389bc317720153aa452
    Author: Jinshan Xiong <jinshan.xiong@intel.com>
    Date:   Mon Sep 30 15:00:38 2013 -0700

    LU-3321 clio: collapse layer of cl_page

    Move radix tree to osc layer to for performance improvement.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
    Change-Id: I93e3cb8352f7be41c23465b12945874316aa1809
    Reviewed-on: http://review.whamcloud.com/7892
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ie8543513852d98d0aa82bca0f227d286cdf8ebd2
Reviewed-on: http://review.whamcloud.com/8853
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4422 tests: disable sanity-quota test_6 temporary 03/9203/3
Fan Yong [Sat, 8 Feb 2014 01:08:18 +0000 (09:08 +0800)]
LU-4422 tests: disable sanity-quota test_6 temporary

To avoid other patches to be failed for LU-4422 under DNE.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia48f7bb14733f6bdc5ef34bde1ce91b5762f9192
Reviewed-on: http://review.whamcloud.com/9203
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4482 grant: don't use cache data in osd_statfs() 11/8911/5
Niu Yawei [Fri, 17 Jan 2014 02:56:29 +0000 (21:56 -0500)]
LU-4482 grant: don't use cache data in osd_statfs()

osd_statfs() shouldn't cache statfs data anymore: the statfs data
is already cached in ofd layer, put another cache in osd layer
looks redundant, and what more important is: grant mechanism relies
on dt_statfs() returning fresh statfs data, caching statfs data in
osd layer would just break grant.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I89b6384cc59d77b1edb0412f24b5c8e823532170
Reviewed-on: http://review.whamcloud.com/8911
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
10 years agoLU-3593 lfsck: repair inconsistent layout EA 56/7456/26
Fan Yong [Fri, 31 Jan 2014 06:55:02 +0000 (14:55 +0800)]
LU-3593 lfsck: repair inconsistent layout EA

The layout EA storing on the MDT-object records not only the file
layout but also some information which indicates the layout owner,
such as lov_mds_md.lmm_oi. They are generated from MDT-object FID,
with them we can know which file the layout EA belongs to. In the
LFSCK phase II, we need to verify whether such information in the
layout EA is correct or not by re-caculating from the MDT-object
FID. If inconsistency is found, trust the MDT-object FID rather
than the FID information in the layout EA, and repair the later.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I3c31e19e9fabe66fe7ffdba2fe8569795ae49b4a
Reviewed-on: http://review.whamcloud.com/7456
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4472 mdc: Fix mdc_page_locate ASSERT 21/8821/7
Nathaniel Clark [Fri, 10 Jan 2014 22:08:19 +0000 (17:08 -0500)]
LU-4472 mdc: Fix mdc_page_locate ASSERT

Storing hash 0 at same as hash 1 should be okay, don't ASSERT in this
case.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I9b9488e5374d22dcfc9e7f2c969da3b02778097a
Reviewed-on: http://review.whamcloud.com/8821
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3531 llite: fix "lfs getdirstripe" to show stripe info 28/7228/42
wang di [Wed, 31 Jul 2013 17:46:09 +0000 (10:46 -0700)]
LU-3531 llite: fix "lfs getdirstripe" to show stripe info

Fix "lfs getdirstripe", so it can show layout information
of striped directory

[root@testnode tests]# ../utils/lfs getdirstripe /mnt/lustre/test1
/mnt/lustre/test1
lmv_stripe_count: 2
lmv_stripe_offset: 0
mdtidx  FID[seq:oid:ver]
     0  [0x280000400:0x1:0x0]
     1  [0x2c0000400:0x1:0x0]

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I586f78ee2e0c35d8c3ed10726d5f5e12a4b543e7
Reviewed-on: http://review.whamcloud.com/7228
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3529 lod: create striped directory 96/7196/45
wang di [Wed, 31 Jul 2013 07:00:40 +0000 (00:00 -0700)]
LU-3529 lod: create striped directory

1. Add "lfs setdirstripe -i -c" to create striped
directory.

2. client send create request to the master MDT, which
will allocate FIDs and create slaves. for all of slaves.

3. Client needs to revalidate slaves during intent getattr
and open request.

4. lmv_stripe_md will include attributes(size, nlink etc)
from all of stripe, which will be protected by UPDATE lock.
client needs to merge these attributes when update inode.

5. send create request to the MDT where the file is located,
which can help creating master stripe of striped directory.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I7ac560e39dcb415e310dc5e6ade531d76227ffae
Reviewed-on: http://review.whamcloud.com/7196
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
10 years agoLU-4196 build: Reenable OFED-3.5 support on SLES11 84/8884/7
James Simmons [Wed, 12 Feb 2014 16:43:15 +0000 (11:43 -0500)]
LU-4196 build: Reenable OFED-3.5 support on SLES11

With the merger of LU-4266 support for SLES11 with
OFED-3.5 was accidentally removed. This patch restores
this support.

Change-Id: If70a8815d90e7d1fa998ba82c4dac5a384216353
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/8884
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4269 ldlm: Hold lock when clearing flag 72/8772/5
Li Xi [Wed, 8 Jan 2014 09:13:16 +0000 (17:13 +0800)]
LU-4269 ldlm: Hold lock when clearing flag

This patch moves lock's skip flag clearing from lru-delete to
lru-add code to prevent clearing lock's flag without resource lock
proection.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I5cce4699833c2a935e418bdd7181a2151612a8be
Reviewed-on: http://review.whamcloud.com/8772
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4290 llog: discard unavailable records and keep going 81/9281/2
Alex Zhuravlev [Fri, 14 Feb 2014 19:07:58 +0000 (23:07 +0400)]
LU-4290 llog: discard unavailable records and keep going

if llog can't process some records due to I/O errors or
corruption, just discard them from the header and keep
going. a new test added to verity this behavior.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Id0dc83ae6239cd55a43eec128b3c750bb9f0894a
Reviewed-on: http://review.whamcloud.com/9281
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4505 quota: race of edquot updating 54/8954/3
Niu Yawei [Wed, 22 Jan 2014 04:24:00 +0000 (23:24 -0500)]
LU-4505 quota: race of edquot updating

The slave edquot flag could be set mistakenly as following:

- slave A acquires quota from master, master found that the
  user is running out of quota, set edquot in reply;
- another slave deletes files and release quota to master,
  master clears edquot and notify all slaves by glimpse;
- glimpse reaches slave A before the reply of dqacq, so
  edquot flag will be set on slave A at the end.

Given that edquot can't be fully trusted, it should only be
revalidated every 5 seconds on the sync acquire path.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Id4db47462bdf620a42cd31f75726fbcaff869179
Reviewed-on: http://review.whamcloud.com/8954
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4620 kernel: kernel update [RHEL6.5 2.6.32-431.5.1.el6] 53/9253/3
Bob Glossman [Thu, 13 Feb 2014 01:08:18 +0000 (17:08 -0800)]
LU-4620 kernel: kernel update [RHEL6.5 2.6.32-431.5.1.el6]

update RHEL6.5 kernel to 2.6.32-431.5.1.el6

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I5b11d0b0e1c3749232caf8d21d365ae351f538aa
Reviewed-on: http://review.whamcloud.com/9253
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4613 tests: purge older request result in test_12o 35/9235/2
Bruno Faccini [Wed, 12 Feb 2014 09:52:07 +0000 (10:52 +0100)]
LU-4613 tests: purge older request result in test_12o

sanity-hsm/test_12o sub-test, which has been introduced as part
of LU-3834, submits 2 RESTORE requests for the same FID and thus
needs to purge 1st result from log before to check 2nd.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ia2a0ead487b29a68c8a920bae2aa1d654eac4051
Reviewed-on: http://review.whamcloud.com/9235
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2687 test: add b2_4 zfs image for conf-sanity test_32a 93/7193/17
Wei Liu [Mon, 22 Jul 2013 22:07:08 +0000 (15:07 -0700)]
LU-2687 test: add b2_4 zfs image for conf-sanity test_32a

In order to ensure that we do not break ZFS upgrades
in the future, add 2.4.0 zfs filesystem test image for
conf-sanity.sh test_32a.

Test-Parameters: mdtfilesystemtype=zfs \
ostfilesystemtype=zfs mdsfilesystemtype=zfs \
envdefinitions=SLOW=yes testlist=conf-sanity

Change-Id: Iae560e05b428907409dc7069d30b601b52750cca
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: http://review.whamcloud.com/7193
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1778 libcfs: add a service that prints a nidlist 21/9221/2
Gregoire Pichon [Tue, 11 Feb 2014 09:40:54 +0000 (10:40 +0100)]
LU-1778 libcfs: add a service that prints a nidlist

The libcfs already provides services to parse a string into a nidlist
and to match a nid into a nidlist. This patch implements a service
that prints a nidlist into a buffer.

This is required for instance to print the nosquash_nids parameter
of the MDT procfs component.

Additionally, this patch fixes a bug in return code of
parse_addrange() routine, so that parsing of nids including
a * character works fine ('*@elan' for instance).

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: I518845b03b34ab5a1e2cbc673c58c5a384702930
Reviewed-on: http://review.whamcloud.com/9221
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1032 build: Honor --disable-modules option in spec file 20/6020/8
Brian Behlendorf [Tue, 9 Apr 2013 21:21:46 +0000 (14:21 -0700)]
LU-1032 build: Honor --disable-modules option in spec file

All the way back to 2004 Lustre has supported an option to
disable the compilation of the kernel modules.  This can be useful
because there are situations where only the user space componets
are required.

For example, when the Lustre kernel modules are either a) provided
by the kernel, or b) proivded as a dkms package.  In both of these
cases it's desirable to be able to build the lustre package without
building lustre-modules subpackage.

The patch adds that missing functionality to the existing lustre
spec file by leveraging the existing --disable-modules configure
option.

Additionally, a small fix was made to lustre/quota/autoMakefile.am
because it didn't properly support the --disable-modules option.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ic4f4f7f19da9951b47c587399a71c42fb0e720d0
Reviewed-on: http://review.whamcloud.com/6020
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
10 years agoLU-4577 lnet: Dropped messages are not accounted correctly 96/9096/3
Matt Ezell [Mon, 3 Feb 2014 18:19:48 +0000 (13:19 -0500)]
LU-4577 lnet: Dropped messages are not accounted correctly

LNET messages that are dropped are not accounted for correctly in
/proc/sys/lnet/stats. What I assume to be a simple typo is causing
drop_length to be double-counted and drop_count to never be
incremented.

Signed-off-by: Matt Ezell <ezellma@ornl.gov>
Change-Id: I761c62b1f3c4c4ceffbe47008b79692a6e643458
Reviewed-on: http://review.whamcloud.com/9096
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4382 ldiskfs: add quota credit for ldiskfs_delete_inode 87/9187/3
Bobi Jam [Sat, 8 Feb 2014 07:02:13 +0000 (15:02 +0800)]
LU-4382 ldiskfs: add quota credit for ldiskfs_delete_inode

In ldiskfs_delete_inode() we missed possible journal credits
for journaled quota change, this patch makes it up.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic4ef8030b5d9743b18f0417dde702f60ccdaf5d7
Reviewed-on: http://review.whamcloud.com/9187
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4293 mdd: Allow layout swap for IGIF FIDs 37/8737/9
Bruno Faccini [Mon, 6 Jan 2014 09:25:47 +0000 (10:25 +0100)]
LU-4293 mdd: Allow layout swap for IGIF FIDs

Patch to also allow layout swap for pre-2.x migrated
files (ie, IGIF FID with linkEA).

Root user special case has also been added to lfs/migrate
command to map owner/group of original file to
volatile, in order to comply with other layout_swap rules.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ia7e7cb2e6e36ba67a57474b8a806a53257a3e014
Reviewed-on: http://review.whamcloud.com/8737
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
10 years agoLU-4543 osd: return actual hash value for a record 18/9218/3
Alex Zhuravlev [Tue, 11 Feb 2014 10:06:47 +0000 (14:06 +0400)]
LU-4543 osd: return actual hash value for a record

hash value should be fetched only once we've got a record.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: If02a5ba0c85c0230dea799445a8b985ed1a6fbae
Reviewed-on: http://review.whamcloud.com/9218
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4386 osc: don't activate deactivated obd_import 47/8747/4
Hongchao Zhang [Thu, 5 Sep 2013 13:50:48 +0000 (21:50 +0800)]
LU-4386 osc: don't activate deactivated obd_import

In ptlrpc_activate_import(), obd_import->imp_deactive should
be checked if it is deactivated, otherwise it will trigger an
LBUG in ptlrpc_invalidate_import():

  ptlrpc_invalidate_import() ASSERTION(imp->imp_invalid) failed

Change-Id: I4c16f166c0c2cf60664119bf438dfd8606d71a2f
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/8747
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4625 gss: fixup for shared key mechanism & flavors 87/9287/2
Dmitry Eremin [Mon, 17 Feb 2014 09:36:09 +0000 (13:36 +0400)]
LU-4625 gss: fixup for shared key mechanism & flavors

Fixup for Commit 6323d52abfe4cf1eda06b4ac3a5b325d9fa13276
The new file lustre/ptlrpc/gss/gss_sk_mech.c was added but
not in Makefile.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I7cc1f0df848877e2ad07ad89b0ad1a0182374a96
Reviewed-on: http://review.whamcloud.com/9287
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4319 build: Clean up rpms/srpm Make targets 26/8426/7
Christopher J. Morrone [Wed, 27 Nov 2013 22:05:50 +0000 (14:05 -0800)]
LU-4319 build: Clean up rpms/srpm Make targets

The "rpms" and "srpm" targets were unnecessarily complicated.  The rpms
target in particular has a very long shell script embedded in the
autoMakefile, which is not especially desirable.  Because of the embedded
shell script with its associated backslashes, we didn't use standard
autoconf/automake macros because we didn't want shell comments to appear
after line continuation.  To get around that, we need another layer of
variables to convert autoconf/automake variables into Make variables.

It gets rather difficult to read and modify.

Instead we move the scripting into autoconf m4 files, where scripting
is much easier (little line continuations necessary, far fewer escapes needed).
We also have direct access to the original variables, so we don't need
to hop through two or three files before we eventually find where
a variable gets set.

All of the decisions are made at configure time anyway, so constructing
the command line options for rpmbuild at configure time is the Right Thing
to do.

A nice side effect of this change is that one can now easily look at
the autoMakefile after running "./configure" and see exactly the command
line that will be passed to rpmbuild.

Change-Id: I10fcfa740d9e901805615c2262263cc1ea8552bf
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-on: http://review.whamcloud.com/8426
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4525 lfsck: distinguish objects visibility by LFSCK 86/9186/4
Fan Yong [Fri, 31 Jan 2014 03:55:59 +0000 (11:55 +0800)]
LU-4525 lfsck: distinguish objects visibility by LFSCK

Originally, the ldiskfs backend otable-based iteration only returned
namespace visible FIDs. That means that the OSD needs to distinguish
related objects visibility. But the OSD should not has the knowledge
about the objects visibility. It is the iteration caller - LFSCK, to
distinguish that by itself.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I1eef4041170e856af00a4b222d053ccb3d8d0023
Reviewed-on: http://review.whamcloud.com/9186
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4369 build: make --disable-ldiskfs option workable 83/8883/4
Dmitry Eremin [Thu, 16 Jan 2014 16:49:22 +0000 (20:49 +0400)]
LU-4369 build: make --disable-ldiskfs option workable

Building ldiskfs is enabled by default, so we need to disable it
if --disable-ldiskfs option is specified to ./configure

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I7ffb013b976e870de32d38b669a1437f8388bbda
Reviewed-on: http://review.whamcloud.com/8883
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4612 lvfs: correct call to pop_ctxt 24/9224/4
Bob Glossman [Tue, 11 Feb 2014 16:51:13 +0000 (08:51 -0800)]
LU-4612 lvfs: correct call to pop_ctxt

Earlier commit 3e7573cc14a331f01150814495e2345793e22f06 that converted
a call of osd_pop_ctxt() to pop_ctxt() ignored the fact the argument
order of these routines were different. This led to panics.

This patch fixes the pop_ctxt() call by putting the arguments
in the correct order.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I352250fe3ed91cb8a23a5b8e88b944dc8309b481
Reviewed-on: http://review.whamcloud.com/9224
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3373 osd-ldiskfs: export ext4_truncate 74/9174/2
Bobi Jam [Fri, 7 Feb 2014 06:37:23 +0000 (14:37 +0800)]
LU-3373 osd-ldiskfs: export ext4_truncate

The latest kernel removes inode_operations::truncate member, while
SLES11 kernel still keep the member but does not fill that member
for regular file.

This patch exports symbol for ext4_truncate, and calls it directly
in osd_punch() if the regular file inode operation does not fill its
truncate function.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I42477ca3f1a56e9c0870a641431936298f6d71b5
Reviewed-on: http://review.whamcloud.com/9174
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3289 gss: Shared key mechanism & flavors 29/8629/5
Andrew Korty [Thu, 19 Dec 2013 22:13:17 +0000 (14:13 -0800)]
LU-3289 gss: Shared key mechanism & flavors

Implement security flavors and GSSAPI mechanism to perform shared key
authentication (ski) and encryption (skpi).

Signed-off-by: Andrew Korty <ajk@iu.edu>
Change-Id: I48855c098965fcf527b3949c6dfb181d457b4ca5
Reviewed-on: http://review.whamcloud.com/8629
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Ken Hornstein <kenh@cmf.nrl.navy.mil>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-1267 lfsck: enhance API for MDT-OST consistency 56/7156/33
Fan Yong [Thu, 30 Jan 2014 19:05:05 +0000 (03:05 +0800)]
LU-1267 lfsck: enhance API for MDT-OST consistency

Introduce new dt_object method ::do_declare_attr_get(). The caller
can use such method to notify low layer that "It will need the OST
object attribute very soon, please help to prepare in advance". For
the LFSCK layout consistency verification, the osp_declare_attr_get()
will use UPDATE_OBJ RPC with sub_opcode OBJ_ATTR_GET.

Similarly, another new new dt_object method ::do_declare_xattr_get()
is used to notify low layer that "It will need the OST object xattr
very soon, please help to prepare in advance", which uses UPDATE_OBJ
RPC with sub_opcode OBJ_XATTR_GET.

These idempotent requests can be batched together during the phase
of declaration and sent out via single OUT RPC. It can be shared by
any thread that wants to send idempotent requests to the same OST.

Introduce cache in OSP for remote object's attribute and extended
attribute. Currently, it is mainly used to hold those pre-fetched
OST-objects' kinds of attr/parent FID. But it also can be used by
DNE for other purposes in the future.

For performance, the batched idempotent OUT RPC uses asynchronous
mode. Every sub operation needs to register its own interpterer.
These interpterers will be called one by one after the batched OUT
RPC is replied by the OST.

Implement do_attr_get() against OSP-object for the MDT to get OST
object attribute.

Implement do_xattr_get() against OSP-object for the MDT to get OST
object parent FID attribute.

Implement do_xattr_set() against OSP-object for the MDT to set OST
object parent FID extended attribute.

Some code cleanup and re-organization, such as moving transaction
related code from osp/osp_md_object.c to osp/osp_trans.c, moving
common OUT code from osp/osp_md_object.c to target/out_lib.c.

Originally, only DNE operations will use OUT RPCs, so they use sync
mode transaction. But with LFSCK phase II introduced, sync mode OUT
RPC processing is bad performance. The patch improves related funcs
to allow LFSCK to use async mode transaction for OUT RPC processing,
and DNE related operations still use sync mode transaction.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4fe99f96ad24d43c1edea3a4a16b7ed206c38c4f
Reviewed-on: http://review.whamcloud.com/7156
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4209 utils: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict 12/8312/10
Andreas Dilger [Mon, 18 Nov 2013 09:47:26 +0000 (02:47 -0700)]
LU-4209 utils: fix O_TMPFILE/O_LOV_DELAY_CREATE conflict

In kernel 3.11 O_TMPFILE was introduced, but the open flag value
conflicts with the O_LOV_DELAY_CREATE flag 020000000 added to fix
LU-812 in Lustre 2.4.  O_LOV_DELAY_CREATE allows applications
to defer file layout and object creation from open time (the default)
until it can instead be specified by the application using an ioctl.

Instead of trying to find a non-conflicting O_LOV_DELAY_CREATE flag
or define a Lustre-specific flag that isn't of use to most/any other
filesystems, use (O_NOCTTY|FASYNC) as the new value.  These flags
are not meaningful for newly-created regular files and should be
ok since O_LOV_DELAY_CREATE is only meaningful for new files.

I looked into using O_ACCMODE/FMODE_WRITE_IOCTL, which allows calling
ioctl() on the minimally-opened fd and is close to what is needed,
but that doesn't allow specifying the actual read or write mode for
the file, and fcntl(F_SETFL) doesn't allow O_RDONLY/O_WRONLY/O_RDWR
to be set after the file is opened.

We will keep the 0100000000 flag for backward compatibility until
3.13 is the oldest client kernel that is supported, but drop the
conflicting __O_TMPFILE value of 02000000 since that will cause an
error when running on newer kernels.  The 020000000 has only been
used since Lustre 2.4.0 and always in conjunction with 0100000000,
so any apps that used O_LOV_DELAY_CREATE directly instead of calling
llapi_file_create*() will still work until Linux 3.13 is used.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I565f3454616edc60c6acee01034aa5d773500c1e
Reviewed-on: http://review.whamcloud.com/8312
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4554 lfsck: old single-OI MDT always scrubbed 67/9067/2
Ned Bass [Thu, 30 Jan 2014 22:56:20 +0000 (14:56 -0800)]
LU-4554 lfsck: old single-OI MDT always scrubbed

Old ldiskfs MDT's that contain a single OI container named "oi.16"
trigger an automatic OI scrub on each restart.  This is because
osd_oi_table_open() gets ENOENT opening "oi.16.0" and consequently
sets bit 0 in scrub_file::sf_oi_bitmap.  This bit indicates the OI
container 0 needs to be recreated, and it triggers a scrub in
osd_fid_lookup() for lookups that fail with ENOENT.  Fix this by
clearing the bit in osd_oi_init() after a successful open of
"oi.16".

Signed-off-by: Ned Bass <bass6@llnl.gov>
Change-Id: Ie69223d3f8289c90de46f9afe0a2de0e0625b0f6
Reviewed-on: http://review.whamcloud.com/9067
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
10 years agoLU-1032 build: Add Lustre DKMS spec file 19/6019/5
Brian Behlendorf [Tue, 9 Apr 2013 04:46:40 +0000 (21:46 -0700)]
LU-1032 build: Add Lustre DKMS spec file

Add a lustre-dkms.spec file which can be used to distribute dkms
style Lustre modules.  The spec file is originally based on the
generic dkms template and the default behavior is as follows:

* Disable ldiskfs osd support.  The ldiskfs packages currently
  cannot be built reliably against arbitrary kernels and are
  therefore disabled by default.

* Enable zfs osd support.  ZFS dkms packages are hosted at
  http://archive.zfsonlinux.org/{epel,fedora}/{release}/ and
  are compatible once LU-3117 is merged in to the Lustre source.

* Some of the default Lustre build options can be changed by
  setting parameted in the /etc/sysconfig/lustre config file.
  Going forward the options can be extended as needed.  The
  currently supported options are:

    * LUSTRE_DKMS_DISABLE_CDEBUG=y|N
    * LUSTRE_DKMS_DISABLE_TRACE=y|N
    * LUSTRE_DKMS_DISABLE_ASSERT=y|N
    * LUSTRE_DKMS_DISABLE_STRIP=y|N

* A build target was not added for the lustre-dkms.spec file.
  To create lustre dkms packages you must manually invoke rpmbuild.

    ./configure --enable-dist
    make dist
    rpmbuild -bs lustre-dkms.spec lustre-x.y.z.tar.gz
    rpmbuild --rebuild lustre-dkms-x.y.z-r.dist.src.rpm

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I870f362b8948d5cd28a8dccd98b565e38ad2da7c
Reviewed-on: http://review.whamcloud.com/6019
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
10 years agoLU-4512 hsm: Fix lhsmtool_posix --report option 34/9034/2
Michael MacDonald [Mon, 20 Jan 2014 17:08:28 +0000 (12:08 -0500)]
LU-4512 hsm: Fix lhsmtool_posix --report option

The --report option is intended to allow an override of the
default copytool progress reporting interval, but it doesn't
work. This commit implements the intended functionality and
renames the option to "--update-progress", or "-u" for short.

Also fixes the progress display in hsm/active_requests to
reflect the change from percentage complete to bytes moved.

Signed-off-by: Michael MacDonald <michael.macdonald@intel.com>
Change-Id: Id6ead1b33868e3454f00053165944bc3900cabb4
Reviewed-on: http://review.whamcloud.com/9034
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4287 kernel: kernel update RHEL6.5 [2.6.32-431.3.1.el6] 49/8549/25
yangsheng [Wed, 8 Jan 2014 16:03:17 +0000 (00:03 +0800)]
LU-4287 kernel: kernel update RHEL6.5 [2.6.32-431.3.1.el6]

Add RHEL6.5 support [2.6.32-431.3.1.el6]

ext4 in RHEL6.5's kernel version 2.6.32-431.3.1.el6 no longer contains
the required function ext4_ext_walk_space(). We start a new rhel6.5
ldiskfs patch series and reintroduce ext4_ext_walk_space() through an
new patch, copying ext4_ext_walk_space() from older kernel rhel6.4
2.6.32-358.23.2.el6.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: I7112747970343b1264910aa21d7a62c45b5ca1ea
Reviewed-on: http://review.whamcloud.com/8549
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4571 fld: resend seq lookup RPC if it is on LWP 06/9106/4
wang di [Mon, 3 Feb 2014 21:19:21 +0000 (13:19 -0800)]
LU-4571 fld: resend seq lookup RPC if it is on LWP

Because Light Weight connection might be evicted after
restart, then cause inflight RPC fails, to avoid this,
we need resend seq lookup RPC.

remove "-f" from "stop mdt" in sanity 17m, so umount can
keep the the connection, and otherwise the OSP might be
evicted.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I032dfb95e65da56b198129c6d6d6039bad08ab9c
Reviewed-on: http://review.whamcloud.com/9106
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
10 years agoLU-4590 ptlrpc: Remove log message about export timer update 47/9147/2
Cheng Shao [Wed, 5 Feb 2014 20:32:48 +0000 (12:32 -0800)]
LU-4590 ptlrpc: Remove log message about export timer update

Function ptlrpc_update_export_timer generates lots of D_HA level log
messages whenever the export timer gets updated. Those log messages
are found little use for issue investigations, and it will take space
in the Lustre log buffer. We are removing it now.

Xyratex-bug-id: MRP-733
Signed-off-by: Cheng Shao <cheng_shao@xyratex.com>
Change-Id: I3699e81fd4bf0b8677c1fbd09ced5d81ffba3f81
Reviewed-on: http://review.whamcloud.com/9147
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4589 kernel: kernel update [SLES11 SP3 3.0.101-0.15] 49/9149/2
Bob Glossman [Wed, 5 Feb 2014 19:04:54 +0000 (11:04 -0800)]
LU-4589 kernel: kernel update [SLES11 SP3 3.0.101-0.15]

update target and config files for new kernel version

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I91a003a5b1947287265b06f16eb5cb9c817d5758
Reviewed-on: http://review.whamcloud.com/9149
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4586 ptlrpc: cast type in the swith op 30/9130/2
Alex Zhuravlev [Wed, 5 Feb 2014 11:08:52 +0000 (15:08 +0400)]
LU-4586 ptlrpc: cast type in the swith op

should allow to build with gcc-4.7.2

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I489ea927d3dc87a7b01f57c5d390612c015b8c47
Reviewed-on: http://review.whamcloud.com/9130
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Xi <pkuelelixi@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoRevert "LU-1778 libcfs: add a service that prints a nidlist" 78/9178/2
Oleg Drokin [Fri, 7 Feb 2014 14:08:12 +0000 (14:08 +0000)]
Revert "LU-1778 libcfs: add a service that prints a nidlist"

Whoops, this patch broke build: http://build.whamcloud.com/job/lustre-master/arch=x86_64,build_type=client,distro=ubuntu1004,ib_stack=inkernel/1879/changes

So I am reverting it.

This reverts commit 874f67c06da8304a194df5fc0dd5a2c61937076c.

Change-Id: Ieb36ba5c909bc3731dc4a925d89773be89ab64ec
Reviewed-on: http://review.whamcloud.com/9178
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1778 libcfs: add a service that prints a nidlist 79/8479/6
Gregoire Pichon [Wed, 4 Dec 2013 13:57:10 +0000 (14:57 +0100)]
LU-1778 libcfs: add a service that prints a nidlist

The libcfs already provides services to parse a string into a nidlist
and to match a nid into a nidlist. This patch implements a service
that prints a nidlist into a buffer.

This is required for instance to print the nosquash_nids parameter
of the MDT procfs component.

Additionally, this patch fixes a bug in return code of
parse_addrange() routine, so that parsing of nids including
a * character works fine ('*@elan' for instance).

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: I5dbc405e02b8f0f90d45e1a7e44589d5972cc384
Reviewed-on: http://review.whamcloud.com/8479
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3950 lfsck: control LFSCK on all devices via single command 65/7665/28
Fan Yong [Fri, 24 Jan 2014 19:45:42 +0000 (03:45 +0800)]
LU-3950 lfsck: control LFSCK on all devices via single command

Under DNE mode, it is more convenient for the administrator to control
the LFSCK (start/stop) on all the MDT devices via single command. Such
functionality is not only useful for DNE consistency verification, but
also for layout consistency (Phase II). It is also required for orphan
OST-objects scanning.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ie0d4611f969e51b80faf27b52dbdaee41caf5187
Reviewed-on: http://review.whamcloud.com/7665
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4540 llite: deadlock for page write 36/9036/2
Jinshan Xiong [Tue, 28 Jan 2014 22:31:36 +0000 (14:31 -0800)]
LU-4540 llite: deadlock for page write

Writing thread already locked page #1, and then wait for the
Writeback bit of page #2;

Ptlrpc thread is composing a write RPC, so it sets Writeback on
page #2 and tries to lock page #1 to make it ready.

Deadlocked.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I2da547b4c93c3464e520a1f593985adae9360bc9
Reviewed-on: http://review.whamcloud.com/9036
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3321 clio: optimize read ahead code 23/8523/12
Jinshan Xiong [Fri, 3 Jan 2014 17:58:56 +0000 (09:58 -0800)]
LU-3321 clio: optimize read ahead code

It used to check each page in the readahead window is covered by
a lock underneath, now cpo_page_is_under_lock() provides @max_index
to help decide the maximum ra window. @max_index can be modified by
OSC to extend the maximum lock region, to align stripe boundary at
LOV, and to make sure the readahead region at least covers read
region at LLITE layer.

After this is done, usually readahead code calls
cpo_page_is_under_lock() for each stripe.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Iecce020d01b804b799ad234f623498cc6f2f3fb2
Reviewed-on: http://review.whamcloud.com/8523
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4470 build: wrong linux symbol file search 56/9056/2
Bob Glossman [Wed, 29 Jan 2014 20:00:42 +0000 (12:00 -0800)]
LU-4470 build: wrong linux symbol file search

Long standing build flaw just discovered.  The autoconf function
LB_CHECK_SYMBOL_EXPORT looks for the linux symbol table in the wrong place.
In most builds this doesn't matter as the wrong path being used exactly
matches the correct path.  In SLES builds it does matter a lot.
Failing to find the linux symbol table can lead to incorrect autoconf results.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iab43a2c118c9b8be54a9596b4682b68a11946a94
Reviewed-on: http://review.whamcloud.com/9056
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoNew tag 2.5.55 2.5.55 v2_5_55 v2_5_55_0
Oleg Drokin [Mon, 3 Feb 2014 06:52:59 +0000 (01:52 -0500)]
New tag 2.5.55

Change-Id: I080c434ada778bf15c7b361072abef97b693734b

10 years agoLU-4442 test: add version check for replay-vbr.sh test_7g 73/8973/3
Emoly Liu [Thu, 23 Jan 2014 09:40:15 +0000 (17:40 +0800)]
LU-4442 test: add version check for replay-vbr.sh test_7g

In replay-vbr.sh test_7g.3, because mdt_object_exists() was added
in http://review.whamcloud.com/#/c/8371, client will not be evicted
without object version check.

Test-Parameters: envdefinitions=SLOW=yes,ONLY=7g testlist=replay-vbr
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Ie3c727aba8bd8bf65460a005412fb217ced341ec
Reviewed-on: http://review.whamcloud.com/8973
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3189 tests: add version check code into sanity test 53 33/8833/5
Jian Yu [Tue, 14 Jan 2014 08:52:28 +0000 (16:52 +0800)]
LU-3189 tests: add version check code into sanity test 53

This patch adds Lustre version check codes into sanity test
53 to make the test work with servers that do not have the
following patch:

Lustre-commit: 6c4c51e3079e6c257fbf86536e4739110c166e3b
Lustre-change: http://review.whamcloud.com/4789

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,ONLY=53 \
ossjob=lustre-b2_3 mdsjob=lustre-b2_3 ossbuildno=41 mdsbuildno=41 \
mdtcount=1 testlist=sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ibc759aeedb0023113d9acbdda6b4db5207775aa1
Reviewed-on: http://review.whamcloud.com/8833
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4431 lnet: 1/3/2014 update for Cray interconnects 44/8744/5
Chuck Fossen [Fri, 3 Jan 2014 23:35:40 +0000 (17:35 -0600)]
LU-4431 lnet: 1/3/2014 update for Cray interconnects

This is a rollup of changes for gnilnd containing bug fixes and
enhancements since the LU-3008 submission.
The new header file gni_pub.h contains code that will allow gnilnd to
be built upstream. It will not pass the checkpatch.pl script since it
was developed previously for gni drivers and ugni.

To build a lustre client including gnilnd for Aries (XC30):
sh autogen.sh
./configure --with-linux=/path_to_centos6.4_kernel
--disable-ldiskfs-build --disable-doc --disable-liblustre
--disable-server --with-o2ib=no --without-sysio --disable-checksum
--enable-utils --enable-gni
GNICPPFLAGS='-DCONFIG_CRAY_ARIES -I$PWD/lnet/klnds/gnilnd'

Included changes:
----------------------------------------------------------------------
Subject: Unused VIRT fma block does not get cleaned up during stack
reset.
Description:
This is an edge case where we allocate a GNILND_FMABLK_VIRT fma block
but before using it, a GNILND_FMABLK_PHYS is freed up for use. The
GNILND_FMABLK_VIRT fma block didn't get associated with a conn thus,
during a stack reset, the fma block will not be cleaned up and we
assert.
Changed kgnilnd_unmap_phys_fmablk to unmap all fma blocks instead of
just PHYS blocks.
Renamed kgnilnd_unmap_phys_fmablk to kgnilnd_unmap_fma_blocks.
---------------------------------------------------------------------
Subject: New LND type "gip" gnilnd use of IP addresses.
Description:
Add a new LND type to the libcfs_netstrfns array that converts IP
address to/from addresses.
This change also allows us to use hostnames or IP addresses to specify
the direct attached file systems in the /etc/fstab file.
----------------------------------------------------------------------
Subject: Changes to gnilnd for non-Cray modified kernel.
Description:
Get MAC address from arp table for generating a nicaddr.
Change the rca_inject proc file name to peer_state.
Add the GNIIPLND type to lnd_type for Apollo builds.
Add TO_GNILND_timeout for building upstream without gni-headers.
Add gni_pub.h for use by upstream builds.
Fix slab-freed debug statement which references a freed structure.
Remove unused code that was needed for the gemini simulator.
----------------------------------------------------------------------
Subject: Gnilnd upstream sync LU-4069
Description:
LU-4069 build: cleanup from GOTO(label, -ERRNO)
Cleanup the code from GOTO(label, -ERRNO) and other bad GOTOs.
----------------------------------------------------------------------
Subject: Merge gnilnd changes from LU-2800.
Description:
Upstream changes from LU-2800 need merging to gnilnd.
----------------------------------------------------------------------
Subject: Adjust to Cray-master cfs changes.
Description:
ll_proc_doxxxx macros have been removed in the cfs layer. Use the
corresponding proc_doxxxx function instead.
----------------------------------------------------------------------
Subject: Fix offset problem in reverse rdma edge case.
Description:
The call to lnet_copy_flat2kiov() was used incorrectly passing in an
offset to the source buffer being copied from. The offset is used to
decide how many bytes will be copied from the first iov which causes
the routine to only copy the difference between the nob and the offset
to the first iov. Since only one iov is ever passed in, all the bytes
need to come from that first iov.
----------------------------------------------------------------------
Subject: Remove CFS kernel abstraction dependencies from GniLND
Description:
The CFS kernel abstactions are being removed upstream.
Incorporate LU-1346 changes to GniLND.
We should not call deamonize as we are using kthread_create/
kthread_run.
----------------------------------------------------------------------
Subject: Fix race in closing connection in response to EFAULT error
Description:
The previous mod causes a regression in closing a conn in two threads
at once. We should use kngilnd_close_conn() instead of
kgnilnd_close_conn_locked() since it checks the state of the conn
before actually closing.
----------------------------------------------------------------------
Subject: Close connection in response to EFAULT error
Description:
After the companion node's GPU fell off the bus, we get an mdd invalid
hardware error even though the mdd's that have been inspected look ok.
The hardware error is returned in the rdma cq event in
kgnilnd_check_rdma_cq() and we respond by nak'ing the message.
The plan is to close the connection since the connection is still
alive and subsequent rdma's will continue to fail.
If the connection cannot be reestablished then the communication to
this node will cease so at least jobs will not continue to be
scheduled on this node.
----------------------------------------------------------------------
Subject: Fix outstanding conns issue during kgnilnd_base_shutdown
Description:
Currently in kgnilnd_base_shutdown there is a small race with the
datagram thread that can cause a wildcard dgram to match while in the
process of shutting down and will generate a nak datagram to be
generated. This new datagram needs to be canceled, currently we go
straight into full shutdown without doing so causing us to assert.

This mod adds a cancel function that iterates over all outstanding
non-wc dgrams regardless of net and cancels them. It then schedules
the device to clean up the remaining conns.
----------------------------------------------------------------------
Subject: Canceled dgram deadlock
Description:
When adding conns of canceled dgrams to purgatory, a call was made to
kgnilnd_destroy_conn_ep().
This is inappropriate since we are inside the kgn_peer_conn_lock and
kgnilnd_destroy_conn_ep() takes a mutex lock.
Avoid this behavior by setting the conn state to CLOSED instead of
DONE and allowing the scheduler thread to finish the conn's
processing.
----------------------------------------------------------------------
Subject: LND obtains node up/down information when creating peer.
Description:
Before creating a peer, check the state of the node to see if it is up
or down.
This is done by calling krca_get_sysnodes() and walking through the
array for the nid of interest and checking it's state.
kgnilnd_finish_connect() also creates peers but we do not need to
check in this instance since the request is from the peer.
----------------------------------------------------------------------
Subject: debug and lbug capability for kgnilnd client EFAULT errors
Description:
Currently when kgnilnd encounters an EFAULT within a nak message it
kills the TX and prints a message to the screen. It does not crash or
print enough information for us to diagnose if the problem is hardware
or software.
This patch will allow us to bug a compute when it starts getting a
large number EFAULTs programatically. It also prints out the memhandle
of the mdd that we should be inspecting for validity.
----------------------------------------------------------------------
Subject: Fix kgnilnd q_time setting
Description:
When we recieve a GNILND_MSG_PUT_REQ we send a GNILND_MSG_PUT_ACK in
response when we send that response we were not setting the tx's
q_time.
This mod fixes that problem and allows us to see the correct tx age
when calling kgnilnd_tx_done.
Changed a cast from long to unsigned long.
Corrected a tab issue.
----------------------------------------------------------------------
Subject: Add gnilnd eager receive limit
Description:
Add module parameter eager_credits which limits the amount of messages
that can be eager received.
Currently, we continue to allocate memory with each message which can
cause out of memory issues if a IB interface goes down.
Add counter to track eager allocations. Return -ENOMEM to lnet if we
exceed the number of credits allocated. Lnet will drop messages when
eager receive returns with an error.
Set the default eager_credits to 256k - this limits us to using 512 MB
of memory. This path is used mainly when there is an imbalance in the
either "side" of the router. There should be no performance impact
provided the normal tuning is done.
----------------------------------------------------------------------
Subject: kgnilnd static analysis fixes
Description:
Static analysis has found different bugs to fix.
This mod is a package of minor static analysis fixes.

1. Remove unsigned compare against 0 in kgnilnd_setup_immediate_buffer
2. Fix unintentional integer overflow in kgnilnd_proc_run_cksum_test
3. Fix Nesting issue in kgnilnd_map_phys_fmablk
4. Fix kgnilnd_process_nak return code nto being used.
5. Remove unneeded code in kgnilnd_del_conn_or_peer
6. Fix uninitialized value in kgnilnd_queue_tx
----------------------------------------------------------------------
Subject: kgnilnd_probe_for_dgram() race during shutdown.
Description:
Canceling dgrams while shutting down can cause an assertion in
kgnilnd_probe_for_dgram().
If the shutdown thread calls kgnilnd_probe_for_dgram concurrently with
the dram mover thread,
both may get the same dgram from the postdata_probe_by_id kgni
function.
Move the lock release to after postdata_test_by_id which actually
removes the dgram from the list.
Added fail_loc to test fix.
----------------------------------------------------------------------
Subject: Mailbox corruption fix
Description:
Canceled dgrams could have been completed at the peer during the
cancelation.
The mailbox could then be used for another peer therefore allowing two
peers to use the same mailbox.
The conns for canceled dgrams need to be put in purgatory so they
don't get reused until a connection has been established for the peer.
During release of a canceled dgram, we hook up the conn to the peer
then put it in purgatory.
Added flag to kgnilnd_release_dgram to indicate we are shutting down
or going through a stack reset.
Added some tracing of gnd_ndgrams.
----------------------------------------------------------------------
Subject: LND support for knc
Description:
For knc nodes, use GNI_PTAG_LND_KNC. Use two scheduler threads for
better performance.
libcfs includes calls to cfs_crypto_crc32_pclmul_register() and
cfs_crypto_crc32_pclmul_unregister() but those files are not built for
k1om architecture.
----------------------------------------------------------------------

Signed-off-by: Chuck Fossen <chuckf@cray.com>
Change-Id: Ie8be6d7e8b6623a49d7a75ec878a23cf5385cc46
Reviewed-on: http://review.whamcloud.com/8744
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4527 utils: deprecate old version lfs command opts 31/8631/6
Andreas Dilger [Thu, 19 Dec 2013 23:34:53 +0000 (16:34 -0700)]
LU-4527 utils: deprecate old version lfs command opts

The build version checking in lfs_getstripe() and lfs_find() was
incorrectly using LUSTRE_VERSION instead of LUSTRE_VERSION_CODE.
The old "positional" parameters for "lfs setstripe" have long been
deprecated and are now being removed.  The "--offset" and "--index"
options were not correctly being deprecated since 2.4.50 as intended.

Remove the code and conditions for already-passed build versions,
and fix the remaining checks to use LUSTRE_VERSION_CODE.  Fix one
test that was using a deprecated option.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I086c869ea5b3ba6c1f83cc2b6ce2c866b43ebbe5
Reviewed-on: http://review.whamcloud.com/8631
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-1267 lfsck: enhance RPCs (3) for MDT-OST consistency 08/7108/36
Fan Yong [Fri, 24 Jan 2014 19:44:54 +0000 (03:44 +0800)]
LU-1267 lfsck: enhance RPCs (3) for MDT-OST consistency

The LFSCK on the OST uses LFSCK_NOTIFY RPC to notify the LFSCK
on the MDT about the LFSCK progress for the layout consistency
verification. And uses the LFSCK_QUERY RPC to query the LFSCK
status on the MDT.

The LFSCK RPC from OST to MDT is sent via the reserse connection
from OST-x to MDT-y.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I138fa9b9ad8ab539379f25bb59ec04a1a482fddb
Reviewed-on: http://review.whamcloud.com/7108
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4509 ptlrpc: re-enqueue ptlrpcd worker 22/8922/4
Liang Zhen [Mon, 20 Jan 2014 12:52:51 +0000 (20:52 +0800)]
LU-4509 ptlrpc: re-enqueue ptlrpcd worker

osc_extent_wait can be stuck in scenario like this:

1) thread-1 held an active extent
2) thread-2 called flush cache, and marked this extent as "urgent"
   and "sync_wait"
3) thread-3 wants to write to the same extent, osc_extent_find will
   get "conflict" because this extent is "sync_wait", so it starts
   to wait...
4) cl_writeback_work has been scheduled by thread-4 to write some
   other extents, it has sent RPCs but not returned yet.
5) thread-1 finished his work, and called osc_extent_release()->
   osc_io_unplug_async()->ptlrpcd_queue_work(), but found
   cl_writeback_work is still running, so it's ignored (-EBUSY)
6) thread-3 is stuck because nobody will wake him up.

This patch allows ptlrpcd_work to be rescheduled, so it will not
miss request anymore

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I4929d52b2d409c2ce081147bb5ee3dd380a86c43
Reviewed-on: http://review.whamcloud.com/8922
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4406 osd-zfs: Correct number of integers for zap key 57/8857/4
Nathaniel Clark [Tue, 14 Jan 2014 21:42:35 +0000 (16:42 -0500)]
LU-4406 osd-zfs: Correct number of integers for zap key

All zap_*_uint64 functions take a key size that is the number of
uint64s.  This corrects the osd_prepare_key to account for that, and
changes the name to make it more consistant with zap functions.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I8ee5ee6e955016fc4340025cede21aaf5bd034b7
Reviewed-on: http://review.whamcloud.com/8857
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-1267 lfsck: enhance RPCs (2) for MDT-OST consistency 87/7087/39
Fan Yong [Fri, 24 Jan 2014 19:44:32 +0000 (03:44 +0800)]
LU-1267 lfsck: enhance RPCs (2) for MDT-OST consistency

The LFSCK on the MDT uses LFSCK_NOTIFY RPC to control the LFSCK
on the OSTs (or other MDTs) to start/stop/fail/pause the layout
consistency verification. And uses LFSCK_QUERY RPC to query the
LFSCK status on the OSTs (or other MDTs).

Introduce new connection flag: OBD_CONNECT_LFSCK to indicate that
whether the target server (MDT/OST) supports online LFSCK or not.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia605f25d0ca0224af3ee543d72a1e9f0cae918e3
Reviewed-on: http://review.whamcloud.com/7087
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1267 lfsck: enhance RPCs (1) for MDT-OST consistency 23/8623/21
Fan Yong [Fri, 24 Jan 2014 19:42:48 +0000 (03:42 +0800)]
LU-1267 lfsck: enhance RPCs (1) for MDT-OST consistency

Introduce new RPC LFSCK_NOTIFY for the LFSCK instance on the server_1
to notify the LFSCK instance on the server_2 about the event such as:
lfsck start/stop/pause/fail/phaseX_done, and so on.

Introduce new RPC LFSCK_QUERY for the LFSCK instance on the server_1
to query the LFSCK status on the server_2.

The two new RPCs are used not only for MDT-OST consistency, but also
for DNE consistency in the future.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I8529f1a3f5f7f9589101f456f0397c8ebe11df18
Reviewed-on: http://review.whamcloud.com/8623
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3344 tests: Verify file handle system calls 47/7247/23
James Simmons [Mon, 13 Jan 2014 14:08:36 +0000 (09:08 -0500)]
LU-3344 tests: Verify file handle system calls

New system calls name_to_handle_at() and open_by_handle_at() are added
to Linux kernel 2.6.39. Added test to verify these work correctly with
Lustre.

Signed-off-by: Swapnil Pimpale <spimpale@ddn.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Icbfc9642cd550ac44d379263836782ffbf4a74f4
Reviewed-on: http://review.whamcloud.com/7247
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
10 years agoLU-2524 tests: run sanity test_51ba in test_51b dir 21/9021/2
Andreas Dilger [Mon, 27 Jan 2014 20:01:19 +0000 (13:01 -0700)]
LU-2524 tests: run sanity test_51ba in test_51b dir

Run the test_51ba directory cleanup in the same directory as the
test_51b subtest created its subdirectories.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ib1118046dba13351c59bc39db3e85ef8583ebbe5
Reviewed-on: http://review.whamcloud.com/9021
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4551 tests: add range support in ONLY 22/9022/3
wang di [Mon, 27 Jan 2014 15:20:40 +0000 (07:20 -0800)]
LU-4551 tests: add range support in ONLY

Add range support in ONLY, so we can indicate
the range of test cases when running the test.

For example ONLY="12-30" sh sanity will run sanity
test case 12 until 30.

Test-Parameters: allwaysuploadlogs envdefinitions=ONLY="12-20" testlist=sanity
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I4c6dd62f0524ece388ccde3f1e4469a1219f11d2
Reviewed-on: http://review.whamcloud.com/9022
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-1267 lfsck: framework (3) for MDT-OST consistency 62/7062/37
Fan Yong [Fri, 24 Jan 2014 19:42:07 +0000 (03:42 +0800)]
LU-1267 lfsck: framework (3) for MDT-OST consistency

Introduce an assistant kernel thread to help to handle MDT-OST
consistency verification. The LFSCK main engine thread and the
assistant kernel thread compose an async mode pipeline:

For a given MDT-object, the LFSCK main engine thread reads its
layout EA, and for each stripe, it prefetches the OST-object's
attribute asynchronously. The LFSCK main engine thread doesn't
wait for the OST-object's attribute to be replied, intead, add
the request structure on the shared list.

The LFSCK assistant kernel thread scans the shared list, and
for each replied request, checks whether the OST-object's attr
is consistent with its MDT-object's attr or not. If found some
inconsistency, the LFSCK assistant kernel thread will fix it.

To avoid the LFSCK main engine thread is too much ahead of the
LFSCK assistant kernel thread as to too many objects have been
pre-fetched then memory pressure, use an async windows size to
control how many objects the LFSCK main engine thread can be
ahead of the LFSCK assistant kernel thread at most. It is also
used to control how many objects the assistant kernel thread
can be ahead of backend ptlrpcd threds at most. Such windows
size can be specified via the "lctl lfsck_start" command "-w"
option and can be adjusted dynamically via the proc interface
"lfsck_async_windows".

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I41efd93bc614591a9aabe1099a13fbcc1275d2d9
Reviewed-on: http://review.whamcloud.com/7062
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3951 lfsck: LWP connection from OST-x to MDT-y 66/7666/24
Fan Yong [Fri, 24 Jan 2014 19:41:42 +0000 (03:41 +0800)]
LU-3951 lfsck: LWP connection from OST-x to MDT-y

When client sends object-based RPC to the OST-x, the RPC service
thread on the OST-x needs to verify whether the given parent FID
information in the client RPC matches the parent FID information
stored in the OST-object. If not match, it will query the MDT-y
according to the client given parent FID information. The query
RPC from the OST-x to the MDT-y is sent via LWP connection.

The other use case is that the LFSCK on the OST-x needs to talk
with the LFSCK on the MDT-y, such control/query RPC will be via
above LWP connection from OST-x to MDT-y.

Currently, we only support LWP connection frm OST-x to the MDT-0.
This patch enhance that to enable LWP connection from any OST to
any MDT.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ie98be82b3af90456d1838d53b6d77c12956f7bd7
Reviewed-on: http://review.whamcloud.com/7666
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4484 lbuild: add support for fresh versions of MPSS 3.x.x 36/8836/5
Dmitry Eremin [Tue, 14 Jan 2014 11:36:55 +0000 (15:36 +0400)]
LU-4484 lbuild: add support for fresh versions of MPSS 3.x.x

* Adopt lbuild script for new version of MPSS with x.x.x notation.
* Remove dependency from MPSS package to avoid renaming issue in
  the future. The name of package which was used for dependency
  was renamed in MPSS.
* Use new server with MPSS released packages for download.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ie4407ad00177ad6d22770230a4dc6bde967d91ef
Reviewed-on: http://review.whamcloud.com/8836
Tested-by: Jenkins
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4456 osp: extra check for opd_pre 90/8890/8
wang di [Thu, 16 Jan 2014 23:26:56 +0000 (15:26 -0800)]
LU-4456 osp: extra check for opd_pre

1. Add extra check for opd_pre in statfs_interpret, in case
opd_pre has been freed before the callback.

2. switch the sync_fini and pre_fini, so opd_pre will be freed
after all of the possible access has been stopped.

3. opd_pre_waitq will be accessed in several update threads,
osp_precreate, osp_statfs_timer_cb, statfs_interrupt, move
it to osp_device to make sure it is accessiable even after
osp_pre is freed.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I5c73cb52e2406ed03570fc3471111c409e6fe08f
Reviewed-on: http://review.whamcloud.com/8890
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4517 tests: get params directly in _wait_osc_import_state 89/8989/3
Emoly Liu [Sun, 26 Jan 2014 02:27:20 +0000 (10:27 +0800)]
LU-4517 tests: get params directly in _wait_osc_import_state

In _wait_osc_import_state(), if the facet is not a client node,
go to get osc.*.ost_server_uuid params directly and quickly,
without waiting $maxtime seconds.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Idf5989f53d050edcb69690b7f24a6e86df233bef
Reviewed-on: http://review.whamcloud.com/8989
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4515 tests: disable sanity-quota test_34 temporary 81/8981/5
Fan Yong [Fri, 24 Jan 2014 19:19:33 +0000 (03:19 +0800)]
LU-4515 tests: disable sanity-quota test_34 temporary

To avoid other patches to be failed for LU-4515.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I8a6949d18a1ff4f5d229ed083f4f12a667eb3329
Reviewed-on: http://review.whamcloud.com/8981
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
10 years agoLU-4413 osp: move seq allocation out of osp_import_event 97/8997/3
wang di [Fri, 24 Jan 2014 22:21:07 +0000 (14:21 -0800)]
LU-4413 osp: move seq allocation out of osp_import_event

Because seq allocation(osp_init_pre_fid) might be stuck
during RPC, move it out of osp_import_event, which is
inside ptlrpcd_rcv. Otherwise, some other import RPCs(like
connect req)might be blocked in ptlrpcd_rcv.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ib4014f8b0088ea3613fa4d53d3e274f5bdfe70c7
Reviewed-on: http://review.whamcloud.com/8997
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3531 mdc: release dir page cache after accessing 35/8935/4
wang di [Mon, 20 Jan 2014 23:49:34 +0000 (15:49 -0800)]
LU-3531 mdc: release dir page cache after accessing

Release the dir page cache in llite/lmv, so the page
will be hold until entires was filled by filldir.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I8b24bec74b14ff2b65130c02294821fc16ca1421
Reviewed-on: http://review.whamcloud.com/8935
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4409 tests: disable insanity 10 for DNE 50/8650/5
wang di [Sat, 21 Dec 2013 15:02:52 +0000 (07:02 -0800)]
LU-4409 tests: disable insanity 10 for DNE

Disable insanity 10 for DNE.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I4b67cf745a18a09335e21e1e6e457134ac47f224
Reviewed-on: http://review.whamcloud.com/8650
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
10 years agoLU-1267 lfsck: framework (2) for MDT-OST consistency 02/8302/15
Fan Yong [Wed, 15 Jan 2014 05:20:59 +0000 (13:20 +0800)]
LU-1267 lfsck: framework (2) for MDT-OST consistency

The LFSCK can talk with OSP directly, then the LFSCK on the MDT can
control/monitor the specified LFSCK instance on other targets (MDTs
or/and OSTs) without breaking dt_device APIs nor making OSP to know
the LFSCK things, and simplify the handling of remote OST-object or
MDT-object. For that, each OSP will register to the LFSCK will they
are added into the system. The LFSCK maintains such target table in
RAM with the similar logic as the LOD does.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ifa14db68925a0cd2afe0c3566382dbb6176d50b2
Reviewed-on: http://review.whamcloud.com/8302
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1267 lfsck: rebuild LAST_ID 97/6997/35
Fan Yong [Sat, 18 Jan 2014 01:04:09 +0000 (09:04 +0800)]
LU-1267 lfsck: rebuild LAST_ID

The /O/<seq>/LAST_ID records the last oid of the object allocated
within the sequence. The LAST_ID file can be crashed or missed as
the system running. The LFSCK for layout consistency verification
can detect the LAST_ID lost/crashed cases, and can rebuild it via
scanning the whole device.

This functionality is also part of LU-14 live replacement of OST.

Introduce lfsck_notify callback - the LFSCK events notification
channel from the LFSCK to the registered users (MDD/OFD).

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iee85056e2fda1ecba9424c9f0e822643e9f029a8
Reviewed-on: http://review.whamcloud.com/6997
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2818 mdt: Properly handle ENOMEM 47/8947/2
Oleg Drokin [Tue, 21 Jan 2014 18:53:26 +0000 (13:53 -0500)]
LU-2818 mdt: Properly handle ENOMEM

When osd_keys_init fails in mdt_lvbo_fill, properly bail out with
error instead of asserting.

Change-Id: I832742ed49cc7740d8e709bc4b87e5d5aa100d39
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/8947
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-1538 tests: sanity/101d to check available space 12/4312/6
Alex Zhuravlev [Fri, 19 Oct 2012 18:30:49 +0000 (22:30 +0400)]
LU-1538 tests: sanity/101d to check available space

Fix the check which compared MBs (in size variable) with KBs
(reported by df).  Also avoid failure if the read took under
two seconds, since the timing is inaccurate at that resolution.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I43335d699ad2b7e4c5db00c36a7795683f3b04f7
Reviewed-on: http://review.whamcloud.com/4312
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4416 llite: struct kiocb ki_left removed 01/8801/2
yangsheng [Wed, 1 Jan 2014 15:53:38 +0000 (23:53 +0800)]
LU-4416 llite: struct kiocb ki_left removed

struct kiocb without ki_left memeber since 3.12.

Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: Iea1fb67ebb03430b5dc8f71ed2652967ff60b84d
Reviewed-on: http://review.whamcloud.com/8801
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4423 ptlrpc: fix potential NULL pointer dereference 82/8682/2
Oleg Drokin [Tue, 31 Dec 2013 01:50:28 +0000 (20:50 -0500)]
LU-4423 ptlrpc: fix potential NULL pointer dereference

The rest of the code seem to imply that rmf_dumper may indeed be
NULL.  Change the code so that dumping is not even considered if
rmf_dumper callback is not set.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: Iaea16aaf799976d08ebb51322021cc879db1c6d8
Reviewed-on: http://review.whamcloud.com/8682
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4442 test: fix wrong usage of wait_mds_ost_sync() 96/8796/6
Emoly Liu [Wed, 1 Jan 2014 21:26:08 +0000 (05:26 +0800)]
LU-4442 test: fix wrong usage of wait_mds_ost_sync()

Fix the wrong usage of wait_mds_ost_sync() in replay_vbr.sh
test_7_cycle(). The first parameter should be a timeout in seconds
not a facet.

Test-Parameters: testlist=replay-vbr envdefinitions=ONLY=7
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I4e6de62049b473deeaf5c75e1136d76d67a02053
Reviewed-on: http://review.whamcloud.com/8796
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoRevert "LU-3319 procfs: move osp proc handling to seq_files" 31/8931/2
Oleg Drokin [Mon, 20 Jan 2014 23:10:06 +0000 (23:10 +0000)]
Revert "LU-3319 procfs: move osp proc handling to seq_files"

This seems to be causing issues like LU-45-13 and LU-4510
This reverts commit a97e4898ad9e0b65f457b01bdfa954f7d7cd272d.

Change-Id: I6066a255ded24dbdb76b4804e82a377f1069af5f
Reviewed-on: http://review.whamcloud.com/8931
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3680 ptlrpc: Fix assertion failure of null_alloc_rs() 00/8200/6
Patrick Farrell [Fri, 22 Nov 2013 16:47:54 +0000 (10:47 -0600)]
LU-3680 ptlrpc: Fix assertion failure of null_alloc_rs()

lustre_get_emerg_rs() set the size of the reply buffer to zero
by mistake, which will cause LBUG in null_alloc_rs() when memory
pressure is high. This patch fix this problem and adds a size
check to avoid the problem of insufficient buffer size.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I9fbd4f14e8e1263de2af564c4f2e420f5f2b43bc
Reviewed-on: http://review.whamcloud.com/8200
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>