Whamcloud - gitweb
fs/lustre-release.git
12 years agoLU-985 lprocfs: verify user buffer access
Bobi Jam [Fri, 13 Jan 2012 05:46:07 +0000 (13:46 +0800)]
LU-985 lprocfs: verify user buffer access

In lprocfs_xxx_evict_client(), need verify user's buffer when access
it.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I702e22f8d432edce200c6d91a0af8a1eac792008
Reviewed-on: http://review.whamcloud.com/1961
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1061 agl: cl_locks_prune() waits for the last user
Fan Yong [Thu, 2 Feb 2012 11:17:41 +0000 (19:17 +0800)]
LU-1061 agl: cl_locks_prune() waits for the last user

The AGL sponsor holds user reference count on the cl_lock before
triggering AGL RPC. The user reference count on the cl_lock will
be released by AGL RPC reply upcall. Such AGL mechanism conflict
with cl_locks_prune(), which requires no lock is in active using
when the last iput() called.

So the cl_locks_prune() should wait for the last user reference
count to be released by the enqueue reply upcall.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I8998c0a247448b1b6c6e99995c9d956b1666279b
Reviewed-on: http://review.whamcloud.com/2079
Tested-by: Hudson
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1017 handle -EAGAIN properly in lu_object_find_try()
Niu Yawei [Tue, 31 Jan 2012 06:06:47 +0000 (22:06 -0800)]
LU-1017 handle -EAGAIN properly in lu_object_find_try()

htable_lookup() could return -EAGAIN for dying object, we should
handle it properly in lu_object_find_try().

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I616928cb983d4f64a64f25b7d296b6c9bd18e4ea
Reviewed-on: http://review.whamcloud.com/2066
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-874 ptlrpc: handle in-flight hqreq correctly
Jinshan Xiong [Wed, 25 Jan 2012 19:27:55 +0000 (11:27 -0800)]
LU-874 ptlrpc: handle in-flight hqreq correctly

If there are in-flight requests pending, we shouldn't timeout the
covering dlm locks; neither should we remove the requests from export
exp_hp_rpcs list until the requests are handled.

In this patch, the following things are improved:
1. leave IO rpcs in export's hp list until they are handled;
2. using interval tree to find rpc overlapped locks;
3. refresh the lock again after IO rpcs are finished to leave a time
   window for clients to cancel covering dlm locks;
4. rework repbody in ost_handler.c so as to not modify original obdo
5. cleanup the code.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I33e2d113d7929a56389741c06dffb5efb6bf28a3
Reviewed-on: http://review.whamcloud.com/1918
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: <alexander_boyko@xyratex.com>
12 years agoLU-874 osc: prioritize writeback pages
Jinshan Xiong [Tue, 10 Jan 2012 07:34:06 +0000 (23:34 -0800)]
LU-874 osc: prioritize writeback pages

When a lock is being canceled, we should prioritize those covering
pages which have already been submitted by page writeback daemon;
otherwise, this client may be evicted because there is no active IO
for that lock for a long time.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: If14eff6361f55d2b2eeb2db7146789dda4c32060
Reviewed-on: http://review.whamcloud.com/1938
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1067 obdecho: Recheck client env ctx for echo md client.
wangdi [Sat, 4 Feb 2012 08:12:48 +0000 (00:12 -0800)]
LU-1067 obdecho: Recheck client env ctx for echo md client.

During echo md test, if there are real clients being mounted
at the same time, the cl env it gotten from the cache might
contain the wrong context, so we need recheck the ctx and
refill the env forcedly.

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: Iddbceee80966b5ca9284c886731386a97d089d53
Reviewed-on: http://review.whamcloud.com/2092
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1073 llite: no CERROR on unknown flock type
Alexander Zarochentsev [Sat, 11 Feb 2012 07:19:31 +0000 (11:19 +0400)]
LU-1073 llite: no CERROR on unknown flock type

Xyratex-bug-id: MRP-215

Signed-off-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Change-Id: I435c3282bc87750f6ebd19e618a1ee8e229834d2
Reviewed-on: http://review.whamcloud.com/2109
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1013 obdclass: lu_object_find miss to unlink object from LRU
Fan Yong [Sat, 11 Feb 2012 01:26:29 +0000 (09:26 +0800)]
LU-1013 obdclass: lu_object_find miss to unlink object from LRU

There is race condition between lu_object_find and lu_object_put.
For the case of two threads trying to find some object with the
same FID concurrently, and the object is not allocated in memory
yet, if the first thread adds the object into LRU list before the
second thread re-searching object hash table for inserting the
new object created by itself, then the second thread will find
the object created by the first thread. Under such case, the
second thread should unlink the object from LRU.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: Iadec96c27d5285f8b859b0060a6f611e87585789
Reviewed-on: http://review.whamcloud.com/2134
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-687 ldiskfs: resolve section dynlocks mismatch
Andreas Dilger [Thu, 26 Jan 2012 10:06:30 +0000 (03:06 -0700)]
LU-687 ldiskfs: resolve section dynlocks mismatch

Fix __init/__exit section mismatch.

  WARNING: lustre-2.1.0/ldiskfs/ldiskfs/ldiskfs.o(.init.text+0x1bc):
  Section mismatch in reference from the function init_module() to the
  function .exit.text:dynlock_cache_exit().  An __init init_module()
  function references a function __exit dynlock_cache_exit().

  This is often seen when error handling in the init function uses
  functionality in the exit path.  The fix is often to remove the
  __exit annotation of dynlock_cache_exit() so it may be used outside
  an exit section.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie0bdde7f78c18bca1127151175cf56bfa6ad500c
Reviewed-on: http://review.whamcloud.com/2019
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1046 ldlm: hold ldlm_cb_set_arg until the last user exit
Fan Yong [Tue, 31 Jan 2012 05:51:41 +0000 (13:51 +0800)]
LU-1046 ldlm: hold ldlm_cb_set_arg until the last user exit

There is race condition between ldlm_cb_interpret() and
ldlm_run_ast_work() when test/release "ldlm_cb_set_arg":
ldlm_run_ast_work() can exit before ldlm_cb_interpret()
processing "ldlm_cb_set_arg::waitq", then causes invalid
memory accessing.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: Idfb3f41739e02a79fe4310ecbc7bb842e75a82d8
Reviewed-on: http://review.whamcloud.com/2065
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-993 osd: code cleanup for directory nlink count
Niu Yawei [Mon, 16 Jan 2012 06:52:12 +0000 (22:52 -0800)]
LU-993 osd: code cleanup for directory nlink count

- LDISKFS_LINK_MAX is ldiskfs unique, we should not handle it in the mdd
  layer. It's moved down to the osd layer in this patch.

- Remove the declared operation count assertion in OSD_EXEC_OP(), since
  the undo operations are not declared, if there some operation failed,
  the undo operation will trigger the assertion unnecessarily.

- Restore the sub-directory count to 70000 for the sanity test_51b, and
  add test_51ba for testing unlink > LDISKFS_LINK_MAX sub-directories.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Id02e7d2ac3a7664a55566b1de783f0a73162339b
Reviewed-on: http://review.whamcloud.com/1971
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-928 llite: fix comment for lustre_client_ocd
Andreas Dilger [Thu, 15 Dec 2011 08:24:33 +0000 (01:24 -0700)]
LU-928 llite: fix comment for lustre_client_ocd

ll_ocd_update() was renamed to cl_ocd_update() before 2.0.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I834935d2dc9ef0edebbd5badb4bc54283a325730
Reviewed-on: http://review.whamcloud.com/1868
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1053 build: allow deleted/chmod in commit-msg
Andreas Dilger [Sun, 29 Jan 2012 07:20:45 +0000 (00:20 -0700)]
LU-1053 build: allow deleted/chmod in commit-msg

Allow files that were deleted or chmod when verifying a commit
message that contains a diff generated with "commit -v".  The
checks added for excluding the diff were previously too strict.
Add a test case for each of these situations, as well as a case
for a commit message that contains "diff" at the start of a line
in the middle of a commit message.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I74fe98682342e445da7fe64fd42e91df4564500c
Reviewed-on: http://review.whamcloud.com/2053
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1025 checksum: add final bit inversion for crc32c
Alexander.Boyko [Fri, 3 Feb 2012 07:20:01 +0000 (11:20 +0400)]
LU-1025 checksum: add final bit inversion for crc32c

The linux kernel implementations of crc32c perform final bit
inversion after loop calculation of checksum.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Change-Id: I5fa6af60c51f6f86f394f3cc71aa2672be614f7b
Reviewed-on: http://review.whamcloud.com/2018
Tested-by: Hudson
Reviewed-by: Shuichi Ihara <sihara@ddn.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-817 lustre-iokit: sgpdd-survey is encountering r/w errors on arrays using 2TB...
Shuichi Ihara [Mon, 7 Nov 2011 09:45:33 +0000 (18:45 +0900)]
LU-817 lustre-iokit: sgpdd-survey is encountering r/w errors on arrays using 2TB drives

There are two fix/improvements in sgpdd-survey.
(1) adding --lba option to sg_readcap for large LUNs.
(2) support configurable boundadry block size between
concurrent regions per device.

Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: Ib3cdb051cf55e096919ad63a42640aaacfe511c4
Reviewed-on: http://review.whamcloud.com/1658
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Cliff White <cliffw@whamcloud.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-746 test: obdfilter-survey FAIL hndls expected >8, have 2
Lai Siyao [Fri, 4 Nov 2011 15:08:25 +0000 (08:08 -0700)]
LU-746 test: obdfilter-survey FAIL hndls expected >8, have 2

obdfilter-survey.sh checks jbd proc stats after survey, but it
doesn't take obd cleanup time into account, so the stats collected
may not reflect the data of test. To fix this, save run time of
survey test, and collect stats of (run_time/4) ago.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: I597a6279353c203b50b1c2b17543bd34d86d8806
Reviewed-on: http://review.whamcloud.com/1645
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
12 years agoLU-1023 utils: Time counting fix for obdfilter-survey
wangdi [Tue, 24 Jan 2012 03:56:09 +0000 (19:56 -0800)]
LU-1023 utils: Time counting fix for obdfilter-survey

1. Assigning the start and end time of test_brw so it
   can get correct time usage.
2. In mds_survey, it adds a new output line for threas utils,
   (Total: total xxxx threads X sec ...), and we should skip
   this line in get_stats(obdfilter-survey), so it can get
   correct min/max time.

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: I58c13bde299c254ed10e1405bfed3e1dce4ef216
Reviewed-on: http://review.whamcloud.com/1999
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-874 ldlm: prioritize LDLM_CANCEL requests
Jinshan Xiong [Thu, 22 Dec 2011 00:33:51 +0000 (16:33 -0800)]
LU-874 ldlm: prioritize LDLM_CANCEL requests

If a lock canceling request has already reached server, there is no
reason of evicting the client when the waiting list times out.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I1a300916b4b3b592ca565ffc06cb3658d699d7a0
Reviewed-on: http://review.whamcloud.com/1900
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-980 llog: cleanup return value in llog_client_create
Hongchao Zhang [Thu, 12 Jan 2012 13:29:00 +0000 (21:29 +0800)]
LU-980 llog: cleanup return value in llog_client_create

in llog_client_create, the newly allocated llog_handle is
return by parameter res, but it doesn't be cleaned up
if the following operations failed and the corresponding
llog_handle is already freed.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: I8b59dfde91da2c20881f29e8ff46a0a93f0ee1b2
Reviewed-on: http://review.whamcloud.com/1958
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-998 acl: declare acl operation for setattr
Lai Siyao [Wed, 18 Jan 2012 07:46:24 +0000 (15:46 +0800)]
LU-998 acl: declare acl operation for setattr

Setattr on ATTR_MODE may set acl if acl is enabled, and it should be
declared in advance.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: Id9c94c5498f8fee0a79986bb424444f658c98e60
Reviewed-on: http://review.whamcloud.com/1984
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-1048 ldiskfs: fix dynlock cache entry freeing
Oleg Drokin [Sat, 28 Jan 2012 01:26:53 +0000 (20:26 -0500)]
LU-1048 ldiskfs: fix dynlock cache entry freeing

Update to rhel6.2 broke dynlocks patch and the last hunk was
ignored by quilt/patch.
This commit rediffs the proper patch against 2.6.32-220.el6

Change-Id: If9020458d07c7c2dc714b3e38587f66c4846f806
Signed-off-by: Oleg Drokin <green@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2034
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-797 tests: speed up ost-pools tests
Andreas Dilger [Thu, 26 Jan 2012 12:39:06 +0000 (05:39 -0700)]
LU-797 tests: speed up ost-pools tests

The test time of the ost-pools subtests is unreasonably long.

test_14 fills an OST to 90% full, regardless of the OST size.
Skip the test if the amount of data to be written is too large
to run in a practical time.

test_18 creates 3x3x30000 files to compare performance with/without
pools enabled.  Instead of creating a fixed number of files, use
createmany to run for a specific (short) time to measure performance.

test_23 tried to fill all OSTs 100% full.  Split this test into two:
- test_23a to test quota with a file in a pool
- test_23b to test OOS with a file striped over pool

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ib851939a210ab68f52da1ac777781c2a922c500c
Reviewed-on: http://review.whamcloud.com/2028
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-972 ldlm: reduce server log message everytime client connect
Minh Diep [Tue, 10 Jan 2012 21:16:17 +0000 (13:16 -0800)]
LU-972 ldlm: reduce server log message everytime client connect

Changed to D_HA to reduce the messages
Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ief4507ef484369b6a7aafb7ce9b1f1e0b1f0967e
Reviewed-on: http://review.whamcloud.com/1944
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoTag 2.1.55 2.1.55 v2_1_55_0
Oleg Drokin [Wed, 25 Jan 2012 14:49:22 +0000 (09:49 -0500)]
Tag 2.1.55

Change-Id: I7f6bf0b88a7dc5c5a2eb08ccff090c2a17dfce22
Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-898 ptlrpc: fix ptlrpc request race.
Alexander.Boyko [Fri, 6 Jan 2012 06:38:00 +0000 (10:38 +0400)]
LU-898 ptlrpc: fix ptlrpc request race.

Allow request reorder from export, only if request has been
added to the list.

Race condition with req error handle at ptlrpc_server_handle_req_in().
1. req is added to rq_export->exp_queued_rpc by ptlrpc_hpreq_init()
2. ptlrpc_server_request_add() returns error ( ost_validate_obdo() whith
one of this condition !(fid_seq_is_rsvd(oa->o_seq) ||
fid_seq_is_idif(oa->o_seq))
3. ptlrpc_server_drop_request(req) drops request (disconnect export),
but req is in rq_export->exp_queued_rpc
4. ldlm_server_blocking_ast handle rq_export->exp_queued_rpc and fail
because req->rq_export is NULL.
Fix allows request reorder from export only if request has been added to
the list. So workaround is not need for situation when request from export
was processed before ptlrpc_server_request_add.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-by: Denis Kondratenko <denis_kondratenko@xyratex.com>
Reviewed-by: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
Xyratex-bug-id: MRP-284
Change-Id: I5f763b3c4f19b6af5f803b50b43a5570dab3dc76
Reviewed-on: http://review.whamcloud.com/1799
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Denis Kondratenko <Denis_Kondratenko@xyratex.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-931 mdd: store lu_fid instead of pointer in md_capainfo
Hongchao Zhang [Tue, 17 Jan 2012 04:10:25 +0000 (12:10 +0800)]
LU-931 mdd: store lu_fid instead of pointer in md_capainfo

in md_capainfo, mc_fid contains at most 5 pointers to lu_fid,
and if the corresponding lu_fid is freed, the pointer isn't notified
about it, then the pointer will point to freed memory!

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: I00088cbfeb145ceac0477467a8b2436f6cf1e530
Reviewed-on: http://review.whamcloud.com/1979
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-919 obdclass: remove hard coded 0x5a5a5a
Niu Yawei [Wed, 11 Jan 2012 03:59:10 +0000 (19:59 -0800)]
LU-919 obdclass: remove hard coded 0x5a5a5a

We assert atomic_t value with hard coded 0x5a5a5a in several places,
which could result in false assertion failure when the reference count
getting very large in some extreme case.

The hard coded 0x5a5a5a should be replaced by LI_POISON.

Signed-off-by: Bruno Faccini <bruno.faccini@bull.net>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Idc271621017d071b3e2dce5d0ec6fb854127a955
Reviewed-on: http://review.whamcloud.com/1953
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-814 test: automate NFS over lustre testing
Minh Diep [Thu, 5 Jan 2012 16:55:48 +0000 (08:55 -0800)]
LU-814 test: automate NFS over lustre testing

Provide setup nfs within auster framework

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Icfd61bf6772807a344576b92b5268a83a7b79e4b
Reviewed-on: http://review.whamcloud.com/1664
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-879 tests: Add test for checking rename_stats
wangdi [Sat, 14 Jan 2012 07:15:46 +0000 (23:15 -0800)]
LU-879 tests: Add test for checking rename_stats

Add 133d in sanity for checking rename_stats.

Signed-off-by: Di Wang <di.wang@whamcloud.com>
Change-Id: If9a57b9ac458fdf729c19f597d6197f410966e91
Reviewed-on: http://review.whamcloud.com/1970
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
12 years agoLU-169 lov: add basic infrastructure for layout lock
Johann Lombardi [Tue, 10 Jan 2012 16:41:26 +0000 (17:41 +0100)]
LU-169 lov: add basic infrastructure for layout lock

This patch adds some basic infrastructure to support the layout lock
in a near future. This includes defining a new inode lock bit to lock
the file layout (namely MDS_INODELOCK_LAYOUT) as well as a new lookup
intent (IT_LAYOUT).

Signed-off-by: Jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Change-Id: Ibf1c3c166b5def4654684febbcf3a99ea7e482eb
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1854
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-781 kernel: kernel update [RHEL6.2 2.6.32-220]
yangsheng [Thu, 5 Jan 2012 10:30:42 +0000 (18:30 +0800)]
LU-781 kernel: kernel update [RHEL6.2 2.6.32-220]

Add support for RHEL6.2. The version is 2.6.32-220.el6.

Change-Id: Icc03a2f5d8b377aa1b1180ae09056989bbc84a9d
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1892
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-948 clio: add a callback to cl_page_gang_lookup()
Jinshan Xiong [Thu, 12 Jan 2012 00:03:41 +0000 (16:03 -0800)]
LU-948 clio: add a callback to cl_page_gang_lookup()

Add a callback to cl_page_gang_lookup() so that it will be easier to
fix this issue and be helpful for new IO engine.

If a read lock is being canceled, we used to grab page lock and then
check if they are covered by another lock, otherwise they will be
discarded. This is unnecessary because we can do this w/o grabbing
page lock.

With the above fix, when a read-ahead page is in IO during recovery,
and one of covering locks is being canceled by early cancel for
recovery, it will detect that this page is being covered by another
one, and then this page will be skipped w/o trying to grab page lock.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I22a3ea0790f5c0e01c12c29208b6d60c38058f12
Reviewed-on: http://review.whamcloud.com/1955
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-459 debug: quiet overly verbose debug messages
Andreas Dilger [Fri, 16 Dec 2011 02:20:28 +0000 (19:20 -0700)]
LU-459 debug: quiet overly verbose debug messages

Some debugging messages are being printed to the console, but
do not provide any particular value.  Turn these into kernel
debug messages.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Id8b0624b281ce67501d0d81cd0e89cc020cd669a
Reviewed-on: http://review.whamcloud.com/1876
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-412 tests: Fix test script lookup in auster
Li Wei [Mon, 11 Jul 2011 05:23:33 +0000 (13:23 +0800)]
LU-412 tests: Fix test script lookup in auster

When asked to run "lfsck", auster found "/usr/sbin/lfsck" instead of
"$LUSTRE/tests/lfsck.sh".  This patch restricts auster to look for test
scripts in only $LUSTRE/tests.

Change-Id: Iea521d06cdf1cea1a9bd80224f847275732d3447
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1078
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-966 mdd: mdd object may not be exist
Bobi Jam [Mon, 9 Jan 2012 10:23:41 +0000 (18:23 +0800)]
LU-966 mdd: mdd object may not be exist

If MDT device has been checked with fsck, some mdd objects
could be removed, so that recovery replay could act on non-existing
objects.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: If3a3b72ee70ab2513978ed968c9598ddde11c085
Reviewed-on: http://review.whamcloud.com/1928
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-807 utils: remove old service tags support
Andreas Dilger [Mon, 31 Oct 2011 19:17:01 +0000 (13:17 -0600)]
LU-807 utils: remove old service tags support

Remove obsolete and unused service tags infrastructure from the
build system and mount utilities.  This reverts the changes from:

    14765d2816bafa2a08879ece0e33bf8c97f84948
    805392ae4c4a4295d0f027234c83a670dfdc2268

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I1f2bd75dfd285fa1cebb5cb8eb4772cf2fc9ad7a
Reviewed-on: http://review.whamcloud.com/1634
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-812 synchronize_rcu no longer a define
Wally Wang [Fri, 13 Jan 2012 00:12:54 +0000 (16:12 -0800)]
LU-812 synchronize_rcu no longer a define

synchronize_kernel() in old kernel(pre 2.6.12) is no longer supported
and synchronize_rcu is no longer a define after 2.6.33. Remove the no
longer used old code.

Change-Id: Iac668efc35d82e6218b1924eca009f92d19a0c7a
Signed-off-by: Wally Wang <wang@cray.com>
Reviewed-on: http://review.whamcloud.com/1952
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-883 build: allow blank after signoff, email fix
Andreas Dilger [Thu, 15 Dec 2011 07:16:20 +0000 (00:16 -0700)]
LU-883 build: allow blank after signoff, email fix

Allow blank lines after the signoff section, since there will usually
be a blank line between the signoff and the checkpatch.pl output.

Improve the EMAILPAT regex to detect bad email addresses that do not
have a full name (at least 2 parts) and a full domain (also with
at least 2 parts).

Allow a generic "{Organization}-bug-id:" line in the signoff section
to allow linking the patch commit into arbitrary bug databases for
ease of tracking patches.

Add test cases for all of these changes.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Iaf14f6c985d3dd4837064d29071bd9acd8031d67
Reviewed-on: http://review.whamcloud.com/1867
Reviewed-by: <bruce.korb@gmail.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-80 utils: Large stripe support
James Simmons [Tue, 10 Jan 2012 13:59:23 +0000 (08:59 -0500)]
LU-80 utils: Large stripe support

Currently a file can be stripped across OSTs up to a limit of
160 stripes due to the ldiskfs xattr size limit of 4096 bytes.
This limit will be increased to 2000 stripes or more by increasing
the maximum xattr size for the MDT.

During testing, issues emerged with clients that are not expecting
more than 160 stripes. This patch allows clients to interoperate with
servers that do not support large xattrs, and also servers with
support with at least 2000 stripes, though it is intended not to have
any hard upper limit.

Change-Id: Idbaeb98919aea6b4cd375b881ed87661034d9394
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/1194
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Yu Jian <yujian@whamcloud.com>
12 years agoLU-960 utils: bad stripe count report, and validate stripe size
Minh Diep [Fri, 6 Jan 2012 01:40:57 +0000 (17:40 -0800)]
LU-960 utils: bad stripe count report, and validate stripe size

Need to use %d to print -1 instead of %u
Need to check for -1 in input for stripe size

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ia9593497764f1d3c0110ad22d70b5da8f7b07a21
Reviewed-on: http://review.whamcloud.com/1922
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Richard Henwood <rhenwood@whamcloud.com>
12 years agoLU-233 test: Remove taring of log files in gather logs.
Chris Gearing [Tue, 4 Oct 2011 17:31:59 +0000 (18:31 +0100)]
LU-233 test: Remove taring of log files in gather logs.

gather_logs in test-framework zips the logs for no apparent purpose
just adding to the file storage. In fact it creates multiple bz2
files which contain one another. This change removes all the bz2
stuff

Change made to remove Russian dolls whilst keeping behaviour of
recovery-*-scale tests

Signed-off-by: Chris Gearing <chris@whamcloud.com>
Change-Id: I76a15a44395c0bdf4ee01e9240b64bdcdf8b25ed
Reviewed-on: http://review.whamcloud.com/1398
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-822 osd: use bitmask to calculate seq hash
Liang Zhen [Tue, 10 Jan 2012 17:18:52 +0000 (01:18 +0800)]
LU-822 osd: use bitmask to calculate seq hash

We are using mod to hash seq to different OI files, which is not
allowed on 32-bit arch because seq is 64-bit. It can be resolved
by limit oi_count to power2 and use bitmask to calculate seq hash.
This patch also renamed modparameter osd_oi_num to osd_oi_count.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I61c1fac65a33c78b5b5196d2b2d6fd5519deffda
Reviewed-on: http://review.whamcloud.com/1941
Tested-by: Hudson
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-867 gss: adapt to 2.6.32 kernel changes cache_detail
Bobi Jam [Mon, 19 Dec 2011 01:35:27 +0000 (09:35 +0800)]
LU-867 gss: adapt to 2.6.32 kernel changes cache_detail

2.6.32 kernel changes cache_detail's member cache_request to
cache_upcall.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I3ddec24e88fd271c8e88e2c649be52542651b8cc
Reviewed-on: http://review.whamcloud.com/1885
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-874 ldlm: Fix ldlm_bl_* thread creation
Christopher J. Morrone [Thu, 5 Jan 2012 19:49:50 +0000 (11:49 -0800)]
LU-874 ldlm: Fix ldlm_bl_* thread creation

Always create a new ldlm_bl_ thread when all threads
are busy, not just after returning from sleep.

Change-Id: I2fa99a0f09a42e1333589fc7bc2a6eebef4924b6
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/1926
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
12 years agoLU-976 tests: skip MMP test for non-ldiskfs backfs
Andreas Dilger [Mon, 9 Jan 2012 19:04:54 +0000 (12:04 -0700)]
LU-976 tests: skip MMP test for non-ldiskfs backfs

Skip the Multi-Mount Protection tests for non-ldiskfs backing
filesystems.  MMP is currently only implemented for ldiskfs,
so until that changes this test cannot pass for any other FSTYPE.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I4dc9936fa6f34b4dc4cc465a6352daae2198788f
Reviewed-on: http://review.whamcloud.com/1934
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-987 build: Fail to create ldisk rpms
James Simmons [Fri, 13 Jan 2012 16:11:29 +0000 (11:11 -0500)]
LU-987 build: Fail to create ldisk rpms

The autoMakefile.am in ldsikfs does not define the
BUILD_SERVER flag so make rpms fails. This patch
simply set the flag to true since ldiskfs will most
likely be used only on servers

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I51e31d1ce0fd1d7e8639426852af1888a9a93f4f
Reviewed-on: http://review.whamcloud.com/1964
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-590 tests: obdfilter-survey.sh fails test 2a.
Andriy Skulysh [Thu, 5 Jan 2012 13:04:14 +0000 (15:04 +0200)]
LU-590 tests: obdfilter-survey.sh fails test 2a.

obdfilter-survey.sh fails to connect during test 2a.
Interruption to and restart of test leads to a panic.
This patch sets correct target for netdisk case, allows to
handle signals in test scripts, sets correct parameters for
obdfilter-survey cleanup.

Xyratex-bug-id: MRP-118
Change-Id: I057610ba51e9a9afb704b4467b8600fd61652a71
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-on: http://review.whamcloud.com/1288
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-952 quota: follow locking order of quota code
Niu Yawei [Fri, 6 Jan 2012 09:18:35 +0000 (01:18 -0800)]
LU-952 quota: follow locking order of quota code

The locking order of quota code is: i_mutex > dqonoff_sem >
journal_lock > dqptr_sem > dquot->dq_lock > dqio_mutex, so we
should call the ll_vfs_dq_init() after journal started to avoid
deadlock.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ia88a2eb8c9dc3827afd4828e0160ee376a1f041e
Reviewed-on: http://review.whamcloud.com/1923
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoRevert "LU-633 iokit: mdt-survey script for MD echo client test"
Oleg Drokin [Fri, 13 Jan 2012 22:27:45 +0000 (17:27 -0500)]
Revert "LU-633 iokit: mdt-survey script for MD echo client test"

Broke build somehow, even though passed gerrit, so reverting meanwhile.

This reverts commit 4f5fc8f9f0a93274c7b7ba7ab79d7a338cd0dfb8.

Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-633 iokit: mdt-survey script for MD echo client test
Minh Diep [Wed, 7 Dec 2011 00:38:54 +0000 (16:38 -0800)]
LU-633 iokit: mdt-survey script for MD echo client test

Create a mdt-survey script to run echo client for MD

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I2d75de5c28b70a3b8474ed8512389d850c77e638
Reviewed-on: http://review.whamcloud.com/1803
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 tests: check lustre.conf for modprobe
Andreas Dilger [Fri, 16 Dec 2011 02:24:57 +0000 (19:24 -0700)]
LU-506 tests: check lustre.conf for modprobe

The newer userspace modprobe requires that the /etc/modprobe.d/
files end in ".conf", so that there is no confusion between config
files that are supposed to be parsed, and files that are left
after editing (e.g. lustre.conf.orig or lustre.conf~).

Add a check for /etc/modprobe.d/lustre.conf to get lnet module
parameters during testing.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: Ie408f5334b2570f13e225038488df06ff7b524c4
Reviewed-on: http://review.whamcloud.com/1877
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-954 build: make lbuild to build lustre-iokit
Minh Diep [Fri, 30 Dec 2011 00:43:44 +0000 (16:43 -0800)]
LU-954 build: make lbuild to build lustre-iokit

Add additional codes in lbuild to build lustre-iokit rpm.
It should be noarch because it's independent of any arch.
Add info to Changelog and change iokit version to 1.3.0

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I8b3c0bbaf348c350155d7c50e9ffcb4ec92fe1ff
Reviewed-on: http://review.whamcloud.com/1907
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Michael MacDonald <mjmac@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agonew tag 2.1.54 2.1.54 v2_1_54_0
Oleg Drokin [Wed, 11 Jan 2012 18:00:36 +0000 (13:00 -0500)]
new tag 2.1.54

Change-Id: I62f089d9c945452336b7d22b2b0bca06890ebe7f
Signed-off-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-868 ptlrpc: Fix the timeout for waiting next replay
wangdi [Mon, 21 Nov 2011 06:57:03 +0000 (22:57 -0800)]
LU-868 ptlrpc: Fix the timeout for waiting next replay

During recovery, when setting the timeout for waiting the next
replay, it should consider netlatency(added into timeout) and
early reply as well, so if server sends the early reply for
the request, the client might extend the timeout according to
current estimate service time.

Signed-off-by: Wang di <di.wang@whamcloud.com>
Change-Id: I23ebf1dc3f525f78573890be26474b2c79c65a6d
Reviewed-on: http://review.whamcloud.com/1716
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 kernel: FC15 - Introduce simple_setattr().
yangsheng [Wed, 31 Aug 2011 15:52:28 +0000 (23:52 +0800)]
LU-506 kernel: FC15 - Introduce simple_setattr().

Since 2.6.35: simple_setattr() has be introduced to
replace inode_setattr().
kernel-commit: 7bb46a6734a7e1ad4beaecc11cae7ed3ff81d30f

Signed-off-by: Yang Sheng <ys@whamcloud.com>
Change-Id: I0e002f20fa4bafb18e0c7d3c55924800265658f6
Reviewed-on: http://review.whamcloud.com/1863
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 kernel: Add config check for sb_any_quota_loaded
yangsheng [Tue, 10 Jan 2012 18:59:11 +0000 (02:59 +0800)]
LU-506 kernel: Add config check for sb_any_quota_loaded

Add config check for sb_any_quota_loaded() since
OFED just backport DQUOT_USAGE_ENABLED.

Signed-off-by: yangsheng <ys@whamcloud.com>
Change-Id: I2f5fe42408bed1be7110944e7a47e0af16d896db
Reviewed-on: http://review.whamcloud.com/1943
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Michael MacDonald <mjmac@whamcloud.com>
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-884 osc: async osc_check_rpcs()
Jinshan Xiong [Wed, 4 Jan 2012 18:28:29 +0000 (10:28 -0800)]
LU-884 osc: async osc_check_rpcs()

Add a new "async" parameter to osc_check_rpcs(); if it is called with
async, it will compose a fake ptlrpc_request so that RPCs will be
composed and issued in ptlrpcd context.

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I1c2f8ae43da8146428c474f17ddf3dc23a2df9ef
Reviewed-on: http://review.whamcloud.com/1825
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-925 agl: async glimpse lock process in CLIO stack
Fan Yong [Fri, 6 Jan 2012 06:49:42 +0000 (14:49 +0800)]
LU-925 agl: async glimpse lock process in CLIO stack

Adjust CLIO lock state machine for supporting:
1. unuse lock in non-hold state.
2. re-enqueue non-granted glimpse lock.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I9de8939a398d7b4c7062e6c5859bca06deddd089
Reviewed-on: http://review.whamcloud.com/1243
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-962 ptlrpc: feature to run callback in ptlrpcd context
Jinshan Xiong [Wed, 4 Jan 2012 07:56:22 +0000 (23:56 -0800)]
LU-962 ptlrpc: feature to run callback in ptlrpcd context

In this patch, a feature is added to run a callback in ptlrpc
context. We need a ptlrpc work for this purpose. There are three
functions exported:
  1. ptlrpc_alloc_work() to allocate work;
  2. ptlrpc_run_work() to run an allocated work, this function can
     be executed many times;
  3. ptlrpc_destroy_work() to destroy the work;

Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I2bce5a17003855468eab9075fb50ed02d7bcc208
Reviewed-on: http://review.whamcloud.com/1917
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 kernel: FC15 - small changes
yangsheng [Thu, 5 Jan 2012 17:14:34 +0000 (01:14 +0800)]
LU-506 kernel: FC15 - small changes

   -- stacktrace_ops.{warning(), warning_symbol()} removed.
   -- sk_sleep() helper added.
   -- quota_on() 4 parameter change to use 'struct path'.
   -- fs_struct.lock change to use spin_lock.
   -- other trivial changes.

Change-Id: Ic9bf47454b19c1cfc3e41cd3aebbabb074f6110f
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1864
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 kernel: FC15 - reiserfs remove & cleanup
yangsheng [Wed, 4 Jan 2012 11:22:04 +0000 (19:22 +0800)]
LU-506 kernel: FC15 - reiserfs remove & cleanup

 -- Remove reiserfs support entirely.
 -- Change to don't use PATCHLEVEL, since kernel has
    forward to version 3.1.

Signed-off-by: yangsheng <ys@whamcloud.com>
Change-Id: I86d185ba522b5c9fc5e16bfe1d34be8720573e58
Reviewed-on: http://review.whamcloud.com/1915
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-822 osd: multiple Object Index files
Liang Zhen [Thu, 8 Dec 2011 16:48:29 +0000 (00:48 +0800)]
LU-822 osd: multiple Object Index files

Single OI container could be performance bottleneck on server side,
because many service threads may content on the same OI htree-tree
even OI htree can support parallel operations but there are
still a lot of spinlock contentions and cacheline contentions.
Also, parallel operations of OI htree can't scale very well if
there are hundreds or thousands threads, it is because limitation
of dynlock. Instead of fix scalability of dynlock, the long term
solution is more straightforward, we can simply support multiple OI
containers and hash service threads to different OIs by lu_fid::f_seq.
We need to make sure this feature can support single OI created
by 2.1 or earlier versions, also, user can specify number of OIs by
modparameter osd_oi_num on creating new filesystem, this parameter
will be ignored if OSD is loading on existed filesystem.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Iaa5ef9e43b80301150608802e40b4ef506467457
Reviewed-on: http://review.whamcloud.com/1822
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-879 mds: Add a few rename stats under /proc
wangdi [Fri, 16 Dec 2011 01:19:31 +0000 (17:19 -0800)]
LU-879 mds: Add a few rename stats under /proc

1. Add samedir_rename in /proc/fs/lustre/mds/lustre-MDT0000/stats
to collect stats of same dir rename.
2. Add crossdir_rename in /proc/fs/lustre/mds/lustre-MDT0000/stats
to collect stats of cross dir rename.
3. Add /proc/fs/lustre/mds/lustre-MDT0000/rename_stats(YAML format)
to collect stats of rename stats happened on different size
directories.
The size of directories under which files are being removed.
With these patches, it will find out how many renames take place
in the same directory compared to how many renames are between
So during DNE implementation, we can know how rename may be
affected by DNE remote directories and large striped directories.

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: I4452ce196802c5724607455e0a9b4b372b06f159
Reviewed-on: http://review.whamcloud.com/1878
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-169 lov: add generation number to LOV EA
Johann Lombardi [Thu, 15 Dec 2011 00:00:00 +0000 (01:00 +0100)]
LU-169 lov: add generation number to LOV EA

This patch shrinks lov_mds_md_v*::lmm_stripe_count to 16 bits and use
the remaining 16 bits to store a generation number for the layout.
This generation will be used in conjunction the layout lock to allow
clients to detect when the file layout has changed.
The layout generation starts at 0 and will be bumped each time the
file layout is altered.
For backward compatibility, the layout generation is set to 0 when the
layout is sent to a client that does not support
OBD_CONNECT_LAYOUTLOCK.

This patch also stores OBD_INCOMPAT_LMM_VER in the MDS last_rcvd file
to prevent older versions of Lustre that cannot deal with a 16-bit
lmm_stripe from mounting the filesystem and exporting layouts to older
clients without setting the layout to 0.  This flag will be set in the
last_rcvd file only once we start modifying file layouts, but at least
the current version should not be confused by a 16-bit stripe count
and a non-zero generation.

Signed-off-by: Jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Change-Id: Ic40083227057eba565287d1a10890875b8a96c13
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1866
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-593 obdclass: echo client for MDS stack
wangdi [Wed, 16 Nov 2011 22:55:23 +0000 (14:55 -0800)]
LU-593 obdclass: echo client for MDS stack

1. Add interfaces and tools for exercising a local MDT
   device for performance reasons, in a similar manner
   to obdfilter-survey.
2. add test_create, test_mkdir, test_lookup, test_destroy,
   test_rmdir, test_setxattr, test_md_getattr in lctl for
   md echo client test.

Signed-off-by: Wang di <di.wang@whamcloud.com>
Change-Id: Ibf774a567820ff36b3624e44371c63a9428d82a5
Reviewed-on: http://review.whamcloud.com/1287
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-955 build: fix bad lustre-backend-fs dependency
Minh Diep [Thu, 5 Jan 2012 17:12:29 +0000 (09:12 -0800)]
LU-955 build: fix bad lustre-backend-fs dependency

Fix an incorrect RPM package dependency if Lustre RPMs are built
with "make rpms" with client only

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: Ib8691a37fc230cc63b7aca48bc5146a67e10a2f0
Reviewed-on: http://review.whamcloud.com/1576
Reviewed-by: Brian J. Murrell <brian@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-909 osd: changes to osd api
Alex Zhuravlev [Fri, 6 Jan 2012 05:01:40 +0000 (13:01 +0800)]
LU-909 osd: changes to osd api

the main purpose of the patch is to get declare methods in the API
and to teach osd-based devices to use that

- new declaration methods for each changing method
- explicit destroy method:
  ->do_ref_del() never destroy object
- methods to access data in 0-copy manner:
  no actual implementation in this patch
- mdd/fld use new methods to create/declare/start transactions
- specific method to change/access version are removed:
  use xattr methods
- ldiskfs osd tracks all declarations and asserts if caller
  is trying to call changing method w/o proper declaration

Change-Id: I473c0c2950c1920abb2fef1dac465c08f35522ea
Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1669
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-847 quota: move lmv/mdc/osc/lov quota code out of lquota
Johann Lombardi [Fri, 30 Sep 2011 13:12:59 +0000 (06:12 -0700)]
LU-847 quota: move lmv/mdc/osc/lov quota code out of lquota

All quota code was initially put into a separate kernel module since
quota might be released under a different license. That's not relevant
any more and we can now move the client-side quota code back to the
regular lustre modules.

This removes useless indirections and makes the code easier to read
and to maintain.

Change-Id: If898a46db9158edb8e4eaf855f1ed98db97330f0
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1613
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-80 tests: large xattr support
Yu Jian [Fri, 30 Dec 2011 06:28:44 +0000 (14:28 +0800)]
LU-80 tests: large xattr support

This patch adds and updates some test cases to verify the
large xattr feature. To enable this feature, the "-O large_xattr"
option needs to be set on the filesystem either with --mkfsoptions
at format time or via tune2fs.

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: Ie62e6e2194fdaa7239ec4e1451a4e696888dca8c
Reviewed-on: http://review.whamcloud.com/1880
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-80 lov: large stripe count support
Alexander Boyko [Fri, 30 Dec 2011 06:24:10 +0000 (14:24 +0800)]
LU-80 lov: large stripe count support

Currently a file can be stripped across OSTs up to a limit of
160 stripes. This patch expands that limit to 2000 and it is
possible to go even to larger stripe counts.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Change-Id: I42e1aad35dd056faac23a0d5b025e0a23fc4ec2f
Reviewed-on: http://review.whamcloud.com/1111
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-80 ldiskfs: large EA support
James Simmons [Fri, 30 Dec 2011 06:02:25 +0000 (14:02 +0800)]
LU-80 ldiskfs: large EA support

This patch implements the large EA support in ext4. If the size of
an EA value is larger than the blocksize, then the EA value would
not be saved in the external EA block, instead it would be saved
in an external EA inode. So, the patch also helps support a larger
number of EAs.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ic08921cf13483c9b28560c987773d7aa36c62fac
Reviewed-on: http://review.whamcloud.com/1708
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
12 years agoLU-250 lnet: Router random shuffling support
James Simmons [Wed, 28 Dec 2011 16:00:56 +0000 (11:00 -0500)]
LU-250 lnet: Router random shuffling support

This is the last of the router improvements that where
developed at ORNL. This feature allows routes to be
randomly placed. Our results show a 20 percent
improvement with random shuffling.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I6862febebd6797fb35c003646f6d90f5d5d2b014
Reviewed-on: http://review.whamcloud.com/249
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Isaac Huang <Isaac_Huang@xyratex.com>
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 kernel: FC15 - file_operations relate changes.
yangsheng [Wed, 4 Jan 2012 15:23:16 +0000 (23:23 +0800)]
LU-506 kernel: FC15 - file_operations relate changes.

   -- file_operations.ioctl() has been removed.
   -- file_operations.fsync() changes need 2 arguments.

Change-Id: I7776593497dd988fbf860221e2dcea61c6c4870f
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1862
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
12 years agoLU-925 agl: trigger async glimpse lock when statahead
Fan Yong [Fri, 16 Dec 2011 07:34:25 +0000 (15:34 +0800)]
LU-925 agl: trigger async glimpse lock when statahead

Client will send async glimpse lock RPCs to OSTs for file size
attribute before stat files to accelerate traversing large dir.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I9bf850abdd3c02c8470ddfcd91a3dc3ef7819c6d
Reviewed-on: http://review.whamcloud.com/1692
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-506 kernel: FC15 - vfsmount.mnt_count doesn't use atomic_t
yangsheng [Fri, 2 Sep 2011 02:20:02 +0000 (10:20 +0800)]
LU-506 kernel: FC15 - vfsmount.mnt_count doesn't use atomic_t

vfsmount.mnt_count use pre-cpu variable instead of atomic_t.

Change-Id: I5eba5a67839719e03d0b44be6312b452bf4dcb98
Signed-off-by: Yang Sheng <ys@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1845
Tested-by: Hudson
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-639 obdclass: wait obd cleanup before mount
Bobi Jam [Tue, 20 Dec 2011 09:59:07 +0000 (17:59 +0800)]
LU-639 obdclass: wait obd cleanup before mount

Obd device cleanup is executed by obd zombie thread, and umount thread
can return successfully before obd zomebie finishes its job. In some
cases, especially in test cases, a test may starts before last tests
finishes obd cleanup, this patch makes mount thread wait for obd
zombie finishes its job.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I881be5de18960867c36e8c4e4180c0c594d88a01
Reviewed-on: http://review.whamcloud.com/1896
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-176 obdclass: disable mb_cache by default
Liang Zhen [Tue, 29 Mar 2011 03:37:28 +0000 (11:37 +0800)]
LU-176 obdclass: disable mb_cache by default

We are supposed to disable mb_cache by default (bug 22771),
but we never did because it's not in default mount option.
This patch will add "no_mbcache" to default mount option.

Change-Id: I13c9db8e98ad305d26d887d8bd069eff92c20763
Signed-off-by: Liang Zhen <liang@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/373
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-748 test: shorten the runtime of sanity subtest_220
Hongchao Zhang [Mon, 12 Dec 2011 18:28:08 +0000 (02:28 +0800)]
LU-748 test: shorten the runtime of sanity subtest_220

in sanity.sh, test_220 tries to exhaust all of the inodes on the OSTs
in order to verify that when it returns -ENOSPC to inode precreate
request, but there is still free blocks, then the MDS continues to use
these precreated inodes on the OSTs.

Signed-off-by: Hongchao Zhang <hongchao.zhang@whamcloud.com>
Change-Id: Icaad07311125f362f0efb26da76534c7dca27b6a
Reviewed-on: http://review.whamcloud.com/1676
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
12 years agoLU-888 lctl: remove perilous lctl {get,set,list}_param behavior.
Richard Henwood [Sat, 3 Dec 2011 02:44:45 +0000 (21:44 -0500)]
LU-888 lctl: remove perilous lctl {get,set,list}_param behavior.

This patch stops the {get,set)_param from potentially reading and
writing to file in the current working directory. Now lctl
{get,set,list}_param visits /proc/{fs,sys}/{lnet,lustre} for
parameter values. Specifying a file path to {get,set,list}_param
is deprecated behavoir.

Using a path with lctl {get,set,list}_param fails if the path does
not begin with '/proc/'. If the path begins with '/proc/' a warning
is printed the command executes using the given path.

A new helper function lprocfs_param_pattern is introduced to
provide checking and constructing the proc path.

Test suite has been searched for {get,set,list}_param lctl calls.
All specified parameters have been checked to ensure they are
not using the deprecated path interface.

lctl man page is updated to remove ambiguity around using paths
to specifiy parameters.

Signed-off-by: Richard Henwood <rhenwood@whamcloud.com>
Change-Id: I39e355b28fb1337f5b3a53f9e7265f4e969ddd2d
Reviewed-on: http://review.whamcloud.com/1765
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: John Hammond <jhammond@tacc.utexas.edu>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-844 test: limit max IO data size for obdfilter test
Bobi Jam [Thu, 24 Nov 2011 09:55:26 +0000 (17:55 +0800)]
LU-844 test: limit max IO data size for obdfilter test

obdfilter-survey disk case test only supports maximum 1M IO data.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Id974a3cacf50ffe760771224a285b0b7cd308840
Reviewed-on: http://review.whamcloud.com/1741
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-613 clio: Client dead-lock during binary exec
Jinshan Xiong [Fri, 19 Aug 2011 22:24:15 +0000 (15:24 -0700)]
LU-613 clio: Client dead-lock during binary exec

The root cause is clio takes attr lock and i_size_sem in reverse
order.

In my patch, I find out there is no deadlock issue any more in fault
and truncate path, so it holds i_size_sem in fault path to fix this
problem.

Change-Id: I04cca9324158a34fded6651692410e29aae2e402
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1281
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-932 llite: cleanup ll_inode_info to reduce inode size
Fan Yong [Fri, 16 Dec 2011 05:58:37 +0000 (13:58 +0800)]
LU-932 llite: cleanup ll_inode_info to reduce inode size

ll_inode_info contains many special-used members, some of them are only
used for directory object, some are only used for non-directory object.
Share memory between those non-coexist members to reduce inode size.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: Ic10014b2121f64718667addb77ed009aa29c1b4c
Reviewed-on: http://review.whamcloud.com/1691
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-927 ptlrpc: common interfaces for ptlrpc_thread::t_flags
Fan Yong [Thu, 15 Dec 2011 01:50:20 +0000 (09:50 +0800)]
LU-927 ptlrpc: common interfaces for ptlrpc_thread::t_flags

Build some common interfaces for ptlrpc_thread::t_flags processing:
thread_is_xxx and thread_yyy_flags. Operating ptlrpc_thread::t_flags
only can through these interfaces.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I285878ffeff8810ecbaf7a42b8d9381f392c0c9a
Reviewed-on: http://review.whamcloud.com/1690
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-805 quota: lfs quota doesn't print grace time correctly
Niu Yawei [Tue, 8 Nov 2011 13:07:05 +0000 (05:07 -0800)]
LU-805 quota: lfs quota doesn't print grace time correctly

Lustre always trigger grace time when the allocated qunit exceeding
softlimit, however, user tools 'lfs quota' only print grace time
when the total usage greater than softlimit, so sometimes user can't
tell if the softlimit is already exceeded from 'lfs quota' output.

This patch changes the 'lfs quota' to use the data get from kernel
instead of comparing usage with softlimit.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ia564c803ca33b2cf925759b6a6e4e4df2692f28d
Reviewed-on: http://review.whamcloud.com/1674
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-847 quota: client retrieve quota usage directly
Niu Yawei [Thu, 15 Sep 2011 03:36:19 +0000 (20:36 -0700)]
LU-847 quota: client retrieve quota usage directly

Current 'lfs quota' sends getquota RPC to MDS, and MDS is responsible
for retrieving disk usage from all targets, this scheme will be
changed to client retrieving disk usage from all targets directly.

This patch addresses the compatibility issue as well: If the getquota
returned by MDS has QIF_SPACE, client just trust the disk usage
returned by MDS, otherwise, client has to issue RPCs to all OSTs to
collect disk usage by itself.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Ia94d3b3ba280d5b31d2d3c508412d662f4e95321
Reviewed-on: http://review.whamcloud.com/1382
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
12 years agoLU-924 test: sync client data before reboot server
Niu Yawei [Mon, 19 Dec 2011 06:51:26 +0000 (22:51 -0800)]
LU-924 test: sync client data before reboot server

Lustre doesn't write client data synchronously (to avoid flooding sync
writes when there are many clients connecting, see exp_need_sync), so
if the server reboots before client data reachs disk, the client data
will be lost and client will be evicted after recovery, which is not
what we expected.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I21bcafe1f2630285c9108c0528467eb177e3449b
Reviewed-on: http://review.whamcloud.com/1888
Tested-by: Hudson
Reviewed-by: Chris Gearing <chris@whamcloud.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-32 osd: keep root node BH ref of IAM container
Liang Zhen [Mon, 27 Dec 2010 05:35:20 +0000 (13:35 +0800)]
LU-32 osd: keep root node BH ref of IAM container

IAM in ldiskfs-osd will always consume some slots in bh_lru (see:
fs/buffer.c), if we keep buffer_head reference on root node,
we can save one slot in bh_lru and could be somehow helpful for
overall performance, I did some tests, LRU hits rate increased
5%-10% while creating files if we always keep this reference.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I954f26932462169c9bfc6bbfe1a57b5348624179
Reviewed-on: http://review.whamcloud.com/1826
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-617 recovery: setattr from open breaks recovery
Niu Yawei [Fri, 26 Aug 2011 02:58:36 +0000 (19:58 -0700)]
LU-617 recovery: setattr from open breaks recovery

The setattr from open(open(O_TRUNC)) is now serialized with
'cl_setattr_lock' on client and goes to a dedicate portal, which is
different with other reint operations, consequently, setattr RPC
can be parallel with other reint RPCs, and that result in the race of
updating last_transno/last_xid on server.

This patch removed the 'cl_setattr_lock' stuff to make all the reint
operations serialized by 'cl_rpc_lock', and the code on server side
which assumes client is holding DLM lock when setattr from open is also
removed, since it's not true.

The MDS_SETATTR_PORTAL service is preserved to keep the compatibility
with old client, and the MDS_SETATTR_FROM_OPEN is also preserved, since
we are using this flag to check write access for open(O_TRUNC), and
it probably can be used for some optimization purpose in future.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I45f83f8f05022ff0d31f8e7784381821c835785d
Reviewed-on: http://review.whamcloud.com/1654
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Mikhail Pershin <tappro@whamcloud.com>
12 years agoLU-867 ptlrpc: Resolve section mismatch
Bobi Jam [Sat, 5 Nov 2011 07:55:39 +0000 (15:55 +0800)]
LU-867 ptlrpc: Resolve section mismatch

The __init and __exit attributes appear to have been over used for
the ptlrpc module.  They should only be attached to ptlrpc_init and
ptlrpc_exit respectively and those functions should call the required
cleanup functions.  By using the attributes too broadly the following
section mismatch warning with be hit.  This indentifies cases where
your calling an __exit function from an __init function which is in
the wrong section.

WARNING: ~/src/git/lustre/lustre/ptlrpc/ptlrpc.o(.init.text+0x354):

Section mismatch in reference from the function ptlrpc_init() to the
function .exit.text:sptlrpc_fini() The function __init ptlrpc_init()
references a function __exit sptlrpc_fini().

This is often seen when error handling in the init function uses
functionality in the exit path.  The fix is often to remove the
__exit annotation of sptlrpc_fini() so it may be used outside an
exit section.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I94a4cb5fb7deea41fce5ee939470dc2e40908f98
Reviewed-on: http://review.whamcloud.com/1653
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-882 quota: Quota code compares unsigned < 0
Niu Yawei [Mon, 19 Dec 2011 10:01:36 +0000 (02:01 -0800)]
LU-882 quota: Quota code compares unsigned < 0

Port from b23858.

In check_cur_qunit(), it checks "if (limit + record < 0)", however,
the limit is unsigned, so this check will be always false, and when
limit is smaller than -record, following "limit += record" will make
limit a unreasonable large value.

This patch also fixed a similar defect in dqacq_handler().

Signed-off-by: Vladimir Saveliev <vladimir.saveliev@oracle.com>
Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Iea02143dae5542f1a9f9cc823a684a18031b8a03
Reviewed-on: http://review.whamcloud.com/1889
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-728 tests: pdsh HOSTLIST test suite support
James Simmons [Thu, 29 Dec 2011 17:10:41 +0000 (12:10 -0500)]
LU-728 tests: pdsh HOSTLIST test suite support

For large systems it becomes tedious to list all the nodes
or devices. The application pdsh uses a HOSTLIST form to
make listing nodes more compact. This patch takes data in
the HOSTLIST format and creates a expanded list for the
shell scripts in the test suite to use. Also this gives the
option for the pdsh commands to use the nodes list in
HOSTLIST format directly. Its is also possible to use the
HOSTLIST format for devices as well as nodes.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ia7a6c95ab4e701a17fc2b80d8443a5ff79da3f3c
Reviewed-on: http://review.whamcloud.com/1462
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-81 deadlock of changelog adding vs. changelog cancelling
Niu Yawei [Thu, 18 Aug 2011 04:22:19 +0000 (21:22 -0700)]
LU-81 deadlock of changelog adding vs. changelog cancelling

This is a workaround for the deadlock of changelog adding vs.
changelog cancelling. Changelog adding always start transaction
before acquiring the catlog lock(lgh_lock), whereas, changelog
cancelling do start transaction after holding the catlog lock.

We start transaction earlier to avoid above deadlock.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I9647b9a559f68a27dc0d4b4885857d3cf73b5b8e
Reviewed-on: http://review.whamcloud.com/1260
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-625 test: sanityn.sh test file size should be dynamic
James Simmons [Thu, 29 Dec 2011 16:08:51 +0000 (11:08 -0500)]
LU-625 test: sanityn.sh test file size should be dynamic

For large OSTCOUNT, a small write can't take extent locks on all OSTs.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ic3fb8e88444767db70545292c4b81b9fe9f1f813
Reviewed-on: http://review.whamcloud.com/1901
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-884 clio: client in memory checksum
Jinshan Xiong [Wed, 19 Oct 2011 23:34:26 +0000 (16:34 -0700)]
LU-884 clio: client in memory checksum

Use page_mkwrite() method from latest kernels to correctly implement
RPC checksum functionality. Also OBD_FL_MMAP is removed because it
won't be used any more.

Change-Id: I6ec5aae14f56c95b1ac6936d21b5a273582fa4e8
Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1609
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-571 ldlm: add parallel ast flow control
Jinshan Xiong [Wed, 26 Oct 2011 19:48:17 +0000 (13:48 -0600)]
LU-571 ldlm: add parallel ast flow control

Commit {hash: 8c83e7d75989ef527e43a824a0dbe46bffabd07d} removed the
parallel AST limit on the server. However, if there are too many locks
to revoke, it will have to allocate too many RPCs.

Return to having an upper limit on the number of AST RPCs in flight by
adding a flow control algorithm that allows a configurable upper limit on
the number of RPCs in flight.

Change-Id: Ifb68991acf7a33119b334447aec50f7717ed546e
Signed-off-by: Jinshan Xiong <jay@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1608
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-635 tests: conf-sanity 27 28 29 30 35 43 failures
James Simmons [Thu, 29 Dec 2011 14:40:36 +0000 (09:40 -0500)]
LU-635 tests: conf-sanity 27 28 29 30 35 43 failures

Several test in conf-sanity would fail with a error
This command must be run on the MGS. This was due
to seperating the MGS and MDS. I tracked down the fix
to doing a pdsh for several lctl commands to MGS
instead of the MDS.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I721472b15821e2dbcc636292a8e82c1a1b5e0149
Reviewed-on: http://review.whamcloud.com/1869
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-424 tests: conf-sanity test 55, 56, 58 failure fixes
James Simmons [Thu, 29 Dec 2011 14:46:43 +0000 (09:46 -0500)]
LU-424 tests: conf-sanity test 55, 56, 58 failure fixes

The MGS service was not started in conf-sanity test 55,
56 and 58 with separate MGS and MDT configuration. This
patch fix the issue.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I0c01955b4baf81535959cdbf38bf84e7acb04ddf
Reviewed-on: http://review.whamcloud.com/1870
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoNew tag 2.1.53 2.1.53 v2_1_53_0
Oleg Drokin [Sun, 1 Jan 2012 13:08:38 +0000 (08:08 -0500)]
New tag 2.1.53

Signed-off-by: Oleg Drokin <green@whamcloud.com>
Change-Id: I0d5a2dc7af8e866f59b5d1dcf5b7ccc6e4268567

12 years agoLU-417 llite: report non-zero blocks on writing client
Bobi Jam [Fri, 4 Nov 2011 07:22:41 +0000 (15:22 +0800)]
LU-417 llite: report non-zero blocks on writing client

Writing client may not report accurate allocated block numbers when
dirty pages has not been writting back to OSTs, some "cp" or "tar" may
skip the file because it thinks it is completely sparse.

This patch makes writing client consider dirty pages when reporting
allocated blocks, lest the file be treated as a completely sparse one.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I985d50a44ea1e917bf8e1cba3b5cb770eec35c3f
Reviewed-on: http://review.whamcloud.com/1647
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
12 years agoLU-935 quota: break early when b/i_unit_sz exceeded upper limit
Niu Yawei [Mon, 19 Dec 2011 10:18:28 +0000 (02:18 -0800)]
LU-935 quota: break early when b/i_unit_sz exceeded upper limit

While expanding b/i_unit_sz in dquot_create_oqaq(), we'd break the loop
early when the b/i_unit_sz exceeded upper limit, otherwise, qaq_b/iunit_sz
could be overflow and result in endless loop.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I0bf069e9259627426d7a87ec42844eaed7a733b4
Reviewed-on: http://review.whamcloud.com/1890
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>