Whamcloud - gitweb
fs/lustre-release.git
11 years agoLU-1600 lnet: peer creation has race with shutdown
Liang Zhen [Thu, 5 Jul 2012 05:10:08 +0000 (13:10 +0800)]
LU-1600 lnet: peer creation has race with shutdown

lnet_nid2peer_locked()->lnet_find_peer_locked() will get NULL if
LNet's in progress of shutting down, then it will try to allocate
a new peer and insert it into peer table. If one thread is doing this
and another thread could have already finalized everything of LNet,
so the first thread will crash system after allocation.

The solution is add an extra refcount on peer-table (number of peers)
before allocating new peer, because the shutting down thread always
needs to wait until peers number to be zero before going to the
next step of finalization.

This bug is not introduced by new LNet, but it can be exposed
easily by new LNet.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I5c8d26f08ce56092bee1b4bae5111fdfe1e9c12b
Reviewed-on: http://review.whamcloud.com/3274
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoNew tag 2.2.60 2.2.60 v2_2_60 v2_2_60_0
Oleg Drokin [Fri, 6 Jul 2012 16:23:58 +0000 (12:23 -0400)]
New tag 2.2.60

Change-Id: Ib962d669a8f3d0cbeeb92b430a0f86eb0d1b6283
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 ptlrpc: post rqbd with flag LNET_INS_LOCAL
Liang Zhen [Wed, 4 Jul 2012 04:21:38 +0000 (12:21 +0800)]
LU-56 ptlrpc: post rqbd with flag LNET_INS_LOCAL

LNet has a new flag LNET_INS_LOCAL which can be used by CPT
affinity threads while posting buffer.
(commit 279bbc81e03dc74d273ec12b4d9e703ca94404c4)
Buffer posted with this flag will be attached on local partition
only, and LND threads can find/match buffer by grabbing a local
partition lock which is good for performance.
This patch applied this flag to ptlrpc service.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I75cee15f125b033642195a71921dbc6ad4db5dfd
Reviewed-on: http://review.whamcloud.com/3268
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-957 lfsck: user space tools for LFSCK/scrub
Fan Yong [Wed, 4 Jul 2012 11:54:00 +0000 (19:54 +0800)]
LU-957 lfsck: user space tools for LFSCK/scrub

Control LFSCK/scrub by lctl commands:

1) lfsck_start: start LFSCK/scrub with specified parameters.

2) lfsck_stop: stop LFSCK/scrub on the specified MDT device.

3) The LFSCK/scrub status can be obtained through some special
lproc interface. For example: check OI scrub status by:
lctl get_param -n osd-ldiskfs.*.oi_scrub

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I5828c18453c92162fa0dc211324b69d15ecd9fbc
Reviewed-on: http://review.whamcloud.com/3170
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1581 mount: do not pass osd= option
Alex Zhuravlev [Thu, 5 Jul 2012 19:40:20 +0000 (23:40 +0400)]
LU-1581 mount: do not pass osd= option

until it's supported by the kernel part

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia3868e1c2dffb6a0ae8bbe4f8b38018ffcdcc6f4
Reviewed-on: http://review.whamcloud.com/3285
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
11 years agoLU-521 lnet: make LST support variable page size
Liang Zhen [Mon, 5 Sep 2011 15:12:32 +0000 (23:12 +0800)]
LU-521 lnet: make LST support variable page size

LNet selftest can't support variable page size because it sends
number of pages over the wire.
We have to change wire format to fix this but it will have
compatibility issue, so this patch also implemented "session features"
for LST to resolve compatibility issue, only new version LST
can understand new bulk RPC format, although new LST nodes still can
communicate with old LST nodes.

The Variable page size feature can be turned on by setting
LST_FEATURES to LST_FEAT_BULK_LEN (1). This feature is off by default.

Please see the Jira ticket (LU-521) for more details.

Change-Id: I4a552a3310cf0ed0a2f5ae29eaf789469e1c245a
Signed-off-by: Liang Zhen <liang@whamcloud.com>
Signed-off-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/1338
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1239 ldlm: cascading client reconnects
Vitaly Fertman [Tue, 26 Jun 2012 11:37:47 +0000 (15:37 +0400)]
LU-1239 ldlm: cascading client reconnects

It may happen that
- MDS is overloaded with enqueues, they consume all the threads on
  MDS_REQUEST portal and waiting for a lock a client holds;
- that client tries to re-connect but MDS is out of threads and
  re-connection fails;
- other clients are waiting for their enqueue completions, they try
  to ping MDS if it is still alive, but despite the fact it is a HP-rpc,
  there is no thread reserved for it. Thus, other clients get timed
  out as well.

Ensure each service which handles HP-rpc has an extra thread reserved
for them; make MDS_CONNECT and OST_CONNECT HP-rpc.

Change-Id: I295aec6a2d2fb614d4b5f037068a3dcdda1a8b09
Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Andrew Perepechko <Andrew_Perepechko@xyratex.com>
Xyratex-bug-id: MRP-455
Reviewed-on: http://review.whamcloud.com/2355
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1422 lnet: eliminate obsolete Cray Catamount support
James Simmons [Mon, 2 Jul 2012 17:18:17 +0000 (13:18 -0400)]
LU-1422 lnet: eliminate obsolete Cray Catamount support

Remove the bulk of code for the no longer supported Catamount
platform on Cray. This was conditionally compiled under CRAY_XT3.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I011058fb0bc74aaf01ec34ea6385e54bdee2356f
Reviewed-on: http://review.whamcloud.com/3064
Reviewed-by: Cory Spitz <spitzcor@cray.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1576 llite: correct page usage count
Bobi Jam [Mon, 2 Jul 2012 08:56:07 +0000 (16:56 +0800)]
LU-1576 llite: correct page usage count

If kernel has add_to_page_cache_lru(), the ll_pagevec_add() is defined
as an empty function, while page_cache_get(page) only makes sense if
ll_pagevec_add() is defined.

This patch moves page_cache_get into ll_pagevec_add() macro
definition.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Iad98aacff43beec3e7a64fd1a778f549250aa5b8
Reviewed-on: http://review.whamcloud.com/3255
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 ptlrpc: CPT affinity ptlrpc RS handlers
Liang Zhen [Tue, 19 Jun 2012 07:56:45 +0000 (15:56 +0800)]
LU-56 ptlrpc: CPT affinity ptlrpc RS handlers

This patch covered a couple of things:
- reimplement RS handler by using CPU partition APIs
- Instead of always round-robin choose RS handler thread, this patch
  directly choose RS handler thread on partition of rs::rs_svcpt

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I5fdebb116630d073d41b39fc4271c4cebb429965
Reviewed-on: http://review.whamcloud.com/3135
Reviewed-by: wangdi <di.wang@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1581 ofd: revert learn osd to be used from lmi
Johann Lombardi [Thu, 5 Jul 2012 11:16:18 +0000 (07:16 -0400)]
LU-1581 ofd: revert learn osd to be used from lmi

This reverts commit 92ff6ccdfb8b069d36840db97820d3ebe44dfd5b
This patch has broken builds:
ofd_dev.c>:106: error: 'struct lustre_sb_info' has no member
named 'lsi_osd_type'

Change-Id: Id3aa27c513575689c3b33754639b64f716d651f0
Reviewed-on: http://review.whamcloud.com/3279
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Johann Lombardi <johann@whamcloud.com>
11 years agoLU-1581 ofd: learn osd to be used from lmi
Alex Zhuravlev [Wed, 27 Jun 2012 12:10:19 +0000 (16:10 +0400)]
LU-1581 ofd: learn osd to be used from lmi

so that OSS can work with different type of backends (ldiskfs, zfs)

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ia630d010b77a8d7ca74d67122d77a3ceb98e95e1
Reviewed-on: http://review.whamcloud.com/3236
Tested-by: Hudson
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: mount to recognize backend type
Alex Zhuravlev [Wed, 27 Jun 2012 05:35:50 +0000 (09:35 +0400)]
LU-1581 utils: mount to recognize backend type

using mount utils introduced earlier.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I18ed4d1840f0b9577b54bc9aa0e7d34e56287ed6
Reviewed-on: http://review.whamcloud.com/3233
Tested-by: Hudson
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: mt_type macro
Alex Zhuravlev [Wed, 27 Jun 2012 05:33:18 +0000 (09:33 +0400)]
LU-1581 utils: mt_type macro

to convert ldd_mount_type into string

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I7ea21958584384b68ccde546c1472eb3053f6e22
Reviewed-on: http://review.whamcloud.com/3231
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: struct mount_opts
Alex Zhuravlev [Tue, 26 Jun 2012 17:37:03 +0000 (21:37 +0400)]
LU-1581 utils: struct mount_opts

a structure to pass mount options all around

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ide67df554c49752bb06c1f0ed81531925332808a
Reviewed-on: http://review.whamcloud.com/3230
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: mkfs's usage to show zfs support
Alex Zhuravlev [Mon, 18 Jun 2012 06:52:27 +0000 (10:52 +0400)]
LU-1581 utils: mkfs's usage to show zfs support

now everyone is aware of zfs support

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie03aa55360e61b7db1e45a27ea9f0c040827c4d2
Reviewed-on: http://review.whamcloud.com/3229
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1445 fld: Checking lsr_flage after gotten from the cache.
wangdi [Mon, 11 Jun 2012 21:32:59 +0000 (14:32 -0700)]
LU-1445 fld: Checking lsr_flage after gotten from the cache.

Checking lsr_flags after getting the FLD entry from the local
cache to make sure the the correct entry is gotten.

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: I42c01bdb6521b69d6ce6b79e2b5eeec512d0d657
Reviewed-on: http://review.whamcloud.com/3164
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-988 clio: use OSC object's m/a/ctime when build write RPC
Bobi Jam [Wed, 27 Jun 2012 13:29:05 +0000 (21:29 +0800)]
LU-988 clio: use OSC object's m/a/ctime when build write RPC

When building OST write RPC, inode's m/a/ctime are out-dated until
lov_merge_lvb() is called, so we need OSC object's m/a/ctime to set
the OST object.

Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ie4466ad646fd87c36577ea2aad5fa19ad97c5be7
Reviewed-on: http://review.whamcloud.com/3200
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1581 utils: introduce osd_init() wrapper
Alex Zhuravlev [Sun, 17 Jun 2012 07:33:00 +0000 (11:33 +0400)]
LU-1581 utils: introduce osd_init() wrapper

to load and initialize libraries/modules for different backends

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Iae4ac1bbf87093731686f9362d7f6a71cc704030
Reviewed-on: http://review.whamcloud.com/3226
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: unknown fstype
Alex Zhuravlev [Sun, 17 Jun 2012 06:18:15 +0000 (10:18 +0400)]
LU-1581 utils: unknown fstype

scream and exit if specific fstype is not recognized

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib5e897bdfe9609be81d74a41fd65ec57b6cf5d37
Reviewed-on: http://review.whamcloud.com/3225
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: support for multiple underlying devices
Alex Zhuravlev [Thu, 7 Jun 2012 07:22:22 +0000 (11:22 +0400)]
LU-1581 utils: support for multiple underlying devices

will be used by zfs backend

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I90c3630d993811d702f81c838aae7344775cc115
Reviewed-on: http://review.whamcloud.com/3224
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: osd_write_ldd() wrapper
Alex Zhuravlev [Tue, 5 Jun 2012 08:54:41 +0000 (12:54 +0400)]
LU-1581 utils: osd_write_ldd() wrapper

used to write mountdata to backend

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I8cb0275372f00f12f54461ce9afb243283aff4f3
Reviewed-on: http://review.whamcloud.com/3223
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-445 lnet: Make previous change backwards compatible
Doug Oucharek [Wed, 27 Jun 2012 00:16:39 +0000 (17:16 -0700)]
LU-445 lnet: Make previous change backwards compatible

With the previous patch as is, a 2.3 system running against
a 2.2 or 2.1 system will have an invalid timestamp for doing
bandwidth calculations.

This patch checks the value of the timestamp field to
determine if it is a valid value (>= 2.3) or is too small
to be a timestamp (< 2.3). If it is a valid timestamp, it
will be used. If not, the local timestamp will be used.

Signed-off-by: Doug Oucharek <doug@whamcloud.com>
Change-Id: If655144de72d3e46ab32acfcfc35ed79f58d00b2
Reviewed-on: http://review.whamcloud.com/3192
Tested-by: Hudson
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: osd_make_lustre() wrapper
Alex Zhuravlev [Tue, 5 Jun 2012 08:50:41 +0000 (12:50 +0400)]
LU-1581 utils: osd_make_lustre() wrapper

a wrapper to prepare backend (basically mkfs for given type)

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I5ba5c7b5905de524bd0089671c753c3a5c44a9ec
Reviewed-on: http://review.whamcloud.com/3222
Tested-by: Hudson
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: osd_read_ldd() wrapper
Alex Zhuravlev [Tue, 5 Jun 2012 08:40:22 +0000 (12:40 +0400)]
LU-1581 utils: osd_read_ldd() wrapper

used to fetch mountdata from backend

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I785c43ff95ae3e8faecf9231495529e0d2da8246
Reviewed-on: http://review.whamcloud.com/3221
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1339 libcfs: add crc32 pclmulqdq implementation
Alexander.Boyko [Fri, 29 Jun 2012 08:51:56 +0000 (12:51 +0400)]
LU-1339 libcfs: add crc32 pclmulqdq implementation

Using hardware provided PCLMULQDQ instruction to accelerate the CRC32
disposal. This instruction present from Intel Westmere and
AMD Bulldozer CPUs.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyartex.com>
Xyratex-bug-id: MRP-314
Change-Id: Id6c88629f77cc5d389db49b7ee6e7111294c4a14
Reviewed-on: http://review.whamcloud.com/2586
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: introduce osd_is_lustre() wrapper
Alex Zhuravlev [Tue, 5 Jun 2012 07:55:36 +0000 (11:55 +0400)]
LU-1581 utils: introduce osd_is_lustre() wrapper

the purpose of that is to verify whether underlying device
can be used for lustre and identify backend filesystem

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie95cf8b88811be12feb4af67e15ff26ecae9a447
Reviewed-on: http://review.whamcloud.com/3220
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: osd_prepare_lustre
Alex Zhuravlev [Tue, 5 Jun 2012 07:26:38 +0000 (11:26 +0400)]
LU-1581 utils: osd_prepare_lustre

wrapper to prepare backed, load modules, etc

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I9f841889cdb950aa9a6650369d73fdc1f0a57f55
Reviewed-on: http://review.whamcloud.com/3219
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: more common helpers
Alex Zhuravlev [Tue, 5 Jun 2012 05:40:19 +0000 (09:40 +0400)]
LU-1581 utils: more common helpers

check_mtab_entry() and update_mtab_entry() this time

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Idcb4d75c4fe202d41f468085f792519b1c431666
Reviewed-on: http://review.whamcloud.com/3218
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: move common funcs to mount_utils.c
Alex Zhuravlev [Tue, 5 Jun 2012 05:01:01 +0000 (09:01 +0400)]
LU-1581 utils: move common funcs to mount_utils.c

like functions to manipulate loop devices, etc

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I96c6347fbda130b81e7da267547af8a3a1c8b567
Reviewed-on: http://review.whamcloud.com/3217
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 utils: extract ldiskfs specifics from mkfs_lustre.c
Alex Zhuravlev [Mon, 4 Jun 2012 11:44:15 +0000 (15:44 +0400)]
LU-1581 utils: extract ldiskfs specifics from mkfs_lustre.c

into mount_utils_ldiskfs.c

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ib63258604f9f38b37bb232bc48a33ff867ec1ef0
Reviewed-on: http://review.whamcloud.com/3216
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 ptlrpc: partitioned ptlrpc service
Liang Zhen [Sun, 17 Jun 2012 02:57:03 +0000 (10:57 +0800)]
LU-56 ptlrpc: partitioned ptlrpc service

Current ptlrpc service only have a per-service instance, all service
threads share locks and request queue of this instance, this causes
many perforamnce issues like heavy lock contention and data/threads
migration between CPUs/NUMA nodes.

This patch created per-partition data for ptlrpc service, each service
have locks/request-queues on each partition, also, service will have
CPT (CPU partition) affinity threads on each partition, threads
bond on a CPT will only access data on local partition, this feature
should decrease lock contention and data/thread migration and improve
server side performance.

Another change is we use cfs_hash to replace big array fo_iobuf_pool
in obdfilter, filter_iobuf can be found by from the hash by thread ID.
The reason we made this change is because we removed absolute limit
of ptlrpc threads number, which means we have no idea how big the
fo_iobuf_pool should be. Also, even we have obsolute limit of threads
number, it's still dangerous to use a pre-allocated array because
it's difficult to guarantee thread ID to be contiguous if we want
to shrink threads in the future.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I5f8dce7bcf389f9f076f5ce2d4685a03f910260b
Reviewed-on: http://review.whamcloud.com/3133
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1581 defs: new param
Alex Zhuravlev [Sun, 17 Jun 2012 09:40:02 +0000 (13:40 +0400)]
LU-1581 defs: new param

identity_upcall param

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie5ed6fea5de4f7dd6ed1f8e2a8333e158d0d147b
Reviewed-on: http://review.whamcloud.com/3215
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1581 defs: zfs string
Alex Zhuravlev [Fri, 25 May 2012 11:56:42 +0000 (15:56 +0400)]
LU-1581 defs: zfs string

a string corresponding to ZFS type

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I491041e22680f051016e5f782843b88580bf841e
Reviewed-on: http://review.whamcloud.com/3214
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1305 lprocfs: osd-ldiskfs to use common helpers
Alex Zhuravlev [Wed, 27 Jun 2012 12:08:25 +0000 (16:08 +0400)]
LU-1305 lprocfs: osd-ldiskfs to use common helpers

introduced with osd-zfs

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I504fa697f6502ddfbac1e0ebc524337a02a74328
Reviewed-on: http://review.whamcloud.com/3198
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 o2iblnd: CPT affinity o2iblnd
Liang Zhen [Thu, 10 May 2012 13:44:51 +0000 (21:44 +0800)]
LU-56 o2iblnd: CPT affinity o2iblnd

this patch covered a few things:
- implement percpt scheduler threads for o2iblnd
- decrease overall threads number for fat core machine
- increase thread number only if there are more than one NIC

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Ic4b72258f73baabed2e59746639e271cab4467fc
Reviewed-on: http://review.whamcloud.com/2725
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: re-finalize failed ACK or routed message
Liang Zhen [Sun, 1 Jul 2012 14:21:39 +0000 (22:21 +0800)]
LU-56 lnet: re-finalize failed ACK or routed message

lnet_finalize should restart finalizing process for failed ACK
or failed forwarding, because message could be committed for
sending then failed before delivering to LND, i.e: ENOMEM,
in that case we can't just continue to call lnet_msg_decommit():

- The rule is message must decommit for sending first if
  the it's committed for both sending and receiving

- CPT for sending can be different with CPT for receiving,
  so we should return back to lnet_finalize() to make
  sure we are locking the correct partition.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I0b35434762225fcb0dccad7d23bcd63740484e0a
Reviewed-on: http://review.whamcloud.com/3252
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1589 lnet: read peers of /proc could be endless
Liang Zhen [Sun, 1 Jul 2012 07:17:34 +0000 (15:17 +0800)]
LU-1589 lnet: read peers of /proc could be endless

There is a chance that reading /proc/sys/lnet/peers could be endless
because we didn't set correct condition for ending. This bug is
introduced by commit a07e9d350b3e500c7be877f6dcf54380b86a9cbe of LU-56

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I25f6dad4a926bb7c62a4b1b1d4c3a86c783e3f7a
Reviewed-on: http://review.whamcloud.com/3251
Tested-by: Hudson
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1588 lnet: deadlock while shutting down router
Liang Zhen [Sun, 1 Jul 2012 06:12:24 +0000 (14:12 +0800)]
LU-1588 lnet: deadlock while shutting down router

Should release lock on exiting of lnet_prune_rc_data(), otherwise
we will get deadlock on later attempting on lnet_net_lock().
Also, there is a wrong condition check in lnet_prune_rc_data()
can prevent router checker from shutting down.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I5292075453e61f300384043e2346df714c530303
Reviewed-on: http://review.whamcloud.com/3250
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 ksocklnd: CPT affinity socklnd
Liang Zhen [Fri, 11 May 2012 02:51:58 +0000 (10:51 +0800)]
LU-56 ksocklnd: CPT affinity socklnd

this patch covered a few things:
- implement percpt scheduler threads for socklnd
- decrease overall threads number for fat core machine
- create more threads only if there are more than one NIC
- remove IRQ affinity implementation from socklnd
  IRQ affinity is not very helpful because CPUs on modern computer
  are very powerful. Also, user can still setup IRQ affinity via
  /proc and cpu_pattern of libcfs

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Idfa19037a529fe96cb1432cbd7f55a5dfac89d29
Reviewed-on: http://review.whamcloud.com/2718
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-169 lov: add lsm refcounting
Jinshan Xiong [Tue, 19 Jun 2012 13:09:48 +0000 (21:09 +0800)]
LU-169 lov: add lsm refcounting

This patch adds a reference counter to the lsm structure.
Each time a lsm is used, a reference must be taken with lsm_get/addref
(lsm_addref() can be used when we already own a reference on the lsm).
This reference can be released via obd_free_memmd(). The lsm is freed
when the last reference is dropped.

This patch also moves lov_stripe_md into a private data of
lov_object.

Signed-off-by: Jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Signed-off-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Change-Id: I156b4cc2dc82bb15ae8107cf9842f100048c03d4
Reviewed-on: http://review.whamcloud.com/1874
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-957 lfsck: LFSCK main engine
Fan Yong [Sun, 1 Jul 2012 02:16:10 +0000 (10:16 +0800)]
LU-957 lfsck: LFSCK main engine

Implement the main engine for Lustre online fsck. The kernel
thread "lfsck" scans the system object table by low layer DT
iteration APIs, and will drives registered scrub component(s)
to check/repair Lustre system.

It is the controller for the whole LFSCK, and controls the
speed, including the low layer OI scrub (for osd-ldiskfs).

For urgent mode, like MDT is restored from file-level backup
against ldiskfs, the OI files are invalided. Under such case,
we need to rebuild the OI files ASAP, then low layer OI scrub
inside osd-ldiskfs ignores the main engine speed control, and
runs with full speed.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: If1bd3cfac1f299e964c029e5e9c4cce6432edfa5
Reviewed-on: http://review.whamcloud.com/3169
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-957 scrub: Proc interfaces and tests for OI scrub
Fan Yong [Fri, 29 Jun 2012 11:34:50 +0000 (19:34 +0800)]
LU-957 scrub: Proc interfaces and tests for OI scrub

1) Control/trace OI scrub running.

2) Verify whether the OI scrub basic functions works or not.

3) Test OI scrub performance.

For autotest:
Test-Parameters: testlist=sanity-scrub,scrub-performance

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I5be3d1a521f5f7875f56e9455ff2010016e6a344
Reviewed-on: http://review.whamcloud.com/3168
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoNew tag 2.2.59 2.2.59 v2_2_59 v2_2_59_0
Oleg Drokin [Mon, 2 Jul 2012 16:37:50 +0000 (12:37 -0400)]
New tag 2.2.59

Change-Id: Ia3d5edcc5032e33ce4b90914a6c00f06ae73aa00
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1182 tests: run accounting tests when OFD is used
Johann Lombardi [Tue, 26 Jun 2012 14:46:31 +0000 (16:46 +0200)]
LU-1182 tests: run accounting tests when OFD is used

Skip all quota enforcement tests in sanity-quota.sh and only run
the accounting tests when OFD is used.

This patch also enables the ext4 quota feature when OFD is in use.
Although it is not done in a very efficient way, it still allows
us to run tests.

Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Change-Id: Ie35bba9fb064acd8e878e4c31f1b43e90e164e15
Reviewed-on: http://review.whamcloud.com/3191
Tested-by: Hudson
Reviewed-by: Li Wei <liwei@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1182 osd-zfs: lu_quota.h was renamed to lquota.h
Johann Lombardi [Mon, 2 Jul 2012 07:35:29 +0000 (09:35 +0200)]
LU-1182 osd-zfs: lu_quota.h was renamed to lquota.h

Fix build issue due to lu_quota.h which was renamed to lquota.h.

Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Change-Id: I01f5b159eeae932f1985546b7ba6b6a064e77a01
Reviewed-on: http://review.whamcloud.com/3254
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
11 years agoLU-1182 osd-zfs: add accounting support
Johann Lombardi [Wed, 27 Jun 2012 14:36:34 +0000 (16:36 +0200)]
LU-1182 osd-zfs: add accounting support

Create QSD instance in ZFS OSD so that space accounting data
are exported via procfs.

Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Change-Id: Ie44a8032d437beadcd0cf337f6271247538946c4
Reviewed-on: http://review.whamcloud.com/3201
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
11 years agoLU-56 lnet: wrong assertion for optimized GET
Liang Zhen [Fri, 29 Jun 2012 04:22:00 +0000 (12:22 +0800)]
LU-56 lnet: wrong assertion for optimized GET

Event type of optimized GET/REPLY will not match message type,
lnet_msg_decommit_rx() did correct assetion for optimzed REPLY,
but wrong assertion for optimized GET.
this patch also removed another wrong assertion in lnet_finalize()

Both wrong assertions are introduced by this commit:
75a8f4b4aa9ad6bf697aedece539e62111e9029a

Test-Parameters: no_virtualization=true nettype=o2ib
Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I089c091819c05e27a8b13cf378a60d6fe64cb849
Reviewed-on: http://review.whamcloud.com/3238
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 lnet: tuning wildcard portals rotor
Liang Zhen [Wed, 27 Jun 2012 01:49:48 +0000 (09:49 +0800)]
LU-56 lnet: tuning wildcard portals rotor

By default, PUT messages arrived on wildcard Portals always prefer
to match buffers on current CPT, this patch added a few more
options for dispatching message match which's called portals rotor:

- OFF: Turn off message rotor
- ON: round-robin dispatch all PUT messages to different CPTS
- RR_RT: round-robin dispatch routed PUT message to different CPTs
- HASH_RT: dispatch routed PUT message to CPTs by hashing source NID

User can change portals rotor by /proc interface at runtime.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Ie2f03255d9b00b005ca5365315fff350d45a6933
Reviewed-on: http://review.whamcloud.com/3193
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1182 ofd: add space accounting support
Johann Lombardi [Mon, 25 Jun 2012 23:57:54 +0000 (01:57 +0200)]
LU-1182 ofd: add space accounting support

Add space accounting support to OFD. This is done by calling into the
lquota library which looks up space usage via the accounting objects.
Report success on quotacheck, quotaon & quotaoff for the time being,
otherwise space accounting can't be tested with current MDT stack
which requires quotacheck to be run for space accounting to be
enabled.

Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Change-Id: I442f3d1bcc9e7c95bb3f505f7a00665419d0f517
Reviewed-on: http://review.whamcloud.com/3186
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 lnet: SMP improvements for LNet selftest
Liang Zhen [Wed, 16 May 2012 06:53:43 +0000 (14:53 +0800)]
LU-56 lnet: SMP improvements for LNet selftest

LNet selftest is using a global WI threads pool to handle all RPCs,
it has performance problem on fat cores machine because all threads
will contend on global queues protected by a single spinlock.
This patch will fix this by creating WI scheduler for each CPT,
RPCs will be dispatched to WI schedulers on different CPTs, and there
is no contention between threads in different WI schedulers.

Another major change in this patch is create percpt data for LST
service. In current implementation each service has a global data
structure to store buffer list and RPC list etch, and all operations
on them are protected by per-service lock, again, this could be a
serious performance issue if the service is busy enough. Having
percpt service data would resolve this issue because threads running
in one CPT will only require lock lock and access local data.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I8035faf2e87d8e424a8c2fac903bf3b241668e00
Reviewed-on: http://review.whamcloud.com/2805
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1182 ldiskfs-osd: space accounting support
Mikhail Pershin [Tue, 19 Jun 2012 19:49:35 +0000 (23:49 +0400)]
LU-1182 ldiskfs-osd: space accounting support

Add space accounting support to ldiskfs OSD.

This patch also sets initial attributes in do_create().
mdd_attr_set_internal() from mdd_object_initialize() is kept until
EDQUOT is returned in lquota itself.
Attributes of new inodes are now initialized in osd_object_create().
All LA_MODE bits are now passed to ldiskfs_create_inode().
(original patch from LiWei, see ORI-46)

Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Change-Id: I77a621c76343c2633810bb3cef9859ee30b7b23a
Reviewed-on: http://review.whamcloud.com/3160
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1201 checksum: add libcfs crypto hash
Alexander.Boyko [Thu, 28 Jun 2012 06:04:06 +0000 (10:04 +0400)]
LU-1201 checksum: add libcfs crypto hash

Add libcfs crypto hash and cleanup all lustre hash checksumming.
Now lustre hash calculations base on linux kernel crypto api for kernel,
and base on libcfs crypto implementation for userlevel. So any improvement
at linux kernel for hashes would improve lustre.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Xyratex-bug-id: MRP-337, MRP-471
Change-Id: I2fbf0f4d0c8ce7e7c3c7ea411c6ccd9dcfc7e03a
Reviewed-on: http://review.whamcloud.com/3098
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1305 build: support for osd-zfs
Alex Zhuravlev [Wed, 23 May 2012 08:17:44 +0000 (12:17 +0400)]
LU-1305 build: support for osd-zfs

changes to enable building of spl, zfs and osd-zfs.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I01dcc211b6943b3ed19eaf8756397a97095bbc59
Reviewed-on: http://review.whamcloud.com/2885
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1305 osd: Makefiles for osd-zfs
Alex Zhuravlev [Tue, 29 May 2012 11:26:11 +0000 (15:26 +0400)]
LU-1305 osd: Makefiles for osd-zfs

Makefiles ...

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I1fdf9eecf98306b0cfe0bbc0611df3a977f48b10
Reviewed-on: http://review.whamcloud.com/2972
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-1577 lnet: should export lnet_net2ni
Liang Zhen [Thu, 28 Jun 2012 01:52:38 +0000 (09:52 +0800)]
LU-1577 lnet: should export lnet_net2ni

We don't want to export lnet_net2ni_locked because it requires
an extra parameter CPT number, which is internal to LNet and
confusing to user, also, no LND is using it/should use it, however,
LND does need lnet_net2ni() which is an inline function and it is
referring lnet_net2ni_locked(), which means it forces us to export
lnet_net2ni_locked.
The way to fix should be simple, we can just dis-inline lnet_net2ni()
and export itself instead of exporting lnet_net2ni_locked.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I251d42d47770211ae6238ee56972fb9aacd63599
Reviewed-on: http://review.whamcloud.com/3207
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1415 tests: Basic support for ZFS-based servers
Li Wei [Tue, 12 Jun 2012 13:54:47 +0000 (21:54 +0800)]
LU-1415 tests: Basic support for ZFS-based servers

This patch extends Test Framework to support formatting and mounting
ZFS-based servers.

Current global FSTYPE is no longer flexible enough, as different
facets can use different types of back ends.  Such "mixed-fstype"
configurations are firstly resulted from the incremental landing of
Orion changes.  For example, when we have OFD and ZFS OSD, although
OSTs will be able to use ZFS, MGS and MDTs can still only use LDiskFS.
Secondly, to reduce the test matrix, we may also want to have both
OSTs using LDiskFS and OSTs using ZFS in the same test cluster.
Hence, this patch makes back end file system type a per-facet
attribute.  The flexible configuration variables in local.sh are
reserved for users; all other scripts shall use facet_fstype()
instead.

The major differences between using an LDiskFS-based target and a
ZFS-based one are on the mkfs.lustre(8) command line:

  - ZFS-based targets shall have "--backfstype=zfs", while
    LDiskFS-based ones shall have "--backfstype=ldiskfs".

  - LDiskFS-specific "--mkfsoptions" arguments shall not be given to
    ZFS-based targets, and vice versa.

  - ZFS-based targets have different device specifications.  See
    mkfs.lustre(8).

In addition, we will make "--index" mandatory.  Naturally, format
options have to be generated for each facet dynamically, based on its
back end type, index, etc.  This patch changes mkfs_opts() to do
exactly that.  All test scripts shall use mkfs_opts() instead of
reading related environment variables directly.

Similarly, mount options should also be generated on a per-facet, or
at least per-fstype, basis.  Nevertheless, this patch takes a shortcut
by keeping current per-facet-type {MGS,MDS,OST}_MOUNT_OPTSs and
generating the "loop" mount option dynamically in mount_facet().  I
believe this solution introduces fewer changes comparing to a pure
per-facet one and is sufficient for current use cases.

This patch is based on Brian Behlendorf's work under ORI-155.  See
http://review.whamcloud.com/1417.

Change-Id: Ifcce9b10179dd1b4992a30d10df13ea10bc34548
Whamcloud-bug-id: ORI-155
Test-Parameters: testgroup=full
Test-Parameters: testgroup=full envdefinitions=USE_OFD=yes,LOAD_MODULES_REMOTE=true
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/2907
Tested-by: Hudson
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1305 osd: dmu helpers
Alex Zhuravlev [Tue, 29 May 2012 10:56:48 +0000 (14:56 +0400)]
LU-1305 osd: dmu helpers

few simple functions to wrap native dmu funtions.
to be replaced with direct calls at some point.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ibf9331b5668b1e7db22670caf395391bca002061
Reviewed-on: http://review.whamcloud.com/2971
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
11 years agoLU-1561 tests: Fix ALWAYS_EXCEPT in mds-survey.sh
Li Wei [Mon, 25 Jun 2012 13:24:53 +0000 (21:24 +0800)]
LU-1561 tests: Fix ALWAYS_EXCEPT in mds-survey.sh

Modifying the ALWAYS_EXCEPT assignment in mds-survey.sh does not take
effect, because the test filters have already been built.  This patch
moves the build_test_filter() call below the ALWAYS_EXCEPT assignment.
Also moved downward is the check_and_setup_lustre() call, which
usually goes after environment checks.

Test-Parameters: testlist=mds-survey
Change-Id: I547310e4fccfaaa4fbe7010e14baefa2ab9bbdd4
Signed-off-by: Li Wei <liwei@whamcloud.com>
Reviewed-on: http://review.whamcloud.com/3183
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1528 test: run_one_logged should return error when test fail
Minh Diep [Fri, 15 Jun 2012 11:14:22 +0000 (04:14 -0700)]
LU-1528 test: run_one_logged should return error when test fail

run_one_logged need to return error code even FAIL_ON_ERROR=false

Signed-off-by: Minh Diep <mdiep@whamcloud.com>
Change-Id: I1ac59ddfb58b25e7292aacc8c5a31fc218371520
Reviewed-on: http://review.whamcloud.com/3112
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Wei Liu <sarah@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1448 llite: Prevent NULL pointer dereference on disabled OSC
Jeremy Filizetti [Thu, 31 May 2012 12:26:28 +0000 (08:26 -0400)]
LU-1448 llite: Prevent NULL pointer dereference on disabled OSC

When a file system is mounted with a disabled OSC reading the import
information from the proc file system can result in a NULL pointer
dereference. The Lustre import on a disabled OSC with remain
in the LUSTRE_IMP_NEW state and imp_connection will remain NULL.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: Ia6df51a36efbcd5a7fc7668bb23455b253ae4855
Reviewed-on: http://review.whamcloud.com/2995
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1479 build: enable srp initiator module with lbuild
Shuichi Ihara [Wed, 6 Jun 2012 15:08:20 +0000 (00:08 +0900)]
LU-1479 build: enable srp initiator module with lbuild

kernel-ib doesn't include ib_srp module if RPMs built by lbuild.
It's required for Infiniband based storage.

Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: I16e940592f93abc9150f391316b8f55eb60c663a
Reviewed-on: http://review.whamcloud.com/3049
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Minh Diep <mdiep@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 ldlm: SMP improvement for ldlm_lock_cancel
Liang Zhen [Tue, 19 Jun 2012 14:43:31 +0000 (22:43 +0800)]
LU-56 ldlm: SMP improvement for ldlm_lock_cancel

ldlm_del_waiting_lock() is always called twice in ldlm_lock_cancel
even the ldlm_lock is not on expiring waiter list, this is bad for
performance because it needs to grab a global lock and disable
softirq.
This patch add a new bit l_waited to ldlm_lock, it's set only if
ldlm_add_waiting_lock() is ever called, and we can bypass
ldlm_del_waiting_lock() in ldlm_lock_cancel() if w/o this bit.

I also adopted patch(32562) on BZ 20509 because it's necessary for
this improvement.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Ie48d527b8da187a88646aa070be3aa4beb304b1d
Reviewed-on: http://review.whamcloud.com/3141
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1203 mdt: recognize old rootsquash and nosquash_nid params
Yu Jian [Tue, 8 May 2012 10:12:14 +0000 (18:12 +0800)]
LU-1203 mdt: recognize old rootsquash and nosquash_nid params

Change mdt_process_config() to make it capable of recognizing
old "mdt.rootsquash" and "mdt.nosquash_nid" parameters.

The new parameters are "mdt.root_squash" and "mdt.nosquash_nids".

Signed-off-by: Yu Jian <yujian@whamcloud.com>
Change-Id: I9fc71e2a997a3b0835a3adad2ccfde6d2c64ca56
Reviewed-on: http://review.whamcloud.com/2680
Reviewed-by: Liang Zhen <liang@whamcloud.com>
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1095 debug: no console message for long symlink
Andreas Dilger [Mon, 4 Jun 2012 22:03:28 +0000 (16:03 -0600)]
LU-1095 debug: no console message for long symlink

Some minor fixups from running local sanity.sh tests:
- don't print a console message for long-but-legal symlinks,
  which could be hit during normal usage as well

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I24e3e8ea9d8db0ca9c59597f3170f2b18359500c
Reviewed-on: http://review.whamcloud.com/3130
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Reviewed-by: Li Wei <liwei@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1305 osd: osd-zfs to use correct objset name
Alex Zhuravlev [Wed, 27 Jun 2012 12:11:08 +0000 (16:11 +0400)]
LU-1305 osd: osd-zfs to use correct objset name

before utils were fixed, we have to add '/' to objset name.
now this is not required.

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ice60fa6c30ede531965ec0032f7134c6c412e4c2
Reviewed-on: http://review.whamcloud.com/3199
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 ptlrpc: Reduce at_lock dance
Liang Zhen [Fri, 25 May 2012 09:30:25 +0000 (17:30 +0800)]
LU-56 ptlrpc: Reduce at_lock dance

Some lock dances of service at_lock are unnecessary,
we made some cleanup in this patch to reduce useless lock/unlock.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I004729692254d1200a8b18c0c4494ff437233caf
Reviewed-on: http://review.whamcloud.com/2911
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1182 quota: quota accounting library
Mikhail Pershin [Tue, 19 Jun 2012 04:28:53 +0000 (08:28 +0400)]
LU-1182 quota: quota accounting library

Library functions for quota accounting.
Introduce basic support for Quota Slave Driver (aka QSD) which just
registers procfs file to dump the accounting data for the time being.

Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Change-Id: I35495062b82f43f2bc26c263e08a656f9cbd9c2b
Reviewed-on: http://review.whamcloud.com/3147
Tested-by: Hudson
Reviewed-by: Niu Yawei <niu@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
11 years agoLU-1305 osd: xattr support for osd-zfs
Alex Zhuravlev [Tue, 29 May 2012 10:54:10 +0000 (14:54 +0400)]
LU-1305 osd: xattr support for osd-zfs

xattr set/get/list, including SA

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ifae82853911a1891ad4b3e6e127e29f6d83bba7b
Reviewed-on: http://review.whamcloud.com/2969
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
11 years agoLU-1396 osc: control the RPC rate between MDS and OST
wangdi [Fri, 22 Jun 2012 14:08:35 +0000 (07:08 -0700)]
LU-1396 osc: control the RPC rate between MDS and OST

1. Limit the RPC rate with setting 50 maxim rpc in flight between
MDS and OST.

2. Add specify flag in oa to tell whether the destory from echo_md
destory or MDT close/unlink orphan, where we can not throttle the
destory RPC, since it might block ptlrpcd thread. See Bug 16006.

Signed-off-by: Wang Di <di.wang@whamcloud.com>
Change-Id: Ie3b555eecd2716a82b058f645d002f46d8c4dd36
Reviewed-on: http://review.whamcloud.com/2899
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1305 osd: body operations for osd-zfs
Alex Zhuravlev [Tue, 29 May 2012 10:53:35 +0000 (14:53 +0400)]
LU-1305 osd: body operations for osd-zfs

read/write, get/put bufs, punch

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: Ie2ccd127b52b00904ec0ce5ba143466ca0328f95
Reviewed-on: http://review.whamcloud.com/2968
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
11 years agoLU-1305 osd: index operations for osd-zfs
Alex Zhuravlev [Tue, 29 May 2012 10:52:49 +0000 (14:52 +0400)]
LU-1305 osd: index operations for osd-zfs

index insert/delete/lookup, iterator support

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I46819a5dd4da601716e516fb54942aa611497aef
Reviewed-on: http://review.whamcloud.com/2967
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
11 years agoLU-56 libcfs: CPT affinity workitem scheduler
Liang Zhen [Mon, 14 May 2012 07:27:41 +0000 (15:27 +0800)]
LU-56 libcfs: CPT affinity workitem scheduler

this patch covered multipled changes:
- flexible APIs for creating WI schedulers
  a) therioticall user can create any number of WI schedulers, each
     scheduler can have its own threads pool
  b) user can create CPT affinity WI schedulers for each CPT, it's
     reserved for LNet selftest.
- rehashing and LNet selftest will not share WI schedulers anymore
- libcfs will only start a WI scheduler with small number of threads
  for cfs_hash rehashing
- LNet selftest will create its own schedulers on starting of module,
  and destroy schedulers on shutting down of module

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Idf66a83817fe847ed29e052e0ddc2a4fed498f1a
Reviewed-on: http://review.whamcloud.com/2729
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 obdclass: SMP improvement for lu_key
Liang Zhen [Thu, 17 May 2012 06:47:15 +0000 (14:47 +0800)]
LU-56 obdclass: SMP improvement for lu_key

ptlrpc service threads need to call lu_context_init/fini in each
loop (for each RPC), this could be a big performance issue on
fat SMP machine if we always add lu_context to remember list
because operations on lu_context with LCT_REMEMBER are serialized
by one spinlock and we need to lock/unlock it for multiple times
for each RPC.

But we found LCT_REMEMBER is abused at here, it's impossible that
server stack is being unloaded but service threads are still
running, so we can simply remove LCT_REMEMBER flag from ->rq_session,
and made some small changes to bypass global lock in lu_context_init
and lu_context_fini if w/o LCT_REMEMBER.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I5875a90365a103707526483047ec7628f6964a56
Reviewed-on: http://review.whamcloud.com/2824
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: multiple cleanups for inspection
Liang Zhen [Sun, 24 Jun 2012 14:52:19 +0000 (22:52 +0800)]
LU-56 lnet: multiple cleanups for inspection

This patch covered multiple cleanups for previous patches:
- user "features" to replace "versions" of router checker ping
- code cleanup for lnet_ni_alloc
- fix a loading issue on 32-bit system
- comments cleanup and some small changes
- coding style cleanup

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I96e5fa260d93082851c5146883df1a5b8a96ef42
Reviewed-on: http://review.whamcloud.com/3180
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agonew tag 2.2.58 2.2.58 v2_2_58 v2_2_58_0
Oleg Drokin [Wed, 27 Jun 2012 16:58:57 +0000 (12:58 -0400)]
new tag 2.2.58

Change-Id: I22a3a84ce6e2ed4260c3a8652131afc43df22e46
Signed-off-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: allow user to bind NI on CPTs
Liang Zhen [Fri, 15 Jun 2012 15:56:44 +0000 (23:56 +0800)]
LU-56 lnet: allow user to bind NI on CPTs

By default, NI will be bond on all CPTs, which means messages for a
NI could be handled by LND threads on any CPT (hashed by NID).
This patch add a new parameter for NI configuration, it allows user
to bind NI on specified CPT(s):

- tcp0(eth1)[0,1]
  bind NI (tcp0) on CPT0 and CPT1
- o2ib(ib0)[2-5]
  bind NI (o2ib) on CPT2,3,4,5

Expression between square brackets are CPTs that user wants this NI
to bind, if user provided this expression, messages for the NI
will only be handled by LND threads running on specified CPTs.

This is an intermediate patch, to get this feature we also need
upcoming LND patches.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I706a92c6da181ed0fec857cc25b5ae27a7a7c36b
Reviewed-on: http://review.whamcloud.com/3114
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: Partitioned LNet networks
Liang Zhen [Wed, 13 Jun 2012 12:37:39 +0000 (20:37 +0800)]
LU-56 lnet: Partitioned LNet networks

We have implemented partitioned LNet sources (MD/ME/EQ),
This patch created partitioned data for other LNet objects:
- Peer-tables
  Peers are hashed into peer-table on different partitions by NID
- NI refcount and message queue
  NI will have refcount and message queue for each partition
- counters for each partition

These objects are protected by percpt lock lnet_t::ln_net_lock,
which replaced the original LNET_LOCK

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I7c8c1359aca04a7f859672ccd3268f0282505dd5
Reviewed-on: http://review.whamcloud.com/3113
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1341 tests: recovery-small test_19* fixes
Vitaly Fertman [Tue, 26 Jun 2012 10:44:18 +0000 (14:44 +0400)]
LU-1341 tests: recovery-small test_19* fixes

- test 19a uses ELC, so no eviction happens;
- test 19b cancels conflicting lock in advance due to CLIO logic,
  no eviction again;

fix tests to check what is expected - eviction must take place.

Change-Id: Ib5365c9f90d1fc388fe81196661b262da5e7ad78
Xyratex-bug-id: MRP-482
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Reviewed-by: Alexander Lezhoev <Alexander_Lezhoev@xyratex.com>
Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-on: http://review.whamcloud.com/2592
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@whamcloud.com>
Reviewed-by: Yu Jian <yujian@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1157 ldlm: replace waiting flock lists by hashes
Vitaly Fertman [Mon, 25 Jun 2012 21:47:50 +0000 (01:47 +0400)]
LU-1157 ldlm: replace waiting flock lists by hashes

replace per-export list by per-export hash to locate a lock with
blocking export & owner.

Change-Id: I9c4089579bbf126781e232ea7021317fd10223e9
Xyratex-Bug-ID: MRP-385
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-on: http://review.whamcloud.com/2240
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: cleanup for rtrpool and LNet counter
Liang Zhen [Tue, 12 Jun 2012 09:15:20 +0000 (17:15 +0800)]
LU-56 lnet: cleanup for rtrpool and LNet counter

This patch covered a few of things:
- code cleanup for router buffer pools
- code cleanup for error handling in lnet_prepare()
- code cleanup for LNet counters

This is an intermediate patch for LNet SMP improvements.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I554d6acb79a55dd77f709d3b6633f157f50a8cee
Reviewed-on: http://review.whamcloud.com/3091
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: Partitioned LNet resources (ME/MD/EQ)
Liang Zhen [Mon, 11 Jun 2012 14:28:23 +0000 (22:28 +0800)]
LU-56 lnet: Partitioned LNet resources (ME/MD/EQ)

We already have a new lock lnet_res_lock to protect LNet resources,
but it's still a global lock and could have performance issue.
This patch created partitioned data for LNet, resources are
spreaded into different partitions. Also, lnet_res_lock is not
a single spinlock anymore, it's a percpt lock now, which means
LNet only needs to lock one partition at a time while operating
MD/ME belonging to that partition.

There are a few things are still serialized by exclusive lock:
- EQ allocation/free
- LNetEQPoll (non-zero size EQ)
- delay message on lazy portal
- Steaing MD between partitions.

There operations are either rare or deprecated so they shouldn't
become performance problem.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: If5e88b92dd508b84c0fd91725b3aaed424dd3108
Reviewed-on: http://review.whamcloud.com/3078
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 ptlrpc: cleanup of ptlrpc_unregister_service
Liang Zhen [Sat, 26 May 2012 06:40:05 +0000 (14:40 +0800)]
LU-56 ptlrpc: cleanup of ptlrpc_unregister_service

This patch is only a cleanup of ptlrpc_unregister_service(),
it's an intermediate patch for partitioned ptlrpc service.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I334fa3813f77711defd598bbb7f688d3ea6026be
Reviewed-on: http://review.whamcloud.com/2917
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 ptlrpc: svc thread starting/stopping cleanup
Liang Zhen [Fri, 25 May 2012 12:40:24 +0000 (20:40 +0800)]
LU-56 ptlrpc: svc thread starting/stopping cleanup

This patch covered two things:
- serialize creation of ptlrpc service thread
  In current version we can parallel create service threads, so there
  could be "hole" of thread ID if one creation failed, it could be
  problemaic because some modules require thread ID to be strictly
  contiguous and unique. Serializing thread creation can resolve this
  issue.
- code cleanup for for stopping of ptlrpc servcie threads, this is
  just for the next step work of partitioned ptlrpc service.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Ied8ad89003aa9d53fa73a4e5166a2c8d07a1aae9
Reviewed-on: http://review.whamcloud.com/2912
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-506 kernel: 2.6.38 kernel macro check cleanup
Lai Siyao [Thu, 21 Jun 2012 07:40:50 +0000 (15:40 +0800)]
LU-506 kernel: 2.6.38 kernel macro check cleanup

* HAVE_DCACHE_LOCK is used to mark kernel >= 2.6.38, instead of other
kernel macros, this makes it easier to maintain kernel support in
the future.
* minor cleanups.

Signed-off-by: Lai Siyao <laisiyao@whamcloud.com>
Change-Id: Ic9a6a8c339a10b34e2b1b06b47fd9e252cd3ec2c
Reviewed-on: http://review.whamcloud.com/3159
Reviewed-by: Yang Sheng <ys@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-723 ldiskfs: Drop support for ext3 based ldiskfs
Prakash Surya [Mon, 11 Jun 2012 16:27:18 +0000 (09:27 -0700)]
LU-723 ldiskfs: Drop support for ext3 based ldiskfs

Building ldiskfs from ext3 is no longer supported, thus this change
attempts to clean things up and only support building ldiskfs from
ext4 sources. The main changes made, involved replacing the @BACKFS@
autoconf variable with 'ext4', and removing the HAVE_EXT4_LDISKFS
macro. It is now safe to assume that ldiskfs is build based entirely
on ext4.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Change-Id: Ia0eff6e4c1b755806ccfbf554e4a36201f9f1a64
Reviewed-on: http://review.whamcloud.com/1643
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1194 llog: fix for not sync llcd at thread stop
Alexander.Boyko [Tue, 15 May 2012 08:55:40 +0000 (12:55 +0400)]
LU-1194 llog: fix for not sync llcd at thread stop

If llog_obd_repl_cancel() happend between llog_sync() and
class_import_put() at filter_llog_finish(), llog_recov_thread_stop()
throw LBUG. This patch fix this issue by adding new flags to llog_ctxt.

Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Reviewed-by: Andriy Skulysh <andriy_skulysh@xyratex.com>
Reviewed-by: Alexander Zarochentsev <alexander_zarochentsev@xyratex.com>
Xyratex-bug-id: MRP-456
Change-Id: Ife79adfe6cde0f2090776cd27cd87f65c1e988e2
Reviewed-on: http://review.whamcloud.com/2789
Reviewed-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <tappro@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: reduce stack usage of "match" functions
Liang Zhen [Sun, 10 Jun 2012 06:01:31 +0000 (14:01 +0800)]
LU-56 lnet: reduce stack usage of "match" functions

Use new structure lnet_match_info to transfer the parameters
of LNet "match" functions and reduce stack usage.

This is an intermediate patch for LNet SMP improvements.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I710a78c58add8609606f5d6de1f975ffc5200439
Reviewed-on: http://review.whamcloud.com/3070
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-56 lnet: Granulate LNet lock
Liang Zhen [Thu, 7 Jun 2012 08:43:29 +0000 (16:43 +0800)]
LU-56 lnet: Granulate LNet lock

LNet is using a global lock LNET_LOCK to serialize all operations
and event callbacks of LNet, it's a big performance issue on fat
SMP machines because of high lock contention.

We have submitted many changes to separate critical logic of LNet
and this patch is the key step for finer-grained LNet locking.
This patch add a new lock "lnet_res_lock", all operations on LNet
resources (ME, MD, EQ) are under protection of this lock, we still
keep LNET_LOCK so far, but it's only called for serializing
operations on NI, peer, credits and routers.

This is still an intermediate patch for LNet SMP improvements, both
LNET_LOCK and lnet_res_lock are just spinlock now, they will be
replaced by percpt lock in upcoming patches.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I313caffd21776ee3474c2a1391ea78f002b47790
Reviewed-on: http://review.whamcloud.com/3056
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Lai Siyao <laisiyao@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
11 years agoLU-1305 osd: object operations for osd-zfs
Alex Zhuravlev [Tue, 29 May 2012 10:52:00 +0000 (14:52 +0400)]
LU-1305 osd: object operations for osd-zfs

allocate/release, create/destroy, attr set/get, etc

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I4e56a175b6fd5008d89cdd509b47f8bc00ce5552
Reviewed-on: http://review.whamcloud.com/2966
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1305 osd: object index for osd-zfs
Alex Zhuravlev [Tue, 29 May 2012 10:47:02 +0000 (14:47 +0400)]
LU-1305 osd: object index for osd-zfs

functionality to map fids to dnodes

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: If8153907342a8c0ac3fe8f657f7dc740db6b2753
Reviewed-on: http://review.whamcloud.com/2964
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1305 osd: osd_handler.c for osd-zfs
Alex Zhuravlev [Tue, 29 May 2012 10:45:32 +0000 (14:45 +0400)]
LU-1305 osd: osd_handler.c for osd-zfs

functions to setup/mount zfs backend

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>
Change-Id: I200709e0c86e4f7aae35528cbae7f7b08e094f47
Reviewed-on: http://review.whamcloud.com/2963
Tested-by: Hudson
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 ptlrpc: partition data for ptlrpc service
Liang Zhen [Wed, 23 May 2012 06:34:18 +0000 (14:34 +0800)]
LU-56 ptlrpc: partition data for ptlrpc service

We will have multiple partition data & threads for ptlrpc service,
this patch is the first step work, we moved quite a lot members
of ptlrpc_service to a new structure ptlrpc_service_part.
Now we only create one instance of ptlrpc_service_part for each
service, but we will have multiple instances for each service
very soon (instance per CPT, CPU ParTion).

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I63d816bdf44a22528c6097fe348060f57d862df3
Reviewed-on: http://review.whamcloud.com/2895
Tested-by: Hudson
Reviewed-by: wangdi <di.wang@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-957 scrub: trigger OI scrub if found bad OI entry
Fan Yong [Thu, 14 Jun 2012 07:41:22 +0000 (15:41 +0800)]
LU-957 scrub: trigger OI scrub if found bad OI entry

If some RPC involves OI lookup and finds inconsistent OI mapping,
it should trigger OI scrub to check and repair the inconsistency.

Known issues:
When the fid is returned to client, the OI mapping corresponding
to such fid may be not updated, or not committed to disk yet. If
server crashed before OI scrub completed, then recovery with the
fid corresponding to inconsistent OI mapping may fail or blocked.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I9709386aa6d42954b619f6b1342adae59a2ec5a9
Reviewed-on: http://review.whamcloud.com/2554
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1459 llite: Don't LBUG when import has LUSTRE_IMP_NEW state
Jeremy Filizetti [Thu, 31 May 2012 14:30:00 +0000 (10:30 -0400)]
LU-1459 llite: Don't LBUG when import has LUSTRE_IMP_NEW state

When a disabled OSC/OST is configured in the system at mount
time, a client will LBUG if calling "lfs check servers".
Disabling the LBUG causes client to return -EIO instead.

Signed-off-by: Jeremy Filizetti <jeremy.filizetti@gmail.com>
Signed-off-by: Bobi Jam <bobijam@whamcloud.com>
Change-Id: I1844b66e56259da28129df2c60d2542e9c95aeee
Reviewed-on: http://review.whamcloud.com/2998
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-1493 quota: extra release caused by race
Niu Yawei [Mon, 11 Jun 2012 10:55:32 +0000 (03:55 -0700)]
LU-1493 quota: extra release caused by race

There is a race between the check_cur_qunit() and the
dqacq_completion(): check_cur_qunit() read hardlimit
and calculate how much quota need be acquired/released
based on the hardlimit, however, the hardlimit can be
changed by the dqacq_completion() at anytime. So that
could result in extra quota acquire/release when there
is inflight dqacq.

In general, such extra dqacq dosen't bring fatal error,
unless an extra release is going to release more than
'hardlimit' quota.

To minimize the code changes (anyway, it'll be totally
rewritten in the new quota design), we just do one more
check here to avoid the extra release which could bring
fatal error. A better solution could be calculating the
qd_count here and removing the lqs_blk/ino_rec stuff.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: I0ad5ff0f32e39f32872c201ad1d545fbd9d1a57d
Reviewed-on: http://review.whamcloud.com/3074
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
11 years agoLU-1438 quota: quota active checking is missed on slave
Niu Yawei [Wed, 13 Jun 2012 02:42:01 +0000 (19:42 -0700)]
LU-1438 quota: quota active checking is missed on slave

On quota slave, we missed checking if quota is enabled in the
quota_check_common() and several other places. Which could cause
slave retry acquire quota in quota_chk_acq_common() infinitely
when the quota is already turned off on master.

Signed-off-by: Niu Yawei <niu@whamcloud.com>
Change-Id: Iaa48c7cca05daf595b6d3b7e4025c7650e460918
Reviewed-on: http://review.whamcloud.com/3097
Tested-by: Hudson
Reviewed-by: Fan Yong <yong.fan@whamcloud.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann@whamcloud.com>
11 years agoLU-957 scrub: osd-ldiskfs itable based iteration
Fan Yong [Thu, 14 Jun 2012 07:40:39 +0000 (15:40 +0800)]
LU-957 scrub: osd-ldiskfs itable based iteration

Implement inode table based object iteration in osd-ldiskfs.
It is implemented as DT iteration APIs, which can be used by
up layer LFSCK to scan the whole device sequentially.

Signed-off-by: Fan Yong <yong.fan@whamcloud.com>
Change-Id: I3d80bb4a174d47429764e5cca35e4f07be52d50b
Reviewed-on: http://review.whamcloud.com/2553
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 lnet: code cleanup for lib-move.c
Liang Zhen [Wed, 6 Jun 2012 14:15:59 +0000 (22:15 +0800)]
LU-56 lnet: code cleanup for lib-move.c

Most changes in this patch are just code cleanup:
- remove one unnecessary lock dance for message forwarding
- move some code blocks to make functions cleaner
- rename lnet_ni_peer_alive to lnet_ni_query_locked

It's an intermediate patch LNet SMP improvements.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: Ia7cfe37a5a4f896be4fcc7f6c5cf9c27268de9ba
Reviewed-on: http://review.whamcloud.com/3048
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>
11 years agoLU-56 lnet: match-table for Portals
Liang Zhen [Tue, 5 Jun 2012 10:02:55 +0000 (18:02 +0800)]
LU-56 lnet: match-table for Portals

Create sub-object named as "match-table" for each Portal, MEs will
be attached match-table instead of Portal.
Although we only have one match-table for each Portal in this patch,
but in upcoming changes, we will create multiple match-tables
for each Portal:
- unique-match Portal
  MEs will be scattered to different match-tables by match info
- wildcard Portal
  LND threads just grab ME/MD from match-table corresponding to
  current CPT (CPU partition).

We also did some code cleanup for delayed message in this patch.

Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I2b24723c3bd2a6664f2b241840de19d5f43be11f
Reviewed-on: http://review.whamcloud.com/3043
Reviewed-by: Doug Oucharek <doug@whamcloud.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@whamcloud.com>
Reviewed-by: Andreas Dilger <adilger@whamcloud.com>