Whamcloud - gitweb
fs/lustre-release.git
7 years agoLU-812 ldiskfs: super_operations->dirty_inode now takes a flag
James Simmons [Tue, 5 Mar 2013 12:50:54 +0000 (07:50 -0500)]
LU-812 ldiskfs: super_operations->dirty_inode now takes a flag

Currently this flag is unused by ext4, so just pass in 0.  This change
happened in kernel commit aa38572954ade525817fe88c54faebf85e5a61c0.
Apparently the flag is used to tell the difference between timestamp
updates and anything else.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: chas williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Change-Id: I24536546256f5f043c1f53e15220b0c956be343f
Reviewed-on: http://review.whamcloud.com/4966
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-1994 kernel: fix reference counting with l_dentry_open
Jeff Mahoney [Thu, 21 Feb 2013 15:05:53 +0000 (10:05 -0500)]
LU-1994 kernel: fix reference counting with l_dentry_open

Commit 78b1d1bd (LU-1994 kernel: 3.6 dentry_open uses struct path
as first arg) added support for the new dentry_open call that
accepts struct path instead of a dentry/vfsmount pair, but missed
the new reference counting rules that go along with it.

Upstream commit 765927b2 also makes dentry_open grab references itself
so it no longer frees references that weren't passed to it.

On failure, we'll end up with an extra reference to the dentry
that was passed in.

Since new dentry_open is the one that will be around for the
foreseeable future, let's just map to that directly for the path case.

For the other two cases, we'll take the references ourselves in
ll_dentry_open, free them on failure, and adjust callers to expect
that it won't free any references passed to it.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Change-Id: I05a95cf735a5b2d70273a485335d571fcda7a6b0
Reviewed-on: http://review.whamcloud.com/5330
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-1798 tests: don't set jobid if not changing
Andreas Dilger [Sat, 1 Sep 2012 01:15:42 +0000 (19:15 -0600)]
LU-1798 tests: don't set jobid if not changing

Don't set jobid_var conf_param if the value would not change.  This
avoids setting the same parameter in the config log multiple times,
and avoids the extra delay on each startup waiting for the conf_param
to propagate from the MGS to the client each time.

Fix the console error printing if the jobid_env does not exist.

Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Change-Id: I36ed60451a875bcd46fd0d6f3d7068d1b1398df5
Reviewed-on: http://review.whamcloud.com/3867
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2484 mgs: remove unused md_stats
John L. Hammond [Tue, 12 Mar 2013 03:44:09 +0000 (22:44 -0500)]
LU-2484 mgs: remove unused md_stats

Do not allocate md_stats or create /proc/.../MGS/md_stats for mgs
devices since they do not implement the md_ops interface.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I8b417bc2433484a0e008ff9bf7fef69ba1e62416
Reviewed-on: http://review.whamcloud.com/4813
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2513 osc: compute grant targets in bytes
John L. Hammond [Thu, 28 Feb 2013 09:48:13 +0000 (03:48 -0600)]
LU-2513 osc: compute grant targets in bytes

In osc_shrink_grant() and osc_shrink_grant_to_target() convert page
unit target values to bytes before comparing to cl_avail_grant.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie8e6e8b4b3245efa3b14777608f3a48bbab7e4e2
Reviewed-on: http://review.whamcloud.com/5495
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2675 cleanup: remove obsolete llog-test crud
John L. Hammond [Fri, 1 Feb 2013 04:00:07 +0000 (22:00 -0600)]
LU-2675 cleanup: remove obsolete llog-test crud

Do not create the llog-test.c symlink as this Linux 2.4-ism is no
longer needed. Also remove the now defunct llog-test.sh.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I72ba710659e168ead206a2b9b28633724d0e44c5
Reviewed-on: http://review.whamcloud.com/5239
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2916 llite: call ll_permission rather than inode_permission
Peng Tao [Wed, 27 Feb 2013 10:50:54 +0000 (18:50 +0800)]
LU-2916 llite: call ll_permission rather than inode_permission

So that we can build on kernels older than 2.6.27.

Signed-off-by: Peng Tao <tao.peng@emc.com>
Change-Id: I288030c3ee37ccb909d45121d457adb4dccafe0a
Reviewed-on: http://review.whamcloud.com/5607
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2642 osd-ldiskfs: fix typo in REQ_WRITE check
James Simmons [Tue, 5 Mar 2013 12:55:54 +0000 (07:55 -0500)]
LU-2642 osd-ldiskfs: fix typo in REQ_WRITE check

Commit eecb3086 defines __REQ_WRITE as BIO_RW if __REQ_WRITE is
not defined. __REQ_WRITE is an enum. REQ_WRITE is a define.

Let's use that instead.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I500c771e6f3e69e1be3fab872150019103515d30
Reviewed-on: http://review.whamcloud.com/5503
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2873 tests: t-f generates numerous dots garbage
Kyrylo Shatskyy [Wed, 27 Feb 2013 02:40:24 +0000 (04:40 +0200)]
LU-2873 tests: t-f generates numerous dots garbage

Remove 'echo -n "."' from run_test t-f function when skipping test

Signed-off-by: Kyrylo Shatskyy <kyrylo_shatskyy@xyratex.com>
Reviewed-by: Roman Grigoryev <Roman_Grigoryev@xyratex.com>
Xyratex-bug-id: MRP-863
Change-Id: I2c6de9fff3676aee7a506a2c2f6b6f9fe7a3b472
Reviewed-on: http://review.whamcloud.com/5540
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2731 scripts: Speed up /etc/init.d/lustre stop
Prakash Surya [Thu, 31 Jan 2013 23:06:07 +0000 (15:06 -0800)]
LU-2731 scripts: Speed up /etc/init.d/lustre stop

This patch parallelizes the shutdown of multiple services running on the
same node. This has been empirically shown to drastically reduce the
runtime of the script for an OSS with many OSTs.

This patch was tested on a Lustre 2.1 ldiskfs OSS node with 32 OSTs
attached, by recording startup and shutdown times for the OSS. The
number of OSTs used in varied, ranging from a single one, up to all 32,
incrementing by powers of two (i.e. timed startup/shutdown of 1 OST,
then of 2 OSTs, then 4, etc.).

Results of startup and shutdown times *without* this patch applied:

    +------------------------------------------------+
    | $ time /etc/init.d/lustre start # (w/o patch)  |
    +-----------+------------+-----------+-----------+
    | # of OSTs |    real    |    user   |    sys    |
    +-----------+------------+-----------+-----------+
    |      1    | 0m  2.184s | 0m 0.162s | 0m 0.077s |
    |      2    | 0m  4.285s | 0m 0.281s | 0m 0.148s |
    |      4    | 0m  8.508s | 0m 0.500s | 0m 0.302s |
    |      8    | 0m 16.961s | 0m 1.017s | 0m 0.568s |
    |     16    | 0m 33.884s | 0m 1.964s | 0m 1.176s |
    |     32    | 1m  7.744s | 0m 3.986s | 0m 2.280s |
    +-----------+------------+-----------+-----------+

    +------------------------------------------------+
    | $ time /etc/init.d/lustre stop # (w/o patch)   |
    +-----------+------------+-----------+-----------+
    | # of OSTs |    real    |    user   |    sys    |
    +-----------+------------+-----------+-----------+
    |     1     | 0m  4.758s | 0m 0.072s | 0m 0.030s |
    |     2     | 0m  9.018s | 0m 0.118s | 0m 0.049s |
    |     4     | 0m 18.813s | 0m 0.185s | 0m 0.083s |
    |     8     | 0m 37.586s | 0m 0.337s | 0m 0.141s |
    |    16     | 1m 16.092s | 0m 0.597s | 0m 0.263s |
    |    32     | 2m 37.550s | 0m 1.181s | 0m 0.403s |
    +-----------+------------+-----------+-----------+

Results of startup and shutdown time *with* this patch:

    +------------------------------------------------+
    | $ time /etc/init.d/lustre start # (w/ patch)   |
    +-----------+------------+-----------+-----------+
    | # of OSTs |    real    |    user   |    sys    |
    +-----------+------------+-----------+-----------+
    |      1    | 0m  2.183s | 0m 0.158s | 0m 0.083s |
    |      2    | 0m  4.282s | 0m 0.274s | 0m 0.153s |
    |      4    | 0m  8.519s | 0m 0.510s | 0m 0.303s |
    |      8    | 0m 16.966s | 0m 1.019s | 0m 0.583s |
    |     16    | 0m 33.878s | 0m 1.984s | 0m 1.154s |
    |     32    | 1m  7.745s | 0m 3.944s | 0m 2.322s |
    +-----------+------------+-----------+-----------+

    +------------------------------------------------+
    | $ time /etc/init.d/lustre stop # (w/ patch)    |
    +-----------+------------+-----------+-----------+
    | # of OSTs |    real    |    user   |    sys    |
    +-----------+------------+-----------+-----------+
    |      1    | 0m  4.566s | 0m 0.075s | 0m 0.023s |
    |      2    | 0m  4.857s | 0m 0.105s | 0m 0.070s |
    |      4    | 0m  4.777s | 0m 0.175s | 0m 0.064s |
    |      8    | 0m  5.449s | 0m 0.323s | 0m 0.153s |
    |     16    | 0m  5.862s | 0m 0.606s | 0m 0.208s |
    |     32    | 0m  6.307s | 0m 1.183s | 0m 0.811s |
    +-----------+------------+-----------+-----------+

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Change-Id: I90c1f6a265a8d86bbc8ddfb88aa635e5b96fd975
Reviewed-on: http://review.whamcloud.com/5235
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-2760 ldiskfs: update rhel6.3 series to handle mkdir errors
James Simmons [Tue, 5 Mar 2013 17:42:35 +0000 (12:42 -0500)]
LU-2760 ldiskfs: update rhel6.3 series to handle mkdir errors

ext4_mkdir can fail in several paths to dirtying an inode but
the errors aren't caught. This patch adds the upstream commit
that handles the errors and adjusts the dependent patches
accordingly.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: If86538e88c7386a06016ffae6893bacc8ba131e4
Reviewed-on: http://review.whamcloud.com/5279
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-913 test: Framework needs to record the test filesystem.
Chris Gearing [Thu, 15 Nov 2012 18:12:32 +0000 (10:12 -0800)]
LU-913  test:  Framework needs to record the test filesystem.

Add a section

file_system: XXX

to the node info in yaml.sh

Because this runs on the node being recorded it can use
node_fstypes $HOSTNAME
to fetch the filesystemtype of the local machine

Signed-off-by: Chris Gearing <chris.gearing@intel.com>
Change-Id: I721e4084096c75b69290959190526ca27b573e1b
Reviewed-on: http://review.whamcloud.com/4591
Tested-by: Hudson
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2675 cleanup: remove unused mkdirdeep.c and lltrace.h
John L. Hammond [Fri, 1 Feb 2013 03:10:24 +0000 (21:10 -0600)]
LU-2675 cleanup: remove unused mkdirdeep.c and lltrace.h

That's all.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I547623a98541d545057776a01424fa6a362f06ee
Reviewed-on: http://review.whamcloud.com/5177
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2342 tests: account for log size in replay-single/20b
Nathaniel Clark [Sun, 10 Mar 2013 21:27:02 +0000 (17:27 -0400)]
LU-2342 tests: account for log size in replay-single/20b

Account for larger ondisk log size in ZFS (256) vs. ldiskfs (50).

Test-Parameters: mdsfilesystemtype=zfs ostfilesystemtype=zfs   mdtfilesystemtype=zfs testlist=replay-single
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I4150d58556240b128c02ba667c4c390c79b8a463
Reviewed-on: http://review.whamcloud.com/5666
Tested-by: Hudson
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2903 tests: calculation of available space
Nathaniel Clark [Fri, 8 Mar 2013 20:54:21 +0000 (15:54 -0500)]
LU-2903 tests: calculation of available space

Better debug logging for statfs path.
Account for log size in space calculation.
Sleep slightly longer to wait for good statfs because zfs doesn't
count uncommitted changes in available space.

Test-Parameters: mdsfilesystemtype=zfs ostfilesystemtype=zfs   mdtfilesystemtype=zfs testlist=replay-ost-single
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib4b5a7c645d6b2a630dcc729483422c8b3a095db
Reviewed-on: http://review.whamcloud.com/5662
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2911 llite: add obd_fid_init/fini() back to llite
Emoly Liu [Fri, 15 Mar 2013 04:40:42 +0000 (12:40 +0800)]
LU-2911 llite: add obd_fid_init/fini() back to llite

If without obd_fid_init() in llite, when filesystem is upgraded from
branch 1.8 to 2.4, obd_fid_init() in lmv won't be triggered because
of no lmv in branch 1.8 based config log. This will cause LBUG during
running mkdir after upgrade, like

seq_client_alloc_fid()) ASSERTION( seq != ((void *)0) ) failed.
seq_client_alloc_fid()) LBUG
Call Trace:
[<ffffffffa0371895>] libcfs_debug_dumpstack+0x55/0x80 [libcfs]
[<ffffffffa0371e97>] lbug_with_loc+0x47/0xb0 [libcfs]
[<ffffffffa080bea9>] seq_client_alloc_fid+0x379/0x440 [fid]
[<ffffffffa03822e1>] ? libcfs_debug_msg+0x41/0x50 [libcfs]
[<ffffffffa082470b>] mdc_fid_alloc+0xbb/0xf0 [mdc]
[<ffffffffa0832b1c>] mdc_create+0xcc/0x780 [mdc]
[<ffffffffa09c487b>] ll_new_node+0x19b/0x6a0 [lustre]
[<ffffffffa09c50a7>] ll_mkdir+0x97/0x1f0 [lustre]

Signed-off-by: Liu Ying <emoly.liu@intel.com>
Change-Id: I0eab1298b8d02ca08ecd4ac8bb422a2de12b7f83
Reviewed-on: http://review.whamcloud.com/5733
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Hudson
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2948 osc: keep trying to shrink more LRU slots
Jinshan Xiong [Mon, 18 Mar 2013 23:05:45 +0000 (16:05 -0700)]
LU-2948 osc: keep trying to shrink more LRU slots

It used to try once. It can enter into a livelock state as we can
see it in this ticket.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: If603686d1d9fa7cf6513143fcc6ef962cfea9863
Reviewed-on: http://review.whamcloud.com/5760
Tested-by: Hudson
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2849 hsm: don't run sanity-hsm for ver < 2.3.61
Johann Lombardi [Thu, 21 Feb 2013 15:17:43 +0000 (16:17 +0100)]
LU-2849 hsm: don't run sanity-hsm for ver < 2.3.61

Add version check in sanity-hsm.
Now that sanity-hsm has been added, we should consider running it by
default.

Test-Parameters: testlist=sanity-hsm

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I8cec304bed46ac24354a27716ad12f5233c75a3f
Reviewed-on: http://review.whamcloud.com/5502
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
7 years agoLU-2845 osp: fix osp precreate thread init error handling
Bobi Jam [Fri, 22 Feb 2013 03:08:00 +0000 (11:08 +0800)]
LU-2845 osp: fix osp precreate thread init error handling

If osp device hasn't connected OST, osp_precreate_thread() should
heed to that and bypass the normal quitting path.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I27d18dfd4d7d55c97eeb169b5d7dc7042a42fd33
Reviewed-on: http://review.whamcloud.com/5508
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2884 libcfs: SMP optimizations cleanups
Isaac Huang [Wed, 27 Feb 2013 22:15:17 +0000 (15:15 -0700)]
LU-2884 libcfs: SMP optimizations cleanups

Miscelaneous cleanups for the SMP optimizations code:
- Fixed typos.
- Fixed resource leak in lnet_create_locks().
- Fixed incorrect symbols in EXPORT_SYMBOL().

Signed-off-by: Isaac Huang <he.huang@intel.com>
Change-Id: I3b617367e5ed6b11ae327e477fc2c201c453e347
Reviewed-on: http://review.whamcloud.com/5547
Tested-by: Hudson
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-1876 hsm: revise ll_setattr_raw to not check stripe data
Jinshan Xiong [Wed, 6 Feb 2013 23:42:28 +0000 (15:42 -0800)]
LU-1876 hsm: revise ll_setattr_raw to not check stripe data

It used to check if the file has stripe data, or it won't send file
size to the MDT. After layout lock is introduced, it's unreliable to
check stripe data at llite layer.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ib947c568232b4025210d1e2e4e05bcf3514fd36a
Reviewed-on: http://review.whamcloud.com/5291
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John Hammond <johnlockwoodhammond@gmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2868 lfsck: finish otable iteration if no more objects
Fan Yong [Fri, 8 Feb 2013 05:46:30 +0000 (13:46 +0800)]
LU-2868 lfsck: finish otable iteration if no more objects

If there are no objects to be scanned just at the LFSCK beginning,
then the LFSCK should mark the otable-based iteration as finished
to avoid to access non-initialized data via dt_it_ops::rec().

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I67e880ccfa13e867c4ca7f1858d67909ba0415b3
Reviewed-on: http://review.whamcloud.com/5622
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2912 mdd: move fid2path from MDD to MDT
wangdi [Sun, 22 Dec 2013 04:07:23 +0000 (20:07 -0800)]
LU-2912 mdd: move fid2path from MDD to MDT

1. Move some linkEA API to obdclass, so both MDD(set linkea)
and MDT(fid2path) can access these LINKEA operation.

2. Move fid2path from MDD to MDT, so it can detect remote
object, and return -EREMOTE to the client. Then the client
will try fid2path on another MDT, and finally the client
will assemble the path fragments from different MDT.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I464cd1fdd44ebbe02c94821f910294829f3d9b94
Reviewed-on: http://review.whamcloud.com/5676
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2748 fsfilt: ext4_map_inode_page in osd and ldisk out of sync
James Simmons [Thu, 14 Mar 2013 17:33:15 +0000 (13:33 -0400)]
LU-2748 fsfilt: ext4_map_inode_page in osd and ldisk out of sync

The functon ext4_map_inode_page is mismatched in its use between
osd-ldiskfs and ldiskfs. The integer array is no longer used so
we remove its handling from the ldiskfs layer.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I9b5522ce187f06983f328408cbcd0ce077e72ea1
Reviewed-on: http://review.whamcloud.com/5708
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
7 years agoLU-2902 test: Debug for roc_hit v1
Keith Mannthey [Fri, 15 Mar 2013 00:41:02 +0000 (17:41 -0700)]
LU-2902 test: Debug for roc_hit v1

This will output all the stats and help sort out what
is happening on the systems.  I would like to know
what the proc values are the roc_hit is blank.

More debug may be needed but I want to start here.

This patch should be dropped when LU-2902 is closed.

I have used this patch with auster without issue it
just added noise to the test results.

Signed-off-by: Keith Mannthey <keith.mannthey@intel.com>
Change-Id: I3efe9aa87e8c51909667f9fbc3b9c2d6779d8d0d
Reviewed-on: http://review.whamcloud.com/5648
Tested-by: Hudson
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-1565 ldlm: make blocking threads async wherenever possible
Vitaly Fertman [Mon, 4 Feb 2013 13:27:03 +0000 (17:27 +0400)]
LU-1565 ldlm: make blocking threads async wherenever possible

There is no need to wait for the cancel lru lock completion in the
client side pool recalculation, make it asynchronous.

make all the ldlm_cli_cancel() calls from blocking callbacks async

Change-Id: Ie510c7361f1025a78c693a11b457baf1652f8c90
Xyratex-bug-id: MRP-690
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: Bruce Korb <bruce_korb@xyratex.com>
Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-on: http://review.whamcloud.com/4181
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2802 lmv: get tgt by checking ltd_idx in lmv_find_target.
wangdi [Sat, 30 Nov 2013 12:22:55 +0000 (04:22 -0800)]
LU-2802 lmv: get tgt by checking ltd_idx in lmv_find_target.

Currently, lmv_find_target return lmv->tgts[mds]
according to mdt_index, which is not correct. LMV
index is created by mount sequence, while mdt_index
by indicated by --index. So we should check ltd_idx
in lmv_find_target.

Signed-off-by: Wang Di <di.wang@intel.com>
Change-Id: I67a941cca00eb80ba91af6eb3f3441982d4fcab3
Reviewed-on: http://review.whamcloud.com/5412
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-1187 out: add resend check for update.
Wang Di [Mon, 11 Jun 2012 22:13:55 +0000 (15:13 -0700)]
LU-1187 out: add resend check for update.

1. Add update resend check between MDTs.

2. Even during creating the new object, osp_init_object
   still needs to get remote object attrs, because the
   object might be created already by some partial failed
   operation.

3. During resend handling, OSP needs to delete the existing
   orphan objects first, then do remote create.

4. MDT will check whether the name has been deleted during
   the resend of remote unlink, and only delete the local
   directory if the name on the remote MDT has been delete.

5. Fix the fail_id assignment location to only fail real
   update RPC.

6. Some fixes for replay DNE test sctipts.

Change-Id: Ia9ba8091b6622b0e2fd1f1b4fd355b5ff3eb9758
Signed-off-by: Wang Di <di.wang@intel.com>
Reviewed-on: http://review.whamcloud.com/4343
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2571 lfsck: run lfsck on server node
Emoly Liu [Tue, 5 Feb 2013 14:27:06 +0000 (22:27 +0800)]
LU-2571 lfsck: run lfsck on server node

Usually we run e2fsck on OST and MDS nodes, and run lfsck on clients,
but it's not necessary to install e2fsprogs rpm on clients.
So, if lfsck is not found on client, we will try server node instead.

Test-Parameters: testlist=lfsck
Signed-off-by: Liu Ying <emoly.liu@intel.com>
Change-Id: I3979dc0236e81163f3283eac3148c36c8ddccf63
Reviewed-on: http://review.whamcloud.com/5139
Tested-by: Hudson
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2194 test: avoid wrong eviction in recovery_small
Hongchao Zhang [Fri, 8 Feb 2013 21:24:48 +0000 (05:24 +0800)]
LU-2194 test: avoid wrong eviction in recovery_small

in subtest 19a and 19b of recovery_small, the locks from OST/MDT
should be cancelled before "drop_ldlm_cancel" to avoid wrong eviction
from OST/MDT for it will also drop its lock cancellation request.

Change-Id: I49d1d189f6791224a3f132dc8cf2dd8b3d51d43e
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/5679
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Hudson
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2640 osd: conditional enable transaction debug
Fan Yong [Wed, 13 Feb 2013 13:09:39 +0000 (21:09 +0800)]
LU-2640 osd: conditional enable transaction debug

Current transaction debug mechanism will be disabled
when Lustre-2.4 released.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I516a9026b2d887935c5714dc0b777d65d487dac7
Reviewed-on: http://review.whamcloud.com/5698
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2882 build: build broken with no zfs libraries installed
Bruno Faccini [Sat, 9 Mar 2013 16:39:01 +0000 (17:39 +0100)]
LU-2882 build: build broken with no zfs libraries installed

Reverting default values (originaly set in LU-2391 patch) for
ldiskfs/zfs OSD RPMs builds, to comply for more ways
("rpmbuild -tb <lustre_source_tarball>") used to build Lustre.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I07eca22d03f597942284bbbc0bfd1b680ecb199b
Reviewed-on: http://review.whamcloud.com/5661
Tested-by: Hudson
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2803 osd: osd-zfs to handle echo sequence (2) properly
Alex Zhuravlev [Wed, 13 Feb 2013 11:17:39 +0000 (15:17 +0400)]
LU-2803 osd: osd-zfs to handle echo sequence (2) properly

use visible OI (/O directory) to map object from this sequence
into dnodes. this let obdecho to work with zfs.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I63e50789df502d11863c69658c9524fbb3cd9f22
Reviewed-on: http://review.whamcloud.com/5414
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2910 clio: skip iov update when tot_nrsegs is zero
Niu Yawei [Mon, 18 Mar 2013 03:08:48 +0000 (23:08 -0400)]
LU-2910 clio: skip iov update when tot_nrsegs is zero

When tot_nrsegs is zero, we should skip the iov update too.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I52d4d52a65802f3967dd6a96ad46ec40fd4ef355
Reviewed-on: http://review.whamcloud.com/5747
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-1032 build: Support --disable-maintainer mode
Christopher J. Morrone [Thu, 14 Feb 2013 00:06:29 +0000 (16:06 -0800)]
LU-1032 build: Support --disable-maintainer mode

We add "AM_MAINTAINER_MODE([enable])" to all configure
scripts to allow us to use --disable-maintainer-mode.

By default, without the AM_MAINTAINER_MODE macro, autotools
"maintainer mode" is enabled.  By specifying "enable" we
maintain our previous default behaviour.

Change-Id: I88366ad658795145af80ed96c6e708c385799ffa
Signed-off-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-on: http://review.whamcloud.com/5423
Tested-by: Hudson
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2468 libcfs: quiet spurious debug message
Andreas Dilger [Tue, 19 Feb 2013 19:05:40 +0000 (12:05 -0700)]
LU-2468 libcfs: quiet spurious debug message

When cfs_trace_get_tage_try()->cfs_tage_alloc() is allocating a debug
buffer, since e2a2fab993d01597010cb2b44df44a522af0eec8 (b=21776) this
allocation is denied when the allocation is happening in a memory
freeing path.  This caused a spurious "cannot allocate a tage" message
on the console each time.  Quiet that message, since it is expected.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ic800c474dc33f62843b74e06d9ca642cad3ebbe5
Reviewed-on: http://review.whamcloud.com/5470
Tested-by: Hudson
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2424 ptlrpc: reduce initial buffers count
Dmitry Eremin [Thu, 14 Mar 2013 17:28:50 +0000 (21:28 +0400)]
LU-2424 ptlrpc: reduce initial buffers count

Separate buffers count for server and client services.
Reduce inititial allocation for client ("ldlm_cbd") because
its not required a lot. This reduced unreclaimable memory
usage after just mount from 233036kB to 97386kB.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I56fcbe6c45c61ba4876bce5482169ea06a03638c
Reviewed-on: http://review.whamcloud.com/5719
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2836 quota: improve test_3 & test_6 of s-q
Niu Yawei [Sun, 4 Nov 2012 02:42:06 +0000 (10:42 +0800)]
LU-2836 quota: improve test_3 & test_6 of s-q

When approaching quota limit, client turns to sync write, then the
tests which assume io can be finished quickly could fail, we need
to enlarge the time margin to make sure those tests can pass on slow
system.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ib14a905344eb78cdbd0cd79e3bfd8e50ab21a4d8
Reviewed-on: http://review.whamcloud.com/5539
Tested-by: Hudson
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2922 changelog: recommend e2fsprogs-1.42.6.wc2
Jian Yu [Thu, 14 Mar 2013 16:58:41 +0000 (00:58 +0800)]
LU-2922 changelog: recommend e2fsprogs-1.42.6.wc2

This patch updates lustre/ChangeLog to recommend
a newer e2fsprogs-1.42.6.wc2 release instead of the
old 1.41.90.wc4 version.

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ib56092500545f0373a307d7645d41d91079ae086
Reviewed-on: http://review.whamcloud.com/5718
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-2883 hsm: Mark file DIRTY as soon as pages are written
Aurelien Degremont [Wed, 27 Feb 2013 13:57:34 +0000 (14:57 +0100)]
LU-2883 hsm: Mark file DIRTY as soon as pages are written

Since the dirty flag has to be packed in close, it should be set when
pages are written and not when building BRWs like SOM which relies on
MDS_DONE_WRITING.

Signed-off-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Change-Id: I9cf0a71cf3228a7aadb8205cff2735a7abff5ef0
Reviewed-on: http://review.whamcloud.com/5543
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
7 years agoLU-2622 obdclass: Remove the global cl_env list
Prakash Surya [Wed, 30 Jan 2013 00:26:59 +0000 (16:26 -0800)]
LU-2622 obdclass: Remove the global cl_env list

New cl_env structures are allocated using a SLAB cache specifically
created for allocating and freeing these structure. Without this patch,
when a thread is finished with a cl_env structure it places it on a
global list instead of freeing it back to the SLAB cache. With this
patch, this global list is completely removed, and cl_env structures are
released immediately back to the SLAB cache.

The motivation for this change essentially boils down to this secondary
global list cache being completely unnecessary, and only serving to
serialize any calls to cl_env_get and cl_env_put. This has proven to
cause a severe performance impact on large core count systems,
specifically during memory reclamation (i.e. ll_releasepage).

For example, on BG/Q Sequoia IO nodes, we've experienced nearly all
68 cores of a machine spinning on the lock protecting this global
list. Some example stack traces showcasing this problem were gathered
using sysrq-l, and are displayed below:

    CPU56:
    Call Trace:
    [c00000000fe3bb30] [c000000000008d1c] .show_stack+0x7c/0x184 (unreliable)
    [c00000000fe3bbe0] [c00000000027604c] .showacpu+0x64/0x94
    [c00000000fe3bc70] [c000000000068b30] .generic_smp_call_function_interrupt+0x10c/0x230
    [c00000000fe3bd40] [c00000000001d11c] .smp_message_recv+0x34/0x78
    [c00000000fe3bdc0] [c00000000002526c] .bgq_ipi_dispatch+0x118/0x18c
    [c00000000fe3be50] [c00000000007b20c] .handle_IRQ_event+0x88/0x18c
    [c00000000fe3bf00] [c00000000007dc90] .handle_percpu_irq+0x8c/0x100
    [c00000000fe3bf90] [c00000000001b808] .call_handle_irq+0x1c/0x2c
    [c0000003e1c4a4c0] [c0000000000059f0] .do_IRQ+0x154/0x1e0
    [c0000003e1c4a570] [c0000000000144dc] exc_external_input_book3e+0x110/0x114
    --- Exception: 501 at ._raw_spin_lock+0xd8/0x1a8
        LR = ._raw_spin_lock+0x104/0x1a8
    [c0000003e1c4a860] [8000000000b04f38] libcfs_nidstrings+0x2acc/0xfffffffffffe5824 [libcfs] (unreliable)
    [c0000003e1c4a910] [c00000000042d4cc] ._spin_lock+0x10/0x24
    [c0000003e1c4a980] [80000000024c2f4c] .cl_env_get+0xec/0x480 [obdclass]
    [c0000003e1c4aa60] [80000000024c336c] .cl_env_nested_get+0x8c/0xf0 [obdclass]
    [c0000003e1c4aaf0] [800000000692070c] .ll_releasepage+0xbc/0x200 [lustre]
    [c0000003e1c4aba0] [c000000000094110] .try_to_release_page+0x68/0x8c
    [c0000003e1c4ac10] [c0000000000a4190] .shrink_page_list.clone.0+0x3d8/0x63c
    [c0000003e1c4adc0] [c0000000000a47d8] .shrink_inactive_list+0x3e4/0x690
    [c0000003e1c4af90] [c0000000000a4f54] .shrink_zone+0x4d0/0x4d4
    [c0000003e1c4b0c0] [c0000000000a5a68] .try_to_free_pages+0x204/0x3d0
    [c0000003e1c4b220] [c00000000009d044] .__alloc_pages_nodemask+0x460/0x738
    [c0000003e1c4b3a0] [c000000000095af4] .grab_cache_page_write_begin+0x7c/0xec
    [c0000003e1c4b450] [8000000006920964] .ll_write_begin+0x94/0x270 [lustre]
    [c0000003e1c4b520] [c0000000000968c8] .generic_file_buffered_write+0x148/0x374
    [c0000003e1c4b660] [c000000000097050] .__generic_file_aio_write+0x374/0x3d8
    [c0000003e1c4b760] [c00000000009712c] .generic_file_aio_write+0x78/0xe8
    [c0000003e1c4b810] [800000000693ed4c] .vvp_io_write_start+0xfc/0x3e0 [lustre]
    [c0000003e1c4b8e0] [80000000024d9c6c] .cl_io_start+0xcc/0x220 [obdclass]
    [c0000003e1c4b980] [80000000024e1a84] .cl_io_loop+0x194/0x2c0 [obdclass]
    [c0000003e1c4ba30] [80000000068ba1d8] .ll_file_io_generic+0x498/0x670 [lustre]
    [c0000003e1c4bb30] [80000000068ba834] .ll_file_aio_write+0x1d4/0x3a0 [lustre]
    [c0000003e1c4bc00] [80000000068bab50] .ll_file_write+0x150/0x320 [lustre]
    [c0000003e1c4bce0] [c0000000000d1ba8] .vfs_write+0xd0/0x1c4
    [c0000003e1c4bd80] [c0000000000d1d98] .SyS_write+0x54/0x98
    [c0000003e1c4be30] [c000000000000580] syscall_exit+0x0/0x2c

    CPU63:
    Call Trace:
    [c00000000fe03b30] [c000000000008d1c] .show_stack+0x7c/0x184 (unreliable)
    [c00000000fe03be0] [c00000000027604c] .showacpu+0x64/0x94
    [c00000000fe03c70] [c000000000068b30] .generic_smp_call_function_interrupt+0x10c/0x230
    [c00000000fe03d40] [c00000000001d11c] .smp_message_recv+0x34/0x78
    [c00000000fe03dc0] [c00000000002526c] .bgq_ipi_dispatch+0x118/0x18c
    [c00000000fe03e50] [c00000000007b20c] .handle_IRQ_event+0x88/0x18c
    [c00000000fe03f00] [c00000000007dc90] .handle_percpu_irq+0x8c/0x100
    [c00000000fe03f90] [c00000000001b808] .call_handle_irq+0x1c/0x2c
    [c0000003c4f0a510] [c0000000000059f0] .do_IRQ+0x154/0x1e0
    [c0000003c4f0a5c0] [c0000000000144dc] exc_external_input_book3e+0x110/0x114
    --- Exception: 501 at ._raw_spin_lock+0xdc/0x1a8
        LR = ._raw_spin_lock+0x104/0x1a8
    [c0000003c4f0a8b0] [800000000697a578] msgdata.87439+0x20/0xfffffffffffccf88 [lustre] (unreliable)
    [c0000003c4f0a960] [c00000000042d4cc] ._spin_lock+0x10/0x24
    [c0000003c4f0a9d0] [80000000024c17e8] .cl_env_put+0x178/0x420 [obdclass]
    [c0000003c4f0aa70] [80000000024c1ab0] .cl_env_nested_put+0x20/0x40 [obdclass]
    [c0000003c4f0aaf0] [8000000006920794] .ll_releasepage+0x144/0x200 [lustre]
    [c0000003c4f0aba0] [c000000000094110] .try_to_release_page+0x68/0x8c
    [c0000003c4f0ac10] [c0000000000a4190] .shrink_page_list.clone.0+0x3d8/0x63c
    [c0000003c4f0adc0] [c0000000000a47d8] .shrink_inactive_list+0x3e4/0x690
    [c0000003c4f0af90] [c0000000000a4f54] .shrink_zone+0x4d0/0x4d4
    [c0000003c4f0b0c0] [c0000000000a5a68] .try_to_free_pages+0x204/0x3d0
    [c0000003c4f0b220] [c00000000009d044] .__alloc_pages_nodemask+0x460/0x738
    [c0000003c4f0b3a0] [c000000000095af4] .grab_cache_page_write_begin+0x7c/0xec
    [c0000003c4f0b450] [8000000006920964] .ll_write_begin+0x94/0x270 [lustre]
    [c0000003c4f0b520] [c0000000000968c8] .generic_file_buffered_write+0x148/0x374
    [c0000003c4f0b660] [c000000000097050] .__generic_file_aio_write+0x374/0x3d8
    [c0000003c4f0b760] [c00000000009712c] .generic_file_aio_write+0x78/0xe8
    [c0000003c4f0b810] [800000000693ed4c] .vvp_io_write_start+0xfc/0x3e0 [lustre]
    [c0000003c4f0b8e0] [80000000024d9c6c] .cl_io_start+0xcc/0x220 [obdclass]
    [c0000003c4f0b980] [80000000024e1a84] .cl_io_loop+0x194/0x2c0 [obdclass]
    [c0000003c4f0ba30] [80000000068ba1d8] .ll_file_io_generic+0x498/0x670 [lustre]
    [c0000003c4f0bb30] [80000000068ba834] .ll_file_aio_write+0x1d4/0x3a0 [lustre]
    [c0000003c4f0bc00] [80000000068bab50] .ll_file_write+0x150/0x320 [lustre]
    [c0000003c4f0bce0] [c0000000000d1ba8] .vfs_write+0xd0/0x1c4
    [c0000003c4f0bd80] [c0000000000d1d98] .SyS_write+0x54/0x98
    [c0000003c4f0be30] [c000000000000580] syscall_exit+0x0/0x2c

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Change-Id: Ief4b524784e07d7677ecb8a9ce97a7b54ccc6f75
Reviewed-on: http://review.whamcloud.com/5204
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2240 osd: change FID of /ROOT on zfs
wangdi [Sun, 29 Dec 2013 08:08:07 +0000 (00:08 -0800)]
LU-2240 osd: change FID of /ROOT on zfs

Pre-production 2.4 code used FID_SEQ_LOCAL_FILE for /ROOT. With
ldiskfs that FID turns into IGIF which is mapped to MDT0 permanently.
With ZFS original local sequence, was used which makes existing setup
incompatibile with DNE. The intention of the patch is to fix this on
existing ZFS setups and replace FID with one from special FID_SEQ_ROOT
sequence which is mapped to MDT0 as well. For simplicity this is done
in few steps:

 - osd-zfs replaces direntry for /ROOT with the new FID and fixes OI
   so that the new FID is mapped to the same dnode
 - MDD finds all objects listed in /ROOT and updates linkEA properly
 - MDD removes ./.. which may be on disk for pre-production setups
 - finally MDD resets LMA on /ROOT with the new FID, which is later
   used to recognize already converted filesystems (or ones created
   with correct FID from the beginning) and skip this conversion code

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Di Wang <di.wang@intel.com>
Change-Id: I03062c6909146f9a3aed72f41c0708f9ef92bb82
Reviewed-on: http://review.whamcloud.com/5249
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
7 years agoLU-951 test: waiting import state in fail().
yangsheng [Tue, 26 Feb 2013 08:40:24 +0000 (16:40 +0800)]
LU-951 test: waiting import state in fail().

Anyway, There still has a rare chance that the request meet
a invalid import after fail() return. So we should waiting
import restore to a certain state and doing next operation.
Add wait_import_state_mount() to check import state while
client has a mount point.

Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: I55ccb0fd30d69eae651978804ef1e303d9939a71
Reviewed-on: http://review.whamcloud.com/5531
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
7 years agoLU-2712 tests: enable sanity/sanityn SLOW tests
Andreas Dilger [Wed, 30 Jan 2013 22:07:04 +0000 (15:07 -0700)]
LU-2712 tests: enable sanity/sanityn SLOW tests

Enable a number of subtests in sanity.sh and sanityn.sh that are
currently skipped for normal "review" builds because of "SLOW=no".
Checking the maloo test history for these tests, they are verified
as always passing for "full" builds in the SLOW=yes case.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I340ccdf1323d14a34dc87a5217b3256c8f62cab0
Reviewed-on: http://review.whamcloud.com/5218
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-1346 libcfs: remove cfs_ file wrappers
John L. Hammond [Wed, 13 Mar 2013 19:46:08 +0000 (14:46 -0500)]
LU-1346 libcfs: remove cfs_ file wrappers

Replace file relevant wrappers with kernel API.

Affected primitives:
file, dentry, dirent, kstatfs, filp_size, filp_poff,
filp_open, do_fsync, filp_close, filp_read, filp_write,
filp_fsync, get_file, fget, fput, file_count, flock_t,
flock_type, flock_set_type, flock_pid, flock_set_pid,
flock_start, flock_set_start, flock_end, flock_set_end.

completion, init_completion, fini_completion,
wait_for_completion, complete

Change some API implementations of darwin/winnt to make it be
consistent with linux kernel API such as filp_open/filp_close etc.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ibe00df71c658aeb5dda854481f6ab5c181b3de7b
Reviewed-on: http://review.whamcloud.com/2830
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2910 clio: restore iov when restart io
Niu Yawei [Fri, 8 Mar 2013 04:58:11 +0000 (23:58 -0500)]
LU-2910 clio: restore iov when restart io

The iovector needs be restored on restarted io. This patch also
added some LASSERT and debug message to make it easier to debug
this kind of problem later.

Test-Parameters: testlist=racer
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I5d737cfa083ae3b3d9f040a2dc36d6b8693b548b
Reviewed-on: http://review.whamcloud.com/5652
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2871 lod: stripe data across the OSTs correctly
Emoly Liu [Wed, 13 Mar 2013 15:51:37 +0000 (23:51 +0800)]
LU-2871 lod: stripe data across the OSTs correctly

Since the ost-in-use array is initialized with 0 and isn't set with
OST index number correctly, OST0 is always skipped when allocating
objects on OSTs with specific stripe offset (offset > 0).

For example, when running command "lfs setstripe -c -1 -i 2 testfile"
on 4 OSTs, we will get a wrong layout, like
        obdidx           objid           objid           group
             2               3            0x3                0
             3               3            0x3                0
             1               4            0x4                0
             2               4            0x4                0

To fix the problem, we initialize the array with -1 instead, and store
the correct OST index number in it.

Signed-off-by: Liu Ying <emoly.liu@intel.com>
Change-Id: I5eeced6b66ae40771f8896204c5a6ed8e6663e57
Reviewed-on: http://review.whamcloud.com/5554
Tested-by: Hudson
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
7 years agoLU-2701 osp: wake up sync thread
Alex Zhuravlev [Tue, 19 Feb 2013 08:02:14 +0000 (12:02 +0400)]
LU-2701 osp: wake up sync thread

osp_sync_process_committed() to wake up sync thread when it
is requested to stop (e.g. umount) and there is no pending
work left. the patch adds a sanity check to ensure this
process is not taking too long.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I5251013afc2aee55627c806a11eb826a9d3dbec9
Reviewed-on: http://review.whamcloud.com/5463
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2925 out: increase reqbuf size for OUT
Liang Zhen [Fri, 8 Mar 2013 05:20:44 +0000 (13:20 +0800)]
LU-2925 out: increase reqbuf size for OUT

OUT service for DNE can have request size up to 9K, however, it's
using default definition of MDS request buffer size which is 5K.
This patch added individual definitions for OUT reqsize and
req_bufsize.
This patch also made some changes to MDS_BUFSIZE and MDS_LOV_BUFSIZE
to unify style of macros.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I102d6b2aed1e0ed495f055fa3de0c7de7de8c28d
Reviewed-on: http://review.whamcloud.com/5653
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2735 test: disable LRU to avoid page deletion
Hongchao Zhang [Tue, 5 Feb 2013 05:07:38 +0000 (13:07 +0800)]
LU-2735 test: disable LRU to avoid page deletion

In sanity.sh, subtest 151 checks whether the page written previously
is in cache or not, this patch disable the LRU of objects in OST to
avoid the object and its content(pages) to be dropped.

Change-Id: Ie481d13215dac599e0b7e122fcc7e9819a053af5
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/5475
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2852 script: sanity 27u should cleanup test dir
Lai Siyao [Mon, 11 Mar 2013 16:11:27 +0000 (00:11 +0800)]
LU-2852 script: sanity 27u should cleanup test dir

sanity 27u should cleanup test dir before test in case there
are remaining files from previous tests.

remove 27u from ALWAYS_EXCEPT since it can pass now.

Signed-off-by: Lai Siyao <lai.siyao@intel.com>
Change-Id: Id21589c1e1ecf0f4143fad3574cd779c73adb7aa
Reviewed-on: http://review.whamcloud.com/5670
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2449 osd: lookup(..) to fetch fid from parent's LMA
Alex Zhuravlev [Thu, 7 Mar 2013 11:09:52 +0000 (15:09 +0400)]
LU-2449 osd: lookup(..) to fetch fid from parent's LMA

in case LinkEA is not accessible for a reason, osd-zfs
will be trying to get fid for ".." using parent's dnode
which is stored in regular ZFS attributes. parent's
LMA can be used to get fid for ".."

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I34e9f884eb60f036c2f941013bf22e154efc2ff4
Reviewed-on: http://review.whamcloud.com/5629
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
7 years agoLU-2829 tests: sanityn/33a cleanup messages for zfs
Nathaniel Clark [Thu, 7 Mar 2013 15:26:19 +0000 (10:26 -0500)]
LU-2829 tests: sanityn/33a cleanup messages for zfs

Remove ldiskfs specific checks from running on zfs.
Account for lvm disks for ldiskfs.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I15cdb33d095c465b117761d40d61579eb3fbf52a
Reviewed-on: http://review.whamcloud.com/5693
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
7 years agoLU-2926 ldiskfs: crash in is_bad_inode()
Andriy Skulysh [Thu, 7 Mar 2013 13:02:57 +0000 (15:02 +0200)]
LU-2926 ldiskfs: crash in is_bad_inode()

Fix error handling in ldiskfs_xattr_inode_iget

Xyratex-bug-id: MRP-883
Signed-off-by: Andriy Skulysh <Andriy_Skulysh@xyratex.com>
Change-Id: I9840b7bd32f2c96763cae402a7bcb51d5798ea6c
Reviewed-on: http://review.whamcloud.com/5631
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2809 llite: Do not return layout_gen for getxattr
Jinshan Xiong [Sat, 9 Mar 2013 01:07:13 +0000 (17:07 -0800)]
LU-2809 llite: Do not return layout_gen for getxattr

The problem is that layout_gen and stripe_offset are sharing the
same field in lov_user_md{}. Layout gen would be wrongly interpreted
as stripe_offset when a backup is restored.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: If2c120bf861eeffd2db5f92e1d23cb1b9a2f5c63
Reviewed-on: http://review.whamcloud.com/5664
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John Hammond <johnlockwoodhammond@gmail.com>
7 years agoLU-2816 llite: Set RAS_INCREASE_STEP correctly
Jinshan Xiong [Tue, 12 Mar 2013 22:52:55 +0000 (15:52 -0700)]
LU-2816 llite: Set RAS_INCREASE_STEP correctly

RAS_INCREASE_STEP is by pages instead of by bytes.

However, I found it caused performance loss by setting it to be 4MB so
I set it back to 1MB. After 4MB RPC is enabled by default, more work
should be done to pick up a right value of max_read_ahead_mb to
maximize performance gain of 4MB RPC.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ic000f266cfefc827e112e03f76cf467c73ba88ad
Reviewed-on: http://review.whamcloud.com/5691
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2775 fid: allow FID-on-OST in fid_seq_is_mdt()
Andreas Dilger [Mon, 18 Feb 2013 06:58:14 +0000 (23:58 -0700)]
LU-2775 fid: allow FID-on-OST in fid_seq_is_mdt()

The LASSERT_SEQ_IS_MDT() macro used fid_seq_is_mdt() in several
places to verify that a FID was "sane" for where it was being used.
However, there should never be LASSERTs for data from the network
or disk.

The use of LASSERT_SEQ_IS_MDT() is removed from clients, since this
is "validating" data from the network.  This macro is no longer used,
so remove it.  The old CMD objseq_to_mdsno() and mdt_to_obd_objseq()
helpers are also no longer needed for DNE, and can be removed.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I74bc9198799045b8bd91510cb45e8f876012cab0
Reviewed-on: http://review.whamcloud.com/5456
Tested-by: Hudson
Reviewed-by: John Hammond <johnlockwoodhammond@gmail.com>
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2918 tests: sanity 184c to start copy before layout swap
Jinshan Xiong [Wed, 6 Mar 2013 18:25:38 +0000 (10:25 -0800)]
LU-2918 tests: sanity 184c to start copy before layout swap

The test script has to make sure dd already starts to copy file
before swapping layout, otherwise the attempt to open file1 would
fail.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I8217c9a38f1d09830c0ab259f65c9716c06736d1
Reviewed-on: http://review.whamcloud.com/5617
Tested-by: Hudson
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2791 ldlm: release reference against failed lock
Fan Yong [Fri, 8 Feb 2013 19:33:52 +0000 (03:33 +0800)]
LU-2791 ldlm: release reference against failed lock

On client-side, when ldlm_cli_enqueue_fini() gets reply from
server, which contains unexpected LVB size, it will mark the
lock as failure, but it does not release one reference, then
the failed lock prevents the lock from being freed. And then
umount client will be blocked.

The ldlm_lvbo_fill() caller should handle failure cases.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I197759a0b964e028627ecb6025820db9517fad7e
Reviewed-on: http://review.whamcloud.com/5634
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2942 ptlrpc: Fix an unswabbed status check in after_reply()
Li Wei [Mon, 11 Mar 2013 07:18:49 +0000 (15:18 +0800)]
LU-2942 ptlrpc: Fix an unswabbed status check in after_reply()

The -EINPROGRESS handling in after_reply() checks reply status while
pb_status still contains unswabbed data.  This patch moves the block
below the unpack_reply() call.

Change-Id: I619b26d213708b8f5c250cdf085f359bba31ffae
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/5667
Tested-by: Hudson
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2713 hsm: limit HSM RPC count from client
John L. Hammond [Fri, 8 Mar 2013 18:42:31 +0000 (12:42 -0600)]
LU-2713 hsm: limit HSM RPC count from client

Put HSM RPCs under the control of the max_rpcs_in_flight param which
limits the number of concurrent RPCs in flight between a single client
and an MDT.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I760333c7391ffd5aafca396b5fd97ef139799076
Reviewed-on: http://review.whamcloud.com/5616
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2936 ptlrpc: Do not try to fetch hp request blindly
Oleg Drokin [Sat, 9 Mar 2013 23:44:36 +0000 (18:44 -0500)]
LU-2936 ptlrpc: Do not try to fetch hp request blindly

ptlrpc_svcpt_health_check tries to blindly fetch a hp requet from
any service it happens to be called on, but some services don't have
any hp policies registered resulting in an underlyign assertion
in nrs_svcpt2nrs.
Make sure there are in fact pending hp requests on a service before
attempting to fetch them.

Change-Id: Ia38dcce758db948a1e4c187d009da4a8d5f2cbc6
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/5665
Tested-by: Hudson
Reviewed-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
7 years agoLU-2190 utils: Use LDD_PREFIX in LDD_*_PROP macros
Prakash Surya [Mon, 11 Mar 2013 16:27:39 +0000 (09:27 -0700)]
LU-2190 utils: Use LDD_PREFIX in LDD_*_PROP macros

This patch replaces the hard coded "lustre:" string used in the
LDD_*_PROP macro definitions with the LDD_PREFIX macro. This is
functionally equivalent, but eases the burden if the prefix string
is ever changed (and subjectively makes the code easier to read).

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Change-Id: Ief0e710b90cf922eb4d4cc8a162ee1b0d21317b4
Reviewed-on: http://review.whamcloud.com/5671
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-2523 mdt: handle -ENOENT && !MDS_OPEN_CREAT in reint open
John L. Hammond [Wed, 27 Feb 2013 07:12:39 +0000 (01:12 -0600)]
LU-2523 mdt: handle -ENOENT && !MDS_OPEN_CREAT in reint open

If mdt_open_by_fid_lock() returns -ENOENT and MDS_OPEN_CREAT is not
set in create flags then bail out and return -ENOENT from
mdt_reint_open().  In mdt_open_by_fid_lock() if -ENOENT is returned
then ensure that DISP_IT_EXECD is set in the reply disposition.  Add
sanity test_27B to call the LL_IOC_LOV_SETSTRIPE ioctl() on an open
unlinked file, a situation which triggers the first case.  In sanityn
test 30, remote the requirement that opening an unlinked file via
/proc/PID/exe return -ESTALE.  In racer's file_create.sh, reinstate
the call to 'lfs setstripe' before dd is started, effectively
reverting commit 0be1c87fe2d4ffddaca9e568cc137518b7368b2d.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I74e2297c4210ff18acbf60efdf51049d9a88cbea
Reviewed-on: http://review.whamcloud.com/5417
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Hudson
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-1468 o2iblnd: Support OFED-3.5 for o2ib
Shuichi Ihara [Sun, 6 Jan 2013 09:40:39 +0000 (18:40 +0900)]
LU-1468 o2iblnd: Support OFED-3.5 for o2ib

OFED is having new structure based on linux kernel code + backports
and packaging. Here is detailed information.
http://lists.openfabrics.org/pipermail/ewg/2011-December/017156.html

This patches are the lustre build improvements to support OFED 3.5,
3.x whatever future OFED release.

Signed-off-by: Shuichi Ihara <sihara@ddn.com>
Change-Id: Id4ffc39bc7fc24cc591bf6fb47e9b0e662993bda
Reviewed-on: http://review.whamcloud.com/3011
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2449 osd: osd-zfs to initialize parent attribute
Alex Zhuravlev [Thu, 7 Mar 2013 19:06:18 +0000 (23:06 +0400)]
LU-2449 osd: osd-zfs to initialize parent attribute

to follow ZFS on-disk format osd_object_create() should
initialize regular attribute storing parent dnode.
parent dnode is taken from object passed to create as an
allocation hint.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I1b4b4b5d7c6989c39f6bdecd52af48f270ad5beb
Reviewed-on: http://review.whamcloud.com/5642
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2748 osd: allocate buffers on demand
Alex Zhuravlev [Fri, 15 Feb 2013 09:19:29 +0000 (13:19 +0400)]
LU-2748 osd: allocate buffers on demand

instead of putting a lot of buffers statically within osd_thread_info,
we can allocate them on the first demand within this thread.
we also can allocate not the maximum, but some optimal amount and
reallocate if really needed. dr_created is not used, so removed.
the number of blocks is calculated using actual blocksize, not the
smallest one, so no need to multiply by 8 in 99.9% cases.

with PTLRPC_MAX_BRW_PAGES=1024 (as default in master branch) and
regular 1MB IO,
before: sizeof(struct osd_thread_info) = 82104
after:  sizeof(struct osd_thread_info) = 4328 + 4K (if IO thread)

should improve threads not doing IO: all MDS threads, LDLM threads,
MGS threads.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ie07780537a4598c6a888ed9be4ef0bbb0d9b3d54
Reviewed-on: http://review.whamcloud.com/5444
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2899 lod: get ldo_stripenr correctly
Bobi Jam [Mon, 4 Mar 2013 11:21:15 +0000 (19:21 +0800)]
LU-2899 lod: get ldo_stripenr correctly

Current code relies on lod_statfs_and_check() to count the number of
activated LOD targets, while lod::ldo_stripenr derivation happens
before calling lod_statfs_and_check(), and that makes
lod::ldo_stripenr not accurate.

This patch make sure lod_statfs_and_check() called before updating
::ldo_stripenr. And if there is [de]activation happens on OST target,
client needs wait 2*lod_qos_maxage seconds to get accurate
ld_active_tgt_count number.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I37bebc69f876dd68da609fb5180bc6db36f01e84
Reviewed-on: http://review.whamcloud.com/5573
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2804 mdd: move ACL mode handling to MDD
wangdi [Sat, 30 Nov 2013 18:13:40 +0000 (10:13 -0800)]
LU-2804 mdd: move ACL mode handling to MDD

Move ACL mode handling from OSD to MDD, so both ldiskfs and zfs
can be set mode correctly, and also by this, it can avoid to
transfer the local mask to the remote MDT for fixing the mode.

Move sanity 103 out of sanity SLOW list, so it can be run in the
normal review/landing check, so to avoid regression about ACL.
And also it only take 15 seconds in my local VM run, which probably
not be put to the SLOW list at all.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I73146425baa7d8b712ce46e18955ecaa2a3fd9a4
Reviewed-on: http://review.whamcloud.com/5421
Tested-by: Hudson
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2694 test: improve lfsck.sh
Niu Yawei [Wed, 6 Mar 2013 08:50:32 +0000 (03:50 -0500)]
LU-2694 test: improve lfsck.sh

The test directory of lfsck.sh contains some files referencing
same object, which could cause error when removing the directory
on test cleanup.

Test-Parameters: testlist=lfsck
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I334ff2b7b5f77498eed940f009e4bc18728bb5da
Reviewed-on: http://review.whamcloud.com/5606
Tested-by: Hudson
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2694 test: fix is_empty_fs in t-f
Niu Yawei [Mon, 4 Mar 2013 14:25:25 +0000 (09:25 -0500)]
LU-2694 test: fix is_empty_fs in t-f

The original is_empty_fs is incorrect, which can cause unexpected
error when running lfsck.sh, because lfsck sets the max error
level based on whether filesystem is empty.

Test-Parameters: testlist=lfsck
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I2269cc41744e3c9fe228898323a1508a03616efe
Reviewed-on: http://review.whamcloud.com/5576
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Hudson
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2730 mdt: fix erroneous LASSERT in mdt_reint_opcode
Nathaniel Clark [Wed, 13 Feb 2013 14:33:30 +0000 (09:33 -0500)]
LU-2730 mdt: fix erroneous LASSERT in mdt_reint_opcode

Only set return code err_serious(-EFAULT) when neccessary and do not
run unknown opcode through err_serious.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Iea616d8afd676ee5ccda52cf09b398198f38f992
Reviewed-on: http://review.whamcloud.com/5416
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2859 osc: unplug IO queue async to avoid stack overflow
Jinshan Xiong [Tue, 26 Feb 2013 01:31:54 +0000 (09:31 +0800)]
LU-2859 osc: unplug IO queue async to avoid stack overflow

Otherwise, there is a stack overflow problem with the following
stacktrace:

18 cfs_trace_unlock_tcd at ffffffffa034336c [libcfs]
19 libcfs_debug_vmsg2 at ffffffffa0354038 [libcfs]
20 libcfs_debug_msg at ffffffffa03545f1 [libcfs]
21 osc_key_init at ffffffffa0a90e17 [osc]
22 keys_fill at ffffffffa06d8c3f [obdclass]
23 lu_context_init at ffffffffa06dcc6b [obdclass]
24 lu_env_init at ffffffffa06dce3e [obdclass]
25 cl_env_new at ffffffffa06e40bd [obdclass]
26 cl_env_get at ffffffffa06e4acb [obdclass]
27 lov_sub_get at ffffffffa0b3777d [lov]
28 lov_page_subio at ffffffffa0b37c5d [lov]
29 lov_page_own at ffffffffa0b311af [lov]
30 cl_page_own0 at ffffffffa06e8b5b [obdclass]
31 cl_page_own_try at ffffffffa06e8db3 [obdclass]
32 discard_pagevec at ffffffffa0a926a9 [osc]
33 osc_lru_shrink at ffffffffa0a935d9 [osc]
34 osc_lru_del at ffffffffa0a94aa6 [osc]
35 osc_page_delete at ffffffffa0a951b4 [osc]
36 cl_page_delete0 at ffffffffa06e99e5 [obdclass]
37 cl_page_delete at ffffffffa06e9e62 [obdclass]
38 ll_releasepage at ffffffffa0bfe41b [lustre]
39 try_to_release_page at ffffffff81110070
40 shrink_page_list.clone.0 at ffffffff8112a501
41 shrink_inactive_list at ffffffff8112a8cb
42 shrink_zone at ffffffff8112b5df
43 zone_reclaim at ffffffff8112c384
44 get_page_from_freelist at ffffffff81122834
45 __alloc_pages_nodemask at ffffffff81123ab1
46 kmem_getpages at ffffffff8115e2f2
47 cache_grow at ffffffff8115e95f
48 cache_alloc_refill at ffffffff8115ebb2
49 __kmalloc at ffffffff8115f8d9
50 cfs_alloc at ffffffffa0344c40 [libcfs]
51 ptlrpc_request_alloc_internal at ffffffffa085d407 [ptlrpc]
52 ptlrpc_request_alloc_pool at ffffffffa085d66e [ptlrpc]
53 osc_brw_prep_request at ffffffffa0a8451b [osc]
54 osc_build_rpc at ffffffffa0a8a513 [osc]
55 osc_io_unplug0 at ffffffffa0aa642d [osc]
56 osc_io_unplug at ffffffffa0aa7ce1 [osc]
57 osc_enter_cache at ffffffffa0aa8473 [osc]
58 osc_queue_async_io at ffffffffa0aae916 [osc]
59 osc_page_cache_add at ffffffffa0a94fc9 [osc]
60 cl_page_cache_add at ffffffffa06e61d7 [obdclass]
61 lov_page_cache_add at ffffffffa0b31325 [lov]
62 cl_page_cache_add at ffffffffa06e61d7 [obdclass]
63 vvp_io_commit_write at ffffffffa0c1161d [lustre]
64 cl_io_commit_write at ffffffffa06f5b1d [obdclass]
65 ll_commit_write at ffffffffa0be68be [lustre]
66 ll_write_end at ffffffffa0bfe4e0 [lustre]
67 generic_file_buffered_write at ffffffff81111684
68 __generic_file_aio_write at ffffffff81112f70
69 generic_file_aio_write at ffffffff8111320f
70 vvp_io_write_start at ffffffffa0c11f3c [lustre]
71 cl_io_start at ffffffffa06f244a [obdclass]
72 cl_io_loop at ffffffffa06f6d54 [obdclass]
73 ll_file_io_generic at ffffffffa0bbda7b [lustre]
74 ll_file_aio_write at ffffffffa0bbdce2 [lustre]
75 ll_file_write at ffffffffa0bbeaac [lustre]
76 vfs_write at ffffffff81177b98
77 sys_write at ffffffff811785a1

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Id61b481fe2036a4d7adb7140c39e50fe61c264ba
Reviewed-on: http://review.whamcloud.com/5526
Tested-by: Hudson
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2190 utils: Map ldd_params to individual props
Prakash Surya [Fri, 8 Mar 2013 01:18:08 +0000 (17:18 -0800)]
LU-2190 utils: Map ldd_params to individual props

At mkfs/tunefs/mount time the configuration utilities will access
a .ldd_params configuraton string.  This string consists of space
delimited key values pairs which must be saved somewhere.

For ldiskfs it makes sense to store this string on disk as part
of the binary lustre_disk_data structure.  But the logical place
for ZFS is to use the user dataset properties where all the other
lustre_disk_data fields are stored.

This patch updates the ZFS portions of the utilities to take the
.ldd_params string and break it apart in to individual key value
pairs which can be stored as user dataset properties.  This is
done by simply appending 'lustre:' to the provided param key.

The original .ldd_params field can then be reconstructed by
fetching all the dataset properties prefixed by 'lustre:' and
appending them with their keys.

The advantage in this approach is new parameters can be added
without the need to update utilties.

Change-Id: Ida1b0e845b27154b209cc807726ae6ed3d22d189
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Prakash Surya <surya1@llnl.gov>
Reviewed-on: http://review.whamcloud.com/5220
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
7 years agoLU-2738 mdt: add proc entry to enable lfs mkdir for non-admin
wangdi [Sun, 1 Dec 2013 10:25:27 +0000 (02:25 -0800)]
LU-2738 mdt: add proc entry to enable lfs mkdir for non-admin

Add enable_remote_dir_gid to enable lfs mkdir for non-admin user,
1. enable_remote_dir_gid = group, only users, whose gid == group
   can create remote dir.
2. enable_remote_dir_gid = -1, all users can create remote dir.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I6605016aa89ef3e2763ca14f94a763685fc689b6
Reviewed-on: http://review.whamcloud.com/5442
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John Hammond <johnlockwoodhammond@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2840 tests: Clean the environment for test.
yangsheng [Thu, 28 Feb 2013 08:31:02 +0000 (16:31 +0800)]
LU-2840 tests: Clean the environment for test.

Use unique directory for sanityn test_21. Avoid it
impacted by previous test.

Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: I36a06605732bb0048c25c622d4c8f1192d761a88
Reviewed-on: http://review.whamcloud.com/5556
Tested-by: Hudson
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2148 kernel: Kernel update for latest FC18 kernel
yangsheng [Mon, 17 Dec 2012 17:42:25 +0000 (01:42 +0800)]
LU-2148 kernel: Kernel update for latest FC18 kernel

Add fc18 support for build system

Signed-off-by: yang sheng <ys@whamcloud.com>
Change-Id: Iaeb24b5e44f969eb23a55d115b866c926b25bd55
Reviewed-on: http://review.whamcloud.com/5194
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Hudson
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2922 ldiskfs: depend on e2fsprogs-1.42.6.wc2
Jian Yu [Thu, 7 Mar 2013 05:45:41 +0000 (13:45 +0800)]
LU-2922 ldiskfs: depend on e2fsprogs-1.42.6.wc2

This patch updates lustre-ldiskfs.spec.in to depend on
a newer e2fsprogs-1.42.6.wc2 release instead of the
old 1.41.12.2.ora1 version.

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: I58828fa5427e5f50cfb3b285f980e946e7eb62e7
Reviewed-on: http://review.whamcloud.com/5623
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2858 mdt: check .lustre before reint operation.
wangdi [Sat, 14 Dec 2013 07:41:11 +0000 (23:41 -0800)]
LU-2858 mdt: check .lustre before reint operation.

1. Check .lustre in MDT layer to make sure any attempt to change
.lustre will return EPERMT.

2. send parent fid for remote open, so mdt can check whether it
will create lov object for the fid under OBF.

3. add a few test cases in sanity 154(OBF checking) to check .lustre
and remote fid

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ia708856e026a959ca30b06fdd3acd907e5d1b913
Reviewed-on: http://review.whamcloud.com/5555
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John Hammond <johnlockwoodhammond@gmail.com>
7 years agoLU-2724 ptlrpc: skip NULL obd_svc_stats in lprocfs_rd_import()
John L. Hammond [Thu, 28 Feb 2013 09:39:45 +0000 (03:39 -0600)]
LU-2724 ptlrpc: skip NULL obd_svc_stats in lprocfs_rd_import()

In lprocfs_rd_import() don't print trivial obd_svc_stats if they were
not allocated for this import. Add a general proc file read check to
sanity.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ic94056eb4f02d71491cdaf948cbe27b82de2153d
Reviewed-on: http://review.whamcloud.com/5234
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Hudson
Reviewed-by: Li Wei <wei.g.li@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2628 tests: disable test_40 of replay-single
Jinshan Xiong [Thu, 17 Jan 2013 23:11:37 +0000 (15:11 -0800)]
LU-2628 tests: disable test_40 of replay-single

This test case assumes that IO to OSTs could go on even when the
connect to MDT is lost, this is not true any more because clients
have to verify the layout is correct before operating OST objects.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I08eaad7b97da7ee152c066426f24bc1d15db5738
Reviewed-on: http://review.whamcloud.com/5056
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-2714 hsm: limit MDT-side allocations for HSM RPCs
John L. Hammond [Thu, 21 Feb 2013 22:43:33 +0000 (16:43 -0600)]
LU-2714 hsm: limit MDT-side allocations for HSM RPCs

Limit the amount of memory an MDT will allocate for a single HSM RPC
to 1 MB and add some sanity checking to the HSM handlers that use
variable length buffers.  In hur_len() compute the size of an HSM
request in a portable way.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie03b85a8524cb377bf43446be429cc60c2fe39a7
Reviewed-on: http://review.whamcloud.com/5507
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
7 years agoLU-2523 tests: disable racer setstripe until fixed
Andreas Dilger [Thu, 14 Feb 2013 01:04:37 +0000 (17:04 -0800)]
LU-2523 tests: disable racer setstripe until fixed

Disable the use of "lfs setstripe" in racer file_create.sh until
the LU-2523 and LU-2789 are fixed, so that racer can be added to
the "review" test workload instead of only in the "full" workload.

Test-Parameters: testlist=racer,racer,racer
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I8b0814ef7864efd817ff585f715183fbf73ebbe5
Reviewed-on: http://review.whamcloud.com/5424
Reviewed-by: John Hammond <johnlockwoodhammond@gmail.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-903 ldlm: inode references moved to resource
Artem Blagodarenko [Wed, 2 May 2012 14:05:33 +0000 (18:05 +0400)]
LU-903 ldlm: inode references moved to resource

There is a race condition while get_attr after cancel_lru_locks
and sysctl drop_caches. ll_clear_inode clears l_ast_data for
ldlm lock and this lock can't be canceled because "inode == NULL".
ll_mdc_blocking_ast finds such lock. As result DCACHE_LUSTRE_INVALID
is not set and lookup returns wrong inode.

This patch moves inode structure reference to
"ldlm_lock::l_resource::lr_lvb_inode". This prevents from different
inode references for same resource's lock.

Xyratex-bug-id: MRP-338
Reviewed-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Reviewed-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Signed-off-by: Artem Blagodarenko <artem_blagodarenko@xyratex.com>
Change-Id: I4105b2aec38c90d3f5a20d1498a563192a74de55
Reviewed-on: http://review.whamcloud.com/2627
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2896 mgs: fix handling non IR targets.
Alexey Lyashkov [Mon, 4 Mar 2013 11:42:22 +0000 (13:42 +0200)]
LU-2896 mgs: fix handling non IR targets.

non IR targets don't know about extra command as have single
"register target" so we need to treat - none command
as register new target

Xyratex-bug-id: MRP-880
Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Change-Id: I22a8fd034772c9355d2d56a166fdf3766edec719
Reviewed-on: http://review.whamcloud.com/5574
Tested-by: Hudson
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
7 years agoLU-2855 kuc: error management in KUC broadcast delivery
Thomas Leibovici [Sun, 24 Feb 2013 13:08:47 +0000 (14:08 +0100)]
LU-2855 kuc: error management in KUC broadcast delivery

In KUC broadcast groups, the message delivery fails if
a userland process terminates without closing its kuc
channel properly.
This patch improves the behavior of KUC broadcast groups,
to make it more adapted for broadcast usage: the message
delivery is successful if at least 1 userland process is
present in this group and receives
the message successfully.

Signed-off-by: Thomas Leibovici <thomas.leibovici@cea.fr>
Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Change-Id: I2a121c7cfcaa6c2ee5ac48b721668bf2f254d848
Reviewed-on: http://review.whamcloud.com/5521
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoNew tag 2.3.62 2.3.62 v2_3_62 v2_3_62_0
Oleg Drokin [Wed, 6 Mar 2013 23:00:06 +0000 (18:00 -0500)]
New tag 2.3.62

Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: I5671b6f298d9f0850d937a443c163758df866341

7 years agoLU-2885 tests: don't e2fsck zfs partition
Nathaniel Clark [Thu, 28 Feb 2013 01:52:32 +0000 (20:52 -0500)]
LU-2885 tests: don't e2fsck zfs partition

Skip check of ext4 extent info on zfs.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I14d98d1d5dd0400c4d6c166a0b4a12c58992f62b
Reviewed-on: http://review.whamcloud.com/5548
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-2863 tests: fix lfsck/OI_scrub test scripts issues
Fan Yong [Wed, 6 Feb 2013 01:21:21 +0000 (09:21 +0800)]
LU-2863 tests: fix lfsck/OI_scrub test scripts issues

1) sanity-scrub.sh checks Lustre version after Lustre initialization.

2) Re-calculate the expactation for lfsck/OI_scrub speed test:
   2.1) 1.1 * the theoretical value for speed upper limit.
   2.2) 0.9 * the theoretical value for speed lower limit.

3) Reformat the device before running sanity-scrub test_11.

4) Other cleanup.

Test-Parameters: testlist=sanity-scrub,sanity-lfsck

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib811fa24ac7581c341d596afbed064a4fc3e7357
Reviewed-on: http://review.whamcloud.com/5551
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2688 tests: verify quota in conf-sanity 32a
Niu Yawei [Wed, 6 Feb 2013 13:46:20 +0000 (08:46 -0500)]
LU-2688 tests: verify quota in conf-sanity 32a

Verify quota in the upgrade test of conf-sanity 32a

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I4b22b16ea31f124f5e39f1b2554df9d4b7dd4b43
Reviewed-on: http://review.whamcloud.com/5293
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
7 years agoLU-2424 ptlrpc: buffer utilization of rqbd
Liang Zhen [Tue, 1 Jan 2013 08:44:41 +0000 (16:44 +0800)]
LU-2424 ptlrpc: buffer utilization of rqbd

This patch covered a few things:
- dfferent request buffer size for different MDS service
  Size of MDS request/reply without LOV EA can be way smaller than
  request/reply size with LOV EQ
  This patch defined four different buffer size for different
  MDS services: MDS_MAXREQSIZE, MDS_MAXREPSIZE, MDS_LOV_MAXREQSIZE
  and MDS_LOV_MAXREPSIZE

- add extra 128K to MDS_LOV_BUFSIZE
  MDS_LOV_BUFSIZE should be at least (max_reqsize + max sptlrpc
  payload size which is (MDS_LOV_MAXREQSIZE + 1024)), but if
  MDS_LOV_BUFSIZE is only a little larger than MDS_LOV_MAXREQSIZE,
  then it can only fit in one request even there are 48K  bytes
  left in a rqbd, memory utilization is very low.
  In the meanwhile, size of rqbd can't be too large, because rqbd
  can't be reused until all requests fit in it have been processed
  and released, which means one long blocked request can prevent
  the rqbd bereused.
  Now we give extra 128K to buffer size, so even each rqbd is unlinked
  from LNet with unused 48K, buffer utilization will be above 70%.

Xyratex-bug-id: MRP-689
Signed-off-by: Liang Zhen <liang@whamcloud.com>
Change-Id: I19107918e62f9de59dd88652f3513234c30e56ce
Reviewed-on: http://review.whamcloud.com/4940
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
7 years agoLU-2083 build: install git commit hooks automatically
Dmitry Eremin [Fri, 22 Feb 2013 20:14:08 +0000 (00:14 +0400)]
LU-2083 build: install git commit hooks automatically

Fix for previous commit http://review.whamcloud.com/4175

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I014455aa4e73b7a384356c8d6e65f3feb28e7e1c
Reviewed-on: http://review.whamcloud.com/5494
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
7 years agoLU-2874 tests: mark slow sync zfs tests as EXCEPT_SLOW
Nathaniel Clark [Thu, 28 Feb 2013 06:16:18 +0000 (01:16 -0500)]
LU-2874 tests: mark slow sync zfs tests as EXCEPT_SLOW

For tests that take a very long time on ZFS but not on ldiskfs, place
them in EXCEPT_SLOW when running on zfs.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I52193d9a1c51e1276a5ac86c46f32e5ba5dd6299
Reviewed-on: http://review.whamcloud.com/5553
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2877 tests: sanity test_34h needs to flush cache after write
Oleg Drokin [Wed, 27 Feb 2013 03:30:24 +0000 (22:30 -0500)]
LU-2877 tests: sanity test_34h needs to flush cache after write

We need to ensure that the cache is clean after dd in the test,
otherwise subsequent multiop might block trying to flush the
dirty pages for more than 2 seconds and trigger a false failure.

Change-Id: Ifb5a0aa0f9c627b353abe0d402c42a4e14d67609
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/5541
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
7 years agoLU-2645 ldlm: use correct lvb size to reply 1.8 lock enqueue
Fan Yong [Tue, 5 Feb 2013 16:38:04 +0000 (00:38 +0800)]
LU-2645 ldlm: use correct lvb size to reply 1.8 lock enqueue

For 1.8 client, it does not support variable-sized LVB.
The 2.4 server should correctly distinguish whether the
client support it or not, and fill the reply buffer with
suitable LVB size when processing lock enqueue.

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes \
clientjob=lustre-b1_8 clientbuildno=256 testlist=runtests

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9241efe25dc64b26e86c4e75da72ab74bb1bc750
Reviewed-on: http://review.whamcloud.com/5459
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Keith Mannthey <keith.mannthey@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2832 ptlrpc: cleanup bulk for resend case
Hongchao Zhang [Sun, 3 Feb 2013 07:27:49 +0000 (15:27 +0800)]
LU-2832 ptlrpc: cleanup bulk for resend case

when the request with bulk(ptlrpc_bulk_desc) is resent or replayed,
the stats of the bulk should be cleaned up for it will be reused.

Change-Id: Ie340c8aa43e1a19595c50bed05134537d8d07d74
Signed-off-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-on: http://review.whamcloud.com/5532
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2776 tests: sleep longer time to yield CPU
Jinshan Xiong [Mon, 11 Feb 2013 20:55:52 +0000 (12:55 -0800)]
LU-2776 tests: sleep longer time to yield CPU

It used to be 0.1 seconds and turned out too less.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I2776d8387aee8a55f325459999c8c9454dd2b4fa
Reviewed-on: http://review.whamcloud.com/5321
Tested-by: Hudson
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2793 fld: send special seq lookup request to MDT0
wangdi [Thu, 28 Nov 2013 12:03:17 +0000 (04:03 -0800)]
LU-2793 fld: send special seq lookup request to MDT0

Since almost all special sequence locates on MDT0, we should
send all seq lookup req to MDT0, especially in an evironment
where other MDTs might not be started at all.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ifda9cb434a217d1e54dc2ef4fcb7628fca049d9d
Reviewed-on: http://review.whamcloud.com/5319
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
7 years agoLU-2683 lov: release all locks in closure to release sublock
Jinshan Xiong [Wed, 30 Jan 2013 00:35:49 +0000 (16:35 -0800)]
LU-2683 lov: release all locks in closure to release sublock

We used to only release current parent lock, this may cause deadlock
if the sublock is shared. See stacktrace of LU-2683 and LU-874 for
details.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ibe5fc364ef22a279f23bb24ad1311a1ad09be369
Reviewed-on: http://review.whamcloud.com/5208
Tested-by: Hudson
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Prakash Surya <surya1@llnl.gov>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
7 years agoLU-2805 tests: compare the file correctly for layout swapping
Jinshan Xiong [Wed, 13 Feb 2013 21:34:53 +0000 (13:34 -0800)]
LU-2805 tests: compare the file correctly for layout swapping

The size of file ref2 may be less than copied bytes so the result of
command cmp will be false due to EOF. Check differ string instead.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I344f90bf4b75f535961f1acdc5574aeca8e85126
Reviewed-on: http://review.whamcloud.com/5420
Tested-by: Hudson
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <whamcloud.maloo@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>