Whamcloud - gitweb
fs/lustre-release.git
5 years agoLU-4943 obdclass: detach MGC dev on error 29/10129/14
Bobi Jam [Mon, 28 Apr 2014 15:40:27 +0000 (23:40 +0800)]
LU-4943 obdclass: detach MGC dev on error

lustre_start_mgc() creates MGC device, if error happens later on
ll_fill_super(), this device is still attached, and later mount
fails by keep complaining that the MGC device's already in the
client node.

It turns out that the device was referenced by mgc config llog data
which is arranged in the mgc lock requeue thread re-trying to get its
mgc lock, and in normal case, this llog reference only released in
mgc_blocking_ast() when the system is umount.

This patch make mgc_precleanup() to wake up requeue thread to handle
the config llog data.

This patch also makes mgc_setup() wait for mgc_requeue_thread() start
before moving on.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I83df8c68c1dbe4ef4ee879e04ab20df46fea9062
Reviewed-on: http://review.whamcloud.com/10129
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Ryan Haasken <haasken@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4839 utils: fix bandwidth ctl in lhsmtool 93/12093/7
Nathaniel Clark [Sat, 27 Sep 2014 18:22:55 +0000 (14:22 -0400)]
LU-4839 utils: fix bandwidth ctl in lhsmtool

Use nanosleep for bandwidth control because usleep fails
for times of a second or longer (usleep(2) EINVAL return value). Add
loop in case sleep is woken by signal.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ib58676b878678eb399bf58bb9873d8fb411b3316
Reviewed-on: http://review.whamcloud.com/12093
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5530 mdt: Properly match open lock and unlock 41/11841/2
Oleg Drokin [Wed, 10 Sep 2014 01:38:35 +0000 (21:38 -0400)]
LU-5530 mdt: Properly match open lock and unlock

It seems that when request resend with a corresponding lock match is
in play (thanks to large striping + llnl patch for the client to send
small requests only), after all the suffering and fixing coming from
LU-2827, here is another casualty in in open/lease locking.
mdt_reint_open() and mdt_open_by_fid_lock() might match the lock
on resend and not call mdt_object_open_lock(), yet call
mdt_object_open_unlock() before exit.
Since mdt_object_open_lock/unlock also plays with a semaphore,
hilarity ensues usually most visible as a rw_sem lockup
in mdt_object_open_lock.

This patch adds tracking whenever we actually called mdt_object_open_lock
or not and only calls mdt_object_open_unlock if we did.

Change-Id: I73c529229acec98cac4ad73f7b487e759ad9a763
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/11841
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
5 years agoLU-5579 ldlm: re-sent enqueue vs lock destroy race 39/11839/4
Vitaly Fertman [Thu, 9 Oct 2014 15:18:34 +0000 (11:18 -0400)]
LU-5579 ldlm: re-sent enqueue vs lock destroy race

Upon lock enqueue re-send, lock is pinned by ldlm_handle_enqueue0,
however it may race with client eviction or even lock cancel (if
a reply for the original RPC finally reached the client) and the
lock cannot be found by cookie anymore:

ASSERTION( lock != NULL ) failed: Invalid lock handle

Signed-off-by: Vitaly Fertman <vitaly_fertman@xyratex.com>
Change-Id: I9d8156bf78a1b83ac22ffaa1148feb43bef37b1a
Xyratex-bug-id: MRP-2094
Reviewed-on: http://review.whamcloud.com/11839
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5711 tests: add version check for recovery-small subtest 09/12209/3
Bob Glossman [Tue, 7 Oct 2014 15:51:06 +0000 (08:51 -0700)]
LU-5711 tests: add version check for recovery-small subtest

version check added for recovery-small, test 10b.
needed to avoid interop failures.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Idb579576d412150d06be20e14ca3517abeb89b31
Reviewed-on: http://review.whamcloud.com/12209
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2872 tests: Reenable sanity-quota/1 for ZFS 57/12157/2
Nathaniel Clark [Wed, 1 Oct 2014 16:18:14 +0000 (12:18 -0400)]
LU-2872 tests: Reenable sanity-quota/1 for ZFS

Because of performance improvements on ZFS this should now pass
consistantly.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I5fe3f2062cfa3cdf6ff92e4b214019bc907ce448
Reviewed-on: http://review.whamcloud.com/12157
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5141 hsm: Only regular files should be archived 36/12036/3
vinayak_clogeny [Wed, 24 Sep 2014 06:37:11 +0000 (12:07 +0530)]
LU-5141 hsm: Only regular files should be archived

It is currently possible to ask lfs to hsm_archive a directory,
although this doesn't appear to make any sense, and the posix
copytool rejects it.

Ideally this should be caught early and not being forwarded to
the copytool at all.

So adding regular file check in
lustre/utils/lfs.c
which will report error if the file provided for hsm_archive
is not a regular file

Signed-off-by: vinayakswami hariharmath <vinayakswami.hariharmath@clogeny.com>
Change-Id: Ie6cdc05a8517853675167640d76d4d7b5ea9dccf
Reviewed-on: http://review.whamcloud.com/12036
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2675 libcfs: remove {linux,posix}-tracefile.h 83/11983/3
John L. Hammond [Tue, 7 Oct 2014 01:11:33 +0000 (21:11 -0400)]
LU-2675 libcfs: remove {linux,posix}-tracefile.h

Move the definition of the trace buffer type enum in
libcfs/libcfs/tracefile.h.  Remove the then unneeded headers
libcfs/libcfs/linux/linux-tracefile.h and
libcfs/libcfs/posix/posix-tracefile.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ied6fe04e98ba4d91197956ecf6566f73eabb7114
Reviewed-on: http://review.whamcloud.com/11983
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5700 llite: handle concurrent use of cob_transient_pages 79/12179/3
Stephen Champion [Fri, 3 Oct 2014 13:08:49 +0000 (06:08 -0700)]
LU-5700 llite: handle concurrent use of cob_transient_pages

With the lockless __generic_file_aio_write introduced in LU-1669,
ll_direct_IO_26 is no longer protected by the inode i_isem.

This renders obsoltete checks that all transient pages have been
handled before and after entry, and requires atomic access to their
counter.

Signed-off-by: Stephen Champion <schamp@sgi.com>
Change-Id: Ie1545b2123a5ca7d9a8cac130ff387fc06955629
Reviewed-on: http://review.whamcloud.com/12179
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2610 tests: Reenable sanity/40 for ZFS 53/12153/2
Nathaniel Clark [Wed, 1 Oct 2014 13:39:20 +0000 (09:39 -0400)]
LU-2610 tests: Reenable sanity/40 for ZFS

Due to fix by LU-2804 change c8d5aa14e50be2a85491783f169a8f4e646b9594
This test should no longer be skipped for ZFS and should pass.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I1e464b8874b0968f038ec4c88c7b24d2fb03207b
Reviewed-on: http://review.whamcloud.com/12153
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5654 osd-ldiskfs: Handle holes in osd_ldiskfs_read() 45/12145/3
Li Wei [Wed, 24 Sep 2014 04:12:37 +0000 (12:12 +0800)]
LU-5654 osd-ldiskfs: Handle holes in osd_ldiskfs_read()

Current osd_ldiskfs_read() incorrectly returns zero and leaves the
corresponding portion of the buffer untouched when a block to be read
is not allocated.

Test-Parameters: testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity,conf-sanity
Change-Id: I3044f771fccfdddf1fee0c931e8996cbdd8df0c8
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/12145
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2675 lustre: remove linux/obd_class.h 05/11505/3
John L. Hammond [Mon, 18 Aug 2014 18:42:51 +0000 (13:42 -0500)]
LU-2675 lustre: remove linux/obd_class.h

Move some obdo handling declarations from
lustre/include/linux/obd_class.h to lustre/include/obd_class.h. Remove
lustre/include/linux/obd_class.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1e04c97ec7f4bb97b23f57298f52f5efd2c576a2
Reviewed-on: http://review.whamcloud.com/11505
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
5 years agoLU-2675 lustre: remove linux/lustre_quota.h 04/11504/2
John L. Hammond [Mon, 18 Aug 2014 18:35:43 +0000 (13:35 -0500)]
LU-2675 lustre: remove linux/lustre_quota.h

Remove the (mostly) empty header lustre/include/linux/lustre_quota.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: If550ac07e442c640c11f7d23b38f2ec6c166f194
Reviewed-on: http://review.whamcloud.com/11504
Reviewed-by: frank zago <fzago@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
5 years agoLU-2675 lustre: remove linux/lustre_net.h 03/11503/6
John L. Hammond [Mon, 18 Aug 2014 18:33:29 +0000 (13:33 -0500)]
LU-2675 lustre: remove linux/lustre_net.h

Remove the (mostly) empty header lustre/include/linux/lustre_net.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I6ad41b94cf21803ac2f6602c5befea511d02937a
Reviewed-on: http://review.whamcloud.com/11503
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4974 doc: comments to lprocfs_lod.c 59/11159/4
Alex Zhuravlev [Mon, 21 Jul 2014 13:19:57 +0000 (17:19 +0400)]
LU-4974 doc: comments to lprocfs_lod.c

comments to the functions in lod/lprocfs_lod.c.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I8722347aae3262db187b4a6fc30d83d18fee2978
Reviewed-on: http://review.whamcloud.com/11159
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5474 hsm: Add test 90 to ALWAYS_EXCEPT list 36/12136/3
James Nunez [Tue, 30 Sep 2014 18:30:08 +0000 (12:30 -0600)]
LU-5474 hsm: Add test 90 to ALWAYS_EXCEPT list

Disable sanity-hsm test 90 from executing with the
ALWAYS_EXCEPT parameter.

Test 90 should be enabled in the patch that fixes
the issues with test.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ia0ed9925a311f3e24510e62842de88cf9c824f7a
Reviewed-on: http://review.whamcloud.com/12136
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Andreas Dilger <andreas.dilger@intel.com>
5 years agoLU-2675 libcfs: remove unused libcfs.a code 68/10668/7
John L. Hammond [Mon, 6 Oct 2014 13:40:12 +0000 (09:40 -0400)]
LU-2675 libcfs: remove unused libcfs.a code

Remove unused code from libcfs.a. Delete the files:

  libcfs/libcfs/posix/posix-adler.c
  libcfs/libcfs/posix/posix-crc32.c
  libcfs/libcfs/posix/posix-debug.c
  libcfs/libcfs/posix/posix-proc.c
  libcfs/libcfs/posix/rbtree.c
  libcfs/libcfs/user-bitops.c
  libcfs/libcfs/user-crc32pclmul.c
  libcfs/libcfs/user-crypto.c
  libcfs/libcfs/user-lock.c
  libcfs/libcfs/user-mem.c
  libcfs/libcfs/user-prim.c
  libcfs/libcfs/user-tcpip.c

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ic7b6539147fc66a3cefa0caddfc995474014e098
Reviewed-on: http://review.whamcloud.com/10668
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4976 doc: comments to osp_precreate.c 52/11152/6
Alex Zhuravlev [Sun, 20 Jul 2014 08:13:39 +0000 (12:13 +0400)]
LU-4976 doc: comments to osp_precreate.c

comments to the functions in osp/osp_precreate.c

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I657661ea13f1e455f8a01b15f5262049e9764521
Reviewed-on: http://review.whamcloud.com/11152
Tested-by: Jenkins
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5396 lnet: add sparse annotation __user wherever needed 19/11819/8
Frank Zago [Wed, 23 Jul 2014 21:32:32 +0000 (16:32 -0500)]
LU-5396 lnet: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: Iff822b1e6036709e3bd660fe14d8faa58b308995
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11819
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5396 libcfs: add sparse annotation __user wherever needed 17/11817/4
Frank Zago [Wed, 23 Jul 2014 21:03:30 +0000 (16:03 -0500)]
LU-5396 libcfs: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: I07a40b3e2f5bf179923c57d9eda86b4921cd7699
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11817
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5659 hsm: add timestamps to lhsmtool log messages 42/12042/3
John L. Hammond [Wed, 24 Sep 2014 17:33:10 +0000 (12:33 -0500)]
LU-5659 hsm: add timestamps to lhsmtool log messages

Add epoch timestamps (secs.usecs) to the log messages emitted by
lhsmtool_posix. Change the single use of CT_DEBUG() to
CT_TRACE(). Remove the unused macro CT_PRINTF().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I227d00c97f93a09ac9322213d255102b8a3cf612
Reviewed-on: http://review.whamcloud.com/12042
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5557 mdt: track reint operations in MDS service stats 24/11924/2
John L. Hammond [Mon, 15 Sep 2014 20:50:33 +0000 (15:50 -0500)]
LU-5557 mdt: track reint operations in MDS service stats

In mdt_reint_rec() tally the appropriate ptlrpc service stat
(MDS_REINT_{CREATE,SETATTR,...}) for the requested operation.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Idda44baa5720b1b3d7d38402c9abe322f3645b7f
Reviewed-on: http://review.whamcloud.com/11924
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-991 lmv: remove dead code 80/11880/3
Dmitry Eremin [Thu, 11 Sep 2014 19:25:56 +0000 (23:25 +0400)]
LU-991 lmv: remove dead code

The member lmv_obd->server_timeout and function lmv_set_timeouts()
are not used.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I61f8a59d72244ca41e76c277002bce3e2c850f0a
Reviewed-on: http://review.whamcloud.com/11880
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5587 lustre: require HAVE_SERVER_SUPPORT in md_object.h 56/11756/4
John L. Hammond [Thu, 4 Sep 2014 20:03:30 +0000 (15:03 -0500)]
LU-5587 lustre: require HAVE_SERVER_SUPPORT in md_object.h

Move the definition of struct seq_server_site from md_object.h to
lustre_fid.h. Uninclude md_object.h from files that don't need it. In
md_object.h generate a preprocessor error if HAVE_SERVER_SUPPORT is
not defined.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ie3fd464ef4a71f09ccaffae409901cb48705301b
Reviewed-on: http://review.whamcloud.com/11756
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5432 fld: don't loop forever on bogus FID sequences 05/11605/2
John L. Hammond [Tue, 26 Aug 2014 16:28:30 +0000 (11:28 -0500)]
LU-5432 fld: don't loop forever on bogus FID sequences

In fld_client_rpc() if the FLD query RPC returns -ENOENT then break
the retry loop and return -ENOENT.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Id70a5d8f6c2105509149e72e8910fcb6c51732f0
Reviewed-on: http://review.whamcloud.com/11605
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5458: libcfs: protect kkuc_groups from write access 55/11355/4
Frank Zago [Wed, 6 Aug 2014 20:03:42 +0000 (15:03 -0500)]
LU-5458: libcfs: protect kkuc_groups from write access

Since reg->kr_fp can be changed inside the foreach loop,
kkuc_groups must be write protected, and not just read protected.

This should fix the following oops, which could happen if two different
threads simultaneously execute the function, and EPIPE is returned.

PID: 24385  TASK: ffff88012da5f500  CPU: 1   COMMAND: "ldlm_cb00_056"
 #0 [ffff88012db55810] machine_kexec at ffffffff81038f3b
 #1 [ffff88012db55870] crash_kexec at ffffffff810c59f2
 #2 [ffff88012db55940] oops_end at ffffffff8152b7f0
 #3 [ffff88012db55970] no_context at ffffffff8104a00b
 #4 [ffff88012db559c0] __bad_area_nosemaphore at ffffffff8104a295
 #5 [ffff88012db55a10] bad_area_nosemaphore at ffffffff8104a363
 #6 [ffff88012db55a20] __do_page_fault at ffffffff8104aabf
 #7 [ffff88012db55b40] do_page_fault at ffffffff8152d73e
 #8 [ffff88012db55b70] page_fault at ffffffff8152aaf5
    [exception RIP: fput+9]
    RIP: ffffffff8118a509  RSP: ffff88012db55c20  RFLAGS: 00010246
    RAX: 00000000ffffffe0  RBX: ffff8800a8ea4fc0  RCX: 0000000000000000
    RDX: ffffffffa03c9eb0  RSI: 0000000000000000  RDI: 0000000000000000
    RBP: ffff88012db55c20   R8: 00000000ffffff0a   R9: 00000000fffffffc
    R10: 0000000000000001  R11: 282064656c696166  R12: ffffffffa03c9c60
    R13: ffff88005df240f8  R14: 0000000000000000  R15: ffff88013b4ca000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 #9 [ffff88012db55c28] libcfs_kkuc_group_put at ffffffffa0388044 [libcfs]
[ptlrpc]

Change-Id: Ifaa861c0778e745f262cba5ab5f5661a3ad4fa9f
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11355
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4976 doc: comments to osp_sync.c 50/11150/5
Alex Zhuravlev [Sat, 19 Jul 2014 19:46:09 +0000 (23:46 +0400)]
LU-4976 doc: comments to osp_sync.c

comments to the functions in osp/osp_sync.c

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ia261c8d4412f7cdfcb9938be132c53f5f5ba45cd
Reviewed-on: http://review.whamcloud.com/11150
Tested-by: Jenkins
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-3703 tests: skip test for getfattr 2.4.44-6 or less 67/10867/3
Andreas Dilger [Fri, 27 Jun 2014 06:15:27 +0000 (00:15 -0600)]
LU-3703 tests: skip test for getfattr 2.4.44-6 or less

Non-SLES distros may also have an older version of getfattr,
so sanity.sh test_234 should also be skipped on those systems.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I012554dd840715d336d3286cc46c56f53d500c1e
Reviewed-on: http://review.whamcloud.com/10867
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5666 llapi: LLAPI helpers for group lock. 53/12053/4
Henri Doreau [Thu, 25 Sep 2014 06:50:20 +0000 (08:50 +0200)]
LU-5666 llapi: LLAPI helpers for group lock.

Introduced llapi_group_{lock,unlock} functions to abstract the ioctl
implementation to manipulate group lock. Updated lfs accordingly.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Change-Id: I9c0c75029cd0d159e353a2c925cecc9b4ba9a6de
Reviewed-on: http://review.whamcloud.com/12053
Tested-by: Jenkins
Reviewed-by: frank zago <fzago@cray.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4821 osd: cleanups in osd-zfs 21/9721/16
Alex Zhuravlev [Wed, 19 Mar 2014 08:20:16 +0000 (12:20 +0400)]
LU-4821 osd: cleanups in osd-zfs

many small changes to get rid of udmu wrappers.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: Ic8746345da1e6695149bacf066be10bf284aecdf
Reviewed-on: http://review.whamcloud.com/9721
Tested-by: Jenkins
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5003 llog: do not fix remote llogs 55/11955/3
Alexander Zarochentsev [Tue, 16 Sep 2014 15:41:46 +0000 (19:41 +0400)]
LU-5003 llog: do not fix remote llogs

prevent llog_process_thread() from trying to fix
remote llog by a direct write to the header.

Xyratex-bug-id: MRP-2076
Change-Id: I4f8b758a38ce3f51c24fa7397ad5c4b341e27ed0
Signed-off-by: Alexander Zarochentsev <alexander.zarochentsev@seagate.com>
Reviewed-on: http://review.whamcloud.com/11955
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5613 lustre: unused variable in tgt_brw_read() 88/11888/2
Isaac Huang [Fri, 12 Sep 2014 03:24:42 +0000 (21:24 -0600)]
LU-5613 lustre: unused variable in tgt_brw_read()

Local variable niocount in tgt_brw_read() is assigned a value
but never used again, and thus can be removed.

Signed-off-by: Isaac Huang <he.huang@intel.com>
Change-Id: Ide032596e6877a7bc6d3cbd6eaf3cfd7e582eb23
Reviewed-on: http://review.whamcloud.com/11888
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4768 tests: Update ost-survey script 71/11971/4
James Nunez [Wed, 17 Sep 2014 20:40:14 +0000 (14:40 -0600)]
LU-4768 tests: Update ost-survey script

Currently, ost-survey hangs due to calling
'lfs setstripe' in an old (positional) style and
setting max_cached_mb to zero.

The call to 'lfs setstripe' is updated to use the
'-S', '-i' and '-c' flags. max_cached_mb is now
set to pagesize * 256 (in MB). The patch also gets
parameters for the correct file system if more
than one Lustre file system is mounted, and
corrects a few typos in comments.

In ll_max_cached_mb_seq_write(), the number of
pages requested is set to the max of pages requested
or PTLRPC_MAX_BRW_PAGES to allow the client to make
well formed RPCs.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ib0adb0dc363a3b885d1566fda2ac3b9da013c238
Reviewed-on: http://review.whamcloud.com/11971
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Cliff White <cliff.white@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoRevert "LU-5521 grant: quiet message on grant waiting timeout"
Oleg Drokin [Wed, 1 Oct 2014 01:12:48 +0000 (21:12 -0400)]
Revert "LU-5521 grant: quiet message on grant waiting timeout"

This is causing problems with LU-5656 and a lot of dmesg spam about
unterminated strings like
format at osc_cache.c:1524:osc_enter_cache doesn't end in newline

This reverts commit 150246e73c925d628ce9cbbd8184c0b0eefc9a16.

Conflicts:
lustre/osc/osc_cache.c

Change-Id: Ie6dee548a8984b5cb51effcfa9427ffcfbf31f74

5 years agoRevert "LU-5654 osd-ldiskfs: Handle holes in osd_ldiskfs_read()"
Oleg Drokin [Wed, 1 Oct 2014 01:09:05 +0000 (21:09 -0400)]
Revert "LU-5654 osd-ldiskfs: Handle holes in osd_ldiskfs_read()"

Adds problems dcocumented in LU-5684

This reverts commit b5485d307568af92e1a940fa4a7859e6db5b7a97.

5 years agoLU-5512 lfsck: repair dangling name entry 30/11330/29
Fan Yong [Wed, 6 Aug 2014 07:02:54 +0000 (15:02 +0800)]
LU-5512 lfsck: repair dangling name entry

If the MDT-object referenced by the name entry is lost, then the
namespace LFSCK needs to repair the inconsistency as required:

1) Keep the inconsistency there and report the inconsistency case,
   then give the chance to the application to find related issues,
   and the users can make the decision about how to handle it with
   more human knownledge. (by default)

2) Re-create the missed MDT-object with the FID (in the name entry)

The LFSCK will allow the administrator to specify how to handle the
dangling name entry via a new option "-C" when trigger the LFSCK:

[-Coff] or [--create_mdtobj=off]:
Report the inconsistency via log, but keep the dangling name entry
there without repairing. (by default)

-C[on] or --create_mdtobj[=on]:
Create the lost MDT-object.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I78231914023b8d02daf4f6cde6176c1ef655f862
Reviewed-on: http://review.whamcloud.com/11330
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5511 lfsck: repair unmatched parent-child pairs 86/11486/22
Fan Yong [Mon, 4 Aug 2014 12:33:58 +0000 (20:33 +0800)]
LU-5511 lfsck: repair unmatched parent-child pairs

If the parent directry_A reference the child directory_B via the
name entry_C, but the child directory_B back references another
parent directory_D via its ".." name entry, then the namespace
LFSCK should can find out such inconsistency in the second-stage
scanning and repair it by trusting the name entry_B basically.

Usually, for local filesystem or Lustre backend system, such as
ldiskfs, the local filesystem consistency verification tools,
such as e2fsck can guarantee such consistency. So the namespace
LFSCK will mainly focus on the corss-MDTs parent-child pairs
consistency verification.

Lustre does not support hardlink on directory object. So if the
directory contains multiple linkEA entries or bad linkEA entry,
then the namespace LFSCK should can find out and repair related
inconsistency.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: If8cb02f76329fe04fe3a6c280e6926d014654322
Reviewed-on: http://review.whamcloud.com/11486
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4975 ofd: documenting ofd_io.c 56/11156/5
Mikhail Pershin [Mon, 21 Jul 2014 10:30:41 +0000 (14:30 +0400)]
LU-4975 ofd: documenting ofd_io.c

Fix up GPL header block to reference proper GPLv2 license URL.
Remove mention of contacting Sun, since they don't exist anymore.
Add introductory comment block for the ofd_io.c file and add
function comment blocks to all functions in it.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Ic45e8a53f6df127c815f332ef2f5e219506727aa
Reviewed-on: http://review.whamcloud.com/11156
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
5 years agoLU-4974 doc: comments to lod_qos.c 38/11138/5
Alex Zhuravlev [Fri, 18 Jul 2014 10:00:19 +0000 (14:00 +0400)]
LU-4974 doc: comments to lod_qos.c

The comments to the functions in doxygen style.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I2224170759b9bfb852b467c4950a6d0c73374d53
Reviewed-on: http://review.whamcloud.com/11138
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
5 years agoNew tag 2.6.53 2.6.53 v2_6_53 v2_6_53_0
Oleg Drokin [Fri, 26 Sep 2014 23:20:03 +0000 (19:20 -0400)]
New tag 2.6.53

Change-Id: Ia13f23d3ffdafa88be9aab5642347ec370b470c8

5 years agoLU-5396 lnet/klnds: add sparse annotation __user wherever needed 24/11824/5
frank zago [Mon, 1 Sep 2014 15:39:14 +0000 (10:39 -0500)]
LU-5396 lnet/klnds: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: I34620392863d622ea419e777f9e4110f26135853
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11824
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5396 obd: add sparse annotation __user wherever needed 21/11821/2
frank zago [Fri, 29 Aug 2014 23:49:15 +0000 (18:49 -0500)]
LU-5396 obd: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: I9040163eedf5ee89be58d78aa7c8d374ed22b981
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11821
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5396 ioctl: add sparse annotation __user wherever needed 20/11820/2
frank zago [Fri, 29 Aug 2014 23:49:01 +0000 (18:49 -0500)]
LU-5396 ioctl: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: I5dae6291e4d22353973088f440083b197010528b
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11820
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5396: ni: make local functions static 06/11306/4
Frank Zago [Thu, 31 Jul 2014 21:22:19 +0000 (16:22 -0500)]
LU-5396: ni: make local functions static

This reduces the code size by about 400 bytes.

Change-Id: I158c3b2c5c68f92938c29090faab1c0153d19e2e
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11306
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5647 kernel: kernel update RHEL7 [3.10.0-123.8.1.el7] 24/12024/2
Bob Glossman [Mon, 22 Sep 2014 22:43:31 +0000 (15:43 -0700)]
LU-5647 kernel: kernel update RHEL7 [3.10.0-123.8.1.el7]

Update RHEL7 kernel to 3.10.0-123.8.1.el7

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I0483fa61ff675158f6c367cb4845dd7480162bdf
Reviewed-on: http://review.whamcloud.com/12024
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5547 mdt: handle NULL parent in mdt_getattr_name_lock() 10/11610/2
John L. Hammond [Tue, 26 Aug 2014 22:11:22 +0000 (17:11 -0500)]
LU-5547 mdt: handle NULL parent in mdt_getattr_name_lock()

In mdt_getattr_name_lock() if the parent object (from mti_object) is
NULL then return -ENOENT rather than asserting.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I081cb54068bba7825e2a5593f293f48cb2eda874
Reviewed-on: http://review.whamcloud.com/11610
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5654 osd-ldiskfs: Handle holes in osd_ldiskfs_read() 35/12035/3
Li Wei [Wed, 24 Sep 2014 04:12:37 +0000 (12:12 +0800)]
LU-5654 osd-ldiskfs: Handle holes in osd_ldiskfs_read()

Current osd_ldiskfs_read() incorrectly returns zero and leaves the
corresponding portion of the buffer untouched when a block to be read
is not allocated.

Change-Id: Idfd441656b99aa039a6bb4f7141b5407553855da
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/12035
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4539 mdt: return errors from mdt_unpack_req_pack_rep() 50/11650/2
John L. Hammond [Fri, 29 Aug 2014 14:02:03 +0000 (09:02 -0500)]
LU-4539 mdt: return errors from mdt_unpack_req_pack_rep()

In mdt_intent_opc() if mdt_unpack_req_pack_rep() fails then return its
return value rather than -EPROTO. Also convert mdt_intent_opc()
to the usual error handling style.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I20189171a0d31b6264b680344a8290e604371fe4
Reviewed-on: http://review.whamcloud.com/11650
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5584 llite: call truncate_inode_page_final() in evict_inode 09/12009/3
Yang Sheng [Mon, 22 Sep 2014 18:12:11 +0000 (02:12 +0800)]
LU-5584 llite: call truncate_inode_page_final() in evict_inode

In evict_inode we should invoke truncate_inode_pages_final()
as inode will cleanup.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ifba1014cf997475a248bf3d0898ae373bd3a7f9c
Reviewed-on: http://review.whamcloud.com/12009
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4856 misc: Reduce exposure to overflow on page counters. 37/10537/8
Stephen Champion [Tue, 26 Aug 2014 12:12:40 +0000 (05:12 -0700)]
LU-4856 misc: Reduce exposure to overflow on page counters.

When the number of an object in use or circulation is tied to memory
size of the system, very large memory systems can overflow 32 bit
counters.  This patch addresses overflow on page counters in the osc LRU
and obd accounting.

Signed-off-by: Stephen Champion <schamp@sgi.com>
Change-Id: I8deff55d7cc9774722f3dc38a2c7b15877e698f0
Reviewed-on: http://review.whamcloud.com/10537
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-3963 libcfs: remove proc handler wrappers 63/11963/3
James Simmons [Wed, 17 Sep 2014 21:50:16 +0000 (17:50 -0400)]
LU-3963 libcfs: remove proc handler wrappers

Libcfs has wrappers to handle other platforms proc
handling  which is no longer needed. This patch unwinds
those wrappers.

Change-Id: I529a5c3ac0fe6eed9e967782d9d24e40215d3840
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/11963
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5633 ptlrpc: hold rq_lock when modify rq_flags 57/11957/2
Niu Yawei [Wed, 17 Sep 2014 08:00:01 +0000 (04:00 -0400)]
LU-5633 ptlrpc: hold rq_lock when modify rq_flags

In after_reply(), take the rq_lock for changing the rq_resend.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Id2025f0d5c6e7b0991f9ced031433a4d69dd1a16
Reviewed-on: http://review.whamcloud.com/11957
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5577 lmv: change type of lmv_obd->tgts_size to __u32 81/11881/2
Dmitry Eremin [Thu, 11 Sep 2014 19:39:44 +0000 (23:39 +0400)]
LU-5577 lmv: change type of lmv_obd->tgts_size to __u32

tgts_size is used as unsigned.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ia910f034605a787b45539952a9d1f0c0d04e8891
Reviewed-on: http://review.whamcloud.com/11881
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5577 obd: change type of lmv_tgt_desc->ltd_idx to __u32 79/11879/3
Dmitry Eremin [Thu, 11 Sep 2014 19:03:38 +0000 (23:03 +0400)]
LU-5577 obd: change type of lmv_tgt_desc->ltd_idx to __u32

ltd_idx is used as unsigned.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Iae119e03b4cb58dddca2ce230d963d380255a57a
Reviewed-on: http://review.whamcloud.com/11879
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4665 tests: support specifying arbitrary OST indices 22/9722/8
Jian Yu [Sat, 20 Sep 2014 00:05:55 +0000 (20:05 -0400)]
LU-4665 tests: support specifying arbitrary OST indices

This patch improves Lustre test framework to support specifying
arbitrary OST indices via OST_INDEX_LIST or OSTINDEX${num}.

For example,

OSTINDEX1="1"
OSTINDEX2="2"
OSTINDEX3="4"
......
or
OST_INDEX_LIST="[1,2,4-6,8]" #[n-m,l-k,...], where n < m and l < k

The default index of an individual OST is its facet number minus 1.

Function facet_index() is added to get index for a facet.

Test-Parameters: alwaysuploadlogs \
envdefinitions=OST_INDEX_LIST=[0-5] ostcount=7 \
testlist=sanity,conf-sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Icf5d377d4d10995c2fb501bf7762ec8937466fdc
Reviewed-on: http://review.whamcloud.com/9722
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
5 years agoLU-5040 osd: fix osd declare credit for quota 85/11085/16
Bobi Jam [Mon, 14 Jul 2014 05:51:05 +0000 (13:51 +0800)]
LU-5040 osd: fix osd declare credit for quota

osd_attr_set() always calls ll_vfs_dq_init() to initialize dquot for
the inode, while in some cases osd_declare_attr_set() does not reserve
credit for it.

This patch fixes this issue. This patch also corrects the quota credit
accounting in osd_declare_qid() by judging whether the inode's
i_dquot[quota_type] has been allocated, while old code judgement is
based on the file uid/gid existence, which is not correct.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I16555cb1097e1a3e75cdcb4852a2c5e1772ddd88
Reviewed-on: http://review.whamcloud.com/11085
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-3536 osp: move update packing into out_lib.c 21/9321/23
Wang Di [Sat, 7 Jun 2014 19:57:47 +0000 (12:57 -0700)]
LU-3536 osp: move update packing into out_lib.c

Move osp_update_insert to out_lib.c, so OSP/LOD/OUT
can all pack update into OUT RPC.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: If087a0bc6a858e9e6c128311ed0d80e4392ffaff
Reviewed-on: http://review.whamcloud.com/9321
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5468 llite: don't call make_bad_inode() on an old inode 09/11609/2
John L. Hammond [Tue, 26 Aug 2014 21:36:03 +0000 (16:36 -0500)]
LU-5468 llite: don't call make_bad_inode() on an old inode

In ll_iget() if ll_update_inode() fails then do not call
make_bad_inode() on the inode since it may still be in use.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I10bb0ad606fad2eff6f6cf5cc7da157e9db59c94
Reviewed-on: http://review.whamcloud.com/11609
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5639 lnet: portal spreading rotor should be unsigned 36/11936/3
Liang Zhen [Tue, 16 Sep 2014 03:49:40 +0000 (11:49 +0800)]
LU-5639 lnet: portal spreading rotor should be unsigned

Portal spreading rotor should be unsigned, otherwise lnet may get
negative CPT number and access invalid addresses.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: Id7f40da241af3b01483fdedd366b09329f530163
Reviewed-on: http://review.whamcloud.com/11936
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4788 lfsck: enable verification for remote object 17/11317/25
Fan Yong [Fri, 1 Aug 2014 01:00:31 +0000 (09:00 +0800)]
LU-4788 lfsck: enable verification for remote object

Based on the LFSCK 1.5 framework, enable the namespace LFSCK
scanning for remote object.

During the first-stage scanning, if the object contains remote
linkEA entry or multiple linkEA entries or claims as multiple
linked, then it will be recorded in the namespace LFSCK tracing
file for double scanning.

Some cleanup for the namespace LFSCK tracing file (lfsck_namespace)
and other code cleanup.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibc87ae9a5c6b7f67a9215140cf2cb89640bce0a9
Reviewed-on: http://review.whamcloud.com/11317
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5506 lfsck: skip orphan OST-object handling for failed OSTs 96/10996/21
Fan Yong [Tue, 29 Jul 2014 10:32:07 +0000 (18:32 +0800)]
LU-5506 lfsck: skip orphan OST-object handling for failed OSTs

The layout LFSCK will record the failed OSTs in the LFSCK tracing
file (lfsck_layout) during the first-stage scanning, then when moves
to the second-stage scanning, the layout LFSCK will know which OSTs
contain the OST-objects that have not been verified or failed to be
verified during the first-stage scanning. Then the layout LFSCK will
skip the orphan OST-objects handling on those OSTs. But other OSTs
can be handled as normal case without affected by the failed OSTs.

This patch also builds the framework of recording failed MDTs for
namespace LFSCK

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I4d7a9fc2e22cb8c6ef1c4cf73383ec588c95da53
Reviewed-on: http://review.whamcloud.com/10996
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4788 lfsck: namespace LFSCK uses assistant thread 03/10603/24
Fan Yong [Tue, 29 Jul 2014 19:02:18 +0000 (03:02 +0800)]
LU-4788 lfsck: namespace LFSCK uses assistant thread

Move the lfsck assistant thread from layout.c to engine.c, and
make it to be shared by both layout LFSCK and namespace LFSCK.

With using assistant thread, the namespace LFSCK can make the
async pipeline for scanning the directory as the layout LFSCK
does for scanning the stripes, then the LFSCK main engine will
not be blocked by cross-MDT verification.

The namesapce LFSCK assistant thread is necessary, because both
the layout LFSCK and the namespace LFSCK are driven by the same
LFSCK main engine. If the LFSCK main engine is blocked because
of namespace handling, then the layout LFSCK will also be blocked.
Currently, the LFSCK main engine and the layout LFSCK assistant
thread has composed a async pipeline, then the LFSCK main engine
will not be blocked by layout related remote operations. So it is
necessary to make another pipeline for namespace related handling
to avoid the LFSCK main engine to be blocked for namespace related
remote operations.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I99e18ab1d85ad4d74b16b2387767422907781d5e
Reviewed-on: http://review.whamcloud.com/10603
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-3331 llite: remove llite proc root on init failure 20/6420/6
John L. Hammond [Wed, 17 Sep 2014 22:00:21 +0000 (18:00 -0400)]
LU-3331 llite: remove llite proc root on init failure

In init_lustre_lite() ensure that /proc/fs/lustre/llite is removed in
case of failure. Generally rework the cleanup code in this function.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I95844622dd39d2529d227b99fffbc938fd17910a
Reviewed-on: http://review.whamcloud.com/6420
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5522 ldlm: remove expired lock from per-export list 34/11634/2
Johann Lombardi [Thu, 28 Aug 2014 12:42:56 +0000 (14:42 +0200)]
LU-5522 ldlm: remove expired lock from per-export list

Expired locks processed by the ldlm_elt thread might still be
referenced in the per-export BL AST list. Most of the time, it
has not impact except when a request for this export is still
being processed for this export and might scan the per-export
BL AST list.

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I23745770983507ffd986de9ba056a03b11199a78
Reviewed-on: http://review.whamcloud.com/11634
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2675 obd: rename LUSTRE_STRIPE_MAXBYTES 00/11800/2
John L. Hammond [Mon, 8 Sep 2014 14:01:14 +0000 (09:01 -0500)]
LU-2675 obd: rename LUSTRE_STRIPE_MAXBYTES

Rename LUSTRE_STRIPE_MAXBYTES to LUSTRE_EXT3_STRIPE_MAXBYTES and
correct the comment describing its use.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I320c86b3f90e8b0c12fd91d6ca95684fb84fe36c
Reviewed-on: http://review.whamcloud.com/11800
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
5 years agoLU-2675 lustre: remove linux/lustre_log.h 02/11502/2
John L. Hammond [Mon, 18 Aug 2014 18:28:48 +0000 (13:28 -0500)]
LU-2675 lustre: remove linux/lustre_log.h

Remove lustre/include/linux/lustre_log.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ibc4d042df9d95301b8d2a8e3df39d7075752ab9d
Reviewed-on: http://review.whamcloud.com/11502
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
5 years agoLU-2675 lustre: remove lustre_lite.h 01/11501/3
John L. Hammond [Mon, 18 Aug 2014 18:24:00 +0000 (13:24 -0500)]
LU-2675 lustre: remove lustre_lite.h

Remove the unused struct lustre_rw_params and the unused function
lustre_build_lock_params(). Move several definition only used in
lustre/llite/ to lustre/llite/llite_internal.h. Remove
lustre/include/{,linux/}lustre_lite.h and fixup the missing includes
in other headers that this exposes.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Iac27e0e50407e39122b121d6040244a5fe1b9f15
Reviewed-on: http://review.whamcloud.com/11501
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
5 years agoLU-5416 ofd: improve error handling in ofd_precreate_objects() 70/11370/2
John L. Hammond [Thu, 7 Aug 2014 19:37:17 +0000 (14:37 -0500)]
LU-5416 ofd: improve error handling in ofd_precreate_objects()

In ofd_precreate_objects() fix two invalid assertions triggered by
errors in the first iterations of the declare loop and the create
loop.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I8bee1528dc58d5b8b35ac89056c667370cc347c5
Reviewed-on: http://review.whamcloud.com/11370
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4974 doc: comments to the functions in lod_dev.c 15/11115/5
Alex Zhuravlev [Wed, 16 Jul 2014 10:52:47 +0000 (14:52 +0400)]
LU-4974 doc: comments to the functions in lod_dev.c

Comments to lod_dev using doxygen markup.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Signed-off-by: Richard Henwood <richard.henwood@intel.com>
Change-Id: I60cbb7ac3d8edb225c75ac4c7fe93fb17a87994c
Reviewed-on: http://review.whamcloud.com/11115
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
5 years agoLU-4975 ofd: documenting ofd_obd.c 58/10658/7
Mikhail Pershin [Tue, 10 Jun 2014 05:35:24 +0000 (09:35 +0400)]
LU-4975 ofd: documenting ofd_obd.c

Fix up GPL header block to reference proper GPLv2 license URL.
Remove mention of contacting Sun, since they don't exist anymore.
Add introductory comment block for the ofd_obd.c file and add
function comment blocks to all functions in it.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I3a4e48ffbc6dffac6647e357f16e3aac2902fd23
Reviewed-on: http://review.whamcloud.com/10658
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
5 years agoLU-4975 ofd: documenting ofd_dev.c 39/10639/6
Mikhail Pershin [Wed, 28 May 2014 08:55:48 +0000 (12:55 +0400)]
LU-4975 ofd: documenting ofd_dev.c

Fix up GPL header block to reference proper GPLv2 license URL.
Remove mention of contacting Sun, since they don't exist anymore.
Add introductory comment block for the ofd_dev.c file and add
function comment blocks to all functions in it.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I8ed91764cff36f51eae78a025efbc34a61fd825c
Reviewed-on: http://review.whamcloud.com/10639
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
5 years agoLU-4974 doc: comments to lod_lov.c 31/10431/7
Alex Zhuravlev [Fri, 23 May 2014 09:05:17 +0000 (13:05 +0400)]
LU-4974 doc: comments to lod_lov.c

and fix up GPL header block to reference proper GPLv2 license URL.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I6d087980cf80b418cd92974cd9eab7a55bda6c80
Reviewed-on: http://review.whamcloud.com/10431
Tested-by: Jenkins
Reviewed-by: Richard Henwood <richard.henwood@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
5 years agoLU-4416 tests: small fixes for conf_sanity 83/10783/8
Yang Sheng [Thu, 18 Sep 2014 16:11:18 +0000 (12:11 -0400)]
LU-4416 tests: small fixes for conf_sanity

--In util-linux 2.23.2, umount will call stat() on
  mountpoint. So have to use 'umount -f' in test_5a.
  Else the test will block forever.
--Invoke cleanup_fs2 in test_24b.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ie7540d6cfd6790417c63ea1efb50381cacc8345f
Reviewed-on: http://review.whamcloud.com/10783
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5607 ptlrpc: restore posix_acl_xattr checks 33/11933/2
John L. Hammond [Mon, 15 Sep 2014 23:49:57 +0000 (18:49 -0500)]
LU-5607 ptlrpc: restore posix_acl_xattr checks

Restore the wiretest checks on posix_acl_xattr_entry and
posix_acl_xattr_header.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I3935ac2f8b7e894896949ff8a5c44fe6f78e1cae
Reviewed-on: http://review.whamcloud.com/11933
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
5 years agoLU-4749 mgs: Recognize colons in failover.node values 56/11956/3
Li Wei [Wed, 17 Sep 2014 05:55:33 +0000 (13:55 +0800)]
LU-4749 mgs: Recognize colons in failover.node values

Current mgs assumes that all the NIDs in a single failover.node
parameter belong to one node.  This is not the case, even prior to
http://review.whamcloud.com/11161, because mkfs.lustre options like

  --servicenode=<nid1>,<nid2>:<nid3>,<nid4>

are allowed.

Change-Id: If3e2ebc0e81093e9c8304e496afdca24edf456ef
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/11956
Tested-by: Jenkins
Reviewed-by: Johann Lombardi <johann.lombardi@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
5 years agoLU-5396 mdc: add sparse annotation __user wherever needed 22/11822/3
frank zago [Fri, 29 Aug 2014 23:50:31 +0000 (18:50 -0500)]
LU-5396 mdc: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: Ie3acf675d284a595d4aa871b522a0c409973da18
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11822
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5396 llite: add sparse annotation __user wherever needed 18/11818/3
Frank Zago [Wed, 23 Jul 2014 21:30:22 +0000 (16:30 -0500)]
LU-5396 llite: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: Ia56bf32ab880c34f8a121f7ba3a7cce546308448
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11818
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5531 osc: Debug to match extent to brw RPC 48/11548/3
Patrick Farrell [Thu, 21 Aug 2014 20:59:23 +0000 (15:59 -0500)]
LU-5531 osc: Debug to match extent to brw RPC

Currently, it's difficult to match brw RPCs to objects and
extents from client logs.  This patch adds a D_RPCTRACE
debug message giving the necessary information.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: Ic3470a805f00b5d9c0f6f72470d3f3eb48a10b7a
Reviewed-on: http://review.whamcloud.com/11548
Tested-by: Jenkins
Reviewed-by: Alexey Lyashkov <alexey.lyashkov@seagate.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Ann Koehler <amk@cray.com>
Reviewed-by: Ryan Haasken <haasken@cray.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5631 obdclass: Proper swabbing of llog_rec_tail. 37/11937/2
Henri Doreau [Tue, 16 Sep 2014 10:34:07 +0000 (12:34 +0200)]
LU-5631 obdclass: Proper swabbing of llog_rec_tail.

A variable-length structure preceeds llog_rec_tail within an llog
block. Thus cr_tail shouldn't be accessed directly as a structure
member but its actual location should be computed dynamically.

Signed-off-by: Henri Doreau <henri.doreau@cea.fr>
Change-Id: I2d244797d107cf52f647e19b2db780138e910925
Reviewed-on: http://review.whamcloud.com/11937
Tested-by: Jenkins
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-3331 obdclass: return errors from class_procfs_init() 35/11935/2
John L. Hammond [Tue, 16 Sep 2014 00:02:49 +0000 (19:02 -0500)]
LU-3331 obdclass: return errors from class_procfs_init()

In class_procfs_init() if /proc/fs/lustre/ or /proc/fs/lustre/devices
cannot be created then cleanup and return an error.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I236a4af31c44b73207d411c02873c10ee7478ca7
Reviewed-on: http://review.whamcloud.com/11935
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
5 years agoLU-2456 lnet: DLC user/kernel space glue code 23/8023/48
Amir Shehata [Fri, 18 Oct 2013 02:24:02 +0000 (19:24 -0700)]
LU-2456 lnet: DLC user/kernel space glue code

This is the sixth patch of a set of patches that enables DLC.

This patch enables the user space to call into the kernel space
DLC code.  Added handlers in the LNetCtl function to call
the new functions added for Dynamic Lnet Configuration

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I19af05abaee827ce8ce7be38ffb2f80611a9f0ca
Reviewed-on: http://review.whamcloud.com/8023
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5104 build: utilities cleanup 54/10654/19
Dmitry Eremin [Fri, 30 May 2014 14:38:24 +0000 (18:38 +0400)]
LU-5104 build: utilities cleanup

- remove unused utilities: lustre_createcsv, lustre_config, lustre_up14
- move req_layout, loadgen, wirecheck and wiretest to tests RPM
- add dependency tests from utils (multiop from liblustreapi)
- make corrent mapping for configure parameters
  "--enable-*/--disable-*" to "--with/--without *"
- fix find command for empty results
- fix utils building (add ifdef UTILS and fix .spec file)
- don't install few server scripts and utils for client build

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I947ea4739526f74a18427dd853c4198871dd0e21
Reviewed-on: http://review.whamcloud.com/10654
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5396 lov: add sparse annotation __user wherever needed 23/11823/2
frank zago [Mon, 1 Sep 2014 01:16:09 +0000 (20:16 -0500)]
LU-5396 lov: add sparse annotation __user wherever needed

This fixes sparse warnings such as:

  .../api-ni.c:1639:33: warning: incorrect type in argument 3
                             (different address spaces)
  .../api-ni.c:1639:33:    expected struct lnet_process_id_t
                             [noderef] [usertype] <asn:1>*ids
  .../api-ni.c:1639:33:    got struct lnet_process_id_t
                             [usertype] *<noident>

There is no code change.

Change-Id: I67c89e141a30d175091673034d48c1aa75148891
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/11823
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-3963 libcfs: remove last of cfs list wrappers 97/11797/2
James Simmons [Sun, 7 Sep 2014 15:19:24 +0000 (11:19 -0400)]
LU-3963 libcfs: remove last of cfs list wrappers

Delete the cfs wrappers for linux list and hlist api.
Removal of these wrappers exposed last remaining
unconverted list operators.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: If72e68e93663174c5c3540e2444cc50d37d04a48
Reviewed-on: http://review.whamcloud.com/11797
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5588 mdt: fix NULL pointer 'obd' can be dereferenced 68/11768/2
Dmitry Eremin [Fri, 5 Sep 2014 12:59:03 +0000 (16:59 +0400)]
LU-5588 mdt: fix NULL pointer 'obd' can be dereferenced

Pointer 'obd' checked for NULL at line 3992 will be dereferenced
at line 4040. Pointer 'obd' checked for NULL at line 4120 will be
dereferenced at line 4170.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I1478cb67f3b17646e631ab38b4a9bbadf82fac31
Reviewed-on: http://review.whamcloud.com/11768
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
5 years agoLU-5556 target: limit bulk transfer time 17/11717/5
Johann Lombardi [Mon, 1 Sep 2014 13:03:51 +0000 (15:03 +0200)]
LU-5556 target: limit bulk transfer time

Messages lost during bulk transfer are not resent, so there is no
point in waiting for a very long time (up to at_max/600s has been
seen). This patch adds a new static timeout for the bulk transfer
(100s by default).

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I3926a7a8f2bce4cbd00b8fe54094a8e9cbec1508
Reviewed-on: http://review.whamcloud.com/11717
Tested-by: Jenkins
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Li Wei <wei.g.li@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5521 grant: quiet message on grant waiting timeout 16/11716/3
Johann Lombardi [Mon, 1 Sep 2014 10:38:31 +0000 (12:38 +0200)]
LU-5521 grant: quiet message on grant waiting timeout

Use at_max in osc_enter_cache() to bound how long we wait for grant
space before switching to synchronous I/Os. Do not print a message
on the console when the timeout is hit since such long wait can
be legitimate with flaky network (i.e. BRW is resent multiple times).

Signed-off-by: Johann Lombardi <johann.lombardi@intel.com>
Change-Id: I9c507a11ea4a3612e932e22bebb5087dbdcf248a
Reviewed-on: http://review.whamcloud.com/11716
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2675 lustre: remove linux/lustre_lib.h 00/11500/2
John L. Hammond [Mon, 18 Aug 2014 17:58:29 +0000 (12:58 -0500)]
LU-2675 lustre: remove linux/lustre_lib.h

Remove lustre/include/linux/lustre_lib.h. Remove some unused
declarations from lustre/include/lustre_lib.h and move some others to
more natural headers.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I30b032e5e766cc052dbf6af34eb2e5d3f1af589e
Reviewed-on: http://review.whamcloud.com/11500
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5453 mdt: mdt_reint_setattr() and mdt_attr_set() fixes 53/11353/4
John L. Hammond [Wed, 6 Aug 2014 19:38:55 +0000 (14:38 -0500)]
LU-5453 mdt: mdt_reint_setattr() and mdt_attr_set() fixes

In mdt_reint_setattr() add early checks that the object exists and is
local. Remove the invalid assertion in mdt_attr_set() that the object
is local. Replace several assertions on wire data with -EPROTO
returns. In mdt_attr_set() when locking the slaves of a striped
directory use a PW lock for slave 0. Remove the unused flags parameter
of mdt_attr_set().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I2aa5a645b8398c1e2890291cb61ffa280ec82443
Reviewed-on: http://review.whamcloud.com/11353
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-2675 llite: remove generic operations from llite/namei.c 69/10769/6
John L. Hammond [Fri, 20 Jun 2014 17:32:19 +0000 (12:32 -0500)]
LU-2675 llite: remove generic operations from llite/namei.c

Remove the llite inode operation wrappers ll_mknod(), ll_unlink(),
ll_mkdir(), ll_rmdir(), ll_symlink(), ll_link(), and ll_rename(),
replacing them with calls to ll_mknod_generic(), but rename these
functions to drop the _generic suffix. Remove several variables that
are always NULL after this transformation. Simplify some control flow,
using the knowledge that certain variables are never NULL. Replace
calls to ll_d_mountpoint() with equivalent calls to d_mountpoint() and
remove the then unused function ll_d_mountpoint(). Remove the
effectively unused lookup_flags parameter from ll_create_it().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ib1757399c66a119e842d3361f184128ce2b96a78
Reviewed-on: http://review.whamcloud.com/10769
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4788 lfsck: verify .lustre/lost+found at the LFSCK start 87/10987/18
Fan Yong [Sat, 26 Jul 2014 23:49:29 +0000 (07:49 +0800)]
LU-4788 lfsck: verify .lustre/lost+found at the LFSCK start

/ROOT/.lustre/lost+found/ is a special directory to hold the
objects that the LFSCK does not exactly know how to handle,
such as orphans. So before the LFSCK scanning the system,
the consistency of such directory needs to be verified firstly
to allow the users it during the LFSCK.

fid_seq_is_dot_lustre() is a duplication of fid_seq_is_dot(),
drop it and cleanup the code.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I95cac84bed1ae16c8c86e495db0120d964395b5e
Reviewed-on: http://review.whamcloud.com/10987
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5509 osd: get PFID from linkEA for remote dir on ldiskfs 85/11485/12
Fan Yong [Sat, 26 Jul 2014 22:20:22 +0000 (06:20 +0800)]
LU-5509 osd: get PFID from linkEA for remote dir on ldiskfs

On the ldiskfs backend, for a directory which parent resides on
remote MDT, to satisfy the local e2fsck, we insert it into the
/REMOTE_PARENT_DIR locally. On the other hand, to make the lookup(..)
on the directory can return the real parent FID, we append the real
parent FID after its ".." name entry in the /REMOTE_PARENT_DIR.

Unfortunately, such PFID-in-dirent cannot be preserved via file-level
backup. So after the restore, we cannot get the right parent FID from
its ".." name entry in the /REMOTE_PARENT_DIR. Under such case, since
we have stored the real parent FID in the directory object's linkEA,
we can parse the linkEA for the real parent FID.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Icf1e24ec911818b3a49a253f67c72334a4b75712
Reviewed-on: http://review.whamcloud.com/11485
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5508 osp: RPC adjustment for remote transaction 82/11382/12
Fan Yong [Sat, 26 Jul 2014 22:15:42 +0000 (06:15 +0800)]
LU-5508 osp: RPC adjustment for remote transaction

1) For remote transaction, the set_attr/set_xattr RPC should not be
   prepared in declare phase. According to our current transaction/
   dt_object_lock framework, the transaction sponsor will start the
   transaction firstly, then try to acquire related dt_object_lock
   if needed. That is a general rule, and the LFSCK needs to follow
   such rule when repair inconsistent linkEA, in spite of local or
   remote MDT-object.

   For linkEA repairing case, before the LFSCK thread obtained the
   dt_object_lock on the target MDT-object, it cannot know whether
   the MDT-object has linkEA or not, neither invalid or not.

   Since the LFSCK cannot hold dt_object_lock before the (remote)
   transaction start (otherwise there will be potential deadlock),
   it cannot prepare related RPC for repairing during the declare
   phase as other normal transactions do.

   To resolve the trouble, we will make OSP to prepare related RPC
   (set_attr/set_xattr/del_xattr) after remote transaction started,
   and trigger the remote updating (RPC sending) when trans_stop.
   Then the up layer users, such as LFSCK, can follow the general
   rule to handle trans_start/dt_object_lock for repairing linkEA
   inconsistency without distinguishing remote MDT-object.

2) Some adjustment for OSP object attributes cache maintainig to make
   the logic more clear and reasonable.
2.1) Update the cached attribute in osp_attr_set(), but not in the
     osp_declare_attr_set().
2.2) Update the cached extended attribute in osp_xattr_set(), but not
     in the osp_declare_xattr_set().
2.3) Drop the cached extended attribute in osp_xattr_del().

3) Typo fixing and code cleanup.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I43c88a8fd3b184c91a4b3cbd4104e35f9915ee24
Reviewed-on: http://review.whamcloud.com/11382
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5610 tests: Handle quoted module options 87/11887/3
Stephen Champion [Fri, 12 Sep 2014 00:03:01 +0000 (17:03 -0700)]
LU-5610 tests: Handle quoted module options

When test-framework.sh translates module options to environment
variables for remote nodes, quotes sould be escaped to the subshell.

Signed-off-by: Stephen Champion <schamp@sgi.com>
Change-Id: I937cc28b96b54ea75082c7d8789c762b4db16c5f
Reviewed-on: http://review.whamcloud.com/11887
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-5596 lnet: Remove obsolete LNET variable 25/11825/3
James Nunez [Mon, 8 Sep 2014 23:08:14 +0000 (17:08 -0600)]
LU-5596 lnet: Remove obsolete LNET variable

For Lustre version 2.6.50 and later, the variable
session_features is defined as "LST_FEATS_MASK". For
earlier versions of Lustre, the same variable is defined
as "LST_FEATS_EMPTY.

Since Lustre master is at 2.6.52 or later, the second
definition of "session_features" can be removed.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I52e7a914880509cfcd4961032ab7775bbaf626a8
Reviewed-on: http://review.whamcloud.com/11825
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-4746 libcfs: Use Linux kernel current_umask() function 42/11642/2
James Simmons [Thu, 28 Aug 2014 18:58:50 +0000 (14:58 -0400)]
LU-4746 libcfs: Use Linux kernel current_umask() function

Lustre not using kernel current_umask() function breaks GRSecurity
umask handling. This is also needed for the linux api cleanup.

Replaces current->fs->umask with more secure current_umask() function

Change-Id: Ide0b83eb3e6c69e1e2178ede37ce708227f1c107
Signed-off-by: Andrew Prout <ajprout@hotmail.com>
Signed-off-by: Cliff White <cliffwhi@intel.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/11642
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
5 years agoLU-5499 tests: keep /sbin/mount.lustre until cleanup 59/11259/7
Andreas Dilger [Wed, 28 May 2014 23:15:25 +0000 (17:15 -0600)]
LU-5499 tests: keep /sbin/mount.lustre until cleanup

Don't unmount /sbin/mount.lustre in the middle of running tests
on a local test system if it is not doing final cleanup.  Otherwise,
later mounts may fail.

The current /sbin/mount.lustre mountpoint is an empty stub that
returns success (0) if executed, but doesn't mount the filesystem.
Instead, create a mountpoint that prints an message if executed and
returns an error to the caller, so it is easier to debug problems.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ied7b69f536bad87333cf5c543384723412500c1e
Reviewed-on: http://review.whamcloud.com/11259
Reviewed-by: frank zago <fzago@cray.com>
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-1669 vvp: Use lockless __generic_file_aio_write 72/6672/12
Prakash Surya [Thu, 3 Oct 2013 00:16:51 +0000 (17:16 -0700)]
LU-1669 vvp: Use lockless __generic_file_aio_write

Testing multi-threaded single shard file write performance has shown
the inode mutex to be a limiting factor when using the
generic_file_aio_write function. To work around this bottle neck, this
change replaces the locked version of that call with the lock less
version, specifically, __generic_file_aio_write.

In order to maintain posix consistency, Lustre must now employ it's
own locking mechanism in the higher layers. Currently writes are
protected using the lli_write_mutex in the ll_inode_info structure.
To protect against simultaneous write and truncate operations, since
we no longer take the inode mutex during writes, we must down the
lli_trunc_sem semaphore.

Unfortunately, this change by itself does not garner any performance
benefits. Using FIO on a single machine with 32 GB of RAM, write
performance tests were ran with and without this change applied; the
results are below:

    +---------+-----------+---------+--------+--------+
    |     fio v2.0.13     |   Write Bandwidth (KB/s)  |
    +---------+-----------+---------+--------+--------+
    | # Tasks | GB / Task | Test 1  | Test 2 | Test 3 |
    +---------+-----------+---------+--------+--------+
    |    1    |    64     |  452446 | 454623 | 457653 |
    |    2    |    32     |  850318 | 565373 | 602498 |
    |    4    |    16     | 1058900 | 463546 | 529107 |
    |    8    |     8     | 1026300 | 468190 | 576451 |
    |   16    |     4     | 1065500 | 503160 | 462902 |
    |   32    |     2     | 1068600 | 462228 | 466963 |
    |   64    |     1     |  991830 | 556618 | 557863 |
    +---------+-----------+---------+--------+--------+

 * Test 1: Lustre client running 04ec54f. File per process write
           workload. This test was used as a baseline for what we
           _could_ achieve in the single shared file tests if the
           bottle necks were removed.

 * Test 2: Lustre client running 04ec54f. Single shared file
           workload, each task writing to a unique region.

 * Test 3: Lustre client running 04ec54f + this patch. Single shared
           file workload, each task writing to a unique region.

In order to garner any real performance benefits out of a single
shared file workload, the lli_write_mutex needs to be broken up into a
range lock. That would allow write operations to unique regions of a
file to be executed concurrently. This work is left to be done in a
follow up patch.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Change-Id: I0023132b5d941b3304f39f015f95106542998072
Reviewed-on: http://review.whamcloud.com/6672
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoLU-1669 llite: Replace write mutex with range lock 20/6320/20
Prakash Surya [Wed, 19 Jun 2013 17:30:36 +0000 (10:30 -0700)]
LU-1669 llite: Replace write mutex with range lock

Testing has shown the ll_inode_inode's lli_write_mutex to be a
limiting factor with single shared file write performance, when using
many writing threads on a single machine. Even if each thread is
writing to a unique portion of the file, the lli_write_mutex will
prevent no more than a single thread to ever write to the file
simultaneously.

This change attempts to remove this bottle neck, by replacing this
mutex with a range lock. This should allow multiple threads to write
to a single file simultaneously iff the threads are writing to unique
regions of the file.

Performance testing shows this change to garner a significant
performance boost to write bandwidth. Using FIO on a single machine
with 32 GB of RAM, write performance tests were run with and without
this change applied; the results are below:

    +---------+-----------+---------+--------+--------+--------+
    |     fio v2.0.13     |        Write Bandwidth (KB/s)      |
    +---------+-----------+---------+--------+--------+--------+
    | # Tasks | GB / Task | Test 1  | Test 2 | Test 3 | Test 4 |
    +---------+-----------+---------+--------+--------+--------+
    |    1    |    64     |  452446 | 454623 | 457653 | 463737 |
    |    2    |    32     |  850318 | 565373 | 602498 | 733027 |
    |    4    |    16     | 1058900 | 463546 | 529107 | 976284 |
    |    8    |     8     | 1026300 | 468190 | 576451 | 963404 |
    |   16    |     4     | 1065500 | 503160 | 462902 | 830065 |
    |   32    |     2     | 1068600 | 462228 | 466963 | 749733 |
    |   64    |     1     |  991830 | 556618 | 557863 | 710912 |
    +---------+-----------+---------+--------+--------+--------+

 * Test 1: Lustre client running 04ec54f. File per process write
           workload. This test was used as a baseline for what we
           _could_ achieve in the single shared file tests if the
           bottle necks were removed.

 * Test 2: Lustre client running 04ec54f. Single shared file
           workload, each task writing to a unique region.

 * Test 3: Lustre client running 04ec54f + I0023132b. Single shared
           file workload, each task writing to a unique region.

 * Test 4: Lustre client running 04ec54f + this patch.
           Single shared file workload, each task writing to a unique
           region.

Signed-off-by: Prakash Surya <surya1@llnl.gov>
Change-Id: I71e060c190065d87a20dc8df3104f898883d0583
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-on: http://review.whamcloud.com/6320
Tested-by: Jenkins
Reviewed-by: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
5 years agoRevert "LU-5261 osc: use wait_for_completion_killable() instead" 92/11892/2
Oleg Drokin [Fri, 12 Sep 2014 16:17:31 +0000 (16:17 +0000)]
Revert "LU-5261 osc: use wait_for_completion_killable() instead"

This is causing LU-5446

This reverts commit 2b3663dda896f669c87feb49e7f3c7d85a89cefe.

Change-Id: I8bd254137ad0d402bad5f5aac85aa52cd3d47f63
Reviewed-on: http://review.whamcloud.com/11892
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>