Whamcloud - gitweb
fs/lustre-release.git
4 years agoLU-6245 libcfs: create userland and kernel string operations 35/13835/8
James Simmons [Mon, 30 Mar 2015 19:03:21 +0000 (15:03 -0400)]
LU-6245 libcfs: create userland and kernel string operations

Additonal string handling and NID string parsing code are both
used by kernel space and user land. This prevents us from
moving forward for cleaning up the userland and kernel space
headers. With the code duplicated for both environments we
can then clean up the headers independently. Since NID string
handling is only done for LNET we move it there.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I5fccdae61322d0bace7094a36d2e551d719c4982
Reviewed-on: http://review.whamcloud.com/13835
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5710 corrected some typos and grammar errors 01/12201/13
frank zago [Sun, 29 Mar 2015 15:35:19 +0000 (11:35 -0400)]
LU-5710 corrected some typos and grammar errors

Most of them are in comments, but there was a few in user
visible help.

Change-Id: Ib050e9042f9f2728cd3eeedeee679739b48419db
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12201
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6285: o2iblnd: Do not use cpus_weight, it's deprecated 54/13954/2
Oleg Drokin [Tue, 3 Mar 2015 19:44:10 +0000 (14:44 -0500)]
LU-6285: o2iblnd: Do not use cpus_weight, it's deprecated

Replace cpus_weight and for_each_cpu mask with
cpumaskweight and for_each_cpu respectively as per latest kernel guidelines.

Change-Id: I038eb38234c0a209a68ca24c8a860d0e84522c27
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/13954
Tested-by: Jenkins
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
4 years agoLU-6405 kernel: kernel update [RHEL7.1 3.10.0-229.1.2.el7] 93/14293/2
Bob Glossman [Tue, 31 Mar 2015 18:53:48 +0000 (11:53 -0700)]
LU-6405 kernel: kernel update [RHEL7.1 3.10.0-229.1.2.el7]

Update RHEL7.1 kernel to 3.10.0-229.1.2.el7

Test-Parameters: clientdistro=el7 testgroup=review-ldiskfs \
  mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I020419971dda62b2997e33a624c6e6113d465afb
Reviewed-on: http://review.whamcloud.com/14293
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5478 lustre: get rid of obd_* typedefs 56/14256/3
Dmitry Eremin [Mon, 30 Mar 2015 10:21:46 +0000 (13:21 +0300)]
LU-5478 lustre: get rid of obd_* typedefs

We have a bunch of typedefs for common things that made no sense
and hid the actual type from plain view.
Replace them with proper uXX or sXX types.
Exception is in lustre_idl.h and lustre_ioctl.h where
they are replaced with __uXX and __sXX to be able to be included
in userspace

final patch in series.

Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I2b714ec673a004561d45ad46041191bef3ec9a8e
Reviewed-on: http://review.whamcloud.com/14256
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
4 years agoLU-6394 all: fix compilation errors with FORTIFY_SOURCE 26/14126/7
Frank Zago [Fri, 20 Mar 2015 20:37:13 +0000 (15:37 -0500)]
LU-6394 all: fix compilation errors with FORTIFY_SOURCE

When Lustre is configured with CFLAGS="-D_FORTIFY_SOURCE=2 -O2" on
Centos 6, the compilation will fails with errors such as this one:

 cacheio.c: In function ‘qword_printhex’:
 cacheio.c:174: error: ignoring return value of ‘fwrite’, declared
   with attribute warn_unused_result

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ie06fd5a26b62daf62bfd0133a2d7ebc66ece5be6
Reviewed-on: http://review.whamcloud.com/14126
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5319 ptlrpc: Add a tag field to ptlrpc messages 95/14095/2
Gregoire Pichon [Tue, 17 Mar 2015 15:28:24 +0000 (16:28 +0100)]
LU-5319 ptlrpc: Add a tag field to ptlrpc messages

The new tag field is used as a virtual index for multiple modifying
RPCs management. It is set by the client and allows the target to
release in-memory reply data when the tag is reused by a new RPC.

The tag field replaces the unused last_seen field of ptlrpcd_body
structure.

Additionally, the last_xid field is used to transfer the highest XID
for which a reply has been received and does not have an unreplied
lower-numbered XID.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: I4a57f3710ffffe21d8d655af6ac222b65051a12d
Reviewed-on: http://review.whamcloud.com/14095
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
4 years agoLU-6285 ptlrpc: Do not recalculate siblings of CPU 0 in a loop 05/13905/3
Oleg Drokin [Fri, 27 Feb 2015 07:59:05 +0000 (02:59 -0500)]
LU-6285 ptlrpc: Do not recalculate siblings of CPU 0 in a loop

ptlrpc_hr_init seems to be recalculating number of siblings of CPU0
in a loop which is wasteful. Just precalculate the value before the
loop and use it in every iteration instead.

Change-Id: I807fddf29a75af4d268829e37dc91b7512cfcc50
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/13905
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
4 years agoLU-6285 libcfs: Do not unnecessarily copy cpumask 04/13904/3
Oleg Drokin [Fri, 27 Feb 2015 07:57:07 +0000 (02:57 -0500)]
LU-6285 libcfs: Do not unnecessarily copy cpumask

Copying a mask just to calculate a static value of siblings
is overkill, esp. if CPUMASK_OFFSTACK is set and the mask is huge,
additionally that'll workaround the incorrectness of kernel code
dealing with such a code until that is actually fixed upstream.

Change-Id: I860cc7d5b54adcde2e7622bfbc64b4e27046b083
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/13904
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
4 years agoLU-5823 clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS 22/13422/8
John Hammond [Thu, 26 Mar 2015 05:10:59 +0000 (22:10 -0700)]
LU-5823 clio: use CIT_SETATTR for FSFILT_IOC_SETFLAGS

Add handling of inode flags to the handlers of CIT_SETATTR in lov and
osc. In the FSFILT_IOC_SETFLAGS case of ll_iocontrol() use
cl_setattr_ost() rather than obd_setattr_rqset() to set inode flags on
OST objects. Remove the then unused OBD API methods
obd_setattr_rqset() and obd_setattr_async() along with their
supporting functions.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I3ccdb139f2e9aa376fb69e353c0cc6d399bf0857
Reviewed-on: http://review.whamcloud.com/13422
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5823 clio: get rid of lov_stripe_md reference 39/12639/10
Bobi Jam [Thu, 6 Nov 2014 12:44:27 +0000 (20:44 +0800)]
LU-5823 clio: get rid of lov_stripe_md reference

Get rid of lov_stripe_md reference in setting file's stripe info.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I303bfc98113bf1f086053225959001377879637a
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-on: http://review.whamcloud.com/12639
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-632 utils: fix problems of llog_reader 64/10764/7
Li Xi [Fri, 20 Jun 2014 08:19:40 +0000 (16:19 +0800)]
LU-632 utils: fix problems of llog_reader

When the input file of llog_reader is invalid, it is easy to
crash or loop infinitely. This patch fixes these problems.

Signed-off-by: Li Xi <pkuelelixi@gmail.com>
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Change-Id: Ifd6bbc5e857f6910bb4103d85742ba33a843d080
Reviewed-on: http://review.whamcloud.com/10764
Tested-by: Jenkins
Reviewed-by: Artem Blagodarenko <artem.blagodarenko@seagate.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Olaf Faaland-LLNL <faaland1@llnl.gov>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6403 quota: fix soft lockup in qmt_adjust_qunit 87/14187/2
Li Dongyang [Wed, 25 Mar 2015 22:33:54 +0000 (09:33 +1100)]
LU-6403 quota: fix soft lockup in qmt_adjust_qunit

If the user sets the quota limits >= ULLONG_MAX using lfs setquota,
we will set the limits to ULLONG_MAX and stuck in the infinite loop
when trying to increase qunit.

Break the loop when that happens and set qunit to
limit / (2 * slv_cnt) instead.

Signed-off-by: Li Dongyang <dongyang.li@anu.edu.au>
Change-Id: I6fb842c62ad46d8765f6c4c41187cf0dcd543c53
Reviewed-on: http://review.whamcloud.com/14187
Tested-by: Jenkins
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6142 lnet: enforce Linux kernel coding style 30/14130/3
wang di [Sat, 21 Mar 2015 07:12:17 +0000 (00:12 -0700)]
LU-6142 lnet: enforce Linux kernel coding style

This patch enforces Linux kernel coding style in
lnet codes. The changes are:
- convert spaces to tabs for indentation
- align variable and struct declarations
- sprintf is deprecated, use snprintf instead
- labels should not be indented
- wrap lines at 80 columns

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I46685fec1703c39fc8e9e6ee7bb6e5cc152aac4f
Signed-off-by: Jian Yu <jian.yu@intel.com>
Reviewed-on: http://review.whamcloud.com/14130
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5971 llite: reorganize variable and data structures 14/13714/5
John Hammond [Thu, 26 Mar 2015 05:08:49 +0000 (22:08 -0700)]
LU-5971 llite: reorganize variable and data structures

Rename struct ccc_grouplock to ll_grouplock and move the definition
from vvp_internal.h to llite_internal.h.

struct vvp_thread_info is used in the non-VVP parts of llite so rename
it struct ll_thread_info. Rename supporting functions accordingly.

struct ccc_thread_info is used in the VVP parts of llite so rename
it struct vvp_thread_info. Rename supporting functions accordingly.

Remove ccc_global_{init,fini}(), merging their contents into
vvp_global_{init,fini}() and {init,exit}_lustre_lite(). Rename
ccc_inode_fini_* to cl_inode_fini_*.

Move several declarations between llite_internal.h and vvp_internal.h
with the goal of reserving the latter header for functions that
pertain to vvp_{device,object,page,...}.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I79b51e6f58dee9e9488c983b4a0759fa4117d2a6
Reviewed-on: http://review.whamcloud.com/13714
Tested-by: Jenkins
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-3105 osd: remove capa related stuff from servers 72/5572/47
Alex Zhuravlev [Wed, 10 Dec 2014 11:11:22 +0000 (14:11 +0300)]
LU-3105 osd: remove capa related stuff from servers

capability feature is broken. it's not going to be used
anytime soon. the related tests are removed as well.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I865a92b57abaae679d7ff8319e0e3fda603beff9
Reviewed-on: http://review.whamcloud.com/5572
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6416 ldlm: no canceled lock on waiting list 85/14085/5
Liang Zhen [Mon, 16 Mar 2015 01:25:17 +0000 (09:25 +0800)]
LU-6416 ldlm: no canceled lock on waiting list

If a lock was not granted straight away on server, but it's granted
with LDLM_FL_AST_SENT set before ldlm_handle_enqueue0 sends out
reply, client side will know she needs to cancel this lock.

At the meanwhile, this lock can be added to a long granting list
by another server thread.

When lock cancel request arrives at server and server calls into
  ldlm_lock_cancel()->
      ldlm_cancel_callback()->
          tgt_blocking_ast(...LDLM_CB_CANCELING)->
              tgt_sync()

The other server thread eventually get a chance to send completion
AST for this lock with LDLM_FL_AST_SENT set, and add this lock to
waiting list again.

However, tgt_sync may take arbitrary time which is irrelevant
to AT of lock revoke on client, server could evict client only
because itself has slow IO.

To resolve this race, this patch does not put canceled lock on
waiting list anymore.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I86c1097d3ccbaa614b8811c1d9f37b39f019c61e
Reviewed-on: http://review.whamcloud.com/14085
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6396 kernel: kernel update [SLES11 SP3 3.0.101-0.47.52] 64/14164/5
Bob Glossman [Tue, 24 Mar 2015 19:44:57 +0000 (12:44 -0700)]
LU-6396 kernel: kernel update [SLES11 SP3 3.0.101-0.47.52]

Update target and config files for new version

Test-Parameters: envdefinitions=SANITY_EXCEPT=170\
  mdsdistro=sles11sp3 ossdistro=sles11sp3 \
  clientdistro=sles11sp3 mdsfilesystemtype=ldiskfs \
  mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
  testgroup=review-ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I4c8d24e45b0e3fbbc31a188ee416860f54371930
Reviewed-on: http://review.whamcloud.com/14164
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-2675 lnet: remove unnecessary goto 98/13698/6
Isaac Huang [Sun, 29 Mar 2015 15:45:23 +0000 (11:45 -0400)]
LU-2675 lnet: remove unnecessary goto

The 'out' label and the goto's in brw_client_done_rpc()
are no longer needed, since the user space code has been
removed.

Change-Id: I0f478de2778ec25eea9f7adbd08f4caf0a29668e
Signed-off-by: Isaac Huang <he.huang@intel.com>
Reviewed-on: http://review.whamcloud.com/13698
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5319 ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag 60/13960/6
Gregoire Pichon [Wed, 3 Sep 2014 08:53:00 +0000 (10:53 +0200)]
LU-5319 ptlrpc: Add OBD_CONNECT_MULTIMODRPCS flag

The new OBD_CONNECT_MULTIMODRPCS connection flag indicates the support
of multiple modify RPCs in parallel. It can be specified by the client
within the connection request and by the server within the connection
reply.
The new ocd_maxmodrpcs connection data specifies the maximum modify
RPCs in parallel supported by the server.

To allow the MDS to send the new ocd_maxmodrpcs field, it has been
required to modify RMF_CONNECT_DATA so that its size includes the new
field. This change leads to remove the ocd_connect_data_v1 structure.
Note that the client has been allocating an extra 16*sizeof(__u64) for
the obd_connect_data reply since 2.0 (commit fd908da9, and even in
later versions of 1.8) so there is no problem for the MDS to just send
the full reply size.

This patch fixes a bug in __req_capsule_get() since it wasn't checking
RMF_F_NO_SIZE_CHECK when receiving the message. This allows legacy
clients (with version lower that this commit) to send connection
request with ocd_connect_data structure size smaller (actually size is
ocd_connect_data_v1 structure size) than new server ocd_connect_data
structure size.

This patch also fixes a bug in the routine that displays the import's
connect data.

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: I4b8a567241f8986d967240efff94c7f407fdd864
Reviewed-on: http://review.whamcloud.com/13960
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
4 years agoLU-5478 osc: get rid of obd_* typedefs 48/13148/16
Dmitry Eremin [Wed, 11 Feb 2015 18:58:11 +0000 (21:58 +0300)]
LU-5478 osc: get rid of obd_* typedefs

We have a bunch of typedefs for common things that made no sense
and hid the actual type from plain view.
Replace them with proper uXX or sXX types.
Exception is in lustre_idl.h and lustre_ioctl.h where
they are replaced with __uXX and __sXX to be able to be included
in userspace

Also fix valid flags to u64.

patch 6 in series: modify osc/osp

Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ib18583489ed79eaded6339a283bc48e4fda319f6
Reviewed-on: http://review.whamcloud.com/13148
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-4647 tests: modify sanity-sec to test multiple MDSes 43/13343/10
Kit Westneat [Mon, 12 Jan 2015 06:51:03 +0000 (01:51 -0500)]
LU-4647 tests: modify sanity-sec to test multiple MDSes

This patch modifies santiy-sec to distribute the nodemap to all the
MDSes in the system. It also uses test_mkdir to create directories on
multiple MDTs, allowing DNE and nodemapping to be tested.

Test-Parameters: alwaysuploadlogs envdefinitions=SLOW=yes \
mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs \
mdtcount=4 testlist=sanity-sec

Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I48f47b72e404614167c715a682d99351295d8189
Reviewed-on: http://review.whamcloud.com/13343
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6325 libcfs: shortcut to create CPT from NUMA topology 49/14049/5
Liang Zhen [Thu, 12 Mar 2015 04:16:10 +0000 (12:16 +0800)]
LU-6325 libcfs: shortcut to create CPT from NUMA topology

If user wants to create CPT table that can match numa topology,
she has to query cpu & numa topology, then provide a pattern
string to describe the topology, this is inconvenient.

To improve it, this patch can support shortcut expression "N" or "n"
to create CPT table from NUMA & CPU topology

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I608f47ad6856ded5bf2f5f223b77b02906ebc8cc
Reviewed-on: http://review.whamcloud.com/14049
Tested-by: Jenkins
Reviewed-by: Olaf Weber <olaf@sgi.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6376 osp: add RPC lock during RPC send 98/14098/2
wang di [Wed, 11 Mar 2015 04:33:51 +0000 (21:33 -0700)]
LU-6376 osp: add RPC lock during RPC send

Add RPC lock in osp_remote_sync(), so only one modified
UPDATE RPC is allowed each time as other normal MDC client.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I6c766625adc355d5c8827bffb78a1820efc46fd6
Reviewed-on: http://review.whamcloud.com/14098
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6373 llite: default dir stripe index only for mkdir 96/14096/2
wang di [Thu, 12 Mar 2015 14:13:42 +0000 (07:13 -0700)]
LU-6373 llite: default dir stripe index only for mkdir

Default dir stripe index should only work during mkdir,
otherwise it will cause other open/create request being
sent to the wrong MDT.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Id95d7218196d52950eceea38be5612ffe4a6b080
Reviewed-on: http://review.whamcloud.com/14096
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6247 tests: fix nodemap quota test to use correct blocksize 44/13844/4
Kit Westneat [Mon, 23 Feb 2015 16:23:17 +0000 (11:23 -0500)]
LU-6247 tests: fix nodemap quota test to use correct blocksize

The nodemap quota test needs to take into account indirect blocks and
other overhead when deciding if the quota used is excessive. It's
supposed to add the fs blocksize * 2, so 8k on ldiskfs and 256k on
zfs, but it was using the osc block size instead of the osd
blocksize. This patch modifies the test to use the fs_log_size helper
to determine an appropriate fuzz size.

Test-Parameters: mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs testlist=sanity-sec
Signed-off-by: Kit Westneat <kit.westneat@gmail.com>
Change-Id: I3bf6380fd0a0e6b3246343da8a139c6b4ea120ae
Reviewed-on: http://review.whamcloud.com/13844
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5319 mdt: pass __u64 for storing opdata 97/13297/2
Andreas Dilger [Thu, 8 Jan 2015 19:47:52 +0000 (12:47 -0700)]
LU-5319 mdt: pass __u64 for storing opdata

While lcd_last_data is only __u32, the internal disposition handling
is done with __u64, so it doesn't makes sense to drop the high bits in
mdt_get_disposition(), mdt_set_disposition(), mdt_clear_disposition().

Minor whitespace and prototype cleanups.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ia4880b7256564a6799f8c686fd8690ebbf3ebbe5
Reviewed-on: http://review.whamcloud.com/13297
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Grégoire Pichon <gregoire.pichon@bull.net>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6081 lfs: split setstripe and migrate help 17/13317/8
Frank Zago [Fri, 9 Jan 2015 16:58:39 +0000 (10:58 -0600)]
LU-6081 lfs: split setstripe and migrate help

The setstripe and migrate share all but one of their parameters
(--block). However the help is common to both, so the --block also
appears for the setstripe command but is not valid. So split the help
message for each command, while still sharing the same text to avoid
duplication.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: If8d110b04b26090d0540bb1628de75a94d92727e
Reviewed-on: http://review.whamcloud.com/13317
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-6387 build: add support for power8 15/14115/4
James Simmons [Mon, 23 Mar 2015 15:33:43 +0000 (11:33 -0400)]
LU-6387 build: add support for power8

Expand lustre to natively support the little endian Power8
platform. The architecture is reported as powerpc64le so
add that support to lustre-build-linux.m4. For the Ubuntu
packaging the platform is reported as ppc64el so include
that to the debian build configuration.

Change-Id: I506fd7228579fe1bd6e5c9e9d39db6ca06e4768d
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/14115
Tested-by: Maloo <hpdd-maloo@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4688 mdt: remove export_put() from mdt_export_evict() 06/13706/6
Mikhail Pershin [Tue, 10 Feb 2015 05:49:24 +0000 (08:49 +0300)]
LU-4688 mdt: remove export_put() from mdt_export_evict()

This export reference dropping is not needed here, export
referenced from ptlrpc level and will be dropped there as
well. It looks like that call to the class_export_put()
was added just similar to other places where the
class_fail_export() is called but in those cases export
reference was taken right in the same function while here
it is taken by external caller and will dropped there.

Test-Parameters: envdefinitions=SLOW=yes alwaysuploadlogs testlist=replay-dual
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I560271668c10715c7f6caa02bfbf2fccab3eeade
Reviewed-on: http://review.whamcloud.com/13706
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoNew tag 2.7.51 2.7.51 v2_7_51 v2_7_51_0
Oleg Drokin [Thu, 26 Mar 2015 00:15:30 +0000 (20:15 -0400)]
New tag 2.7.51

Change-Id: If0b670b630c0df80474f5a77e159da1752db39db

4 years agoLU-6020 kerberos: readdir bulk replies are not wrapped 20/14020/5
Andrew Perepechko [Mon, 9 Mar 2015 18:20:37 +0000 (21:20 +0300)]
LU-6020 kerberos: readdir bulk replies are not wrapped

target_bulk_io() does not wrap readdir replies causing readdir errors:
gss_cli_ctx_unwrap_bulk() bulk security descriptor mismatch: (0,0,2) != (0,0,0)
ll_get_dir_page() read cache page: [0x200000007:0x1:0x0] at 0: rc -71
ll_dir_read() error reading dir [0x200000007:0x1:0x0] at 0: rc -71

Change-Id: Ifa11e1f6bbc6ae8a3a7c4f296f055fd38dabd6aa
Xyratex-bug-id: SNT-15
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-on: http://review.whamcloud.com/14020
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6020 kerberos: bulk nob is not corrected on bulk writes 18/14018/3
Andrew Perepechko [Mon, 9 Mar 2015 17:42:57 +0000 (20:42 +0300)]
LU-6020 kerberos: bulk nob is not corrected on bulk writes

The real transferred block in the end of a file can
have a size different from the original block in
privacy mode. The proper fix should probably just
mimic the behaviour of the client.

Change-Id: I594c116c78b7746f4881e0de8b7cc63b37268381
Xyratex-bug-id: SNT-15
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-on: http://review.whamcloud.com/14018
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6081 lib: don't ignore the return of read() 54/13654/2
Frank Zago [Wed, 4 Feb 2015 21:31:31 +0000 (15:31 -0600)]
LU-6081 lib: don't ignore the return of read()

The return of read() must be checked on some platform, possibly with a
recent version of glibc. Otherwise this gives an "ignoring return
value of 'read'" warning, which breaks the build. It happens on Ubuntu
14.04.

Use the return of read() to the random entropy to fix the issue. This
doesn't do much for the entropy, but it doesn't hurt either.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Iee7c1bce818e2f163db8f860acf8be075f5a543e
Reviewed-on: http://review.whamcloud.com/13654
Reviewed-by: Patrick Farrell <paf@cray.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-6159 hsm: add CL_CLOSE to default changelog mask 26/13526/8
Frank Zago [Mon, 26 Jan 2015 17:53:59 +0000 (11:53 -0600)]
LU-6159 hsm: add CL_CLOSE to default changelog mask

There's no point in ignoring CL_CLOSE by default in changelogs.
Robinhood needs these events else the database quickly
becomes out of sync. So let's have it by default.

Note that CL_CLOSE is not issued when the file was opened in read-only
mode.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ie5f42bc4413259e5079801a204e15125cde0c48b
Reviewed-on: http://review.whamcloud.com/13526
Tested-by: Jenkins
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5823 clio: remove IOC_LOV_GETINFO 48/12748/7
Bobi Jam [Fri, 7 Nov 2014 09:50:15 +0000 (17:50 +0800)]
LU-5823 clio: remove IOC_LOV_GETINFO

* In cb_find_init() (lfs find) use some variant of stat() to get file
  size and times.
* Remove the then unused IOC_LOV_GETINFO ioctl.
* Remove ll_glimpse_ioctl() and ll_lsm_getattr().
* Remove the OBD API method obd_getattr_async().

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I8fb6b69b1c94f0522a1405f1105ab0b7a2041601
Reviewed-on: http://review.whamcloud.com/12748
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5823 clio: add coo_obd_info_get and coo_data_version 38/12638/14
Bobi Jam [Tue, 4 Nov 2014 13:41:56 +0000 (21:41 +0800)]
LU-5823 clio: add coo_obd_info_get and coo_data_version

* Add coo_obd_info_get to retrieve object attributes from servers.
* Add coo_data_version to retrieve object's data_version.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ia66a3ba1eee5c6478f3aa5c9f942ada77b2b5fe9
Reviewed-on: http://review.whamcloud.com/12638
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6047 mdt: remove Size on MDS support 42/13442/3
John L. Hammond [Fri, 16 Jan 2015 19:33:46 +0000 (13:33 -0600)]
LU-6047 mdt: remove Size on MDS support

Remove size on MDS support from lustre/mdt/. In struct mdt_object
change the struct mutex mot_ioepoch_mutex member to spinlock_t
mot_write_lock and rename mot_writecount to mot_write_count.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I271117618f7b88a22ddbcca4db5a4723ab48e3ea
Reviewed-on: http://review.whamcloud.com/13442
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6313 tests: more robust for scrub test_11 57/13957/3
Fan Yong [Tue, 3 Mar 2015 21:13:28 +0000 (05:13 +0800)]
LU-6313 tests: more robust for scrub test_11

For the sanity-scrub test_11, except for the known created by the
test scripts, there may be other objects (such as for llog) have
been created before the first OI scrub scaning. So it is not easy
to estimate how many objects should be skipped during the first
OI scrub scanning. So we only check that the number of skipped
files is more than the number or known created.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iced9fb255559394117880514c5e716d05a81a177
Reviewed-on: http://review.whamcloud.com/13957
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6335 kernel: kernel upgrade [RHEL7.1 3.10.0-229.el7] 90/14090/4
Bob Glossman [Thu, 12 Mar 2015 17:05:26 +0000 (10:05 -0700)]
LU-6335 kernel: kernel upgrade [RHEL7.1 3.10.0-229.el7]

upgrade from RHEL7.0 to RHEL7.1 3.10.0-229.el7 kernel

Test-Parameters: clientdistro=el7 testgroup=review-ldiskfs \
  mdtfilesystemtype=ldiskfs mdsfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I6b733eb4571b57339889e927c4658c02e7ac7f34
Reviewed-on: http://review.whamcloud.com/14090
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6221 utils: hsm_root is also required for --dry-run 73/13673/2
Bruno Faccini [Fri, 6 Feb 2015 12:13:02 +0000 (13:13 +0100)]
LU-6221 utils: hsm_root is also required for --dry-run

Not specifying hsm_root in copytool/lhsmtool_posix command line for
--dry-run mode can lead to failure/error.
This path ensures that hsm_root will be required even for --dry-run.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Icaa2af6d1365751d9e77b2be3f60aacc9c1f6a5c
Reviewed-on: http://review.whamcloud.com/13673
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6378 kernel: simplify quota-avoid-dqget-call.patch 35/14135/3
Niu Yawei [Mon, 23 Mar 2015 05:23:32 +0000 (01:23 -0400)]
LU-6378 kernel: simplify quota-avoid-dqget-call.patch

Backport the patch from upstream kernel, which doesn't rely
on the I_NEW to skip dqget()/dqput() calls, it should check
the i_dquot directly instead.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: I10e2e8284704bc7cf9ffae4ee88f06fafef14b1a
Reviewed-on: http://review.whamcloud.com/14135
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
4 years agoLU-6356 ptlrpc: ret -ECONNREFUSED if not context found in req 43/14043/4
Sebastien Buisson [Wed, 11 Mar 2015 10:31:08 +0000 (11:31 +0100)]
LU-6356 ptlrpc: ret -ECONNREFUSED if not context found in req

Return -ECONNREFUSED instead of -ENOMEM in sptlrpc_req_get_ctx()
if no context is found in req.
It it more graceful?

Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net>
Change-Id: If1b142199a94d1976093a7d26a05e49a63f50469
Reviewed-on: http://review.whamcloud.com/14043
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 libcfs: cleanup libcfs lock handling 93/13793/4
James Simmons [Mon, 9 Mar 2015 20:53:41 +0000 (16:53 -0400)]
LU-6245 libcfs: cleanup libcfs lock handling

Previously with libcfs being built for user land and kernel
space wrappers were created to transparently handle locking.
Now that user land support has been removed we delete all
those locking wrappers with this patch.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Icbd9b5c0918cb01202439416b220b6f327144a91
Reviewed-on: http://review.whamcloud.com/13793
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-6245 lnet: remove kernel defines in userland headers 92/13792/6
James Simmons [Thu, 19 Mar 2015 15:57:34 +0000 (11:57 -0400)]
LU-6245 lnet: remove kernel defines in userland headers

Currently the lnet headers used for user land applications
contain various kernel definations. This is due to the
fact libcfs contains kernel wrappers for user land which
will be going away. This patch sorted the header data
so all kernel containing structures are moved out of
headers that user land will use.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I3904cd692bf2debd3123cbf8ca98dfc518ce0a97
Reviewed-on: http://review.whamcloud.com/13792
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-6395 mgc: one byte shorter for logname allocation 46/14146/2
wang di [Sun, 22 Mar 2015 15:15:19 +0000 (08:15 -0700)]
LU-6395 mgc: one byte shorter for logname allocation

One byte shorter for logname allocation in mgc_llog_local_copy(),
which might cause buffer overflow in the following sprintf().

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ie758c3650c1cf7848874d9fd3a02a5618043eb8f
Reviewed-on: http://review.whamcloud.com/14146
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-5953 build: use installed OFED by default 86/12686/19
Bruno Faccini [Wed, 12 Nov 2014 14:23:06 +0000 (15:23 +0100)]
LU-5953 build: use installed OFED by default

During LNET autoconf phase, if OFED installed and its devel headers
are available, default to use it instead of in-Kernel IB driver.

Also handle wrong case where OFED installed but not devel preventing
to build against OFED. Had to add new "patches" vs "kernel_patches"
dir name use in recent OFED versions and to avoid its check to
collide with inkernel-IB builds case.

Current OFED headers detection mechanism allow for non-standard
prefix but relies on "ofed_info" command and on "%prefix/openib"
link (both are ok for 1.5.x and 3.x versions), and should work
for both source and DKMS Lustre builds.

Test-Parameters: nettypes=o2ib
Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I82639f8392d5fe707a3b1a1719d53ab937e918b5
Reviewed-on: http://review.whamcloud.com/12686
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6218 osd-zfs: increase redundancy for meta data 41/13741/4
Isaac Huang [Thu, 12 Feb 2015 01:45:13 +0000 (18:45 -0700)]
LU-6218 osd-zfs: increase redundancy for meta data

Use DMU_OTN_UINT8_METADATA for local objects so their
data blocks would get an additional ditto copy. This
increases redundancy and hence the chance of recovery
by zpool scrub in the event of corruption.

Change-Id: I502da680521027733ea53744905c47f569a1b531
Signed-off-by: Isaac Huang <he.huang@intel.com>
Test-Parameters: mdtfilesystemtype=zfs mdsfilesystemtype=zfs ostfilesystemtype=zfs
Reviewed-on: http://review.whamcloud.com/13741
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5757 hsm: strengthen checks for flags and archive id 37/13337/9
Bruno Faccini [Sat, 10 Jan 2015 11:33:49 +0000 (12:33 +0100)]
LU-5757 hsm: strengthen checks for flags and archive id

Prior to this patch undefined flags bits and out of range
archive id can be set.
Also changed the concerned error handling that has been
recently added (LU-5732) as part of sanity-hsm/test_500.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I64403de4529f0214bab55c2fc13281b0a3d30a11
Reviewed-on: http://review.whamcloud.com/13337
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 libcfs: move lucache from libcfs to lustre 83/13783/8
James Simmons [Mon, 9 Mar 2015 20:06:38 +0000 (16:06 -0400)]
LU-6245 libcfs: move lucache from libcfs to lustre

The lucache handling in libcfs is used only for
idmap handling in the obdclass and mdt layers.
Since this is the case we can move the lucache
handling into the lustre stack. As a bonus the
lucache will only be built when we enable server
support instead of the current state of it being
built for clients as well.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I812f2c5952ea79bd023435e5fac1955316c9c59f
Reviewed-on: http://review.whamcloud.com/13783
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6245 libcfs: remove tcpip abstraction from libcfs 60/13760/19
James Simmons [Fri, 13 Mar 2015 17:11:19 +0000 (13:11 -0400)]
LU-6245 libcfs: remove tcpip abstraction from libcfs

Since libcfs no longer builds for user land we can
move the tcpip abstraction that exist to the LNET
layer which is the only place that uses it. Also
the migrated code will use native linux kernel
apis directly instead of with wrappers.

Change-Id: Iaa39e4f581f18cfe586feb5bfbf4233a2f2335c7
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/13760
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5823 clio: add cl_object_fiemap() 35/12535/19
Bobi Jam [Mon, 3 Nov 2014 10:52:29 +0000 (18:52 +0800)]
LU-5823 clio: add cl_object_fiemap()

* Add cl_object_operations::coo_fiemap().
* Add cl_object_fiemap() to get FIEMAP mappings.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ie32eb5ddb8d2daa1a66055f347cef4757d039e75
Reviewed-on: http://review.whamcloud.com/12535
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6319 tests: Clean up sanityn ALWAYS_EXCEPT list 53/13953/2
James Nunez [Tue, 3 Mar 2015 17:39:45 +0000 (10:39 -0700)]
LU-6319 tests: Clean up sanityn ALWAYS_EXCEPT list

At some point between Lustre 1.8 and 2.1, sanityn test 22 was removed.
Test number 22 is still included in the ALWAYS_EXCEPT list and in the
EXCEPT list; the list of tests that will not be run under normal
(autotest) testing. Remove test 22 from the ALWAYS_EXCEPT and EXCEPT
lists.

Also, tests 11 and 14 are skipped for SUSE10. All Lustre branches from
b2_4 to current master are not built for and are no longer tested on
SLES10. Remove the check for SUSE10 and remove tests 11 and 14 from
the ALWAYS_EXCEPT list.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I2222770874c6c2da816cfd54b371ddc9c0da370b
Reviewed-on: http://review.whamcloud.com/13953
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5929 tests: conf-sanity test 72 call tune2fs on MDSs 61/12761/3
James Nunez [Tue, 18 Nov 2014 01:32:29 +0000 (18:32 -0700)]
LU-5929 tests: conf-sanity test 72 call tune2fs on MDSs

The call to tune2fs tuning MDTs with "-O extents" is now
run on the MDS node(s) and not the client.

Test-Parameters: alwaysuploadlogs testlist=conf-sanity

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ie4e1bb2d9447d86c9b0144e8f2564b5c2444842d
Reviewed-on: http://review.whamcloud.com/12761
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5504 utils: add const qualifier to changelog accessors. 87/13787/3
Frank Zago [Tue, 17 Feb 2015 21:11:40 +0000 (15:11 -0600)]
LU-5504 utils: add const qualifier to changelog accessors.

Commit 6e1365 changed 4 functions, and commit 0f22e4 accidentally
reverted them.

This patch put them back, as well as adding new ones to
changelog_rec_size(), changelog_rec_varsize(), changelog_rec_rename()
and changelog_rec_jobid().

6e1365 also changed changelog_rec_name() and changelog_rec_sname() to
return a const, but that is not possible anymore.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I19e389c422795c2ece4d7af369099cb733d3cb1a
Reviewed-on: http://review.whamcloud.com/13787
Tested-by: Jenkins
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-6322 lfsck: show start/complete time directly 48/13948/4
Fan Yong [Tue, 3 Mar 2015 21:00:32 +0000 (05:00 +0800)]
LU-6322 lfsck: show start/complete time directly

It is more easy for the users to use/understand when the LFSCK
was started and/or when the LFSCK completed by showing related
time directly.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ibdacccf1abba6041eaddd6bb5456fb122e9ca994
Reviewed-on: http://review.whamcloud.com/13948
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6317 lfsck: NOT count the objects repeatedly 33/13933/4
Fan Yong [Sun, 7 Dec 2014 04:41:18 +0000 (12:41 +0800)]
LU-6317 lfsck: NOT count the objects repeatedly

The namespace LFSCK uses object-table based iteration plus namespace
based directory traversing to scan the system. So one object will be
returned twice by them. Counting the objects repeatedly will confuse
the users. So the namespace LFSCK should only count the objects that
are scanned via namespace based directory traversing.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I36459743843e1db1e9372d46d3aafddef033d699
Reviewed-on: http://review.whamcloud.com/13933
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6316 lfsck: skip dot name entry 23/13923/2
Fan Yong [Thu, 4 Dec 2014 15:19:57 +0000 (23:19 +0800)]
LU-6316 lfsck: skip dot name entry

It is unnecessary for the namespace LFSCK to verify the dot
entry since it is always on the local MDT and has no linkEA.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I01289b04c8807e930c6f777007f1e1fb3295431d
Reviewed-on: http://review.whamcloud.com/13923
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6235 osd-ldiskfs: NOT skip LMAC_NOT_IN_OI in osd_check_lma 22/13922/2
Fan Yong [Thu, 4 Dec 2014 14:21:13 +0000 (22:21 +0800)]
LU-6235 osd-ldiskfs: NOT skip LMAC_NOT_IN_OI in osd_check_lma

Sometimes, the ost-object may references a wrong indoe because of
the invalid OI mapping. Usually, the OSD can auto detect that via
osd_check_lma(). For old system, if the inode's LMV contains flag
LMAC_NOT_IN_OI, it would skip related checking. But such behavior
is wrong, if may cause the osd-object to reference some important
system inode, and cause system crash via subsequent modification.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ib5cdf2ee4d9893a87fde3caf81109eabdad9ecfa
Reviewed-on: http://review.whamcloud.com/13922
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5682 lfsck: optimize ldlm lock used by LFSCK 66/12766/11
Fan Yong [Tue, 25 Nov 2014 14:24:59 +0000 (22:24 +0800)]
LU-5682 lfsck: optimize ldlm lock used by LFSCK

When LFSCK repairs some inconsistency, it needs to take related
ldlm lock(s) firstly to prevent concurrent modifications or purge
client side cache. Originally, to simply the implementation, the
LFSCK just simply acquires LCK_EX mode ibits lock(s) on related
object(s). But such coarse-grained lock policy may be not efficient
for some directory-based modification, such as insert name entry to
the directory.

This patch introduces lfsck PDO (Parallel Directory Operations) lock
for directory-based LFSCK modification, it only locks part of the
directory with the given <object, name> pairs, then allow others to
access or modify the different part(s) of the directory in parallel,
and also avoid to purge client-side cache unnecessarily.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I29bad81112c14e3aaecaa2b808e60ea74c10a702
Reviewed-on: http://review.whamcloud.com/12766
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6129 lnet: DLC design doc 19/13419/6
Amir Shehata [Thu, 15 Jan 2015 18:30:24 +0000 (10:30 -0800)]
LU-6129 lnet: DLC design doc

Add a DLC design document to lustre/doc/

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I3ec283f960aba7b56afadcc1d2a7770604efb023
Reviewed-on: http://review.whamcloud.com/13419
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-6384 mdt: propagate find errors in mdt_fid2path() 08/14108/2
John L. Hammond [Thu, 19 Mar 2015 14:41:19 +0000 (09:41 -0500)]
LU-6384 mdt: propagate find errors in mdt_fid2path()

In mdt_fid2path() propagate the specific error from mdt_object_find()
rather than returning -EINVAL.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: Ib09f3741f95c0f3484f9a7839e31c583ecc34761
Reviewed-on: http://review.whamcloud.com/14108
Tested-by: Jenkins
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-6134 utils: lfs should only open/stat files if needed 22/13822/3
Andreas Dilger [Fri, 20 Feb 2015 12:25:43 +0000 (05:25 -0700)]
LU-6134 utils: lfs should only open/stat files if needed

Since (commit 322968acf183) "lfs find" would needlessly open() and
fstat() every file if the --ost, -uid/user, -gid/group, -[amc]time,
or -size options were used, to get the MDT index for each file.
This was causing "lfs find --ost" to fail if an OST was offline, and
added needless overhead that "lfs find" was meant to avoid.

The MDT index is only needed if --mdt is used, so only get it in
that case.  It also wasn't necessary to call fstat() in this case
either because the file type was already known at this point.

Some other minor cleanups related to fetching the MDT index:
- don't use ret in cb_get_dirstripe() as it is isn't needed
- fix cb_get_mdt_index() to avoid a Coverity false positive due to
  initializing rc and having a conditional branch that is always taken
- convert spaces to tabs for related code, other minor style fixes

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ib41ced742fe5068f504f540479e6b4718d2540e5
Reviewed-on: http://review.whamcloud.com/13822
Reviewed-by: wangdi <di.wang@intel.com>
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6345 test: compare /bin/sleep in sanity-hsm.sh test_30c 25/14025/3
Emoly Liu [Wed, 4 Mar 2015 22:07:21 +0000 (06:07 +0800)]
LU-6345 test: compare /bin/sleep in sanity-hsm.sh test_30c

In case /bin/sleep is modified during the test, we do a checksum
at the beginning and the end of the test respectively, and won't
mark the test a failure if the checksum has changed.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I1ff472ea6052e2df9ba9fd4c78a4cf53686e1ccd
Reviewed-on: http://review.whamcloud.com/14025
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6014 mdt: remove unused function 26/12326/7
Alex Zhuravlev [Fri, 17 Oct 2014 12:13:25 +0000 (16:13 +0400)]
LU-6014 mdt: remove unused function

mdt_trans_stop() is not used.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I34da179288bef9a928173aa79e53cc74082abed7
Reviewed-on: http://review.whamcloud.com/12326
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
4 years agoLU-6283 ptlrpc: re-add NRS policy registration symbol exports 03/14003/2
Nikitas Angelinas [Fri, 6 Mar 2015 21:38:34 +0000 (13:38 -0800)]
LU-6283 ptlrpc: re-add NRS policy registration symbol exports

Export the ptlrpc_nrs_policy_(register|unregister)() functions, in
order to allow modules other than ptlrpc to load NRS policies on
demand.

These symbols were unexported as part of a subsystem-wide
symbol-unexporting effort for PTLRPC by commit 3ee0e09.

Signed-off-by: Nikitas Angelinas <nikitas.angelinas@seagate.com>
Xyratex-bug-id: MRP-2489
Change-Id: Ic294a94202fb644f997f11b931f7f9bc36d221ba
Reviewed-on: http://review.whamcloud.com/14003
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: frank zago <fzago@cray.com>
Reviewed-by: Chris Horn <hornc@cray.com>
4 years agoLU-6219 utils: remove O_NONBLOCK usage for archive file 72/13672/2
Bruno Faccini [Fri, 6 Feb 2015 11:03:32 +0000 (12:03 +0100)]
LU-6219 utils: remove O_NONBLOCK usage for archive file

In the first implementations of Posix copytool, file archive/restore
was using a loop around select() with handling/retry on EAGAIN.
This was later found as useless for regular files and fixed/changed
in patch for LU-3971 (http://review.whamcloud.com/7583, commit
397ebc93cef378e6d77450cdd095e2737b94f2f6).
But the O_NONBLOCK flag usage, during open() of file on archive, has
been kept since and must be removed, even if ineffective, to improve
code clarity.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Ie6350e6f3951545f50783c9fff6753793b7a9a33
Reviewed-on: http://review.whamcloud.com/13672
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Robert Read <robert.read@intel.com>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6205 tests: fix bash expansion of fid 18/13618/5
Frank Zago [Tue, 3 Feb 2015 20:27:54 +0000 (14:27 -0600)]
LU-6205 tests: fix bash expansion of fid

When calling lfs path2fid, the FID is returned between bracket. When
that fid variable is used, it may be expanded by the shell to
something else. For instance:

  $ touch x
  $ ../utils/lfs fid2path lustre [0x200000be7:0xb:0x0]
  bad FID format [x], should be [0x1:0x2:0x0]

  fid2path: error on FID x: Invalid argument

This will cause some tests, like 154A or 238, to sometimes fail.

Use quotes where the FIDs are used.

Replace "$(lfs ..." with "$($LFS ..." and made a couple variables
local.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I3c1a34585ebaa596d66063f5ada3ccfc4d202ade
Reviewed-on: http://review.whamcloud.com/13618
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
4 years agoLU-5829 ldlm: remove unnecessary EXPORT_SYMBOL 24/13324/2
Frank Zago [Fri, 9 Jan 2015 18:25:58 +0000 (12:25 -0600)]
LU-5829 ldlm: remove unnecessary EXPORT_SYMBOL

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ic182e844d621e6ba3c22e685c72b3702ccbb793b
Reviewed-on: http://review.whamcloud.com/13324
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5829 lnet: remove unnecessary EXPORT_SYMBOL 20/13320/5
Frank Zago [Fri, 9 Jan 2015 18:24:18 +0000 (12:24 -0600)]
LU-5829 lnet: remove unnecessary EXPORT_SYMBOL

A lot of symbols don't need to be exported at all because they are
only used in the module they belong to.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: I8374d2da55d839e361be5721425d7270425f2286
Reviewed-on: http://review.whamcloud.com/13320
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6078 utils: fix copytool file bounds checking 26/13226/3
Bruno Faccini [Fri, 2 Jan 2015 12:37:37 +0000 (13:37 +0100)]
LU-6078 utils: fix copytool file bounds checking

Strengthen copytool file bounds checking in either full
or partial/extent mode.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I36939f9a18362ca3131e39b8d390978dfc79405a
Reviewed-on: http://review.whamcloud.com/13226
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Hongchao Zhang <hongchao.zhang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4820 osd: drop memcpy in zfs osd 91/12991/12
Alex Zhuravlev [Mon, 24 Mar 2014 15:30:19 +0000 (19:30 +0400)]
LU-4820 osd: drop memcpy in zfs osd

dmu_read() was called from osd_read_prep() copying from
ARC bufs into the same ARC bufs. seem to be the remainings
of pre-zerocopy age.

Change-Id: I87c10a2d484b7fe0be370349a2bfeb857ddd74e9
Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-on: http://review.whamcloud.com/12991
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4239 tests: test FID related APIs 45/12545/11
Frank Zago [Wed, 15 Oct 2014 19:19:44 +0000 (14:19 -0500)]
LU-4239 tests: test FID related APIs

This adds a few stress tests to the user lustre API, related to FIDs.

Change-Id: I34144a8f4c446e55c6630d31cae6a133d61eb304
Signed-off-by: Frank Zago <fzago@cray.com>
Test-Parameters: alwaysuploadlogs envdefinitions=ONLY=154g testlist=sanity
Reviewed-on: http://review.whamcloud.com/12545
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5657 doc: allow the use of rst2man to build man pages 40/12040/6
frank zago [Thu, 25 Sep 2014 00:39:59 +0000 (19:39 -0500)]
LU-5657 doc: allow the use of rst2man to build man pages

The man page sources can now be written in reStructuredText (rst), and
the man pages will automatically be generated with rst2man.

Added a build dependency on rst2man and the package python-docutils.

Converted lustreapi.7 to ReST to validate the solution.

Change-Id: I69e9892a238a002eb86769ed65b758cba55543bb
Signed-off-by: frank zago <fzago@cray.com>
Reviewed-on: http://review.whamcloud.com/12040
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6340 lnet: LNet startup script fix 00/14000/5
Amir Shehata [Fri, 6 Mar 2015 20:17:49 +0000 (12:17 -0800)]
LU-6340 lnet: LNet startup script fix

When starting up LNet via the startup script, check if the default
LNet yaml configuration file is present, if it is, then make sure
to use "lnetctl lnet configure" to bring up LNet instead of
"lctl network up".  The latter configures networks and routes
defined in the modparams, while the former does not, since the
configuration defined in the YAML file will be used.

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I1a05cba2a9a6b7a2179b541f1ea5db6d2e89b243
Reviewed-on: http://review.whamcloud.com/14000
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6107 tests: Skip sanityn test_82 if server version is older than 2.6.92 10/13510/3
Wei Liu [Fri, 23 Jan 2015 06:11:31 +0000 (22:11 -0800)]
LU-6107 tests: Skip sanityn test_82 if server version is older than 2.6.92

Skip sanityn test_82 if server version is older than 2.6.92

Change-Id: I2361a333ad1edfb546f18d1a1bb34c9d9173c2e5
Signed-off-by: Wei Liu <wei3.liu@intel.com>
Reviewed-on: http://review.whamcloud.com/13510
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-1593 tests: Remove sanity 34h from ALWAYS_EXCEPT 18/13918/4
James Nunez [Fri, 27 Feb 2015 23:27:30 +0000 (16:27 -0700)]
LU-1593 tests: Remove sanity 34h from ALWAYS_EXCEPT

Sanity test 34h is skipped for all ZFS testing. Since the
issue in LU-1593 is resolved, test 34h needs to be removed
from the ALWAYS_EXCEPT list.

Test-Parameters: alwaysuploadlogs

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: I922a179f76fba7643f9bd7251509433848f384ec
Reviewed-on: http://review.whamcloud.com/13918
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6357 kernel: kernel update RHEL6.6 [2.6.32-504.12.2.el6] 58/14058/2
Bob Glossman [Wed, 11 Mar 2015 18:02:40 +0000 (11:02 -0700)]
LU-6357 kernel: kernel update RHEL6.6 [2.6.32-504.12.2.el6]

Update RHEL6.6 kernel to 2.6.32-504.12.2.el6

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I75af90922aac0e3e06aa7952ad87aec8b57bc1d2
Reviewed-on: http://review.whamcloud.com/14058
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5022 ldiskfs: enable support for RHEL7 49/10249/39
Yang Sheng [Mon, 9 Mar 2015 15:05:52 +0000 (23:05 +0800)]
LU-5022 ldiskfs: enable support for RHEL7

This patch adds support for RHEL7.1 [3.10.0-229.el7] kernel.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ifbc294a53bd21eb35d373637d3326fc3c611c9f0
Reviewed-on: http://review.whamcloud.com/10249
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-3259 clio: Revise read ahead implementation 59/10859/17
Jinshan Xiong [Fri, 16 Jan 2015 19:23:38 +0000 (11:23 -0800)]
LU-3259 clio: Revise read ahead implementation

In this implementation, read ahead will hold the underlying DLM lock
to add read ahead pages. A new cl_io operation cio_read_ahead() is
added for this purpose. It takes parameter cl_read_ahead{} so that
each layer can adjust it by their own requirements. For example, at
OSC layer, it will make sure the read ahead region is covered by a
LDLM lock; at the LOV layer, it will make sure that the region won't
cross stripe boundary.

Legacy callback cpo_is_under_lock() is removed.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Ic388e3a3f744ea5a8352cc8529e32a71073bddb3
Reviewed-on: http://review.whamcloud.com/10859
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Bobi Jam <bobijam@hotmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6324 osd: allow larger osd_thread_info for debug 55/13955/2
John L. Hammond [Tue, 3 Mar 2015 20:18:13 +0000 (14:18 -0600)]
LU-6324 osd: allow larger osd_thread_info for debug

In osd_mod_init() skip the CLASSERT() on the size of struct
osd_thread_info if CONFIG_DEBUG_MUTEXES or CONFIG_DEBUG_SPINLOCK is
defined.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1320403a345886cefaf538dbf80d7c49fa226183
Reviewed-on: http://review.whamcloud.com/13955
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5155 scripts: added lustre/scripts/zfsobj2fid 21/13721/5
Christopher J. Morrone [Wed, 11 Feb 2015 04:13:45 +0000 (21:13 -0700)]
LU-5155 scripts: added lustre/scripts/zfsobj2fid

The zfsobj2fid script converts ZFS object xattr FID to
standard Lustre FID format so it can be used with Lustre
tools like "lfs fid2path".

Change-Id: Id87ff0533a5431a292bca24a76815642f4318083
Signed-off-by: Isaac Huang <he.huang@intel.com>
Reviewed-on: http://review.whamcloud.com/13721
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5848 lfsck: debug log for sanity-lfsck test_18e 50/13950/3
Fan Yong [Tue, 3 Mar 2015 21:09:52 +0000 (05:09 +0800)]
LU-5848 lfsck: debug log for sanity-lfsck test_18e

More debug information for sanity-lfsck test_18e.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I18682ef13c0a12063e3cb595b2e16961451bbe89
Reviewed-on: http://review.whamcloud.com/13950
Tested-by: Jenkins
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6030 osd-ldiskfs: improve mount option handling 72/13572/13
Yang Sheng [Thu, 12 Feb 2015 18:24:54 +0000 (02:24 +0800)]
LU-6030 osd-ldiskfs: improve mount option handling

--handle force-over-128tb option to osd layer
--handle bigendian-check option to osd layer
--strip out extents option & remove extents-mount-options patch
--strip out iopen & mballoc mount options
--back LDISKFS_SUPER_MAGIC to EXT4_SUPER_MAGIC

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: Ic9bf431d0826d6279fc76f7fd1d7e356e421f292
Reviewed-on: http://review.whamcloud.com/13572
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
4 years agoLU-6047 obd: remove client Size on MDS support 69/13169/3
John L. Hammond [Mon, 22 Dec 2014 18:40:21 +0000 (12:40 -0600)]
LU-6047 obd: remove client Size on MDS support

Remove the unused OBD MD API method md_done_writing(). Remove the
unused logcookie and struct md_open_data ** parameters from
md_setattr(). Remove the unused functions iattr_from_obdo(),
md_from_obdo(), and obdo_refresh_inode().

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I59bf2b101807f5b582eb7ab27e5a742284800979
Reviewed-on: http://review.whamcloud.com/13169
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6047 llite: remove client Size on MDS support 26/13126/10
John L. Hammond [Mon, 15 Dec 2014 18:47:21 +0000 (12:47 -0600)]
LU-6047 llite: remove client Size on MDS support

Size on MDS support have been in preview since at least 2.0.0. Remove
support for it from lustre/llite/.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I0b31d893453ef57e54cc9052d4fb6a669a11e28f
Reviewed-on: http://review.whamcloud.com/13126
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6040 lnet: remove messages from lazy portal on NI shutdown 36/13836/4
Amir Shehata [Sat, 21 Feb 2015 00:05:31 +0000 (16:05 -0800)]
LU-6040 lnet: remove messages from lazy portal on NI shutdown

When shutting down an NI in a busy system, some messages received
on this NI, might be on the lazy portal.  They would have grabbed
a ref count on the NI.  Therefore NI will not be removed until
messages are processed.

In order to avoid this scenario, when an NI is shutdown go through
all messages queued on the lazy portal and drop messages for the
NI being shutdown

Signed-off-by: Amir Shehata <amir.shehata@intel.com>
Change-Id: I67c8b720a6eb62fded4f084c1acea69dcdc8d2b6
Reviewed-on: http://review.whamcloud.com/13836
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Isaac Huang <he.huang@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6001 build: cleanup build scripts after reorganization 87/12987/10
Dmitry Eremin [Mon, 8 Dec 2014 15:25:45 +0000 (18:25 +0300)]
LU-6001 build: cleanup build scripts after reorganization

After passing a few configuration parameters in "--with/--without"
option to rpmbuild some code become useless.

Don't pass options through configure_args that can be passed through
rpmbuild options. This allows to avoid unexpected behavior during
the build from source rpm.

Change module-dist-hook: target according coding guidelines.

Remove obsolete liblustre.{a,so} from .spec file that were actually
removed in commit cdfbc722f4d63d3ed3740cbb549062f712010d90.

Don't add the version of kernel to .src.rpm.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ib5f50d257b5d95efe9c45d1865f9dab9ccc3c19a
Reviewed-on: http://review.whamcloud.com/12987
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5231 hsm: display file size in decimal not hex 78/13678/3
Frank Zago [Fri, 6 Feb 2015 19:55:07 +0000 (13:55 -0600)]
LU-5231 hsm: display file size in decimal not hex

'lfs hsm_action' displays the file sizes in hex:
  somebigfile: ARCHIVE running (0xf1c00000 bytes moved)

This is not user friendly. Use decimal instead.

Remove the last occurences of LPX64 in lfs.

Signed-off-by: frank zago <fzago@cray.com>
Change-Id: Ib964c162b275bc836104cec3500a2f03c73dffeb
Reviewed-on: http://review.whamcloud.com/13678
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
4 years agoLU-6209 lnet: Delete all obsolete LND drivers 63/13663/4
James Simmons [Tue, 10 Feb 2015 02:28:45 +0000 (21:28 -0500)]
LU-6209 lnet: Delete all obsolete LND drivers

Remove ralnd, mxlnd, qswlnd drivers. They are no
longer supported and have not even been buildable
for a long time.

Change-Id: I9c88b446028e79122b5847448fdd23fb6cb5c530
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/13663
Tested-by: Jenkins
Reviewed-by: Isaac Huang <he.huang@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6020 kerberos: proper sg list initialization 31/13631/3
Andrew Perepechko [Wed, 4 Feb 2015 13:50:10 +0000 (16:50 +0300)]
LU-6020 kerberos: proper sg list initialization

This patch adds sg_init_table() calls in order
to have proper sg list initialization including
magics, tables sizes, etc.

Without it, when using kernels with CONFIG_DEBUG_SG
option, the following crash can happen:

kernel BUG at include/linux/scatterlist.h:65!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
last sysfs file: /sys/devices/system/cpu/online
CPU 0

Pid: 4911, comm: ptlrpcd_3 Not tainted 2.6.32-431 #7                  /D525MWV
RIP: 0010:[<ffffffffa0b60170>]  [<ffffffffa0b60170>] krb5_make_checksum+0x750/0x770 [ptlrpc_gss]

Change-Id: Ic6c52c8b15393d8d7f67f4bf675c1f57cf27004a
Signed-off-by: Andrew Perepechko <andrew.perepechko@seagate.com>
Reviewed-on: http://review.whamcloud.com/13631
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
4 years agoLU-6030 ldiskfs: clean up ext4-fiemap patch 71/13571/11
Yang Sheng [Thu, 29 Jan 2015 10:13:55 +0000 (18:13 +0800)]
LU-6030 ldiskfs: clean up ext4-fiemap patch

Move ext4-fiemap patch to osd-ldiskfs. So we can
remove this patch entirely.

Signed-off-by: Yang Sheng <yang.sheng@intel.com>
Change-Id: I639733f6f106398bbc3d5e2ffc6fa8a06ffe867f
Reviewed-on: http://review.whamcloud.com/13571
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6215 osc: use list_for_each_entry_safe() when delete items 56/13956/3
Andreas Dilger [Tue, 3 Mar 2015 20:22:51 +0000 (13:22 -0700)]
LU-6215 osc: use list_for_each_entry_safe() when delete items

Since we will remove items off the list using list_del_init() we need
to use a safe version of the list_for_each_entry() macro aptly named
list_for_each_entry_safe().

Linux-commit: f13ab92effb94c8fc5eade75f6f246facd7ef5be

Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I6ec6d8073da6e0aa45e9d8a6ee7cde84ed9cab07
Reviewed-on: http://review.whamcloud.com/13956
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6261 gnilnd: Cray interconnect rollup 12/13812/4
Chuck Fossen [Thu, 19 Feb 2015 21:21:42 +0000 (15:21 -0600)]
LU-6261 gnilnd: Cray interconnect rollup

I am leaving a few lines in structure definitions that are
longer than 80 columns. It's not the time to reformat the
whole structure.
-------------------------------------------------------------
Subject: Update debug messages for rca and quiesce events
Description:
Change informational message when receiving down event for
better tracking of RCA event issues to display under console
logging.
Clarify the message printed when we receive connection request
from a down node.
Simplify quiesce messages to just report the start and end of
quiesce.
-------------------------------------------------------------
Subject: Limit fma block allocations.
Description:
Under network pressure whereby thousands of nodes need to
reconnect all at the same time, routers can run out of memory
allocating fma blocks for mailboxes since the previous ones
cannot be cleaned up until a new connection is established.
Limit the amount of fma blocks that can be allocated to 3
quarters of total memory. This leaves memory free for other
allocations which tend to be much smaller than the mailboxs.
This should only be needed on service nodes.
Clean up some whitespace in kgn_data_t.
-------------------------------------------------------------
Subject: Double deregistration error.
Description:
lustre:18920 introduced a bug which causes us to deregister
the same memory twice when the transfer is unaligned.
Clean up the tx_buffer_copy after a deregistration so that
kgnilnd_rdma can properly register the memory on the retry.
-------------------------------------------------------------
Subject: Stack reset is causing pings to timeout instead of
failing immediately.
Description:
It is possible to register with the same MDD after a stack
reset causing pings to timeout instead of failing right away.
During a stack reset, we need to deregister with a hold
timeout set so we don't use the same mdd after the stack reset
is complete.
This was found by gnilnd regression test 110c.
-------------------------------------------------------------
Subject: Post rdma resource error
Description:
Handle kgni_post_rdma resource error by unmapping the tx and
put it back on the TX_MAPQ.
Also fixed:
fast_reconn variable check was using the pointer instead of
it's value.
bug that causes a stall when calling
kgnilnd_wakeup_rca_thread() when regression test causes
startup failures and the rca thread has not started yet.
Only call sock_release if socket was created.
Changed some stats prints to print unsigned values so they
don't show as negative.
-------------------------------------------------------------
Subject: limit kgnilnd conns in purgatory
Description:
Currently kgnilnd allows for an infinite number of connections
in purgatory, which in the face of a missed rca event can
cause nodes to slowly run out of memory from continued timed
out connection requests to those halted or dead nodes.
This mod makes the following changes to alleviate this issue:
1. Add a module parameter and live tunable allowing us to
limit
   number of connections per peer held in purgatory.
2. Remove the fast reconnect path on the server by making
   that tunable contain different settings for computes
   and service nodes. fast_reconnect is on for computes and
   off for service nodes. This setting can be changed on a
   live system.
3. In the kgnilnd reaper code utilize the tunable and remove
   the oldest purgatoried connections as new connections are
   put into purgatory. This will keep memory usage down and
   allow a system to stay up in the face of nodes being down
   and rca not informing us that they are down.
-------------------------------------------------------------
Subject: Update kgnilnd to be KNC aware.
Description:
Kgnilnd currently ignores rt_accel nodetype events coming from
RCA. This is incorrect as KNC's down and up events are
reported as rt_accel.
Since we currently ignore rt_accel events this causes us to
continually attempt to talk to down KNC nodes.
With this mod we now recognize rt_accel events allowing us to
prevent
communications with down KNC nodes.
-------------------------------------------------------------
Subject: Always notify LNET on GNILND_RCA_NODE_DOWN
Description:
When an LNET router fails it can take router_ping_timeout +
live_router_check_interval seconds for all peers to detect the
down router. For peers on a gni network this can be over two
minutes. During this time peers will continue to use the
failed router.
In some situations gnilnd will receive an event from RCA
notifying that the node is down within 30 seconds of the node
failure. This is much faster than relying on the router
pinger, so gnilnd should call lnet_notify() to notify LNET,
upon receipt of the RCA event, that a peer is down.
-------------------------------------------------------------
Subject: Add fast reconnect path and update lnet_notify last
alive timestamp.
Description:
A lustre client can time out a router during a blade failure
which causes multiple quiesce cycles.
When we time out a connection, reconnect even if there are no
tx's waiting to be sent. This causes an lnet_notify up
notification so we don't need to wait
for the router pinger to bring the connection back up.
At the end of a quiesce, call lnet_notify that the peer is
still up which updates the last alive timestamp.
Various debug message cleanup.
-------------------------------------------------------------
Subject: gnilnd proc_dir_entry port - part 2
Description:
PDE_DATA is defined by libcfs in Cray-master and therefore
only needed by b2_5
-------------------------------------------------------------
Subject: gnilnd proc_dir_entry port
Description:
In SLES12 create_proc_entry and create_proc_read_entry have
been removed, and struct proc_dir_entry is no longer public.
This mod ports all proc functions to use seq_file.
-------------------------------------------------------------
Subject: Remove system.h from gnilnd
Description:
There is no longer system.h for x86 and gnilnd doesn't seem to
need it.
Remove it from gnilnd include.
-------------------------------------------------------------

Signed-off-by: Chuck Fossen <chuckf@cray.com>
Change-Id: Iad14538751cc50fbd03fd3d4876ca41f4c0a223f
Reviewed-on: http://review.whamcloud.com/13812
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Patrick Farrell <paf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-4727 hsm: use IOC_MDC_GETFILEINFO in restore 50/13750/6
John L. Hammond [Thu, 12 Feb 2015 19:53:18 +0000 (13:53 -0600)]
LU-4727 hsm: use IOC_MDC_GETFILEINFO in restore

Use IOC_MDC_GETFILEINFO rather than fstatat() to get the original file
attributes during restore. Add test_12p to sanity-hsm to check that
triggering an implicit restore from the copytool's own mount point
does not wedge the copytool.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I1b1eeb703c60907a2759fdb6d8fb8728a13f8918
Reviewed-on: http://review.whamcloud.com/13750
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6049 obdclass: Add synchro in lu_context_key_degister() 64/13164/7
Patrick Valentin [Mon, 22 Dec 2014 10:11:54 +0000 (11:11 +0100)]
LU-6049 obdclass: Add synchro in lu_context_key_degister()

When unloading a module, it may happen that lu_context_key_degister()
removes a key while a thread is either registering it in a new
context (lu_context_init(), lu_context_refill()), or using it when
exiting from a context (lu_context__exit(), lu_context__fini()).

In these cases, we reference a key which no longer exists, and
the system crashes either because we use a *POISON'ed* pointer
in key_fini() -> key->lct_fini(), or because one of the following
assertions fails:
 - lu_context_key_degister():
        ASSERTION(cfs_atomic_read(&key->lct_used) == 1)
                  failed: key has instances: 2

 - lu_context_exit():
        ASSERTION(key != NULL)

 - key_fini():
        ASSERTION(atomic_read(&key->lct_used) > 1)

This can also leads to SLAB objects which are not freed:
        slab error in kmem_cache_destroy(): cache `echo_thread_kmem':
                   Can't free all objects

Note: ptlrpc service threads need to call lu_context_init/fini in
each loop (for each RPC), and this could be a big performance issue
on fat SMP machines if we add serialization by a spinlock and need
to lock/unlock it for multiple times for each RPC.

So the aim of this patch, which only impacts some low frequently used
functions, is:
 1) to add a synchronization in lu_context_key_quiesce(), also called
    by lu_context_key_degister(), to wait until all key::lct_init()
    methods have completed, by serializing with keys_fill()
 2) to add a synchronization in lu_context_key_degister(), to wait
    until all transient contexts referencing this key have run
    key::lct_fini() method

Signed-off-by: Patrick Valentin <patrick.valentin@bull.net>
Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: Id4ad974e8c7b8053d6e35ebce60cfbcf91dc230b
Reviewed-on: http://review.whamcloud.com/13164
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6203 tests: early lock cancel to allow early copytool death 46/13646/2
Bruno Faccini [Wed, 4 Feb 2015 16:39:38 +0000 (17:39 +0100)]
LU-6203 tests: early lock cancel to allow early copytool death

Since copytool death check+timing has been introduced with patch for
LU-5622, sanity-hsm/test_251() has experienced several failures
due to copytool death being delayed and to timeout, because of lock
cancel.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I399b37854b98626c4c92a367d543b79aebf9eb4e
Reviewed-on: http://review.whamcloud.com/13646
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Henri Doreau <henri.doreau@cea.fr>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6321 lfsck: make lfsck_namespace trace file as index 45/13945/2
Fan Yong [Sun, 7 Dec 2014 01:00:55 +0000 (09:00 +0800)]
LU-6321 lfsck: make lfsck_namespace trace file as index

Originally, the "lfsck_namespace" file stored both the namespace
LFSCK statistics information and the FIDs to be double scanned.
But to improve the namespace LFSCK performance (since Lustre-2.7),
we used multiple trace files with the name "lfsck_namespace_xx".
At that time, the original "lfsck_namespace" file only need to
record the namespace LFSCK statistics information. So we made it
as regular file, NOT index file. Such changes will cause trouble
when downgrade to Lustre-2.6 or older, becuase the old namespace
LFSCK needs an index trace file instead of regular file. To avoid
the compatibility issues, we will keep the "lfsck_namespace" file
as index file on b2_7 and newer release.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I76d8b1416c4c507793aa9bbab2d52cc7d8daa440
Reviewed-on: http://review.whamcloud.com/13945
Tested-by: Jenkins
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-6307 obdclass: distinguish MGC/MDT connection properly 27/13927/2
Fan Yong [Thu, 4 Dec 2014 16:58:06 +0000 (00:58 +0800)]
LU-6307 obdclass: distinguish MGC/MDT connection properly

In the 5f8847bca12afb798de600299356ed2e3655a53e, we introduced the
version checking for the MDT-MDT connection. But there is a corner
that the MGC will set OBD_CONNECT_MNE_SWAB (that is defined as the
same as OBD_CONNECT_MDS_MDS) in the connection flags for Imperative
Recovery interoperability issues with MGS. So the server needs to
know whether the connection is really from another MDT or from the
MGC via checking OBD_CONNECT_FID (that is not set for the MGC-MGS
connection).

Test-Parameters: envdefinitions=ONLY=105 clientjob=lustre-b2_6 clientbuildno=19 testlist=recovery-small
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I9cee743d5474702b77adbb8c3dedd6c19faef15f
Reviewed-on: http://review.whamcloud.com/13927
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Lai Siyao <lai.siyao@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
4 years agoLU-5760 rpm: remove Red Hat specific check for init scripts 77/12377/6
Dmitry Eremin [Wed, 22 Oct 2014 12:05:18 +0000 (16:05 +0400)]
LU-5760 rpm: remove Red Hat specific check for init scripts

The issue with build under mock-based environments is related to
a sloppy heuristic of checking for the existence of checking for
two files under /etc, and assuming that is a good way to identify
a Red Hat system. We had a concern about this for other systems.

So, let's remove this Red Hat specific check of /etc files.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ibc6af75ebea51b39d5ff4c8473db2e3828ffea68
Reviewed-on: http://review.whamcloud.com/12377
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Christopher J. Morrone <morrone2@llnl.gov>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>