Whamcloud - gitweb
fs/lustre-release.git
10 years agoLU-4586 ptlrpc: cast type in the swith op 30/9130/2
Alex Zhuravlev [Wed, 5 Feb 2014 11:08:52 +0000 (15:08 +0400)]
LU-4586 ptlrpc: cast type in the swith op

should allow to build with gcc-4.7.2

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I489ea927d3dc87a7b01f57c5d390612c015b8c47
Reviewed-on: http://review.whamcloud.com/9130
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Li Xi <pkuelelixi@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoRevert "LU-1778 libcfs: add a service that prints a nidlist" 78/9178/2
Oleg Drokin [Fri, 7 Feb 2014 14:08:12 +0000 (14:08 +0000)]
Revert "LU-1778 libcfs: add a service that prints a nidlist"

Whoops, this patch broke build: http://build.whamcloud.com/job/lustre-master/arch=x86_64,build_type=client,distro=ubuntu1004,ib_stack=inkernel/1879/changes

So I am reverting it.

This reverts commit 874f67c06da8304a194df5fc0dd5a2c61937076c.

Change-Id: Ieb36ba5c909bc3731dc4a925d89773be89ab64ec
Reviewed-on: http://review.whamcloud.com/9178
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1778 libcfs: add a service that prints a nidlist 79/8479/6
Gregoire Pichon [Wed, 4 Dec 2013 13:57:10 +0000 (14:57 +0100)]
LU-1778 libcfs: add a service that prints a nidlist

The libcfs already provides services to parse a string into a nidlist
and to match a nid into a nidlist. This patch implements a service
that prints a nidlist into a buffer.

This is required for instance to print the nosquash_nids parameter
of the MDT procfs component.

Additionally, this patch fixes a bug in return code of
parse_addrange() routine, so that parsing of nids including
a * character works fine ('*@elan' for instance).

Signed-off-by: Gregoire Pichon <gregoire.pichon@bull.net>
Change-Id: I5dbc405e02b8f0f90d45e1a7e44589d5972cc384
Reviewed-on: http://review.whamcloud.com/8479
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3950 lfsck: control LFSCK on all devices via single command 65/7665/28
Fan Yong [Fri, 24 Jan 2014 19:45:42 +0000 (03:45 +0800)]
LU-3950 lfsck: control LFSCK on all devices via single command

Under DNE mode, it is more convenient for the administrator to control
the LFSCK (start/stop) on all the MDT devices via single command. Such
functionality is not only useful for DNE consistency verification, but
also for layout consistency (Phase II). It is also required for orphan
OST-objects scanning.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ie0d4611f969e51b80faf27b52dbdaee41caf5187
Reviewed-on: http://review.whamcloud.com/7665
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4540 llite: deadlock for page write 36/9036/2
Jinshan Xiong [Tue, 28 Jan 2014 22:31:36 +0000 (14:31 -0800)]
LU-4540 llite: deadlock for page write

Writing thread already locked page #1, and then wait for the
Writeback bit of page #2;

Ptlrpc thread is composing a write RPC, so it sets Writeback on
page #2 and tries to lock page #1 to make it ready.

Deadlocked.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: I2da547b4c93c3464e520a1f593985adae9360bc9
Reviewed-on: http://review.whamcloud.com/9036
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3321 clio: optimize read ahead code 23/8523/12
Jinshan Xiong [Fri, 3 Jan 2014 17:58:56 +0000 (09:58 -0800)]
LU-3321 clio: optimize read ahead code

It used to check each page in the readahead window is covered by
a lock underneath, now cpo_page_is_under_lock() provides @max_index
to help decide the maximum ra window. @max_index can be modified by
OSC to extend the maximum lock region, to align stripe boundary at
LOV, and to make sure the readahead region at least covers read
region at LLITE layer.

After this is done, usually readahead code calls
cpo_page_is_under_lock() for each stripe.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Change-Id: Iecce020d01b804b799ad234f623498cc6f2f3fb2
Reviewed-on: http://review.whamcloud.com/8523
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4470 build: wrong linux symbol file search 56/9056/2
Bob Glossman [Wed, 29 Jan 2014 20:00:42 +0000 (12:00 -0800)]
LU-4470 build: wrong linux symbol file search

Long standing build flaw just discovered.  The autoconf function
LB_CHECK_SYMBOL_EXPORT looks for the linux symbol table in the wrong place.
In most builds this doesn't matter as the wrong path being used exactly
matches the correct path.  In SLES builds it does matter a lot.
Failing to find the linux symbol table can lead to incorrect autoconf results.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Iab43a2c118c9b8be54a9596b4682b68a11946a94
Reviewed-on: http://review.whamcloud.com/9056
Tested-by: Jenkins
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoNew tag 2.5.55 2.5.55 v2_5_55 v2_5_55_0
Oleg Drokin [Mon, 3 Feb 2014 06:52:59 +0000 (01:52 -0500)]
New tag 2.5.55

Change-Id: I080c434ada778bf15c7b361072abef97b693734b

10 years agoLU-4442 test: add version check for replay-vbr.sh test_7g 73/8973/3
Emoly Liu [Thu, 23 Jan 2014 09:40:15 +0000 (17:40 +0800)]
LU-4442 test: add version check for replay-vbr.sh test_7g

In replay-vbr.sh test_7g.3, because mdt_object_exists() was added
in http://review.whamcloud.com/#/c/8371, client will not be evicted
without object version check.

Test-Parameters: envdefinitions=SLOW=yes,ONLY=7g testlist=replay-vbr
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Ie3c727aba8bd8bf65460a005412fb217ced341ec
Reviewed-on: http://review.whamcloud.com/8973
Tested-by: Jenkins
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3189 tests: add version check code into sanity test 53 33/8833/5
Jian Yu [Tue, 14 Jan 2014 08:52:28 +0000 (16:52 +0800)]
LU-3189 tests: add version check code into sanity test 53

This patch adds Lustre version check codes into sanity test
53 to make the test work with servers that do not have the
following patch:

Lustre-commit: 6c4c51e3079e6c257fbf86536e4739110c166e3b
Lustre-change: http://review.whamcloud.com/4789

Test-Parameters: envdefinitions=SLOW=yes,ENABLE_QUOTA=yes,ONLY=53 \
ossjob=lustre-b2_3 mdsjob=lustre-b2_3 ossbuildno=41 mdsbuildno=41 \
mdtcount=1 testlist=sanity

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ibc759aeedb0023113d9acbdda6b4db5207775aa1
Reviewed-on: http://review.whamcloud.com/8833
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4431 lnet: 1/3/2014 update for Cray interconnects 44/8744/5
Chuck Fossen [Fri, 3 Jan 2014 23:35:40 +0000 (17:35 -0600)]
LU-4431 lnet: 1/3/2014 update for Cray interconnects

This is a rollup of changes for gnilnd containing bug fixes and
enhancements since the LU-3008 submission.
The new header file gni_pub.h contains code that will allow gnilnd to
be built upstream. It will not pass the checkpatch.pl script since it
was developed previously for gni drivers and ugni.

To build a lustre client including gnilnd for Aries (XC30):
sh autogen.sh
./configure --with-linux=/path_to_centos6.4_kernel
--disable-ldiskfs-build --disable-doc --disable-liblustre
--disable-server --with-o2ib=no --without-sysio --disable-checksum
--enable-utils --enable-gni
GNICPPFLAGS='-DCONFIG_CRAY_ARIES -I$PWD/lnet/klnds/gnilnd'

Included changes:
----------------------------------------------------------------------
Subject: Unused VIRT fma block does not get cleaned up during stack
reset.
Description:
This is an edge case where we allocate a GNILND_FMABLK_VIRT fma block
but before using it, a GNILND_FMABLK_PHYS is freed up for use. The
GNILND_FMABLK_VIRT fma block didn't get associated with a conn thus,
during a stack reset, the fma block will not be cleaned up and we
assert.
Changed kgnilnd_unmap_phys_fmablk to unmap all fma blocks instead of
just PHYS blocks.
Renamed kgnilnd_unmap_phys_fmablk to kgnilnd_unmap_fma_blocks.
---------------------------------------------------------------------
Subject: New LND type "gip" gnilnd use of IP addresses.
Description:
Add a new LND type to the libcfs_netstrfns array that converts IP
address to/from addresses.
This change also allows us to use hostnames or IP addresses to specify
the direct attached file systems in the /etc/fstab file.
----------------------------------------------------------------------
Subject: Changes to gnilnd for non-Cray modified kernel.
Description:
Get MAC address from arp table for generating a nicaddr.
Change the rca_inject proc file name to peer_state.
Add the GNIIPLND type to lnd_type for Apollo builds.
Add TO_GNILND_timeout for building upstream without gni-headers.
Add gni_pub.h for use by upstream builds.
Fix slab-freed debug statement which references a freed structure.
Remove unused code that was needed for the gemini simulator.
----------------------------------------------------------------------
Subject: Gnilnd upstream sync LU-4069
Description:
LU-4069 build: cleanup from GOTO(label, -ERRNO)
Cleanup the code from GOTO(label, -ERRNO) and other bad GOTOs.
----------------------------------------------------------------------
Subject: Merge gnilnd changes from LU-2800.
Description:
Upstream changes from LU-2800 need merging to gnilnd.
----------------------------------------------------------------------
Subject: Adjust to Cray-master cfs changes.
Description:
ll_proc_doxxxx macros have been removed in the cfs layer. Use the
corresponding proc_doxxxx function instead.
----------------------------------------------------------------------
Subject: Fix offset problem in reverse rdma edge case.
Description:
The call to lnet_copy_flat2kiov() was used incorrectly passing in an
offset to the source buffer being copied from. The offset is used to
decide how many bytes will be copied from the first iov which causes
the routine to only copy the difference between the nob and the offset
to the first iov. Since only one iov is ever passed in, all the bytes
need to come from that first iov.
----------------------------------------------------------------------
Subject: Remove CFS kernel abstraction dependencies from GniLND
Description:
The CFS kernel abstactions are being removed upstream.
Incorporate LU-1346 changes to GniLND.
We should not call deamonize as we are using kthread_create/
kthread_run.
----------------------------------------------------------------------
Subject: Fix race in closing connection in response to EFAULT error
Description:
The previous mod causes a regression in closing a conn in two threads
at once. We should use kngilnd_close_conn() instead of
kgnilnd_close_conn_locked() since it checks the state of the conn
before actually closing.
----------------------------------------------------------------------
Subject: Close connection in response to EFAULT error
Description:
After the companion node's GPU fell off the bus, we get an mdd invalid
hardware error even though the mdd's that have been inspected look ok.
The hardware error is returned in the rdma cq event in
kgnilnd_check_rdma_cq() and we respond by nak'ing the message.
The plan is to close the connection since the connection is still
alive and subsequent rdma's will continue to fail.
If the connection cannot be reestablished then the communication to
this node will cease so at least jobs will not continue to be
scheduled on this node.
----------------------------------------------------------------------
Subject: Fix outstanding conns issue during kgnilnd_base_shutdown
Description:
Currently in kgnilnd_base_shutdown there is a small race with the
datagram thread that can cause a wildcard dgram to match while in the
process of shutting down and will generate a nak datagram to be
generated. This new datagram needs to be canceled, currently we go
straight into full shutdown without doing so causing us to assert.

This mod adds a cancel function that iterates over all outstanding
non-wc dgrams regardless of net and cancels them. It then schedules
the device to clean up the remaining conns.
----------------------------------------------------------------------
Subject: Canceled dgram deadlock
Description:
When adding conns of canceled dgrams to purgatory, a call was made to
kgnilnd_destroy_conn_ep().
This is inappropriate since we are inside the kgn_peer_conn_lock and
kgnilnd_destroy_conn_ep() takes a mutex lock.
Avoid this behavior by setting the conn state to CLOSED instead of
DONE and allowing the scheduler thread to finish the conn's
processing.
----------------------------------------------------------------------
Subject: LND obtains node up/down information when creating peer.
Description:
Before creating a peer, check the state of the node to see if it is up
or down.
This is done by calling krca_get_sysnodes() and walking through the
array for the nid of interest and checking it's state.
kgnilnd_finish_connect() also creates peers but we do not need to
check in this instance since the request is from the peer.
----------------------------------------------------------------------
Subject: debug and lbug capability for kgnilnd client EFAULT errors
Description:
Currently when kgnilnd encounters an EFAULT within a nak message it
kills the TX and prints a message to the screen. It does not crash or
print enough information for us to diagnose if the problem is hardware
or software.
This patch will allow us to bug a compute when it starts getting a
large number EFAULTs programatically. It also prints out the memhandle
of the mdd that we should be inspecting for validity.
----------------------------------------------------------------------
Subject: Fix kgnilnd q_time setting
Description:
When we recieve a GNILND_MSG_PUT_REQ we send a GNILND_MSG_PUT_ACK in
response when we send that response we were not setting the tx's
q_time.
This mod fixes that problem and allows us to see the correct tx age
when calling kgnilnd_tx_done.
Changed a cast from long to unsigned long.
Corrected a tab issue.
----------------------------------------------------------------------
Subject: Add gnilnd eager receive limit
Description:
Add module parameter eager_credits which limits the amount of messages
that can be eager received.
Currently, we continue to allocate memory with each message which can
cause out of memory issues if a IB interface goes down.
Add counter to track eager allocations. Return -ENOMEM to lnet if we
exceed the number of credits allocated. Lnet will drop messages when
eager receive returns with an error.
Set the default eager_credits to 256k - this limits us to using 512 MB
of memory. This path is used mainly when there is an imbalance in the
either "side" of the router. There should be no performance impact
provided the normal tuning is done.
----------------------------------------------------------------------
Subject: kgnilnd static analysis fixes
Description:
Static analysis has found different bugs to fix.
This mod is a package of minor static analysis fixes.

1. Remove unsigned compare against 0 in kgnilnd_setup_immediate_buffer
2. Fix unintentional integer overflow in kgnilnd_proc_run_cksum_test
3. Fix Nesting issue in kgnilnd_map_phys_fmablk
4. Fix kgnilnd_process_nak return code nto being used.
5. Remove unneeded code in kgnilnd_del_conn_or_peer
6. Fix uninitialized value in kgnilnd_queue_tx
----------------------------------------------------------------------
Subject: kgnilnd_probe_for_dgram() race during shutdown.
Description:
Canceling dgrams while shutting down can cause an assertion in
kgnilnd_probe_for_dgram().
If the shutdown thread calls kgnilnd_probe_for_dgram concurrently with
the dram mover thread,
both may get the same dgram from the postdata_probe_by_id kgni
function.
Move the lock release to after postdata_test_by_id which actually
removes the dgram from the list.
Added fail_loc to test fix.
----------------------------------------------------------------------
Subject: Mailbox corruption fix
Description:
Canceled dgrams could have been completed at the peer during the
cancelation.
The mailbox could then be used for another peer therefore allowing two
peers to use the same mailbox.
The conns for canceled dgrams need to be put in purgatory so they
don't get reused until a connection has been established for the peer.
During release of a canceled dgram, we hook up the conn to the peer
then put it in purgatory.
Added flag to kgnilnd_release_dgram to indicate we are shutting down
or going through a stack reset.
Added some tracing of gnd_ndgrams.
----------------------------------------------------------------------
Subject: LND support for knc
Description:
For knc nodes, use GNI_PTAG_LND_KNC. Use two scheduler threads for
better performance.
libcfs includes calls to cfs_crypto_crc32_pclmul_register() and
cfs_crypto_crc32_pclmul_unregister() but those files are not built for
k1om architecture.
----------------------------------------------------------------------

Signed-off-by: Chuck Fossen <chuckf@cray.com>
Change-Id: Ie8be6d7e8b6623a49d7a75ec878a23cf5385cc46
Reviewed-on: http://review.whamcloud.com/8744
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4527 utils: deprecate old version lfs command opts 31/8631/6
Andreas Dilger [Thu, 19 Dec 2013 23:34:53 +0000 (16:34 -0700)]
LU-4527 utils: deprecate old version lfs command opts

The build version checking in lfs_getstripe() and lfs_find() was
incorrectly using LUSTRE_VERSION instead of LUSTRE_VERSION_CODE.
The old "positional" parameters for "lfs setstripe" have long been
deprecated and are now being removed.  The "--offset" and "--index"
options were not correctly being deprecated since 2.4.50 as intended.

Remove the code and conditions for already-passed build versions,
and fix the remaining checks to use LUSTRE_VERSION_CODE.  Fix one
test that was using a deprecated option.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: I086c869ea5b3ba6c1f83cc2b6ce2c866b43ebbe5
Reviewed-on: http://review.whamcloud.com/8631
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-1267 lfsck: enhance RPCs (3) for MDT-OST consistency 08/7108/36
Fan Yong [Fri, 24 Jan 2014 19:44:54 +0000 (03:44 +0800)]
LU-1267 lfsck: enhance RPCs (3) for MDT-OST consistency

The LFSCK on the OST uses LFSCK_NOTIFY RPC to notify the LFSCK
on the MDT about the LFSCK progress for the layout consistency
verification. And uses the LFSCK_QUERY RPC to query the LFSCK
status on the MDT.

The LFSCK RPC from OST to MDT is sent via the reserse connection
from OST-x to MDT-y.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I138fa9b9ad8ab539379f25bb59ec04a1a482fddb
Reviewed-on: http://review.whamcloud.com/7108
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4509 ptlrpc: re-enqueue ptlrpcd worker 22/8922/4
Liang Zhen [Mon, 20 Jan 2014 12:52:51 +0000 (20:52 +0800)]
LU-4509 ptlrpc: re-enqueue ptlrpcd worker

osc_extent_wait can be stuck in scenario like this:

1) thread-1 held an active extent
2) thread-2 called flush cache, and marked this extent as "urgent"
   and "sync_wait"
3) thread-3 wants to write to the same extent, osc_extent_find will
   get "conflict" because this extent is "sync_wait", so it starts
   to wait...
4) cl_writeback_work has been scheduled by thread-4 to write some
   other extents, it has sent RPCs but not returned yet.
5) thread-1 finished his work, and called osc_extent_release()->
   osc_io_unplug_async()->ptlrpcd_queue_work(), but found
   cl_writeback_work is still running, so it's ignored (-EBUSY)
6) thread-3 is stuck because nobody will wake him up.

This patch allows ptlrpcd_work to be rescheduled, so it will not
miss request anymore

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I4929d52b2d409c2ce081147bb5ee3dd380a86c43
Reviewed-on: http://review.whamcloud.com/8922
Tested-by: Jenkins
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4406 osd-zfs: Correct number of integers for zap key 57/8857/4
Nathaniel Clark [Tue, 14 Jan 2014 21:42:35 +0000 (16:42 -0500)]
LU-4406 osd-zfs: Correct number of integers for zap key

All zap_*_uint64 functions take a key size that is the number of
uint64s.  This corrects the osd_prepare_key to account for that, and
changes the name to make it more consistant with zap functions.

Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: I8ee5ee6e955016fc4340025cede21aaf5bd034b7
Reviewed-on: http://review.whamcloud.com/8857
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-1267 lfsck: enhance RPCs (2) for MDT-OST consistency 87/7087/39
Fan Yong [Fri, 24 Jan 2014 19:44:32 +0000 (03:44 +0800)]
LU-1267 lfsck: enhance RPCs (2) for MDT-OST consistency

The LFSCK on the MDT uses LFSCK_NOTIFY RPC to control the LFSCK
on the OSTs (or other MDTs) to start/stop/fail/pause the layout
consistency verification. And uses LFSCK_QUERY RPC to query the
LFSCK status on the OSTs (or other MDTs).

Introduce new connection flag: OBD_CONNECT_LFSCK to indicate that
whether the target server (MDT/OST) supports online LFSCK or not.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia605f25d0ca0224af3ee543d72a1e9f0cae918e3
Reviewed-on: http://review.whamcloud.com/7087
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1267 lfsck: enhance RPCs (1) for MDT-OST consistency 23/8623/21
Fan Yong [Fri, 24 Jan 2014 19:42:48 +0000 (03:42 +0800)]
LU-1267 lfsck: enhance RPCs (1) for MDT-OST consistency

Introduce new RPC LFSCK_NOTIFY for the LFSCK instance on the server_1
to notify the LFSCK instance on the server_2 about the event such as:
lfsck start/stop/pause/fail/phaseX_done, and so on.

Introduce new RPC LFSCK_QUERY for the LFSCK instance on the server_1
to query the LFSCK status on the server_2.

The two new RPCs are used not only for MDT-OST consistency, but also
for DNE consistency in the future.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I8529f1a3f5f7f9589101f456f0397c8ebe11df18
Reviewed-on: http://review.whamcloud.com/8623
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3344 tests: Verify file handle system calls 47/7247/23
James Simmons [Mon, 13 Jan 2014 14:08:36 +0000 (09:08 -0500)]
LU-3344 tests: Verify file handle system calls

New system calls name_to_handle_at() and open_by_handle_at() are added
to Linux kernel 2.6.39. Added test to verify these work correctly with
Lustre.

Signed-off-by: Swapnil Pimpale <spimpale@ddn.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Icbfc9642cd550ac44d379263836782ffbf4a74f4
Reviewed-on: http://review.whamcloud.com/7247
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
10 years agoLU-2524 tests: run sanity test_51ba in test_51b dir 21/9021/2
Andreas Dilger [Mon, 27 Jan 2014 20:01:19 +0000 (13:01 -0700)]
LU-2524 tests: run sanity test_51ba in test_51b dir

Run the test_51ba directory cleanup in the same directory as the
test_51b subtest created its subdirectories.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ib1118046dba13351c59bc39db3e85ef8583ebbe5
Reviewed-on: http://review.whamcloud.com/9021
Reviewed-by: James Nunez <james.a.nunez@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4551 tests: add range support in ONLY 22/9022/3
wang di [Mon, 27 Jan 2014 15:20:40 +0000 (07:20 -0800)]
LU-4551 tests: add range support in ONLY

Add range support in ONLY, so we can indicate
the range of test cases when running the test.

For example ONLY="12-30" sh sanity will run sanity
test case 12 until 30.

Test-Parameters: allwaysuploadlogs envdefinitions=ONLY="12-20" testlist=sanity
Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I4c6dd62f0524ece388ccde3f1e4469a1219f11d2
Reviewed-on: http://review.whamcloud.com/9022
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-1267 lfsck: framework (3) for MDT-OST consistency 62/7062/37
Fan Yong [Fri, 24 Jan 2014 19:42:07 +0000 (03:42 +0800)]
LU-1267 lfsck: framework (3) for MDT-OST consistency

Introduce an assistant kernel thread to help to handle MDT-OST
consistency verification. The LFSCK main engine thread and the
assistant kernel thread compose an async mode pipeline:

For a given MDT-object, the LFSCK main engine thread reads its
layout EA, and for each stripe, it prefetches the OST-object's
attribute asynchronously. The LFSCK main engine thread doesn't
wait for the OST-object's attribute to be replied, intead, add
the request structure on the shared list.

The LFSCK assistant kernel thread scans the shared list, and
for each replied request, checks whether the OST-object's attr
is consistent with its MDT-object's attr or not. If found some
inconsistency, the LFSCK assistant kernel thread will fix it.

To avoid the LFSCK main engine thread is too much ahead of the
LFSCK assistant kernel thread as to too many objects have been
pre-fetched then memory pressure, use an async windows size to
control how many objects the LFSCK main engine thread can be
ahead of the LFSCK assistant kernel thread at most. It is also
used to control how many objects the assistant kernel thread
can be ahead of backend ptlrpcd threds at most. Such windows
size can be specified via the "lctl lfsck_start" command "-w"
option and can be adjusted dynamically via the proc interface
"lfsck_async_windows".

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I41efd93bc614591a9aabe1099a13fbcc1275d2d9
Reviewed-on: http://review.whamcloud.com/7062
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3951 lfsck: LWP connection from OST-x to MDT-y 66/7666/24
Fan Yong [Fri, 24 Jan 2014 19:41:42 +0000 (03:41 +0800)]
LU-3951 lfsck: LWP connection from OST-x to MDT-y

When client sends object-based RPC to the OST-x, the RPC service
thread on the OST-x needs to verify whether the given parent FID
information in the client RPC matches the parent FID information
stored in the OST-object. If not match, it will query the MDT-y
according to the client given parent FID information. The query
RPC from the OST-x to the MDT-y is sent via LWP connection.

The other use case is that the LFSCK on the OST-x needs to talk
with the LFSCK on the MDT-y, such control/query RPC will be via
above LWP connection from OST-x to MDT-y.

Currently, we only support LWP connection frm OST-x to the MDT-0.
This patch enhance that to enable LWP connection from any OST to
any MDT.

Test-Parameters: allwaysuploadlogs
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ie98be82b3af90456d1838d53b6d77c12956f7bd7
Reviewed-on: http://review.whamcloud.com/7666
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4484 lbuild: add support for fresh versions of MPSS 3.x.x 36/8836/5
Dmitry Eremin [Tue, 14 Jan 2014 11:36:55 +0000 (15:36 +0400)]
LU-4484 lbuild: add support for fresh versions of MPSS 3.x.x

* Adopt lbuild script for new version of MPSS with x.x.x notation.
* Remove dependency from MPSS package to avoid renaming issue in
  the future. The name of package which was used for dependency
  was renamed in MPSS.
* Use new server with MPSS released packages for download.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: Ie4407ad00177ad6d22770230a4dc6bde967d91ef
Reviewed-on: http://review.whamcloud.com/8836
Tested-by: Jenkins
Reviewed-by: Brian J. Murrell <brian.murrell@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4456 osp: extra check for opd_pre 90/8890/8
wang di [Thu, 16 Jan 2014 23:26:56 +0000 (15:26 -0800)]
LU-4456 osp: extra check for opd_pre

1. Add extra check for opd_pre in statfs_interpret, in case
opd_pre has been freed before the callback.

2. switch the sync_fini and pre_fini, so opd_pre will be freed
after all of the possible access has been stopped.

3. opd_pre_waitq will be accessed in several update threads,
osp_precreate, osp_statfs_timer_cb, statfs_interrupt, move
it to osp_device to make sure it is accessiable even after
osp_pre is freed.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I5c73cb52e2406ed03570fc3471111c409e6fe08f
Reviewed-on: http://review.whamcloud.com/8890
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4517 tests: get params directly in _wait_osc_import_state 89/8989/3
Emoly Liu [Sun, 26 Jan 2014 02:27:20 +0000 (10:27 +0800)]
LU-4517 tests: get params directly in _wait_osc_import_state

In _wait_osc_import_state(), if the facet is not a client node,
go to get osc.*.ost_server_uuid params directly and quickly,
without waiting $maxtime seconds.

Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: Idf5989f53d050edcb69690b7f24a6e86df233bef
Reviewed-on: http://review.whamcloud.com/8989
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4515 tests: disable sanity-quota test_34 temporary 81/8981/5
Fan Yong [Fri, 24 Jan 2014 19:19:33 +0000 (03:19 +0800)]
LU-4515 tests: disable sanity-quota test_34 temporary

To avoid other patches to be failed for LU-4515.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I8a6949d18a1ff4f5d229ed083f4f12a667eb3329
Reviewed-on: http://review.whamcloud.com/8981
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
10 years agoLU-4413 osp: move seq allocation out of osp_import_event 97/8997/3
wang di [Fri, 24 Jan 2014 22:21:07 +0000 (14:21 -0800)]
LU-4413 osp: move seq allocation out of osp_import_event

Because seq allocation(osp_init_pre_fid) might be stuck
during RPC, move it out of osp_import_event, which is
inside ptlrpcd_rcv. Otherwise, some other import RPCs(like
connect req)might be blocked in ptlrpcd_rcv.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ib4014f8b0088ea3613fa4d53d3e274f5bdfe70c7
Reviewed-on: http://review.whamcloud.com/8997
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3531 mdc: release dir page cache after accessing 35/8935/4
wang di [Mon, 20 Jan 2014 23:49:34 +0000 (15:49 -0800)]
LU-3531 mdc: release dir page cache after accessing

Release the dir page cache in llite/lmv, so the page
will be hold until entires was filled by filldir.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I8b24bec74b14ff2b65130c02294821fc16ca1421
Reviewed-on: http://review.whamcloud.com/8935
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4409 tests: disable insanity 10 for DNE 50/8650/5
wang di [Sat, 21 Dec 2013 15:02:52 +0000 (07:02 -0800)]
LU-4409 tests: disable insanity 10 for DNE

Disable insanity 10 for DNE.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I4b67cf745a18a09335e21e1e6e457134ac47f224
Reviewed-on: http://review.whamcloud.com/8650
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
10 years agoLU-1267 lfsck: framework (2) for MDT-OST consistency 02/8302/15
Fan Yong [Wed, 15 Jan 2014 05:20:59 +0000 (13:20 +0800)]
LU-1267 lfsck: framework (2) for MDT-OST consistency

The LFSCK can talk with OSP directly, then the LFSCK on the MDT can
control/monitor the specified LFSCK instance on other targets (MDTs
or/and OSTs) without breaking dt_device APIs nor making OSP to know
the LFSCK things, and simplify the handling of remote OST-object or
MDT-object. For that, each OSP will register to the LFSCK will they
are added into the system. The LFSCK maintains such target table in
RAM with the similar logic as the LOD does.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ifa14db68925a0cd2afe0c3566382dbb6176d50b2
Reviewed-on: http://review.whamcloud.com/8302
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1267 lfsck: rebuild LAST_ID 97/6997/35
Fan Yong [Sat, 18 Jan 2014 01:04:09 +0000 (09:04 +0800)]
LU-1267 lfsck: rebuild LAST_ID

The /O/<seq>/LAST_ID records the last oid of the object allocated
within the sequence. The LAST_ID file can be crashed or missed as
the system running. The LFSCK for layout consistency verification
can detect the LAST_ID lost/crashed cases, and can rebuild it via
scanning the whole device.

This functionality is also part of LU-14 live replacement of OST.

Introduce lfsck_notify callback - the LFSCK events notification
channel from the LFSCK to the registered users (MDD/OFD).

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Iee85056e2fda1ecba9424c9f0e822643e9f029a8
Reviewed-on: http://review.whamcloud.com/6997
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2818 mdt: Properly handle ENOMEM 47/8947/2
Oleg Drokin [Tue, 21 Jan 2014 18:53:26 +0000 (13:53 -0500)]
LU-2818 mdt: Properly handle ENOMEM

When osd_keys_init fails in mdt_lvbo_fill, properly bail out with
error instead of asserting.

Change-Id: I832742ed49cc7740d8e709bc4b87e5d5aa100d39
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/8947
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-1538 tests: sanity/101d to check available space 12/4312/6
Alex Zhuravlev [Fri, 19 Oct 2012 18:30:49 +0000 (22:30 +0400)]
LU-1538 tests: sanity/101d to check available space

Fix the check which compared MBs (in size variable) with KBs
(reported by df).  Also avoid failure if the read took under
two seconds, since the timing is inaccurate at that resolution.

Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Change-Id: I43335d699ad2b7e4c5db00c36a7795683f3b04f7
Reviewed-on: http://review.whamcloud.com/4312
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4416 llite: struct kiocb ki_left removed 01/8801/2
yangsheng [Wed, 1 Jan 2014 15:53:38 +0000 (23:53 +0800)]
LU-4416 llite: struct kiocb ki_left removed

struct kiocb without ki_left memeber since 3.12.

Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: Iea1fb67ebb03430b5dc8f71ed2652967ff60b84d
Reviewed-on: http://review.whamcloud.com/8801
Tested-by: Jenkins
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4423 ptlrpc: fix potential NULL pointer dereference 82/8682/2
Oleg Drokin [Tue, 31 Dec 2013 01:50:28 +0000 (20:50 -0500)]
LU-4423 ptlrpc: fix potential NULL pointer dereference

The rest of the code seem to imply that rmf_dumper may indeed be
NULL.  Change the code so that dumping is not even considered if
rmf_dumper callback is not set.

Signed-off-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Change-Id: Iaea16aaf799976d08ebb51322021cc879db1c6d8
Reviewed-on: http://review.whamcloud.com/8682
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4442 test: fix wrong usage of wait_mds_ost_sync() 96/8796/6
Emoly Liu [Wed, 1 Jan 2014 21:26:08 +0000 (05:26 +0800)]
LU-4442 test: fix wrong usage of wait_mds_ost_sync()

Fix the wrong usage of wait_mds_ost_sync() in replay_vbr.sh
test_7_cycle(). The first parameter should be a timeout in seconds
not a facet.

Test-Parameters: testlist=replay-vbr envdefinitions=ONLY=7
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I4e6de62049b473deeaf5c75e1136d76d67a02053
Reviewed-on: http://review.whamcloud.com/8796
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoRevert "LU-3319 procfs: move osp proc handling to seq_files" 31/8931/2
Oleg Drokin [Mon, 20 Jan 2014 23:10:06 +0000 (23:10 +0000)]
Revert "LU-3319 procfs: move osp proc handling to seq_files"

This seems to be causing issues like LU-45-13 and LU-4510
This reverts commit a97e4898ad9e0b65f457b01bdfa954f7d7cd272d.

Change-Id: I6066a255ded24dbdb76b4804e82a377f1069af5f
Reviewed-on: http://review.whamcloud.com/8931
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3680 ptlrpc: Fix assertion failure of null_alloc_rs() 00/8200/6
Patrick Farrell [Fri, 22 Nov 2013 16:47:54 +0000 (10:47 -0600)]
LU-3680 ptlrpc: Fix assertion failure of null_alloc_rs()

lustre_get_emerg_rs() set the size of the reply buffer to zero
by mistake, which will cause LBUG in null_alloc_rs() when memory
pressure is high. This patch fix this problem and adds a size
check to avoid the problem of insufficient buffer size.

Signed-off-by: Li Xi <lixi@ddn.com>
Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I9fbd4f14e8e1263de2af564c4f2e420f5f2b43bc
Reviewed-on: http://review.whamcloud.com/8200
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4416 mem: truncate_pagecache oldsize removed 00/8800/2
yangsheng [Fri, 10 Jan 2014 19:46:06 +0000 (03:46 +0800)]
LU-4416 mem: truncate_pagecache oldsize removed

truncate_pagecache doesn't need oldsize parameter anymore.
In fact, the oldsize useless in all kernel we supported. so
just remove all things relate to it.

Signed-off-by: yang sheng <yang.sheng@intel.com>
Change-Id: Iba9ccebb73e7e8df3179ef1d68507b7403b117a7
Reviewed-on: http://review.whamcloud.com/8800
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4353 strncmp: Replace incorrect strncmp()s with strcmp() 02/8702/2
Swapnil Pimpale [Thu, 2 Jan 2014 16:50:36 +0000 (22:20 +0530)]
LU-4353 strncmp: Replace incorrect strncmp()s with strcmp()

This patch fixes a few places where strncmp() is used but strcmp()
was meant. It also corrects calls to strncmp() in functions
mdd_declare_xattr_del() and mgs_setparam().

Change-Id: I6f0fa4230bca10e8e7b310783bb89628a6eb788f
Signed-off-by: Swapnil Pimpale <spimpale@ddn.com>
Reviewed-on: http://review.whamcloud.com/8702
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4154 lfsck: skip old lfsck test in DNE mode 06/8206/6
Emoly Liu [Wed, 1 Jan 2014 11:35:25 +0000 (19:35 +0800)]
LU-4154 lfsck: skip old lfsck test in DNE mode

The old e2fsck/lfsck tool will not be allowed to run on a DNE
filesystem. This patch updates generate_db() to pass master MDS
parameters only, so that the old lfsck does not corrupt it or
delete all of the files on other MDTs.
This patch also fixes a typo in run_lfsck_remote().

Test-Parameters: mdscount=2 mdtcount=2 testlist=lfsck mdsfilesystemtype=ldiskfs mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: I28caaa5ca4a564aabbd6116325257c905ff22861
Reviewed-on: http://review.whamcloud.com/8206
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-3319 procfs: move osp proc handling to seq_files 29/8029/7
James Simmons [Fri, 3 Jan 2014 15:16:08 +0000 (10:16 -0500)]
LU-3319 procfs: move osp proc handling to seq_files

With 3.10 linux kernel and above proc handling now only
uses struct seq_files. This patch migrates the osp
layer proc entries over to using seq_files.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: If58826e11524a5fffd2e491c1386e3795015bc7e
Reviewed-on: http://review.whamcloud.com/8029
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3319 procfs: update shared server side core proc handling to seq_files 33/7933/11
James Simmons [Thu, 2 Jan 2014 14:03:26 +0000 (09:03 -0500)]
LU-3319 procfs: update shared server side core proc handling to seq_files

Several of the server side abstact layers such as mdt,mgs
etc share several common proc handling routines. This patch
adds the seq_file version so that the stack can gradually
be ported over to these new methods.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I2dd64046fdd4d2bb6f7550bb49cf1c9ef703c157
Reviewed-on: http://review.whamcloud.com/7933
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2880 ldiskfs: Added mount option to enable dirdata. 95/6495/12
Manisha Salve [Fri, 10 Jan 2014 16:25:50 +0000 (11:25 -0500)]
LU-2880 ldiskfs: Added mount option to enable dirdata.

Added the code to set mount option for enabling the dirdata in
osd_mount(). This will set the dirdata option for lustre
filesystems only. The dirdata mount option would be checked in
get_dtype() to decide whether to pass dirdata flag as well while
reading the file type.

Signed-off-by: Manisha Salve <msalve@ddn.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I07f430c5cd7ad92b81746085b05b53c1202dd725
Reviewed-on: http://review.whamcloud.com/6495
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
10 years agoLU-3467 ofd: remove obsoleted OBD methods 80/8580/4
Mikhail Pershin [Sat, 14 Dec 2013 09:42:47 +0000 (13:42 +0400)]
LU-3467 ofd: remove obsoleted OBD methods

Remove unused OBD methods from OFD, keep those needed
for echo client and rename them to indicate that

ofd_get_info() and ofd_set_info_async() are only needed
for local calls with single key.

Clean the usage of fti_fid, don't copy FID to it when
it is possible to point on FID in request buffers directly.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I5622e14bccacb809bca1c10499c23bcaf72e2a68
Reviewed-on: http://review.whamcloud.com/8580
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
10 years agoLU-4403 mds: extra lock during resend lock lookup. 80/8680/4
wang di [Mon, 30 Dec 2013 16:59:10 +0000 (08:59 -0800)]
LU-4403 mds: extra lock during resend lock lookup.

1. If the request does not require open lock, MDT does not
need to search the lock in exp_hash_lock list, because the MDT
will release the lock anyway, and it can always re-enqueue the
request during resend.

2. Lock the resource when the MDS is trying to lookup the lock for
resend request(in mdt_intent_fixup_resent). Otherwise, if the
lock is being released, fixup_resent will return a released lock,
which will cause LBUG when this lock is being released later.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: Ice5a75f9a4cc3bcb45f8d54cf809b4a272805804
Reviewed-on: http://review.whamcloud.com/8680
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4349 target: count disconnected exports as stale 85/8785/3
Mikhail Pershin [Thu, 9 Jan 2014 16:01:19 +0000 (20:01 +0400)]
LU-4349 target: count disconnected exports as stale

If export is disconnected during recovery then stale client
counter need to be increased.

Test-Parameters: envdefinitions=SLOW=yes testlist=conf-sanity,conf-sanity,conf-sanity,conf-sanity,conf-sanity
Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Id4f2a668dbe8d5cd2fa3d392b7df7b4d67a4265a
Reviewed-on: http://review.whamcloud.com/8785
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3618 ptlrpc: rq_commit_cb is called for twice 15/8815/3
Liang Zhen [Sun, 12 Jan 2014 16:11:47 +0000 (00:11 +0800)]
LU-3618 ptlrpc: rq_commit_cb is called for twice

If a ptlrpc_request is already on imp::imp_replay_list, when it's
replayed and replied, after_reply() will call req::rq_commit_cb
for the request, then call it again in ptlrpc_free_committed.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I796c3351ad896aa3e1d0c2147ca7f775b7c14bfc
Reviewed-on: http://review.whamcloud.com/8815
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3319 procfs: update nodemap proc handling to seq_files 16/8816/2
James Simmons [Sun, 12 Jan 2014 20:48:36 +0000 (15:48 -0500)]
LU-3319 procfs: update nodemap proc handling to seq_files

Migrate all nodemap proc handling to using strictly
seq_files.

Change-Id: I3dd338789de1cc266cdb37df7f4083b2cb9b6da5
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Reviewed-on: http://review.whamcloud.com/8816
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Joshua Walgenbach <jjw@iu.edu>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3319 procfs: move llite proc handling over to seq_file 90/7290/17
James Simmons [Fri, 10 Jan 2014 16:44:59 +0000 (11:44 -0500)]
LU-3319 procfs: move llite proc handling over to seq_file

For lustre clients a special abstract layer so a lustre
client can be mounted. In order to support 3.10+ kernels
this client code being the llite,vvp,and clio layers proc
proc handling has been ported to using seq_files only.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Id2ac0956dbdf586ab1200e2edb00d489c15c5d50
Reviewed-on: http://review.whamcloud.com/7290
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4193 ldiskfs: increase max journal to 4GB 11/8111/3
Andreas Dilger [Thu, 31 Oct 2013 08:06:06 +0000 (02:06 -0600)]
LU-4193 ldiskfs: increase max journal to 4GB

Increase the maximum journal size to 4GB to handle the increased
metadata operation rate possible since MDT SMP scaling is done.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I025b68f9f8b7c9d1084f704e2ebf8722753ebbe5
Reviewed-on: http://review.whamcloud.com/8111
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4444 tests: Skip conf-sanity/69 on zfs 41/8841/2
Nathaniel Clark [Tue, 14 Jan 2014 14:36:50 +0000 (09:36 -0500)]
LU-4444 tests: Skip conf-sanity/69 on zfs

Because file creates happen slowly on ZFS and the number of files
required to run the test is 100K, this test cannot run in a
reasonable amount of time.

Also bail out of test if createmany fails (possible if MDS or OST is
too small), this prevents the test from just timing out instead.

Test-Parameters: envdefinitions=SLOW=yes testlist=conf-sanity
Test-Parameters: envdefinitions=SLOW=yes testlist=conf-sanity mdsfilesystemtype=zfs mdtfilesystemtype=zfs ostfilesystemtype=zfs
Signed-off-by: Nathaniel Clark <nathaniel.l.clark@intel.com>
Change-Id: Ic8286c970332e0c53525e4d89e4c5f0e32cf57cb
Reviewed-on: http://review.whamcloud.com/8841
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Jian Yu <jian.yu@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4384 target: don't set OBD_INCOMPAT_FID for OST 10/8810/2
Mikhail Pershin [Sat, 11 Jan 2014 20:21:13 +0000 (00:21 +0400)]
LU-4384 target: don't set OBD_INCOMPAT_FID for OST

That cause downgrade to fall, though here is no incompatibility
actually.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: I71de76496441dd4627cf235166f866404edd6cec
Reviewed-on: http://review.whamcloud.com/8810
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
10 years agoLU-1267 lfsck: framework (1) for MDT-OST consistency 46/7146/25
Fan Yong [Thu, 9 Jan 2014 17:35:32 +0000 (01:35 +0800)]
LU-1267 lfsck: framework (1) for MDT-OST consistency

1) New LFSCK component - lfsck_layout, its data structure, tracing
   file, lFSCK APIs, internal shared functions, and so on.

2) Extend the existing LFSCK:
2.1) new flag - LF_INCOMPLETE, for the case of some target (MDT/OST)
     not join the LFSCK or crashed during the LFSCK.
2.2) New status - LS_PARTIAL, corresponding to above LF_INCOMPLETE,
     if some target (MDT/OST) does not join the LFSCK processing or
     crashed during the LFSCK, when it finished, its status will be
     set as LS_PARTIAL.

3) Control the LFSCK speed by each component itself during the
   second-phase scanning.

4) Avoid lfsck engines to access freed lfsck_instance.

5) Some code cleanup.

Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I6dc7bb881dc831f6c760be14aac2a066ad75ffec
Reviewed-on: http://review.whamcloud.com/7146
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4106 scrub: Trigger OI scrub properly 02/8002/15
Fan Yong [Mon, 13 Jan 2014 07:46:51 +0000 (15:46 +0800)]
LU-4106 scrub: Trigger OI scrub properly

There is the following race case between osd_fid_lookup() and object
unlink/detroy:

Both RPC service thread_1 and RPC service thread_2 try to find the
same obj_A at the same time. At the beginning, the obj_A is not in
cache. The thread_1 is in osd_fid_lookup() and finds the OI mapping
for obj_A. But before the thread_1 finding out related inode_A, the
thread_2 moves faster and finds the inode_A and unlinks the inode_A.
So the thread_1 will fail to find the inode_A. Under such case, the
thread_1 will try to check OI again to make sure whether related OI
mapping is still there or not. If no OI mapping, then it is normal
becuase someone has unlinked the file by race; otherwise, it may be
caused by file-level backup/restore, then thread_1 will trigger OI
scrub to rebuild OI files.

But we ignored a corner case that the thread_1 recheck the OI files
may just between the thread_2 has dropped the inode_A's referene to
zero and will remove related OI mapping from the OI file. Then the
thread_1 is misguided, and will trigger OI scrbu unexpectedly.

More initial OI scrub for the /ROOT/.lustre directory to make sure
the necessary files/directories for mount are ready before used.

This patch also enhances the ls_locate()/dt_locate_at() interface
to allow the caller to pass some hints to low layer, such as flag
LOC_F_NEW for create, to help the low layer to handle efficiently
and properly.

Test-Parameters: mdtcount=4 testlist=sanity-scrub
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: Ia0fa198fc4fa31056f6a32a6d3e75cf905832cd1
Reviewed-on: http://review.whamcloud.com/8002
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4263 osd-zfs: Avoid converting last ID FIDs to OST IDs 01/8301/4
Li Wei [Fri, 30 Aug 2013 07:12:40 +0000 (15:12 +0800)]
LU-4263 osd-zfs: Avoid converting last ID FIDs to OST IDs

When obdfilter-survey first creates an object on a fresh ZFS OST, the
last ID object for FID_SEQ_ECHO has to be created in the first place.
The last ID FID, [FID_SEQ_ECHO:0:0], can not be converted to an OST ID
because the resulting OST ID would be indistinguishable from an
FID_SEQ_OST_MDT0 OST ID and would confuse ostid_id().  This patch
checks for last ID FIDs before converting them to OST IDs in
osd_get_idx_for_ost_obj().

Change-Id: I96cdf85b4725e4882cecabaf90466c7f77a5e0a6
Intel-bug-id: FF-182
Signed-off-by: Li Wei <wei.g.li@intel.com>
Reviewed-on: http://review.whamcloud.com/8301
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4454 libcfs: warn if all HTs in a core are gone 70/8770/3
Liang Zhen [Wed, 8 Jan 2014 06:51:17 +0000 (14:51 +0800)]
LU-4454 libcfs: warn if all HTs in a core are gone

libcfs cpu partition can't support CPU hotplug, but it is safe
when plug-in new CPU or enabling/disabling hyper-threading.
It has potential risk only if plug-out CPU because it may break CPU
affinity of Lustre threads.

Current libcfs will print warning for all CPU notification, this
patch changed this behavior and only output warning when we lost all
HTs in a CPU core which may have broken affinity of Lustre threads.

Signed-off-by: Liang Zhen <liang.zhen@intel.com>
Change-Id: I62267b62871c129beeb1593c4f69e7b81a79999d
Reviewed-on: http://review.whamcloud.com/8770
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-4372 gss: Compatibility cache_register_net 2.6.x/3.3 kernel 28/8528/6
Thomas Stibor [Mon, 30 Dec 2013 19:00:32 +0000 (14:00 -0500)]
LU-4372 gss: Compatibility cache_register_net 2.6.x/3.3 kernel

Since 3.4 cache_register/cache_unregister are removed.
This patch provides a compatibility function to support
kernels 2.6.x to 3.3. Note, since 2.6.37
cache_register_net/cache_unregister_net are defined,
but not exported (no EXPORT_SYMBOL) and thus cannot
be loaded by the module loader. In 3.3
cache_register_net/cache_unregister_net
are exported. This patch is related to LU-4012.

Signed-off-by: Thomas Stibor <thomas@stibor.net>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Iaee5aa7e60e1bd08735c345f413a2344c2850f57
Reviewed-on: http://review.whamcloud.com/8528
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
10 years agoLU-3834 mdt: handle swap_layouts failures during restore 31/7631/14
Bruno Faccini [Tue, 10 Dec 2013 09:55:59 +0000 (10:55 +0100)]
LU-3834 mdt: handle swap_layouts failures during restore

Actually nothing occur after swap_layouts failures during restore,
this can lead to file being left in incoherent state and thus be
unavailable because HS_RELEASED is clear but LOV_PATTERN_F_RELEASED
is still set.
This patch will allow original layout to be recovered by the use of
SWAP_LAYOUTS_MDS_HSM flag. Additionaly this requires HSM xattr of
the data FID to be set.
Also adds layout-swap failure injection and related test.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: Id0e9a005362e4a3854b33f6ce1888197d20e7dbf
Reviewed-on: http://review.whamcloud.com/7631
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Aurelien Degremont <aurelien.degremont@cea.fr>
Reviewed-by: jacques-Charles Lafoucriere <jacques-charles.lafoucriere@cea.fr>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3963 libcfs: convert DT objects atomic primitives 76/7076/7
Peng Tao [Tue, 17 Dec 2013 19:55:13 +0000 (14:55 -0500)]
LU-3963 libcfs: convert DT objects atomic primitives

This patch convers all cfs_atomic primitives in
ofd, osc, osd-ldiskfs and osd-zfs to the linux
atomic api.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I235cd45503115a936cf502e5469daf806cf16078
Reviewed-on: http://review.whamcloud.com/7076
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3963 libcfs: convert md[d/t]/mg[c/s] to linux atomic primitives 74/7074/10
Peng Tao [Tue, 31 Dec 2013 15:22:23 +0000 (10:22 -0500)]
LU-3963 libcfs: convert md[d/t]/mg[c/s] to linux atomic primitives

This patch convers all cfs_atomic primitives in
mdd, mdt, mgc and mgs.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I052ca4da80b0cbd36df02a50f2ae0651eec28ea8
Reviewed-on: http://review.whamcloud.com/7074
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
10 years agoLU-3963 libcfs: convert include/lclient/ldlm/lfsck cfs_atomic 72/7072/9
James Simmons [Fri, 10 Jan 2014 17:02:12 +0000 (12:02 -0500)]
LU-3963 libcfs: convert include/lclient/ldlm/lfsck cfs_atomic

This patch converts all cfs_atomic primitives in
lustre/include, lclient, ldlm and lfsck directories.

Signed-off-by: Liu Xuezhao <xuezhao.liu@emc.com>
Signed-off-by: Peng Tao <tao.peng@emc.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ie746993b917bd6ea8c2403a47488ef0e5a06d6fb
Reviewed-on: http://review.whamcloud.com/7072
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2675 lov: remove unused lov obd functions 81/5581/9
John L. Hammond [Wed, 11 Dec 2013 17:28:24 +0000 (11:28 -0600)]
LU-2675 lov: remove unused lov obd functions

Remove the unused liblustre functions llu_extent_lock,
llu_extent_unlock, and llu_glimpse_callback.

Remove the unused lov functions lov_getattr, lov_setattr, lov_brw,
lov_merge_lvb, lov_punch, lov_sync, lov_enqueue, lov_cancel,
lov_cancel_unused, lov_extent_calc, lov_setstripe, lov_setea,
lov_find_pool, and supporting functions.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I59daa224cfedc9e105a3ebe4d3bacecd5a9aa739
Reviewed-on: http://review.whamcloud.com/5581
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3529 osp: init FID client for OSP on MDT. 58/7158/19
wang di [Mon, 12 May 2014 05:33:17 +0000 (22:33 -0700)]
LU-3529 osp: init FID client for OSP on MDT.

1. Initialize FID client for OSP on MDT.
2. Pull out OSP precreate stuff into a separate structure,
so OSP on MDT does not have to allocate precreate.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I3c4a87e2fbea2a733fc6a4a296a50ba67652aa32
Reviewed-on: http://review.whamcloud.com/7158
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3772 ptlrpc: fix nrs cleanup 10/7410/4
Niu Yawei [Thu, 14 Nov 2013 04:48:00 +0000 (23:48 -0500)]
LU-3772 ptlrpc: fix nrs cleanup

When service start failed due to short of memory, the cleanup code
could operate on uninitialized structure and cause crash at the end.

This patch fix the nrs_svcpt_cleanup_locked() to perform cleanup only
on the nrs which has been properly initialized.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ieafa5b144133490b662f5a80a7b99311a9970de3
Reviewed-on: http://review.whamcloud.com/7410
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4314 build: no dependency from ldiskfs sources 04/8404/5
Dmitry Eremin [Tue, 26 Nov 2013 20:18:53 +0000 (00:18 +0400)]
LU-4314 build: no dependency from ldiskfs sources

Fix absence dependency modules from ldiskfs sources. Otherwise the
build of modules begins early than actual sources prepared in ldiskfs
directory in case of parallel build (with -j <n> option).

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I24d3630d9132235a2dc5433eb5b8d3379c04acaa
Reviewed-on: http://review.whamcloud.com/8404
Tested-by: Jenkins
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Christopher J. Morrone <chris.morrone.llnl@gmail.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3289 gss: gssnull security flavor 75/8475/10
Andrew Korty [Tue, 3 Dec 2013 20:07:13 +0000 (12:07 -0800)]
LU-3289 gss: gssnull security flavor

This change implements the gssnull security flavor for the purpose of
testing the Lustre GSS code.  It provides and uses a null GSS
mechanism so this testing doesn't have to involve any code related to
Kerberos or any other authentication method.

Signed-off-by: Andrew Korty <ajk@iu.edu>
Change-Id: Ic8378a052fd2a0f5a84877476a4a29aef7b0412a
Reviewed-on: http://review.whamcloud.com/8475
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Thomas Stibor <thomas@stibor.net>
10 years agoLU-4299 kernel: kernel update [SLES11 SP3 3.0.101-0.8] 62/8762/2
Bob Glossman [Mon, 6 Jan 2014 23:13:37 +0000 (15:13 -0800)]
LU-4299 kernel: kernel update [SLES11 SP3 3.0.101-0.8]

update target and config files for new kernel version

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
Change-Id: Ie8d7078768129bf19ce7dfd129ba509389d8a232
Reviewed-on: http://review.whamcloud.com/8762
Tested-by: Jenkins
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4360 Fix use after free in ksocknal_send 67/8667/3
Oleg Drokin [Sat, 28 Dec 2013 03:31:15 +0000 (22:31 -0500)]
LU-4360 Fix use after free in ksocknal_send

Call to ksocknal_launch_packet might schedule a callback that
might free the just sent message, and so subsequent access to it
via lntmsg->msg_vmflush goes to freed memory.

Instead we'll just remember if we are in the vmflush thread and
only restore if we happened to set mempressure flag.

Change-Id: I2f0f8b27e26e11b37ad60fde4c98e86c39768349
Signed-off-by: Oleg Drokin <oleg.drokin@intel.com>
Reviewed-on: http://review.whamcloud.com/8667
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Amir Shehata <amir.shehata@intel.com>
10 years agoNew tag 2.5.54 2.5.54 v2_5_54 v2_5_54_0
Oleg Drokin [Sat, 11 Jan 2014 05:42:07 +0000 (00:42 -0500)]
New tag 2.5.54

Change-Id: I6471a15e07373c2cc2c023840ddb31a492eb9d4c

10 years agoLU-2753 lvfs: cleanup lvfs.h and collateral 60/6660/6
John L. Hammond [Mon, 30 Dec 2013 16:05:36 +0000 (11:05 -0500)]
LU-2753 lvfs: cleanup lvfs.h and collateral

Remove the unused struct lvfs_ucred. Remove the ucred pointer argument
from {push,pop}_ctxt() to which all called passed NULL. Remove the now
unused functions {push,pop}_group_info(). Delete lvfs_linux.h. Remove
the cb_ops member from lvfs_run_ctxt which was read but never set.
Remove the unused functions {l,ll}_dentry_open(). Reduce the number
of gratuitous includes of lvfs.h.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Ia7f61e4c5a6af73381a39740dc76367655d18985
Reviewed-on: http://review.whamcloud.com/6660
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Mike Pershin <mike.pershin@intel.com>
10 years agoLU-4237 lfs: Include lfs mkdir in lfs man page 21/8221/5
Patrick Farrell [Thu, 2 Jan 2014 15:44:25 +0000 (09:44 -0600)]
LU-4237 lfs: Include lfs mkdir in lfs man page

This patch adds the usage information for lfs mkdir to the lfs
man page.

Signed-off-by: Patrick Farrell <paf@cray.com>
Change-Id: I33a7ad1f48d2a14ad6b0c212a38140622f47c8a7
Reviewed-on: http://review.whamcloud.com/8221
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
10 years agoLU-1422 lnet: eliminate obsolete Cray SeaStar support 69/7469/8
James Simmons [Tue, 31 Dec 2013 15:59:38 +0000 (10:59 -0500)]
LU-1422 lnet: eliminate obsolete Cray SeaStar support

Remove the bulk of code for the no longer supported
SeaStar interconnect found on older Cray systems.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I29d07df9e7a5d33a700f7c9a14a49a9b3bf61dbe
Reviewed-on: http://review.whamcloud.com/7469
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Liang Zhen <liang.zhen@intel.com>
Reviewed-by: Doug Oucharek <doug.s.oucharek@intel.com>
Reviewed-by: Chuck Fossen <chuckf@cray.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-946 lprocfs: List open files in filesystem 86/6386/22
Girish Shilamkar [Sun, 19 May 2013 08:27:00 +0000 (16:27 +0800)]
LU-946 lprocfs: List open files in filesystem

Added lprocfs file on MDT to list open files in per-export
directory for mdt.

Test-Parameters: testlist=sanity,sanityn
Signed-off-by: Girish Shilamkar <gshilamkar@ddn.com>
Signed-off-by: Emoly Liu <emoly.liu@intel.com>
Change-Id: If8f233d95dca4cd4c4044d85bd117a027dabd80e
Reviewed-on: http://review.whamcloud.com/6386
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Swapnil Pimpale <spimpale@ddn.com>
10 years agoLU-1095 debug: clean up console messages 17/8617/2
Andreas Dilger [Fri, 10 May 2013 04:52:01 +0000 (22:52 -0600)]
LU-1095 debug: clean up console messages

Clean up overly verbose console error messages, improve others.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I480d61dd6febb81ca58709ff939a7807bc3ebbe5
Reviewed-on: http://review.whamcloud.com/8617
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4293 utils: handle lfs migrate failure in lfs_migrate 16/8616/2
Andreas Dilger [Wed, 18 Dec 2013 08:50:56 +0000 (01:50 -0700)]
LU-4293 utils: handle lfs migrate failure in lfs_migrate

If "lfs migrate" returns an error, possibly because it is refusing
to migrate an IGIF FID, fall back to using rsync to copy the file
and rename it.  Print a message in this case so the user knows it
is not a fatal error yet.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I114006afb93d8c8d78923a874f3b914200500c1e
Reviewed-on: http://review.whamcloud.com/8616
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4429 llite: fix open lock matching in ll_md_blocking_ast() 18/8718/2
John L. Hammond [Fri, 3 Jan 2014 23:31:53 +0000 (17:31 -0600)]
LU-4429 llite: fix open lock matching in ll_md_blocking_ast()

In ll_md_blocking_ast() match open locks before all others, ensuring
that MDS_INODELOCK_OPEN is not cleared from bits by another open lock
with a different mode. Change the int flags parameter of
ll_md_real_close() to fmode_t fmode. Clean up verious style issues in
both functions.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I790cbbdbab75b25016c938b5f6340b20e09fc82e
Reviewed-on: http://review.whamcloud.com/8718
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Niu Yawei <yawei.niu@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4405 mdc: use ibits_known mask for lock match 36/8636/5
Alexey Lyashkov [Thu, 2 Jan 2014 22:03:26 +0000 (16:03 -0600)]
LU-4405 mdc: use ibits_known mask for lock match

Before revalidating a lock on the client, mask the lock bits against
the lock bits supported by the server (ibits_known), so newer clients
will find valid locks given by older server versions.

Xyratex-bug-id: MRP-1583

Signed-off-by: Patrick Farrell <paf@cray.com>
Signed-off-by: Alexey Lyashkov <alexey_lyashkov@xyratex.com>
Change-Id: I359b87e4bdc7b930e51538a4a854c47e87dd0520
Reviewed-on: http://review.whamcloud.com/8636
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2524 test: Modify tdir to be single directory 23/8123/5
James Nunez [Sat, 21 Dec 2013 03:13:19 +0000 (20:13 -0700)]
LU-2524 test: Modify tdir to be single directory

Currently, the tdir variable is a directory with a subdirectory.
This requires the '-p' option when calling mkdir. We've made tdir
be a single directory so calls to mkdir and test_mkdir do not
require the '-p' option in most cases.

tdir was changed from d0.${TESTSUITE}/d${base} to
d${testnum}.${TESTSUITE} and tfile was changed from
f.${TESTSUITE}.${testnum} to f${testnum}.${TESTSUITE}. Now tdir and
tfile are consistent in their format and the call to remove files
and directories at the beginning of many test scripts will remove
these files and directories.

Once this patch lands, we can remove the "-p" option from many
of the calls to mkdir.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ib49d7102a49ff6b5f3ec539a5b2f2f5186231a04
Reviewed-on: http://review.whamcloud.com/8123
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
10 years agoLU-3558 ptlrpc: Add the NRS TBF policy 01/6901/20
Li Xi [Sun, 3 Nov 2013 17:49:54 +0000 (09:49 -0800)]
LU-3558 ptlrpc: Add the NRS TBF policy

The TBF (Token Bucket Filter) policy schedules and throttles all
types of RPCs for traffic control purposes. It divides RPCs into
different types according to their NIDs or job IDs, and enforces
a RPC rate limit on every type. The handling of a RPC will be delayed
until there are enough tokens for the type. Different types are
scheduled according to their deadlines, so that none of them will be
starving even though the service does not have the ability to satisfy
all the RPC rate requirments of types. The RPCs with the the same
types are queued in a FIFO manner.

Signed-off-by: Li Xi <lixi@ddn.com>
Change-Id: I3f73dfbfb451cc44dfe5e0a575ec7ab5b90ac47e
Reviewed-on: http://review.whamcloud.com/6901
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Tested-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3527 nodemap: add nodemap kernel module 34/8034/16
Joshua Walgenbach [Thu, 28 Nov 2013 20:18:53 +0000 (21:18 +0100)]
LU-3527 nodemap: add nodemap kernel module

The nodemap kernel module manages groups of mounted clients
and allows policies to be applied to them. A nodemap will be
defined as a nodemap name, and id, a tree of NID ranges that
represents the clients that are included in the nodemap, and
values that represent a policy for the behavior of the
filesystem with respect to those clients.

Initially, the policies implemented for nodemap will be static
UID/GID mapping from client IDs to a canonical filesystem IDs.
Additional flags are provided to allow the filesystem to trust
the IDs from the client (apply no mapping) and allow UID 0
(rootsquash applied to an entire nodemap).

A default nodemap is automatically provided which implicitly
contains all the clients not otherwise specified by an NID
range.

Nodemap will allow the unmapped UID/GIDs to be specified on a
case by case basis.

Nodemaps will be managed on the MGS, and configurations pushed
out the the other servers in the filesystem.

This patch adds nodemap to the build system, and defines the
basic operation for managing the addition and deletion of
the nodemap definitions.

The ioctl for management of the nodemaps from userspace was
added. The commands for adding and removing nodemaps were
added to lctl, and the handlers in the mgs.

The data structures for the module were added in
lustre/include/lustre_nodemap.

Additions were made to the make system and the m4 macros
to build nodemap with the rest of lustre.

Unit tests for adding and removing nodemaps were added
to sanity-sec.sh.

Range and idmap management functions will be added in the
two subsquent patches.

Signed-off-by: Joshua Walgenbach <jjw@iu.edu>
Change-Id: I2d34fde7aae3c2b4af512ff2c1ace5115ed40a6a
Reviewed-on: http://review.whamcloud.com/8034
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Andrew Perepechko <andrew_perepechko@xyratex.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: James Simmons <uja.ornl@gmail.com>
10 years agoLU-3569 ofd: packing ost_idx in IDIF 53/7053/10
Fan Yong [Sun, 24 Nov 2013 08:59:00 +0000 (16:59 +0800)]
LU-3569 ofd: packing ost_idx in IDIF

For a normal FID, we can know on which target the related object
is allocated via querying FLDB; but it is not true for an IDIF.

To locate the OST via the given IDIF, when the IDIF is generated,
we pack the OST index in it. Then for any given FID, in spite of
it is a normal FID or not, we has the method to know which target
it belongs to. That is useful for LFSCK.

For old IDIF, the OST index is not part of the IDIF, means that
different OSTs may have the same IDIFs, that may cause the IFID
in LMA does not match the read FID. Under such case, we need to
make some compatible check to avoid to trigger unexpected.

tgt_validate_obdo() converts the ostid contained in the RPC body
to fid and changes the "struct ost_id" union, then the users can
access ost_id::oi_fid directly without call ostid_to_fid() again.

It also contains some other fixing and cleanup.

Test-Parameters: testlist=sanity-scrub
Signed-off-by: wang di <di.wang@intel.com>
Signed-off-by: Fan Yong <fan.yong@intel.com>
Change-Id: I228f2f6cd9310193a1724046cee15e3b2103c8e2
Reviewed-on: http://review.whamcloud.com/7053
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Alex Zhuravlev <alexey.zhuravlev@intel.com>
10 years agoLU-3467 target: generic hpreq handler in target 83/7383/41
Mikhail Pershin [Sun, 18 Aug 2013 12:53:24 +0000 (16:53 +0400)]
LU-3467 target: generic hpreq handler in target

Make high-priority request handling generic. Each request handler
may initialize now not only generic handler but also high-priority
handler. Move specific OST hp callbacks to the OFD.

Remove rq_recovery_session from ptlrpc_request and use rq_session
always. That additional session was needed when recovery request
was copied, so the normal session might become freed. Now request
is not copied but referenced and only single session is enough.

Signed-off-by: Mikhail Pershin <mike.pershin@intel.com>
Change-Id: Iabf36d0828a86974bfe0638957f6018c919ac13b
Reviewed-on: http://review.whamcloud.com/7383
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
10 years agoLU-3939 tests: sanity-hsm/test_40 needs a local HSM_ARCHIVE 03/7703/7
Bruno Faccini [Wed, 2 Oct 2013 13:39:38 +0000 (15:39 +0200)]
LU-3939 tests: sanity-hsm/test_40 needs a local HSM_ARCHIVE

sanity-hsm/test_40 suffers frequent failures during auto-test due
to remote/NFS-mounted HSM_ARCHIVE causing the 400 archive requests
to take more than 100s to be drained from copytool requests queue.
This patch allows copytool_setup func to allow each sub-test to
specify a non-default hsm-root/HSM_ARCHIVE dir and test_40 uses it.
And for sure, remove test_40 from exception list.

Signed-off-by: Bruno Faccini <bruno.faccini@intel.com>
Change-Id: I00e7df7b3cc5530cb96177a61ca6e07f1c784297
Reviewed-on: http://review.whamcloud.com/7703
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Reviewed-by: Jian Yu <jian.yu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4430 mdt: check for MDS_FMODE_EXEC in mdt_mfd_open() 19/8719/2
John L. Hammond [Fri, 3 Jan 2014 23:42:08 +0000 (17:42 -0600)]
LU-4430 mdt: check for MDS_FMODE_EXEC in mdt_mfd_open()

In the error path of mdt_mfd_open() check for MDS_FMODE_EXEC rather
than FMODE_EXEC in the open flags.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I04c53eb1af0fdeeb2c2b0c2f2ef1340b247921d8
Reviewed-on: http://review.whamcloud.com/8719
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3531 llite: move dir cache to MDC layer 43/7043/26
wang di [Thu, 21 Nov 2013 08:00:04 +0000 (00:00 -0800)]
LU-3531 llite: move dir cache to MDC layer

Move directory entries cache from llite to MDC, so client
side dir stripe will use independent hash function(in LMV),
which does not need to be tightly coupled with the backend
storage dir-entry hash function. With striped directory, it
will be 2-tier hash, LMV calculate hash value according to the
name and hash-type in layout, then each MDT will store these
entry in disk by its own hash.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I14bb6bd81aad6fd59dcc22cf4bcea9d341dca2a1
Reviewed-on: http://review.whamcloud.com/7043
Tested-by: Jenkins
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4222 mdt: extra checking for getattr RPC. 99/8599/5
wang di [Wed, 18 Dec 2013 08:01:45 +0000 (00:01 -0800)]
LU-4222 mdt: extra checking for getattr RPC.

Check whether getattr RPC can hold layout MD(RMF_MDT_MD),
in case the client sends some invalid RPC, which can
cause panic on MDT.

Client will retrieve cl_max_md_size/cl_default_md_size
from MDS during mount process, so it will initialize
cl_max_md_size/cl_default_md_size before sending getattr
to MDS.

Signed-off-by: wang di <di.wang@intel.com>
Change-Id: I43bbe54c37360242bb7a3cd2aa8d90c2b9e0baf1
Reviewed-on: http://review.whamcloud.com/8599
Reviewed-by: Fan Yong <fan.yong@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
10 years agoLU-4343 tests: mkdir failing in sanity-hsm test 228 42/8542/4
James Nunez [Wed, 11 Dec 2013 16:50:21 +0000 (09:50 -0700)]
LU-4343 tests: mkdir failing in sanity-hsm test 228

sanity-hsm test 228 calls mkdir on $tdir. Currently, the tdir
variable is two directories. This is changed in LU-2524. Until
LU-2524 lands, any call to mkdir with the tdir variable needs
the "-p" flag.

Also added removal of two files that the test creates and a new
routine to create small files with dd using the sync flag.

Signed-off-by: James Nunez <james.a.nunez@intel.com>
Change-Id: Ic78a14625d8c45f405c013b96c84da909b9b9244
Reviewed-on: http://review.whamcloud.com/8542
Reviewed-by: John L. Hammond <john.hammond@intel.com>
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Faccini Bruno <bruno.faccini@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4318 lbuild: build failed if kernel source in command line 21/8421/3
Dmitry Eremin [Wed, 27 Nov 2013 19:24:54 +0000 (23:24 +0400)]
LU-4318 lbuild: build failed if kernel source in command line

For OFED and ZFS builds the function $(find_linux_release) is used.
But it defined in $LBUILD_DIR/lbuild-$DISTRO that is not included
in case of build when kernel source tree was given on the command
line.

Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com>
Change-Id: I5f7ae64f560c76e745dc7314465c7fa7d9ec275f
Reviewed-on: http://review.whamcloud.com/8421
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-2675 lod: remove lov and lod stuff from obd.h 87/8687/2
John L. Hammond [Tue, 10 Dec 2013 21:21:01 +0000 (15:21 -0600)]
LU-2675 lod: remove lov and lod stuff from obd.h

Remove QOS related data structures from obd.h to the
lod_internal.h. Remove the unused functions lov_stripe_md_cmp() and
lov_lum_lsm_cmp(). Remove the declarations of several functions that
no longer exist. Move lov_lum_swab_if_needed() to the one file that
uses it.

Signed-off-by: John L. Hammond <john.hammond@intel.com>
Change-Id: I59874eda50aa333e5e991090fa3ac538ff8dc0f3
Reviewed-on: http://review.whamcloud.com/8687
Tested-by: Jenkins
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com>
10 years agoLU-2850 ptlrpc: handle sunrpc_cache_pipe_upcall change 96/8396/3
James Simmons [Tue, 31 Dec 2013 16:44:02 +0000 (11:44 -0500)]
LU-2850 ptlrpc: handle sunrpc_cache_pipe_upcall change

Currently the ptlrpc GSS code has a wrapper to call
sunrpc_cache_pipe_upcall which takes three arguments.
The cache_request argument is already stored in the
cache_detail structure which is passed in already. So
for 3.8 the cache_request was removed with commit
21cd1254d3402a72927ed744e8ac1a7cf532f1ea. This patch
enabled Lustre to detect this change and run on newer
kernels.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I2be613d22aab5a0b8aa207a86e99fc63132affa0
Reviewed-on: http://review.whamcloud.com/8396
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Reviewed-by: Thomas Stibor <thomas@stibor.net>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3889 osc: Allow lock to be canceled at ENQ time 05/8405/5
Alexander.Boyko [Tue, 3 Dec 2013 06:00:22 +0000 (10:00 +0400)]
LU-3889 osc: Allow lock to be canceled at ENQ time

A cl_lock can be canceled when it's in CLS_ENQUEUED state.
We can't unuse this kind of lock in lov_lock_unuse() because
it will bring this lock into CLS_NEW state and then confuse
osc_lock_upcall().

Add a regression test case by Alexander Boyko.

Signed-off-by: Jinshan Xiong <jinshan.xiong@intel.com>
Signed-off-by: Alexander Boyko <alexander_boyko@xyratex.com>
Change-Id: I2acc7fd0176280062eb0d25dbe929b5d0144db50
Reviewed-on: http://review.whamcloud.com/8405
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bobi Jam <bobijam@gmail.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3952 nfs: don't panic NFS server if MDS fails to find FID 59/8459/5
Bobi Jam [Tue, 5 Nov 2013 09:14:40 +0000 (17:14 +0800)]
LU-3952 nfs: don't panic NFS server if MDS fails to find FID

When MDS fails to retrive the parent's fid, we'd handle it without
crashing the NFS server.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Ic15c7f8f99aed38fc77c46d24da7775e1a12b4ff
Reviewed-on: http://review.whamcloud.com/8459
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Fan Yong <fan.yong@intel.com>
Reviewed-by: wangdi <di.wang@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoNew tag 2.5.53 2.5.53 v2_5_53 v2_5_53_0
Oleg Drokin [Wed, 1 Jan 2014 06:17:18 +0000 (01:17 -0500)]
New tag 2.5.53

Change-Id: I226166f123c0c9b6cc6d655db6a6424b3f4b390b

10 years agoLU-4217 build: bump build warnings to 2.7 development 27/8627/2
Andreas Dilger [Thu, 19 Dec 2013 20:57:08 +0000 (13:57 -0700)]
LU-4217 build: bump build warnings to 2.7 development

The "acl" mount option on the client has been deprecated since
Lustre 1.8 (using ACLs is enforced by the admin on the MDS).
Unfortunately, there are still configs using this option, so
it cannot be removed yet, or it would cause an error at mount time.

The DNE cross-MDT locking depends on cross-MDT rename support, so
is also pushed to 2.6.53, but is expected to be fixed sooner.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Idcfe68eaec12df08538f1479a13c2c208c3ebbe5
Reviewed-on: http://review.whamcloud.com/8627
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-4304 tests: fix auster to detect "SKIP" test status 81/8381/3
Jian Yu [Mon, 25 Nov 2013 03:41:28 +0000 (11:41 +0800)]
LU-4304 tests: fix auster to detect "SKIP" test status

This patch fixes run_suite() in auster to detect "SKIP"
test status for one test suite. If all of the sub-tests
in one test suite were skipped, then the status of the
test suite would be "SKIP" instead of "PASS".

Signed-off-by: Jian Yu <jian.yu@intel.com>
Change-Id: Ice971aa4b15675c8a5f70f5b32092db69358565e
Reviewed-on: http://review.whamcloud.com/8381
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Wei Liu <wei3.liu@intel.com>
Reviewed-by: Minh Diep <minh.diep@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3319 procfs: move ost proc handling over to seq_file 28/7928/6
James Simmons [Thu, 14 Nov 2013 14:48:08 +0000 (09:48 -0500)]
LU-3319 procfs: move ost proc handling over to seq_file

Most of the current proc handling of the OST is already
based on seq_file handling except for the reporting of
the UUID of the OST. This patch moves this last piece
so that the OST layer will use strictly proc files with
seq_files.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: Idf2bc014ada9292d545f761aa27c777412a66671
Reviewed-on: http://review.whamcloud.com/7928
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-1095 debug: quiet overly verbose info message 18/7918/3
Andreas Dilger [Thu, 10 Oct 2013 18:19:52 +0000 (12:19 -0600)]
LU-1095 debug: quiet overly verbose info message

The client doesn't need to print a message for every client mount that
the layout lock feature is enabled.  This can be found at runtime via
the "import" proc file.

I also noticed that deleting OST objects logs into the debug log with
D_HA status, which is enabled by default.  Move this over to D_INODE
so it doesn't fill the OST debug logs.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: Ibd7b39fdd36020e62cd40883d1eac5cc7a0885dc
Reviewed-on: http://review.whamcloud.com/7918
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Emoly Liu <emoly.liu@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3974 llite: dentry d_compare changes in 3.11 46/7746/5
James Simmons [Wed, 11 Dec 2013 15:29:41 +0000 (10:29 -0500)]
LU-3974 llite: dentry d_compare changes in 3.11

In the linux 3.11 kernel the d_compare function has
removed passing in any struct inode arguments. This
patch provides support to handle this case.

Signed-off-by: James Simmons <uja.ornl@gmail.com>
Change-Id: I363057e4d0a119ad43a9907ec26e7e0079f7c305
Reviewed-on: http://review.whamcloud.com/7746
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Peng Tao <bergwolf@gmail.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>
Reviewed-by: Yang Sheng <yang.sheng@intel.com>
Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
10 years agoLU-3762 build: allow longer JIRA ticket letter 38/7338/2
Bobi Jam [Thu, 15 Aug 2013 02:28:53 +0000 (10:28 +0800)]
LU-3762 build: allow longer JIRA ticket letter

Allow a longer leading JIRA ticket letter to 9 characters.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: I063ed7dbf6a3f648500463b0478768cc4a684a88
Reviewed-on: http://review.whamcloud.com/7338
Tested-by: Jenkins
Tested-by: Maloo <hpdd-maloo@intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger@intel.com>
Reviewed-by: Bob Glossman <bob.glossman@intel.com>