Whamcloud - gitweb
adilger [Wed, 27 Aug 2003 22:39:27 +0000 (22:39 +0000)]
Fixes from Martin to allow write_append_truncate to run on multiple nodes
(albeit only two at a time, in rotation).
braam [Wed, 27 Aug 2003 05:53:38 +0000 (05:53 +0000)]
- open replay should NOT go to the OSC which is unreachable during
recovery
- minor fix to the test
- remove the wrong code that did osc create replay
alex [Tue, 26 Aug 2003 15:57:56 +0000 (15:57 +0000)]
- extents support for ext3 added
- O_EXTENTS flag support (one must pass the flag to vfs_create() in order
to force ext3 to use extents)
NOTE: extents support is disabled by default. to enable use mount option
'extents'
braam [Tue, 26 Aug 2003 13:17:56 +0000 (13:17 +0000)]
- basic replay code for open; passes sanity as well as without it I think
adilger [Mon, 25 Aug 2003 23:09:20 +0000 (23:09 +0000)]
Script to set MDS, OST, client to wildly different dates for mtime testing.
Then, dates in the filesystem that are 1973 are MDS, 1976 are OST, and those
in Aug (1979) are the client. These also conveniently appear as ~
101010101,
~
202020202, and ~
303030303 in unix time format (seconds since epoch).
phil [Sun, 24 Aug 2003 21:05:02 +0000 (21:05 +0000)]
fix many bugs that prevented previously-mounted filesystems from
mounting a second time:
- set the oscc_next_id (via obd_setattr) before we use that value to
recover orphans
- when recovering orphans, send the actual last-used ID (oscc_next_id - 1),
instead of oscc_next_id
- in filter_destroy_precreated, call filter_destroy with a NULL oti
to avoid saving locks for the reply-ack
- remove the boot count component of the filter objid; it is no
longer required and makes orphan recovery harder
- start the filter objid count at 1
- orphan recovery is a dangerous process which if done incorrectly
will delete the wrong objects; with that in mind, we now call a
dramatically simpler lov_create_orphans() instead of the normal
out-of-control lov_create
alex [Sun, 24 Aug 2003 19:08:56 +0000 (19:08 +0000)]
- using this .config I could build 2.6.0-test3 for UML
alex [Sun, 24 Aug 2003 17:38:16 +0000 (17:38 +0000)]
- chaos-2.4.18 series have right ext3-no-write-super patch now
phil [Fri, 22 Aug 2003 22:51:24 +0000 (22:51 +0000)]
merge HEAD into b_devel, including socknal autoconnect and new v24 kernel
cvs2svn [Fri, 22 Aug 2003 21:40:00 +0000 (21:40 +0000)]
This commit was manufactured by cvs2svn to create branch 'unlabeled-1.1.2'.
phil [Fri, 22 Aug 2003 21:39:57 +0000 (21:39 +0000)]
fix socknal build on vanilla kernels by adding socket exports
bumped the kernel patch version to 23, but really only vanilla-2.4.20 changed
zab [Fri, 22 Aug 2003 18:09:06 +0000 (18:09 +0000)]
- pass ecp into io funcs who need it when we already have it, avoiding some
atomic_incs which are currently a bottleneck
shaver [Fri, 22 Aug 2003 17:50:39 +0000 (17:50 +0000)]
stop on pinger-error
shaver [Fri, 22 Aug 2003 16:28:59 +0000 (16:28 +0000)]
Add /proc/fs/lustre/pinger (contents "on" or "off") so that tests which require
the pinger be present can check and error.
wangdi [Fri, 22 Aug 2003 09:05:20 +0000 (09:05 +0000)]
add lustre_lite.h and lustre_idl.h to rpm
wangdi [Fri, 22 Aug 2003 09:05:16 +0000 (09:05 +0000)]
file Makefile.am was initially added on branch b_devel.
rread [Fri, 22 Aug 2003 02:18:59 +0000 (02:18 +0000)]
b=1803
r=shaver
New import state machine, as documented on the lustre wiki in
ImportStates.
A new function, ptlrpc_connect_import, performs all import connects
and moves the import from the DISCON state to either FULL, EVICTED,
REPLAY, or RECOVER, depending on the situation. Unlike the levels, the
states are now exact, and the request->rq_send_state much match the
import state to be sent.
Passes recovery/01, replay-small, and replay-dual.
rread [Fri, 22 Aug 2003 02:18:55 +0000 (02:18 +0000)]
file import.c was initially added on branch b_devel.
shaver [Thu, 21 Aug 2003 23:32:17 +0000 (23:32 +0000)]
Make sure that transnos and status make it into the repmsg, not just the
server-side ptlrpc_request. While I'm in there, unify a bunch of common code
and remove some overly pessimistic LBUGs.
rread [Thu, 21 Aug 2003 19:15:54 +0000 (19:15 +0000)]
b=1817 add a replay-single test that uses touch, which triggers 1817.
teach the upcall about different upcall types.
ericm [Thu, 21 Aug 2003 08:57:39 +0000 (08:57 +0000)]
[liblustre]: merge LiuKai's cygwin patch
phil [Thu, 21 Aug 2003 06:40:50 +0000 (06:40 +0000)]
if the connection fails for whatever reason and we pass an invalid
export handle into oscc_init, don't dereference a NULL
phil [Thu, 21 Aug 2003 06:32:25 +0000 (06:32 +0000)]
fix double-unlock in osccd_main
phil [Thu, 21 Aug 2003 06:10:08 +0000 (06:10 +0000)]
another spinlock typo
phil [Thu, 21 Aug 2003 06:08:25 +0000 (06:08 +0000)]
- initialize spinlocks
- fix spinlock name typo
- uninitialized variable warning
phil [Thu, 21 Aug 2003 04:18:18 +0000 (04:18 +0000)]
oscc_init should not be calling oscc_precreate, because this happens
before recovery.
phil [Wed, 20 Aug 2003 23:32:38 +0000 (23:32 +0000)]
one big fix:
- in osc_create, set oa->o_id AFTER copying oscc_oa overtop of it, so
that we don't give every new file object 6
lots of little fixes:
- call mds_lov_set_nextid() from the end of mds_postsetup()
- ...make that safe by calling mds_lov_connect() at the beginning of
mds_lov_set_nextid()
- in filter_create, stop holding the directory lock for reply acks.
it's not needed anymore, and causes obvious deadlock during precreation
- clean up the error handling in filter_precreate
- start precreating when there are fewer than "kick_barrier" objects
remaining, and then create "grow_count" additional objects
moving on to larger tests:
- initial_create_count is now 100
- kick_barrier is now 50
- grow_count is now 100
braam [Wed, 20 Aug 2003 15:38:56 +0000 (15:38 +0000)]
- a bunch of "fixes" on b_llpmd
- hook initialization and orphan removal functions
[this needs review and redo, after some config issues are settled.]
[nasty hackery continues in this area until lconf sees a change.]
- things that would help:
1. pass the lov_uuid to the MDS at setup time
2. call the mds ioctl to do lovinfo setup during setup
[add yet another parameter to mds_setup?]
[currently this is a race]
3. get lov/osc devices set up before the mds is set up.
All 3 look doable to me.
- moved code from the class ioctl handler to genops, to begin a
framework for in-kernel configuration. For now we have:
- class_newdev,
- class_attach
- class_setup functions.
- added Alex pre-creation code
NOTE: oscc_precreate doesn't wake up unless has_objects returns
true. That prevents pre-create to be called multiple times. Should be
fixed, but it's unlikely we will see this bug again
girishc [Wed, 20 Aug 2003 07:43:26 +0000 (07:43 +0000)]
NFS export of Lustre FS
-Some of the review comments addressed
girishc [Wed, 20 Aug 2003 07:43:23 +0000 (07:43 +0000)]
file nfs_export_kernel-2.4.20.pc was initially added on branch b_nfsdevel.
adilger [Tue, 19 Aug 2003 23:34:57 +0000 (23:34 +0000)]
Remove "ERR" trap, which doesn't work in my UML, nor at PNNL. r=robert
Use "MOUNT" and "DIR" to be more in line with sanity.sh. This also
allows us to run these tests in a subdir instead of the root, in case
it makes a difference (it doesn't currently).
Use checkstat instead of "ls" so that we can do some verification of results.
Don't pause for GDB debugging symbols to be loaded.
adilger [Tue, 19 Aug 2003 22:22:50 +0000 (22:22 +0000)]
Close and unlink test file if we didn't encounter any errors.
Set default iteration count to 50k instead of 100k.
adilger [Tue, 19 Aug 2003 22:19:25 +0000 (22:19 +0000)]
Add truncate-to-zero to multiop commands.
adilger [Tue, 19 Aug 2003 22:15:19 +0000 (22:15 +0000)]
Remove unused variable.
behlendo [Tue, 19 Aug 2003 19:13:26 +0000 (19:13 +0000)]
- #1642
- backport some small cleanups to the dentry refcounts and cleanup
- Zach's fix from llpio which fixes a race condition in the
partial read
- #1592 fixes trying to read a single page that was entirely past
EOF, we would later oops in commitrw on the OST trying to dput
a NULL dentry. This fixes the 'make' issue Richard discovered.
- #1505 patch to quiet four messages
- "lustre_commitd quitting"
- "processing error -107" after server is restarted
- no need to print statfs() errors to the console
- print one reconnect message instead of two
- Adjusted version_tag.pl to use portals version string if no
/CVS/Tag exists. This resolves rpms using only the HEAD tag.
behlendo [Tue, 19 Aug 2003 19:13:24 +0000 (19:13 +0000)]
file LLNL_Changelog was initially added on branch b_llnl_stable.
rread [Tue, 19 Aug 2003 19:04:17 +0000 (19:04 +0000)]
b=1777
* Add test to replay open after chmod
phil [Tue, 19 Aug 2003 17:38:30 +0000 (17:38 +0000)]
merge b_multinet into HEAD
phil [Tue, 19 Aug 2003 17:10:03 +0000 (17:10 +0000)]
b=1505
Disable some console messages from failed statfs(), reconnect
ericm [Tue, 19 Aug 2003 12:08:52 +0000 (12:08 +0000)]
[liblustre]: merge LiuKai's cygwin patch
- add portals/include/cygwin-ioctl.h
- don't compile accepter.c in liblustre
- add missing le-cpu convert macro
- use windows API directly at some places
ericm [Tue, 19 Aug 2003 12:08:49 +0000 (12:08 +0000)]
file cygwin-ioctl.h was initially added on branch b_eq.
braam [Tue, 19 Aug 2003 08:28:36 +0000 (08:28 +0000)]
- remainder of the pre-create code (more or less)
- routines to set next_id
- routines to clear orphans
- routines to set growth count
- minor cleanups
braam [Tue, 19 Aug 2003 07:08:47 +0000 (07:08 +0000)]
- more small fixes for pre-creation, to help bzzz make progress.
- set o_id as a hint upon creation.
alex [Mon, 18 Aug 2003 09:40:35 +0000 (09:40 +0000)]
- new series uml_2.6.0_test3
- latest uml patch against 2.6.0-test3
alex [Mon, 18 Aug 2003 09:40:33 +0000 (09:40 +0000)]
file uml_2.6.0_test3 was initially added on branch b_llpmd.
alex [Mon, 18 Aug 2003 09:40:19 +0000 (09:40 +0000)]
file uml-patch-2.6.0-test3-1.pc was initially added on branch b_llpmd.
alex [Mon, 18 Aug 2003 09:40:10 +0000 (09:40 +0000)]
file uml-patch-2.6.0-test3-1.patch was initially added on branch b_llpmd.
phil [Sun, 17 Aug 2003 20:52:58 +0000 (20:52 +0000)]
- oscc_has_objects needs to do a signed comparison, or osc_create will
spin forever
- print some lines to the log about which objects are being returned,
and which were preallocated
- default to starting at 2 instead of 0, because that's the first
object that the OST will actually create. This will be replaced
shortly with an actual understanding of which objects are available
at startup
- exclude tests 34c-34e until I fix "create with existing size"; the
rest of sanity passes if you start with a newly formatted OST (see
last point)
phil [Sun, 17 Aug 2003 17:42:36 +0000 (17:42 +0000)]
Fix some tab damage and missing modeline
braam [Sun, 17 Aug 2003 15:25:32 +0000 (15:25 +0000)]
several fixes to the pre-create daemon. I think it survives a few creates
now.
ericm [Sun, 17 Aug 2003 07:20:00 +0000 (07:20 +0000)]
[liblustre]: liblustre sanity script fix.
ericm [Sun, 17 Aug 2003 07:19:04 +0000 (07:19 +0000)]
[liblustre]: set intent pointer to NULL after intent released.
shaver [Sat, 16 Aug 2003 19:38:35 +0000 (19:38 +0000)]
b=1541: fix LASSERT on mismatched transnos during open-unlink testing
r=phik
- the open request now keeps a pair of pointers in its replay data:
- the och that has always been there for fixing up open filehandles
- the close request that has been created to balance it, which may or may
not have made it over the wire yet, for fixing up the filehandle cookie it
sends to the MDS
- the explicit updating of the close req replaces the previous walking of the
sending list, which was both insufficient _and_ excessive
- suppress the "asked for EA, got none" message in the case of a 0-nlink file
- when processing an INTENT_ONLY enqueue request, return a successful 0 rather
than ELDLM_LOCK_ABORTED, which only confused the clients
- when opening by FID, be sure to set the intent disposition flags
appropriately
- triple the timeout, instead of doubling it, for the replay-completed PING, to
better suit the numerology of the pinger and timeout values
jacob [Sat, 16 Aug 2003 19:05:29 +0000 (19:05 +0000)]
Use compat. macros to fix building w/ rh-2.4.20 series.
r=phil
braam [Sat, 16 Aug 2003 14:49:46 +0000 (14:49 +0000)]
- pre-creation is beginning to work with these small changes to the
state machine.
braam [Sat, 16 Aug 2003 05:08:19 +0000 (05:08 +0000)]
- initial check in of pre-creation code including:
- alex server side code (still needs to update last allocated id on disk)
- peter's client code (still needs debugging and some finishing touches)
- the whole thing needs some thought viz a viz:
- recovery
- error notificaiton from the daemon to the caller.
- removed unused commit_cbd, maybe one day we will use it and re-surrect,
but I think we need this at the osc/mdc level, not at llite level.
braam [Sat, 16 Aug 2003 05:08:17 +0000 (05:08 +0000)]
file osc_create.c was initially added on branch b_llpmd.
adilger [Fri, 15 Aug 2003 20:28:09 +0000 (20:28 +0000)]
Fix for truncate/write inversion.
b=1639
r=phil
behlendo [Fri, 15 Aug 2003 17:59:35 +0000 (17:59 +0000)]
- #1763 fixes timestamps on files getting advanced incorrectly.
- #1639 client VFS truncate/write lock inversion. This caused
the bug in Joe Koning music code.
zab [Thu, 14 Aug 2003 21:00:37 +0000 (21:00 +0000)]
- give llite moderately more reasonable default watermarks so that it can
fill a fair sized lov in a burst.
- be more careful about unlocking writepage's page as it returns an error
- allocate writeback page state with NOFS so we don't descend into filesystems
when memory is short
adilger [Thu, 14 Aug 2003 17:41:21 +0000 (17:41 +0000)]
Update header to give compilation and running instructions.
girishc [Thu, 14 Aug 2003 17:20:41 +0000 (17:20 +0000)]
NFS Export of Lustre File System
-Contains kernel patch nfs_export_kernel-2.4.20.patch
-Lustre Modifications
fh_to_dentry/dentry_to_fh implementation
IT_CREAT Implementation (As zero sized file creation on NFS client never calles nfsd_open)
girishc [Thu, 14 Aug 2003 17:20:39 +0000 (17:20 +0000)]
file nfs_export_kernel-2.4.20.patch was initially added on branch b_nfsdevel.
adilger [Thu, 14 Aug 2003 16:26:58 +0000 (16:26 +0000)]
Check for MAP_FAILED return from mmap instead of NULL.
#define _GNU_SOURCE to get O_DIRECT/O_DIRECTORY
In test_brw define a default pg_vec if unset.
mdoyle [Thu, 14 Aug 2003 14:07:53 +0000 (14:07 +0000)]
utility to get GM global nid from node name
mdoyle [Thu, 14 Aug 2003 14:07:51 +0000 (14:07 +0000)]
file lgmnalnid.c was initially added on branch b_myrinet.
nikke [Thu, 14 Aug 2003 09:13:33 +0000 (09:13 +0000)]
Paged IO support for scimacnal.
Closes bug#1347.
adilger [Thu, 14 Aug 2003 06:30:35 +0000 (06:30 +0000)]
Zach and I both agreed that we should keep short IO sanity tests in sanity.sh
or sanityN.sh so that they are always run during tests.
Remove tests from ALWAYS_EXCEPT now that their bugs have been fixed and the
tests pass. Please keep ALWAYS_EXCEPT and the comment above it in sync.
phil [Thu, 14 Aug 2003 05:54:47 +0000 (05:54 +0000)]
merge b_filterio into b_llpio; b_filterio soon to be deleted, given
that b_llpio is a superset
rread [Wed, 13 Aug 2003 20:16:21 +0000 (20:16 +0000)]
b=1513
r=shaver
* add conn_cnt to import, export, and lustre_msg. The conn_cnt is
increased by the client whenever it connects or reconnects. The
server ignores failed BRWs with an old conn_cnt.
Also, when a bulk is resent, the xid is changed, so the previous one
will definitely fail.
phil [Wed, 13 Aug 2003 19:53:17 +0000 (19:53 +0000)]
merge HEAD changes into b_devel
phil [Wed, 13 Aug 2003 19:02:31 +0000 (19:02 +0000)]
Fix size validation with getattr intents. b=1768
adilger [Wed, 13 Aug 2003 19:00:52 +0000 (19:00 +0000)]
Update write_append_truncate program:
- check all return codes
- add understandable error messages
- can select the test file without recompiling.
- can select iteration count (default 100000 loops)
- have some progress output during run (ala fsx)
behlendo [Wed, 13 Aug 2003 18:37:23 +0000 (18:37 +0000)]
- #1592 fixes trying to read a single page that was entirely past
EOF, we would later oops in commitrw on the OST trying to dput
a NULL dentry. This fixes the 'make' issue Richard discovered.
- #1765 fixes MDS cleanup bug where we have an outstanding reply
at disconnect time. Remove all exp_outstanding_reply
manipulation in mds_disconnect.
- #1732 fixes the lprocfs unresolved symbol error.
phil [Wed, 13 Aug 2003 18:19:25 +0000 (18:19 +0000)]
b=1592
Read past EOF would clear res->dentry in preprw; we would oops trying
to dput it in commitrw. Fixed.
zab [Wed, 13 Aug 2003 17:20:32 +0000 (17:20 +0000)]
- trivial sanity asserts in llite page accounting
- don't leak ocp's in some failure paths
- make sure read-ahead doesn't orphan a locked page
- add some mmap goo to multiop (this is going to conflict, I bet)
adilger [Wed, 13 Aug 2003 17:00:05 +0000 (17:00 +0000)]
First test for io sanity - force a zero-length read from a sparse stripe.
adilger [Wed, 13 Aug 2003 17:00:00 +0000 (17:00 +0000)]
file iosanity.sh was initially added on branch b_devel.
phil [Tue, 12 Aug 2003 23:24:08 +0000 (23:24 +0000)]
b=1642
r=zab
- Land a cleanup of the preprw_read/commitrw_read path from b_filterio.
The old dentry cleanup in error cases could not be overseen.
- Backport a copy of Zach's fix for bug 1741, believed the be the
cause of the intermittent partial-page corruption
adilger [Tue, 12 Aug 2003 23:18:11 +0000 (23:18 +0000)]
MPI distributed write/append/truncate coherency stress test - original version.
adilger [Tue, 12 Aug 2003 23:18:08 +0000 (23:18 +0000)]
file write_append_truncate.c was initially added on branch b_devel.
behlendo [Tue, 12 Aug 2003 22:45:14 +0000 (22:45 +0000)]
- backport some small cleanups to the dentry refcounts and cleanup
- Zach's fix from llpio which fixes a race condition in the partial read path
behlendo [Tue, 12 Aug 2003 22:33:09 +0000 (22:33 +0000)]
- Bug #1749 patch
rread [Tue, 12 Aug 2003 20:27:41 +0000 (20:27 +0000)]
b=1720
r=adilger
* free reply buffer whether we have ack locks or not.
behlendo [Tue, 12 Aug 2003 19:46:08 +0000 (19:46 +0000)]
- Patch for bug #1600, MDS server data isn't written at setup. This is
only an issue when mounting the MDS for the first time.
adilger [Tue, 12 Aug 2003 17:45:39 +0000 (17:45 +0000)]
Revert changes committed to b_devel for MDS-creates-objects.
Changes will be committed into b_llpmd instead.
We could, however, remove the OST open/close RPCs from b_devel as a
starting point.
behlendo [Tue, 12 Aug 2003 17:03:09 +0000 (17:03 +0000)]
- version_tag.pl now uses the portals version string if no CVS/Tag exists.
This resolves building rpms which all claim to have the HEAD tag.
- LLNL_ChangeLog updates
- Tagged version llnl_4devel
adilger [Tue, 12 Aug 2003 16:59:47 +0000 (16:59 +0000)]
Add sleeping write/verification test to CVS.
usage: sleeptest [basename [sleeptime]]
Fixes a couple of bugs in the current test:
- assumes that there is 100 bytes after offset & ~4095 (fails at iter 233)
- assumes that "Line" appears in the output (fails if < 60 bytes read)
adilger [Tue, 12 Aug 2003 16:59:45 +0000 (16:59 +0000)]
file sleeptest.c was initially added on branch b_devel.
adilger [Tue, 12 Aug 2003 16:18:37 +0000 (16:18 +0000)]
Exit early from mds_open() if we get an error.
b=1749
r=phil
adilger [Tue, 12 Aug 2003 16:13:58 +0000 (16:13 +0000)]
Fix import levels when a reconnect happens without a previous timeout.
b=1597
r=shaver
behlendo [Tue, 12 Aug 2003 16:02:30 +0000 (16:02 +0000)]
- #1751 fix LBUG/bad LOV EA if lov_create fails because of inactive OSCs.
We hit this bug on ALC Monday Afternoon (8/11/03), the fix has been
tested in b_devel since 07/28.
braam [Tue, 12 Aug 2003 10:22:33 +0000 (10:22 +0000)]
- this contains much of the pre-create code
- extensive changes to mds_open to call obd_create
- important changes to the client to avoid obd_create
- changes to intent locking to avoid giving a client a lock
- deficiencies:
- there is an open unlink problem (sanity 31). I think this was lurking and is now exposed
- lstripe needs some thought; doesn't work right now
- further refactoring of the object creation in mds_open desirable and forthcoming.
adilger [Tue, 12 Aug 2003 06:39:26 +0000 (06:39 +0000)]
Return an error from lov_create() if all OSCs are inactive.
b=1751
r=phil,jacob
adilger [Tue, 12 Aug 2003 06:26:29 +0000 (06:26 +0000)]
Don't LBUG if we get bad stripe data back from the MDS (normally a bug, but
not one that we want to crash on).
zab [Mon, 11 Aug 2003 22:04:51 +0000 (22:04 +0000)]
- make sure the sync commit write gets the rpc's error code
- be sure to set the page up to date if commit_write succeeds
- get a page ref when we start async ocp io and drop the ref as the io
completes
- drop some asserts left over from early refactoring passes
- add a simple io data consistency check to sanity.sh
rread [Mon, 11 Aug 2003 20:06:27 +0000 (20:06 +0000)]
r=adilger
* Add a flag to mds_mfd_close to prevent unlinking files when we are
cleaning up with the --failover flag.
rread [Mon, 11 Aug 2003 19:58:33 +0000 (19:58 +0000)]
* test_6 fails if any files are found at the end.
* test_8 does fchmod after unlink
rread [Mon, 11 Aug 2003 18:15:24 +0000 (18:15 +0000)]
* yank back stat help
rread [Mon, 11 Aug 2003 18:01:20 +0000 (18:01 +0000)]
* new: t fchmod (set mode to 0)
* add help for new options