Whamcloud - gitweb
shaver [Thu, 22 Aug 2002 13:30:11 +0000 (13:30 +0000)]
* Chain granted locks off the export.
braam [Thu, 22 Aug 2002 06:04:36 +0000 (06:04 +0000)]
minor but important lov bug fixes Robert and I found out about.
braam [Thu, 22 Aug 2002 05:00:39 +0000 (05:00 +0000)]
- fix sending wrong attributes... Are there more of these?
pschwan [Thu, 22 Aug 2002 04:56:24 +0000 (04:56 +0000)]
- abstracted some parts of ll_size into ll_size_lock and ll_size_unlock
- added file size locking around the mds_setattr in ll_file_release
pschwan [Wed, 21 Aug 2002 23:52:51 +0000 (23:52 +0000)]
in getattr requests, get the authoritative file size from the OST
braam [Wed, 21 Aug 2002 22:36:29 +0000 (22:36 +0000)]
Change lov.xml file a little so that it works with localhost.
Remove the namespace locks from ll_file_release. Basic parallel I/O
now works, it seems.
adilger [Wed, 21 Aug 2002 22:25:14 +0000 (22:25 +0000)]
Add target data for symlinks to intent.
adilger [Wed, 21 Aug 2002 22:24:21 +0000 (22:24 +0000)]
Fix symlink creation a bit more.
Fix ll_create() error handling (at least it won't oops).
eeb [Wed, 21 Aug 2002 22:17:39 +0000 (22:17 +0000)]
optional highmem buffers in obdecho/test_brw
adilger [Wed, 21 Aug 2002 21:49:01 +0000 (21:49 +0000)]
Symlink support.
rread [Wed, 21 Aug 2002 21:06:37 +0000 (21:06 +0000)]
new option: --nosetup disables device setup/cleanup. Useful for testing
module load/unload without any device config.
shaver [Wed, 21 Aug 2002 20:49:11 +0000 (20:49 +0000)]
* Add timeouts for blocking-AST callbacks.
* Add fail_loc support for dropping a blocking AST reply or callback on the
OSC.
rread [Wed, 21 Aug 2002 20:47:24 +0000 (20:47 +0000)]
Add module loading support to lconf. By default, lconf will load and
unload the modules needed based on what devices are configured for a node.
The path to load modules from is determined based on the directory lconf is
run from. If a Makefile is found, the it is assumed lconf is in lustre/utils
and modules will be searched for in ../../lustre and ../../portals.
Module support can be turned off with --nomod, if desired.
Use option --gdb to create a gdb module script. Lconf will print the path
of the script and pause for few seconds.
rread [Wed, 21 Aug 2002 20:40:15 +0000 (20:40 +0000)]
- change default tcp port to 988
braam [Wed, 21 Aug 2002 19:30:13 +0000 (19:30 +0000)]
No dput's, the VFS does this. Some sanity with rename appears to
begin surfacing now.
braam [Wed, 21 Aug 2002 19:05:40 +0000 (19:05 +0000)]
- redo rename to instantiate d_new on the client, instead of
trying to bypass that step (and run into troubles).
braam [Wed, 21 Aug 2002 17:05:43 +0000 (17:05 +0000)]
- give inodes more metadata for objects. Per stripe we now maintain:
- id
- size
- soon: also flags indicating if we have a "size" lock on the object
- fix two lov bugs (open/close) - both used the wrong oa.
braam [Wed, 21 Aug 2002 04:13:38 +0000 (04:13 +0000)]
Change uml1 to localhost. Now it works.
braam [Wed, 21 Aug 2002 03:53:09 +0000 (03:53 +0000)]
- put Robert's file back...
braam [Wed, 21 Aug 2002 03:41:13 +0000 (03:41 +0000)]
- In ordinary writes (not O_DIRECT) do not round the data object file
size to a page boundary. Objects are now as long as they should
be.
braam [Wed, 21 Aug 2002 03:15:21 +0000 (03:15 +0000)]
- add a name to ptlrpc_svc_init in preparation for eliminating
request->rq_obd
- fix typo in osc_enqueue
pschwan [Wed, 21 Aug 2002 02:37:36 +0000 (02:37 +0000)]
- Do an additional getattr in ll_lookup2 after we get the lock, to refresh the
inode attributes
- mds_blocking_ast didn't wait for all lock holders to finish with the lock
before cancelling it; fixed.
- adjusted more of the wildly inconsistent error code reporting, this time in
mds_getattr
- removed the bad array-walking code that I introduced yesterday to osc_enqueue
adilger [Tue, 20 Aug 2002 21:47:17 +0000 (21:47 +0000)]
Fix input of objid.
adilger [Tue, 20 Aug 2002 21:02:02 +0000 (21:02 +0000)]
Test program to generate the "test_brw" pattern from user-space for a file.
eeb [Tue, 20 Aug 2002 19:41:35 +0000 (19:41 +0000)]
de-serialised getattr, brw test ioctls
adilger [Tue, 20 Aug 2002 17:47:33 +0000 (17:47 +0000)]
Pass the private descriptor pointer between preprw and commitrw for the
ost_brw_read() case also. Thanks again to Eric for finding.
rread [Tue, 20 Aug 2002 17:43:48 +0000 (17:43 +0000)]
- allow lconf to continue to run in debug mode even if acceptor and lctl
are not available.
adilger [Tue, 20 Aug 2002 17:09:54 +0000 (17:09 +0000)]
Fix missing increment for multi-page I/O cleanup/verification, found by Eric.
gord-fig [Tue, 20 Aug 2002 16:58:53 +0000 (16:58 +0000)]
Add simple test of mount and unmount.
adilger [Tue, 20 Aug 2002 06:29:31 +0000 (06:29 +0000)]
- Add create and destroy operations to lctl.
- Make the help options a bit more descriptive.
adilger [Tue, 20 Aug 2002 05:39:49 +0000 (05:39 +0000)]
Remove bogus assertion. The dentry isn't instantiated until later, and
we are guaranteed to have a valid inode if we return from ll_lookup2()
without an error code.
adilger [Tue, 20 Aug 2002 05:27:45 +0000 (05:27 +0000)]
Comment out the rename part of runtests to try and get something working.
adilger [Tue, 20 Aug 2002 04:26:26 +0000 (04:26 +0000)]
Remove call to set_page_clean() from lustre_commit_write().
gord-fig [Tue, 20 Aug 2002 03:14:32 +0000 (03:14 +0000)]
Tweak distribution files.
adilger [Mon, 19 Aug 2002 23:57:32 +0000 (23:57 +0000)]
Return an error code if the test_brw read check failed.
adilger [Mon, 19 Aug 2002 23:45:00 +0000 (23:45 +0000)]
Minor cleanups to test_brw path + debugging to see what is wrong with vectors.
rread [Mon, 19 Aug 2002 23:42:06 +0000 (23:42 +0000)]
- looks like osc_enqueue is now looking at the strip_count, so it needs to be
initialized. Not sure why osc needs this info.
rread [Mon, 19 Aug 2002 22:36:44 +0000 (22:36 +0000)]
- fix return code in mds_getattr_internal
- add support for --format flag to lmc
rread [Mon, 19 Aug 2002 22:16:56 +0000 (22:16 +0000)]
- generate some simple configs
adilger [Mon, 19 Aug 2002 21:51:33 +0000 (21:51 +0000)]
Add some basic data integrity checking to obdecho.
This puts the offset and objid into the first 16 bytes and last 16 bytes
of the bulk transfer. These are in HTON__u64() format.
The lctl command for test_brw now takes an objid instead of an obdo count,
so that you can (potentially) use lctl with test_brw on a real OBD instead
of just obdecho. That has not been tested yet.
shaver [Mon, 19 Aug 2002 21:18:28 +0000 (21:18 +0000)]
Replace ldlm_lock's connection handle with an export handle. (Always
NULL on the client side.)
adilger [Mon, 19 Aug 2002 20:46:56 +0000 (20:46 +0000)]
Fix case where rc was not set.
pschwan [Mon, 19 Aug 2002 17:50:18 +0000 (17:50 +0000)]
- Maintain a list in the ll_inode_data of data (OST) locks held by this client
- in ll_file_release, cancel any remaining locks in that list
- refactored mds_getattr_name and mds_getattr into two functions with a common
sub-function; this fixed bugs in mds_getattr, and helps prevent them from
drifting apart again
rread [Sat, 17 Aug 2002 23:16:08 +0000 (23:16 +0000)]
- writing data to lov stripes is beginning to work. still much to check
shaver [Sat, 17 Aug 2002 22:06:32 +0000 (22:06 +0000)]
* l_wait_event can now do interrupts without a timeout, if we're feeling brave.
* Big doc comment for l_wait_event.
* Only fire the timeout once from l_wait_event.
* Made timeout and the recovery-upcall path configurable via sysctl.
* Added OBD_FAIL_OSC codes for simulating simple client failure.
* Tentative rewiring of recovd into client connections, needs more thought
and then more typing. We do fire the upcall, at least.
* Use the provided cluuid instead of NULL wherever it's handy already.
* Protect (feebly) against waiting for recovery that will never happen,
in sync_io_timeout.
* Add timeouts to bulk operations in MDS and OST -- a recovery stub is now
triggered, but nothing else.
* Document the unpleasant business in osc_brw_{read,write} as pertains to
errors in the callbacks and cleanup of descriptors.
* Remove now-unused ptlrpc_check_bulk_{sent,received}.
rread [Sat, 17 Aug 2002 22:02:33 +0000 (22:02 +0000)]
Some architectures (like ppc) need linux/init.h to define things like
__init__
rread [Sat, 17 Aug 2002 00:50:58 +0000 (00:50 +0000)]
- cleanup callback data to get striping to work
adilger [Fri, 16 Aug 2002 21:25:17 +0000 (21:25 +0000)]
Add --with-linuxdir= to match howto (and be more correct).
adilger [Fri, 16 Aug 2002 20:53:08 +0000 (20:53 +0000)]
Fix yet one more hidden-by-missing-kmap compile error.
adilger [Fri, 16 Aug 2002 20:37:47 +0000 (20:37 +0000)]
Fix harmless compile warning. Not sure the code is correct, though, since
it would appear to preclude unlinking a non-regular, non-directory file.
adilger [Fri, 16 Aug 2002 16:54:31 +0000 (16:54 +0000)]
Finally appear to have a bug free "locked page" handler. Still need to
worry about multiple writer problems (I have seen this happen in dbench,
for instance). Maybe we should only use the locked-page-copy fallback
if there is the possibility of a deadlock (i.e. we already have another
page locked?).
adilger [Fri, 16 Aug 2002 16:51:11 +0000 (16:51 +0000)]
Remove old journal callback compatibility support.
Use a more "magical" number than 4711 for the LOV EA data, since this is
the magic we use in many other places as well.
adilger [Fri, 16 Aug 2002 16:49:22 +0000 (16:49 +0000)]
Remove open-coded list walking.
adilger [Fri, 16 Aug 2002 16:48:00 +0000 (16:48 +0000)]
Remove some grossness I previously put in the error handling path. It
turns out inode can only ever be NULL and not IS_ERR().
adilger [Fri, 16 Aug 2002 16:45:35 +0000 (16:45 +0000)]
Set the o_valid flag for valid fields in the obdo. We also need to start
checking for these at the target side to ensure we are using good data
and not an unset or corrupt field.
rread [Fri, 16 Aug 2002 09:30:29 +0000 (09:30 +0000)]
- fix mount by adding UUIDs requested by Mike
- add HTREE to lconf
- fix various brainos in lov
shaver [Fri, 16 Aug 2002 01:51:03 +0000 (01:51 +0000)]
* Fix interrupt-pending-when-timeout-occurs handling in l_wait_event.
* If timeout specified, but no handler, wake up with -ETIMEOUT instead of
going back to sleep.
* Export a class_signal_client_failure hook-symbol from obdclass, to be filled
in by recovd.o and used by various obdclass bits (avoiding sour dependencies
on recovd.o).
* Add OBD_FAIL_OST_BRW_{READ,WRITE}_BULK fail_loc values, for testing of
bulk-xfer timeouts and interrupts.
* Fix the timeout in ll_sync_io_cb to scale by HZ.
* Rip out some leftovers from ptlrpc_check_reply.
adilger [Thu, 15 Aug 2002 20:40:49 +0000 (20:40 +0000)]
Fix minor divergence between the -chaos and non-chaos patches.
adilger [Thu, 15 Aug 2002 16:56:03 +0000 (16:56 +0000)]
Don't change local inode size if we had a write error (that causes vmtruncate
to try and truncate the file, which calls osc_punch() to remove blocks we
didn't write in the first place).
adilger [Thu, 15 Aug 2002 03:48:11 +0000 (03:48 +0000)]
Hopefully final fix for kunmap problem - will check with a chaos build...
adilger [Thu, 15 Aug 2002 00:14:34 +0000 (00:14 +0000)]
Fix strange non-complaining error for missing page_array and pagearray
declarations.
pschwan [Wed, 14 Aug 2002 22:45:18 +0000 (22:45 +0000)]
- comment out the noisy get/put LDLM_DEBUGs; I'll remove them when I'm sure
that we're free of refcount bugs
- make mds_connect not crash when cluuid is NULL
- in ldlm_intent_policy, return a write lock if the client is opening a file
with no EA
- in mds_extN_get_md, allow MD to be NULL
- fix resource ID corruption leading to infinite locks (b=595247)
- in lctl, make ptl_initialize failure non-fatal, so that I can run debugctl
functions on non-Lustre systems
rread [Wed, 14 Aug 2002 19:37:44 +0000 (19:37 +0000)]
- updated to match tools
adilger [Wed, 14 Aug 2002 10:55:57 +0000 (10:55 +0000)]
Fix obvious breakage in ll_direct_IO.
adilger [Wed, 14 Aug 2002 10:21:43 +0000 (10:21 +0000)]
Add some renames to the runtest, since nobody seems to have noticed that
it didn't work.
adilger [Wed, 14 Aug 2002 10:19:26 +0000 (10:19 +0000)]
Fix annoying bugs in filter_write_locked_page() - we were not unlocking
the page, and this appears to have left stray locked pages around?
In filter_commitrw() we were never writing out the "locked" pages because
we didn't reset the loop variables. Couldn't have found this bug without
the other one ;-).
adilger [Wed, 14 Aug 2002 09:21:57 +0000 (09:21 +0000)]
Fix most obvious breakage due to rename. Still not 100% clean.
adilger [Wed, 14 Aug 2002 05:58:21 +0000 (05:58 +0000)]
Fix minor patch breakage.
adilger [Wed, 14 Aug 2002 05:15:02 +0000 (05:15 +0000)]
Minor changes to bring patch-2.4.18 and patch-2.4.18-chaos12 into sync:
- named initializers for the intent structs
- whitespace cleanups
- diff chunk ordering (so it is easy to compare the two)
- add path_lookup_it() to patch-2.4.18, even though we don't use it there yet
- add intent_release() for rename dentries to -chaos12
Still does not contain sync of link_path_walk_it() differences.
gord-fig [Wed, 14 Aug 2002 01:24:27 +0000 (01:24 +0000)]
Run latex twice.
gord-fig [Wed, 14 Aug 2002 00:29:45 +0000 (00:29 +0000)]
Updated changebar generation to preserve nesting.
adilger [Tue, 13 Aug 2002 21:53:30 +0000 (21:53 +0000)]
Minor fixups for error handling case.
adilger [Tue, 13 Aug 2002 21:42:35 +0000 (21:42 +0000)]
Clean up bulk descriptor refcounting now that we do not have both
callback and non-callback cases. Fix comments to reflect current reality.
pschwan [Tue, 13 Aug 2002 15:32:25 +0000 (15:32 +0000)]
Added a chaos12 patch, with all updates
adilger [Mon, 12 Aug 2002 22:43:48 +0000 (22:43 +0000)]
Miscellaneous minor changes as a result of code audit.
- write lastino to disk in little-endian format (still needs to be written
to disk for each update, so we don't try to re-allocate existing objects,
see bug #594147).
- don't call class_conn2export() just to check "conn" validity, and then call
class_conn2obd() to get the obd - just use obd to determine conn validity.
pschwan [Mon, 12 Aug 2002 22:25:53 +0000 (22:25 +0000)]
Add an echo_cleanup, for the namespace
pschwan [Mon, 12 Aug 2002 21:57:43 +0000 (21:57 +0000)]
- James Newsome's dlm stress test
- Small warning fix in filter.c
pschwan [Mon, 12 Aug 2002 21:26:05 +0000 (21:26 +0000)]
b=585183
We weren't telling the MDS what kind of unlink we were doing (unlink vs.
rmdir), so, for example, if you called rmdir() on a file, the MDS would
remove it and then the client VFS would return -ENODIR. Not so good.
We send a 'mode' flag along with the unlink request now, that must be one of
S_IFDIR or S_IFREG.
I also fixed some unaligned structures in the MDS protocol, so if you update
one node you must UPDATE THEM ALL.
Minutiae:
- in the intent policy function, if mds_reint returns EISDIR or ENOTDIR, still
go ahead and send back the file attributes
- in mds_reint_unlink, use the mode sent over the wire instead of the actual
inode mode to determine which vfs unlink function to call
shaver [Mon, 12 Aug 2002 19:59:59 +0000 (19:59 +0000)]
I know, let's actually set the desc field, since we're going to use it later.
shaver [Mon, 12 Aug 2002 19:53:27 +0000 (19:53 +0000)]
First steps at getting recovery back off the ground:
* make the callback data parameter to brw functions be strongly typed as
cb_io_data. LOV and other non-OSC users of these facilities should
"inherit" from this struct, see lov_callback_data for an example.
* replace l_wait_event_killable and some wait_event calls with
l_wait_event, an all-singing, all-dancing timeout- and interrupt-handling
event waiting macro. More such replacement to come.
* interrupt and timeout handling of bulk data will probably crash at present,
but it didn't really work before either -- I'll fix it up ASAP.
pschwan [Mon, 12 Aug 2002 19:45:02 +0000 (19:45 +0000)]
New patches for 2.4.18 and 2.4.18-um (2.4.18-chaos to follow):
- Fixes, or at least makes no worse, a race condition in open()
- Fixes a handful of lookup_intent initialization bugs that eventually lead
to LBUGs trying to free nonexistent locks (b=593739)
braam [Mon, 12 Aug 2002 02:45:32 +0000 (02:45 +0000)]
More documentation:
- object API
- first three flow charts
- lock api
braam [Sun, 11 Aug 2002 08:15:46 +0000 (08:15 +0000)]
- change I/O to use a pagearray
- implement remaining striping function in LOV:
- read/write
- locking
- truncate
- minor protocol cleanup for MDS
- change documentation to include design / architecture / manual /
appendix parts
- add design documents:
- managmennt api
- network format
-
gord-fig [Sat, 10 Aug 2002 19:26:10 +0000 (19:26 +0000)]
Silence remaining 4 warnings.
gord-fig [Sat, 10 Aug 2002 18:30:39 +0000 (18:30 +0000)]
Back out for a different solution.
gord-fig [Sat, 10 Aug 2002 17:34:56 +0000 (17:34 +0000)]
Clean up extN patch and dist rules.
adilger [Sat, 10 Aug 2002 10:43:59 +0000 (10:43 +0000)]
Commit prototype changes in headers.
adilger [Sat, 10 Aug 2002 10:06:44 +0000 (10:06 +0000)]
Another list_add() abuse.
adilger [Sat, 10 Aug 2002 10:05:34 +0000 (10:05 +0000)]
Do proper setup/cleanup of MDS exports and client data.
This also changes the behaviour of MDS connections so that the export
and client data is set up immediately at connect time, rather than at
"getinfo" time. I am ASSUMING that at recovery time the client does
another connect to the MDS, or that some other mechanism is in place
so that it will get the correct export back (it looks like this is
correct, but I didn't follow the whole code path through for recovery).
I was torn on whether to zap the on-disk MDS client record in the case
where it does a proper disconnect. In the end I decided against it,
because it was too difficult to pass a parameter to the mds_disconnect()
call telling whether we should zap or not. We don't want to change the
disk if there is some error in restarting after a failure or if we are
forcibly shutting down the MDS, but only on a clean disconnect by the
client.
So far, the only potential harm that comes from not doing the zapping
of the client record is that we get an (empty) export for each client
that shut down cleanly (and was not overwritten) on the last MDS incarnation.
On the following MDS incarnation this client export will be dropped
because the incarnation number is too low (assuming it remains unused).
Another thing of note is that we pass "struct file *" back to the client
upon open, and dereference this at close time. We need to move this
into the export struct and pass a cookie to the client (and validate
the cookie) instead of (or in addition to) passing the pointer directly.
This is needed for recovery of the client open state anyways...
adilger [Sat, 10 Aug 2002 09:50:12 +0000 (09:50 +0000)]
Fix LDLM namespace leak in client_obd_connect() if there is an error
connecting (i.e. bad UUID given for the target).
Fix semaphore imbalance in client_obd_disconnect() if we try to disconnect
an already disconnected device.
adilger [Sat, 10 Aug 2002 09:46:25 +0000 (09:46 +0000)]
Use list_heads properly. In a few places we were using them incorrectly:
BAD: list_add(&list_item, list_head.prev); Overwrites the adjacent field.
GOOD: list_add_tail(&list_item, list_head);
While the following usage is technically correct, it does encourage people
to use the above (incorrect) usage pattern, so it should be avoided:
BAD: list_add(&list_item, list_head.next);
GOOD: list_add_tail(&list_item, &list_head);
Just FYI - list_add() is like a queue, while list_add_tail() is like a stack,
when iterating over items via list_for_each().
adilger [Sat, 10 Aug 2002 09:38:46 +0000 (09:38 +0000)]
Make the UUIDs more unique by appending the hostname, otherwise the MDS
gets confused about exports when there are multiple clients.
Robert - the same thing (or better) needs to be done for lctl also.
rread [Fri, 9 Aug 2002 19:57:54 +0000 (19:57 +0000)]
- update sample uml.xml
rread [Fri, 9 Aug 2002 19:56:52 +0000 (19:56 +0000)]
- support mounting osc or lov
rread [Fri, 9 Aug 2002 19:55:07 +0000 (19:55 +0000)]
- ensure lovconfig returns 0 on success
gord-fig [Fri, 9 Aug 2002 03:03:54 +0000 (03:03 +0000)]
Remove lctl.h from distribution.
gord-fig [Thu, 8 Aug 2002 21:49:26 +0000 (21:49 +0000)]
Changebar support for lustre.pdf. Check out the old doc directory into doc/doc.old, then run `make chbar' to produce lustre-chbar.pdf.
pschwan [Thu, 8 Aug 2002 11:53:15 +0000 (11:53 +0000)]
It would be nice to properly type obd_enqueue() to avoid future bugs of this
nature, but the header dependency problem is tricky.
pschwan [Thu, 8 Aug 2002 11:22:35 +0000 (11:22 +0000)]
Fix stupid deadlock-causing mistake in llite file locking callback
pschwan [Thu, 8 Aug 2002 10:33:56 +0000 (10:33 +0000)]
My work goes more smoothly when I think.