Whamcloud - gitweb
fs/lustre-release.git
20 years agoReverted #974 for now as it causes problems for people.
green [Sun, 21 Dec 2003 10:26:13 +0000 (10:26 +0000)]
Reverted #974 for now as it causes problems for people.
Approved by Andreas Dilger.

20 years agoMake the namespace/resource/lock dumping somewhat more compact, so
phil [Sun, 21 Dec 2003 07:51:42 +0000 (07:51 +0000)]
Make the namespace/resource/lock dumping somewhat more compact, so
that less log space is wasted, and it's easier to visually scan.

20 years agob=2425
phil [Sun, 21 Dec 2003 07:46:39 +0000 (07:46 +0000)]
b=2425
Jacob reported that when MDS/OST recovery requires new objects to be
created, the OST throws an assertion.

Bug 2425 remains open to track the creation of many more tests for
missing MDS/OST recovery cases.

20 years agoRemove pesky $Id tag which only causes conflicts
phil [Sun, 21 Dec 2003 07:41:47 +0000 (07:41 +0000)]
Remove pesky $Id tag which only causes conflicts

20 years agob=2353
rread [Fri, 19 Dec 2003 19:45:29 +0000 (19:45 +0000)]
b=2353
r=shaver

Delete IOC_CONNECT,DISCONNECT and use obd_self_export instead
of creating connections for lctl. Also delete the IOC_DEVICE comamnd
and make the ioctl interface stateless.  The lctl probe command is now
a noop, and lctl device is still used to set the device, although the
current device state is only saved in lctl now, and not the kernel.

20 years agob=2420: don't acquire a duplicate lock when processing a resent GETATTR, just
shaver [Fri, 19 Dec 2003 14:17:00 +0000 (14:17 +0000)]
b=2420: don't acquire a duplicate lock when processing a resent GETATTR, just
        grab the dchild directly and sample the data. Fixes recovery-small.sh.
r=phik,buffalo

20 years ago- tcp_sendpage_zccd() must be exported always
alex [Fri, 19 Dec 2003 11:16:11 +0000 (11:16 +0000)]
- tcp_sendpage_zccd() must be exported always

20 years agob=2383
phil [Thu, 18 Dec 2003 10:21:23 +0000 (10:21 +0000)]
b=2383
Stop taking a PR lock in mds_readpage; a PR is already held by the
client, so if there is a PW in the queue, deadlock will result.  Just
assume that the client has a lock.

20 years agoPrint the service name in the mds RECOVERY: message
phil [Thu, 18 Dec 2003 09:45:44 +0000 (09:45 +0000)]
Print the service name in the mds RECOVERY: message

20 years agob=2252
zab [Thu, 18 Dec 2003 04:13:42 +0000 (04:13 +0000)]
b=2252
r=adilger
(didn't see regressions in buffalo, confirmed read throughput increases
with sf and fpp multi-node IOR)

This cleans up llite's readpage path and implements our own read-ahead window
that hangs off of ll_file_data.  The broad goal is to keep a fair amount of
read-ahead pages issued and queued which can be fired off into read rpcs as
read-ahead rpcs are completed.

20 years ago- put llite page cache pages in a list_head for the duration
zab [Thu, 18 Dec 2003 03:59:08 +0000 (03:59 +0000)]
- put llite page cache pages in a list_head for the duration
  of their stay in the page cache.  This lets us display the contents
  of the page cache via llite/*/dump_pgcache file.  This was done as part
  of b=2252 and is being committed seperately from the read-ahead work.

20 years agoSilence bogus compiler warning.
adilger [Wed, 17 Dec 2003 19:49:18 +0000 (19:49 +0000)]
Silence bogus compiler warning.

20 years agoWe can never hit the end of mds_finish_open() with a non-zero error code
adilger [Wed, 17 Dec 2003 19:48:16 +0000 (19:48 +0000)]
We can never hit the end of mds_finish_open() with a non-zero error code
because we exit early on error, so the mds_destroy_mfd() is bogus.  I left
RETURN(rc) in case things change in the future though.

We don't use request_body() anywhere inside mds_put_write_access(), but
since all of that code is just commented out I didn't do a real cleanup.
Just a bogus compiler warning fixed.

20 years agofile dir.c was initially added on branch b_eq.
ericm [Wed, 17 Dec 2003 13:42:47 +0000 (13:42 +0000)]
file dir.c was initially added on branch b_eq.

20 years ago- move the osc histogram helpers into lprocfs and rename accordingly
zab [Wed, 17 Dec 2003 00:04:24 +0000 (00:04 +0000)]
- move the osc histogram helpers into lprocfs and rename accordingly
- export brw histograms from the filter that record discontiguous offsets
  in the brw request and discontigous blocks that satisfy the request
  (seen as /proc/fs/lustre/obdfilter/$name/brw_stats)

20 years ago- get rid of some ancient unused left-overs
zab [Tue, 16 Dec 2003 22:13:03 +0000 (22:13 +0000)]
- get rid of some ancient unused left-overs

20 years agor=zab,phil
green [Tue, 16 Dec 2003 17:46:23 +0000 (17:46 +0000)]
r=zab,phil
Fix for bug 974, Also adds a test to check for OOM (modified script from
bug 1135), fixes to sanity.sh's test 45 to obtain a grant (closes 2387).

20 years agob=1557/2316
phil [Tue, 16 Dec 2003 17:01:07 +0000 (17:01 +0000)]
b=1557/2316
Back out patch from bug 1557, because it causes the crash described in
bug 2316.

20 years ago- large kernel address space support against vanilla-2.4.22
alex [Mon, 15 Dec 2003 20:42:09 +0000 (20:42 +0000)]
- large kernel address space support against vanilla-2.4.22

20 years agofile sanity.c was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:33 +0000 (12:03 +0000)]
file sanity.c was initially added on branch b_eq.

20 years agofile echo_test.c was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:32 +0000 (12:03 +0000)]
file echo_test.c was initially added on branch b_eq.

20 years agofile Makefile.am was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:31 +0000 (12:03 +0000)]
file Makefile.am was initially added on branch b_eq.

20 years agofile test_lock_cancel.c was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:30 +0000 (12:03 +0000)]
file test_lock_cancel.c was initially added on branch b_eq.

20 years agofile test_common.h was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:29 +0000 (12:03 +0000)]
file test_common.h was initially added on branch b_eq.

20 years agofile test_common.c was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:28 +0000 (12:03 +0000)]
file test_common.c was initially added on branch b_eq.

20 years agofile replay_single.c was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:27 +0000 (12:03 +0000)]
file replay_single.c was initially added on branch b_eq.

20 years agofile recovery_small.c was initially added on branch b_eq.
ericm [Mon, 15 Dec 2003 12:03:26 +0000 (12:03 +0000)]
file recovery_small.c was initially added on branch b_eq.

20 years agoImplement saving of previous value of max_dirty_mb, as suggested by Andreas
green [Mon, 15 Dec 2003 10:36:15 +0000 (10:36 +0000)]
Implement saving of previous value of max_dirty_mb, as suggested by Andreas

20 years ago b: 2356
tianying [Mon, 15 Dec 2003 06:22:42 +0000 (06:22 +0000)]
 b: 2356
     r: Andreas and Phil
     To increase the mount count of mds.

20 years agochange debug_client_off from 0 to the minimal but still useful 0x3f0400
phil [Mon, 15 Dec 2003 06:14:25 +0000 (06:14 +0000)]
change debug_client_off from 0 to the minimal but still useful 0x3f0400

20 years ago- fix iopentest*.c to produce error messages with filenames
phil [Mon, 15 Dec 2003 04:39:38 +0000 (04:39 +0000)]
- fix iopentest*.c to produce error messages with filenames
- remove sanity test 55

20 years agoWhoops, just added test for #2319 was a bit flawed and failed for no good reason
green [Sun, 14 Dec 2003 22:05:30 +0000 (22:05 +0000)]
Whoops, just added test for #2319 was a bit flawed and failed for no good reason

20 years agor=shaver
green [Sun, 14 Dec 2003 21:39:09 +0000 (21:39 +0000)]
r=shaver
fix for #2319, make osic to be allocated separately and implement proper
refcounting for it.
Also adds a test to sanity.sh that checks for (fixed) crash.

20 years agor=phik
green [Sun, 14 Dec 2003 17:42:07 +0000 (17:42 +0000)]
r=phik
fix for #2348

20 years ago- xattr-related fixes against chaos-2.4.21
alex [Sun, 14 Dec 2003 12:49:50 +0000 (12:49 +0000)]
- xattr-related fixes against chaos-2.4.21

20 years agofix "empty case at end of compound statement" warning in newer GCCs
phil [Sun, 14 Dec 2003 05:15:10 +0000 (05:15 +0000)]
fix "empty case at end of compound statement" warning in newer GCCs

20 years agochange default debug level to a more reasonable production setting
phil [Sun, 14 Dec 2003 03:59:16 +0000 (03:59 +0000)]
change default debug level to a more reasonable production setting

20 years agob=2371
phil [Sun, 14 Dec 2003 02:50:28 +0000 (02:50 +0000)]
b=2371
Updated the BUILDING file, to at least remove the lies, and point
people at more helpful documentation

20 years agoignore generated files
phil [Sat, 13 Dec 2003 06:10:30 +0000 (06:10 +0000)]
ignore generated files

20 years agob=2368
phil [Sat, 13 Dec 2003 04:28:44 +0000 (04:28 +0000)]
b=2368
fix a useless error message

20 years ago- chaos-2.4.21 series against 2.4.21-p4smp-12chaos
alex [Fri, 12 Dec 2003 17:11:48 +0000 (17:11 +0000)]
- chaos-2.4.21 series against 2.4.21-p4smp-12chaos

20 years agob=1792
wangchao [Fri, 12 Dec 2003 06:43:41 +0000 (06:43 +0000)]
b=1792
r=Chris

add sanity test for "iopen_connect_dentry() on already-connected dentry"

20 years agoFix path to include lctl (was already in PATH at LLNL).
adilger [Fri, 12 Dec 2003 01:38:33 +0000 (01:38 +0000)]
Fix path to include lctl (was already in PATH at LLNL).

20 years agoAllow sanityN.sh to run with a zconf-mounted setup.
adilger [Fri, 12 Dec 2003 00:20:51 +0000 (00:20 +0000)]
Allow sanityN.sh to run with a zconf-mounted setup.
Be more verbose about what the specific error is.

20 years agoMake ONLY=setup not do cleanup at the end, while we use replay-dual.sh as
adilger [Fri, 12 Dec 2003 00:17:08 +0000 (00:17 +0000)]
Make ONLY=setup not do cleanup at the end, while we use replay-dual.sh as
a proxy for mount2.sh.

20 years ago- silence trivial unused variable warning
zab [Fri, 12 Dec 2003 00:01:42 +0000 (00:01 +0000)]
- silence trivial unused variable warning

20 years agoAdd lock-order regression test.
adilger [Thu, 11 Dec 2003 22:33:40 +0000 (22:33 +0000)]
Add lock-order regression test.
b=1844

20 years ago- fix up rc = type-o spotted by adilger
zab [Thu, 11 Dec 2003 20:06:24 +0000 (20:06 +0000)]
- fix up rc = type-o spotted by adilger

20 years agob=2339
zab [Thu, 11 Dec 2003 19:04:49 +0000 (19:04 +0000)]
b=2339
filter_precreate() was setting the oid returned based on the last_id for the
requested object group, but was always creating objects in group 0 by virtue of
passing NULL in as the obdo to the _next_id functions.  In the process of
fixing this we stop NULLing out the obdo in the loop and get rid of the
_setattr() and obdo_from_inode() which are artifacts from when the client
performed obd_create().

Also some cleanup_phase beautification.

20 years agob:2316 Save the owner of f_op before replace it with llite special file operation
wangdi [Thu, 11 Dec 2003 08:30:50 +0000 (08:30 +0000)]
b:2316 Save the owner of f_op before replace it with llite special file operation

20 years agoa trivial fix to add description for lfs commands
wangchao [Thu, 11 Dec 2003 08:29:09 +0000 (08:29 +0000)]
a trivial fix to add description for lfs commands

20 years agob=1135
wangchao [Thu, 11 Dec 2003 02:19:12 +0000 (02:19 +0000)]
b=1135
r=Andreas

Add a regression test script to test OST out-of-space.

20 years ago- ignore write_disjoint
ccooper [Thu, 11 Dec 2003 00:01:27 +0000 (00:01 +0000)]
- ignore write_disjoint

20 years ago- kernel_text_address patch against chaos-2.4.18 series
alex [Wed, 10 Dec 2003 23:26:10 +0000 (23:26 +0000)]
- kernel_text_address patch against chaos-2.4.18 series

20 years ago- list_for_each_entry_safe(), list_move() and list_move_tail() have been added
alex [Wed, 10 Dec 2003 21:40:11 +0000 (21:40 +0000)]
- list_for_each_entry_safe(), list_move() and list_move_tail() have been added

20 years ago- list_for_each_entry() added
alex [Wed, 10 Dec 2003 19:10:15 +0000 (19:10 +0000)]
- list_for_each_entry() added

20 years agob: 1991
niu [Wed, 10 Dec 2003 10:13:55 +0000 (10:13 +0000)]
b: 1991
r: Peter

lfs catinfo <keyword>
Fetching logs information from client node. Now keywords include:
config and deletions. Others will be added in future.

20 years agob=2237
wangchao [Wed, 10 Dec 2003 09:51:51 +0000 (09:51 +0000)]
b=2237
a small fix. We should use 0 instead of 1 as the stripe_start patameter, because the first number of OSTs is 0. If we have only one OST, 1 will fail.

20 years agob=2237
wangchao [Wed, 10 Dec 2003 07:05:25 +0000 (07:05 +0000)]
b=2237
r=phil

lstripe should fail when offset > numobd

20 years agoDoing endian conversion on constant instead of variable according to andreas advices...
wangdi [Wed, 10 Dec 2003 03:23:30 +0000 (03:23 +0000)]
Doing endian conversion on constant instead of variable according to andreas advices bug 1989

20 years agob=2230
zab [Wed, 10 Dec 2003 02:03:24 +0000 (02:03 +0000)]
b=2230
Allocation failures during heavy bulk IO load were causing timeouts.  Using
GFP_NOFS throughout lustre, and in particular instead of 0 as sk->allocation,
is our most recent attempt to appease the VM.  Make lots of noise if you see
allocation failures or deadlocks involving threads waiting for memory.

20 years agob: 1988
niu [Wed, 10 Dec 2003 01:51:01 +0000 (01:51 +0000)]
b: 1988
r: Andreas

Make log record alignment 8 bytes.

20 years agob: 2226
niu [Wed, 10 Dec 2003 01:36:14 +0000 (01:36 +0000)]
b: 2226
r: Phil

Remove all orhpans on OST while MDS startup, and set last_id correctly.

20 years agoThe newly added "jt_llog_check" function was not declared here.
radhika [Tue, 9 Dec 2003 19:39:56 +0000 (19:39 +0000)]
The newly added "jt_llog_check" function was not declared here.

20 years ago- bring the filter_survey script up to date with recent lctl interface changes
zab [Tue, 9 Dec 2003 18:37:20 +0000 (18:37 +0000)]
- bring the filter_survey script up to date with recent lctl interface changes

20 years agob=2330
phil [Tue, 9 Dec 2003 16:26:16 +0000 (16:26 +0000)]
b=2330
Add sanity test #62 for obd_match error checking, to avoid regression

20 years agoadd llog_check and add remove the logs of catalog in llog_remove r:peter
wangdi [Tue, 9 Dec 2003 13:00:35 +0000 (13:00 +0000)]
add llog_check and add remove the logs of catalog in llog_remove r:peter

20 years agob=2284
wangchao [Tue, 9 Dec 2003 11:42:17 +0000 (11:42 +0000)]
b=2284
r=Robert
scsi support for dev_read_only

20 years agob=2284
wangchao [Tue, 9 Dec 2003 04:14:25 +0000 (04:14 +0000)]
b=2284
r=Robert
scsi support for dev_read_only

20 years agob=2321
phil [Mon, 8 Dec 2003 15:22:47 +0000 (15:22 +0000)]
b=2321
Fix two rare exit paths which will leak an l_lock() reference:
 - an allocation failure in ldlm_server_blocking_ast
 - an unlikely race condition in ldlm_resource_add_lock

I blame the latter for the problem reported in bug 2321.

20 years agoFix confusing MDC error message
phil [Mon, 8 Dec 2003 14:48:36 +0000 (14:48 +0000)]
Fix confusing MDC error message

20 years ago- bring the generic_hweight32 x86_64 insmod fix over from b_eq
zab [Fri, 5 Dec 2003 23:51:27 +0000 (23:51 +0000)]
- bring the generic_hweight32 x86_64 insmod fix over from b_eq

20 years agob=2330
zab [Fri, 5 Dec 2003 20:28:01 +0000 (20:28 +0000)]
b=2330
minor state cleanup from matching error return paths

20 years agob=2334
phil [Fri, 5 Dec 2003 17:46:41 +0000 (17:46 +0000)]
b=2334
A slight reorganization of ll_intent_release, so we can drop the MDS
lock early.

20 years agob=2334
phil [Fri, 5 Dec 2003 15:18:18 +0000 (15:18 +0000)]
b=2334
Break cyclic locking deadlock by dropping the MDC read lock before we
take the OSC read lock during getattr intents

20 years agob=1897: use the rpcd to send closes, so that we can resend in the case of a
shaver [Fri, 5 Dec 2003 14:45:23 +0000 (14:45 +0000)]
b=1897: use the rpcd to send closes, so that we can resend in the case of a
        reconnect after user interruption, and avoid leaking an open-count.

        Also, allocate repmsg _before_ reconstructing a close into it.
r=phik

20 years agob=2313
phil [Fri, 5 Dec 2003 11:49:50 +0000 (11:49 +0000)]
b=2313
My fix to bug 2313 accidentally created a lot of noise by returning
non-zero return codes when multiple clients had a file open for write.

20 years agoI am very stupid. I put the extra debugging code in the wrong path.
phil [Fri, 5 Dec 2003 11:01:10 +0000 (11:01 +0000)]
I am very stupid.  I put the extra debugging code in the wrong path.

20 years agob=2306
phil [Fri, 5 Dec 2003 09:31:53 +0000 (09:31 +0000)]
b=2306
r=alex
Replace i_sem with BKL in ext3_fsfilt_write_record

20 years agob=2333
phil [Fri, 5 Dec 2003 05:35:10 +0000 (05:35 +0000)]
b=2333
Fix i_sem/journal inversion in mds_client_add, which was never updated
when we decided to re-order these a few months ago.  This became much
easier to hit after we fixed bug 2306.

20 years agob=2330
phil [Fri, 5 Dec 2003 05:33:12 +0000 (05:33 +0000)]
b=2330
Be more careful about the return codes from obd_match, lest we try to
cancel a lock which was never granted.

20 years agob=1505
phil [Fri, 5 Dec 2003 03:20:24 +0000 (03:20 +0000)]
b=1505
r=shaver
Print a much more meaningful error when a client is rejected because a
service node is waiting for recoverable clients.

20 years agob=2313
phil [Fri, 5 Dec 2003 03:18:46 +0000 (03:18 +0000)]
b=2313
r=shaver
This bug happens when a file is opened twice for write, then both close it
at the same time.  If they both drop the writecount, then race to
compare it against 0, one will free the fsdata and the other will assert.

This looks like a big patch, but it's mostly plumbing.  I had to do some
different argument passing, in order to keep everything protected under the
same lock.

I removed the writecount spinlock, and use the epoch semaphore for all three
things: management of the epoch, protection of the writecount, and atomicity of
writecount modifications which result in allocation or freeing of the fsdata.

20 years agob=2313
phil [Fri, 5 Dec 2003 03:15:19 +0000 (03:15 +0000)]
b=2313
r=shaver
This bug happens when a file is opened twice for write, then both close it
at the same time.  If they both drop the writecount, then race to
compare it against 0, one will free the fsdata and the other will assert.

This looks like a big patch, but it's mostly plumbing. I had to do some
different argument passing, in order to keep everything protected under the
same lock.

I removed the writecount spinlock, and use the epoch semaphore for all three
things: management of the epoch, protection of the writecount, and atomicity of
writecount modifications which result in allocation or freeing of the fsdata.

20 years ago- suse-2.4.21 builds on x86_64 now
alex [Thu, 4 Dec 2003 20:17:51 +0000 (20:17 +0000)]
- suse-2.4.21 builds on x86_64 now

20 years ago- more error checking and less verbosity for insanity
rread [Thu, 4 Dec 2003 20:12:45 +0000 (20:12 +0000)]
- more error checking and less verbosity for insanity
- fixed shell brainos preventing client failures from working

20 years agob=2329: move osc_rpcd into ptlrpc as ptlrpcd, for non-OSC applications. Largely
shaver [Thu, 4 Dec 2003 20:07:30 +0000 (20:07 +0000)]
b=2329: move osc_rpcd into ptlrpc as ptlrpcd, for non-OSC applications.  Largely
        mechanical, plus a tiny Makefile.am cleanup in ptlrpc.
r=zab

20 years ago- tcp_sendpage_zccd() must be exported always
alex [Thu, 4 Dec 2003 17:58:26 +0000 (17:58 +0000)]
- tcp_sendpage_zccd() must be exported always

20 years ago- tcp_sendpage_zccd() must be exported always
alex [Thu, 4 Dec 2003 17:48:22 +0000 (17:48 +0000)]
- tcp_sendpage_zccd() must be exported always

20 years ago* Added ENOMEM detection and retry on socknal sends
eeb [Thu, 4 Dec 2003 14:00:06 +0000 (14:00 +0000)]
*   Added ENOMEM detection and retry on socknal sends

20 years agor: TianYing
niu [Thu, 4 Dec 2003 08:18:25 +0000 (08:18 +0000)]
r: TianYing

close all opened logs before user change logs via lctl, reopen them
after lctl operation finished.

20 years ago- insanity cleanups.
rread [Thu, 4 Dec 2003 05:25:40 +0000 (05:25 +0000)]
- insanity cleanups.
 - Call the right function to shutdown the osts
 - just sleep when powering off the machine.
 - use checkstat, instead of ls -ld

20 years agoforce D_OTHER on for the duration of the ldlm_namespace_dump, then restore
phil [Thu, 4 Dec 2003 04:55:28 +0000 (04:55 +0000)]
force D_OTHER on for the duration of the ldlm_namespace_dump, then restore

20 years agob=2328
phil [Thu, 4 Dec 2003 04:41:32 +0000 (04:41 +0000)]
b=2328
r=shaver
- Make sure that all locks which have been marked as receiving a blocking AST
  are eventually added to the waiting list, to evict badly-behaving clients.
- If a service node times out waiting for a lock, dump the namespace
  to the log, no more than once every 5 minutes

20 years agowhen kernel_thread fails, print the return code instead of 0 (or nothing)
phil [Thu, 4 Dec 2003 02:56:32 +0000 (02:56 +0000)]
when kernel_thread fails, print the return code instead of 0 (or nothing)

20 years agob: 2226
niu [Thu, 4 Dec 2003 01:10:02 +0000 (01:10 +0000)]
b: 2226
r: Phil

Set correct last object id when cleaning up orphans during mds setting up.

20 years ago- fix against wrong lock order in fsfilt_ext3_write_record()
alex [Wed, 3 Dec 2003 22:35:57 +0000 (22:35 +0000)]
- fix against wrong lock order in fsfilt_ext3_write_record()
  bug 2306

20 years agoInstrumentation for reproducing and verifying 1897 (open-count leaked if close
shaver [Wed, 3 Dec 2003 21:39:49 +0000 (21:39 +0000)]
Instrumentation for reproducing and verifying 1897 (open-count leaked if close
is interrupted on the client). r=robert.

20 years agofile llmount-upcall.sh was initially added on branch b_devel.
shaver [Wed, 3 Dec 2003 21:24:55 +0000 (21:24 +0000)]
file llmount-upcall.sh was initially added on branch b_devel.

20 years agob=2322
phil [Wed, 3 Dec 2003 10:58:26 +0000 (10:58 +0000)]
b=2322
In ldlm_process_{plain,extent}_lock, we used to remove and re-add the
lock to the waiting list after every -ERESTART loop.  But because of
the logic in the ldlm_*_compat_queue functions, in a very rare case,
this could lead to lock re-ordering and subsequent deadlock.