Whamcloud - gitweb
Theodore Ts'o [Sat, 9 Aug 2014 16:24:54 +0000 (12:24 -0400)]
libext2fs: avoid buffer overflow if s_first_meta_bg is too big
If s_first_meta_bg is greater than the of number block group
descriptor blocks, then reading or writing the block group descriptors
will end up overruning the memory buffer allocated for the
descriptors. Fix this by limiting first_meta_bg to no more than
fs->desc_blocks. This doesn't correct the bad s_first_meta_bg value,
but it avoids causing the e2fsprogs userspace programs from
potentially crashing.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Fri, 8 Aug 2014 20:42:05 +0000 (16:42 -0400)]
libext2fs: have UNIX IO manager use pread64/pwrite64
Commit
baa3544609da3c ("libext2fs: have UNIX IO manager use
pread/pwrite) causes a breakage on 32-bit systems where off_t is
32-bits for file systems larger than 4GB. Fix this by using
pread64/pwrite64 if possible, and if pread64/pwrite64 is not present,
using pread/pwrite only if the size of off_t is at least as big as
ext2_loff_t.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Aaron Crane [Mon, 4 Aug 2014 01:54:14 +0000 (21:54 -0400)]
debugfs: teach rdump to take multiple source arguments
[ modified to update man page by tytso ]
Signed-off-by: Aaron Crane <arc@aaroncrane.co.uk>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Aaron Crane [Mon, 4 Aug 2014 01:53:24 +0000 (21:53 -0400)]
debugfs: refactor do_rdump()
No behaviour changes. This will simplify the next commit.
Signed-off-by: Aaron Crane <arc@aaroncrane.co.uk>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Aaron Crane [Mon, 4 Aug 2014 01:52:11 +0000 (21:52 -0400)]
debugfs: fix double-close bug in "rdump" and "dump -p"
Previously, both of these usages called dump_file() with a true value as
the "preserve" argument, which caused it to in turn call fix_perms() to
make the permissions on the locally-dumped file match those found on the
ext2 filesystem. fix_perms() then attempted to close(2) the file descriptor
(if any) before returning (though it didn't attempt to report on any errors
found while doing so).
However, in both of these situations, the local file being dumped had been
opened by the caller of dump_file(), which also closes it (and reports on
any errors detected when closing). This meant that both "rdump" and "dump
-p" would then emit a spurious EBADF message when trying to re-close the
local file descriptor.
Deleting the spurious close(2) call in fix_perms() fixes the problem in both
commands.
Signed-off-by: Aaron Crane <arc@aaroncrane.co.uk>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Aaron Crane [Mon, 4 Aug 2014 01:51:04 +0000 (21:51 -0400)]
debugfs: be more specific in error messages
Signed-off-by: Aaron Crane <arc@aaroncrane.co.uk>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sun, 3 Aug 2014 18:00:47 +0000 (14:00 -0400)]
libext2fs: place metadata blocks in the last flex_bg so they are contiguous
Place the allocation bitmaps and inode table blocks so they are
adjacent, even in the last flexbg.
Previously, after running "mke2fs -t ext4 DEV 286720", the layout of
the last few block groups would look like this:
Group 32: (Blocks 262145-270336) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262145 (+0), Inode bitmap at 262161 (+16)
Inode table at 262177-262432 (+32)
Group 33: (Blocks 270337-278528) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 262146 (bg #32 + 1), Inode bitmap at 262162 (bg #32 + 17)
Inode table at 262433-262688 (bg #32 + 288)
Group 34: (Blocks 278529-286719) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262147 (bg #32 + 2), Inode bitmap at 262163 (bg #32 + 18)
Inode table at 262689-262944 (bg #32 + 544)
Now, they look like this:
Group 32: (Blocks 262145-270336) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262145 (+0), Inode bitmap at 262148 (+3)
Inode table at 262151-262406 (+6)
Group 33: (Blocks 270337-278528) [INODE_UNINIT, BLOCK_UNINIT, ITABLE_ZEROED]
Block bitmap at 262146 (bg #32 + 1), Inode bitmap at 262149 (bg #32 + 4)
Inode table at 262407-262662 (bg #32 + 262)
Group 34: (Blocks 278529-286719) [INODE_UNINIT, ITABLE_ZEROED]
Block bitmap at 262147 (bg #32 + 2), Inode bitmap at 262150 (bg #32 + 5)
Inode table at 262663-262918 (bg #32 + 518)
This reduces the free space fragmentation in a freshly created file
system. It also allows the following mke2fs command to succeed:
mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483
(Note that while this allows people to run mke2fs with insanely large
flexbg sizes, this is not a recommended practice, as the kernel may
refuse to resize such a file system while mounted, since it currently
tries to allocate an in-memory data structure based on the size of the
flexbg, and so a file system with a very large flexbg size will cause
the memory allocation to fail. This will hopefully be fixed in a
future kernel release, but if the goal is to force all of the metadata
blocks to be at the beginning of the file system, it's better to use
the packed_meta_blocks configuration parameter in mke2fs.conf.)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sun, 3 Aug 2014 16:22:27 +0000 (12:22 -0400)]
Revert "mke2fs: prevent creation of unmountable ext4 with large flex_bg count"
This reverts commit
d988201ef9cb6f7b521e544061976ab4270a3f89.
The problem with this commit is that causes common small file system
configurations to fail. For example:
mke2fs -O flex_bg -b 4096 -I 1024 -F /tmp/tt 79106
mke2fs 1.42.11 (09-Jul-2014)
/tmp/tt: Invalid argument passed to ext2 library while setting
up superblock
This check in ext2fs_initialize() was added to prevent the metadata
from being allocated beyond the end of the filesystem, but it is
also causing a wide range of failures for small filesystems.
We'll address this in a different way, by using a smarter algorithm
for deciding the layout of metadata blocks for the last flex block
group.
Reported-by: Andreas Dilger <andreas.dilger@intel.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Artemiy Volkov [Sat, 2 Aug 2014 23:53:04 +0000 (19:53 -0400)]
debugfs: fix argument parsing in do_freefrag()
When do_freefrag() is called from debugfs, the value of optind is
not reset. Rectify that by calling reset_getopt().
Signed-off-by: Artemiy Volkov <artemiyv@acm.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 2 Aug 2014 23:43:10 +0000 (19:43 -0400)]
misc: fix Makefile for profiled build
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Thu, 1 May 2014 23:15:05 +0000 (16:15 -0700)]
libext2fs: when appending to a file, don't split an index block in equal halves
When we're appending an extent to the end of a file and the index
block is full, don't split the index block into two half-full index
blocks because this leaves us with under utilized index blocks, at
least in the fallocate case. Instead, copy the last extent from the
full block into the new block. This isn't perfect utilization, but
there's a lot of work involved in teaching extent.c to be able to goto
a nonexistent node in a newly allocated (and empty) extent block.
This patch does not fix the general problem of keeping the extent tree
balanced.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Sat, 2 Aug 2014 23:18:03 +0000 (19:18 -0400)]
libext2fs: have UNIX IO manager use pread/pwrite
If pread/pwrite are present, have the UNIX IO manager use them for
aligned IOs (instead of the current seek -> read/write), thereby
saving us a (minor) amount of system call overhead.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Wed, 30 Jul 2014 20:25:49 +0000 (14:25 -0600)]
filefrag: minor code fixes and cleanups
Print filefrag_fiemap() error message to stderr instead of stdout.
Only call ioctl(EXT3_IOC_GETFLAGS) for ext{2,3,4} filesystems to
decide if the ext2 indirect block allocation heuristic shold be used.
Properly handle the the force_bmap (-B) option.
Exit with a positive error number instead of a negative one.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Tue, 29 Jul 2014 23:30:40 +0000 (17:30 -0600)]
tests: fix f_badcluster output formatting
The f_badcluster output format depends on how libreadline formats
and outputs the commands read from stdin. Instead of trying to
handle these differences, use an input command file, which does
not depend on external components to be consistent.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Tue, 29 Jul 2014 23:30:39 +0000 (17:30 -0600)]
misc: quiet signed/unsigned charactr compiler warnings
Quiet warnings about signed vs. unsigned character mismatch.
Use __u8 for storing UUIDs instead of char to match the superblock
s_uuid field.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 31 Jul 2014 15:49:48 +0000 (11:49 -0400)]
tune2fs: fix uninitialized variable in remove_journal_device
This bug was introduced by commit
7dfefaf413bbd ("tune2fs: update
journal super block when changing UUID for fs").
Fixes-Coverity-Bug: 1229243
Reported-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Azat Khuzhin [Mon, 28 Jul 2014 07:43:25 +0000 (11:43 +0400)]
tune2fs: update journal users while updating fs UUID (with external journal)
When we have fs with external journal device, and updating it's UUID, we
should update UUID in users list for that external journal device.
Before:
$ tune2fs -U clear /tmp/dev
tune2fs 1.42.10 (18-May-2014)
$ dumpe2fs /tmp/dev | fgrep UUID
dumpe2fs 1.42.10 (18-May-2014)
Filesystem UUID: <none>
Journal UUID:
da1f2ed0-60f6-aaaa-92fd-
738701418523
$ dumpe2fs /tmp/journal | fgrep users -A10
dumpe2fs 1.42.10 (18-May-2014)
Journal number of users: 2
Journal users:
0707762d-638e-4bc6-944e-
ae8ee7a3359e
0ad849df-1041-4f0a-b1c1-
2f949d6a1e37
After:
$ sudo tune2fs -U clear /tmp/dev
tune2fs 1.43-WIP (18-May-2014)
$ dumpe2fs /tmp/dev | fgrep UUID
dumpe2fs 1.42.10 (18-May-2014)
Filesystem UUID: <none>
Journal UUID:
da1f2ed0-60f6-aaaa-92fd-
738701418523
$ dumpe2fs /tmp/journal | fgrep users -A10
dumpe2fs 1.42.10 (18-May-2014)
Journal number of users: 2
Journal users:
0707762d-638e-4bc6-944e-
ae8ee7a3359e
00000000-0000-0000-0000-
000000000000
Also add some consts to avoid *magic numbers*:
- UUID_STR_SIZE
- UUID_SIZE
- JFS_USERS_MAX
- JFS_USERS_SIZE
Proposed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Azat Khuzhin [Mon, 28 Jul 2014 07:43:24 +0000 (11:43 +0400)]
tune2fs: update journal super block when changing UUID for fs.
Using -U option you can change the UUID for fs, however it will not work
for journal device, since it have a copy of this UUID inside jsb (i.e.
journal super block). So copy UUID on change into that block.
Here is the initial thread:
http://comments.gmane.org/gmane.comp.file-systems.ext4/44532
You can reproduce this by executing following commands:
$ fallocate -l100M /tmp/dev
$ fallocate -l100M /tmp/journal
$ sudo /sbin/losetup /dev/loop1 /tmp/dev
$ sudo /sbin/losetup /dev/loop0 /tmp/journal
$ mke2fs -O journal_dev /tmp/journal
$ tune2fs -U
da1f2ed0-60f6-aaaa-92fd-
738701418523 /tmp/journal
$ sudo mke2fs -t ext4 -J device=/dev/loop0 /dev/loop1
$ dumpe2fs -h /tmp/dev | fgrep UUID
dumpe2fs 1.43-WIP (18-May-2014)
Filesystem UUID:
8a776be9-12eb-411f-8e88-
b873575ecfb6
Journal UUID:
e3d02151-e776-4865-af25-
aecb7291e8e5
$ sudo e2fsck /dev/vdc
e2fsck 1.43-WIP (18-May-2014)
External journal does not support this filesystem
/dev/loop1: ********** WARNING: Filesystem still has errors **********
Reported-by: Chin Tzung Cheng <chintzung@gmail.com>
Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Azat Khuzhin [Mon, 28 Jul 2014 07:43:23 +0000 (11:43 +0400)]
tune2fs: remove_journal_device(): use the correct block to find jsb
Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Azat Khuzhin [Mon, 28 Jul 2014 07:43:22 +0000 (11:43 +0400)]
journal: use consts instead of 1024 and add helper for journal with 1k blocksize
Use EXT2_MIN_BLOCK_SIZE, JFS_MIN_JOURNAL_BLOCKS, SUPERBLOCK_SIZE, and
SUPERBLOCK_OFFSET instead of hardcoded 1024 when it is okay, and also
add a helper ext2fs_journal_sb_start() that will return start of
journal sb with special case for fs with 1k block size.
Signed-off-by: Azat Khuzhin <a3at.mail@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Mon, 28 Jul 2014 19:37:03 +0000 (15:37 -0400)]
tests: add the f_badcluster test
This should have been part of commit
9a1d614df21 ("e2fsck: fix
rule-violating lblk->pblk mappings on bigalloc filesystems") but it
accidentally got dropped when the patch was applied.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Rakesh Pandit [Mon, 28 Jul 2014 00:04:48 +0000 (20:04 -0400)]
filefrag: fix block size value
ioctl(FIGETBSZ) was used to get block size earlier but
2508eaa7
(filefrag: improvements to filefrag FIEMAP handling) moved to fstatfs
f_bsize which doesn't work well for many files systems.
Block size returned using fstatfs isn't block size but "optimal
transfer block size" as per man page. Even stat st_blksize is
"preferred I/O block size" and in may file systems it may even vary
from file to file (POSIX). This patch changes filefrag to use
FIGETBSZ preferentially over f_bsize.
[ Modified by tytso to add the fallback to f_bsize if FIGETBSZ fails
for some reason ]
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Rakesh Pandit [Sun, 27 Jul 2014 23:56:27 +0000 (19:56 -0400)]
filefrag: fix -B option and extents calculation for FIBMAP
29758d2 broke -B option which is useful for filesystems not supporting
FIEMAP. Also, fix extents calculation for -B which is broken since
2508eaa7.
Signed-off-by: Rakesh Pandit <rakesh@tuxera.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Sat, 26 Jul 2014 20:28:58 +0000 (16:28 -0400)]
e2fsck: during pass1b delete_file, only free a cluster once
If we're forced to delete a crosslinked file, only call
ext2fs_block_alloc_stats2() on cluster boundaries, since the block
bitmaps are all cluster bitmaps at this point. It's safe to do this
only once per cluster since we know all the blocks are going away.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Sat, 26 Jul 2014 00:34:04 +0000 (17:34 -0700)]
e2fsck: fix rule-violating lblk->pblk mappings on bigalloc filesystems
As far as I can tell, logical block mappings on a bigalloc filesystem are
supposed to follow a few constraints:
* The logical cluster offset must match the physical cluster offset.
* A logical cluster may not map to multiple physical clusters.
Since the multiply-claimed block recovery code can be used to fix these
problems, teach e2fsck to find these transgressions and fix them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Sat, 26 Jul 2014 00:33:57 +0000 (17:33 -0700)]
e2fsck: perform implied cluster allocations when filling a directory hole
If we're filling a directory hole, we need to perform an implied
cluster allocation to satisfy the bigalloc rule of mapping only one
pblk to a logical cluster.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Sat, 26 Jul 2014 00:33:45 +0000 (17:33 -0700)]
e2fsck: reserve blocks for root/lost+found directory repair
If we think we're going to need to repair either the root directory or
the lost+found directory, reserve a block at the end of pass 1 to
reduce the likelihood of an e2fsck abort while reconstructing
root/lost+found during pass 3.
If / and/or /lost+found are corrupt and duplicate processing in pass
1b allocates all the free blocks in the FS, fsck aborts with an
unusable FS since pass 3 can't recreate / or /lost+found. If either
of those directories are missing, an admin can't easily mount the FS
and access the directory tree to move files off the injured FS and
free up space; this in turn prevents subsequent runs of e2fsck from
being able to continue repairs of the FS.
(One could migrate files manually with debugfs without the help of
path names, but it seems easier if users can simply mount the FS and
use regular FS management tools.)
[ Fixed up an obvious C trap: const char * and const char [] are not
the same thing when you are taking the size of the parameter.
People, run your regression tests! Like spinach, it's good for you. :-)
-- tytso ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Sat, 26 Jul 2014 18:34:56 +0000 (14:34 -0400)]
libext2fs: provide a function to set inode size
Provide an API to set i_size in an inode and take care of all required
feature flag modifications. Refactor the code to use this new
function.
[ Moved the function to lib/ext2fs/blk_num.c, which is the rest of
these sorts of functions live, and renamed it to be
ext2fs_inode_size_set() instead of ext2fs_inode_set_size() to be
consistent with the other functions in in blk_num.c -- tytso ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 26 Jul 2014 13:25:40 +0000 (09:25 -0400)]
libext2fs: fix free block accounting for 64-bit file systems
We rely on a nasty hack to adjust the free block count where we pass
signed value into ext2fs_free_blocks_count_add(), which takes an
64-bit unsigned value, and relies on overflow and C's signed->unsigned
semantics to do the subtraction. This works, so long as a 64-bit
signed value is used.
Unfortunately, ext2fs_block_alloc_stats2() and
ext2fs_block_alloc_stats_range(), this is not true, so on a 64-bit
file system, the free blocks accounting can get screwed up.
A simple way to demonstrate the problem is:
mke2fs -F -t ext4 -O 64bit /tmp/foo.img 1M
e2fsck -fy /tmp/foo.img
... which will result in the following e2fsck complaint:
Pass 5: Checking group summary information
Free blocks count wrong (
4294968278, counted=982).
Fix? yes
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 26 Jul 2014 11:40:36 +0000 (07:40 -0400)]
Fix 32/64-bit overflow when multiplying by blocks/clusters per group
There are a number of places where we need convert groups to blocks or
clusters by multiply the groups by blocks/clusters per group.
Unfortunately, both quantities are 32-bit, but the result needs to be
64-bit, and very often the cast to 64-bit gets lost.
Fix this by adding new macros, EXT2_GROUPS_TO_BLOCKS() and
EXT2_GROUPS_TO_CLUSTERS().
This should fix a bug where resizing a 64bit file system can result in
calculate_minimum_resize_size() looping forever.
Addresses-Launchpad-Bug: #1321958
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 26 Jul 2014 04:49:14 +0000 (00:49 -0400)]
libext2fs: use C99 initializers for the io_manager structure
Using C99 initializers makes the code a bit more readable, and it
avoids some gcc -Wall warnings regarding missing initializers.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Fri, 25 Jul 2014 17:38:50 +0000 (13:38 -0400)]
resize2fs: radically reduce memory utilization by using rbtree bitmaps
When resizing an empty 21T file system to 28T, resize2fs was using
this much CPU time and memory:
216.98user 19.77system 4:02.92elapsed 97%CPU (0avgtext+0avgdata 4485664maxresident)k
8inputs+1068680outputs (0major+800745minor)pagefaults 0swaps
After this one-line change:
222.29user 0.49system 3:48.79elapsed 97%CPU (0avgtext+0avgdata 30080maxresident)k
8inputs+1068552outputs (0major+2497minor)pagefaults 0swaps
So this reduces the max memory utilized from 4.2GB to 29MB!
For future work, the primary place where we are spending the most cpu
time (from resize2fs -d 16) are these two places:
blocks_to_move: Memory used: 2508k/25096k (1903k/606k), time: 91.42/91.53/ 0.00
and
calculate_summary_stats: Memory used: 2508k/25612k (1908k/601k), time: 95.33/95.45/ 0.00
The calculate_summary_stats pass can be sped up by using
ext2fs_find_first_{zero,set}_block_bitmap2(), instead of iterating
over the entire block bitmap one bit at a time.
The blocks_to_move pass can be sped up by using a bitmap to store the
location of fs metadata blocks, to avoid an O(N**2) algorithm where N
is the number of groups in the file system.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 26 Jul 2014 04:45:28 +0000 (00:45 -0400)]
libext2fs: fix rb_resize_bmap to handle the padding bits
The bits between end and real_end are set as a safety measure for the
kernel when it uses the bit scan instructions. We need to take this
into account when shrinking or growing the block allocation bitmap,
before we can safely use rbtree bitmaps in resize2fs.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sat, 26 Jul 2014 04:47:37 +0000 (00:47 -0400)]
tests: use e2fsck -f instead of -p for resize tests
Using e2sck -f provides better debugging information if things go
wrong.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Sat, 26 Jul 2014 01:43:08 +0000 (21:43 -0400)]
build: fix unused/uninitialized variable warnings
Fix a few warnings about unused and uninitialized variables.
Also fix util/subst.c to include <sys/time.h> to avoid using
undeclared functions gettimeofday() and futimes().
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 18 Jul 2014 22:55:21 +0000 (15:55 -0700)]
e2fsck: clear uninit flag on directory extents
Directories can't have uninitialized extents, so offer to clear the
uninit flag when we find this situation. The actual directory blocks
will be checked in pass 2 and 3 regardless of the uninit flag.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 25 Jul 2014 12:41:11 +0000 (08:41 -0400)]
e2fsck: pass2 should not process directory blocks that are impossibly large
Currently, directories cannot be fallocated, which means that the only
way they get bigger is for the kernel to append blocks one by one.
Therefore, if we encounter a logical block offset that is too big, we
needn't bother adding it to the dblist for pass2 processing, because
it's unlikely to contain a valid directory block. The code that
handles extent based directories also does not add toobig blocks to
the dblist.
Note that we can easily cause e2fsck to fail with ENOMEM if we start
feeding it really large logical block offsets, as the dblist
implementation will try to realloc() an array big enough to hold it.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 25 Jul 2014 12:39:45 +0000 (08:39 -0400)]
e2fsck: always submit logical block 0 of a directory for pass 2
Always iterate logical block 0 in a directory, even if no physical
block has been allocated. Pass 2 will notice the lack of mapping and
offer to allocate a new directory block; this enables us to link the
directory into lost+found.
Previously, if there were no logical blocks mapped, we would fail to
pick up even block 0 of the directory for processing in pass 2. This
meant that e2fsck never allocated a block 0 and therefore wouldn't fix
the missing . and .. entries for the directory; subsequent e2fsck runs
would complain about (yet never fix) the problem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 18 Jul 2014 22:54:30 +0000 (15:54 -0700)]
e2fsck: collapse holes in extent-based directories
If we notice a hole in the block map of an extent-based directory,
offer to collapse the hole by decreasing the logical block # of the
extent. This saves us from pass 3's inefficient strategy, which fills
the holes by mapping in a lot of empty directory blocks.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 18 Jul 2014 22:54:15 +0000 (15:54 -0700)]
e2fsck: don't crash during rehash
If a user crafts a carefully constructed filesystem containing a
single directory entry block with an invalid checksum and fewer than
two entries, and then runs e2fsck to fix the filesystem, fsck will
crash when it tries to "compress" the short dir and passes a negative
dirent array length to qsort. Therefore, don't allow directory
"compression" in this situation.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 18 Jul 2014 22:54:07 +0000 (15:54 -0700)]
misc: fix problems with strncat
The third argument to strncat is the maximum number of characters to
copy out of the second argument; it is not the maximum length of the
first argument.
Therefore, code in a check just in case we ever find a /sys/block/X
path long enough to hit the end of the buffer. FWIW the longest path
I could find on my machine was 133 bytes.
Fixes-Coverity-Bug: 1252003
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 25 Jul 2014 11:11:57 +0000 (07:11 -0400)]
libext2fs: fix bounds check of the bitmap test range in get_free_blocks2
In the loop in ext2fs_get_free_blocks2, we ask the bitmap if there's a
range of free blocks starting at "b" and ending at "b + num - 1".
That quantity is the number of the last block in the range. Since
ext2fs_blocks_count() returns the number of blocks and not the number
of the last block in the filesystem, the check is incorrect.
Put in a shortcut to exit the loop if finish > start, because in that
case it's obvious that we don't need to reset to the beginning of the
FS to continue the search for blocks. This is needed to terminate the
loop because the broken test meant that b could get large enough to
equal finish, which would end the while loop.
The attached testcase shows that with the off by one error, it is
possible to throw e2fsck into an infinite loop while it tries to
find space for the inode table even though there's no space for one.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 25 Jul 2014 02:19:27 +0000 (22:19 -0400)]
e2fsck: fix off-by-one bounds check on group number
Since fs->group_desc_count is the number of block groups, the number
of the last group is always one less than this count. Fix the bounds
check to reflect that.
This flaw shouldn't have any user-visible side effects, since the
block bitmap test based on last_grp later on can handle overbig block
numbers.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 18 Jul 2014 22:53:41 +0000 (15:53 -0700)]
e2fsck: force all block allocations to use block_found_map
During the later passes of efsck, we sometimes need to allocate and
map blocks into a file. This can happen either by fsck directly
calling new_block() or indirectly by the library calling new_block
because it needs to allocate a block for lower level metadata (bmap2()
with BMAP_SET; block_iterate3() with BLOCK_CHANGED).
We need to force new_block to allocate blocks from the found block
map, because the FS block map could be inaccurate for various reasons:
the map is wrong, there are missing blocks, the checksum failed, etc.
Therefore, any time fsck does something that could to allocate blocks,
we need to intercept allocation requests so that they're sourced from
the found block map. Remove the previous code that swapped bitmap
pointers as this is now unneeded.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 25 Jul 2014 01:03:54 +0000 (21:03 -0400)]
e2fsck: free ctx->fs, not fs, at the end of fsck
When we call ext2fs_close_free at the end of main(), we need to supply
the address of ctx->fs, because the subsequent e2fsck_free_context
call will try to access ctx->fs (which is now set to a freed block) to
see if it should free the directory block list. This is clearly not
desirable, so fix the problem.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 18 Jul 2014 22:53:11 +0000 (15:53 -0700)]
e2fsck: fix inode coherency issue when iterating an inode's blocks
When we're about to iterate the blocks of a block-map file, we need to
write the inode out to disk if it's dirty because block_iterate3()
will re-read the inode from disk. (In practice this won't happen
because nothing dirties block-mapped inodes before the iterate call,
but we can program defensively).
More importantly, we need to re-read the inode after the iterate()
operation because it's possible that mappings were changed (or erased)
during the iteration. If we then dirty or clear the inode, we'll
mistakenly write the old inode values back out to disk!
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Tue, 22 Jul 2014 18:48:41 +0000 (14:48 -0400)]
e2fsck: check error return from ext2fs_extent_fix_parents in pass 1
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Fri, 18 Jul 2014 22:53:04 +0000 (15:53 -0700)]
e2fsck: skip clearing bad extents if bitmaps are unreadable
If the bitmaps are known to be unreadable, don't bother clearing them;
just mark fsck to restart itself after pass 5, by which time the
bitmaps should be fixed.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Tue, 22 Jul 2014 17:54:54 +0000 (13:54 -0400)]
e2fsck: don't offer to recreate the journal if fsck is aborting due to bad block bitmaps
If e2fsck knows the bitmaps are bad at the exit (probably because they
were bad at the start and have not been fixed), don't offer to
recreate the journal because doing so causes e2fsck to abort a second
time.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Tue, 22 Jul 2014 17:52:33 +0000 (13:52 -0400)]
e2fsck: report correct inode number in pass1b
If there's a problem with the inode scan during pass 1b, report the
inode that we were trying to examine when the error happened, not the
inode that just went through the checker.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Tue, 22 Jul 2014 16:44:42 +0000 (12:44 -0400)]
debugfs: allow bmap to allocate blocks
Allow set_inode_field's bmap command in debugfs to allocate blocks,
which enables us to allocate blocks for indirect blocks and internal
extent tree blocks. True, we could do this manually, but seems like
unnecessary bookkeeping activity for humans.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Tue, 22 Jul 2014 16:44:42 +0000 (12:44 -0400)]
debugfs: create inode_dump command to dump an inode in hex
Create a command that will dump an entire inode's space in hex.
[ Modified by tytso to add a description to the man page, and to add
the more formal command name, inode_dump, in addition to short
command name of "idump". ]
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Darrick J. Wong [Tue, 22 Jul 2014 16:40:56 +0000 (12:40 -0400)]
e4defrag: backwards-allocated files should be defragmented too
Currently, e4defrag avoids increasing file fragmentation by comparing
the number of runs of physical extents of both the original and the
donor files. Unfortunately, there is a bug in the routine that counts
physical extents, since it doesn't look at the logical block offsets
of the extents. Therefore, a file whose blocks were allocated in
reverse order will be seen as only having one big physical extent, and
therefore will not be defragmented.
Fix the counting routine to consider logical extent offset so that we
defragment backwards-allocated files. This could be problematic if we
ever gain the ability to lay out logically sparse extents in a
physically contiguous manner, but presumably one wouldn't call defrag
on such a file.
Reported-by: Xiaoguang Wang <wangxg.fnst@cn.fujitsu.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sun, 13 Jul 2014 20:18:38 +0000 (16:18 -0400)]
debian: update changelog for 1.42.10-2 release
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Samuel Thibault [Sun, 13 Jul 2014 17:12:51 +0000 (13:12 -0400)]
po: update fr.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Paul Wise [Sat, 12 Jul 2014 15:56:49 +0000 (11:56 -0400)]
Use a wildcard for static libs in the git ignore list
Signed-off-by: Paul Wise <pabs3@bonedaddy.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Paul Wise [Sat, 12 Jul 2014 15:56:49 +0000 (11:56 -0400)]
Add more generated files to the git ignore list
Signed-off-by: Paul Wise <pabs3@bonedaddy.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 10 Jul 2014 20:26:14 +0000 (16:26 -0400)]
lib/ext2fs: Only build tst_libext2fs for make check
It's only necessary to build tst_libext2fs when running "make check".
Also make sure the links of the tst_* programs are done with
$(ALL_LDFLAGS).
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 10 Jul 2014 19:54:42 +0000 (15:54 -0400)]
Use sys/syscall.h instead of syscall.h
Most systems have a backwards compatibility symlink in
/usr/include/syscall.h to /usr/include/sys/syscall.h, but
sys/syscall.h is the documented location of the header file. Fix two
locations where we were using <syscall.h> instead of <sys/syscall.h>.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 10 Jul 2014 19:33:57 +0000 (15:33 -0400)]
mke2fs: fix fencepost error when calling strncat
There were other protections which would prevent a buffer overflow
from happening, but we should fix this nevertheless.
Addresses-Coverity-Bug: #1225003
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 10 Jul 2014 03:44:51 +0000 (23:44 -0400)]
Update release notes, etc. for final 1.42.11 release
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 10 Jul 2014 04:47:40 +0000 (00:47 -0400)]
Fix nroff macro issue in chattr man page
The single quote character must not be in the first character in a
line, or else it can get mistaken as a macro call.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 10 Jul 2014 04:17:05 +0000 (00:17 -0400)]
Fix up configure so it finds mkinstalldirs
As an object lesson in why autoreconf is fundamentally unsafe, the
newer version of nls.m4 no longer handles @MKINSTALLDIRS@. So add
this back, since our Makefiles depend on it.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Thu, 10 Jul 2014 03:30:49 +0000 (23:30 -0400)]
Update translation files
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Trần Ngọc Quân [Thu, 10 Jul 2014 03:13:31 +0000 (23:13 -0400)]
po: update vi.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Yuri Chornoivan [Thu, 10 Jul 2014 03:13:31 +0000 (23:13 -0400)]
po: update uk.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jakub Bogusz [Thu, 10 Jul 2014 03:13:31 +0000 (23:13 -0400)]
po: update pl.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Benno Schulenberg [Thu, 10 Jul 2014 03:13:31 +0000 (23:13 -0400)]
po: update nl.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Benno Schulenberg [Thu, 10 Jul 2014 03:13:30 +0000 (23:13 -0400)]
po: update es.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Benno Schulenberg [Thu, 10 Jul 2014 03:13:30 +0000 (23:13 -0400)]
po: update eo.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Philipp Thomas [Thu, 10 Jul 2014 03:13:30 +0000 (23:13 -0400)]
po: update de.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Petr Pisar [Thu, 10 Jul 2014 03:13:30 +0000 (23:13 -0400)]
po: update cs.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Wed, 9 Jul 2014 00:02:48 +0000 (20:02 -0400)]
mke2fs: add support to align hugefiles relative to beginning of the disk
Add the mke2fs.conf configuration option which causes the hugefiles to
be aligned to the beginning of the disk. This is important if the the
reason for aligning the hugefiles is to support hard-drive specific
features such as Shingled Magnetic Recording (SMR).
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sun, 6 Jul 2014 03:44:28 +0000 (23:44 -0400)]
Update translation files for upcoming 1.42.11 release
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Benno Schulenberg [Sun, 6 Jul 2014 03:39:54 +0000 (23:39 -0400)]
po: add Esperanto translation
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Yuri Chornoivan [Sun, 6 Jul 2014 03:31:42 +0000 (23:31 -0400)]
po: add Ukrainian translation
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Trần Ngọc Quân [Sun, 6 Jul 2014 03:18:05 +0000 (23:18 -0400)]
po: update vi.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Göran Uddeborg [Sun, 6 Jul 2014 03:18:05 +0000 (23:18 -0400)]
po: update sv.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Jakub Bogusz [Sun, 6 Jul 2014 03:18:05 +0000 (23:18 -0400)]
po: update pl.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Benno Schulenberg [Sun, 6 Jul 2014 03:18:05 +0000 (23:18 -0400)]
po: update nl.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Milo Casagrande [Sun, 6 Jul 2014 03:18:04 +0000 (23:18 -0400)]
po: update it.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Samuel Thibault [Sun, 6 Jul 2014 03:18:04 +0000 (23:18 -0400)]
po: update fr.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Philipp Thomas [Sun, 6 Jul 2014 03:18:04 +0000 (23:18 -0400)]
po: update de.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Petr Pisar [Sun, 6 Jul 2014 03:18:04 +0000 (23:18 -0400)]
po: update cs.po (from translationproject.org)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Theodore Ts'o [Sun, 6 Jul 2014 02:53:29 +0000 (22:53 -0400)]
e2fsck: reopen the file system with saved flags after a journal replay
After a journal replay, we close and reopen the file system so that
any changes in the superblock can get reflected in the libext2fs's
internal data structures. We need to save the flags passed to
ext2fs_open() that we used when we originally opened the file system.
Otherwise we will end up not be able to repair a file system which
requires a journal replay and which has bigalloc enabled or which has
more than 2**32 blocks; e2fsck will abort with the error message:
fsck.ext4: Filesystem too large to use legacy bitmaps while trying to re-open
Addresses-Debian-Bug: 744953
Cc: Андрей Василишин <a.vasilishin@kpi.ua>
Cc: Jon Severinsson <jon@severinsson.net>
Cc: 744953@bugs.debian.org
Akira Fujita [Sun, 6 Jul 2014 02:42:36 +0000 (22:42 -0400)]
mke2fs: prevent creation of unmountable ext4 with large flex_bg count
In mke2fs command, if flex_bg count is too large to filesystem blocks
count, unmountable ext4 which has the out of filesystem block offset
is created (Case1). Moreover this large flex_bg count causes an
unintentional metadata layout (bmap-imap-itable-bmap-imap-itable .. in
block group) (Case2).
To fix these issues and keep healthy flex_bg layout, disallow creating
ext4 with obviously large flex_bg count to filesystem blocks count.
Steps to reproduce:
(Case1)
1.
# mke2fs -t ext4 -b 4096 -O ^resize_inode -G $((2**20)) DEV 2130483
2.
# mount -t ext4 DEV MP
mount: wrong fs type, bad option, bad superblock on /dev/sdb4,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so
3.
# dumpe2fs DEV
...
Block count: 2130483
...
Flex block group size: 1048576
...
Group 65: (Blocks 2129920-2130482) [INODE_UNINIT]
Checksum 0x4cb3, unused inodes 8080
Block bitmap at 67 (bg #0 + 67), Inode bitmap at 1048643 (bg #32 + 67)
Inode table at 2129979-2130483 (+59)
^^^^^^^ 2130483 is out of FS!
65535 free blocks, 8080 free inodes, 0 directories, 8080 unused inodes
Free blocks:
Free inodes: 525201-533280
(Case2)
1.
# mke2fs -t ext4 -G
2147483648 DEV 3145728
2.
# debugfs -R stats DEV
...
Block count: 786432
...
Flex block group size:
2147483648
...
Group 0: block bitmap at 193, inode bitmap at 194, inode table at 195
...
Group 1: block bitmap at 707, inode bitmap at 708, inode table at 709
...
Group 2: block bitmap at 1221, inode bitmap at 1222, inode table at 1223
...
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Akira Fujita [Sun, 6 Jul 2014 02:34:07 +0000 (22:34 -0400)]
mke2fs: add get_uint_from_profile to mke2fs.c
We can set flex_bg count only up to 2^30 with profile
because get_int_from_profile can handle it to 2^31-1.
Add get_uint_from_profile to read unsigned int value
so that mke2fs with profile can handle up to 2^31 flex_bg same as -G option.
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Akira Fujita [Sun, 6 Jul 2014 02:14:08 +0000 (22:14 -0400)]
mke2fs: set upper limit to flex_bg count
mke2fs -G option allows root user to set flex_bg count (power of 2).
However ext4 has bad metadata layout if we specify more than or equal to
2^32 to mke2fs -G, because of the 32bit shift operation
in ext2fs_allocate_group_table().
And the maximum block group count of ext4 is 2^32 -1 (ext4_group_t
s_groups_count), so diallow more than 2^32 flex_bg count.
Steps to reproduce:
# mke2fs -t ext4 -G
4294967296 DEV
# dumpe2fs DEV
...
Flex block group size: 1 <----- flex_bg is 1!
...
Group 0: (Blocks 0-32767)
Checksum 0x4afd, unused inodes 7541
Primary superblock at 0, Group descriptors at 1-1
Reserved GDT blocks at 2-59
Block bitmap at 60 (+60), Inode bitmap at 61 (+61)
Inode table at 62-533 (+62)
32228 free blocks, 7541 free inodes, 2 directories, 7541 unused inodes
Free blocks: 540-32767
Free inodes: 12-7552
Group 1: (Blocks 32768-65535) [INODE_UNINIT]
Checksum 0xc890, unused inodes 7552
Backup superblock at 32768, Group descriptors at 32769-32769
Reserved GDT blocks at 32770-32827
Block bitmap at 32828 (+60), Inode bitmap at 32829 (+61)
Inode table at 32830-33301 (+62)
32234 free blocks, 7552 free inodes, 0 directories, 7552 unused inodes
Free blocks: 33302-65535
Free inodes: 7553-15104
...
Signed-off-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: "Darrick J. Wong" <darrick.wong@oracle.com>
Andreas Dilger [Fri, 28 Feb 2014 20:15:45 +0000 (13:15 -0700)]
mke2fs: handle flex_bg collision with backup descriptors
If a large flex_bg factor is specified and the block allocator was
laying out block or inode bitmaps or inode tables, and collides with
previously allocated metadata (for example the backup superblock or
group descriptors) it would reset the allocator back to the beginning
of the flex_bg instead of continuing past the obstruction.
For example, with "-G 131072" the inode table will hit the backup
descriptors in groups 1, 3, 5, 7, 9 and start interleaving with the
block and inode bitmaps. That results in poorly allocated bitmaps
and inode tables that are interleaved and not contiguous as was
intended for flex_bg:
Group 0: (Blocks 0-32767)
Primary superblock at 0, Group descriptors at 1-2048
Block bitmap 2049 (+2049), Inode bitmap at 133121 (bg #4+2049)
Inode table 264193-264200 (bg #8+2049)
:
:
Group 3838: (Blocks
125763584-
125796351) [INODE_UNINIT, BLOCK_UNINIT]
Block bitmap 5887 (bg #0+5887), Inode bitmap 136959 (bg #4+5887)
Inode table 294897-294904 (bg #8 + 32753)
Group 3839: (Blocks
125796352-
125829119) [INODE_UNINIT, BLOCK_UNINIT]
Block bitmap 5888 (bg #0+5888), Inode bitmap 136960 (bg #4+5888)
Inode table 5889-5896 (bg #0 + 5889)
Group 3840: (Blocks
125829120-
125861887) [INODE_UNINIT, BLOCK_UNINIT]
Block bitmap 5897 (bg #0+5897), Inode bitmap 136961 (bg #4+5889)
Inode table 5898-5905 (bg #0 + 5898)
:
:
Instead, skip the intervening blocks if there aren't too many of them.
That mostly keeps the flex_bg allocations from colliding, though still
not perfect because there is still some overlap with the backups.
This patch addresses the majority of the problem, allowing about 124k
groups to be layed out perfectly, instead of less than 4k groups with
the previous code.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Lukas Czerner [Sun, 6 Jul 2014 01:08:34 +0000 (21:08 -0400)]
mke2fs: enable lazy_itable_init on newer kernel by default
Currently is used did not specified lazy_itable_init option we rely on
information from ext4 module exported via sysfs interface. However if
the ext4 module is not loaded it will not be enabled even though kernel
might support it.
With this commit we set the default according to the kernel version,
however we still allow it to be set manually via extended option or be
enabled in case that ext4 module advertise that it supports this
feature.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Lukas Czerner [Sun, 6 Jul 2014 01:07:55 +0000 (21:07 -0400)]
mke2fs: add revision to the is_before_linux_ver()
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Lukas Czerner [Sun, 6 Jul 2014 01:06:29 +0000 (21:06 -0400)]
e2fsprogs: introduce ext2fs_close_free() helper
Currently there are many uses of ext2fs_close() which might be wrong.
First of all ext2fs_close() does not set the ext2_filsys pointer to NULL
so the caller is responsible for clearing it, however there are some
cases there we do not do it.
Second of all very small number of users of ext2fs_close() actually
check the return value. If there is a problem in ext2fs_close() it will
not even free the ext2_filsys structure, but majority of users expect it
to do so.
To fix both problems this commit introduces a new helper
ext2fs_close_free() which will not only check for the return value and
free the ext2_filsys structure if the call to ext2fs_close2() failed,
but it will also set the ext2_filsys pointer to NULL.
Replace every use of ext2fs_close() in e2fsprogs tools with
ext2fs_close_free() - there is no real reason to keep using
ext2fs_close().
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Theodore Ts'o [Sun, 6 Jul 2014 00:23:23 +0000 (20:23 -0400)]
fix cross-compilation support
Commit
2500ebfc89 (util: fix make dependencies for subst) broke cross
compilation because it unconditionally used config.h without setting a
includes path so that the config.h file could be found.
The proposed fix of adding the include path (such as was proposed at
http://patchwork.ozlabs.org/patch/355662/ or in Debian Bug #753375)
isn't really the right way to go, since the information in config.h is
for the target environment, and not the build environment. So using
config.h when building helper programs used as part of the build can
potentially cause more problems than it solves.
In general, build helpers must be written to be as portable as
possible, and to not require any autoconf defined #ifdef's whenever
possible. The subst program broke this rule to (1) address a Coverity
security complaint by using futimes(2) instad of utimes(2) if present,
and (2) to preserve the nanosecond portion of the file timestamp.
Oh, well. We won't be able to do the latter when cross compiling, and
as to the former, if an attacker has write access to your build tree
while you are building programs that will be run as root, you've got
bigger problems. :-)
Fix the problem that commit
2500ebfc89 was trying to address by
explicitly adding @DEFS@ to CFLAGS, so that -DHAVE_CONFIG_H is passed
to make depend. This fixes up the make depend without forcing the use
of config.h when cross-compiling.
Addresses-Debian-Bug: #753375
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Tested-by: Gustavo Zacarias <gustavo@zacarias.com.ar>
Cc: Helmut Grohne <helmut@subdivi.de>
Cc: 753375@bugs.debian.org
Theodore Ts'o [Sat, 5 Jul 2014 04:27:02 +0000 (00:27 -0400)]
configure.in: fix external libblkid test for static link
External libblkid needs -luuid when linking statically.
Also fix up the bogus other-lib parameter in the libuuid test;
$LIBUUID is the null string, so it doesn't do anything other than
obfuscate the use of AC_CHECK_LIB.
Reported-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Sat, 5 Jul 2014 03:34:10 +0000 (23:34 -0400)]
blkid,ext2fs: avoid name clash with __u{8,16,32,64}
Try to avoid name clashes with definitions of __u8, __u16, __u32,
and __u64 in userspace, in case other headers also define these
types. Define HAVE___{S,U}{8,16,32,64} preprocessor macros to
show that these types are already defined.
This would avoid the need to check for _BLKID_TYPES_H in ext2_types.h
and _EXT2_TYPES_H in blkid_types.h, but since older versions of these
headers did not use HAVE___U8 et.al. keep these checks around for now.
Report an error if there are no 64-bit types available. The code
will not compile if these are not available.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Andreas Dilger [Wed, 11 Jun 2014 18:59:05 +0000 (12:59 -0600)]
blkid: remove unnecessary header and comment
The LIST_HEAD macro is not directly used in getsize.c, so
<sys/queue.h> is not needed at all, and could cause confusion at
some later point if the Linux-style list macros are ever used.
Build was verified on MacOS which defined HAVE_SYS_DISK_H true.
I manually inspected the sources for recent *BSD headers to check
if this was needed there or not. MacOS and FreeBSD <sys/disk.h>
do not use lists at all. NetBSD and OpenBSD <sys/disk.h> and all
of the <sys/mount.h> headers include <sys/queue.h> internally.
I used http://fxr.watson.org/fxr/source/sys/mount.h?v={OSTYPE}
as a reference, checking both old and new *BSD versions.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Eric Sandeen [Sat, 5 Jul 2014 03:07:36 +0000 (23:07 -0400)]
e2fsprogs: add mount options to ext4.5
This is a straight cut and paste from the util-linux
mount manpage to ext4.5 (with commented-out lines
removed).
It's pretty much impossible for util-linux to keep up
with every filesystem out there, and Karel has more than
once expressed a wish that mount options move into fs-specific
manpages.
So, here we go.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Eric Sandeen [Sat, 5 Jul 2014 03:03:14 +0000 (23:03 -0400)]
e2fsprogs: revise and extend chattr(1) and chattr usage()
The chattr(1) manpage and chattr usage() output were missing some flags.
Add those, and make some other minor cosmetic fixes.
(I've left out the 'B' (EXT2_COMPRBLK_FL) flag, because
it's not actually used anywhere, and I can't figure out
how it differs from 'c' (EXT2_COMPR_FL))
Also, because the matrix of filesystems & flags is quite large,
refer to filesystem-specific manpages for detailed discussion
of flags supported by those filesystems, rather than trying to
cover it all in this manpage. I'll send those manpage
updates to the appropriate lists a bit later.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Eric Sandeen [Sat, 5 Jul 2014 03:02:59 +0000 (23:02 -0400)]
e2fsprogs: reorder flags in chattr(1)
The flags described in chattr usage() and the chattr(1) manpage
were in semi-random order, which makes it hard to ascertain
which flags might be missing or undocumented, and to locate
flags within the manpage.
Re-order the list of flags in alphanumeric order, and do
the same for the flag descriptions in the body of the manpage.
There should be no content changes here, just reordering
for consistency.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Jan Kara [Fri, 4 Jul 2014 20:24:18 +0000 (16:24 -0400)]
e2fsck: fix last mount time and last write time in preen mode
Fixing last mount time and last write time is safe - there's no risk of
loosing any important information or making corruption significantly
worse even if we get it wrong. So let's just fix these times in preen
mode. This allows initrd to automatically check and mount root
filesystem in case system clock is wrong without having to manually set
broken_system_clock variable (openSUSE uses broken_system_clock by default
to avoid these problems during boot but this disables time-based checks
even on systems where clock is fine so that's not ideal either).
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>