Whamcloud - gitweb
tools/e2fsprogs.git
11 years agoLU-0000 build: fix miscellaneous build warnings 44/9744/3 v1.42.9.wc1
Andreas Dilger [Fri, 21 Mar 2014 05:30:51 +0000 (23:30 -0600)]
LU-0000 build: fix miscellaneous build warnings

Fix various unused variable and use-uninitialized warnings.

Add misc files generated by upstream code into .gitignore.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I7a34499a8665de82b54f1393cf2199101b500c1e

11 years agoLU-1540 e2fsck: add missing symlink NUL terminator
Andreas Dilger [Sat, 14 Jul 2012 02:33:01 +0000 (20:33 -0600)]
LU-1540 e2fsck: add missing symlink NUL terminator

If a long symbolic link target is written into an external block
without a NUL terminator, its length is decided by the inode's size.
Make symlink check add a NUL termination in such cases if needed.

Such faulty symlinks were generated by osd-ldiskfs on the MDS until
Lustre 2.1.3 and Lustre 2.3.  The in-kernel code would handle such
unterminated symlinks correctly, since it used the inode size to
determine the symlink length, but e2fsck would assume the symlink
is broken if there wasn't a trailing NUL.

  LU-2627 e2fsck: check_symlink() SIGSEGV

  Since e2fsck_pass1_check_symlink() calls into check_symlink()
  with pctx == NULL, we should use 'ino' instead of 'pctx->ino'
  in check_symlink().

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
  Change-Id: If9c16f96d0655d5a886ef607f1f47ced6176f8d8

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I4419b30f1adb4a7d273796a936427aa351510213

11 years agoLU-1502 build: enable quota when building
Niu Yawei [Mon, 11 Jun 2012 13:33:55 +0000 (06:33 -0700)]
LU-1502 build: enable quota when building

The quota support is disabled by default, but we need to enable it
explicitly when build e2fsprogs for Lustre 2.4.

Signed-off-by: Niu Yawei <yawei.niu@intel.com>
Change-Id: Ic09f7c100b254559a5223460242b3bf465ff0802

11 years agotests: add basic test case for e2scan
Andreas Dilger [Fri, 13 Apr 2012 19:24:30 +0000 (13:24 -0600)]
tests: add basic test case for e2scan

Add a simple test to verify that e2scan is detecting the correct
files in the filesystem based on the modification time.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2scan: a tool for fast namespace/inode scanning
Andreas Dilger [Fri, 13 Apr 2012 19:17:37 +0000 (13:17 -0600)]
e2scan: a tool for fast namespace/inode scanning

e2scan is a tool for efficiently scanning inodes for changes,
or all inodes, and then generating pathnames for the inodes
of interest.

  LU-3612 e2scan: fix missing header for e2scan to enable SQLite
    SQLite function is always disabled due to missing config.h
Signed-off-by: Shuichi Ihara <sihara@ddn.com>
  Change-Id: Ie4b49bd6d04a6f1e25610df51d17b97d3dc4fe5f

  LU-4328 e2scan: Missing copyright attribution
    Copyright headers are added to the files for e2scan.
    A minor correction was made to the header of tst_read_ea.c
Signed-off-by: James Nunez <james.a.nunez@intel.com>
  Change-Id: Icc359e717a0177b83a08f2471d84e68487dafee3

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoext2fs: add readahead method to improve scanning
Andreas Dilger [Fri, 13 Apr 2012 18:58:53 +0000 (12:58 -0600)]
ext2fs: add readahead method to improve scanning

Add a readahead method for prefetching ranges of disk blocks.
This is useful for inode table scanning, and other large
contiguous ranges of blocks, and may also prove useful for
random block prefetch, since it will allow reordering of the
IO without waiting synchronously for the reads to complete.

It is currently using the posix_fadvise(POSIX_FADV_WILLNEED)
interface, as this proved most efficient during our testing

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: add Lustre lfsck tool
Andreas Dilger [Fri, 13 Apr 2012 08:32:19 +0000 (02:32 -0600)]
e2fsck: add Lustre lfsck tool

The lfsck tool, in conjunction with e2fsck, build a DB4 database
of all the inodes and objects on the MDT and OST filesystems.

The lfsck tool combines the databases on the Lustre client,
and can verify that all of the objects referenced by inodes
exist, are not referenced by two inodes, and have a parent
inode.

  LU-266 e2fsck: regenerate LAST_ID file
    e2fsck should be able to regenerate the LAST_ID file if it gets
    corrupted.  This patch will create a new LAST_ID file if it was
    deleted, and removes the unnecessary lfsck_get_last_id function.
    The last_id is then set as before in e2fsck_pass6_ost to either
    the max objid on disk, or MDS' max ost id, whichever is larger.
Reported-by: Bernd Schubert <aakef@fastmail.fm>
Signed-off-by: Kit Westneat <kwestneat@ddn.com>
  Change-Id: Ic5396da000909b826b76da2fd5a0b5ce88b06944

  LU-2682 lfsck: fix access to ost_id structures
    Changes in upstream lov_mds_md use lmm_oi and l_ost_oi instead
    of direct _id and _seq access, in preparation for FID-on-OST
    changes.  Update lfsck code to handle new structures.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
  Change-Id: If18f5ed34744b8372687472843ccc09108500c1e
  Change-Id: I05b5da92efbedb7b92c6de736c05beef30500c1e

  LU-2677 lfsck: handle smaller lustre_mdt_attrs
    In 2.4 the lustre_mdt_attrs (LMA) structure was shrunk to move out
    the unused SOM fields into a separate structure.  The filter_fid
    structure was shrunk to allow both FF and LMA on 256-byte OST
    inodes. Allow reading both old and new LMA and FF structures,
    since we only care about the initial fields in both of them.
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
  Change-Id: If6c75d5ee3192ef3761aa9f645175698ebe5ee36

  LU-3132 lfsck: fix the wrong data pointer
    The patch is to pass a correct data pointer of struct lu_fid or
    struct ll_recreate_obj to ioctl in lfsck_recreate_obj().
Signed-off-by: Liu Ying <emoly.liu@intel.com>
  Change-Id: I8301f311cc5aaf57ae51ceaeb74db25fe61b5cd6

  LU-3838 lfsck: various defects in lfsck
    - In lfsck_get_fid(), sizeof(buf) should be passed as buffer
      length, but not sizeof(*buf);
    - In lfsck_mds_dirs(), we should follow into directory if the
      directory doesn't contain self fid;
    - In ext2fs_attr_ibody_get(), returns EXT2_ET_EA_TOO_BIG only
      when the buffer isn't large enough;
    - In ext2fs_attr_ibody_get(), do type cast before operating on
      pointer!
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
  Change-Id: I8ba4139fcc51b23bfa33c82c71f9b69b2b033a37

  LU-3837 lfsck: Abort lfsck in DNE mode
    lfsck doesn't support DNE mode now, we'd abort lfsck once the
    objects in remote dir is detected, otherwise, those objects could
    probably be cleared by lfsck mistakenly.
Signed-off-by: Niu Yawei <yawei.niu@intel.com>
  Change-Id: If66f04e261522e6216f524a703d427fce2a5938a

  LU-4288 e2fsprogs: process EA information in e2fsck_lfsck_find_ea
    The function e2fsck_lfsck_find_ea is designed to save information
    about an EA in the correct tables. This patch addresses a bug that
    prevents any EA from looked at it.
Signed-off-by: James Simmons <uja.ornl@gmail.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoext2fs: check if Lustre filesystem is mounted
Andreas Dilger [Fri, 13 Apr 2012 08:16:24 +0000 (02:16 -0600)]
ext2fs: check if Lustre filesystem is mounted

Add a check to ext2fs_check_mount_point() to loo in /proc/fs/lustre/*
to see if Lustre is mounted, since st_rdev of the mountpoint does not
match st_rdev of the block device itself, which confuses libext2fs.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agodebugfs: dump "fid" and "lma" xattrs on inode stat
Andreas Dilger [Fri, 13 Apr 2012 18:55:45 +0000 (12:55 -0600)]
debugfs: dump "fid" and "lma" xattrs on inode stat

Print out the Lustre "fid" and "lma" object xattr contents,
if present, with debugfs stat to simplify debugging.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoLU-1774 tests: e2fsck -D does not change dirdata
Bobi Jam [Sat, 25 Aug 2012 07:08:59 +0000 (15:08 +0800)]
LU-1774 tests: e2fsck -D does not change dirdata

Add test case for directory optimization to preserve dirdata
content for dot and dotdot entries.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
Change-Id: Iae190794da75a2080a8e5cc5b95a49e0c894f72f

11 years agotests: add basic tests for dirdata feature
Andreas Dilger [Fri, 13 Apr 2012 19:42:42 +0000 (13:42 -0600)]
tests: add basic tests for dirdata feature

Signed-off-by: Pravin Shelar <pravin@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: add support for dirdata feature
Andreas Dilger [Fri, 13 Apr 2012 19:39:26 +0000 (13:39 -0600)]
e2fsck: add support for dirdata feature

Add support for the INCOMPAT_DIRDATA feature, which allows
storing extra data in the directory entry beyond the name.
This allows the Lustre File IDentifier to be accessed in
an efficient manner, and would be useful for expanding a
filesystem to allow more than 2^32 inodes in the future.

  LU-1774 e2fsck: e2fsck -D does not change dirdata content

  Fix dir optimization to preserver dirdata content for dot
  and dotdot entries.

Signed-off-by: Bobi Jam <bobijam.xu@intel.com>
  Change-Id: Iae190794da75a2080a8e5cc5b95a49e0c894f72f

  LU-2462 e2fsprogs Consider DIRENT_LUFID flag in link_proc().

  While adding the new file entry in directory block, link_proc()
  calculates minimum record length of the existing directory entry
  without considering the dirent data size and which leads to
  corruption. Changed the code to use EXT2_DIR_REC_LEN() which will
  return correct record length including dirent data size.

Signed-off-by: Manisha Salve <msalve@ddn.com>
  Change-Id: Ic593c558c47a78183143ec8e99d8385ac94d06f7

  LU-4677 libext2fs, e2fsck: don't use ext2_dir_entry_2

  Due to endian issues, do not use ext2_dir_entry_2 because it will
  have the wrong byte order on directory entries that are swabbed.
  Instead, use the standard practice of mask-and-shift to access the
  file_type and dirdata flags.

Signed-off-by: Pravin Shelar <pravin@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: verify large xattr inode support
Andreas Dilger [Fri, 13 Apr 2012 08:14:16 +0000 (02:14 -0600)]
tests: verify large xattr inode support

Verify that inodes with large EAs in a secondary inode are working:
* EA inode needs to have EA_INODE_FL set
* EA inode should reference parent inode number+generation

Signed-off-by: Kalpak Shah <kalpak@sun.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: add support for xattrs in external inodes
Andreas Dilger [Fri, 13 Apr 2012 08:04:53 +0000 (02:04 -0600)]
e2fsck: add support for xattrs in external inodes

Add support for the INCOMPAT_EA_INODE feature, which stores large
extended attributes into an external inode instead of data blocks.
The inode is referenced by the e_value_inum field (formerly the
unused e_value_block field) from the extent header, and stores the
xattr data starting at byte offset 0 in the inode data block.

The xattr inode stores the referring inode number in its i_mtime,
and the parent i_generation in its own i_generation, so that there
is a solid linkage between the two that e2fsck can verify.  The
xattr inode is itself marked with EXT4_EA_INODE_FL as well.

Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: clean up xattr checking code, add test
Andreas Dilger [Fri, 13 Apr 2012 08:01:12 +0000 (02:01 -0600)]
e2fsck: clean up xattr checking code, add test

Clean up xattr header/list processing for in-inode xattrs instead
of doing lots of explicit pointer math.  Add a regression test for
in-inode xattrs.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: add checks for journal checksum feature
Andreas Dilger [Fri, 13 Apr 2012 07:50:29 +0000 (01:50 -0600)]
tests: add checks for journal checksum feature

f_jchksum_bblk: journal checksum feature where there is a corrupt
block in an uncommitted transaction
f_jchksum_blast_trans: incomplete last trans not considered bad
f_jchksum_remount: check journal mounted by a kernel without
CHECKSUM support after CHECKSUM is in journal superblock

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: create random filesystem, corrupt, e2fsck
Andreas Dilger [Fri, 13 Apr 2012 07:53:39 +0000 (01:53 -0600)]
tests: create random filesystem, corrupt, e2fsck

The f_random_corruption test enables a random subset of filesystem
features, picks one of the valid filesystem block and inode sizes,
and a random device size and creates a new filesystem with those
parameters.

It is possible to disable the running of the test by setting the
environment variable F_RANDOM_CORRUPTION=skip.  By default the test
script is run only one time, but setting the LOOP_COUNT variable
allows the test to run multiple times.

The resulting filesystem is corrupted with both random data and
by shifting data from one part of the device to another and then
repaired by e2fsck.  In some rare cases the random corruption is
severe enough that the filesystem is not recoverable (e.g. small
filesystem with no backup superblock has bad superblock corruption)
but in most cases "e2fsck -fy" should be able to fix all errors
in some way.

A second e2fsck run is done to verify that all of the errors are
fixed in the first pass, and that the filesystem is free of errors.

Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: add test cases for inode badness
Andreas Dilger [Fri, 13 Apr 2012 07:23:17 +0000 (01:23 -0600)]
tests: add test cases for inode badness

Signed-off-by: Girish Shilamkar <girish@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: track errors/badness found for each inode
Andreas Dilger [Fri, 13 Apr 2012 07:13:58 +0000 (01:13 -0600)]
e2fsck: track errors/badness found for each inode

The present e2fsck code checks the inode, per field basis.  It
doesn't take into consideration to total sanity of the inode.
This may cause e2fsck turning a garbage inode into an apparently
sane inode ("It is a vessel of fertilizer, and none may abide
its strength.").

The following patch adds a heuristics to detect the degree of
badness of an inode. icount mechanism is used to keep track of
the badness of every inode.  The badness is increased as various
fields in inode are found to be corrupt.  Badness above a certain
threshold value results in deletion of the inode.  The default
badness threshold value is 7, it can be specified to e2fsck
using "-E inode_badness_threshold=<value>"

This can avoid lengthy pass1b shared block processing, where a
corrupt chunk of the inode table has resulted in a bunch of
garbage inodes suddenly having shared blocks with a lot of good
inodes (or each other).

Signed-off-by: Girish Shilamkar <girish@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: add tests for expanding inode extra size
Andreas Dilger [Fri, 13 Apr 2012 00:05:03 +0000 (18:05 -0600)]
tests: add tests for expanding inode extra size

Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: add support for expanding the inode size
Andreas Dilger [Fri, 13 Apr 2012 00:03:37 +0000 (18:03 -0600)]
e2fsck: add support for expanding the inode size

This patch adds a "-E expand_extra_isize" feature which makes sure
that _every_ used inode has i_extra_isize >= s_min_extra_isize if
s_min_extra_isize is set. Else it makes sure that i_extra_isize
of every inode is equal to sizeof(ext2_inode_large) - 128.

This is useful for the case where nanosecond timestamps or 64-bit
inode version fields are required for all inodes in the filesystem.

Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: ignore xattr feature in backup superblocks
Andreas Dilger [Fri, 13 Apr 2012 18:25:25 +0000 (12:25 -0600)]
e2fsck: ignore xattr feature in backup superblocks

Since the xattr feature is enabled automatically by the kernel,
it can cause spurious e2fsck runs on a clean filesystem due to
differences between the primary and backup superblocks.

Signed-off-by: Eric Sandeen <esandeen@redhat.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: PAGE_SIZE larger than blocksize with hole
Andreas Dilger [Thu, 9 Aug 2012 05:23:22 +0000 (23:23 -0600)]
tests: PAGE_SIZE larger than blocksize with hole

Verify correct operation in the case of writing files with allocated
blocks at the end of the file beyond i_size.  This can happen for
PAGE_SIZE > blocksize, or through fallocate().

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: handle preallocation for large PAGE_SIZE
Andreas Dilger [Fri, 13 Apr 2012 07:59:31 +0000 (01:59 -0600)]
e2fsck: handle preallocation for large PAGE_SIZE

Fix handling of block preallocation support in cases where the kernel
PAGE_SIZE is larger than the filesystem blocksize.

Signed-off-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: add tests for uninitialized bitmaps
Andreas Dilger [Thu, 12 Apr 2012 23:52:44 +0000 (17:52 -0600)]
tests: add tests for uninitialized bitmaps

Various tests for handing uninitialized block and inode bitmaps,
and inodes beyond the in-use high watermark.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotune2fs: warn if the filesystem journal is dirty
Andreas Dilger [Thu, 12 Apr 2012 23:38:13 +0000 (17:38 -0600)]
tune2fs: warn if the filesystem journal is dirty

Running tune2fs on a filesystem with an unrecovered journal can
cause the tune2fs settings to be reverted when the journal is
replayed.  Print a warning if this is detected so that the user
isn't surprised if it happens.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: allow deleting or zeroing shared blocks
Andreas Dilger [Thu, 12 Apr 2012 23:32:53 +0000 (17:32 -0600)]
e2fsck: allow deleting or zeroing shared blocks

E2fsck fixes files that are found to be sharing blocks by cloning
the shared blocks and giving each file a private copy in pass 1D.

Allowing all files claiming the shared blocks to have copies can
inadvertantly bypass access restrictions.  Deleting all the files,
zeroing the cloned blocks, or placing the files in the /lost+found
directory after cloning may be preferable in some secure environments.

The following patches implement config file and command line options
in e2fsck that allow pass 1D behavior to be tuned according to site
policy.  It adds two extended options and config file counterparts.
On the command line:

 -E clone=dup|zero

    Select the block cloning method.  "dup" is old behavior,
    and is the default.  "zero" is a new method that substitutes
    zero-filled blocks for the shared blocks in all the files
    that claim them.

 -E shared=preserve|lost+found|delete

    Select the disposition of files containing shared blocks.
    "preserve" is the old behavior which remains the default.
    "lost+found" causes files to be unlinked after cloning so
    they will be reconnected to /lost+found in pass 3.
    "delete" skips cloning entirely and simply deletes the files.

In the config file:
  [options]
      clone=dup|zero
      shared=preserve|lost+found|delete

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: parse config file before command-line opts
Andreas Dilger [Thu, 12 Apr 2012 23:24:55 +0000 (17:24 -0600)]
e2fsck: parse config file before command-line opts

The patch changes the order that the config file and command line
are parsed so that command line has precedence.  It also parses
the -E option for every occurrence, otherwise the -E option is
not cumulative.

Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: invalid value of in-inode EA offset
Andreas Dilger [Thu, 12 Apr 2012 22:57:40 +0000 (16:57 -0600)]
tests: invalid value of in-inode EA offset

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: extent pointing to non-existent block
Andreas Dilger [Fri, 25 May 2012 07:01:28 +0000 (01:01 -0600)]
tests: extent pointing to non-existent block

Signed-off-by: Girish Shilamkar <girish@clusterfs.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: workaround for old extents tests
Andreas Dilger [Thu, 12 Apr 2012 22:15:07 +0000 (16:15 -0600)]
e2fsck: workaround for old extents tests

The e2fsck_ext2fs_extent_get() part of this patch is a workaround
to handle problems with old Lustre extents patches that didn't
clear the ee_start_hi or ei_leaf_hi fields.

That has been fixed for long time and could be removed as soon
as the f_extent_* tests are fixed to clear these _hi fields.
Otherwise the extents are all marked as corrupt and it ruins those
tests value.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agotests: verify > 65000 subdirectories
Andreas Dilger [Thu, 12 Apr 2012 22:02:12 +0000 (16:02 -0600)]
tests: verify > 65000 subdirectories

Add test case to verify nlink handling of large directories.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoTT-177 build: add .spec file for SLES11 packaging
Andreas Dilger [Fri, 13 Apr 2012 08:23:12 +0000 (02:23 -0600)]
TT-177 build: add .spec file for SLES11 packaging

Include the upstream SLES11 .spec file to ensure the packages we
build match the upstream packages.  Any later patches that change
the packaging should patch the .spec file appropriately.

Add in the SLES-specific patches, excluding the replacement de.po
file, since the original SLES11 de.po file is only against 1.41.4,
and is missing a large number of changes to the translated messages
related to 64-bit format specifiers.

  LU-4284 build: add missing Provides line in SLES spec file

  Need to add a line in the SUSE spec file for Provides: ldiskfsprogs.
  This is present in the RHEL spec file and is needed to resolve
  dependencies in lustre server rpms at rpm install time.

Signed-off-by: Bob Glossman <bob.glossman@intel.com>
  Change-Id: Ib4821004d27c9a7271ffdbd7403990e586d6c9ca

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I783d58bd78d7c4c66cc85ec5557ae1aaf64016ba

11 years agobuild: add RHEL6 .spec file for packaging
Andreas Dilger [Fri, 13 Apr 2012 08:19:19 +0000 (02:19 -0600)]
build: add RHEL6 .spec file for packaging

Include the upstream RHEL6 .spec file to ensure the packages we
build match the upstream packages.  Any later patches that change
the packaging should patch the .spec file appropriately.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agobuild: update e2fsprogs.spec for distro builds
Andreas Dilger [Thu, 12 Apr 2012 21:39:04 +0000 (15:39 -0600)]
build: update e2fsprogs.spec for distro builds

Add the distro version to the RPM release number, so that it the
RPM names do not conflict.

Allow the RPM built from upstream to replace the split packages
provided by the distros.  At some point in the future it may be
desirable to also split the RPM built by this spec file, but this
is complicated by the fact that SLES and RHEL have different splits.

Signed-off-by: Girish Shilamkar <girish.shilamkar@sun.com>
Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agofilefrag: Lustre changes to filefrag FIEMAP handling
Andreas Dilger [Thu, 12 Apr 2012 21:31:35 +0000 (15:31 -0600)]
filefrag: Lustre changes to filefrag FIEMAP handling

Add support for multiple-device filesystems by defining a new
fe_device field in the fiemap_extent structure.  This allows
printing the filesystem-relative or linux block device number
associated with each extent of a file.  If a single filesystem
extent is mirrored to multiple block devices, the fe_device
field can be used to disambiguate the multiple copies.

If the "-l" (device-logical) option is given to filefrag, then
all extents for a particular device of a file are returned
before returning extents for the next device.  This makes it
easier to see if extent allocation within a single device is
contiguous, instead of returning all of the blocks of a file
interleaved in file-logical-offset order.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoblkid: fix ZFS device detection
Andreas Dilger [Thu, 12 Apr 2012 21:26:49 +0000 (15:26 -0600)]
blkid: fix ZFS device detection

Fix the ZFS device detection by looking at multiple uberblocks to
see if any are present, rather than looking for the ZFS boot block
which is not always present.

There may be up to 128 uberblocks, but the first 4 are not written
to disk on a newly-formatted filesystem so check several of them at
different offsets within the uberblock array.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoe2fsck: improve in-inode xattr checks
Andreas Dilger [Thu, 12 Apr 2012 21:23:47 +0000 (15:23 -0600)]
e2fsck: improve in-inode xattr checks

Add check for in-inode xattr to make sure that it is not referencing
an offset that is beyond the end of the inode.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agoLU-0000 tests: fix resize test tmpfs max-file-size checking 46/9746/4
Andreas Dilger [Fri, 21 Mar 2014 07:05:51 +0000 (01:05 -0600)]
LU-0000 tests: fix resize test tmpfs max-file-size checking

Old distros may not have the "truncate" tool, so use "dd" instead.

If tmpfs cannot handle a 2GB temp file (e.g. old RHEL5 and SLES 11
kernels) then skip the test instead of failing it.  If this fails,
try to report better error messages instead of failing silently.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
Change-Id: I575ebe1c902e34876a6f507a1a28eb30f4500c1e

11 years agotests: make generated test scripts read-only
Andreas Dilger [Wed, 23 May 2012 21:05:21 +0000 (15:05 -0600)]
tests: make generated test scripts read-only

Make generated test scripts read-only, to avoid errors by developers
editing the generated test scripts and then having them accidentally
clobbered when "make" is run again.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agobuild: update version for Lustre build
Andreas Dilger [Thu, 12 Apr 2012 20:00:07 +0000 (14:00 -0600)]
build: update version for Lustre build

Add Lustre-specific build version to distinguish packages from
upstream packages.

Signed-off-by: Andreas Dilger <andreas.dilger@intel.com>
11 years agomke2fs: disable resize_inode feature if 64bit feature is enabled
Eryu Guan [Thu, 4 Jul 2013 09:05:10 +0000 (17:05 +0800)]
mke2fs: disable resize_inode feature if 64bit feature is enabled

Since auto_64-bit_support is on by default, resize_inode feature will
be disabled when creating a >16T ext4 according to mke2fs.conf(5).

This should also be done when making ext4 with "-O 64bit" to enable
64bit feature explicitly. Otherwise online resize to enlarge a
over-16T fs to larger would fail.

[root@localhost resize]# truncate -s 50t fs.img
[root@localhost resize]# losetup /dev/loop0 fs.img
[root@localhost resize]# mkfs -t ext4 -O 64bit /dev/loop0 30t
[root@localhost resize]# mount /dev/loop0 mnt
[root@localhost resize]# resize2fs /dev/loop0
resize2fs 1.42.7 (21-Jan-2013)
Filesystem at /dev/loop0 is mounted on /root/resize/mnt; on-line resizing required
old_desc_blocks = 3840, new_desc_blocks = 6400
resize2fs: Invalid argument While checking for on-line resizing support

And dmesg shows
[688378.442623] EXT4-fs (loop0): resizing filesystem from 6710886400 to 13421772800 blocks
[688378.443216] EXT4-fs warning (device loop0): verify_reserved_gdb:700: reserved GDT 3201 missing grp 177147 (5804756097)
[688378.443222] EXT4-fs (loop0): resized filesystem to 8858370048
[688378.528451] EXT4-fs warning (device loop0): ext4_group_extend:1710: can't shrink FS - resize aborted

With this fix resize2fs could do the online enlarge correctly.

Signed-off-by: Eryu Guan <guaneryu@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoe2fsck: don't use e2fsck_global_ctx in e2fsck_set_bitmap_type()
Theodore Ts'o [Wed, 5 Mar 2014 00:10:26 +0000 (19:10 -0500)]
e2fsck: don't use e2fsck_global_ctx in e2fsck_set_bitmap_type()

There is no reason to use e2fsck_global_ctx in
e2fsck_set_bitmap_type(), since we can get the context structure from
fs->priv_data.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoe2fsck: always make sure e2fsck_global_ctx is set
Theodore Ts'o [Wed, 5 Mar 2014 00:05:00 +0000 (19:05 -0500)]
e2fsck: always make sure e2fsck_global_ctx is set

The e2fsck_global_ctx varible was only being set if HAVE_SIGNAL_H is
defined.  There are systems, such as Android, where this is not true.

This was causing e2fsck_set_bitmap_type() to seg fault since
e2fsck_global_ctx was not NULL.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: JP Abgrall <jpa@google.com>
11 years agodebian: fix udeb package support
Filipe Brandenburger [Tue, 25 Feb 2014 06:33:00 +0000 (01:33 -0500)]
debian: fix udeb package support

Previous commit which introduced SKIP_UDEB variable had typos in the
variable name in the m4 macros of control.in (UDEV vs. UDEB.) Fix those
typos and fix m4 quoting problem in "Don't".

Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoresize2fs: don't free in-use clusters when moving blocks
Darrick J. Wong [Mon, 24 Feb 2014 01:54:54 +0000 (20:54 -0500)]
resize2fs: don't free in-use clusters when moving blocks

When we're moving blocks around the filesystem, ensure that freeing
the old blocks only frees the clusters if they're not in use by other
metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoresize2fs: during shrink, don't free in-use bg data clusters
Darrick J. Wong [Mon, 24 Feb 2014 01:33:59 +0000 (20:33 -0500)]
resize2fs: during shrink, don't free in-use bg data clusters

When freeing a block group descriptor block, be careful not to free
metadata clusters belonging to other groups!

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agoe2fsck: don't add a UUID on a mounted filesystem with csums
Michael Marineau [Sun, 19 Jan 2014 22:09:34 +0000 (14:09 -0800)]
e2fsck: don't add a UUID on a mounted filesystem with csums

This fix is similar to 66457fcb for tune2fs. When booting from a root
filesystem with an empty UUID which fsck fixes the following remount
step reliably fails, leaving the filesystem in an inconsistent state.
Like the tune2fs fix this patch resolves the issue by simply refusing to
update the UUID if the filesystem is mounted.

Signed-off-by: Michael Marineau <michael.marineau@coreos.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoAdd coverage testing using gcov
Theodore Ts'o [Sun, 23 Feb 2014 05:17:09 +0000 (00:17 -0500)]
Add coverage testing using gcov

To check the coverage of e2fsprogs's regression test, do the
following:

configure --enable-gcov
make -j8 ; make -j8 check ; make coverage.txt

The coverage information will be the coverage.txt and *.gcov files in
the build directories.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoSet pointer to NULL after ext2fs_free
Lukas Czerner [Fri, 21 Feb 2014 01:54:29 +0000 (20:54 -0500)]
Set pointer to NULL after ext2fs_free

ext2fs_free() does not set the ext2_filsys pointer to null so the
caller is responsible to setting it himself if it is needed.

This patch fixes some places where caller did not set ext2_filsys
pointer to NULL after ext2fs_free() which might result in use after
free.  Fix it.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agotune2fs: allow removal of dirty journal with two "-f" options
Eric Sandeen [Fri, 21 Feb 2014 01:18:41 +0000 (20:18 -0500)]
tune2fs: allow removal of dirty journal with two "-f" options

Jim pointed out that "tune2fs -f -O ^has_journal" won't remove the
journal if the needs_recovery flag is set; the manpage seems to indicate
that it should.  And if you've lost an external journal and can no longer
replay it, how should one proceed?

Change tune2fs so that two "-f" options will allow removal of a dirty
journal from a filesystem, even if the filesystem needs recovery.

e2fsck can then do its best to pick up the pieces.

Addresses-Debian-Bug: #559301

Reported-by: Jim Faulkner <james.faulkner@yale.edu>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: delete unused "handle" variable
jon ernst [Fri, 21 Feb 2014 00:59:24 +0000 (19:59 -0500)]
libext2fs: delete unused "handle" variable

After commit 62f17f36031102a2a40fac338e063c556f73b94a, variable
"handle" has no use.  So delete it.

Signed-off-by: Jon Enrst <jonernst07@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoe4defrag: remove local sync_file_range and fallocate
Baruch Siach [Wed, 19 Feb 2014 01:02:49 +0000 (20:02 -0500)]
e4defrag: remove local sync_file_range and fallocate

The locally defined versions of both sync_file_range and fallocate are broken
on 32bit systems. On these systems two 32bit registers are needed for each
64bit parameter. Also, sync_file_range on MIPS32 needs a dummy parameters
after the fd parameter. Just leave all these subtleties to the C library.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agoext2fs: declare struct_io_manager at end of file
Andreas Dilger [Tue, 18 Feb 2014 23:31:36 +0000 (18:31 -0500)]
ext2fs: declare struct_io_manager at end of file

Declare struct_io_manager at the end of unix_io.c, undo_io.c, and
test_io.c files so that there isn't a need to forward declare every
member of this structure.  That avoids a lot of redundant code
at the start of every one of these files.

Move the test_flush() function above test_abort() to avoid the need
for a forward declaration.

Fix a few instances of space before tab in these files.

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agotests: skip unsupported tests on MacOS systems
Andreas Dilger [Tue, 18 Feb 2014 23:29:55 +0000 (18:29 -0500)]
tests: skip unsupported tests on MacOS systems

The "mkswap" program is not available on MacOS, so just use the
existing swap0.img.bz2 and swap1.img.bz2 files directly.

Because MacOS HFS+ doesn't support sparse files (welcome to the 80's)
the m_bigjournal test takes forever to zero out the whole 42GB test
filesystem.  Skip this test for Darwin kernels for now.

Unfortunately, neither "df -T" nor "stat -f -c %T" is available on
MacOS to directly determine the filesystem type, and I'm too lazy
to parse the output of "mount" and match it to the path of the test
directory in shell, so it just checks the kernel type and assumes
the filesystem type is HFS and skips the test.

Since this test runs on Linux the majority of the time, the loss of
test coverage is minimal.  If MacOS should ever get a real filesystem,
this can be revisited.

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agobuild: fix LLVM compiler warnings
Andreas Dilger [Tue, 18 Feb 2014 17:12:32 +0000 (12:12 -0500)]
build: fix LLVM compiler warnings

Fix a number of non-literal string format warnings from LLVM due
to the use of _() that were not fixed in commit 45ff69ffeb.

Fix mismatched int vs. __u64 format warnings in blkmap64_rb.c.
There were also some comparisons of __u64 start or count <= 0.
Change them to be comparisons == 0, or start + count overflow.

Fix operator precedence warning for (value & (value - 1) != 0)
introduced in 11d1116a7c0b.  It seems "&" is lower precedence
than "!=", so the above didn't fail for power-of-two values,
but only odd values.  Fortunately, either s_desc_size nor
s_inode_size is valid if odd.

Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agochattr: improve the description for 'j' option in manpage
Zheng Liu [Wed, 12 Feb 2014 17:28:29 +0000 (12:28 -0500)]
chattr: improve the description for 'j' option in manpage

Ext4 file system also supports to set/clear 'j' attribute, but it just
say that this option is only useful for ext3 in manpage.  This commit
fixes it.

Signed-off-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
11 years agolibe2p: allow libe2p.h to be used in C++ programs
Theodore Ts'o [Fri, 7 Feb 2014 22:25:28 +0000 (17:25 -0500)]
libe2p: allow libe2p.h to be used in C++ programs

In C++, "private" is a reserved keyword, so don't use it in the header
file as a function parameter name.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: try to roll back when splitting an extent fails
Darrick J. Wong [Thu, 6 Feb 2014 20:34:00 +0000 (15:34 -0500)]
libext2fs: try to roll back when splitting an extent fails

If a client asks us to remap a block in the middle of an extent, we
potentially have to allocate a fair number of blocks to handle extent
tree splits.  A failure in either of the ext2fs_extent_insert calls
leaves us with an extent tree that no longer maps the logical block in
question and everything that came after it!  Therefore, try to roll
back the extent tree changes before returning an error code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agolibext2fs: don't hang on to unmapped block if extent tree update fails
Darrick J. Wong [Thu, 6 Feb 2014 20:32:18 +0000 (15:32 -0500)]
libext2fs: don't hang on to unmapped block if extent tree update fails

If we're doing a BMAP_ALLOC allocation and the extent tree update
fails, there's no point in hanging on to the newly allocated block.
So, free it to make fsck happy.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agolibext2fs: during punch, fix parent extents after modifying extent
Darrick J. Wong [Thu, 6 Feb 2014 20:30:59 +0000 (15:30 -0500)]
libext2fs: during punch, fix parent extents after modifying extent

When modifying/removing an extent during punch, don't forget to update
the extent's parents.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agolibext2fs: iterate past lower extents during punch
Darrick J. Wong [Thu, 6 Feb 2014 20:29:15 +0000 (15:29 -0500)]
libext2fs: iterate past lower extents during punch

When we're iterating extents during a punch operation, the loop exits
if the punch region is entirely to the right of the extent we're
looking at.  This can happen if the punch region starts in the middle
of a hole and covers mapped extents.  When this happens, we want to
skip to the next extent, because it might be punchable.

Also, if we've totally passed the punch range, stop.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agomke2fs: clean up kernel version tests
Darrick J. Wong [Thu, 6 Feb 2014 20:24:01 +0000 (15:24 -0500)]
mke2fs: clean up kernel version tests

Refactor the running kernel version checks to hide the details of
version code checking, etc.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Zheng Liu <wenqing.lz@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agoe2fsprogs: Disallow tune2fs enabling sparse_super with ext4 meta_bg enabled
Akira Fujita [Thu, 6 Feb 2014 20:11:52 +0000 (15:11 -0500)]
e2fsprogs: Disallow tune2fs enabling sparse_super with ext4 meta_bg enabled

When meta_bg feature is enabled, group descriptor block is allocated
every 128 block group (or every 64 block group if 64bit feature is
enabled).

In such situation, files in block group more than #128 will be removed
if sparse_super feature is enabled with tune2fs and afterwards
necessary e2fsck running.

Because tune2fs does not reallocate group descriptor blocks but just
set sparse_super feature.  If ext4 has sparse_super,
ext2fs_descriptor_block_loc2() called by e2fsck thinks the block group
(e.g. #128) that it has group descriptor block at the head offset. But
that offset is used as backup super block before.  So e2fsck fixes
ext4 based on invalid group descriptor blocks and this cause data
lost.

The patch avoids this problem simply by disallow tune2fs enabling
sparse_super if meta_bg is enabled.

Steps to reproduce:

1. Create ext4 which has meta_bg, ^sparse_super and 129+ block groups.
# mke2fs -t ext4 -O meta_bg,^resize_inode,^sparse_super DEV 17G
# mount DEV /MP

2. Create direcotry and files which use block group #128's metadata.
# echo $((8192*128+1)) > /sys/fs/ext4/DEV/inode_goal
# mkdir /MP/DIR
# for i in $(seq 1 100); do dd if=/dev/urandom of=/MP/DIR/file$i bs=1024 count=10; done

3. Enable sparse_super with tune2fs then execute e2fsck.
   Data in block group #128 will be lost!!
# umount DEV
# tune2fs -O sparse_super DEV
# e2fsck/e2fsck -yf DEV

Signed-off-by: Akira Fujita <a-fujita@rs.jp.ne.cocm>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agodebian: fix spelling typo in debian/control.in
Theodore Ts'o [Thu, 6 Feb 2014 20:00:44 +0000 (15:00 -0500)]
debian: fix spelling typo in debian/control.in

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agodebian: add make variable to prevent building udeb packages
Theodore Ts'o [Thu, 6 Feb 2014 19:59:25 +0000 (14:59 -0500)]
debian: add make variable to prevent building udeb packages

Setting SKIP_UDEBS=yes in rules.custom will prevent the debian/rules
makefile from building the udeb files for the debian installer.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agomke2fs: minor bugfixes for mk_hugefiles
Theodore Ts'o [Thu, 6 Feb 2014 19:34:12 +0000 (14:34 -0500)]
mke2fs: minor bugfixes for mk_hugefiles

Interpret "zero_hugefiles" relation in mke2fs.conf as a boolean value,
as documented in the man page.

If the hugefile is larger than 2GB, set the large_file file system
feature so e2fsck doesn't complain.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoconfigure: support biarch builds with --multiarch=lib64
Theodore Ts'o [Wed, 5 Feb 2014 20:45:36 +0000 (15:45 -0500)]
configure: support biarch builds with --multiarch=lib64

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agodebian: update debian/changelog and e2fslibs.symbols for 1.42.9-3 release
Theodore Ts'o [Wed, 5 Feb 2014 03:59:25 +0000 (22:59 -0500)]
debian: update debian/changelog and e2fslibs.symbols for 1.42.9-3 release

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agodebian: fix dpkg-buildpackage for Debian Squeeze
Theodore Ts'o [Wed, 5 Feb 2014 02:45:51 +0000 (21:45 -0500)]
debian: fix dpkg-buildpackage for Debian Squeeze

Commit becb01ce84d breaks building e2fsprogs with dpkg 1.15.8 which is
used in Debian 6.0 (Squeeze), since it doesn't support package
specifications qualified with an architecture (i.e., "dpkg-query -W
libblkid1:amd64").

Debian only needs to use its own version of libblkid and libuuid for
versions of Debian 5.0 (Lenny) or before.  So default to using
util-linux-ng, instead of trying to test the version number of
libblkid1.

Lenny was released in February, 2009, and the current stable Debian
release is 7.x, so it is two stable releases back as of February 2014.
In the unlikely case someone needs to build a modern version of
e2fsprogs on a version of Debian which is five years old or older, can
create the file Debian/rules.custom with the line:

UTIL_LINUX_NG = no

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agomke2fs: add support for hugefiles_align
Theodore Ts'o [Tue, 4 Feb 2014 17:30:00 +0000 (12:30 -0500)]
mke2fs: add support for hugefiles_align

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoFix up the Makefiles dependencies in lib/ext2fs and lib/quota
Theodore Ts'o [Thu, 30 Jan 2014 23:48:23 +0000 (18:48 -0500)]
Fix up the Makefiles dependencies in lib/ext2fs and lib/quota

Also use angle brackets for the #include of dirpaths.h to avoid the
need to manually massage the Makefile.in for the util directory.  This
is needed because we have to create a fake dirpaths.h file in the util
directory.  The fake dirpaths.h file is rquired to break the circular
dependency caused by util/subst creating dirpaths.h, while
util/subst.c is including config.h, which includes dirpaths.h.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoblkid: suppress Coverity warning
Theodore Ts'o [Thu, 30 Jan 2014 23:02:37 +0000 (18:02 -0500)]
blkid: suppress Coverity warning

The getopt() function will never let optarg be NULL (at least without
using the GNU double-colon extension, which we don't use because it's
not portable), so don't bother checking for that case.  It's harmless,
but it triggers a Coverity warning elsewhere, since it thinks optarg
could in fact be NULL.

Addresses-Coverity-Id: #1049156

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibss: fix potential buffer overrun in list_rqs
Theodore Ts'o [Thu, 30 Jan 2014 22:45:36 +0000 (17:45 -0500)]
libss: fix potential buffer overrun in list_rqs

Addresses-Coverity-Bug: #709516

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoquota: fix uninitiaized memory reference in mke2fs with quota enabled
Theodore Ts'o [Thu, 30 Jan 2014 22:10:46 +0000 (17:10 -0500)]
quota: fix uninitiaized memory reference in mke2fs with quota enabled

Initialize the on-disk structure before we fill it in, to avoid the
following valgrind warning:

   Conditional jump or move depends on uninitialised value(s)
      at 0x4323A8: qtree_entry_unused (quotaio_tree.c:40)
      by 0x431218: v2r1_mem2diskdqblk (quotaio_v2.c:85)
      by 0x432409: qtree_write_dquot (quotaio_tree.c:336)
      by 0x431136: v2_commit_dquot (quotaio_v2.c:264)
      by 0x42FB63: quota_write_inode (mkquota.c:126)
      by 0x408BE6: create_quota_inodes (mke2fs.c:2466)
      by 0x409A2D: main (mke2fs.c:2850)

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoblkid: avoid potential integer overflow issues identified by Coverity
Theodore Ts'o [Thu, 30 Jan 2014 21:19:01 +0000 (16:19 -0500)]
blkid: avoid potential integer overflow issues identified by Coverity

Addresses-Coverity-Id: #1049157
Addresses-Coverity-Id: #1049158

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agomke2fs: add make_hugefile feature
Theodore Ts'o [Tue, 21 Jan 2014 04:06:07 +0000 (23:06 -0500)]
mke2fs: add make_hugefile feature

This feature is enabled via settings in /etc/mke2fs.conf.  For
example:

hugefile = {
features = extent,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize,^resize_inode,sparse_super2
inode_size = 128
num_backup_sb = 0
packed_meta_blocks = 1
make_hugefiles = 1
inode_ratio = 4194304
hugefiles_dir = /database
hugefiles_uid = 120
hugefiles_gid = 50
hugefiles_name = storage
hugefiles_digits = 4
hugefiles_size = 1G
num_hugefiles = 0
}

Then "mke2fs -T hugefile /dev/sdXX" will create as many 1G files
needed to fill the file system.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoe2fsck, mke2fs: enable octal integers in the profile/config file
Theodore Ts'o [Tue, 21 Jan 2014 05:59:03 +0000 (00:59 -0500)]
e2fsck, mke2fs: enable octal integers in the profile/config file

If an integer in the config file starts with a 0, interpret it as an
octal number.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agomke2fs: allow metadata blocks to be at the beginning of the file system
Theodore Ts'o [Tue, 28 Jan 2014 19:44:23 +0000 (14:44 -0500)]
mke2fs: allow metadata blocks to be at the beginning of the file system

Add the extended options packed_meta_blocks and journal_location_front
which causes mke2fs to place the metadata blocks at the beginning of
the file system.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoAdd support for new compat feature "sparse_super2"
Theodore Ts'o [Sun, 12 Jan 2014 03:11:42 +0000 (22:11 -0500)]
Add support for new compat feature "sparse_super2"

In practice, it is **extremely** rare for users to try to use more
than the first backup superblock located at the beginning of block
group #1.  (i.e., at block number 32768 for file systems with a 4k
block size).  This new compat feature restricts the backup superblock
to block group #1 and the last block group in the file system.

Aside from reducing the overhead of the file system by a small number
of blocks, by eliminating the rest of the backup superblocks, it
allows us to have a much more flexible metadata layout.  For example,
we can force all of the allocation bitmaps and inode table blocks to
the beginning of the disk, which allows most of the disk to be
exclusively used for contiguous data blocks.

This simplifies taking advantage of certain HDD specific features,
such as Shingled Magnetic Recording (aka Shingled Drives), and the
TCG's OPAL Storage Specification where having a simple mapping between
LBA block ranges and the data blocks used by the file system can make
life much simpler.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agotune2fs, mke2fs: add the ability to control the location of the journal
Theodore Ts'o [Tue, 28 Jan 2014 17:58:56 +0000 (12:58 -0500)]
tune2fs, mke2fs: add the ability to control the location of the journal

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: add new function ext2fs_add_journal_inode2()
Theodore Ts'o [Tue, 28 Jan 2014 17:16:35 +0000 (12:16 -0500)]
libext2fs: add new function ext2fs_add_journal_inode2()

This new function has an parameter which allows the caller to specify
the location of the journal.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: factor out get_midpoint_journal_block() in mkjournal.c
Theodore Ts'o [Tue, 28 Jan 2014 17:12:27 +0000 (12:12 -0500)]
libext2fs: factor out get_midpoint_journal_block() in mkjournal.c

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agomke2fs: optimize fix_cluster_bg_counts()
Theodore Ts'o [Mon, 20 Jan 2014 00:44:45 +0000 (19:44 -0500)]
mke2fs: optimize fix_cluster_bg_counts()

Instead of iterating over the allocation bitmap using
ext2fs_test_block_bitmap2(), bit by bit, use
ext2fs_find_first_set_block_bitmap2() instead.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: optimize ext2fs_new_block2()
Theodore Ts'o [Mon, 20 Jan 2014 00:35:33 +0000 (19:35 -0500)]
libext2fs: optimize ext2fs_new_block2()

If there are hundreds of thousands of blocks which are in use before
the first free block, it is much, MUCH faster to use
ext2fs_find_first_zero_block_bitmap2() instead of searching the
allocation bitmap bit by bit.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: optimize ext2fs_allocate_group_table()
Theodore Ts'o [Sun, 19 Jan 2014 21:47:21 +0000 (16:47 -0500)]
libext2fs: optimize ext2fs_allocate_group_table()

By using ext2fs_mark_block_bitmap_range2 and/or
ext2fs_block_alloc_stats_range(), we can significantly speed up the
time needed by mke2fs to allocate the inode table.

For example, the CPU time needed to run the command "mke2fs -t ext4
/tmp/foo.img 32T" (where tmpfs was mounted on /tmp) was decreased from
21.7 CPU seconds down to under 1.7 seconds.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: add ext2fs_block_alloc_stats_range()
Theodore Ts'o [Sun, 19 Jan 2014 21:35:50 +0000 (16:35 -0500)]
libext2fs: add ext2fs_block_alloc_stats_range()

This function is more efficient than using ext2fs_block_alloc_stats2()
for each block in a range.  The efficiencies come from being able to
set a block range in the block bitmap at once, and from being update
the block group descriptors once per block group.  Especially now that
we are checksuming the block group descriptors, and we are using red
black trees for the allocation bitmaps, these changes can make a huge
difference in the CPU time used by mke2fs when creating very large
file systems.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: further clean up and rename check_block_uninit
Theodore Ts'o [Sun, 19 Jan 2014 06:24:30 +0000 (01:24 -0500)]
libext2fs: further clean up and rename check_block_uninit

Commit 8e44eb64bb (libext2fs: mark group data blocks when loading
block bitmap) simplified check_block_uninit since we are now
initializing the bitmap when it is loaded from disk.  It left some
variables which were being set but never used, however.  In addition,
since we only need check_block_uninit() to clear the block bitmap's
uninit flag, rename it to clear_block_uninit(), and only call it once
we have found a free block in ext2fs_new_blocks2().

This cleans up the code some and optimizes things if we need to search
multiple block groups trying to find a free block.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
11 years agolibext2fs: optimize find_first_{zero,set}() for red-black tree based bitmaps
Theodore Ts'o [Mon, 13 Jan 2014 02:46:46 +0000 (21:46 -0500)]
libext2fs: optimize find_first_{zero,set}() for red-black tree based bitmaps

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: optimize find_first_set() for bitarray-based bitmaps
Theodore Ts'o [Mon, 13 Jan 2014 02:45:04 +0000 (21:45 -0500)]
libext2fs: optimize find_first_set() for bitarray-based bitmaps

Basically just a trivial adaption of the find_first_zero() function
for bitarray-based bitmaps.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: build tst_bitmaps with rep invariants checking enabled
Theodore Ts'o [Mon, 13 Jan 2014 01:18:55 +0000 (20:18 -0500)]
libext2fs: build tst_bitmaps with rep invariants checking enabled

When building tst_bitmaps, enable #define DEBUG_RB, so we are
always testing the sanity of the in-memory representation of the
bitmap when using red-black trees as part of a "make check" run.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: clean up generic handling of ext2fs_find_first_{set,zero}_*()
Theodore Ts'o [Mon, 13 Jan 2014 00:45:43 +0000 (19:45 -0500)]
libext2fs: clean up generic handling of ext2fs_find_first_{set,zero}_*()

Move the error checking into the the generic bitmap code, and add
support for bitmaps with cluster_bits set.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: fix off-by-one bug in ext2fs_extent_insert()
Theodore Ts'o [Thu, 16 Jan 2014 04:29:21 +0000 (23:29 -0500)]
libext2fs: fix off-by-one bug in ext2fs_extent_insert()

When inserting the first extent into an empty inode, the
ext2fs_extent_insert() leaves path->left set to 1 instead of 0.  Since
path->curr is pointing at the last (only) extent in the file,
path->left should be 0.

This is mostly harmless, and gets corrected fairly quickly if the
calling applicaton jumps to a different part of the extent tree ---
for example, by calling ext2fs_extent_goto(), or calling
ext2fs_extent_get with the flags argument set to EXT2_EXTENT_ROOT.
Which is why we hadn't noticed this problem until now.

However, if you insert four extents using ext2fs_extent_insert, the
fourth insert will end up copying too many bytes in the i_block[]
array, since path->left is one larger than it should be.  This results
in the inode fields i_generation, i_file_acl, and i_size_high getting
zeroed out.

This problem can be replicated as follows:

% cp /dev/null /tmp/foo.img
% mke2fs -F -t ext4 /tmp/foo.img 100
% debugfs -w /tmp/foo.img
debugfs: write /dev/null foo
debugfs: set_inode_field foo i_size_hi 1
debugfs: stat foo
 <----- note that the inode's size is 4294967296
debugfs: extent_open foo
debugfs (extent ino 12): insert --after 0 1 100
debugfs (extent ino 12): insert --after 1 1 101
debugfs (extent ino 12): insert --after 2 1 102
debugfs (extent ino 12): insert --after 3 1 103
debugfs (extent ino 12): extent_close
debugfs: stat foo
 <----- note that the inode's size is now 0
debugfs: quit

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agolibext2fs: add ext2fs_find_first_set_{block,inode}_bitmap2()
Theodore Ts'o [Sun, 12 Jan 2014 20:57:31 +0000 (15:57 -0500)]
libext2fs: add ext2fs_find_first_set_{block,inode}_bitmap2()

Add functions which try to find the first set block or inode in a
bitmap.  This is useful when trying to allocate a range of blocks
efficiently.

Like the find_first_zero family of functions, provide a generic O(N)
search function which will be used if there is no optimized version
provided by the red-black tree or bitarray functions.

Also, expand the test cases for ext2fs_find_first_zero_*() functions.

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
11 years agoext4.5: remove duplicate .TP in man page
Theodore Ts'o [Sun, 12 Jan 2014 03:07:24 +0000 (22:07 -0500)]
ext4.5: remove duplicate .TP in man page

Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
11 years agotests: adjust test output to reflect block_uninit calculated block bitmaps
Darrick J. Wong [Sat, 11 Jan 2014 19:15:52 +0000 (14:15 -0500)]
tests: adjust test output to reflect block_uninit calculated block bitmaps

Now that libext2fs marks group metadata in the fs block bitmap, adjust
the expected test output to reflect expanded use of block_uninit and
the fact debugfs no longer prints block bitmap data that fails to
account for group data blocks.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agolibext2fs: no need to clear BLOCK_UNINIT during ext2fs_reserve_super_and_bgd
Darrick J. Wong [Sat, 11 Jan 2014 19:15:51 +0000 (14:15 -0500)]
libext2fs: no need to clear BLOCK_UNINIT during ext2fs_reserve_super_and_bgd

Since the beginning of the uninit_bg feature, the kernel[1] and
e2fsck[2] have always been careful to detect the presence of the
BLOCK_UNINIT flag, and compute a block bitmap with any group metadata
blocks marked in that bitmap.  With that in mind, I think it's safe to
say that this is a design feature of uninit_bg.

Now that we've trained libext2fs to have this same behavior whenever
it's loading a block bitmap, we no longer need to unset BLOCK_UNINIT
for a group that contains only its own group metadata -- kernel,
e2fsck, and e2fsprogs will handle this correctly.

[1] kernel git 717d50e4971b81b96c0199c91cdf0039a8cb181a
    "Ext4: Uninitialized Block Groups"
[2] e2fsprogs git f5fa20078bfc05b554294fe9c5505375d7913e8c
    "Add support for EXT2_FEATURE_COMPAT_LAZY_BG"

Reported-by: Akira Fujita <a-fujita@rs.jp.nec.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agoe2fsck: remove uninit block bitmap calculation
Darrick J. Wong [Sat, 11 Jan 2014 19:05:02 +0000 (14:05 -0500)]
e2fsck: remove uninit block bitmap calculation

Since libext2fs now detects a BLOCK_UNINIT group and calculates the
group's block bitmap, we no longer need to emulate this behavior in
e2fsck.  We can simply compare the found block map against the
filesystem's, and proceed from there.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agolibext2fs: mark group data blocks when loading block bitmap
Darrick J. Wong [Sat, 11 Jan 2014 19:04:48 +0000 (14:04 -0500)]
libext2fs: mark group data blocks when loading block bitmap

The kernel[1] and e2fsck[2] both react to a BLOCK_UNINIT group by
calculating the block bitmap that's needed to show all the group
blocks for that group (if any) and using that.  However, when reading
bitmaps from disk, libext2fs simply imports a block of zeroes into the
bitmap, without bothering to check for group blocks.  This erroneous
behavior results in the filesystem having a block bitmap that does not
accurately reflect disk contents, and worse yet makes it seem as
though superblocks, group descriptors, bitmaps, and inode tables are
"free" space on disk.

So, fix the block bitmap loading routines to calculate the correct
block bitmap for all groups and load it into the main fs block bitmap.

This also fixes bogus debugfs output such as:

Group 1: (Blocks 8193-16384) [INODE_UNINIT, BLOCK_UNINIT]
  Checksum 0x1310, unused inodes 512
  Backup superblock at 8193, Group descriptors at 8194-8217
  Reserved GDT blocks at 8218-8473
  Block bitmap at 283 (bg #0 + 282), Inode bitmap at 299 (bg #0 + 298)
  Inode table at 442-569 (bg #0 + 441)
  7911 free blocks, 512 free inodes, 0 directories, 512 unused inodes
  Free blocks: 8193-16384
  Free inodes: 513-1024

Notice how the "free blocks" range includes the backup sb & GDT area
and doesn't match the free block count.

Worse yet, debugfs' testb command will report those group descriptor
blocks as not being in use unless the user also instructs debugfs to
find a free block first.  That is a rather surprising result:

debugfs:  testb 8194
Block 8194 not in use
debugfs:  ffb 1 16380
Free blocks found: 16380
debugfs:  testb 8194
Block 8194 marked in use

Also, remove the part of check_block_uninit() that "fixes" the bitmap
since we're doing that at bitmap load time now.

[1] kernel git 717d50e4971b81b96c0199c91cdf0039a8cb181a
    "Ext4: Uninitialized Block Groups"
[2] e2fsprogs git f5fa20078bfc05b554294fe9c5505375d7913e8c
    "Add support for EXT2_FEATURE_COMPAT_LAZY_BG"

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
11 years agolibext2fs: don't always read backup group descriptors on a 1k-block meta_bg fs
Darrick J. Wong [Sat, 11 Jan 2014 18:58:15 +0000 (13:58 -0500)]
libext2fs: don't always read backup group descriptors on a 1k-block meta_bg fs

On a filesystem with 1K blocks and meta_bg enabled, opening a
filesystem with automatic superblock detection tries to compensate for
the fact that the superblock lives in block 1.  However, the method by
which this is done is later misinterpreted to mean "read the backup
group descriptors", which is not what we want in this case.

Therefore, in ext2fs_open3() separate the 'group zero' adjustment into
its own variable so that we don't get fed backup group descriptors
when we try to load meta_bg group descriptors.

Furthermore, enhance ext2fs_descriptor_block_loc2() to perform its own
group zero correction.  The other caller of this function neglects to
do any group-zero correction of their own, so this fixes them too.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>