LU-12477 ldiskfs: last cleanups The patch to cleanup ldiskfs collided with the landing of the ext4-mballoc-prefetch patch. Remove the last unsupported rhel7 bits. With the new Ubuntu 20 coming out we can drop Ubuntu16 support. Drop 3.12 kernel versions of SUSE. Test-Parameters: trivial Fixes: fc87b01f96e8 ("LU-12477 ldiskfs: remove obsolete ext4 patches") Change-Id: I15f9f59ffb1275e2eaabf7ca543fd4c4829aaf9e Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/38139 Reviewed-by: Shaun Tancheff <shaun.tancheff@hpe.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-12477 ldiskfs: remove obsolete ext4 patches Drop support for ldiskfs kernels earlier than RHEL7.6. Test-Parameters: trivial Change-Id: I30450904c508ec8aa5388cbfd9bd967028f88b28 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/37352 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-12977 ldiskfs: properly take inode_lock() for truncates Originally Lustre grabbed the inode_lock() but this lead to deadlocks as described in LU-6446 and LU-4252. The recent work of LU-10048 changed the truncate code so that it is called asynchronously from the main transactions. This should avoid lock ordering issues. It should be safe to take the inode_lock() around ldiskfs_truncate() and remove the WARN(). Test-Parameters: fstype=ldiskfs testlist=racer Change-Id: Id7b6d05d054ab041980e946989aa1effae5c7111 Signed-off-by: James Simmons <jsimmons@infradead.org> Reviewed-on: https://review.whamcloud.com/37116 Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Yang Sheng <ys@whamcloud.com> Tested-by: jenkins <devops@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-12345 ldiskfs: optimize nodelalloc mode We found performance regression when using bigalloc with "nodelalloc" (1MB cluster size): 1. mke2fs -C 1048576 -O ^has_journal,bigalloc /dev/sda 2. mount -o nodelalloc /dev/sda /test/ 3. time dd if=/dev/zero of=/test/io bs=1048576 count=1024 The "dd" will cost about 2 seconds to finish, but if we mke2fs without "bigalloc", "dd" will only cost less than 1 second. The reason is: when using ext4 with "nodelalloc", it will call ext4_find_delalloc_cluster() nearly everytime it call ext4_ext_map_blocks(), and ext4_find_delalloc_range() will also scan all pages in cluster because no buffer is "delayed". A cluster has 256 pages (1MB cluster), so it will scan 256 * 256k pags when creating a 1G file. That severely hurts the performance. Therefore, we return immediately from ext4_find_delalloc_range() in nodelalloc mode, since by definition there can't be any delalloc pages. The same optimization also added for ldiskfs_find_delayed_extent() function that improve performance dromaticaly. Here is results of testing on two node system. Without the patch: avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 56.30 0.06 0.00 43.63 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sds 0.00 0.00 0.00 1174.00 0.00 4.59 8.00 0.84 0.71 0.00 0.71 0.01 1.20 With patch: 08/29/2018 01:13:22 AM avg-cpu: %user %nice %system %iowait %steal %idle 0.00 0.00 4.13 30.37 0.00 65.50 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sds 0.00 0.00 0.00 54117.82 0.00 211.43 8.00 152.59 2.82 0.00 2.82 0.02 99.01 Cray-bug-id: LUS-5835 Signed-off-by: Artem Blagodarenko <c17828@cray.com> Change-Id: Ie33410d4481778ee4f76a054ab8cfc11cc19a0ed Reviewed-on: https://review.whamcloud.com/34982 Tested-by: Jenkins Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com> Reviewed-by: Li Dongyang <dongyangli@ddn.com> Reviewed-by: Oleg Drokin <green@whamcloud.com>
LU-11790 ldiskfs: add terminating u32 when expanding inodes In ext4_expand_extra_isize_ea(), we calculate the total size of the xattr header, plus the xattr entries so we know how much of the beginning part of the xattrs to move when expanding the inode extra size. We need to include the terminating u32 at the end of the xattr entries, or else if there is uninitialized, non-zero bytes after the xattr entries and before the xattr values, the list of xattr entries won't be properly terminated. Signed-off-by: Li Dongyang <dongyangli@ddn.com> Change-Id: I247b935b3cf315481dc4658133a7eee02b6350e9 Reviewed-on: https://review.whamcloud.com/33893 Tested-by: Jenkins Tested-by: Maloo <maloo@whamcloud.com> Reviewed-by: Yang Sheng <ys@whamcloud.com> Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
LU-11187 ldiskfs: don't mark mmp buffer head dirty Marking mmp bh dirty before writing it will make writeback pick up mmp block later and submit a write, we don't want the duplicate write as kmmpd thread should have full control of reading and writing the mmp block. Another reason is we will also have random I/O error on the writeback request when blk integrity is enabled, because kmmpd could modify the content of the mmp block(e.g. setting new seq and time) while the mmp block is under I/O requested by writeback. Linux-commit: fe18d649891d813964d3aaeebad873f281627fbc Test-Parameters: testgroup=review-ldiskfs testlist=mmp Signed-off-by: Li Dongyang <dongyangli@ddn.com> Change-Id: I5aa9fd384a4ea25ee52f1198528fae4ecc9c28c7 Reviewed-on: https://review.whamcloud.com/33038 Tested-by: Jenkins Reviewed-by: Andreas Dilger <adilger@whamcloud.com> Reviewed-by: Wang Shilong <wshilong@ddn.com> Tested-by: Maloo <hpdd-maloo@intel.com>
LU-10048 osd: async truncate osd-ldiskfs should execute truncate outside of main transaction handle. This avoids restarting truncate transaction handles in main transaction, and allows "transaction first, locking second" model on OST. Change-Id: Iffe45c42834c26ca72b65e068ad25ac61d0607c8 Signed-off-by: Alex Zhuravlev <alexey.zhuravlev@intel.com> Reviewed-on: https://review.whamcloud.com/27488 Tested-by: Jenkins Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Tested-by: Maloo <hpdd-maloo@intel.com> Reviewed-by: Fan Yong <fan.yong@intel.com>
LU-10859 ldiskfs: fix deadlock with heavy memory preassure On one Customer site, we hit following deadlock: Thread 1: ofd_object_punch osd_punch ldiskfs_truncate ldiskfs_inode_attach_jinode ... do_try_to_free_pages lu_cache_shrink mutex_lock -->try to hold @lu_sites_guard kswapd thread2: kthread shrink_slab lu_cache_shrink mutex_lock ---->hold already. ... dqget ldiskfs_acquire_dquot jbd2__journal_start-->blocked to wait for more credits. Thread3: kthread kjournald2 jbd2_journal_commit_transaction-->blocked to wait Thread2 finished, since Thread1 add a handle into transaction. So deadlock happens because of Thread1 wait Thread2, Thread2 wait Thread3.. but Thread3 wait Thread1.... This problem still exists even we have switched @lu_sites_guard into a read/write lock, sine we hold write lock at lu_cahce_shrink(). Fixed the problem by making ldiskfs_inode_attach_jinode() use GFP_NOFS. Test-Parameters: testgroup=review-ldiskfs \ mdtfilesystemtype=ldiskfs ostfilesystemtype=ldiskfs Change-Id: I0ab143fc0cdb8e1b0c490c2c25e8af483c491a81 Signed-off-by: Wang Shilong <wshilong@ddn.com> Signed-off-by: Bob Glossman <bob.glossman@intel.com> Reviewed-on: https://review.whamcloud.com/31806 Tested-by: Jenkins Tested-by: Maloo <hpdd-maloo@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com>
LU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0 This enables compatibility with the current LTS flavours of Ubuntu. Do note that you need the Xenial HWE Kernel for Ubuntu 14.04.5, as that distribution originally used a 3.x series Kernel. The patches have been developed to apply cleanly to the kernel versions 4.4.0-45.66 to 4.4.0-85.108 from the Ubuntu Xenial (and its Trusty backports). This change also adjusts the Debian scripting to produce the ldiskfs modules and the server utilities. To create the server modules run "./configure" with "--enable-server" and specify "--enable-ldiskfs" and "--with-zfs/-spl" as appropriate. The call to "make debs" will then produce the server modules and utils instead of their client versions. NOTE: This contains a small hack taken from LU-9995 / #29130 Test-Parameters: trivial Signed-off-by: Martin Schroeder <martin.h.schroeder@intel.com> Change-Id: I02cd5e9314367ad4e1f8f3d81712f84441a8bc71 Reviewed-on: https://review.whamcloud.com/29215 Tested-by: Jenkins Tested-by: Maloo <hpdd-maloo@intel.com> Reviewed-by: James Simmons <uja.ornl@yahoo.com> Reviewed-by: Thomas Stibor <t.stibor@gsi.de> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>
Revert "LU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0" This actually breaks our Ubuntu builds which prevents any sort of full testing, so I am reverting this now. This reverts commit 86c3f90d3ab8dbd21dc6fa325aa3a0556fb94035. Change-Id: I14d242bfde1efb0144080b882e63542fc2190465 Reviewed-on: https://review.whamcloud.com/28293 Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Tested-by: Oleg Drokin <oleg.drokin@intel.com> Tested-by: Jenkins
LU-9564 build: Add server-build for Ubuntu with Kernel 4.4.0 This enables compatibility with the current LTS flavours of Ubuntu. Do note that you need the Xenial HWE Kernel for Ubuntu 14.04.5, as that distribution originally used a 3.x series Kernel. The patches have been developed to apply cleanly to the kernel versions 4.4.0-45.66 to 4.4.0-85.108 from the Ubuntu Xenial (and its Trusty backports). This change also adjusts the Debian scripting to produce the ldiskfs modules and the server utilities. To create the server modules run "./configure" with "--enable-server" and specify "--enable-ldiskfs" and "--with-zfs/-spl" as appropriate. The call to "make debs" will then produce the server modules and utils instead of their client versions. Test-Parameters: trivial Signed-off-by: Martin Schroeder <martin.h.schroeder@intel.com> Change-Id: Ia78702da304f735bb932738784f1346be0f39026 Reviewed-on: https://review.whamcloud.com/27323 Tested-by: Jenkins Tested-by: Maloo <hpdd-maloo@intel.com> Reviewed-by: Gu Zheng <gzheng@ddn.com> Reviewed-by: Yang Sheng <yang.sheng@intel.com> Reviewed-by: Thomas Stibor <t.stibor@gsi.de> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com>