From c2d257dfa4505ab45324daad2ca08ddc0d8977c0 Mon Sep 17 00:00:00 2001
From: Jian Yu <yujian@whamcloud.com>
Date: Wed, 7 May 2025 23:48:47 -0700
Subject: [PATCH] LU-18931 build: Update ZFS version to 2.2.7

Update ZFS version to 2.2.7. The changes are listed in:
https://github.com/openzfs/zfs/releases/tag/zfs-2.2.7

LU-18886 zfs-osd: za_name flexible array OI scrub fix

Initialize the zap_attribute_t.za_name_len to resolve the
OI scrub failures observed with zfs-2.3.  This is a follow up
to commit d47a71f7a894a193957fe7771d43c5767979c117 which made
an identical fix for the other sites where a zap_attribute_t
is used.

Lustre-change: https://review.whamcloud.com/58762
Lustre-commit: 18f7a2e9ff3536a6d7cefebb9e9a58ffbf460627

Was-Change-Id: Ib4295cbb7ce7e7efe0f7e67b82bc8c73a1f9df8d

Fixes: d47a71f7a894a ("LU-18360 zfs-osd: za_name flexible array")
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

LU-17924 tests: disable obdfilter-survey for ZFS

Starting with ZFS 2.2.7 the obdfilter-survey test fails due to
memory exhausting. For now disable the test for ZFS.

Lustre-change: https://review.whamcloud.com/58796
Lustre-commit: b0bbd61f8d7411560b6aa4ae84b8c680d7bd5c3d

Was-Change-Id: Ibebc637a9b733cf0b262d18de1baeef09108cd36

Signed-off-by: James Simmons <jsimmons@infradead.org>

LU-18153 osd: don't release uninitialized SA

if osd-zfs's object has no initialized SA, then do not
try to release that.

Lustre-change: https://review.whamcloud.com/57989
Lustre-commit: fa0e99f28aa015a721de6eea41019a58c25f8606

Was-Change-Id: I210ae1eb9cae0bfb02161efeee2f897d9c37294b

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>

LU-18624 zfs: disable compression by default

By default ZFS is enabling compression which is causing test
failures for us. It should the administrators choice to use
ZFS with compression turned on or off. Change mount.lustre
zfs backend to handle compression=val mount option. By default
we will turn it off.

For the test suite we can use <facet_type>_FS_MKFS_OPTS to
turn on compression for testing.

Disable lots of conf-sanity test which are broken with ZFS 2.X.

Lustre-change: https://review.whamcloud.com/57990
Lustre-commit: de4f30d5c862e4887d001e8c29cf920e04c5f737

Was-Change-Id: I752c883f6f912a340aa346e1dfb8bf7bdef24939

Signed-off-by: James Simmons <jsimmons@infradead.org>

LU-17763 tests: use urandom in sanity/66

as zfs-2.2.6 compresses data by default and this breaks the test
using /dev/zero as a source.

Lustre-change: https://review.whamcloud.com/57987
Lustre-commit: 5751da80ecd57dfae9ad3c080b8c984605da47ec

Was-Change-Id: I6853c693e7cb560ae737025507e5377da82c787b

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>

LU-18728 tests: use urandom to really consume ZFS space

It appears newer ZFS is using data compression by default, so reading
from /dev/zero results in files not consuming the expected amount of
space.  Instead, read from /dev/urandom for ZFS to write files in
sanity and conf-sanity to ensure they fill the OSTs, or the image
to be used for target creation, as expected.

Lustre-change: https://review.whamcloud.com/58115
Lustre-commit: 6153eed3ee180e8695c9e2e3d9ad9db8a6b73ad8

Was-Change-Id: I7b4e95032608d8db82c75e4b6dd1ec5beb6f8d99

Signed-off-by: Bruno Faccini <bfaccini@nvidia.com>

LU-18360 zfs-osd: za_name flexible array

zfs-2.3 commit 3cf2bfa57008af7f0690f73491d7f9b4ac4ed65a
  Allocate zap_attribute_t from kmem instead of stack

za_name maximum size increased to ZAP_MAXNAMELEN_NEW

Ensure zap_attribute_t.za_name space is available and
zap_attribute_t.za_name_len now needs to be initialized
to the allocated space of MAXNAMELEN.

Also mkfs should disable longname support as it would break
MAX_NAME interop.

Lustre-change: https://review.whamcloud.com/56656
Lustre-commit: d47a71f7a894a193957fe7771d43c5767979c117

Was-Change-Id: I6c48c66a42a36ea6816b37ffce7e17f45eed3dbf

HPE-bug-id: LUS-12561

LU-14094 tests: give zfs more wait time to unlink

In sanity.sh test_311, give zfs more wait time to unlink more
files as expected.

Lustre-change: https://review.whamcloud.com/56952
Lustre-commit: faf7b20eeae7550083842c810a3941f07859ae1c

Was-Change-Id: I17f278df3826fa38b71713c610d644cc7676c1ad

Signed-off-by: Emoly Liu <emoly@whamcloud.com>

LU-14094 tests: improve sanity.sh test_311

Improve sanity.sh test_311 to see why the number of the objects
doesn't decrease as expected.

Lustre-change: https://review.whamcloud.com/55566
Lustre-commit: c1bc42821d36f9ec5630e43c142abda60515d9e3

Was-Change-Id: Iabbaed42c5654ef31bc9f98fe9868785f8ff2f18

Signed-off-by: Emoly Liu <emoly@whamcloud.com>

LU-18272 build: remove Summary line from osd-zfs

Resolve spurious warning:
   warning: line 390: second Summary
when building src rpm:

Lustre-change: https://review.whamcloud.com/56506
Lustre-commit: 69e67fb582f846af45ac608b32be716e435d34e4

Was-Change-Id: I6aa591aae3ae4dc07a36740e12ef3520cea035ef

HPE-bug-id: LUS-12538

LU-15963 osd-zfs: use contiguous chunk to grow blocksize

otherwise a sparse OST_WRITE can grow blocksize way too large.

Lustre-change: https://review.whamcloud.com/47768
Lustre-commit: dacc4b6d384cbe6376a4cf106cc63ad1ac0cd23d

Was-Change-Id: I729775490f9a0c8262708931f321297af943f3c0

Signed-off-by: Alex Zhuravlev <bzzz@whamcloud.com>

LU-6142 osd-zfs: Fix style issues for osd_io.c

This patch fixes issues reported by checkpatch
for file lustre/osd-zfs/osd_io.c

Lustre-change: https://review.whamcloud.com/54264
Lustre-commit: 4c0e328c9417ff196dc8e69f75c187dd21809ce7

Was-Change-Id: Ia9153be34a1d583195e3ecfc56ca4ab279781566

Signed-off-by: Arshad Hussain <arshad.hussain@aeoncomputing.com>

LU-14692 tests: restore sanity/312 to always_except

The sanity test_312 was incorrectly removed from ALWAYS_EXCEPT.

Lustre-change: https://review.whamcloud.com/49720
Lustre-commit: 8767d2e44110fc19e624e963d5ebc788409339d3

Was-Change-Id: I6e8ed42561809b28fd6d5b4f7ee1104080ebe756

Fixes: eaae465556 ("LU-14692 tests: allow FID_SEQ_NORMAL for MDT0000")
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>

Test-Parameters: fstype=zfs
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el8.10 testgroup=full-dne-zfs-part-1
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el8.10 testgroup=full-dne-zfs-part-2
Test-Parameters: optional fstype=zfs mdtcount=4 mdscount=2 \
  clientdistro=el9.5 serverdistro=el8.10 testgroup=full-dne-zfs-part-3

Change-Id: Ic9df52fc7933cc9129f3b6cb630199c8c44d6d59
Signed-off-by: Jian Yu <yujian@whamcloud.com>
Reviewed-on: https://review.whamcloud.com/c/fs/lustre-release/+/58834
Tested-by: jenkins <devops@whamcloud.com>
Tested-by: Maloo <maloo@whamcloud.com>
Reviewed-by: Alex Zhuravlev <bzzz@whamcloud.com>
Reviewed-by: Oleg Drokin <green@whamcloud.com>
---
 contrib/lbuild/lbuild             |   2 +-
 lustre.spec.in                    |   2 +-
 lustre/osd-zfs/osd_handler.c      |  12 +-
 lustre/osd-zfs/osd_index.c        |   7 ++
 lustre/osd-zfs/osd_internal.h     |  13 ++
 lustre/osd-zfs/osd_io.c           | 244 ++++++++++++++++++++++----------------
 lustre/osd-zfs/osd_object.c       |   3 +-
 lustre/osd-zfs/osd_scrub.c        |   5 +
 lustre/tests/conf-sanity.sh       |  10 +-
 lustre/tests/obdfilter-survey.sh  |   7 +-
 lustre/tests/sanity.sh            |  32 ++---
 lustre/tests/test-framework.sh    |  21 +++-
 lustre/utils/libmount_utils_zfs.c |  30 ++++-
 rpm/kmp-lustre-osd-zfs.preamble   |   1 -
 14 files changed, 257 insertions(+), 132 deletions(-)

diff --git a/contrib/lbuild/lbuild b/contrib/lbuild/lbuild
index b08bf12..e4e2484 100755
--- a/contrib/lbuild/lbuild
+++ b/contrib/lbuild/lbuild
@@ -1029,7 +1029,7 @@ build_spl_zfs() {
     # The spl/zfs spec files expect RPM_BUILD_ROOT to point to the root of the
     # destination for the rpms
     export RPM_BUILD_ROOT=$TOPDIR
-    SPLZFSVER=${SPLZFSVER:-2.1.15}
+    SPLZFSVER=${SPLZFSVER:-2.2.7}
     SPLZFSTAG=${SPLZFSTAG:-}
     # "spl zfs" prior to 0.8.0
     # "zfs" for 0.8.0 and later
diff --git a/lustre.spec.in b/lustre.spec.in
index ef340a9..52e60d9 100644
--- a/lustre.spec.in
+++ b/lustre.spec.in
@@ -313,7 +313,7 @@ Requires: libmount
 Provides: lustre-osd-mount = %{version}
 Obsoletes: lustre-osd-mount < %{version}
 %if 0%{confzfsdobjpath} != 0
-BuildRequires: (libzfs-devel or libzfs4-devel or libzfs5-devel)
+BuildRequires: (libzfs-devel or libzfs4-devel or libzfs5-devel or libzfs6-devel)
 %endif
 # end confzfsdobjpath
 # Tests also require zpool from zfs package:
diff --git a/lustre/osd-zfs/osd_handler.c b/lustre/osd-zfs/osd_handler.c
index c87d1c3..71f4715 100644
--- a/lustre/osd-zfs/osd_handler.c
+++ b/lustre/osd-zfs/osd_handler.c
@@ -757,10 +757,14 @@ static void *osd_key_init(const struct lu_context *ctx,
 	struct osd_thread_info *info;
 
 	OBD_ALLOC_PTR(info);
-	if (info != NULL)
-		info->oti_env = container_of(ctx, struct lu_env, le_ctx);
-	else
-		info = ERR_PTR(-ENOMEM);
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	info->oti_env = container_of(ctx, struct lu_env, le_ctx);
+#ifdef ZAP_MAXNAMELEN_NEW
+	info->oti_za.za_name_len = MAXNAMELEN;
+	info->oti_za2.za_name_len = MAXNAMELEN;
+#endif
 	return info;
 }
 
diff --git a/lustre/osd-zfs/osd_index.c b/lustre/osd-zfs/osd_index.c
index 8b32422..03ee107 100644
--- a/lustre/osd-zfs/osd_index.c
+++ b/lustre/osd-zfs/osd_index.c
@@ -158,6 +158,9 @@ static struct dt_it *osd_index_it_init(const struct lu_env *env,
 
 	it->ozi_obj   = obj;
 	it->ozi_reset = 1;
+#ifdef ZAP_MAXNAMELEN_NEW
+	it->ozi_za.za_name_len = MAXNAMELEN;
+#endif
 	lu_object_get(lo);
 
 	RETURN((struct dt_it *)it);
@@ -1330,7 +1333,11 @@ static int osd_dir_it_next(const struct lu_env *env, struct dt_it *di)
 	ENTRY;
 
 	/* temp. storage should be enough for any key supported by ZFS */
+#ifdef ZAP_MAXNAMELEN_NEW
+	LASSERT(za->za_name_len <= sizeof(it->ozi_name));
+#else
 	BUILD_BUG_ON(sizeof(za->za_name) > sizeof(it->ozi_name));
+#endif
 
 	/*
 	 * the first ->next() moves the cursor to .
diff --git a/lustre/osd-zfs/osd_internal.h b/lustre/osd-zfs/osd_internal.h
index e11f84d..15c41dd 100644
--- a/lustre/osd-zfs/osd_internal.h
+++ b/lustre/osd-zfs/osd_internal.h
@@ -165,6 +165,10 @@ struct osd_zap_it {
 	enum osd_zap_pos	 ozi_pos;
 	struct luz_direntry	 ozi_zde;
 	zap_attribute_t		 ozi_za;
+#ifdef ZAP_MAXNAMELEN_NEW
+	/* flexible array: zap_attribute_t.za_name[], ensure space allocated */
+	char			 ozi_za_name_buffer[MAXNAMELEN];
+#endif
 	union {
 		char		 ozi_name[MAXNAMELEN]; /* file name for dir */
 		__u64		 ozi_key; /* binary key for index files */
@@ -258,7 +262,15 @@ struct osd_thread_info {
 	struct lu_attr		 oti_la;
 	struct osa_attr		 oti_osa;
 	zap_attribute_t		 oti_za;
+#ifdef ZAP_MAXNAMELEN_NEW
+	/* flexible array: zap_attribute_t.za_name[], ensure space allocated */
+	char			 oti_za_name_buffer[MAXNAMELEN];
+#endif
 	zap_attribute_t		 oti_za2;
+#ifdef ZAP_MAXNAMELEN_NEW
+	/* flexible array: zap_attribute_t.za_name[], ensure space allocated */
+	char			 oti_za2_name_buffer[MAXNAMELEN];
+#endif
 	dmu_object_info_t	 oti_doi;
 	struct luz_direntry	 oti_zde;
 
@@ -466,6 +478,7 @@ struct osd_object {
 
 	/* the i_flags in LMA */
 	__u32			 oo_lma_flags;
+	__u32			 oo_next_blocksize;
 	union {
 		int		oo_ea_in_bonus; /* EA bytes we expect */
 		struct {
diff --git a/lustre/osd-zfs/osd_io.c b/lustre/osd-zfs/osd_io.c
index ac98fba..ef867e9 100644
--- a/lustre/osd-zfs/osd_io.c
+++ b/lustre/osd-zfs/osd_io.c
@@ -62,9 +62,13 @@
 
 char osd_0copy_tag[] = "zerocopy";
 
+static void osd_choose_next_blocksize(struct osd_object *obj,
+				      loff_t off, ssize_t len);
+
 static void dbuf_set_pending_evict(dmu_buf_t *db)
 {
 	dmu_buf_impl_t *dbi = (dmu_buf_impl_t *)db;
+
 	dbi->db_pending_evict = TRUE;
 }
 
@@ -168,19 +172,20 @@ static ssize_t osd_declare_write(const struct lu_env *env, struct dt_object *dt,
 				const struct lu_buf *buf, loff_t pos,
 				struct thandle *th)
 {
-	struct osd_object  *obj  = osd_dt_obj(dt);
-	struct osd_device  *osd = osd_obj2dev(obj);
+	struct osd_object *obj = osd_dt_obj(dt);
+	struct osd_device *osd = osd_obj2dev(obj);
 	loff_t _pos = pos, max = 0;
 	struct osd_thandle *oh;
-	uint64_t            oid;
-	ENTRY;
+	uint64_t oid;
 
+	ENTRY;
 	oh = container_of(th, struct osd_thandle, ot_super);
 
 	/* in some cases declare can race with creation (e.g. llog)
 	 * and we need to wait till object is initialized. notice
 	 * LOHA_EXISTs is supposed to be the last step in the
-	 * initialization */
+	 * initialization
+	 */
 
 	/* size change (in dnode) will be declared by dmu_tx_hold_write() */
 	if (dt_object_exists(dt))
@@ -190,7 +195,8 @@ static ssize_t osd_declare_write(const struct lu_env *env, struct dt_object *dt,
 
 	/* XXX: we still miss for append declaration support in ZFS
 	 *	-1 means append which is used by llog mostly, llog
-	 *	can grow upto LLOG_MIN_CHUNK_SIZE*8 records */
+	 *	can grow upto LLOG_MIN_CHUNK_SIZE*8 records
+	 */
 	max = max_t(loff_t, 256 * 8 * LLOG_MIN_CHUNK_SIZE,
 		    obj->oo_attr.la_size + (2 << 20));
 	if (pos == -1)
@@ -233,7 +239,8 @@ static ssize_t osd_declare_write(const struct lu_env *env, struct dt_object *dt,
 
 	/* dt_declare_write() is usually called for system objects, such
 	 * as llog or last_rcvd files. We needn't enforce quota on those
-	 * objects, so always set the lqi_space as 0. */
+	 * objects, so always set the lqi_space as 0.
+	 */
 	RETURN(osd_declare_quota(env, osd, obj->oo_attr.la_uid,
 				 obj->oo_attr.la_gid, obj->oo_attr.la_projid,
 				 0, oh, NULL, OSD_QID_BLK));
@@ -248,6 +255,7 @@ static dmu_buf_t *osd_get_dbuf(struct osd_object *obj, uint64_t offset)
 	blkid = dbuf_whichblock(obj->oo_dn, 0, offset);
 	for (i = 0; i < OSD_MAX_DBUFS; i++) {
 		dmu_buf_impl_t *dbi = (void *)dbs[i];
+
 		if (!dbs[i])
 			continue;
 		if (dbi->db_blkid == blkid)
@@ -320,14 +328,13 @@ static ssize_t osd_write(const struct lu_env *env, struct dt_object *dt,
 			const struct lu_buf *buf, loff_t *pos,
 			struct thandle *th)
 {
-	struct osd_object  *obj  = osd_dt_obj(dt);
-	struct osd_device  *osd = osd_obj2dev(obj);
+	struct osd_object *obj = osd_dt_obj(dt);
+	struct osd_device *osd = osd_obj2dev(obj);
 	struct osd_thandle *oh;
-	uint64_t            offset = *pos;
-	int                 rc;
+	uint64_t offset = *pos;
+	int rc;
 
 	ENTRY;
-
 	LASSERT(dt_object_exists(dt));
 	LASSERT(obj->oo_dn);
 
@@ -350,7 +357,8 @@ static ssize_t osd_write(const struct lu_env *env, struct dt_object *dt,
 		write_unlock(&obj->oo_attr_lock);
 		/* osd_object_sa_update() will be copying directly from oo_attr
 		 * into dbuf.  any update within a single txg will copy the
-		 * most actual */
+		 * most actual
+		 */
 		rc = osd_object_sa_update(obj, SA_ZPL_SIZE(osd),
 					&obj->oo_attr.la_size, 8, oh);
 		if (unlikely(rc))
@@ -382,8 +390,8 @@ static int osd_bufs_put(const struct lu_env *env, struct dt_object *dt,
 {
 	struct osd_object *obj  = osd_dt_obj(dt);
 	struct osd_device *osd = osd_obj2dev(obj);
-	unsigned long      ptr;
-	int                i;
+	unsigned long ptr;
+	int i;
 
 	LASSERT(dt_object_exists(dt));
 	LASSERT(obj->oo_dn);
@@ -405,10 +413,12 @@ static int osd_bufs_put(const struct lu_env *env, struct dt_object *dt,
 				atomic_dec(&osd->od_zerocopy_pin);
 			} else if (lnb[i].lnb_data != NULL) {
 				int j, apages, abufsz;
+
 				abufsz = arc_buf_size(lnb[i].lnb_data);
 				apages = abufsz >> PAGE_SHIFT;
 				/* these references to pages must be invalidated
-				 * to prevent access in osd_bufs_put() */
+				 * to prevent access in osd_bufs_put()
+				 */
 				for (j = 0; j < apages; j++)
 					lnb[i + j].lnb_page = NULL;
 				dmu_return_arcbuf(lnb[i].lnb_data);
@@ -521,7 +531,8 @@ static int osd_bufs_get_read(const struct lu_env *env, struct osd_object *obj,
 				lnb->lnb_page = kmem_to_page(dbp[i]->db_data +
 							     bufoff);
 				/* mark just a single slot: we need this
-				 * reference to dbuf to be released once */
+				 * reference to dbuf to be released once
+				 */
 				lnb->lnb_data = dbf;
 				dbf = NULL;
 
@@ -537,8 +548,7 @@ static int osd_bufs_get_read(const struct lu_env *env, struct osd_object *obj,
 			if (drop_cache)
 				dbuf_set_pending_evict(dbp[i]);
 
-			/* steal dbuf so dmu_buf_rele_array() can't release
-			 * it */
+			/* steal dbuf so dmu_buf_rele_array() can't free it */
 			dbp[i] = NULL;
 		}
 
@@ -587,13 +597,16 @@ static int osd_bufs_get_write(const struct lu_env *env, struct osd_object *obj,
 			      int maxlnb)
 {
 	struct osd_device *osd = osd_obj2dev(obj);
-	int                poff, plen, off_in_block, sz_in_block;
-	int                rc, i = 0, npages = 0;
+	int poff, plen, off_in_block, sz_in_block;
+	int rc, i = 0, npages = 0;
 	dnode_t *dn = obj->oo_dn;
 	arc_buf_t *abuf;
 	uint32_t bs = dn->dn_datablksz;
+
 	ENTRY;
 
+	osd_choose_next_blocksize(obj, off, len);
+
 	/*
 	 * currently only full blocks are subject to zerocopy approach:
 	 * so that we're sure nobody is trying to update the same block
@@ -617,7 +630,8 @@ static int osd_bufs_get_write(const struct lu_env *env, struct osd_object *obj,
 			atomic_inc(&osd->od_zerocopy_loan);
 
 			/* go over pages arcbuf contains, put them as
-			 * local niobufs for ptlrpc's bulks */
+			 * local niobufs for ptlrpc's bulks
+			 */
 			while (sz_in_block > 0) {
 				plen = min_t(int, sz_in_block, PAGE_SIZE);
 
@@ -704,7 +718,7 @@ static int osd_bufs_get(const struct lu_env *env, struct dt_object *dt,
 			int maxlnb, enum dt_bufs_type rw)
 {
 	struct osd_object *obj  = osd_dt_obj(dt);
-	int                rc;
+	int rc;
 
 	LASSERT(dt_object_exists(dt));
 	LASSERT(obj->oo_dn);
@@ -747,21 +761,21 @@ static int osd_declare_write_commit(const struct lu_env *env,
 				    struct niobuf_local *lnb, int npages,
 				    struct thandle *th)
 {
-	struct osd_object  *obj = osd_dt_obj(dt);
-	struct osd_device  *osd = osd_obj2dev(obj);
+	struct osd_object *obj = osd_dt_obj(dt);
+	struct osd_device *osd = osd_obj2dev(obj);
 	struct osd_thandle *oh;
-	uint64_t            offset = 0;
-	uint32_t            size = 0;
+	uint64_t offset = 0;
+	uint32_t size = 0;
 	uint32_t blksz = obj->oo_dn->dn_datablksz;
-	int		    i, rc;
+	int i, rc;
 	bool synced = false;
-	long long	    space = 0;
-	struct page	   *last_page = NULL;
-	unsigned long	    discont_pages = 0;
+	long long space = 0;
+	struct page *last_page = NULL;
+	unsigned long discont_pages = 0;
 	enum osd_quota_local_flags local_flags = 0;
 	enum osd_qid_declare_flags declare_flags = OSD_QID_BLK;
-	ENTRY;
 
+	ENTRY;
 	LASSERT(dt_object_exists(dt));
 	LASSERT(obj->oo_dn);
 
@@ -778,13 +792,15 @@ static int osd_declare_write_commit(const struct lu_env *env,
 			/* ENOSPC, network RPC error, etc.
 			 * We don't want to book space for pages which will be
 			 * skipped in osd_write_commit(). Hence we skip pages
-			 * with lnb_rc != 0 here too */
+			 * with lnb_rc != 0 here too
+			 */
 			continue;
 		/* ignore quota for the whole request if any page is from
 		 * client cache or written by root.
 		 *
 		 * XXX we could handle this on per-lnb basis as done by
-		 * grant. */
+		 * grant.
+		 */
 		if ((lnb[i].lnb_flags & OBD_BRW_NOQUOTA) ||
 		    (lnb[i].lnb_flags & OBD_BRW_SYS_RESOURCE) ||
 		    !(lnb[i].lnb_flags & OBD_BRW_SYNC))
@@ -809,7 +825,8 @@ static int osd_declare_write_commit(const struct lu_env *env,
 		 * indirect blocks and just use as a rough estimate the worse
 		 * case where the old space is being held by a snapshot. Quota
 		 * overrun will be adjusted once the operation is committed, if
-		 * required. */
+		 * required.
+		 */
 		space += osd_roundup2blocksz(size, offset, blksz);
 
 		offset = lnb[i].lnb_file_offset;
@@ -822,8 +839,7 @@ static int osd_declare_write_commit(const struct lu_env *env,
 		space += osd_roundup2blocksz(size, offset, blksz);
 	}
 
-	/* backend zfs filesystem might be configured to store multiple data
-	 * copies */
+	/* backend zfs FS might be configured to store multiple data copies */
 	space  *= osd->od_os->os_copies;
 	space   = toqb(space);
 	CDEBUG(D_QUOTA, "writing %d pages, reserving %lldK of quota space\n",
@@ -847,7 +863,8 @@ retry:
 
 	/* we need only to store the overquota flags in the first lnb for
 	 * now, once we support multiple objects BRW, this code needs be
-	 * revised. */
+	 * revised.
+	 */
 	if (local_flags & QUOTA_FL_OVER_USRQUOTA)
 		lnb[0].lnb_flags |= OBD_BRW_OVER_USRQUOTA;
 	if (local_flags & QUOTA_FL_OVER_GRPQUOTA)
@@ -866,57 +883,71 @@ retry:
  * maximum blocksize the dataset can support. Otherwise, it will pick a
  * a block size by the writing region of this I/O.
  */
-static int osd_grow_blocksize(struct osd_object *obj, struct osd_thandle *oh,
-			      uint64_t start, uint64_t end)
+static int osd_grow_blocksize(struct osd_object *obj, struct osd_thandle *oh)
 {
-	struct osd_device	*osd = osd_obj2dev(obj);
+	struct osd_device *osd = osd_obj2dev(obj);
 	dnode_t *dn = obj->oo_dn;
-	uint32_t		 blksz;
-	int			 rc = 0;
+	int rc = 0;
 
 	ENTRY;
 
+	if (obj->oo_next_blocksize == 0)
+		return 0;
 	if (dn->dn_maxblkid > 0) /* can't change block size */
 		GOTO(out, rc);
-
 	if (dn->dn_datablksz >= osd->od_max_blksz)
 		GOTO(out, rc);
+	if (dn->dn_datablksz == obj->oo_next_blocksize)
+		GOTO(out, rc);
 
 	down_write(&obj->oo_guard);
-
-	blksz = dn->dn_datablksz;
-	if (blksz >= osd->od_max_blksz) /* check again after grabbing lock */
-		GOTO(out_unlock, rc);
-
-	/* now ZFS can support up to 16MB block size, and if the write
-	 * is sequential, it just increases the block size gradually */
-	if (start <= blksz) { /* sequential */
-		blksz = (uint32_t)min_t(uint64_t, osd->od_max_blksz, end);
-	} else { /* sparse, pick a block size by write region */
-		blksz = (uint32_t)min_t(uint64_t, osd->od_max_blksz,
-					end - start);
-	}
-
-	if (!is_power_of_2(blksz))
-		blksz = size_roundup_power2(blksz);
-
-	if (blksz > dn->dn_datablksz) {
+	if (dn->dn_datablksz < obj->oo_next_blocksize) {
+		CDEBUG(D_INODE, "set blksz to %u\n", obj->oo_next_blocksize);
 		rc = -dmu_object_set_blocksize(osd->od_os, dn->dn_object,
-					       blksz, 0, oh->ot_tx);
-		LASSERT(ergo(rc == 0, dn->dn_datablksz >= blksz));
+					       obj->oo_next_blocksize, 0,
+					       oh->ot_tx);
 		if (rc < 0)
-			CDEBUG(D_INODE, "object "DFID": change block size"
+			CDEBUG(D_ERROR, "object "DFID": change block size"
 			       "%u -> %u error rc = %d\n",
 			       PFID(lu_object_fid(&obj->oo_dt.do_lu)),
-			       dn->dn_datablksz, blksz, rc);
+			       dn->dn_datablksz, obj->oo_next_blocksize, rc);
 	}
 	EXIT;
-out_unlock:
 	up_write(&obj->oo_guard);
 out:
 	return rc;
 }
 
+static void osd_choose_next_blocksize(struct osd_object *obj,
+				      loff_t off, ssize_t len)
+{
+	struct osd_device *osd = osd_obj2dev(obj);
+	dnode_t *dn = obj->oo_dn;
+	uint32_t blksz;
+
+	if (dn->dn_maxblkid > 0)
+		return;
+
+	if (dn->dn_datablksz >= osd->od_max_blksz)
+		return;
+
+	/*
+	 * client sends data from own writeback cache after local
+	 * aggregation. there is a chance this is a "unit of write"
+	 * so blocksize.
+	 */
+	if (off != 0)
+		return;
+
+	blksz = (uint32_t)min_t(uint64_t, osd->od_max_blksz, len);
+	if (!is_power_of_2(blksz))
+		blksz = size_roundup_power2(blksz);
+
+	/* XXX: locking? */
+	if (blksz > obj->oo_next_blocksize)
+		obj->oo_next_blocksize = blksz;
+}
+
 static void osd_evict_dbufs_after_write(struct osd_object *obj,
 					loff_t off, ssize_t len)
 {
@@ -938,14 +969,14 @@ static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
 			struct niobuf_local *lnb, int npages,
 			struct thandle *th, __u64 user_size)
 {
-	struct osd_object  *obj  = osd_dt_obj(dt);
-	struct osd_device  *osd = osd_obj2dev(obj);
+	struct osd_object *obj = osd_dt_obj(dt);
+	struct osd_device *osd = osd_obj2dev(obj);
 	struct osd_thandle *oh;
-	uint64_t            new_size = 0;
-	int                 i, abufsz, rc = 0, drop_cache = 0;
-	unsigned long	   iosize = 0;
-	ENTRY;
+	uint64_t new_size = 0;
+	int i, abufsz, rc = 0, drop_cache = 0;
+	unsigned long iosize = 0;
 
+	ENTRY;
 	LASSERT(dt_object_exists(dt));
 	LASSERT(obj->oo_dn);
 
@@ -953,9 +984,7 @@ static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
 	oh = container_of(th, struct osd_thandle, ot_super);
 
 	/* adjust block size. Assume the buffers are sorted. */
-	(void)osd_grow_blocksize(obj, oh, lnb[0].lnb_file_offset,
-				 lnb[npages - 1].lnb_file_offset +
-				 lnb[npages - 1].lnb_len);
+	(void)osd_grow_blocksize(obj, oh);
 
 	if (obj->oo_attr.la_size >= osd->od_readcache_max_filesize ||
 	    lnb[npages - 1].lnb_file_offset + lnb[npages - 1].lnb_len >=
@@ -1003,13 +1032,14 @@ static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
 
 	for (i = 0; i < npages; i++) {
 		CDEBUG(D_INODE, "write %u bytes at %u\n",
-			(unsigned) lnb[i].lnb_len,
-			(unsigned) lnb[i].lnb_file_offset);
+			(unsigned int) lnb[i].lnb_len,
+			(unsigned int) lnb[i].lnb_file_offset);
 
 		if (lnb[i].lnb_rc) {
 			/* ENOSPC, network RPC error, etc.
 			 * Unlike ldiskfs, zfs allocates new blocks on rewrite,
-			 * so we skip this page if lnb_rc is set to -ENOSPC */
+			 * so we skip this page if lnb_rc is set to -ENOSPC
+			 */
 			CDEBUG(D_INODE, "obj "DFID": skipping lnb[%u]: rc=%d\n",
 				PFID(lu_object_fid(&dt->do_lu)), i,
 				lnb[i].lnb_rc);
@@ -1030,30 +1060,35 @@ static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
 			abufsz = lnb[i].lnb_len; /* to drop cache below */
 		} else if (lnb[i].lnb_data) {
 			int j, apages;
+
 			LASSERT(((unsigned long)lnb[i].lnb_data & 1) == 0);
 			/* buffer loaned for zerocopy, try to use it.
 			 * notice that dmu_assign_arcbuf() is smart
 			 * enough to recognize changed blocksize
-			 * in this case it fallbacks to dmu_write() */
+			 * in this case it fallbacks to dmu_write()
+			 */
 			abufsz = arc_buf_size(lnb[i].lnb_data);
 			LASSERT(abufsz & PAGE_MASK);
 			apages = abufsz >> PAGE_SHIFT;
 			LASSERT(i + apages <= npages);
 			/* these references to pages must be invalidated
-			 * to prevent access in osd_bufs_put() */
+			 * to prevent access in osd_bufs_put()
+			 */
 			for (j = 0; j < apages; j++)
 				lnb[i + j].lnb_page = NULL;
 			dmu_assign_arcbuf(&obj->oo_dn->dn_bonus->db,
 					  lnb[i].lnb_file_offset,
 					  lnb[i].lnb_data, oh->ot_tx);
 			/* drop the reference, otherwise osd_put_bufs()
-			 * will be releasing it - bad! */
+			 * will be releasing it - bad!
+			 */
 			lnb[i].lnb_data = NULL;
 			atomic_dec(&osd->od_zerocopy_loan);
 			iosize += abufsz;
 		} else {
 			/* we don't want to deal with cache if nothing
-			 * has been send to ZFS at this step */
+			 * has been send to ZFS at this step
+			 */
 			continue;
 		}
 
@@ -1062,7 +1097,8 @@ static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
 
 		/* we have to mark dbufs for eviction here because
 		 * dmu_assign_arcbuf() may create a new dbuf for
-		 * loaned abuf */
+		 * loaned abuf
+		 */
 		osd_evict_dbufs_after_write(obj, lnb[i].lnb_file_offset,
 					    abufsz);
 	}
@@ -1071,7 +1107,8 @@ static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
 		/* no pages to write, no transno is needed */
 		th->th_local = 1;
 		/* it is important to return 0 even when all lnb_rc == -ENOSPC
-		 * since ofd_commitrw_write() retries several times on ENOSPC */
+		 * since ofd_commitrw_write() retries several times on ENOSPC
+		 */
 		up_read(&obj->oo_guard);
 		record_end_io(osd, WRITE, 0, 0, 0);
 		RETURN(0);
@@ -1086,7 +1123,8 @@ static int osd_write_commit(const struct lu_env *env, struct dt_object *dt,
 		write_unlock(&obj->oo_attr_lock);
 		/* osd_object_sa_update() will be copying directly from
 		 * oo_attr into dbuf. any update within a single txg will copy
-		 * the most actual */
+		 * the most actual
+		 */
 		rc = osd_object_sa_update(obj, SA_ZPL_SIZE(osd),
 					  &obj->oo_attr.la_size, 8, oh);
 	} else {
@@ -1104,8 +1142,8 @@ static int osd_read_prep(const struct lu_env *env, struct dt_object *dt,
 			struct niobuf_local *lnb, int npages)
 {
 	struct osd_object *obj  = osd_dt_obj(dt);
-	int                i;
-	loff_t		   eof;
+	int i;
+	loff_t eof;
 
 	LASSERT(dt_object_exists(dt));
 	LASSERT(obj->oo_dn);
@@ -1156,12 +1194,9 @@ static int __osd_object_punch(struct osd_object *obj, objset_t *os,
 	uint64_t size = obj->oo_attr.la_size;
 	int rc = 0;
 
-	/* Assert that the transaction has been assigned to a
-	   transaction group. */
+	/* Confirm if transaction has been assigned to a transaction group */
 	LASSERT(tx->tx_txg != 0);
-	/*
-	 * Nothing to do if file already at desired length.
-	 */
+	/* Nothing to do if file already at desired length. */
 	if (len == DMU_OBJECT_END && size == off)
 		return 0;
 
@@ -1188,13 +1223,13 @@ static int __osd_object_punch(struct osd_object *obj, objset_t *os,
 static int osd_punch(const struct lu_env *env, struct dt_object *dt,
 			__u64 start, __u64 end, struct thandle *th)
 {
-	struct osd_object  *obj = osd_dt_obj(dt);
-	struct osd_device  *osd = osd_obj2dev(obj);
+	struct osd_object *obj = osd_dt_obj(dt);
+	struct osd_device *osd = osd_obj2dev(obj);
 	struct osd_thandle *oh;
-	__u64               len;
-	int                 rc = 0;
-	ENTRY;
+	__u64 len;
+	int rc = 0;
 
+	ENTRY;
 	LASSERT(dt_object_exists(dt));
 	LASSERT(osd_invariant(obj));
 
@@ -1234,9 +1269,9 @@ static int osd_declare_punch(const struct lu_env *env, struct dt_object *dt,
 	struct osd_object  *obj = osd_dt_obj(dt);
 	struct osd_device  *osd = osd_obj2dev(obj);
 	struct osd_thandle *oh;
-	__u64		    len;
-	ENTRY;
+	__u64 len;
 
+	ENTRY;
 	oh = container_of(handle, struct osd_thandle, ot_super);
 
 	read_lock(&obj->oo_attr_lock);
@@ -1270,9 +1305,9 @@ static int osd_declare_punch(const struct lu_env *env, struct dt_object *dt,
 static int osd_ladvise(const struct lu_env *env, struct dt_object *dt,
 		       __u64 start, __u64 end, enum lu_ladvise_type advice)
 {
-	int	rc;
-	ENTRY;
+	int rc;
 
+	ENTRY;
 	switch (advice) {
 	default:
 		rc = -ENOTSUPP;
@@ -1286,8 +1321,8 @@ static int osd_fallocate(const struct lu_env *env, struct dt_object *dt,
 			 __u64 start, __u64 end, int mode, struct thandle *th)
 {
 	int rc = -EOPNOTSUPP;
-	ENTRY;
 
+	ENTRY;
 	 /*
 	  * space preallocation is not supported for ZFS
 	  * Returns -EOPNOTSUPP for now
@@ -1300,8 +1335,8 @@ static int osd_declare_fallocate(const struct lu_env *env,
 				 int mode, struct thandle *th)
 {
 	int rc = -EOPNOTSUPP;
-	ENTRY;
 
+	ENTRY;
 	 /*
 	  * space preallocation is not supported for ZFS
 	  * Returns -EOPNOTSUPP for now
@@ -1320,7 +1355,6 @@ static loff_t osd_lseek(const struct lu_env *env, struct dt_object *dt,
 	boolean_t hole = whence == SEEK_HOLE;
 
 	ENTRY;
-
 	LASSERT(dt_object_exists(dt));
 	LASSERT(osd_invariant(obj));
 	LASSERT(offset >= 0);
diff --git a/lustre/osd-zfs/osd_object.c b/lustre/osd-zfs/osd_object.c
index a421ff8..5906076 100644
--- a/lustre/osd-zfs/osd_object.c
+++ b/lustre/osd-zfs/osd_object.c
@@ -142,7 +142,8 @@ void osd_object_sa_dirty_rele(const struct lu_env *env, struct osd_thandle *oh)
 			}
 			up_write(&obj->oo_guard);
 		}
-		sa_spill_rele(obj->oo_sa_hdl);
+		if (obj->oo_sa_hdl)
+			sa_spill_rele(obj->oo_sa_hdl);
 	}
 }
 
diff --git a/lustre/osd-zfs/osd_scrub.c b/lustre/osd-zfs/osd_scrub.c
index b2ce389..cb1d270 100644
--- a/lustre/osd-zfs/osd_scrub.c
+++ b/lustre/osd-zfs/osd_scrub.c
@@ -1852,6 +1852,11 @@ static int osd_scan_dir(const struct lu_env *env, struct osd_device *dev,
 
 	za = &it->ozi_za;
 	zde = &it->ozi_zde;
+
+#ifdef ZAP_MAXNAMELEN_NEW
+	za->za_name_len = MAXNAMELEN;
+#endif
+
 	while (1) {
 		rc = -zap_cursor_retrieve(it->ozi_zc, za);
 		if (unlikely(rc)) {
diff --git a/lustre/tests/conf-sanity.sh b/lustre/tests/conf-sanity.sh
index 69e54b1..228e220 100755
--- a/lustre/tests/conf-sanity.sh
+++ b/lustre/tests/conf-sanity.sh
@@ -33,6 +33,12 @@ fi
 #                                  8  22  40  165  (min)
 [ "$SLOW" = "no" ] && EXCEPT_SLOW="45 69 106 111"
 
+if [[ "$mds1_FSTYPE" == "zfs" ]]; then
+	always_except LU-18652 108a 112a 112b 113 117 119 121 122a
+	always_except LU-18652 123aa 123ab 123ac 123ad 123ae 123af 123ag 123ah 123ahi
+	always_except LU-18652 123F 123G 123H 126 129 132 133 135 136 137 150 152 153a 153b 153c 155 802a
+fi
+
 build_test_filter
 
 # use small MDS + OST size to speed formatting time
@@ -1796,8 +1802,8 @@ t32_verify_quota() {
 		"$fsname.quota.ost" ug
 
 	chmod 0777 $mnt
-	runas -u $T32_QID -g $T32_QID dd if=/dev/zero of=$mnt/t32_qf_new \
-		bs=1M count=$((img_blimit / 1024)) oflag=sync && {
+	runas -u $T32_QID -g $T32_QID $DD of=$mnt/t32_qf_new \
+		count=$((img_blimit / 1024)) oflag=sync && {
 		echo "Write succeed, but expect -EDQUOT"
 		return 1
 	}
diff --git a/lustre/tests/obdfilter-survey.sh b/lustre/tests/obdfilter-survey.sh
index 059694e..5a1f655 100644
--- a/lustre/tests/obdfilter-survey.sh
+++ b/lustre/tests/obdfilter-survey.sh
@@ -8,7 +8,12 @@ init_logging
 
 # bug number for skipped test:
 ALWAYS_EXCEPT="$OBDFILTER_SURVEY_EXCEPT "
-# UPDATE THE COMMENT ABOVE WITH BUG NUMBERS WHEN CHANGING ALWAYS_EXCEPT!
+
+# it would be nice to have an "all" option :-)
+ if [[ $mds1_FSTYPE == zfs ]] &&
+    (( $(zfs_version_code mds1) >= $(version_code 2.2.7) )); then
+	always_except LU-18889 1a 1b 1c 2a 2b 3a
+fi
 
 build_test_filter
 
diff --git a/lustre/tests/sanity.sh b/lustre/tests/sanity.sh
index 9f968b6..c5ada84 100755
--- a/lustre/tests/sanity.sh
+++ b/lustre/tests/sanity.sh
@@ -75,19 +75,17 @@ if (( $LINUX_VERSION_CODE >= $(version_code 4.18.0) &&
 	ALWAYS_EXCEPT+=" 411"
 fi
 
-#                                  5              12     8   12  15   (min)"
-[ "$SLOW" = "no" ] && EXCEPT_SLOW="27m 60i 64b 68 71 115 135 136 230d 300o"
+# minutes runtime:                   5              12     8   12   15
+[[ "$SLOW" = "no" ]] && EXCEPT_SLOW="27m 60i 64b 68 71 115 135 136 230d 300o"
 
-if [ "$mds1_FSTYPE" = "zfs" ]; then
-	# bug number for skipped test:
-	ALWAYS_EXCEPT+="              "
+if [[ "$mds1_FSTYPE" == "zfs" ]]; then
 	#                                               13    (min)"
-	[ "$SLOW" = "no" ] && EXCEPT_SLOW="$EXCEPT_SLOW 51b"
+	[[ "$SLOW" == "no" ]] && EXCEPT_SLOW="$EXCEPT_SLOW 51b"
 fi
 
-if [ "$ost1_FSTYPE" = "zfs" ]; then
-	# bug number for skipped test:	LU-1941  LU-1941  LU-1941  LU-1941
-	ALWAYS_EXCEPT+="                130b 130c 130d 130e 130f 130g"
+if [[ "$ost1_FSTYPE" = "zfs" ]]; then
+	always_except LU-1941 130b 130c 130d 130e 130f 130g
+	always_except LU-9054 312
 fi
 
 proc_regexp="/{proc,sys}/{fs,sys,kernel/debug}/{lustre,lnet}/"
@@ -9554,7 +9552,7 @@ test_66() {
 	[ $PARALLEL == "yes" ] && skip "skip parallel run"
 
 	COUNT=${COUNT:-8}
-	dd if=/dev/zero of=$DIR/f66 bs=1k count=$COUNT
+	dd if=/dev/urandom of=$DIR/f66 bs=1k count=$COUNT
 	sync; sync_all_data; sync; sync_all_data
 	cancel_lru_locks osc
 	BLOCKS=`ls -s $DIR/f66 | awk '{ print $1 }'`
@@ -24204,6 +24202,7 @@ test_311() {
 	remote_mds_nodsh && skip "remote MDS with nodsh"
 
 	local old_iused=$($LFS df -i | awk '/OST0000/ { print $3; exit; }')
+	echo "old_iused=$old_iused"
 	local mdts=$(comma_list $(mdts_nodes))
 
 	mkdir -p $DIR/$tdir
@@ -24212,11 +24211,13 @@ test_311() {
 
 	# statfs data is not real time, let's just calculate it
 	old_iused=$((old_iused + 1000))
+	echo "suppose current old_iused=$old_iused"
 
 	local count=$(do_facet $SINGLEMDS "$LCTL get_param -n \
 			osp.*OST0000*MDT0000.create_count")
 	local max_count=$(do_facet $SINGLEMDS "$LCTL get_param -n \
 				osp.*OST0000*MDT0000.max_create_count")
+	echo "create_count=$count, max_create_count=$max_count"
 	do_nodes $mdts "$LCTL set_param -n osp.*OST0000*.max_create_count=0"
 
 	$LFS setstripe -i 0 $DIR/$tdir/$tfile || error "setstripe failed"
@@ -24224,6 +24225,8 @@ test_311() {
 	[ $index -ne 0 ] || error "$tfile stripe index is 0"
 
 	unlinkmany $DIR/$tdir/$tfile. 1000
+	wait_delete_completed
+	wait_zfs_commit $SINGLEMDS 10
 
 	do_nodes $mdts "$LCTL set_param -n \
 			osp.*OST0000*.max_create_count=$max_count"
@@ -24236,14 +24239,15 @@ test_311() {
 	local new_iused
 	for i in $(seq 120); do
 		new_iused=$($LFS df -i | awk '/OST0000/ { print $3; exit; }')
+		echo -n "$new_iused "
 		# system may be too busy to destroy all objs in time, use
 		# a somewhat small value to not fail autotest
-		[ $((old_iused - new_iused)) -gt 400 ] && break
+		((old_iused - new_iused > 400)) && break
 		sleep 1
 	done
 
-	echo "waited $i sec, old Iused $old_iused, new Iused $new_iused"
-	[ $((old_iused - new_iused)) -gt 400 ] ||
+	echo -e "\nwaited $i sec, old Iused $old_iused, new Iused $new_iused"
+	((old_iused - new_iused > 400)) ||
 		error "objs not destroyed after unlink"
 }
 run_test 311 "disable OSP precreate, and unlink should destroy objs"
@@ -25694,7 +25698,7 @@ generate_uneven_mdts() {
 	if check_fallocate_supported mds$((min_index + 1)); then
 		cmd="fallocate -l 128K "
 	else
-		cmd="dd if=/dev/zero bs=128K count=1 of="
+		cmd="$DD bs=128K count=1 of="
 	fi
 
 	echo "using cmd $cmd"
diff --git a/lustre/tests/test-framework.sh b/lustre/tests/test-framework.sh
index 9a385a2..075ee67 100755
--- a/lustre/tests/test-framework.sh
+++ b/lustre/tests/test-framework.sh
@@ -429,6 +429,13 @@ init_test_env() {
 	export MACHINEFILE=${MACHINEFILE:-$TMP/$(basename $0 .sh).machines}
 	. ${CONFIG:=$LUSTRE/tests/cfg/$NAME.sh}
 	get_lustre_env
+	# use /dev/urandom when consuming space on ZFS to avoid compression
+	if [[ "$ost1_FSTYPE" == "zfs" ]]; then
+		DD_DEV="/dev/urandom"
+	else
+		DD_DEV="/dev/zero"
+	fi
+	DD="dd if=$DD_DEV bs=1M"
 
 	# use localrecov to enable recovery for local clients, LU-12722
 	[[ $MDS1_VERSION -lt $(version_code 2.13.52) ]] || {
@@ -540,6 +547,18 @@ lustre_version_code() {
 	version_code $(lustre_build_version $1)
 }
 
+zfs_version_code() {
+	local facet=$1
+	local facet_version=${facet}_ZFS_VERSION
+
+	if [[ -z "${!facet_version}" ]]; then
+		local zfs_ver=$(do_facet $facet "modinfo --field version zfs")
+
+		export $facet_version=$(version_code ${zfs_ver%-*})
+	fi
+	echo ${!facet_version}
+}
+
 module_loaded () {
 	/sbin/lsmod | grep -q "^\<$1\>"
 }
@@ -3185,7 +3204,7 @@ wait_zfs_commit() {
 	# the occupied disk space will be released
 	# only after TXGs are committed
 	if [[ $(facet_fstype $1) == zfs ]]; then
-		echo "sleep $zfs_wait for ZFS $(facet_fstype $1)"
+		echo "sleep $zfs_wait for ZFS $(facet_type $1)"
 		sleep $zfs_wait
 	fi
 }
diff --git a/lustre/utils/libmount_utils_zfs.c b/lustre/utils/libmount_utils_zfs.c
index 3f203c4..e6d30ca 100644
--- a/lustre/utils/libmount_utils_zfs.c
+++ b/lustre/utils/libmount_utils_zfs.c
@@ -631,6 +631,12 @@ static int zfs_create_vdev(struct mkfs_opts *mop, char *vdev)
 
 	return ret;
 }
+/* interop will break if we change MAX_NAME from 255 */
+#ifdef ZAP_MAXNAMELEN_NEW
+#define ZFS_LONGNAME_FEATURE	" -o feature@longname=disabled"
+#else
+#define ZFS_LONGNAME_FEATURE	""
+#endif
 
 int zfs_make_lustre(struct mkfs_opts *mop)
 {
@@ -708,7 +714,8 @@ int zfs_make_lustre(struct mkfs_opts *mop)
 
 		memset(mkfs_cmd, 0, PATH_MAX);
 		snprintf(mkfs_cmd, PATH_MAX,
-			"zpool create -f -O canmount=off %s", pool);
+			"zpool create%s -f -O canmount=off %s",
+			ZFS_LONGNAME_FEATURE, pool);
 
 		/* Append the vdev config and create file vdevs as required */
 		while (*mop->mo_pool_vdevs != NULL) {
@@ -774,6 +781,7 @@ int zfs_make_lustre(struct mkfs_opts *mop)
 	 * zfs 0.6.1 - system attribute based xattrs
 	 * zfs 0.6.5 - large block support
 	 * zfs 0.7.0 - large dnode support
+	 * zfs 2.2.6 - compression handling
 	 *
 	 * Check if zhp is NULL as a defensive measure. Any dataset
 	 * validation errors that would cause zfs_open() to fail
@@ -781,6 +789,8 @@ int zfs_make_lustre(struct mkfs_opts *mop)
 	 */
 	zhp = zfs_open(g_zfs, ds, ZFS_TYPE_FILESYSTEM);
 	if (zhp) {
+		char *opt;
+
 		/* zfs 0.6.1 - system attribute based xattrs */
 		if (!strstr(mop->mo_mkfsopts, "xattr="))
 			zfs_set_prop_str(zhp, "xattr", "sa");
@@ -797,6 +807,24 @@ int zfs_make_lustre(struct mkfs_opts *mop)
 				zfs_set_prop_str(zhp, "recordsize", "1M");
 		}
 
+		/* zfs 2.2.6 - compression handling */
+		opt = strstr(mop->mo_mkfsopts, "compression=");
+		if (opt) {
+			char *end = index(opt, ',');
+			size_t len = strlen(opt);
+
+			if (end) {
+				len = end - opt;
+				end = strndup(opt, len);
+			}
+			zfs_set_prop_str(zhp, "compression", end ? end : opt);
+			if (end)
+				free(end);
+		} else {
+			/* By default turn off compression */
+			zfs_set_prop_str(zhp, "compression", "off");
+		}
+
 		zfs_close(zhp);
 	}
 
diff --git a/rpm/kmp-lustre-osd-zfs.preamble b/rpm/kmp-lustre-osd-zfs.preamble
index 34a36b1..11fa5e8 100644
--- a/rpm/kmp-lustre-osd-zfs.preamble
+++ b/rpm/kmp-lustre-osd-zfs.preamble
@@ -1,4 +1,3 @@
-Summary:        Lustre osd-zfs feature support
 Requires:       %{name}-osd-zfs-mount = %{version}
 %if 0%{confzfsdobjpath} != 0
 BuildRequires:  kmod-zfs-devel
-- 
1.8.3.1