Dave Jones 8c3a2d0
Path: news.gmane.org!not-for-mail
Dave Jones 8c3a2d0
From: Niels de Vos <ndevos@redhat.com>
Dave Jones 8c3a2d0
Newsgroups: gmane.linux.kernel,gmane.linux.file-systems
Dave Jones 8c3a2d0
Subject: [PATCH v2] fs: Invalidate the cache for a parent block-device if fsync() is called for a partition
Dave Jones 8c3a2d0
Date: Mon, 23 Jan 2012 10:38:29 +0000
Dave Jones 8c3a2d0
Lines: 58
Dave Jones 8c3a2d0
Approved: news@gmane.org
Dave Jones 8c3a2d0
Message-ID: <1327315109-7740-1-git-send-email-ndevos@redhat.com>
Dave Jones 8c3a2d0
References: <4F19356E.3020708@redhat.com>
Dave Jones 8c3a2d0
NNTP-Posting-Host: lo.gmane.org
Dave Jones 8c3a2d0
X-Trace: dough.gmane.org 1327315263 30652 80.91.229.12 (23 Jan 2012 10:41:03 GMT)
Dave Jones 8c3a2d0
X-Complaints-To: usenet@dough.gmane.org
Dave Jones 8c3a2d0
NNTP-Posting-Date: Mon, 23 Jan 2012 10:41:03 +0000 (UTC)
Dave Jones 8c3a2d0
Cc: linux-kernel@vger.kernel.org, Niels de Vos <ndevos@redhat.com>,
Dave Jones 8c3a2d0
	"Bryn M. Reeves" <bmr@redhat.com>,
Dave Jones 8c3a2d0
	Mikulas Patocka <mpatocka@redhat.com>
Dave Jones 8c3a2d0
To: linux-fsdevel@vger.kernel.org
Dave Jones 8c3a2d0
Original-X-From: linux-kernel-owner@vger.kernel.org Mon Jan 23 11:40:58 2012
Dave Jones 8c3a2d0
Return-path: <linux-kernel-owner@vger.kernel.org>
Dave Jones 8c3a2d0
Envelope-to: glk-linux-kernel-3@lo.gmane.org
Dave Jones 8c3a2d0
Original-Received: from vger.kernel.org ([209.132.180.67])
Dave Jones 8c3a2d0
	by lo.gmane.org with esmtp (Exim 4.69)
Dave Jones 8c3a2d0
	(envelope-from <linux-kernel-owner@vger.kernel.org>)
Dave Jones 8c3a2d0
	id 1RpHKb-0008Bu-Fh
Dave Jones 8c3a2d0
	for glk-linux-kernel-3@lo.gmane.org; Mon, 23 Jan 2012 11:40:57 +0100
Dave Jones 8c3a2d0
Original-Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
Dave Jones 8c3a2d0
	id S1753263Ab2AWKkt (ORCPT <rfc822;glk-linux-kernel-3@m.gmane.org>);
Dave Jones 8c3a2d0
	Mon, 23 Jan 2012 05:40:49 -0500
Dave Jones 8c3a2d0
Original-Received: from mx1.redhat.com ([209.132.183.28]:58739 "EHLO mx1.redhat.com"
Dave Jones 8c3a2d0
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
Dave Jones 8c3a2d0
	id S1751990Ab2AWKks (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
Dave Jones 8c3a2d0
	Mon, 23 Jan 2012 05:40:48 -0500
Dave Jones 8c3a2d0
Original-Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25])
Dave Jones 8c3a2d0
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id q0NAelMx027033
Dave Jones 8c3a2d0
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK);
Dave Jones 8c3a2d0
	Mon, 23 Jan 2012 05:40:47 -0500
Dave Jones 8c3a2d0
Original-Received: from ndevos.usersys.redhat.com (dhcp-1-51.fab.redhat.com [10.33.1.51])
Dave Jones 8c3a2d0
	by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id q0NAejLn013691;
Dave Jones 8c3a2d0
	Mon, 23 Jan 2012 05:40:46 -0500
Dave Jones 8c3a2d0
In-Reply-To: <4F19356E.3020708@redhat.com>
Dave Jones 8c3a2d0
X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25
Dave Jones 8c3a2d0
Original-Sender: linux-kernel-owner@vger.kernel.org
Dave Jones 8c3a2d0
Precedence: bulk
Dave Jones 8c3a2d0
List-ID: <linux-kernel.vger.kernel.org>
Dave Jones 8c3a2d0
X-Mailing-List: linux-kernel@vger.kernel.org
Dave Jones 8c3a2d0
Xref: news.gmane.org gmane.linux.kernel:1242432 gmane.linux.file-systems:60751
Dave Jones 8c3a2d0
Archived-At: <http://permalink.gmane.org/gmane.linux.kernel/1242432>
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
Executing an fsync() on a file-descriptor of a partition flushes the
Dave Jones 8c3a2d0
caches for that partition by calling blkdev_issue_flush(). However, it
Dave Jones 8c3a2d0
seems that reading data through the parent device will still return the
Dave Jones 8c3a2d0
old cached data.
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
The cache for the block-device is not synced if the block-device is kept
Dave Jones 8c3a2d0
open (due to a mounted partition, for example). Only when all users for
Dave Jones 8c3a2d0
the disk have exited, the cache for the disk is made consistent again.
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
Calling invalidate_bdev() on the parent block-device in case
Dave Jones 8c3a2d0
blkdev_fsync() was called for a partition, fixes this.
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
The problem can be worked around by forcing the caches to be flushed
Dave Jones 8c3a2d0
with either
Dave Jones 8c3a2d0
	# blockdev --flushbufs ${dev_disk}
Dave Jones 8c3a2d0
or
Dave Jones 8c3a2d0
	# echo 3 > /proc/sys/vm/drop_caches
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
CC: Bryn M. Reeves <bmr@redhat.com>
Dave Jones 8c3a2d0
CC: Mikulas Patocka <mpatocka@redhat.com>
Dave Jones 8c3a2d0
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
---
Dave Jones 8c3a2d0
v2:
Dave Jones 8c3a2d0
- Do not call invalidate_bdev() from blkdev_issue_flush() and prevent
Dave Jones 8c3a2d0
  performance degration with journalled filesystems.
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
  Suggested was to call invalidate_bdev() in fsync_bdev(), but this is
Dave Jones 8c3a2d0
  not in the call-path of mkfs.ext3 and similar tools. Hence the issue
Dave Jones 8c3a2d0
  persists.
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
- Correct phrasing a little, changing ioctl-BLKFLSBUF is not required.
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
- This issue also occurs when doing an ioctl-BLKFLSBUF on a partition.
Dave Jones 8c3a2d0
  Reading the whole disk will still return cached data. If this is an
Dave Jones 8c3a2d0
  issue, it will need a seperate patch.
Dave Jones 8c3a2d0
---
Dave Jones 8c3a2d0
 fs/block_dev.c |    4 ++++
Dave Jones 8c3a2d0
 1 files changed, 4 insertions(+), 0 deletions(-)
Dave Jones 8c3a2d0
Dave Jones 8c3a2d0
diff --git a/fs/block_dev.c b/fs/block_dev.c
Dave Jones 8c3a2d0
index 0e575d1..433c4de 100644
Dave Jones 8c3a2d0
--- a/fs/block_dev.c
Dave Jones 8c3a2d0
+++ b/fs/block_dev.c
Dave Jones 8c3a2d0
@@ -424,6 +424,10 @@ int blkdev_fsync(struct file *filp, loff_t start, loff_t end, int datasync)
Dave Jones 8c3a2d0
 	if (error == -EOPNOTSUPP)
Dave Jones 8c3a2d0
 		error = 0;
Dave Jones 8c3a2d0
 
Dave Jones 8c3a2d0
+	/* invalidate parent block_device */
Dave Jones 8c3a2d0
+	if (!error && bdev != bdev->bd_contains)
Dave Jones 8c3a2d0
+		invalidate_bdev(bdev->bd_contains);
Dave Jones 8c3a2d0
+
Dave Jones 8c3a2d0
 	return error;
Dave Jones 8c3a2d0
 }
Dave Jones 8c3a2d0
 EXPORT_SYMBOL(blkdev_fsync);
Dave Jones 8c3a2d0
-- 
Dave Jones 8c3a2d0
1.7.6.5
Dave Jones 8c3a2d0