From 4c21e2f2441dc5fbb957b030333f5a3f2d02dea7 Mon Sep 17 00:00:00 2001 From: Hugh Dickins Date: Sat, 29 Oct 2005 18:16:40 -0700 Subject: [PATCH] mm: split page table lock Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with a many-threaded application which concurrently initializes different parts of a large anonymous area. This patch corrects that, by using a separate spinlock per page table page, to guard the page table entries in that page, instead of using the mm's single page_table_lock. (But even then, page_table_lock is still used to guard page table allocation, and anon_vma allocation.) In this implementation, the spinlock is tucked inside the struct page of the page table page: with a BUILD_BUG_ON in case it overflows - which it would in the case of 32-bit PA-RISC with spinlock debugging enabled. Splitting the lock is not quite for free: another cacheline access. Ideally, I suppose we would use split ptlock only for multi-threaded processes on multi-cpu machines; but deciding that dynamically would have its own costs. So for now enable it by config, at some number of cpus - since the Kconfig language doesn't support inequalities, let preprocessor compare that with NR_CPUS. But I don't think it's worth being user-configurable: for good testing of both split and unsplit configs, split now at 4 cpus, and perhaps change that to 8 later. There is a benefit even for singly threaded processes: kswapd can be attacking one part of the mm while another part is busy faulting. Signed-off-by: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/buffer.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'fs/buffer.c') diff --git a/fs/buffer.c b/fs/buffer.c index b1667986442f..2066e4cb700c 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -96,7 +96,7 @@ static void __clear_page_buffers(struct page *page) { ClearPagePrivate(page); - page->private = 0; + set_page_private(page, 0); page_cache_release(page); } -- cgit From aaa4059bc2dca7fa816624a28db1958c3a22df9b Mon Sep 17 00:00:00 2001 From: Jan Kara Date: Sun, 30 Oct 2005 15:00:16 -0800 Subject: [PATCH] ext3: Fix unmapped buffers in transaction's lists Fix the problem (BUG 4964) with unmapped buffers in transaction's t_sync_data list. The problem is we need to call filesystem's own invalidatepage() from block_write_full_page(). block_write_full_page() must call filesystem's invalidatepage(). Otherwise following nasty race can happen: proc 1 proc 2 ------ ------ - write some new data to 'offset' => bh gets to the transactions data list - starts truncate => i_size set to new size - mpage_writepages() - ext3_ordered_writepage() to 'offset' - block_write_full_page() - page->index > end_index+1 - block_invalidatepage() - discard_buffer() - clear_buffer_mapped() - commit triggers and finds unmapped buffer - BOOM! Signed-off-by: Jan Kara Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/buffer.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) (limited to 'fs/buffer.c') diff --git a/fs/buffer.c b/fs/buffer.c index 2066e4cb700c..75cac9ada026 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1637,6 +1637,15 @@ out: } EXPORT_SYMBOL(block_invalidatepage); +int do_invalidatepage(struct page *page, unsigned long offset) +{ + int (*invalidatepage)(struct page *, unsigned long); + invalidatepage = page->mapping->a_ops->invalidatepage; + if (invalidatepage == NULL) + invalidatepage = block_invalidatepage; + return (*invalidatepage)(page, offset); +} + /* * We attach and possibly dirty the buffers atomically wrt * __set_page_dirty_buffers() via private_lock. try_to_free_buffers @@ -2696,7 +2705,7 @@ int block_write_full_page(struct page *page, get_block_t *get_block, * they may have been added in ext3_writepage(). Make them * freeable here, so the page does not leak. */ - block_invalidatepage(page, 0); + do_invalidatepage(page, 0); unlock_page(page); return 0; /* don't care */ } -- cgit From a3e713b5fdd0e54c2e3c8909ccde2a98839e3a52 Mon Sep 17 00:00:00 2001 From: Andrew Morton Date: Sun, 30 Oct 2005 15:03:15 -0800 Subject: [PATCH] __bread oops fix If a filesystem passes an idiotic blocksize into bread(), __getblk_slow() will warn and will return NULL. We have a report (from Hubert Tonneau ) of isofs_fill_super() doing this (passing in a silly block size) against an unplugged CDROM drive. But a couple of __getblk_slow() callers forgot to check for the NULL bh, hence oops. Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- fs/buffer.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) (limited to 'fs/buffer.c') diff --git a/fs/buffer.c b/fs/buffer.c index 75cac9ada026..35fa34977e81 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -1478,8 +1478,10 @@ EXPORT_SYMBOL(__getblk); void __breadahead(struct block_device *bdev, sector_t block, int size) { struct buffer_head *bh = __getblk(bdev, block, size); - ll_rw_block(READA, 1, &bh); - brelse(bh); + if (likely(bh)) { + ll_rw_block(READA, 1, &bh); + brelse(bh); + } } EXPORT_SYMBOL(__breadahead); @@ -1497,7 +1499,7 @@ __bread(struct block_device *bdev, sector_t block, int size) { struct buffer_head *bh = __getblk(bdev, block, size); - if (!buffer_uptodate(bh)) + if (likely(bh) && !buffer_uptodate(bh)) bh = __bread_slow(bh); return bh; } -- cgit