linux

mirror of https://github.com/torvalds/linux.git synced 2026-01-25 15:03:52 +08:00

Go to file

Coly Li 80265d8dfd bcache: acquire bch_register_lock later in cached_dev_free()

When enable lockdep engine, a lockdep warning can be observed when
reboot or shutdown system,

[ 3142.764557][    T1] bcache: bcache_reboot() Stopping all devices:
[ 3142.776265][ T2649]
[ 3142.777159][ T2649] ======================================================
[ 3142.780039][ T2649] WARNING: possible circular locking dependency detected
[ 3142.782869][ T2649] 5.2.0-rc4-lp151.20-default+ #1 Tainted: G        W
[ 3142.785684][ T2649] ------------------------------------------------------
[ 3142.788479][ T2649] kworker/3:67/2649 is trying to acquire lock:
[ 3142.790738][ T2649] 00000000aaf02291 ((wq_completion)bcache_writeback_wq){+.+.}, at: flush_workqueue+0x87/0x4c0
[ 3142.794678][ T2649]
[ 3142.794678][ T2649] but task is already holding lock:
[ 3142.797402][ T2649] 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache]
[ 3142.801462][ T2649]
[ 3142.801462][ T2649] which lock already depends on the new lock.
[ 3142.801462][ T2649]
[ 3142.805277][ T2649]
[ 3142.805277][ T2649] the existing dependency chain (in reverse order) is:
[ 3142.808902][ T2649]
[ 3142.808902][ T2649] -> #2 (&bch_register_lock){+.+.}:
[ 3142.812396][ T2649]        __mutex_lock+0x7a/0x9d0
[ 3142.814184][ T2649]        cached_dev_free+0x17/0x120 [bcache]
[ 3142.816415][ T2649]        process_one_work+0x2a4/0x640
[ 3142.818413][ T2649]        worker_thread+0x39/0x3f0
[ 3142.820276][ T2649]        kthread+0x125/0x140
[ 3142.822061][ T2649]        ret_from_fork+0x3a/0x50
[ 3142.823965][ T2649]
[ 3142.823965][ T2649] -> #1 ((work_completion)(&cl->work)#2){+.+.}:
[ 3142.827244][ T2649]        process_one_work+0x277/0x640
[ 3142.829160][ T2649]        worker_thread+0x39/0x3f0
[ 3142.830958][ T2649]        kthread+0x125/0x140
[ 3142.832674][ T2649]        ret_from_fork+0x3a/0x50
[ 3142.834915][ T2649]
[ 3142.834915][ T2649] -> #0 ((wq_completion)bcache_writeback_wq){+.+.}:
[ 3142.838121][ T2649]        lock_acquire+0xb4/0x1c0
[ 3142.840025][ T2649]        flush_workqueue+0xae/0x4c0
[ 3142.842035][ T2649]        drain_workqueue+0xa9/0x180
[ 3142.844042][ T2649]        destroy_workqueue+0x17/0x250
[ 3142.846142][ T2649]        cached_dev_free+0x52/0x120 [bcache]
[ 3142.848530][ T2649]        process_one_work+0x2a4/0x640
[ 3142.850663][ T2649]        worker_thread+0x39/0x3f0
[ 3142.852464][ T2649]        kthread+0x125/0x140
[ 3142.854106][ T2649]        ret_from_fork+0x3a/0x50
[ 3142.855880][ T2649]
[ 3142.855880][ T2649] other info that might help us debug this:
[ 3142.855880][ T2649]
[ 3142.859663][ T2649] Chain exists of:
[ 3142.859663][ T2649]   (wq_completion)bcache_writeback_wq --> (work_completion)(&cl->work)#2 --> &bch_register_lock
[ 3142.859663][ T2649]
[ 3142.865424][ T2649]  Possible unsafe locking scenario:
[ 3142.865424][ T2649]
[ 3142.868022][ T2649]        CPU0                    CPU1
[ 3142.869885][ T2649]        ----                    ----
[ 3142.871751][ T2649]   lock(&bch_register_lock);
[ 3142.873379][ T2649]                                lock((work_completion)(&cl->work)#2);
[ 3142.876399][ T2649]                                lock(&bch_register_lock);
[ 3142.879727][ T2649]   lock((wq_completion)bcache_writeback_wq);
[ 3142.882064][ T2649]
[ 3142.882064][ T2649]  *** DEADLOCK ***
[ 3142.882064][ T2649]
[ 3142.885060][ T2649] 3 locks held by kworker/3:67/2649:
[ 3142.887245][ T2649]  #0: 00000000e774cdd0 ((wq_completion)events){+.+.}, at: process_one_work+0x21e/0x640
[ 3142.890815][ T2649]  #1: 00000000f7df89da ((work_completion)(&cl->work)#2){+.+.}, at: process_one_work+0x21e/0x640
[ 3142.894884][ T2649]  #2: 000000004fcf89c5 (&bch_register_lock){+.+.}, at: cached_dev_free+0x17/0x120 [bcache]
[ 3142.898797][ T2649]
[ 3142.898797][ T2649] stack backtrace:
[ 3142.900961][ T2649] CPU: 3 PID: 2649 Comm: kworker/3:67 Tainted: G        W         5.2.0-rc4-lp151.20-default+ #1
[ 3142.904789][ T2649] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/13/2018
[ 3142.909168][ T2649] Workqueue: events cached_dev_free [bcache]
[ 3142.911422][ T2649] Call Trace:
[ 3142.912656][ T2649]  dump_stack+0x85/0xcb
[ 3142.914181][ T2649]  print_circular_bug+0x19a/0x1f0
[ 3142.916193][ T2649]  __lock_acquire+0x16cd/0x1850
[ 3142.917936][ T2649]  ? __lock_acquire+0x6a8/0x1850
[ 3142.919704][ T2649]  ? lock_acquire+0xb4/0x1c0
[ 3142.921335][ T2649]  ? find_held_lock+0x34/0xa0
[ 3142.923052][ T2649]  lock_acquire+0xb4/0x1c0
[ 3142.924635][ T2649]  ? flush_workqueue+0x87/0x4c0
[ 3142.926375][ T2649]  flush_workqueue+0xae/0x4c0
[ 3142.928047][ T2649]  ? flush_workqueue+0x87/0x4c0
[ 3142.929824][ T2649]  ? drain_workqueue+0xa9/0x180
[ 3142.931686][ T2649]  drain_workqueue+0xa9/0x180
[ 3142.933534][ T2649]  destroy_workqueue+0x17/0x250
[ 3142.935787][ T2649]  cached_dev_free+0x52/0x120 [bcache]
[ 3142.937795][ T2649]  process_one_work+0x2a4/0x640
[ 3142.939803][ T2649]  worker_thread+0x39/0x3f0
[ 3142.941487][ T2649]  ? process_one_work+0x640/0x640
[ 3142.943389][ T2649]  kthread+0x125/0x140
[ 3142.944894][ T2649]  ? kthread_create_worker_on_cpu+0x70/0x70
[ 3142.947744][ T2649]  ret_from_fork+0x3a/0x50
[ 3142.970358][ T2649] bcache: bcache_device_free() bcache0 stopped

Here is how the deadlock happens.
1) bcache_reboot() calls bcache_device_stop(), then inside
   bcache_device_stop() BCACHE_DEV_CLOSING bit is set on d->flags.
   Then closure_queue(&d->cl) is called to invoke cached_dev_flush().
2) In cached_dev_flush(), cached_dev_free() is called by continu_at().
3) In cached_dev_free(), when stopping the writeback kthread of the
   cached device by kthread_stop(), dc->writeback_thread will be waken
   up to quite the kthread while-loop, then cached_dev_put() is called
   in bch_writeback_thread().
4) Calling cached_dev_put() in writeback kthread may drop dc->count to
   0, then dc->detach kworker is scheduled, which is initialized as
   cached_dev_detach_finish().
5) Inside cached_dev_detach_finish(), the last line of code is to call
   closure_put(&dc->disk.cl), which drops the last reference counter of
   closrure dc->disk.cl, then the callback cached_dev_flush() gets
   called.
Now cached_dev_flush() is called for second time in the code path, the
first time is in step 2). And again bch_register_lock will be acquired
again, and a A-A lock (lockdep terminology) is happening.

The root cause of the above A-A lock is in cached_dev_free(), mutex
bch_register_lock is held before stopping writeback kthread and other
kworkers. Fortunately now we have variable 'bcache_is_reboot', which may
prevent device registration or unregistration during reboot/shutdown
time, so it is unncessary to hold bch_register_lock such early now.

This is how this patch fixes the reboot/shutdown time A-A lock issue:
After moving mutex_lock(&bch_register_lock) to a later location where
before atomic_read(&dc->running) in cached_dev_free(), such A-A lock
problem can be solved without any reboot time registration race.

Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2019-06-28 07:39:16 -06:00

arch

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

2019-06-14 06:16:47 -10:00

block

block: Remove unused code

2019-06-27 07:34:25 -06:00

certs

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36

2019-05-24 17:27:11 +02:00

crypto

Merge tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

2019-06-08 12:52:42 -07:00

Documentation

Documentation: nvme: add an example for nvme fault injection

2019-06-21 11:15:50 +02:00

drivers

bcache: acquire bch_register_lock later in cached_dev_free()

2019-06-28 07:39:16 -06:00

blkcg, writeback: dead memcgs shouldn't contribute to writeback ownership arbitration

2019-06-15 10:39:42 -06:00

include

block: Remove unused code

2019-06-27 07:34:25 -06:00

init

block: rename CONFIG_DEBUG_BLK_CGROUP to CONFIG_BFQ_CGROUP_DEBUG

2019-06-20 10:32:35 -06:00

ipc

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 441

2019-06-05 17:37:17 +02:00

kernel

cgroup: export css_next_descendant_pre for bfq

2019-06-21 02:48:34 -06:00

lib

lib/genalloc: introduce chunk owners

2019-06-13 17:34:56 -10:00

LICENSES

LICENSES: Rename other to deprecated

2019-05-03 06:34:32 -06:00

mm/devm_memremap_pages: fix final page put race

2019-06-13 17:34:56 -10:00

net

Merge tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

2019-06-08 12:52:42 -07:00

samples

Merge tag 'spdx-5.2-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core

2019-06-08 12:52:42 -07:00

scripts

scripts/decode_stacktrace.sh: prefix addr2line with $CROSS_COMPILE

2019-06-13 17:34:56 -10:00

security

Smack: Restore the smackfsdef mount option and add missing prefixes

2019-06-14 14:25:04 -10:00

sound

Merge tag 'sound-5.2-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

2019-06-14 05:37:06 -10:00

tools

mm/devm_memremap_pages: fix final page put race

2019-06-13 17:34:56 -10:00

usr

user/Makefile: Fix typo and capitalization in comment section

2018-12-11 00:18:03 +09:00

virt

treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 333

2019-06-05 17:37:06 +02:00

.clang-format

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net

2019-04-17 11:26:25 -07:00

.cocciconfig

…

.get_maintainer.ignore

Opt out of scripts/get_maintainer.pl

2019-05-16 10:53:40 -07:00

.gitattributes

…

.gitignore

.gitignore: exclude .get_maintainer.ignore and .gitattributes

2019-05-18 11:49:54 +09:00

.mailmap

Merge tag 'docs-5.2' of git://git.lwn.net/linux

2019-05-08 12:42:50 -07:00

COPYING

COPYING: use the new text with points to the license files

2018-03-23 12:41:45 -06:00

CREDITS

MAINTAINERS: Farewell Martin Schwidefsky

2019-05-31 10:14:11 +02:00

Kbuild

Merge tag 'kbuild-v5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

2019-03-10 17:48:21 -07:00

Kconfig

kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt

2018-08-02 08:06:55 +09:00

MAINTAINERS

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux

2019-06-08 12:48:49 -07:00

Makefile

Linux 5.2-rc4

2019-06-08 20:24:46 -07:00

README

Drop all 00-INDEX files from Documentation/

2018-09-09 15:08:58 -06:00

README

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.

Languages

C 97.1%

Assembly 1%

Shell 0.6%

Rust 0.4%

Python 0.4%

Other 0.3%