Commit Graph

931895 Commits

Author SHA1 Message Date
Linus Torvalds
9716e57a01 Merge tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull atomics rework from Thomas Gleixner:
 "Peter Zijlstras rework of atomics and fallbacks. This solves two
  problems:

   1) Compilers uninline small atomic_* static inline functions which
      can expose them to instrumentation.

   2) The instrumentation of atomic primitives was done at the
      architecture level while composites or fallbacks were provided at
      the generic level. As a result there are no uninstrumented
      variants of the fallbacks.

  Both issues were in the way of fully isolating fragile entry code
  pathes and especially the text poke int3 handler which is prone to an
  endless recursion problem when anything in that code path is about to
  be instrumented. This was always a problem, but got elevated due to
  the new batch mode updates of tracing.

  The solution is to mark the functions __always_inline and to flip the
  fallback and instrumentation so the non-instrumented variants are at
  the architecture level and the instrumentation is done in generic
  code.

  The latter introduces another fallback variant which will go away once
  all architectures have been moved over to arch_atomic_*"

* tag 'locking-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/atomics: Flip fallbacks and instrumentation
  asm-generic/atomic: Use __always_inline for fallback wrappers
2020-06-11 18:27:19 -07:00
Linus Torvalds
b1a6274994 Merge branch 'akpm' (patches from Andrew)
Pull updates from Andrew Morton:
 "A few fixes and stragglers.

  Subsystems affected by this patch series: mm/memory-failure, ocfs2,
  lib/lzo, misc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  amdgpu: a NULL ->mm does not mean a thread is a kthread
  lib/lzo: fix ambiguous encoding bug in lzo-rle
  ocfs2: fix build failure when TCP/IP is disabled
  mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread
  mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill
2020-06-11 18:18:50 -07:00
Christoph Hellwig
8449d150e6 amdgpu: a NULL ->mm does not mean a thread is a kthread
Use the proper API instead.

Fixes: 70539bd795 ("drm/amd: Update MEC HQD loading code for KFD")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Jens Axboe <axboe@kernel.dk>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: http://lkml.kernel.org/r/20200404094101.672954-1-hch@lst.de
Link: http://lkml.kernel.org/r/20200404094101.672954-2-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11 18:17:48 -07:00
Dave Rodgman
b5265c813c lib/lzo: fix ambiguous encoding bug in lzo-rle
In some rare cases, for input data over 32 KB, lzo-rle could encode two
different inputs to the same compressed representation, so that
decompression is then ambiguous (i.e.  data may be corrupted - although
zram is not affected because it operates over 4 KB pages).

This modifies the compressor without changing the decompressor or the
bitstream format, such that:

 - there is no change to how data produced by the old compressor is
   decompressed

 - an old decompressor will correctly decode data from the updated
   compressor

 - performance and compression ratio are not affected

 - we avoid introducing a new bitstream format

In testing over 12.8M real-world files totalling 903 GB, three files
were affected by this bug.  I also constructed 37M semi-random 64 KB
files totalling 2.27 TB, and saw no affected files.  Finally I tested
over files constructed to contain each of the ~1024 possible bad input
sequences; for all of these cases, updated lzo-rle worked correctly.

There is no significant impact to performance or compression ratio.

Signed-off-by: Dave Rodgman <dave.rodgman@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Dave Rodgman <dave.rodgman@arm.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
Cc: Markus F.X.J. Oberhumer <markus@oberhumer.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>
Cc: Chao Yu <yuchao0@huawei.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11 18:17:47 -07:00
Tom Seewald
fce1affe4e ocfs2: fix build failure when TCP/IP is disabled
After commit 12abc5ee78 ("tcp: add tcp_sock_set_nodelay") and commit
c488aeadcb ("tcp: add tcp_sock_set_user_timeout"), building the kernel
with OCFS2_FS=y but without INET=y causes it to fail with:

  ld: fs/ocfs2/cluster/tcp.o: in function `o2net_accept_many':
  tcp.c:(.text+0x21b1): undefined reference to `tcp_sock_set_nodelay'
  ld: tcp.c:(.text+0x21c1): undefined reference to `tcp_sock_set_user_timeout'
  ld: fs/ocfs2/cluster/tcp.o: in function `o2net_start_connect':
  tcp.c:(.text+0x2633): undefined reference to `tcp_sock_set_nodelay'
  ld: tcp.c:(.text+0x2643): undefined reference to `tcp_sock_set_user_timeout'

This is due to tcp_sock_set_nodelay() and tcp_sock_set_user_timeout()
being declared in linux/tcp.h and defined in net/ipv4/tcp.c, which
depend on TCP/IP being enabled.

To fix this, make OCFS2_FS depend on INET=y which already requires
NET=y.

Fixes: 12abc5ee78 ("tcp: add tcp_sock_set_nodelay")
Fixes: c488aeadcb ("tcp: add tcp_sock_set_user_timeout")
Signed-off-by: Tom Seewald <tseewald@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Jason Gunthorpe <jgg@mellanox.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Link: http://lkml.kernel.org/r/20200606190827.23954-1-tseewald@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11 18:17:47 -07:00
Naoya Horiguchi
03151c6e0b mm/memory-failure: send SIGBUS(BUS_MCEERR_AR) only to current thread
Action Required memory error should happen only when a processor is
about to access to a corrupted memory, so it's synchronous and only
affects current process/thread.

Recently commit 872e9a205c ("mm, memory_failure: don't send
BUS_MCEERR_AO for action required error") fixed the issue that Action
Required memory could unnecessarily send SIGBUS to the processes which
share the error memory.  But we still have another issue that we could
send SIGBUS to a wrong thread.

This is because collect_procs() and task_early_kill() fails to add the
current process to "to-kill" list.  So this patch is suggesting to fix
it.  With this fix, SIGBUS(BUS_MCEERR_AR) is never sent to non-current
process/thread.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Tony Luck <tony.luck@intel.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: http://lkml.kernel.org/r/1591321039-22141-3-git-send-email-naoya.horiguchi@nec.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11 18:17:47 -07:00
Naoya Horiguchi
4e018b450a mm/memory-failure: prioritize prctl(PR_MCE_KILL) over vm.memory_failure_early_kill
Patch series "hwpoison: fixes signaling on memory error"

This is a small patchset to solve issues in memory error handler to send
SIGBUS to proper process/thread as expected in configuration.  Please
see descriptions in individual patches for more details.

This patch (of 2):

Early-kill policy is controlled from two types of settings, one is
per-process setting prctl(PR_MCE_KILL) and the other is system-wide
setting vm.memory_failure_early_kill.  Users expect per-process setting
to override system-wide setting as many other settings do, but
early-kill setting doesn't work as such.

For example, if a system configures vm.memory_failure_early_kill to 1
(enabled), a process receives SIGBUS even if it's configured to
explicitly disable PF_MCE_KILL by prctl().  That's not desirable for
applications with their own policies.

This patch is suggesting to change the priority of these two types of
settings, by checking sysctl_memory_failure_early_kill only when a given
process has the default kill policy.

Note that this patch is solving a thread choice issue too.

Originally, collect_procs() always chooses the main thread when
vm.memory_failure_early_kill is 1, even if the process has a dedicated
thread for memory error handling.  SIGBUS should be sent to the
dedicated thread if early-kill is enabled via
vm.memory_failure_early_kill as we are doing for PR_MCE_KILL_EARLY
processes.

Signed-off-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Link: http://lkml.kernel.org/r/1591321039-22141-1-git-send-email-naoya.horiguchi@nec.com
Link: http://lkml.kernel.org/r/1591321039-22141-2-git-send-email-naoya.horiguchi@nec.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11 18:17:47 -07:00
Linus Torvalds
b961f8dc89 Merge tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
 "A few late stragglers in here. In particular:

   - Validate full range for provided buffers (Bijan)

   - Fix bad use of kfree() in buffer registration failure (Denis)

   - Don't allow close of ring itself, it's not fully safe. Making it
     fully safe would require making the system call more expensive,
     which isn't worth it.

   - Buffer selection fix

   - Regression fix for O_NONBLOCK retry

   - Make IORING_OP_ACCEPT honor O_NONBLOCK (Jiufei)

   - Restrict opcode handling for SQ/IOPOLL (Pavel)

   - io-wq work handling cleanups and improvements (Pavel, Xiaoguang)

   - IOPOLL race fix (Xiaoguang)"

* tag 'io_uring-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
  io_uring: fix io_kiocb.flags modification race in IOPOLL mode
  io_uring: check file O_NONBLOCK state for accept
  io_uring: avoid unnecessary io_wq_work copy for fast poll feature
  io_uring: avoid whole io_wq_work copy for requests completed inline
  io_uring: allow O_NONBLOCK async retry
  io_wq: add per-wq work handler instead of per work
  io_uring: don't arm a timeout through work.func
  io_uring: remove custom ->func handlers
  io_uring: don't derive close state from ->func
  io_uring: use kvfree() in io_sqe_buffer_register()
  io_uring: validate the full range of provided buffers for access
  io_uring: re-set iov base/len for buffer select retry
  io_uring: move send/recv IOPOLL check into prep
  io_uring: deduplicate io_openat{,2}_prep()
  io_uring: do build_open_how() only once
  io_uring: fix {SQ,IO}POLL with unsupported opcodes
  io_uring: disallow close of ring itself
2020-06-11 16:10:08 -07:00
Linus Torvalds
a58dfea297 Merge tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "Some followup fixes for this merge window. In particular:

   - Seqcount write missing preemption disable for stats (Ahmed)

   - blktrace fixes (Chaitanya)

   - Redundant initializations (Colin)

   - Various small NVMe fixes (Chaitanya, Christoph, Daniel, Max,
     Niklas, Rikard)

   - loop flag bug regression fix (Martijn)

   - blk-mq tagging fixes (Christoph, Ming)"

* tag 'block-5.8-2020-06-11' of git://git.kernel.dk/linux-block:
  umem: remove redundant initialization of variable ret
  pktcdvd: remove redundant initialization of variable ret
  nvmet: fail outstanding host posted AEN req
  nvme-pci: use simple suspend when a HMB is enabled
  nvme-fc: don't call nvme_cleanup_cmd() for AENs
  nvmet-tcp: constify nvmet_tcp_ops
  nvme-tcp: constify nvme_tcp_mq_ops and nvme_tcp_admin_mq_ops
  nvme: do not call del_gendisk() on a disk that was never added
  blk-mq: fix blk_mq_all_tag_iter
  blk-mq: split out a __blk_mq_get_driver_tag helper
  blktrace: fix endianness for blk_log_remap()
  blktrace: fix endianness in get_pdu_int()
  blktrace: use errno instead of bi_status
  block: nr_sects_write(): Disable preemption on seqcount write
  block: remove the error argument to the block_bio_complete tracepoint
  loop: Fix wrong masking of status flags
  block/bio-integrity: don't free 'buf' if bio_integrity_add_page() failed
2020-06-11 16:07:33 -07:00
David Howells
b3597945c8 afs: Fix afs_store_data() to set mtime in new operation descriptor
Fix afs_store_data() so that it sets the mtime in the new operation
descriptor otherwise the mtime on the server gets set to 0 when a write is
stored to the server.

Fixes: e49c7b2f6d ("afs: Build an abstraction around an "operation" concept")
Reported-by: Dave Botsch <botsch@cnf.cornell.edu>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11 16:04:30 -07:00
Linus Torvalds
6a45a65888 Merge tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull more x86 updates from Thomas Gleixner:
 "A set of fixes and updates for x86:

   - Unbreak paravirt VDSO clocks.

     While the VDSO code was moved into lib for sharing a subtle check
     for the validity of paravirt clocks got replaced. While the
     replacement works perfectly fine for bare metal as the update of
     the VDSO clock mode is synchronous, it fails for paravirt clocks
     because the hypervisor can invalidate them asynchronously.

     Bring it back as an optional function so it does not inflict this
     on architectures which are free of PV damage.

   - Fix the jiffies to jiffies64 mapping on 64bit so it does not
     trigger an ODR violation on newer compilers

   - Three fixes for the SSBD and *IB* speculation mitigation maze to
     ensure consistency, not disabling of some *IB* variants wrongly and
     to prevent a rogue cross process shutdown of SSBD. All marked for
     stable.

   - Add yet more CPU models to the splitlock detection capable list
     !@#%$!

   - Bring the pr_info() back which tells that TSC deadline timer is
     enabled.

   - Reboot quirk for MacBook6,1"

* tag 'x86-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso: Unbreak paravirt VDSO clocks
  lib/vdso: Provide sanity check for cycles (again)
  clocksource: Remove obsolete ifdef
  x86_64: Fix jiffies ODR violation
  x86/speculation: PR_SPEC_FORCE_DISABLE enforcement for indirect branches.
  x86/speculation: Prevent rogue cross-process SSBD shutdown
  x86/speculation: Avoid force-disabling IBPB based on STIBP and enhanced IBRS.
  x86/cpu: Add Sapphire Rapids CPU model number
  x86/split_lock: Add Icelake microserver and Tigerlake CPU models
  x86/apic: Make TSC deadline timer detection message visible
  x86/reboot/quirks: Add MacBook6,1 reboot quirk
2020-06-11 15:54:31 -07:00
Linus Torvalds
92ac971219 Merge tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull timer fix from Thomas Gleixner:
 "A small fix for the VDSO code to force inline
  __cvdso_clock_gettime_common() so the compiler
  can't generate horrible code"

* tag 'timers-urgent-2020-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  lib/vdso: Force inlining of __cvdso_clock_gettime_common()
2020-06-11 15:36:28 -07:00
Linus Torvalds
623f6dc593 Merge branch 'akpm' (patches from Andrew)
Merge some more updates from Andrew Morton:

 - various hotfixes and minor things

 - hch's use_mm/unuse_mm clearnups

Subsystems affected by this patch series: mm/hugetlb, scripts, kcov,
lib, nilfs, checkpatch, lib, mm/debug, ocfs2, lib, misc.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  kernel: set USER_DS in kthread_use_mm
  kernel: better document the use_mm/unuse_mm API contract
  kernel: move use_mm/unuse_mm to kthread.c
  kernel: move use_mm/unuse_mm to kthread.c
  stacktrace: cleanup inconsistent variable type
  lib: test get_count_order/long in test_bitops.c
  mm: add comments on pglist_data zones
  ocfs2: fix spelling mistake and grammar
  mm/debug_vm_pgtable: fix kernel crash by checking for THP support
  lib: fix bitmap_parse() on 64-bit big endian archs
  checkpatch: correct check for kernel parameters doc
  nilfs2: fix null pointer dereference at nilfs_segctor_do_construct()
  lib/lz4/lz4_decompress.c: document deliberate use of `&'
  kcov: check kcov_softirq in kcov_remote_stop()
  scripts/spelling: add a few more typos
  khugepaged: selftests: fix timeout condition in wait_for_scan()
2020-06-11 13:25:53 -07:00
Joerg Roedel
5cd221e837 alpha: Fix build around srm_sysrq_reboot_op
The patch introducing the struct was probably never compile tested,
because it sets a handler with a wrong function signature. Wrap the
handler into a functions with the correct signature to fix the build.

Fixes: 0f1c9688a1 ("tty/sysrq: alpha: export and use __sysrq_get_key_op()")
Cc: Emil Velikov <emil.l.velikov@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-11 13:05:24 -07:00
Linus Torvalds
cd16ed33c3 Merge tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull more RISC-V updates from Palmer Dabbelt:

 - Kconfig select statements are now sorted alphanumerically

 - first-level interrupts are now handled via a full irqchip driver

 - CPU hotplug is fixed

 - vDSO calls now use the common vDSO infrastructure

* tag 'riscv-for-linus-5.8-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: set the permission of vdso_data to read-only
  riscv: use vDSO common flow to reduce the latency of the time-related functions
  riscv: fix build warning of missing prototypes
  RISC-V: Don't mark init section as non-executable
  RISC-V: Force select RISCV_INTC for CONFIG_RISCV
  RISC-V: Remove do_IRQ() function
  clocksource/drivers/timer-riscv: Use per-CPU timer interrupt
  irqchip: RISC-V per-HART local interrupt controller driver
  RISC-V: Rename and move plic_find_hart_id() to arch directory
  RISC-V: self-contained IPI handling routine
  RISC-V: Sort select statements alphanumerically
2020-06-11 12:55:20 -07:00
Linus Torvalds
55d728b2b0 Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "arm64 fixes that came in during the merge window.

  There will probably be more to come, but it doesn't seem like it's
  worth me sitting on these in the meantime.

   - Fix SCS debug check to report max stack usage in bytes as advertised

   - Fix typo: CONFIG_FTRACE_WITH_REGS => CONFIG_DYNAMIC_FTRACE_WITH_REGS

   - Fix incorrect mask in HiSilicon L3C perf PMU driver

   - Fix compat vDSO compilation under some toolchain configurations

   - Fix false UBSAN warning from ACPI IORT parsing code

   - Fix booting under bootloaders that ignore TEXT_OFFSET

   - Annotate debug initcall function with '__init'"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: warn on incorrect placement of the kernel by the bootloader
  arm64: acpi: fix UBSAN warning
  arm64: vdso32: add CONFIG_THUMB2_COMPAT_VDSO
  drivers/perf: hisi: Fix wrong value for all counters enable
  arm64: ftrace: Change CONFIG_FTRACE_WITH_REGS to CONFIG_DYNAMIC_FTRACE_WITH_REGS
  arm64: debug: mark a function as __init to save some memory
  scs: Report SCS usage in bytes rather than number of entries
2020-06-11 12:53:23 -07:00
Linus Torvalds
d3ea693439 Merge tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu
Pull m68knommu updates from Greg Ungerer:

 - casting clean up in the user access macros

 - memory leak on error case fix for PCI probing

 - update of a defconfig

* tag 'm68knommu-for-v5.8' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu:
  m68k,nommu: fix implicit cast from __user in __{get,put}_user_asm()
  m68k,nommu: add missing __user in uaccess' __ptr() macro
  m68k: Drop CONFIG_MTD_M25P80 in stmark2_defconfig
  m68k/PCI: Fix a memory leak in an error handling path
2020-06-11 12:50:54 -07:00
Linus Torvalds
2dca74a40e Merge tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:
 "qcom:
   - new controller driver for IPCC
   - reorg the of_device data
   - add support for ipq6018 platform

  spreadtrum:
   - new sprd controller driver

  imx:
   - implement suspend/resume PM support

  misc:
   - make pcc driver struct static
   - fix return value in imx_mu_scu
   - disable clock before bailout in imx probe
   - remove duplicate error mssg in zynqmp probe
   - fix header size in imx.scu
   - check for null instead of is-err in zynqmp"

* tag 'mailbox-v5.8' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  mailbox: qcom: Add ipq6018 apcs compatible
  mailbox: qcom: Add clock driver name in apcs mailbox driver data
  dt-bindings: mailbox: Add YAML schemas for QCOM APCS global block
  mailbox: imx: ONLY IPC MU needs IRQF_NO_SUSPEND flag
  mailbox: imx: Add runtime PM callback to handle MU clocks
  mailbox: imx: Add context save/restore for suspend/resume
  MAINTAINERS: Add entry for Qualcomm IPCC driver
  mailbox: Add support for Qualcomm IPCC
  dt-bindings: mailbox: Add devicetree binding for Qcom IPCC
  mailbox: zynqmp-ipi: Fix NULL vs IS_ERR() check in zynqmp_ipi_mbox_probe()
  mailbox: imx-mailbox: fix scu msg header size check
  mailbox: sprd: Add Spreadtrum mailbox driver
  dt-bindings: mailbox: Add the Spreadtrum mailbox documentation
  mailbox: ZynqMP IPI: Delete an error message in zynqmp_ipi_probe()
  mailbox: imx: Disable the clock on devm_mbox_controller_register() failure
  mailbox: imx: Fix return in imx_mu_scu_xlate()
  mailbox: imx: Support runtime PM
  mailbox: pcc: make pcc_mbox_driver static
2020-06-11 12:42:14 -07:00
Linus Torvalds
e0154bd478 Merge tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "Here are last-minute fixes gathered before merge window close; a few
  fixes are for the core while the rest majority are driver fixes.

   - PCM locking annotation fixes and the possible self-lock fix

   - ASoC DPCM regression fixes with multi-CPU DAI

   - A fix for inconsistent resume from system-PM on USB-audio

   - Improved runtime-PM handling with multiple USB interfaces

   - Quirks for HD-audio and USB-audio

   - Hardened firmware handling in max98390 codec

   - A couple of fixes for meson"

* tag 'sound-fix-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (21 commits)
  ASoC: rt5645: Add platform-data for Asus T101HA
  ASoC: Intel: bytcr_rt5640: Add quirk for Toshiba Encore WT10-A tablet
  ASoC: SOF: nocodec: conditionally set dpcm_capture/dpcm_playback flags
  ASoC: Intel: boards: replace capture_only by dpcm_capture
  ASoC: core: only convert non DPCM link to DPCM link
  ASoC: soc-pcm: dpcm: fix playback/capture checks
  ASoC: meson: add missing free_irq() in error path
  ALSA: pcm: disallow linking stream to itself
  ALSA: usb-audio: Manage auto-pm of all bundled interfaces
  ALSA: hda/realtek - add a pintbl quirk for several Lenovo machines
  ALSA: pcm: fix snd_pcm_link() lockdep splat
  ALSA: usb-audio: Use the new macro for HP Dock rename quirks
  ALSA: usb-audio: Add vendor, product and profile name for HP Thunderbolt Dock
  ALSA: emu10k1: delete an unnecessary condition
  dt-bindings: ASoc: Fix tdm-slot documentation spelling error
  ASoC: meson: fix memory leak of links if allocation of ldata fails
  ALSA: usb-audio: Fix inconsistent card PM state after resume
  ASoC: max98390: Fix potential crash during param fw loading
  ASoC: max98390: Fix incorrect printf qualifier
  ASoC: fsl-asoc-card: Defer probe when fail to find codec device
  ...
2020-06-11 12:38:11 -07:00
Linus Torvalds
d4e181f204 Merge tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
 "One sun4i fix and a connector hotplug race The ast fix is for a
  regression in 5.6, and one of the i915 ones fixes an oops reported by
  dhowells.

  core:
   - fix race in connectors sending hotplug

  i915:
   - Avoid use after free in cmdparser
   - Avoid NULL dereference when probing all display encoders
   - Fixup to module parameter type

  sun4i:
   - clock divider fix

  ast:
   - 24/32 bpp mode setting fix"

* tag 'drm-next-2020-06-11-1' of git://anongit.freedesktop.org/drm/drm:
  drm/ast: fix missing break in switch statement for format->cpp[0] case 4
  drm/sun4i: hdmi ddc clk: Fix size of m divider
  drm/i915/display: Only query DP state of a DDI encoder
  drm/i915/params: fix i915.reset module param type
  drm/i915/gem: Mark the buffer pool as active for the cmdparser
  drm/connector: notify userspace on hotplug after register complete
2020-06-11 12:27:06 -07:00
Linus Torvalds
a539568299 Merge tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
Pull NFS client updates from Anna Schumaker:
 "New features and improvements:
   - Sunrpc receive buffer sizes only change when establishing a GSS credentials
   - Add more sunrpc tracepoints
   - Improve on tracepoints to capture internal NFS I/O errors

  Other bugfixes and cleanups:
   - Move a dprintk() to after a call to nfs_alloc_fattr()
   - Fix off-by-one issues in rpc_ntop6
   - Fix a few coccicheck warnings
   - Use the correct SPDX license identifiers
   - Fix rpc_call_done assignment for BIND_CONN_TO_SESSION
   - Replace zero-length array with flexible array
   - Remove duplicate headers
   - Set invalid blocks after NFSv4 writes to update space_used attribute
   - Fix direct WRITE throughput regression"

* tag 'nfs-for-5.8-1' of git://git.linux-nfs.org/projects/anna/linux-nfs: (27 commits)
  NFS: Fix direct WRITE throughput regression
  SUNRPC: rpc_xprt lifetime events should record xprt->state
  xprtrdma: Make xprt_rdma_slot_table_entries static
  nfs: set invalid blocks after NFSv4 writes
  NFS: remove redundant initialization of variable result
  sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
  NFS: Add a tracepoint in nfs_set_pgio_error()
  NFS: Trace short NFS READs
  NFS: nfs_xdr_status should record the procedure name
  SUNRPC: Set SOFTCONN when destroying GSS contexts
  SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
  SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
  SUNRPC: trace RPC client lifetime events
  SUNRPC: Trace transport lifetime events
  SUNRPC: Split the xdr_buf event class
  SUNRPC: Add tracepoint to rpc_call_rpcerror()
  SUNRPC: Update the RPC_SHOW_SOCKET() macro
  SUNRPC: Update the rpc_show_task_flags() macro
  SUNRPC: Trace GSS context lifetimes
  SUNRPC: receive buffer size estimation values almost never change
  ...
2020-06-11 12:22:41 -07:00
Linus Torvalds
e8de4575cf Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
 "A number of fixes to the omap and nitrox drivers"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: cavium/nitrox - Fix 'nitrox_get_first_device()' when ndevlist is fully iterated
  crypto: omap-sham - add proper load balancing support for multicore
  crypto: omap-aes - prevent unregistering algorithms twice
  crypto: omap-sham - fix very small data size handling
  crypto: omap-sham - huge buffer access fixes
  crypto: omap-crypto - fix userspace copied buffer access
  crypto: omap-sham - force kernel driver usage for sha algos
  crypto: omap-aes - avoid spamming console with self tests
2020-06-11 11:02:13 -07:00
Linus Torvalds
7cf035cc83 Merge tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull DAX updates part three from Darrick Wong:
 "Now that the xfs changes have landed, this third piece changes the
  FS_XFLAG_DAX ioctl code in xfs to request that the inode be reloaded
  after the last program closes the file, if doing so would make a S_DAX
  change happen. The goal here is to make dax access mode switching
  quicker when possible.

  Summary:

   - Teach XFS to ask the VFS to drop an inode if the administrator
     changes the FS_XFLAG_DAX inode flag such that the S_DAX state would
     change. This can result in files changing access modes without
     requiring an unmount cycle"

* tag 'vfs-5.8-merge-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  fs/xfs: Update xfs_ioctl_setattr_dax_invalidate()
  fs/xfs: Combine xfs_diflags_to_linux() and xfs_diflags_to_iflags()
  fs/xfs: Create function xfs_inode_should_enable_dax()
  fs/xfs: Make DAX mount option a tri-state
  fs/xfs: Change XFS_MOUNT_DAX to XFS_MOUNT_DAX_ALWAYS
  fs/xfs: Remove unnecessary initialization of i_rwsem
2020-06-11 10:48:12 -07:00
Chuck Lever
ba838a75e7 NFS: Fix direct WRITE throughput regression
I measured a 50% throughput regression for large direct writes.

The observed on-the-wire behavior is that the client sends every
NFS WRITE twice: once as an UNSTABLE WRITE plus a COMMIT, and once
as a FILE_SYNC WRITE.

This is because the nfs_write_match_verf() check in
nfs_direct_commit_complete() fails for every WRITE.

Buffered writes use nfs_write_completion(), which sets req->wb_verf
correctly. Direct writes use nfs_direct_write_completion(), which
does not set req->wb_verf at all. This leaves req->wb_verf set to
all zeroes for every direct WRITE, and thus
nfs_direct_commit_completion() always sets NFS_ODIRECT_RESCHED_WRITES.

This fix appears to restore nearly all of the lost performance.

Fixes: 1f28476dcb ("NFS: Fix O_DIRECT commit verifier handling")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
94afd9c489 SUNRPC: rpc_xprt lifetime events should record xprt->state
Help troubleshoot the logic that uses these flags.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Zou Wei
5bffb00621 xprtrdma: Make xprt_rdma_slot_table_entries static
Fix the following sparse warning:

net/sunrpc/xprtrdma/transport.c:71:14: warning: symbol 'xprt_rdma_slot_table_entries'
was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Zheng Bin
3a39e77869 nfs: set invalid blocks after NFSv4 writes
Use the following command to test nfsv4(size of file1M is 1MB):
mount -t nfs -o vers=4.0,actimeo=60 127.0.0.1/dir1 /mnt
cp file1M /mnt
du -h /mnt/file1M  -->0 within 60s, then 1M

When write is done(cp file1M /mnt), will call this:
nfs_writeback_done
  nfs4_write_done
    nfs4_write_done_cb
      nfs_writeback_update_inode
        nfs_post_op_update_inode_force_wcc_locked(change, ctime, mtime
nfs_post_op_update_inode_force_wcc_locked
   nfs_set_cache_invalid
   nfs_refresh_inode_locked
     nfs_update_inode

nfsd write response contains change, ctime, mtime, the flag will be
clear after nfs_update_inode. Howerver, write response does not contain
space_used, previous open response contains space_used whose value is 0,
so inode->i_blocks is still 0.

nfs_getattr  -->called by "du -h"
  do_update |= force_sync || nfs_attribute_cache_expired -->false in 60s
  cache_validity = READ_ONCE(NFS_I(inode)->cache_validity)
  do_update |= cache_validity & (NFS_INO_INVALID_ATTR    -->false
  if (do_update) {
        __nfs_revalidate_inode
  }

Within 60s, does not send getattr request to nfsd, thus "du -h /mnt/file1M"
is 0.

Add a NFS_INO_INVALID_BLOCKS flag, set it when nfsv4 write is done.

Fixes: 16e1437517 ("NFS: More fine grained attribute tracking")
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Colin Ian King
86b936672e NFS: remove redundant initialization of variable result
The variable result is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Xiongfeng Wang
2ac3ddc723 sunrpc: add missing newline when printing parameter 'auth_hashtable_size' by sysfs
When I cat parameter
'/sys/module/sunrpc/parameters/auth_hashtable_size', it displays as
follows. It is better to add a newline for easy reading.

[root@hulk-202 ~]# cat /sys/module/sunrpc/parameters/auth_hashtable_size
16[root@hulk-202 ~]#

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
cd2ed9bdc0 NFS: Add a tracepoint in nfs_set_pgio_error()
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
fd2b612141 NFS: Trace short NFS READs
A short read can generate an -EIO error without there being an error
on the wire. This tracepoint acts as an eyecatcher when there is no
obvious I/O error.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
5be5945864 NFS: nfs_xdr_status should record the procedure name
When sunrpc trace points are not enabled, the recorded task ID
information alone is not helpful.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
841a2ed9a1 SUNRPC: Set SOFTCONN when destroying GSS contexts
Move the RPC_TASK_SOFTCONN flag into rpc_call_null_helper(). The
only minor behavior change is that it is now also set when
destroying GSS contexts.

This gives a better guarantee that gss_send_destroy_context() will
not hang for long if a connection cannot be established.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
6fc3737aac SUNRPC: rpc_call_null_helper() should set RPC_TASK_SOFT
Clean up.

All of rpc_call_null_helper() call sites assert RPC_TASK_SOFT, so
move that setting into rpc_call_null_helper() itself.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
eefc536dbd SUNRPC: rpc_call_null_helper() already sets RPC_TASK_NULLCREDS
Clean up.

Commit a52458b48a ("NFS/NFSD/SUNRPC: replace generic creds with
'struct cred'.") made rpc_call_null_helper() set RPC_TASK_NULLCREDS
unconditionally. Therefore there's no need for
rpc_call_null_helper()'s call sites to set RPC_TASK_NULLCREDS.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
42aad0d7f9 SUNRPC: trace RPC client lifetime events
The "create" tracepoint records parts of the rpc_create arguments,
and the shutdown tracepoint records when the rpc_clnt is about to
signal pending tasks and destroy auths.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
911813d7a1 SUNRPC: Trace transport lifetime events
Refactor: Hoist create/destroy/disconnect tracepoints out of
xprtrdma and into the generic RPC client. Some benefits include:

- Enable tracing of xprt lifetime events for the socket transport
  types

- Expose the different types of disconnect to help run down
  issues with lingering connections

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
c509f15a58 SUNRPC: Split the xdr_buf event class
To help tie the recorded xdr_buf to a particular RPC transaction,
the client side version of this class should display task ID
information and the server side one should show the request's XID.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
0125ecbb52 SUNRPC: Add tracepoint to rpc_call_rpcerror()
Add a tracepoint in another common exit point for failing RPCs.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
82909dc546 SUNRPC: Update the RPC_SHOW_SOCKET() macro
Clean up: remove unnecessary commas, and fix a white-space nit.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:48 -04:00
Chuck Lever
7a34c8e0c3 SUNRPC: Update the rpc_show_task_flags() macro
Recent additions to the RPC_TASK flags neglected to update
the tracepoint ENUM definitions.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:47 -04:00
Chuck Lever
74fb8fecee SUNRPC: Trace GSS context lifetimes
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:47 -04:00
Chuck Lever
53bc19f17f SUNRPC: receive buffer size estimation values almost never change
Avoid unnecessary cache sloshing by placing the buffer size
estimation update logic behind an atomic bit flag.

The size of GSS information included in each wrapped Reply does
not change during the lifetime of a GSS context. Therefore, the
au_rslack and au_ralign fields need to be updated only once after
establishing a fresh GSS credential.

Thus a slack size update must occur after a cred is created,
duplicated, renewed, or expires. I'm not sure I have this exactly
right. A trace point is introduced to track updates to these
variables to enable troubleshooting the problem if I missed a spot.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
2020-06-11 13:33:47 -04:00
Linus Torvalds
c742b63473 Merge tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
 "Highlights:

   - Keep nfsd clients from unnecessarily breaking their own
     delegations.

     Note this requires a small kthreadd addition. The result is Tejun
     Heo's suggestion (see link), and he was OK with this going through
     my tree.

   - Patch nfsd/clients/ to display filenames, and to fix byte-order
     when displaying stateid's.

   - fix a module loading/unloading bug, from Neil Brown.

   - A big series from Chuck Lever with RPC/RDMA and tracing
     improvements, and lay some groundwork for RPC-over-TLS"

Link: https://lore.kernel.org/r/1588348912-24781-1-git-send-email-bfields@redhat.com

* tag 'nfsd-5.8' of git://linux-nfs.org/~bfields/linux: (49 commits)
  sunrpc: use kmemdup_nul() in gssp_stringify()
  nfsd: safer handling of corrupted c_type
  nfsd4: make drc_slab global, not per-net
  SUNRPC: Remove unreachable error condition in rpcb_getport_async()
  nfsd: Fix svc_xprt refcnt leak when setup callback client failed
  sunrpc: clean up properly in gss_mech_unregister()
  sunrpc: svcauth_gss_register_pseudoflavor must reject duplicate registrations.
  sunrpc: check that domain table is empty at module unload.
  NFSD: Fix improperly-formatted Doxygen comments
  NFSD: Squash an annoying compiler warning
  SUNRPC: Clean up request deferral tracepoints
  NFSD: Add tracepoints for monitoring NFSD callbacks
  NFSD: Add tracepoints to the NFSD state management code
  NFSD: Add tracepoints to NFSD's duplicate reply cache
  SUNRPC: svc_show_status() macro should have enum definitions
  SUNRPC: Restructure svc_udp_recvfrom()
  SUNRPC: Refactor svc_recvfrom()
  SUNRPC: Clean up svc_release_skb() functions
  SUNRPC: Refactor recvfrom path dealing with incomplete TCP receives
  SUNRPC: Replace dprintk() call sites in TCP receive path
  ...
2020-06-11 10:33:13 -07:00
Xiaoguang Wang
65a6543da3 io_uring: fix io_kiocb.flags modification race in IOPOLL mode
While testing io_uring in arm, we found sometimes io_sq_thread() keeps
polling io requests even though there are not inflight io requests in
block layer. After some investigations, found a possible race about
io_kiocb.flags, see below race codes:
  1) in the end of io_write() or io_read()
    req->flags &= ~REQ_F_NEED_CLEANUP;
    kfree(iovec);
    return ret;

  2) in io_complete_rw_iopoll()
    if (res != -EAGAIN)
        req->flags |= REQ_F_IOPOLL_COMPLETED;

In IOPOLL mode, io requests still maybe completed by interrupt, then
above codes are not safe, concurrent modifications to req->flags, which
is not protected by lock or is not atomic modifications. I also had
disassemble io_complete_rw_iopoll() in arm:
   req->flags |= REQ_F_IOPOLL_COMPLETED;
   0xffff000008387b18 <+76>:    ldr     w0, [x19,#104]
   0xffff000008387b1c <+80>:    orr     w0, w0, #0x1000
   0xffff000008387b20 <+84>:    str     w0, [x19,#104]

Seems that the "req->flags |= REQ_F_IOPOLL_COMPLETED;" is  load and
modification, two instructions, which obviously is not atomic.

To fix this issue, add a new iopoll_completed in io_kiocb to indicate
whether io request is completed.

Signed-off-by: Xiaoguang Wang <xiaoguang.wang@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11 09:45:21 -06:00
Colin Ian King
9a6a5738ab umem: remove redundant initialization of variable ret
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Addresses-Coverity: ("Unused value")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11 09:16:17 -06:00
Colin Ian King
8d20319e29 pktcdvd: remove redundant initialization of variable ret
The variable ret is being initialized with a value that is never read
and it is being updated later with a new value.  The initialization is
redundant and can be removed.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Addresses-Coverity: ("Unused value")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11 09:15:48 -06:00
Chaitanya Kulkarni
819f7b88b4 nvmet: fail outstanding host posted AEN req
In function nvmet_async_event_process() we only process AENs iff
there is an open slot on the ctrl->async_event_cmds[] && aen
event list posted by the target is not empty. This keeps host
posted AEN outstanding if target generated AEN list is empty.
We do cleanup the target generated entries from the aen list in
nvmet_ctrl_free()-> nvmet_async_events_free() but we don't
process AEN posted by the host. This leads to following problem :-

When processing admin sq at the time of nvmet_sq_destroy() holds
an extra percpu reference(atomic value = 1), so in the following code
path after switching to atomic rcu, release function (nvmet_sq_free())
is not getting called which blocks the sq->free_done in
nvmet_sq_destroy() :-

nvmet_sq_destroy()
 percpu_ref_kill_and_confirm()
 - __percpu_ref_switch_mode()
 --  __percpu_ref_switch_to_atomic()
 ---   call_rcu() -> percpu_ref_switch_to_atomic_rcu()
 ----     /* calls switch callback */
 - percpu_ref_put()
 -- percpu_ref_put_many(ref, 1)
 --- else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
 ----   ref->release(ref); <---- Not called.

This results in indefinite hang:-

  void nvmet_sq_destroy(struct nvmet_sq *sq)
...
          if (ctrl && ctrl->sqs && ctrl->sqs[0] == sq) {
                  nvmet_async_events_process(ctrl, status);
                  percpu_ref_put(&sq->ref);
          }
          percpu_ref_kill_and_confirm(&sq->ref, nvmet_confirm_sq);
          wait_for_completion(&sq->confirm_done);
          wait_for_completion(&sq->free_done); <-- Hang here

Which breaks the further disconnect sequence. This problem seems to be
introduced after commit 64f5e9cdd7 ("nvmet: fix memory leak when
removing namespaces and controllers concurrently").

This patch processes ctrl->async_event_cmds[] in the admin sq destroy()
context irrespetive of aen_list. Also we get rid of the controller's
aen_list processing in the nvmet_sq_destroy() context and just ignore
ctrl->aen_list.

This results in nvmet_async_events_process() being called from workqueue
context so we adjust the code accordingly.

Fixes: 64f5e9cdd7 ("nvmet: fix memory leak when removing namespaces and controllers concurrently ")
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11 09:10:06 -06:00
Christoph Hellwig
b97120b15e nvme-pci: use simple suspend when a HMB is enabled
While the NVMe specification allows the device to access the host memory
buffer in host DRAM from all power states, hosts will fail access to
DRAM during S3 and similar power states.

Fixes: d916b1be94 ("nvme-pci: use host managed power state for suspend")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11 09:10:06 -06:00
Daniel Wagner
c9c12e51b8 nvme-fc: don't call nvme_cleanup_cmd() for AENs
Asynchronous event notifications do not have an associated request.
When fcp_io() fails we unconditionally call nvme_cleanup_cmd() which
leads to a crash.

Fixes: 16686f3a6c ("nvme: move common call to nvme_cleanup_cmd to core layer")
Signed-off-by: Daniel Wagner <dwagner@suse.de>
Reviewed-by: Himanshu Madhani <hmadhani2024@gmail.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-06-11 09:10:05 -06:00