Commit Graph

1336146 Commits

Author SHA1 Message Date
Thomas Weißschuh
7787bfb3b0 drm/sysfs: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://lore.kernel.org/r/20241216-sysfs-const-bin_attr-drm-v1-1-210f2b36b9bf@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:31 +01:00
Thomas Weißschuh
1c83b02c91 firmware: dmi: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250125-sysfs-const-bin_attr-dmi-v2-3-ece1895936f4@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Thomas Weißschuh
80d3989b9c firmware: dmi: Define bin_attributes through macro
The macro makes the code shorter and simplifies constification of the
callback arguments.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250125-sysfs-const-bin_attr-dmi-v2-2-ece1895936f4@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Thomas Weißschuh
14e694dbf2 firmware: dmi: Mark bin_attributes as __ro_after_init
The attributes are only modified during the __init phase.
Protect them against accidental or intentional modifications afterwards.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250125-sysfs-const-bin_attr-dmi-v2-1-ece1895936f4@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Thomas Weißschuh
7de24e20a7 cxl/port: Constify 'struct bin_attribute'
The sysfs core now allows instances of 'struct bin_attribute' to be
moved into read-only memory. Make use of that to protect them against
accidental or malicious modifications.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20250114-sysfs-const-bin_attr-cxl-v1-1-5afa23fe2a52@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Thorsten Blum
1d2d45b627 driver core: location: Use str_yes_no() helper function
Remove hard-coded strings by using the str_yes_no() helper function.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://lore.kernel.org/r/20250211132409.700073-2-thorsten.blum@linux.dev
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Lucas De Marchi
177cbd5249 drivers: base: component: Allow more space for device name
Some drivers use <BDF>-<UUID> as the aggregate device name which uses
more than 20 chars, causing the status not to be aligned correctly.
Example for mei_gsc_proxy on LNL:

Before:
	aggregate_device name                                  status
	-------------------------------------------------------------
	0000:00:16.0-0f73db04-97ab-4125-b893-e904ad0d5464                bound

After:
	aggregate_device name                                            status
	-----------------------------------------------------------------------
	0000:00:16.0-0f73db04-97ab-4125-b893-e904ad0d5464                 bound

Give it 10 more chars for proper alignment.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://lore.kernel.org/r/20250205205851.2355820-2-lucas.demarchi@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Zijun Hu
0514059ca0 MAINTAINERS: Add driver core headers to DRIVER CORE maintainers
According to get_maintainer.pl output, there are neither maintainer
nor supporter for the following driver core headers:

include/linux/device.h
include/linux/device/

Add them to DRIVER CORE maintainers.

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20250208-drv_core_hdr-v1-1-8205b0483e3f@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Bharadwaj Raju
6fb1ee255e drivers/base/bus.c: fix spelling of "subsystem"
Fix spelling, "subystem" -> "subsystem"

Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com>
Link: https://lore.kernel.org/r/20250203220312.1052986-1-bharadwaj.raju777@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Bagas Sanjaya
b1b620bfa9 kernel: Fix "select" wording on HZ_250 description
HZ_250 config description contains alternative choice for NTSC
media users (HZ_300), which is written as "selected 300Hz". This is
incorrect, as it implies that HZ_300 is automatically selected
whereas the user has chosen HZ_250 instead.

Fix the wording to "select 300Hz".

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/r/20250203025000.17953-1-bagasdotme@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-21 09:20:30 +01:00
Zijun Hu
a44073c28b driver core: Remove needless return in void API device_remove_group()
Remove return since both device_remove_group() and device_remove_groups()
are void functions.

Fixes: e323b2dddc ("driver core: add device_{add|remove}_group() helpers")
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20250208-fix_device_remove_group-v1-1-8a5b0ac0ce5c@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-20 12:48:01 +01:00
Zijun Hu
8fd74a31ea driver core: class: Remove needless return in void API class_remove_file()
Remove return since both class_remove_file() and class_remove_file_ns()
are void functions.

Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
Link: https://lore.kernel.org/r/20250208-cls_rmv_return-v1-1-091b37945aac@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-20 12:47:56 +01:00
Sebastian Andrzej Siewior
6ef5b6fae3 kernfs: Drop kernfs_rwsem while invoking lookup_positive_unlocked().
syzbot reported two warnings:
- kernfs_node::name was accessed outside of a RCU section so it created
  warning. The kernfs_rwsem was held so it was okay but it wasn't seen.

- While kernfs_rwsem was held invoked lookup_positive_unlocked()->
  kernfs_dop_revalidate() which acquired kernfs_rwsem.

kernfs_rwsem was both acquired as a read lock so it can be acquired
twice. However if a writer acquires the lock after the first reader then
neither the writer nor the second reader can obtain the lock so it
deadlocks.

The reason for the lock is to ensure that kernfs_node::name remain
stable during lookup_positive_unlocked()'s invocation. The function can
not be invoked within a RCU section because it may sleep.

Make a temporary copy of the kernfs_node::name under the lock so
GFP_KERNEL can be used and use this instead.

Reported-by: syzbot+ecccecbc636b455f9084@syzkaller.appspotmail.com
Fixes: 5b2fabf7fe ("kernfs: Acquire kernfs_rwsem in kernfs_node_dentry().")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20250218163938.xmvjlJ0K@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-19 17:07:41 +01:00
Greg Kroah-Hartman
2ce177e9b3 Merge 6.14-rc3 into driver-core-next
We need the faux_device changes in here for future work.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-17 07:24:33 +01:00
Linus Torvalds
0ad2507d5d Linux 6.14-rc3 v6.14-rc3 2025-02-16 14:02:44 -08:00
Linus Torvalds
224e745110 Merge tag 'kbuild-fixes-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Fix annoying logs when building tools in parallel

 - Fix the Debian linux-headers package build again

 - Fix the target triple detection for userspace programs on Clang

* tag 'kbuild-fixes-v6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  modpost: Fix a few typos in a comment
  kbuild: userprogs: fix bitsize and target detection on clang
  kbuild: fix linux-headers package build when $(CC) cannot link userspace
  tools: fix annoying "mkdir -p ..." logs when building tools in parallel
2025-02-16 12:58:51 -08:00
Linus Torvalds
ae5fa8ce7e Merge tag 'driver-core-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core api addition from Greg KH:
 "Here is a driver core new api for 6.14-rc3 that is being added to
  allow platform devices from stop being abused.

  It adds a new 'faux_device' structure and bus and api to allow almost
  a straight or simpler conversion from platform devices that were not
  really a platform device. It also comes with a binding for rust, with
  an example driver in rust showing how it's used.

  I'm adding this now so that the patches that convert the different
  drivers and subsystems can all start flowing into linux-next now
  through their different development trees, in time for 6.15-rc1.

  We have a number that are already reviewed and tested, but adding
  those conversions now doesn't seem right. For now, no one is using
  this, and it passes all build tests from 0-day and linux-next, so all
  should be good"

* tag 'driver-core-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
  rust/kernel: Add faux device bindings
  driver core: add a faux bus for use when a simple device/bus is needed
2025-02-16 12:54:42 -08:00
Linus Torvalds
56400391b1 Merge tag 'tty-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull serial driver fixes from Greg KH:
 "Here are some small serial driver fixes for some reported problems.
  Nothing major, just:

   - sc16is7xx irq check fix

   - 8250 fifo underflow fix

   - serial_port and 8250 iotype fixes

  Most of these have been in linux-next already, and all have passed
  0-day testing"

* tag 'tty-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  serial: 8250: Fix fifo underflow on flush
  serial: 8250_pnp: Remove unneeded ->iotype assignment
  serial: 8250_platform: Remove unneeded ->iotype assignment
  serial: 8250_of: Remove unneeded ->iotype assignment
  serial: port: Make ->iotype validation global in __uart_read_properties()
  serial: port: Always update ->iotype in __uart_read_properties()
  serial: port: Assign ->iotype correctly when ->iobase is set
  serial: sc16is7xx: Fix IRQ number check behavior
2025-02-16 12:50:44 -08:00
Linus Torvalds
6bfcc5fb2f Merge tag 'usb-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
 "Here are some small USB driver fixes, and new device ids, for
  6.14-rc3. Lots of tiny stuff for reported problems, including:

   - new device ids and quirks

   - usb hub crash fix found by syzbot

   - dwc2 driver fix

   - dwc3 driver fixes

   - uvc gadget driver fix

   - cdc-acm driver fixes for a variety of different issues

   - other tiny bugfixes

  Almost all of these have been in linux-next this week, and all have
  passed 0-day testing"

* tag 'usb-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (25 commits)
  usb: typec: tcpm: PSSourceOffTimer timeout in PR_Swap enters ERROR_RECOVERY
  usb: roles: set switch registered flag early on
  usb: gadget: uvc: Fix unstarted kthread worker
  USB: quirks: add USB_QUIRK_NO_LPM quirk for Teclast dist
  usb: gadget: core: flush gadget workqueue after device removal
  USB: gadget: f_midi: f_midi_complete to call queue_work
  usb: core: fix pipe creation for get_bMaxPacketSize0
  usb: dwc3: Fix timeout issue during controller enter/exit from halt state
  USB: Add USB_QUIRK_NO_LPM quirk for sony xperia xz1 smartphone
  USB: cdc-acm: Fill in Renesas R-Car D3 USB Download mode quirk
  usb: cdc-acm: Fix handling of oversized fragments
  usb: cdc-acm: Check control transfer buffer size before access
  usb: xhci: Restore xhci_pci support for Renesas HCs
  USB: pci-quirks: Fix HCCPARAMS register error for LS7A EHCI
  USB: serial: option: drop MeiG Smart defines
  USB: serial: option: fix Telit Cinterion FN990A name
  USB: serial: option: add Telit Cinterion FN990B compositions
  USB: serial: option: add MeiG Smart SLM828
  usb: gadget: f_midi: fix MIDI Streaming descriptor lengths
  usb: dwc2: gadget: remove of_node reference upon udc_stop
  ...
2025-02-16 11:15:50 -08:00
Linus Torvalds
ba643b6d84 Merge tag 'irq_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq Kconfig cleanup from Borislav Petkov:

 - Remove an unused config item GENERIC_PENDING_IRQ_CHIPFLAGS

* tag 'irq_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Remove unused CONFIG_GENERIC_PENDING_IRQ_CHIPFLAGS
2025-02-16 10:55:17 -08:00
Linus Torvalds
ff1848d81c Merge tag 'perf_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fixes from Borislav Petkov:

 - Explicitly clear DEBUGCTL.LBR to prevent LBRs continuing being
   enabled after handoff to the OS

 - Check CPUID(0x23) leaf and subleafs presence properly

 - Remove the PEBS-via-PT feature from being supported on hybrid systems

 - Fix perf record/top default commands on systems without a raw PMU
   registered

* tag 'perf_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Ensure LBRs are disabled when a CPU is starting
  perf/x86/intel: Fix ARCH_PERFMON_NUM_COUNTER_LEAF
  perf/x86/intel: Clean up PEBS-via-PT on hybrid
  perf/x86/rapl: Fix the error checking order
2025-02-16 10:41:50 -08:00
Linus Torvalds
ff3b373ecc Merge tag 'sched_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fix from Borislav Petkov:

 - Clarify what happens when a task is woken up from the wake queue and
   make clear its removal from that queue is atomic

* tag 'sched_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Clarify wake_up_q()'s write to task->wake_q.next
2025-02-16 10:38:24 -08:00
Linus Torvalds
592c358ea9 Merge tag 'objtool_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:

 - Move a warning about a lld.ld breakage into the verbose setting as
   said breakage has been fixed in the meantime

 - Teach objtool to ignore dangling jump table entries added by Clang

* tag 'objtool_urgent_for_v6.14_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Move dodgy linker warn to verbose
  objtool: Ignore dangling jump table entries
2025-02-16 10:30:58 -08:00
Linus Torvalds
82ff316456 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
 "ARM:

   - Large set of fixes for vector handling, especially in the
     interactions between host and guest state.

     This fixes a number of bugs affecting actual deployments, and
     greatly simplifies the FP/SIMD/SVE handling. Thanks to Mark Rutland
     for dealing with this thankless task.

   - Fix an ugly race between vcpu and vgic creation/init, resulting in
     unexpected behaviours

   - Fix use of kernel VAs at EL2 when emulating timers with nVHE

   - Small set of pKVM improvements and cleanups

  x86:

   - Fix broken SNP support with KVM module built-in, ensuring the PSP
     module is initialized before KVM even when the module
     infrastructure cannot be used to order initcalls

   - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being
     emulated by KVM to fix a NULL pointer dereference

   - Enter guest mode (L2) from KVM's perspective before initializing
     the vCPU's nested NPT MMU so that the MMU is properly tagged for
     L2, not L1

   - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as
     the guest's value may be stale if a VM-Exit is handled in the
     fastpath"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (25 commits)
  x86/sev: Fix broken SNP support with KVM module built-in
  KVM: SVM: Ensure PSP module is initialized if KVM module is built-in
  crypto: ccp: Add external API interface for PSP module initialization
  KVM: arm64: vgic: Hoist SGI/PPI alloc from vgic_init() to kvm_create_vgic()
  KVM: arm64: timer: Drop warning on failed interrupt signalling
  KVM: arm64: Fix alignment of kvm_hyp_memcache allocations
  KVM: arm64: Convert timer offset VA when accessed in HYP code
  KVM: arm64: Simplify warning in kvm_arch_vcpu_load_fp()
  KVM: arm64: Eagerly switch ZCR_EL{1,2}
  KVM: arm64: Mark some header functions as inline
  KVM: arm64: Refactor exit handlers
  KVM: arm64: Refactor CPTR trap deactivation
  KVM: arm64: Remove VHE host restore of CPACR_EL1.SMEN
  KVM: arm64: Remove VHE host restore of CPACR_EL1.ZEN
  KVM: arm64: Remove host FPSIMD saving for non-protected KVM
  KVM: arm64: Unconditionally save+flush host FPSIMD/SVE/SME state
  KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop
  KVM: nSVM: Enter guest mode before initializing nested NPT MMU
  KVM: selftests: Add CPUID tests for Hyper-V features that need in-kernel APIC
  KVM: selftests: Manage CPUID array in Hyper-V CPUID test's core helper
  ...
2025-02-16 10:25:12 -08:00
Linus Torvalds
b878a1c072 Merge tag 'mips-fixes_6.14_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:
 "Fix for o32 ptrace/get_syscall_info"

* tag 'mips-fixes_6.14_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: fix mips_get_syscall_arg() for o32
  MIPS: Export syscall stack arguments properly for remote use
2025-02-16 10:19:41 -08:00
Linus Torvalds
ad1b832bf1 Merge tag 'devicetree-fixes-for-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:

 - Add bindings for QCom QCS8300 clocks, QCom SAR2130P qfprom, and
   powertip,{st7272|hx8238a} displays

 - Fix compatible for TI am62a7 dss

 - Add a kunit test for __of_address_resource_bounds()

* tag 'devicetree-fixes-for-6.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  dt-bindings: display: Add powertip,{st7272|hx8238a} as DT Schema description
  dt-bindings: nvmem: qcom,qfprom: Add SAR2130P compatible
  dt-bindings: display: ti: Fix compatible for am62a7 dss
  of: address: Add kunit test for __of_address_resource_bounds()
  dt-bindings: clock: qcom: Add QCS8300 video clock controller
  dt-bindings: clock: qcom: Add CAMCC clocks for QCS8300
  dt-bindings: clock: qcom: Add GPU clocks for QCS8300
2025-02-15 17:20:39 -08:00
Linus Torvalds
ad73b9a17d Merge tag 'uml-for-linus-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux
Pull UML fixes from Richard Weinberger:

 - Align signal stack correctly

 - Convert to raw spinlocks where needed (irq and virtio)

 - FPU related fixes

* tag 'uml-for-linus-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux:
  um: convert irq_lock to raw spinlock
  um: virtio_uml: use raw spinlock
  um: virt-pci: don't use kmalloc()
  um: fix execve stub execution on old host OSs
  um: properly align signal stack on x86_64
  um: avoid copying FP state from init_task
  um: add back support for FXSAVE registers
2025-02-15 17:14:53 -08:00
Linus Torvalds
5784d8c93e Merge tag 'trace-ring-buffer-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull trace ring buffer fixes from Steven Rostedt:

 - Enable resize on mmap() error

   When a process mmaps a ring buffer, its size is locked and resizing
   is disabled. But if the user passes in a wrong parameter, the mmap()
   can fail after the resize was disabled and the mmap() exits with
   error without reenabling the ring buffer resize. This prevents the
   ring buffer from ever being resized after that. Reenable resizing of
   the ring buffer on mmap() error.

 - Have resizing return proper error and not always -ENOMEM

   If the ring buffer is mmapped by one task and another task tries to
   resize the buffer it will error with -ENOMEM. This is confusing to
   the user as there may be plenty of memory available. Have it return
   the error that actually happens (in this case -EBUSY) where the user
   can understand why the resize failed.

 - Test the sub-buffer array to validate persistent memory buffer

   On boot up, the initialization of the persistent memory buffer will
   do a validation check to see if the content of the data is valid, and
   if so, it will use the memory as is, otherwise it re-initializes it.
   There's meta data in this persistent memory that keeps track of which
   sub-buffer is the reader page and an array that states the order of
   the sub-buffers. The values in this array are indexes into the
   sub-buffers. The validator checks to make sure that all the entries
   in the array are within the sub-buffer list index, but it does not
   check for duplications.

   While working on this code, the array got corrupted and had
   duplicates, where not all the sub-buffers were accounted for. This
   passed the validator as all entries were valid, but the link list was
   incorrect and could have caused a crash. The corruption only produced
   incorrect data, but it could have been more severe. To fix this,
   create a bitmask that covers all the sub-buffer indexes and set it to
   all zeros. While iterating the array checking the values of the array
   content, have it set a bit corresponding to the index in the array.
   If the bit was already set, then it is a duplicate and mark the
   buffer as invalid and reset it.

 - Prevent mmap()ing persistent ring buffer

   The persistent ring buffer uses vmap() to map the persistent memory.
   Currently, the mmap() logic only uses virt_to_page() to get the page
   from the ring buffer memory and use that to map to user space. This
   works because a normal ring buffer uses alloc_page() to allocate its
   memory. But because the persistent ring buffer use vmap() it causes a
   kernel crash.

   Fixing this to work with vmap() is not hard, but since mmap() on
   persistent memory buffers never worked, just have the mmap() return
   -ENODEV (what was returned before mmap() for persistent memory ring
   buffers, as they never supported mmap. Normal buffers will still
   allow mmap(). Implementing mmap() for persistent memory ring buffers
   can wait till the next merge window.

 - Fix polling on persistent ring buffers

   There's a "buffer_percent" option (default set to 50), that is used
   to have reads of the ring buffer binary data block until the buffer
   fills to that percentage. The field "pages_touched" is incremented
   every time a new sub-buffer has content added to it. This field is
   used in the calculations to determine the amount of content is in the
   buffer and if it exceeds the "buffer_percent" then it will wake the
   task polling on the buffer.

   As persistent ring buffers can be created by the content from a
   previous boot, the "pages_touched" field was not updated. This means
   that if a task were to poll on the persistent buffer, it would block
   even if the buffer was completely full. It would block even if the
   "buffer_percent" was zero, because with "pages_touched" as zero, it
   would be calculated as the buffer having no content. Update
   pages_touched when initializing the persistent ring buffer from a
   previous boot.

* tag 'trace-ring-buffer-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ring-buffer: Update pages_touched to reflect persistent buffer content
  tracing: Do not allow mmap() of persistent ring buffer
  ring-buffer: Validate the persistent meta data subbuf array
  tracing: Have the error of __tracing_resize_ring_buffer() passed to user
  ring-buffer: Unlock resize on mmap error
2025-02-15 16:34:41 -08:00
Steven Rostedt
97937834ae ring-buffer: Update pages_touched to reflect persistent buffer content
The pages_touched field represents the number of subbuffers in the ring
buffer that have content that can be read. This is used in accounting of
"dirty_pages" and "buffer_percent" to allow the user to wait for the
buffer to be filled to a certain amount before it reads the buffer in
blocking mode.

The persistent buffer never updated this value so it was set to zero, and
this accounting would take it as it had no content. This would cause user
space to wait for content even though there's enough content in the ring
buffer that satisfies the buffer_percent.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/20250214123512.0631436e@gandalf.local.home
Fixes: 5f3b6e839f ("ring-buffer: Validate boot range memory events")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-15 14:00:59 -05:00
Steven Rostedt
129fe71881 tracing: Do not allow mmap() of persistent ring buffer
When trying to mmap a trace instance buffer that is attached to
reserve_mem, it would crash:

 BUG: unable to handle page fault for address: ffffe97bd00025c8
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 2862f3067 P4D 2862f3067 PUD 0
 Oops: Oops: 0000 [#1] PREEMPT_RT SMP PTI
 CPU: 4 UID: 0 PID: 981 Comm: mmap-rb Not tainted 6.14.0-rc2-test-00003-g7f1a5e3fbf9e-dirty #233
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
 RIP: 0010:validate_page_before_insert+0x5/0xb0
 Code: e2 01 89 d0 c3 cc cc cc cc 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 <48> 8b 46 08 a8 01 75 67 66 90 48 89 f0 8b 50 34 85 d2 74 76 48 89
 RSP: 0018:ffffb148c2f3f968 EFLAGS: 00010246
 RAX: ffff9fa5d3322000 RBX: ffff9fa5ccff9c08 RCX: 00000000b879ed29
 RDX: ffffe97bd00025c0 RSI: ffffe97bd00025c0 RDI: ffff9fa5ccff9c08
 RBP: ffffb148c2f3f9f0 R08: 0000000000000004 R09: 0000000000000004
 R10: 0000000000000000 R11: 0000000000000200 R12: 0000000000000000
 R13: 00007f16a18d5000 R14: ffff9fa5c48db6a8 R15: 0000000000000000
 FS:  00007f16a1b54740(0000) GS:ffff9fa73df00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffe97bd00025c8 CR3: 00000001048c6006 CR4: 0000000000172ef0
 Call Trace:
  <TASK>
  ? __die_body.cold+0x19/0x1f
  ? __die+0x2e/0x40
  ? page_fault_oops+0x157/0x2b0
  ? search_module_extables+0x53/0x80
  ? validate_page_before_insert+0x5/0xb0
  ? kernelmode_fixup_or_oops.isra.0+0x5f/0x70
  ? __bad_area_nosemaphore+0x16e/0x1b0
  ? bad_area_nosemaphore+0x16/0x20
  ? do_kern_addr_fault+0x77/0x90
  ? exc_page_fault+0x22b/0x230
  ? asm_exc_page_fault+0x2b/0x30
  ? validate_page_before_insert+0x5/0xb0
  ? vm_insert_pages+0x151/0x400
  __rb_map_vma+0x21f/0x3f0
  ring_buffer_map+0x21b/0x2f0
  tracing_buffers_mmap+0x70/0xd0
  __mmap_region+0x6f0/0xbd0
  mmap_region+0x7f/0x130
  do_mmap+0x475/0x610
  vm_mmap_pgoff+0xf2/0x1d0
  ksys_mmap_pgoff+0x166/0x200
  __x64_sys_mmap+0x37/0x50
  x64_sys_call+0x1670/0x1d70
  do_syscall_64+0xbb/0x1d0
  entry_SYSCALL_64_after_hwframe+0x77/0x7f

The reason was that the code that maps the ring buffer pages to user space
has:

	page = virt_to_page((void *)cpu_buffer->subbuf_ids[s]);

And uses that in:

	vm_insert_pages(vma, vma->vm_start, pages, &nr_pages);

But virt_to_page() does not work with vmap()'d memory which is what the
persistent ring buffer has. It is rather trivial to allow this, but for
now just disable mmap() of instances that have their ring buffer from the
reserve_mem option.

If an mmap() is performed on a persistent buffer it will return -ENODEV
just like it would if the .mmap field wasn't defined in the
file_operations structure.

Cc: stable@vger.kernel.org
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Vincent Donnefort <vdonnefort@google.com>
Link: https://lore.kernel.org/20250214115547.0d7287d3@gandalf.local.home
Fixes: 9b7bdf6f6e ("tracing: Have trace_printk not use binary prints if boot buffer")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-02-15 13:59:52 -05:00
Linus Torvalds
496659003d Merge tag 'i2c-for-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "MAINTAINERS maintenance.

  Changed email, added entry, deleted entry falling back to a generic
  one"

* tag 'i2c-for-6.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  MAINTAINERS: Add maintainer for Qualcomm's I2C GENI driver
  MAINTAINERS: delete entry for AXXIA I2C
  MAINTAINERS: Use my kernel.org address for I2C ACPI work
2025-02-15 10:20:47 -08:00
Linus Torvalds
f3d8b0ebae Merge tag 's390-6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Fix isolated VFs handling by verifying that a VF’s parent PF is
   locally owned before registering it in an existing PCI domain

 - Disable arch_test_bit() optimization for PROFILE_ALL_BRANCHES to
   workaround gcc failure in handling __builtin_constant_p() in this
   case

 - Fix CHPID "configure" attribute caching in CIO by not updating the
   cache when SCLP returns no data, ensuring consistent sysfs output

 - Remove CONFIG_LSM from default configs and rely on defaults, which
   enables BPF LSM hook

* tag 's390-6.14-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/pci: Fix handling of isolated VFs
  s390/pci: Pull search for parent PF out of zpci_iov_setup_virtfn()
  s390/bitops: Disable arch_test_bit() optimization for PROFILE_ALL_BRANCHES
  s390/cio: Fix CHPID "configure" attribute caching
  s390/configs: Remove CONFIG_LSM
2025-02-15 10:15:24 -08:00
Uwe Kleine-König
b28fb1f2ef modpost: Fix a few typos in a comment
Namely: s/becasue/because/ and s/wiht/with/ plus an added article.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-02-16 03:10:58 +09:00
Thomas Weißschuh
1b71c2fb04 kbuild: userprogs: fix bitsize and target detection on clang
scripts/Makefile.clang was changed in the linked commit to move --target from
KBUILD_CFLAGS to KBUILD_CPPFLAGS, as that generally has a broader scope.
However that variable is not inspected by the userprogs logic,
breaking cross compilation on clang.

Use both variables to detect bitsize and target arguments for userprogs.

Fixes: feb843a469 ("kbuild: add $(CLANG_FLAGS) to KBUILD_CPPFLAGS")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-02-16 03:10:58 +09:00
Linus Torvalds
243899076c Merge tag 'rust-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux
Pull rust fixes from Miguel Ojeda:

 - Fix objtool warning due to future Rust 1.85.0 (to be released in a
   few days)

 - Clean future Rust 1.86.0 (to be released 2025-04-03) Clippy warning

* tag 'rust-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux:
  rust: rbtree: fix overindented list item
  objtool/rust: add one more `noreturn` Rust function
2025-02-15 09:54:46 -08:00
Linus Torvalds
d440148418 tegra210-adma: fix 32-bit x86 build
The Tegra210 Audio DMA controller driver did a plain divide:

	page_no = (res_page->start - res_base->start) / cdata->ch_base_offset;

which causes problems on 32-bit x86 configurations that have 64-bit
resource sizes:

  x86_64-linux-ld: drivers/dma/tegra210-adma.o: in function `tegra_adma_probe':
  tegra210-adma.c:(.text+0x1322): undefined reference to `__udivdi3'

because gcc doesn't generate the trivial code for a 64-by-32 divide,
turning it into a function call to do a full 64-by-64 divide.  And the
kernel intentionally doesn't provide that helper function, because 99%
of the time all you want is the narrower version.

Of course, tegra210 is a 64-bit architecture and the 32-bit x86 build is
purely for build testing, so this really is just about build coverage
failure.

But build coverage is good.

Side note: div_u64() would be suboptimal if you actually have a 32-bit
resource_t, so our "helper" for divides are admittedly making it harder
than it should be to generate good code for all the possible cases.

At some point, I'll consider 32-bit x86 so entirely legacy that I can't
find it in myself to care any more, and we'll just add the __udivdi3
library function.

But for now, the right thing to do is to use "div_u64()" to show that
you know that you are doing the simpler divide with a 32-bit number.
And the build error enforces that.

While fixing the build issue, also check for division-by-zero, and for
overflow.  Which hopefully cannot happen on real production hardware,
but the value of 'ch_base_offset' can definitely be zero in other
places.

Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-02-15 09:28:55 -08:00
Sebastian Andrzej Siewior
741c10b096 kernfs: Use RCU to access kernfs_node::name.
Using RCU lifetime rules to access kernfs_node::name can avoid the
trouble with kernfs_rename_lock in kernfs_name() and kernfs_path_from_node()
if the fs was created with KERNFS_ROOT_INVARIANT_PARENT. This is usefull
as it allows to implement kernfs_path_from_node() only with RCU
protection and avoiding kernfs_rename_lock. The lock is only required if
the __parent node can be changed and the function requires an unchanged
hierarchy while it iterates from the node to its parent.
The change is needed to allow the lookup of the node's path
(kernfs_path_from_node()) from context which runs always with disabled
preemption and or interrutps even on PREEMPT_RT. The problem is that
kernfs_rename_lock becomes a sleeping lock on PREEMPT_RT.

I went through all ::name users and added the required access for the lookup
with a few extensions:
- rdtgroup_pseudo_lock_create() drops all locks and then uses the name
  later on. resctrl supports rename with different parents. Here I made
  a temporal copy of the name while it is used outside of the lock.

- kernfs_rename_ns() accepts NULL as new_parent. This simplifies
  sysfs_move_dir_ns() where it can set NULL in order to reuse the current
  name.

- kernfs_rename_ns() is only using kernfs_rename_lock if the parents are
  different. All users use either kernfs_rwsem (for stable path view) or
  just RCU for the lookup. The ::name uses always RCU free.

Use RCU lifetime guarantees to access kernfs_node::name.

Suggested-by: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Reported-by: syzbot+6ea37e2e6ffccf41a7e6@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/67251dc6.050a0220.529b6.015e.GAE@google.com/
Reported-by: Hillf Danton <hdanton@sina.com>
Closes: https://lore.kernel.org/20241102001224.2789-1-hdanton@sina.com
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-7-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Sebastian Andrzej Siewior
633488947e kernfs: Use RCU to access kernfs_node::parent.
kernfs_rename_lock is used to obtain stable kernfs_node::{name|parent}
pointer. This is a preparation to access kernfs_node::parent under RCU
and ensure that the pointer remains stable under the RCU lifetime
guarantees.

For a complete path, as it is done in kernfs_path_from_node(), the
kernfs_rename_lock is still required in order to obtain a stable parent
relationship while computing the relevant node depth. This must not
change while the nodes are inspected in order to build the path.
If the kernfs user never moves the nodes (changes the parent) then the
kernfs_rename_lock is not required and the RCU guarantees are
sufficient. This "restriction" can be set with
KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required.

Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU
access and use RCU accessor while accessing the node.
Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can
not change.

Acked-by: Tejun Heo <tj@kernel.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Sebastian Andrzej Siewior
9aab10a024 kernfs: Don't re-lock kernfs_root::kernfs_rwsem in kernfs_fop_readdir().
The readdir operation iterates over all entries and invokes dir_emit()
for every entry passing kernfs_node::name as argument.
Since the name argument can change, and become invalid, the
kernfs_root::kernfs_rwsem lock should not be dropped to prevent renames
during the operation.

The lock drop around dir_emit() has been initially introduced in commit
   1e5289c97b ("sysfs: Cache the last sysfs_dirent to improve readdir scalability v2")

to avoid holding a global lock during a page fault. The lock drop is
wrong since the support of renames and not a big burden since the lock
is no longer global.

Don't re-acquire kernfs_root::kernfs_rwsem while copying the name to the
userpace buffer.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-5-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Sebastian Andrzej Siewior
5b2fabf7fe kernfs: Acquire kernfs_rwsem in kernfs_node_dentry().
kernfs_node_dentry() passes kernfs_node::name to
lookup_positive_unlocked().

Acquire kernfs_root::kernfs_rwsem to ensure the node is not renamed
during the operation.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-4-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Sebastian Andrzej Siewior
122ab92dee kernfs: Acquire kernfs_rwsem in kernfs_get_parent_dentry().
kernfs_get_parent_dentry() passes kernfs_node::parent to
kernfs_get_inode().

Acquire kernfs_root::kernfs_rwsem to ensure kernfs_node::parent isn't
replaced during the operation.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-3-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Sebastian Andrzej Siewior
400188ae36 kernfs: Acquire kernfs_rwsem in kernfs_notify_workfn().
kernfs_notify_workfn() dereferences kernfs_node::name and passes it
later to fsnotify(). If the node is renamed then the previously observed
name pointer becomes invalid.

Acquire kernfs_root::kernfs_rwsem to block renames of the node.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20250213145023.2820193-2-bigeasy@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15 17:46:32 +01:00
Linus Torvalds
6452feaf29 Merge tag 'gpio-fixes-for-v6.14-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux
Pull gpio fixes from Bartosz Golaszewski:

 - fix interrupt handling issues in gpio-bcm-kona

 - add an ACPI quirk for Acer Nitro ANV14 fixing an issue with spurious
   wake up events

 - add missing return value checks to gpio-stmpe

 - fix a crash in error path in gpiochip_get_ngpios()

* tag 'gpio-fixes-for-v6.14-rc3-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux:
  gpiolib: Fix crash on error in gpiochip_get_ngpios()
  gpio: stmpe: Check return value of stmpe_reg_read in stmpe_gpio_irq_sync_unlock
  gpiolib: acpi: Add a quirk for Acer Nitro ANV14
  gpio: bcm-kona: Add missing newline to dev_err format string
  gpio: bcm-kona: Make sure GPIO bits are unlocked when requesting IRQ
  gpio: bcm-kona: Fix GPIO lock/unlock for banks above bank 0
2025-02-15 08:13:45 -08:00
Masahiro Yamada
140332b6ed kbuild: fix linux-headers package build when $(CC) cannot link userspace
Since commit 5f73e7d038 ("kbuild: refactor cross-compiling
linux-headers package"), the linux-headers Debian package fails to
build when $(CC) cannot build userspace applications, for example,
when using toolchains installed by the 0day bot.

The host programs in the linux-headers package should be rebuilt using
the disto's cross-compiler, ${DEB_HOST_GNU_TYPE}-gcc instead of $(CC).
Hence, the variable 'CC' must be expanded in this shell script instead
of in the top-level Makefile.

Commit f354fc88a7 ("kbuild: install-extmod-build: add missing
quotation marks for CC variable") was not a correct fix because
CC="ccache gcc" should be unrelated when rebuilding userspace tools.

Fixes: 5f73e7d038 ("kbuild: refactor cross-compiling linux-headers package")
Reported-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
Closes: https://lore.kernel.org/linux-kbuild/CAK7LNARb3xO3ptBWOMpwKcyf3=zkfhMey5H2KnB1dOmUwM79dA@mail.gmail.com/T/#t
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Jeff Johnson <jeff.johnson@oss.qualcomm.com>
2025-02-15 22:40:52 +09:00
Masahiro Yamada
d1d0963121 tools: fix annoying "mkdir -p ..." logs when building tools in parallel
When CONFIG_OBJTOOL=y or CONFIG_DEBUG_INFO_BTF=y, parallel builds
show awkward "mkdir -p ..." logs.

  $ make -j16
    [ snip ]
  mkdir -p /home/masahiro/ref/linux/tools/objtool && make O=/home/masahiro/ref/linux subdir=tools/objtool --no-print-directory -C objtool
  mkdir -p /home/masahiro/ref/linux/tools/bpf/resolve_btfids && make O=/home/masahiro/ref/linux subdir=tools/bpf/resolve_btfids --no-print-directory -C bpf/resolve_btfids

Defining MAKEFLAGS=<value> on the command line wipes out command line
switches from the resultant MAKEFLAGS definition, even though the command
line switches are active. [1]

MAKEFLAGS puts all single-letter options into the first word, and that
word will be empty if no single-letter options were given. [2]
However, this breaks if MAKEFLAGS=<value> is given on the command line.

The tools/ and tools/% targets set MAKEFLAGS=<value> on the command
line, which breaks the following code in tools/scripts/Makefile.include:

    short-opts := $(firstword -$(MAKEFLAGS))

If MAKEFLAGS really needs modification, it should be done through the
environment variable, as follows:

    MAKEFLAGS=<value> $(MAKE) ...

That said, I question whether modifying MAKEFLAGS is necessary here.
The only flag we might want to exclude is --no-print-directory, as the
tools build system changes the working directory. However, people might
find the "Entering/Leaving directory" logs annoying.

I simply removed the offending MAKEFLAGS=<value>.

[1]: https://savannah.gnu.org/bugs/?62469
[2]: https://www.gnu.org/software/make/manual/make.html#Testing-Flags

Fixes: ea01fa9f63 ("tools: Connect to the kernel build system")
Fixes: a50e433327 ("perf tools: Honor parallel jobs")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: Daniel Xu <dxu@dxuuu.xyz>
2025-02-15 22:36:10 +09:00
Linus Torvalds
7ff71e6d92 Merge tag 'alpha-fixes-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha
Pull alpha fixes from Matt Turner:
 "A few changes for alpha, including some important fixes for kernel
  stack alignment"

* tag 'alpha-fixes-v6.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mattst88/alpha:
  alpha: Use str_yes_no() helper in pci_dac_dma_supported()
  alpha: Replace one-element array with flexible array member
  alpha: align stack for page fault and user unaligned trap handlers
  alpha: make stack 16-byte aligned (most cases)
  alpha: replace hardcoded stack offsets with autogenerated ones
2025-02-14 19:56:12 -08:00
Linus Torvalds
78a632a208 Merge tag 'pci-v6.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:

 - Update a BUILD_BUG_ON() usage that works on current compilers, but
   breaks compilation on gcc 5.3.1 (Alex Williamson)

 - Avoid use of FLR for Mediatek MT7922 WiFi; the device previously
   worked after a long timeout and fallback to SBR, but after a recent
   RRS change it doesn't work at all after FLR (Bjorn Helgaas)

* tag 'pci-v6.14-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI: Avoid FLR for Mediatek MT7922 WiFi
  PCI: Fix BUILD_BUG_ON usage for old gcc
2025-02-14 16:49:07 -08:00
Paolo Bonzini
d3d0b8dfe0 Merge tag 'kvm-x86-fixes-6.14-rcN' of https://github.com/kvm-x86/linux into HEAD
KVM fixes for 6.14 part 1

 - Reject Hyper-V SEND_IPI hypercalls if the local APIC isn't being emulated
   by KVM to fix a NULL pointer dereference.

 - Enter guest mode (L2) from KVM's perspective before initializing the vCPU's
   nested NPT MMU so that the MMU is properly tagged for L2, not L1.

 - Load the guest's DR6 outside of the innermost .vcpu_run() loop, as the
   guest's value may be stale if a VM-Exit is handled in the fastpath.
2025-02-14 19:08:35 -05:00
Ashish Kalra
409f45387c x86/sev: Fix broken SNP support with KVM module built-in
Fix issues with enabling SNP host support and effectively SNP support
which is broken with respect to the KVM module being built-in.

SNP host support is enabled in snp_rmptable_init() which is invoked as
device_initcall(). SNP check on IOMMU is done during IOMMU PCI init
(IOMMU_PCI_INIT stage). And for that reason snp_rmptable_init() is
currently invoked via device_initcall() and cannot be invoked via
subsys_initcall() as core IOMMU subsystem gets initialized via
subsys_initcall().

Now, if kvm_amd module is built-in, it gets initialized before SNP host
support is enabled in snp_rmptable_init() :

[   10.131811] kvm_amd: TSC scaling supported
[   10.136384] kvm_amd: Nested Virtualization enabled
[   10.141734] kvm_amd: Nested Paging enabled
[   10.146304] kvm_amd: LBR virtualization supported
[   10.151557] kvm_amd: SEV enabled (ASIDs 100 - 509)
[   10.156905] kvm_amd: SEV-ES enabled (ASIDs 1 - 99)
[   10.162256] kvm_amd: SEV-SNP enabled (ASIDs 1 - 99)
[   10.171508] kvm_amd: Virtual VMLOAD VMSAVE supported
[   10.177052] kvm_amd: Virtual GIF supported
...
...
[   10.201648] kvm_amd: in svm_enable_virtualization_cpu

And then svm_x86_ops->enable_virtualization_cpu()
(svm_enable_virtualization_cpu) programs MSR_VM_HSAVE_PA as following:
wrmsrl(MSR_VM_HSAVE_PA, sd->save_area_pa);

So VM_HSAVE_PA is non-zero before SNP support is enabled on all CPUs.

snp_rmptable_init() gets invoked after svm_enable_virtualization_cpu()
as following :
...
[   11.256138] kvm_amd: in svm_enable_virtualization_cpu
...
[   11.264918] SEV-SNP: in snp_rmptable_init

This triggers a #GP exception in snp_rmptable_init() when snp_enable()
is invoked to set SNP_EN in SYSCFG MSR:

[   11.294289] unchecked MSR access error: WRMSR to 0xc0010010 (tried to write 0x0000000003fc0000) at rIP: 0xffffffffaf5d5c28 (native_write_msr+0x8/0x30)
...
[   11.294404] Call Trace:
[   11.294482]  <IRQ>
[   11.294513]  ? show_stack_regs+0x26/0x30
[   11.294522]  ? ex_handler_msr+0x10f/0x180
[   11.294529]  ? search_extable+0x2b/0x40
[   11.294538]  ? fixup_exception+0x2dd/0x340
[   11.294542]  ? exc_general_protection+0x14f/0x440
[   11.294550]  ? asm_exc_general_protection+0x2b/0x30
[   11.294557]  ? __pfx_snp_enable+0x10/0x10
[   11.294567]  ? native_write_msr+0x8/0x30
[   11.294570]  ? __snp_enable+0x5d/0x70
[   11.294575]  snp_enable+0x19/0x20
[   11.294578]  __flush_smp_call_function_queue+0x9c/0x3a0
[   11.294586]  generic_smp_call_function_single_interrupt+0x17/0x20
[   11.294589]  __sysvec_call_function+0x20/0x90
[   11.294596]  sysvec_call_function+0x80/0xb0
[   11.294601]  </IRQ>
[   11.294603]  <TASK>
[   11.294605]  asm_sysvec_call_function+0x1f/0x30
...
[   11.294631]  arch_cpu_idle+0xd/0x20
[   11.294633]  default_idle_call+0x34/0xd0
[   11.294636]  do_idle+0x1f1/0x230
[   11.294643]  ? complete+0x71/0x80
[   11.294649]  cpu_startup_entry+0x30/0x40
[   11.294652]  start_secondary+0x12d/0x160
[   11.294655]  common_startup_64+0x13e/0x141
[   11.294662]  </TASK>

This #GP exception is getting triggered due to the following errata for
AMD family 19h Models 10h-1Fh Processors:

Processor may generate spurious #GP(0) Exception on WRMSR instruction:
Description:
The Processor will generate a spurious #GP(0) Exception on a WRMSR
instruction if the following conditions are all met:
- the target of the WRMSR is a SYSCFG register.
- the write changes the value of SYSCFG.SNPEn from 0 to 1.
- One of the threads that share the physical core has a non-zero
value in the VM_HSAVE_PA MSR.

The document being referred to above:
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/57095-PUB_1_01.pdf

To summarize, with kvm_amd module being built-in, KVM/SVM initialization
happens before host SNP is enabled and this SVM initialization
sets VM_HSAVE_PA to non-zero, which then triggers a #GP when
SYSCFG.SNPEn is being set and this will subsequently cause
SNP_INIT(_EX) to fail with INVALID_CONFIG error as SYSCFG[SnpEn] is not
set on all CPUs.

Essentially SNP host enabling code should be invoked before KVM
initialization, which is currently not the case when KVM is built-in.

Add fix to call snp_rmptable_init() early from iommu_snp_enable()
directly and not invoked via device_initcall() which enables SNP host
support before KVM initialization with kvm_amd module built-in.

Add additional handling for `iommu=off` or `amd_iommu=off` options.

Note that IOMMUs need to be enabled for SNP initialization, therefore,
if host SNP support is enabled but late IOMMU initialization fails
then that will cause PSP driver's SNP_INIT to fail as IOMMU SNP sanity
checks in SNP firmware will fail with invalid configuration error as
below:

[    9.723114] ccp 0000:23:00.1: sev enabled
[    9.727602] ccp 0000:23:00.1: psp enabled
[    9.732527] ccp 0000:a2:00.1: enabling device (0000 -> 0002)
[    9.739098] ccp 0000:a2:00.1: no command queues available
[    9.745167] ccp 0000:a2:00.1: psp enabled
[    9.805337] ccp 0000:23:00.1: SEV-SNP: failed to INIT rc -5, error 0x3
[    9.866426] ccp 0000:23:00.1: SEV API:1.53 build:5

Fixes: c3b86e61b7 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature")
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Acked-by: Joerg Roedel <jroedel@suse.de>
Message-ID: <138b520fb83964782303b43ade4369cd181fdd9c.1739226950.git.ashish.kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-14 18:39:19 -05:00
Sean Christopherson
44e70718df KVM: SVM: Ensure PSP module is initialized if KVM module is built-in
The kernel's initcall infrastructure lacks the ability to express
dependencies between initcalls, whereas the modules infrastructure
automatically handles dependencies via symbol loading.  Ensure the
PSP SEV driver is initialized before proceeding in sev_hardware_setup()
if KVM is built-in as the dependency isn't handled by the initcall
infrastructure.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
Message-ID: <f78ddb64087df27e7bcb1ae0ab53f55aa0804fab.1739226950.git.ashish.kalra@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2025-02-14 18:39:19 -05:00