Commit Graph

1324895 Commits

Author SHA1 Message Date
Christian Brauner
1623bc27a8 Merge branch 'vfs-6.14.poll' into vfs.fixes
Bring in the fixes for __pollwait() and waitqueue_active() interactions.

Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 12:01:21 +01:00
Christian Brauner
67cd2e23c0 Merge patch series "poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()"
Oleg Nesterov <oleg@redhat.com> says:

The waitqueue_active() helper can only be used if both waker and waiter
have memory barriers that pair with each other. But __pollwait() is
broken in this respect. Fix it.

* patches from https://lore.kernel.org/r/20250107162649.GA18886@redhat.com:
  poll: kill poll_does_not_wait()
  sock_poll_wait: kill the no longer necessary barrier after poll_wait()
  io_uring_poll: kill the no longer necessary barrier after poll_wait()
  poll_wait: kill the obsolete wait_address check
  poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()

Link: https://lore.kernel.org/r/20250107162649.GA18886@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 11:59:45 +01:00
Oleg Nesterov
f005bf18a5 poll: kill poll_does_not_wait()
It no longer has users.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162743.GA18947@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 11:59:00 +01:00
Oleg Nesterov
b2849867b3 sock_poll_wait: kill the no longer necessary barrier after poll_wait()
Now that poll_wait() provides a full barrier we can remove smp_mb() from
sock_poll_wait().

Also, the poll_does_not_wait() check before poll_wait() just adds the
unnecessary confusion, kill it. poll_wait() does the same "p && p->_qproc"
check.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162736.GA18944@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 11:59:00 +01:00
Oleg Nesterov
4e15fa8305 io_uring_poll: kill the no longer necessary barrier after poll_wait()
Now that poll_wait() provides a full barrier we can remove smp_rmb() from
io_uring_poll().

In fact I don't think smp_rmb() was correct, it can't serialize LOADs and
STOREs.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162730.GA18940@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 11:58:59 +01:00
Oleg Nesterov
10b02a2cfe poll_wait: kill the obsolete wait_address check
This check is historical and no longer needed, wait_address is never NULL.
These days we rely on the poll_table->_qproc check. NULL if select/poll
is not going to sleep, or it already has a data to report, or all waiters
have already been registered after the 1st iteration.

However, poll_table *p can be NULL, see p9_fd_poll() for example, so we
can't remove the "p != NULL" check.

Link: https://lore.kernel.org/all/20250106180325.GF7233@redhat.com/
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162724.GA18926@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 11:58:59 +01:00
Oleg Nesterov
cacd9ae4bf poll_wait: add mb() to fix theoretical race between waitqueue_active() and .poll()
As the comment above waitqueue_active() explains, it can only be used
if both waker and waiter have mb()'s that pair with each other. However
__pollwait() is broken in this respect.

This is not pipe-specific, but let's look at pipe_poll() for example:

	poll_wait(...); // -> __pollwait() -> add_wait_queue()

	LOAD(pipe->head);
	LOAD(pipe->head);

In theory these LOAD()'s can leak into the critical section inside
add_wait_queue() and can happen before list_add(entry, wq_head), in this
case pipe_poll() can race with wakeup_pipe_readers/writers which do

	smp_mb();
	if (waitqueue_active(wq_head))
		wake_up_interruptible(wq_head);

There are more __pollwait()-like functions (grep init_poll_funcptr), and
it seems that at least ep_ptable_queue_proc() has the same problem, so the
patch adds smp_mb() into poll_wait().

Link: https://lore.kernel.org/all/20250102163320.GA17691@redhat.com/
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Link: https://lore.kernel.org/r/20250107162717.GA18922@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-10 11:58:59 +01:00
Lizhi Xu
17a4fde81d afs: Fix merge preference rule failure condition
syzbot reported a lock held when returning to userspace[1].  This is
because if argc is less than 0 and the function returns directly, the held
inode lock is not released.

Fix this by store the error in ret and jump to done to clean up instead of
returning directly.

[dh: Modified Lizhi Xu's original patch to make it honour the error code
from afs_split_string()]

[1]
WARNING: lock held when returning to user space!
6.13.0-rc3-syzkaller-00209-g499551201b5f #0 Not tainted
------------------------------------------------
syz-executor133/5823 is leaving the kernel with locks still held!
1 lock held by syz-executor133/5823:
 #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: inode_lock include/linux/fs.h:818 [inline]
 #0: ffff888071cffc00 (&sb->s_type->i_mutex_key#9){++++}-{4:4}, at: afs_proc_addr_prefs_write+0x2bb/0x14e0 fs/afs/addr_prefs.c:388

Reported-by: syzbot+76f33569875eb708e575@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=76f33569875eb708e575
Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/20241226012616.2348907-1-lizhi.xu@windriver.com/
Link: https://lore.kernel.org/r/529850.1736261552@warthog.procyon.org.uk
Tested-by: syzbot+76f33569875eb708e575@syzkaller.appspotmail.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 17:21:41 +01:00
David Howells
904abff4b1 netfs: Fix read-retry for fs with no ->prepare_read()
Fix netfslib's read-retry to only call ->prepare_read() in the backing
filesystem such a function is provided.  We can get to this point if a
there's an active cache as failed reads from the cache need negotiating
with the server instead.

Fixes: ee4cdf7ba8 ("netfs: Speed up buffered reading")
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/529329.1736261010@warthog.procyon.org.uk
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 17:20:04 +01:00
David Howells
3f6bc9e3ab netfs: Fix kernel async DIO
Netfslib needs to be able to handle kernel-initiated asynchronous DIO that
is supplied with a bio_vec[] array.  Currently, because of the async flag,
this gets passed to netfs_extract_user_iter() which throws a warning and
fails because it only handles IOVEC and UBUF iterators.  This can be
triggered through a combination of cifs and a loopback blockdev with
something like:

        mount //my/cifs/share /foo
        dd if=/dev/zero of=/foo/m0 bs=4K count=1K
        losetup --sector-size 4096 --direct-io=on /dev/loop2046 /foo/m0
        echo hello >/dev/loop2046

This causes the following to appear in syslog:

        WARNING: CPU: 2 PID: 109 at fs/netfs/iterator.c:50 netfs_extract_user_iter+0x170/0x250 [netfs]

and the write to fail.

Fix this by removing the check in netfs_unbuffered_write_iter_locked() that
causes async kernel DIO writes to be handled as userspace writes.  Note
that this change relies on the kernel caller maintaining the existence of
the bio_vec array (or kvec[] or folio_queue) until the op is complete.

Fixes: 153a9961b5 ("netfs: Implement unbuffered/DIO write support")
Reported-by: Nicolas Baranger <nicolas.baranger@3xo.fr>
Closes: https://lore.kernel.org/r/fedd8a40d54b2969097ffa4507979858@3xo.fr/
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/608725.1736275167@warthog.procyon.org.uk
Tested-by: Nicolas Baranger <nicolas.baranger@3xo.fr>
Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
cc: Steve French <smfrench@gmail.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: netfs@lists.linux.dev
cc: linux-cifs@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 17:18:51 +01:00
Christian Brauner
482d520d86 Merge tag 'vfs-6.14-rc7.mount.fixes'
Bring in the fix for the mount namespace rbtree. It is used as the base
for the vfs mount work for this cycle and so shouldn't be applied
directly.

Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 17:03:21 +01:00
Christian Brauner
344bac8f0d fs: kill MNT_ONRB
Move mnt->mnt_node into the union with mnt->mnt_rcu and mnt->mnt_llist
instead of keeping it with mnt->mnt_list. This allows us to use
RB_CLEAR_NODE(&mnt->mnt_node) in umount_tree() as well as
list_empty(&mnt->mnt_node). That in turn allows us to remove MNT_ONRB.

This also fixes the bug reported in [1] where seemingly MNT_ONRB wasn't
set in @mnt->mnt_flags even though the mount was present in the mount
rbtree of the mount namespace.

The root cause is the following race. When a btrfs subvolume is mounted
a temporary mount is created:

btrfs_get_tree_subvol()
{
        mnt = fc_mount()
        // Register the newly allocated mount with sb->mounts:
        lock_mount_hash();
        list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts);
        unlock_mount_hash();
}

and registered on sb->s_mounts. Later it is added to an anonymous mount
namespace via mount_subvol():

-> mount_subvol()
   -> mount_subtree()
      -> alloc_mnt_ns()
         mnt_add_to_ns()
         vfs_path_lookup()
         put_mnt_ns()

The mnt_add_to_ns() call raises MNT_ONRB in @mnt->mnt_flags. If someone
concurrently does a ro remount:

reconfigure_super()
-> sb_prepare_remount_readonly()
   {
           list_for_each_entry(mnt, &sb->s_mounts, mnt_instance) {
   }

all mounts registered in sb->s_mounts are visited and first
MNT_WRITE_HOLD is raised, then MNT_READONLY is raised, and finally
MNT_WRITE_HOLD is removed again.

The flag modification for MNT_WRITE_HOLD/MNT_READONLY and MNT_ONRB race
so MNT_ONRB might be lost.

Fixes: 2eea9ce431 ("mounts: keep list of mounts in an rbtree")
Cc: <stable@kernel.org> # v6.8+
Link: https://lore.kernel.org/r/20241215-vfs-6-14-mount-work-v1-1-fd55922c4af8@kernel.org
Link: https://lore.kernel.org/r/ec6784ed-8722-4695-980a-4400d4e7bd1a@gmx.com [1]
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 16:58:50 +01:00
Marco Nelissen
c13094b894 iomap: avoid avoid truncating 64-bit offset to 32 bits
on 32-bit kernels, iomap_write_delalloc_scan() was inadvertently using a
32-bit position due to folio_next_index() returning an unsigned long.
This could lead to an infinite loop when writing to an xfs filesystem.

Signed-off-by: Marco Nelissen <marco.nelissen@gmail.com>
Link: https://lore.kernel.org/r/20250109041253.2494374-1-marco.nelissen@gmail.com
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-09 16:09:20 +01:00
David Howells
8fd56ad6e7 afs: Fix the maximum cell name length
The kafs filesystem limits the maximum length of a cell to 256 bytes, but a
problem occurs if someone actually does that: kafs tries to create a
directory under /proc/net/afs/ with the name of the cell, but that fails
with a warning:

        WARNING: CPU: 0 PID: 9 at fs/proc/generic.c:405

because procfs limits the maximum filename length to 255.

However, the DNS limits the maximum lookup length and, by extension, the
maximum cell name, to 255 less two (length count and trailing NUL).

Fix this by limiting the maximum acceptable cellname length to 253.  This
also allows us to be sure we can create the "/afs/.<cell>/" mountpoint too.

Further, split the YFS VL record cell name maximum to be the 256 allowed by
the protocol and ignore the record retrieved by YFSVL.GetCellName if it
exceeds 253.

Fixes: c3e9f88826 ("afs: Implement client support for the YFSVL.GetCellName RPC op")
Reported-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/6776d25d.050a0220.3a8527.0048.GAE@google.com/
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://lore.kernel.org/r/376236.1736180460@warthog.procyon.org.uk
Tested-by: syzbot+7848fee1f1e5c53f912b@syzkaller.appspotmail.com
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-07 15:55:25 +01:00
Christian Brauner
3ff93c5935 Merge tag 'fuse-fixes-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
Pull fuse fixes from Miklos Szeredi <mszeredi@redhat.com>:

- Fix fuse_get_user_pages() allocation failure handling.

- Fix direct-io folio offset and length calculation.

* tag 'fuse-fixes-6.13-rc7' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: Set *nbytesp=0 in fuse_get_user_pages on allocation failure
  fuse: fix direct io folio offset and length calculation

Link: https://lore.kernel.org/r/CAJfpegu7o_X%3DSBWk_C47dUVUQ1mJZDEGe1MfD0N3wVJoUBWdmg@mail.gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-07 15:43:07 +01:00
Linus Torvalds
fbfd64d25c Merge tag 'vfs-6.13-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:

 - Relax assertions on failure to encode file handles

   The ->encode_fh() method can fail for various reasons. None of them
   warrant a WARN_ON().

 - Fix overlayfs file handle encoding by allowing encoding an fid from
   an inode without an alias

 - Make sure fuse_dir_open() handles FOPEN_KEEP_CACHE. If it's not
   specified fuse needs to invaludate the directory inode page cache

 - Fix qnx6 so it builds with gcc-15

 - Various fixes for netfslib and ceph and nfs filesystems:
     - Ignore silly rename files from afs and nfs when building header
       archives
     - Fix read result collection in netfslib with multiple subrequests
     - Handle ENOMEM for netfslib buffered reads
     - Fix oops in nfs_netfs_init_request()
     - Parse the secctx command immediately in cachefiles
     - Remove a redundant smp_rmb() in netfslib
     - Handle recursion in read retry in netfslib
     - Fix clearing of folio_queue
     - Fix missing cancellation of copy-to_cache when the cache for a
       file is temporarly disabled in netfslib

 - Sanity check the hfs root record

 - Fix zero padding data issues in concurrent write scenarios

 - Fix is_mnt_ns_file() after converting nsfs to path_from_stashed()

 - Fix missing declaration of init_files

 - Increase I/O priority when writing revoke records in jbd2

 - Flush filesystem device before updating tail sequence in jbd2

* tag 'vfs-6.13-rc7.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits)
  ovl: support encoding fid from inode with no alias
  ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
  fuse: respect FOPEN_KEEP_CACHE on opendir
  netfs: Fix is-caching check in read-retry
  netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled
  netfs: Fix ceph copy to cache on write-begin
  netfs: Work around recursion by abandoning retry if nothing read
  netfs: Fix missing barriers by using clear_and_wake_up_bit()
  netfs: Remove redundant use of smp_rmb()
  cachefiles: Parse the "secctx" immediately
  nfs: Fix oops in nfs_netfs_init_request() when copying to cache
  netfs: Fix enomem handling in buffered reads
  netfs: Fix non-contiguous donation between completed reads
  kheaders: Ignore silly-rename files
  fs: relax assertions on failure to encode file handles
  fs: fix missing declaration of init_files
  fs: fix is_mnt_ns_file()
  iomap: fix zero padding data issue in concurrent append writes
  iomap: pass byte granular end position to iomap_add_to_ioend
  jbd2: flush filesystem device before updating tail sequence
  ...
2025-01-06 10:26:39 -08:00
Linus Torvalds
13563da6ff Merge tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfio
Pull vfio fix from Alex Williamson:

 - Fix a missed order alignment requirement of the pfn when inserting
   mappings through the new huge fault handler introduced in v6.12 (Alex
   Williamson)

* tag 'vfio-v6.13-rc7' of https://github.com/awilliam/linux-vfio:
  vfio/pci: Fallback huge faults for unaligned pfn
2025-01-06 06:56:23 -08:00
Christian Brauner
368fcc5d3f Merge patch series "Fix encoding overlayfs fid for fanotify delete events"
Amir Goldstein <amir73il@gmail.com> says:

This is a followup fix to the reported regression [1] that was
introduced by overlayfs non-decodable file handles support in v6.6.

The first fix posted two weeks ago [2] was a quick band aid which is
justified on its own and is still queued on your vfs.fixes branch.

This followup fix fixes the root cause of overlayfs file handle encoding
failure and it also solves a bug with fanotify FAN_DELETE_SELF events on
overlayfs, that was discovered from analysis of the first report.

The fix to fanotify delete events was verified with a new LTP test [3].

[1] https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/
[2] https://lore.kernel.org/linux-fsdevel/20241219115301.465396-1-amir73il@gmail.com/
[3] https://github.com/amir73il/ltp/commits/ovl_encode_fid/

* patches from https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com:
  ovl: support encoding fid from inode with no alias
  ovl: pass realinode to ovl_encode_real_fh() instead of realdentry

Link: https://lore.kernel.org/r/20250105162404.357058-1-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06 15:43:58 +01:00
Amir Goldstein
c45beebfde ovl: support encoding fid from inode with no alias
Dmitry Safonov reported that a WARN_ON() assertion can be trigered by
userspace when calling inotify_show_fdinfo() for an overlayfs watched
inode, whose dentry aliases were discarded with drop_caches.

The WARN_ON() assertion in inotify_show_fdinfo() was removed, because
it is possible for encoding file handle to fail for other reason, but
the impact of failing to encode an overlayfs file handle goes beyond
this assertion.

As shown in the LTP test case mentioned in the link below, failure to
encode an overlayfs file handle from a non-aliased inode also leads to
failure to report an fid with FAN_DELETE_SELF fanotify events.

As Dmitry notes in his analyzis of the problem, ovl_encode_fh() fails
if it cannot find an alias for the inode, but this failure can be fixed.
ovl_encode_fh() seldom uses the alias and in the case of non-decodable
file handles, as is often the case with fanotify fid info,
ovl_encode_fh() never needs to use the alias to encode a file handle.

Defer finding an alias until it is actually needed so ovl_encode_fh()
will not fail in the common case of FAN_DELETE_SELF fanotify events.

Fixes: 16aac5ad1f ("ovl: support encoding non-decodable file handles")
Reported-by: Dmitry Safonov <dima@arista.com>
Closes: https://lore.kernel.org/linux-fsdevel/CAOQ4uxiie81voLZZi2zXS1BziXZCM24nXqPAxbu8kxXCUWdwOg@mail.gmail.com/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250105162404.357058-3-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06 15:43:55 +01:00
Amir Goldstein
07aeefae7f ovl: pass realinode to ovl_encode_real_fh() instead of realdentry
We want to be able to encode an fid from an inode with no alias.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06 15:43:55 +01:00
Linus Torvalds
5428dc1906 Merge tag 'exfat-for-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat fixes from Namjae Jeon:
 "All fixes are for issues reported by syzbot:

   - Fix wrong error return in exfat_find_empty_entry()

   - Fix a endless loop by self-linked chain

   - fix a KMSAN uninit-value issue in exfat_extend_valid_size()"

* tag 'exfat-for-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
  exfat: fix the infinite loop in __exfat_free_cluster()
  exfat: fix the new buffer was not zeroed before writing
  exfat: fix the infinite loop in exfat_readdir()
  exfat: fix exfat_find_empty_entry() not returning error on failure
2025-01-06 06:19:36 -08:00
Linus Torvalds
cd6313beae Revert "vmstat: disable vmstat_work on vmstat_cpu_down_prep()"
This reverts commit adcfb264c3.

It turns out this just causes a different warning splat instead that
seems to be much easier to trigger, so let's revert ASAP.

Reported-and-bisected-by: Borislav Petkov <bp@alien8.de>
Tested-by: Breno Leitao <leitao@debian.org>
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Link: https://lore.kernel.org/all/20250106131817.GAZ3vYGVr3-hWFFPLj@fat_crate.local/
Cc: Koichiro Den <koichiro.den@canonical.com>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-01-06 06:10:24 -08:00
Linus Torvalds
9d89551994 Linux 6.13-rc6 v6.13-rc6 2025-01-05 14:13:40 -08:00
Linus Torvalds
9244696b34 Merge tag 'kbuild-fixes-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Fix escaping of '$' in scripts/mksysmap

 - Fix a modpost crash observed with the latest binutils

 - Fix 'provides' in the linux-api-headers pacman package

* tag 'kbuild-fixes-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: pacman-pkg: provide versioned linux-api-headers package
  modpost: work around unaligned data access error
  modpost: refactor do_vmbus_entry()
  modpost: fix the missed iteration for the max bit in do_input()
  scripts/mksysmap: Fix escape chars '$'
2025-01-05 10:52:47 -08:00
Linus Torvalds
5635d8bad2 Merge tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull hotfixes from Andrew Morton:
 "25 hotfixes.  16 are cc:stable.  18 are MM and 7 are non-MM.

  The usual bunch of singletons and two doubletons - please see the
  relevant changelogs for details"

* tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits)
  MAINTAINERS: change Arınç _NAL's name and email address
  scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity
  mm/util: make memdup_user_nul() similar to memdup_user()
  mm, madvise: fix potential workingset node list_lru leaks
  mm/damon/core: fix ignored quota goals and filters of newly committed schemes
  mm/damon/core: fix new damon_target objects leaks on damon_commit_targets()
  mm/list_lru: fix false warning of negative counter
  vmstat: disable vmstat_work on vmstat_cpu_down_prep()
  mm: shmem: fix the update of 'shmem_falloc->nr_unswapped'
  mm: shmem: fix incorrect index alignment for within_size policy
  percpu: remove intermediate variable in PERCPU_PTR()
  mm: zswap: fix race between [de]compression and CPU hotunplug
  ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv
  fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit
  kcov: mark in_softirq_really() as __always_inline
  docs: mm: fix the incorrect 'FileHugeMapped' field
  mailmap: modify the entry for Mathieu Othacehe
  mm/kmemleak: fix sleeping function called from invalid context at print message
  mm: hugetlb: independent PMD page table shared count
  maple_tree: reload mas before the second call for mas_empty_area
  ...
2025-01-05 10:37:45 -08:00
Linus Torvalds
7a5b6fc8bd Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "A randconfig build fix and a performance fix:

   - Fix the CONFIG_RESET_CONTROLLER=n path signature of
     clk_imx8mp_audiomix_reset_controller_register() to appease
     randconfig

   - Speed up the sdhci clk on TH1520 by a factor of 4 by adding
     a fixed factor clk"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: clk-imx8mp-audiomix: fix function signature
  clk: thead: Fix TH1520 emmc and shdci clock rate
2025-01-05 10:28:34 -08:00
Thomas Weißschuh
385443057f kbuild: pacman-pkg: provide versioned linux-api-headers package
The Arch Linux glibc package contains a versioned dependency on
"linux-api-headers". If the linux-api-headers package provided by
pacman-pkg does not specify an explicit version this dependency is not
satisfied.
Fix the dependency by providing an explicit version.

Fixes: c8578539de ("kbuild: add script and target to generate pacman package")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-01-05 23:19:17 +09:00
Linus Torvalds
ab75170520 Merge tag 'linux-watchdog-6.13-rc6' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fix from Wim Van Sebroeck:

 - fix error message during stm32 driver probe

* tag 'linux-watchdog-6.13-rc6' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: stm32_iwdg: fix error message during driver probe
2025-01-04 10:59:10 -08:00
Amir Goldstein
03f275adb8 fuse: respect FOPEN_KEEP_CACHE on opendir
The re-factoring of fuse_dir_open() missed the need to invalidate
directory inode page cache with open flag FOPEN_KEEP_CACHE.

Fixes: 7de64d521b ("fuse: break up fuse_open_common()")
Reported-by: Prince Kumar <princer@google.com>
Closes: https://lore.kernel.org/linux-fsdevel/CAEW=TRr7CYb4LtsvQPLj-zx5Y+EYBmGfM24SuzwyDoGVNoKm7w@mail.gmail.com/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://lore.kernel.org/r/20250101130037.96680-1-amir73il@gmail.com
Reviewed-by: Bernd Schubert <bernd.schubert@fastmail.fm>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-04 09:58:14 +01:00
Linus Torvalds
63676eefb7 Merge tag 'sched_ext-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext
Pull sched_ext fixes from Tejun Heo:

 - Fix a bug where bpf_iter_scx_dsq_new() was not initializing the
   iterator's flags and could inadvertently enable e.g. reverse
   iteration

 - Fix a bug where scx_ops_bypass() could call irq_restore twice

 - Add Andrea and Changwoo as maintainers for better review coverage

 - selftests and tools/sched_ext build and other fixes

* tag 'sched_ext-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Fix dsq_local_on selftest
  sched_ext: initialize kit->cursor.flags
  sched_ext: Fix invalid irq restore in scx_ops_bypass()
  MAINTAINERS: add me as reviewer for sched_ext
  MAINTAINERS: add self as reviewer for sched_ext
  scx: Fix maximal BPF selftest prog
  sched_ext: fix application of sizeof to pointer
  selftests/sched_ext: fix build after renames in sched_ext API
  sched_ext: Add __weak to fix the build errors
2025-01-03 15:09:12 -08:00
Linus Torvalds
f9aa1fb9f8 Merge tag 'wq-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq
Pull workqueue fixes from Tejun Heo:

 - Suppress a corner case spurious flush dependency warning

 - Two trivial changes

* tag 'wq-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: add printf attribute to __alloc_workqueue()
  workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker
  rust: add safety comment in workqueue traits
2025-01-03 15:03:56 -08:00
Linus Torvalds
2ae3aab557 Merge tag 'block-6.13-20250103' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
 "Collection of fixes for block. Particularly the target name overflow
  has been a bit annoying, as it results in overwriting random memory
  and hence shows up as triggering various other bugs.

   - NVMe pull request via Keith:
      - Fix device specific quirk for PRP list alignment (Robert)
      - Fix target name overflow (Leo)
      - Fix target write granularity (Luis)
      - Fix target sleeping in atomic context (Nilay)
      - Remove unnecessary tcp queue teardown (Chunguang)

   - Simple cdrom typo fix"

* tag 'block-6.13-20250103' of git://git.kernel.dk/linux:
  cdrom: Fix typo, 'devicen' to 'device'
  nvme-tcp: remove nvme_tcp_destroy_io_queues()
  nvmet-loop: avoid using mutex in IO hotpath
  nvmet: propagate npwg topology
  nvmet: Don't overflow subsysnqn
  nvme-pci: 512 byte aligned dma pool segment quirk
2025-01-03 14:58:57 -08:00
Linus Torvalds
a984e234fc Merge tag 'io_uring-6.13-20250103' of git://git.kernel.dk/linux
Pull io_uring fixes from Jens Axboe:

 - Fix an issue with the read multishot support and posting of CQEs from
   io-wq context

 - Fix a regression introduced in this cycle, where making the timeout
   lock a raw one uncovered another locking dependency. As a result,
   move the timeout flushing outside of the timeout lock, punting them
   to a local list first

 - Fix use of an uninitialized variable in io_async_msghdr. Doesn't
   really matter functionally, but silences a valid KMSAN complaint that
   it's not always initialized

 - Fix use of incrementally provided buffers for read on non-pollable
   files, where the buffer always gets committed upfront. Unfortunately
   the buffer address isn't resolved first, so the read ends up using
   the updated rather than the current value

* tag 'io_uring-6.13-20250103' of git://git.kernel.dk/linux:
  io_uring/kbuf: use pre-committed buffer address for non-pollable file
  io_uring/net: always initialize kmsg->msg.msg_inq upfront
  io_uring/timeout: flush timeouts outside of the timeout lock
  io_uring/rw: fix downgraded mshot read
2025-01-03 14:45:59 -08:00
Linus Torvalds
aba74e639f Merge tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from wireles and netfilter.

  Nothing major here. Over the last two weeks we gathered only around
  two-thirds of our normal weekly fix count, but delaying sending these
  until -rc7 seemed like a really bad idea.

  AFAIK we have no bugs under investigation. One or two reverts for
  stuff for which we haven't gotten a proper fix will likely come in the
  next PR.

  Current release - fix to a fix:

   - netfilter: nft_set_hash: unaligned atomic read on struct
     nft_set_ext

   - eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup

  Previous releases - regressions:

   - net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets

   - mptcp:
      - fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust
      - fix TCP options overflow
      - prevent excessive coalescing on receive, fix throughput

   - net: fix memory leak in tcp_conn_request() if map insertion fails

   - wifi: cw1200: fix potential NULL dereference after conversion to
     GPIO descriptors

   - phy: micrel: dynamically control external clock of KSZ PHY, fix
     suspend behavior

  Previous releases - always broken:

   - af_packet: fix VLAN handling with MSG_PEEK

   - net: restrict SO_REUSEPORT to inet sockets

   - netdev-genl: avoid empty messages in NAPI get

   - dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X

   - eth:
      - gve: XDP fixes around transmit, queue wakeup etc.
      - ti: icssg-prueth: fix firmware load sequence to prevent time
        jump which breaks timesync related operations

  Misc:

   - netlink: specs: mptcp: add missing attr and improve documentation"

* tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init
  net: ti: icssg-prueth: Fix firmware load sequence.
  mptcp: prevent excessive coalescing on receive
  mptcp: don't always assume copied data in mptcp_cleanup_rbuf()
  mptcp: fix recvbuffer adjust on sleeping rcvmsg
  ila: serialize calls to nf_register_net_hooks()
  af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK
  af_packet: fix vlan_get_tci() vs MSG_PEEK
  net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init()
  net: restrict SO_REUSEPORT to inet sockets
  net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets
  net: sfc: Correct key_len for efx_tc_ct_zone_ht_params
  net: wwan: t7xx: Fix FSM command timeout issue
  sky2: Add device ID 11ab:4373 for Marvell 88E8075
  mptcp: fix TCP options overflow.
  net: mv643xx_eth: fix an OF node reference leak
  gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup
  eth: bcmsysport: fix call balance of priv->clk handling routines
  net: llc: reset skb->transport_header
  netlink: specs: mptcp: fix missing doc
  ...
2025-01-03 14:36:54 -08:00
Linus Torvalds
ee063c23e4 Merge tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux
Pull nios2 fixlet from Dinh Nguyen:

 - Use str_yes_no() helper function

* tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux:
  nios2: Use str_yes_no() helper in show_cpuinfo()
2025-01-03 14:16:25 -08:00
Linus Torvalds
dea3165f98 Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "A lot of fixes accumulated over the holiday break:

   - Static tool fixes, value is already proven to be NULL, possible
     integer overflow

   - Many bnxt_re fixes:
      - Crashes due to a mismatch in the maximum SGE list size
      - Don't waste memory for user QPs by creating kernel-only
        structures
      - Fix compatability issues with older HW in some of the new HW
        features recently introduced: RTS->RTS feature, work around 9096
      - Do not allow destroy_qp to fail
      - Validate QP MTU against device limits
      - Add missing validation on madatory QP attributes for RTR->RTS
      - Report port_num in query_qp as required by the spec
      - Fix creation of QPs of the maximum queue size, and in the
        variable mode
      - Allow all QPs to be used on newer HW by limiting a work around
        only to HW it affects
      - Use the correct MSN table size for variable mode QPs
      - Add missing locking in create_qp() accessing the qp_tbl
      - Form WQE buffers correctly when some of the buffers are 0 hop
      - Don't crash on QP destroy if the userspace doesn't setup the
        dip_ctx
      - Add the missing QP flush handler call on the DWQE path to avoid
        hanging on error recovery
      - Consistently use ENXIO for return codes if the devices is
        fatally errored

   - Try again to fix VLAN support on iwarp, previous fix was reverted
     due to breaking other cards

   - Correct error path return code for rdma netlink events

   - Remove the seperate net_device pointer in siw and rxe which
     syzkaller found a way to UAF

   - Fix a UAF of a stack ib_sge in rtrs

   - Fix a regression where old mlx5 devices and FW were wrongly
     activing new device features and failing"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits)
  RDMA/mlx5: Enable multiplane mode only when it is supported
  RDMA/bnxt_re: Fix error recovery sequence
  RDMA/rtrs: Ensure 'ib_sge list' is accessible
  RDMA/rxe: Remove the direct link to net_device
  RDMA/hns: Fix missing flush CQE for DWQE
  RDMA/hns: Fix warning storm caused by invalid input in IO path
  RDMA/hns: Fix accessing invalid dip_ctx during destroying QP
  RDMA/hns: Fix mapping error of zero-hop WQE buffer
  RDMA/bnxt_re: Fix the locking while accessing the QP table
  RDMA/bnxt_re: Fix MSN table size for variable wqe mode
  RDMA/bnxt_re: Add send queue size check for variable wqe
  RDMA/bnxt_re: Disable use of reserved wqes
  RDMA/bnxt_re: Fix max_qp_wrs reported
  RDMA/siw: Remove direct link to net_device
  RDMA/nldev: Set error code in rdma_nl_notify_event
  RDMA/bnxt_re: Fix reporting hw_ver in query_device
  RDMA/bnxt_re: Fix to export port num to ib_query_qp
  RDMA/bnxt_re: Fix setting mandatory attributes for modify_qp
  RDMA/bnxt_re: Add check for path mtu in modify_qp
  RDMA/bnxt_re: Fix the check for 9060 condition
  ...
2025-01-03 11:09:35 -08:00
Linus Torvalds
f274fffbc2 Merge tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:

 - A small Kconfig fixup for the i.MX.

   In principle this could come in from the SoC tree but the bug was
   introduced from the pin control tree so let's fix it from here.

 - Fix a sleep in atomic context in the MCP23xxx GPIO expander by
   disabling the regmap locking and using explicit mutex locks.

* tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking
  ARM: imx: Re-introduce the PINCTRL selection
2025-01-03 10:57:57 -08:00
Linus Torvalds
4f5d3da619 Merge tag 'sound-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "The first new year pull request: no surprises, all small fixes,
  including:

   - Follow-up fixes for the new compress-offload API extension

   - A couple of fixes for MIDI 2.0 UMP handling

   - A trivial race fix for OSS sequencer emulation ioctls

   - USB-audio and HD-audio fixes / quirks"

* tag 'sound-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq: Check UMP support for midi_version change
  ALSA hda/realtek: Add quirk for Framework F111:000C
  Revert "ALSA: ump: Don't enumeration invalid groups for legacy rawmidi"
  ALSA: seq: oss: Fix races at processing SysEx messages
  ALSA: compress_offload: fix remaining descriptor races in sound/core/compress_offload.c
  ALSA: compress_offload: Drop unneeded no_free_ptr()
  ALSA: hda/tas2781: Ignore SUBSYS_ID not found for tas2563 projects
  ALSA: usb-audio: US16x08: Initialize array before use
2025-01-03 10:54:51 -08:00
Linus Torvalds
92c3bb3d2e Merge tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernel
Pull drm fixes from Dave Airlie:
 "Happy New Year.

  It was fairly quiet for holidays period, certainly nothing that worth
  getting off the couch before I needed to, this is for the past two
  weeks, i915, xe and some adv7511, I expect we will see some amdgpu etc
  happening next week, but otherwise all quiet.

  i915:
   - Fix C10 pll programming sequence [cx0_phy]
   - Fix power gate sequence. [dg1]

  xe:
   - uapi: Revert some devcoredump file format changes breaking a mesa
     debug tool
   - Fixes around waits when moving to system
   - Fix a typo when checking for LMEM provisioning
   - Fix a fault on fd close after unbind
   - A couple of OA fixes squashed for stable backporting

  adv7511:
   - fix UAF
   - drop single lane support
   - audio infoframe fix"

* tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernel:
  xe/oa: Fix query mode of operation for OAR/OAC
  drm/i915/dg1: Fix power gate sequence.
  drm/i915/cx0_phy: Fix C10 pll programming sequence
  drm/xe: Fix fault on fd close after unbind
  drm/xe/pf: Use correct function to check LMEM provisioning
  drm/xe: Wait for migration job before unmapping pages
  drm/xe: Use non-interruptible wait when moving BO to system
  drm/xe: Revert some changes that break a mesa debug tool
  drm: adv7511: Drop dsi single lane support
  dt-bindings: display: adi,adv7533: Drop single lane support
  drm: adv7511: Fix use-after-free in adv7533_attach_dsi()
  drm/bridge: adv7511_audio: Update Audio InfoFrame properly
2025-01-03 10:06:44 -08:00
Linus Torvalds
e30dd219c7 Merge tag 'ftrace-v6.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ftrace fixes from Steven Rostedt:

 - Add needed READ_ONCE() around access to the fgraph array element

   The updates to the fgraph array can happen when callbacks are
   registered and unregistered. The __ftrace_return_to_handler() can
   handle reading either the old value or the new value. But once it
   reads that value it must stay consistent otherwise the check that
   looks to see if the value is a stub may show false, but if the
   compiler decides to re-read after that check, it can be true which
   can cause the code to crash later on.

 - Make function profiler use the top level ops for filtering again

   When function graph became available for instances, its filter ops
   became independent from the top level set_ftrace_filter. In the
   process the function profiler received its own filter ops as well.
   But the function profiler uses the top level set_ftrace_filter file
   and does not have one of its own. In giving it its own filter ops, it
   lost any user interface it once had. Make it use the top level
   set_ftrace_filter file again. This fixes a regression.

* tag 'ftrace-v6.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
  ftrace: Fix function profiler's filtering functionality
  fgraph: Add READ_ONCE() when accessing fgraph_array[]
2025-01-03 10:04:43 -08:00
Jens Axboe
ed123c948d io_uring/kbuf: use pre-committed buffer address for non-pollable file
For non-pollable files, buffer ring consumption will commit upfront.
This is fine, but io_ring_buffer_select() will return the address of the
buffer after having committed it. For incrementally consumed buffers,
this is incorrect as it will modify the buffer address.

Store the pre-committed value and return that. If that isn't done, then
the initial part of the buffer is not used and the application will
correctly assume the content arrived at the start of the userspace
buffer, but the kernel will have put it later in the buffer. Or it can
cause a spurious -EFAULT returned in the CQE, depending on the buffer
size. As bounds are suitably checked for doing the actual IO, no adverse
side effects are possible - it's just a data misplacement within the
existing buffer.

Reported-by: Gwendal Fernet <gwendalfernet@gmail.com>
Cc: stable@vger.kernel.org
Fixes: ae98dbf43d ("io_uring/kbuf: add support for incremental buffer consumption")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-03 09:38:37 -07:00
Alex Williamson
09dfc8a5f2 vfio/pci: Fallback huge faults for unaligned pfn
The PFN must also be aligned to the fault order to insert a huge
pfnmap.  Test the alignment and fallback when unaligned.

Fixes: f9e54c3a2f ("vfio/pci: implement huge_fault support")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=219619
Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com>
Reported-by: Precific <precification@posteo.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Precific <precification@posteo.de>
Link: https://lore.kernel.org/r/20250102183416.1841878-1-alex.williamson@redhat.com
Cc: stable@vger.kernel.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-01-03 08:49:05 -07:00
Mark Zhang
45d339fefa RDMA/mlx5: Enable multiplane mode only when it is supported
Driver queries vport_cxt.num_plane and enables multiplane when it is
greater then 0, but some old FWs (versions from x.40.1000 till x.42.1000),
report vport_cxt.num_plane = 1 unexpectedly.

Fix it by querying num_plane only when HCA_CAP2.multiplane bit is set.

Fixes: 2a5db20fa5 ("RDMA/mlx5: Add support to multi-plane device and port")
Link: https://patch.msgid.link/r/1ef901acdf564716fcf550453cf5e94f343777ec.1734610916.git.leon@kernel.org
Cc: stable@vger.kernel.org
Reported-by: Francesco Poli <invernomuto@paranoici.org>
Closes: https://lore.kernel.org/all/nvs4i2v7o6vn6zhmtq4sgazy2hu5kiulukxcntdelggmznnl7h@so3oul6uwgbl/
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-01-03 09:17:19 -04:00
David S. Miller
ce21419b55 Merge branch 'net-iep-clock-module-fixes'
Meghana Malladi says:

====================
IEP clock module bug fixes

This series has some bug fixes for IEP module needed by PPS and
timesync operations.

Patch 1/2 fixes firmware load sequence to run all the firmwares
when either of the ethernet interfaces is up. Move all the code
common for firmware bringup under common functions.

Patch 2/2 fixes distorted PPS signal when the ethernet interfaces
are brough down and up. This patch also fixes enabling PPS signal
after bringing the interface up, without disabling PPS.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-03 11:54:06 +00:00
Meghana Malladi
9b11536124 net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init
When ICSSG interfaces are brought down and brought up again, the
pru cores are shut down and booted again, flushing out all the memories
and start again in a clean state. Hence it is expected that the
IEP_CMP_CFG register needs to be flushed during iep_init() to ensure
that the existing residual configuration doesn't cause any unusual
behavior. If the register is not cleared, existing IEP_CMP_CFG set for
CMP1 will result in SYNC0_OUT signal based on the SYNC_OUT register values.

After bringing the interface up, calling PPS enable doesn't work as
the driver believes PPS is already enabled, (iep->pps_enabled is not
cleared during interface bring down) and driver will just return true
even though there is no signal. Fix this by disabling pps and perout.

Fixes: c1e0230eea ("net: ti: icss-iep: Add IEP driver")
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-03 11:54:06 +00:00
MD Danish Anwar
9facce84f4 net: ti: icssg-prueth: Fix firmware load sequence.
Timesync related operations are ran in PRU0 cores for both ICSSG SLICE0
and SLICE1. Currently whenever any ICSSG interface comes up we load the
respective firmwares to PRU cores and whenever interface goes down, we
stop the resective cores. Due to this, when SLICE0 goes down while
SLICE1 is still active, PRU0 firmwares are unloaded and PRU0 core is
stopped. This results in clock jump for SLICE1 interface as the timesync
related operations are no longer running.

As there are interdependencies between SLICE0 and SLICE1 firmwares,
fix this by running both PRU0 and PRU1 firmwares as long as at least 1
ICSSG interface is up. Add new flag in prueth struct to check if all
firmwares are running and remove the old flag (fw_running).

Use emacs_initialized as reference count to load the firmwares for the
first and last interface up/down. Moving init_emac_mode and fw_offload_mode
API outside of icssg_config to icssg_common_start API as they need
to be called only once per firmware boot.

Change prueth_emac_restart() to return error code and add error prints
inside the caller of this functions in case of any failures.

Move prueth_emac_stop() from common to sr1 driver.
sr1 and sr2 drivers have different logic handling for stopping
the firmwares. While sr1 driver is dependent on emac structure
to stop the corresponding pru cores for that slice, for sr2
all the pru cores of both the slices are stopped and is not
dependent on emac. So the prueth_emac_stop() function is no
longer common and can be moved to sr1 driver.

Fixes: c1e0230eea ("net: ti: icss-iep: Add IEP driver")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Signed-off-by: Meghana Malladi <m-malladi@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-03 11:54:06 +00:00
Jakub Kicinski
3473020d0f Merge branch 'mptcp-rx-path-fixes'
Matthieu Baerts says:

====================
mptcp: rx path fixes

Here are 3 different fixes, all related to the MPTCP receive buffer:

- Patch 1: fix receive buffer space when recvmsg() blocks after
  receiving some data. For a fix introduced in v6.12, backported to
  v6.1.

- Patch 2: mptcp_cleanup_rbuf() can be called when no data has been
  copied. For 5.11.

- Patch 3: prevent excessive coalescing on receive, which can affect the
  throughput badly. It looks better to wait a bit before backporting
  this one to stable versions, to get more results. For 5.10.
====================

Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-0-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-02 18:44:06 -08:00
Paolo Abeni
56b824eb49 mptcp: prevent excessive coalescing on receive
Currently the skb size after coalescing is only limited by the skb
layout (the skb must not carry frag_list). A single coalesced skb
covering several MSS can potentially fill completely the receive
buffer. In such a case, the snd win will zero until the receive buffer
will be empty again, affecting tput badly.

Fixes: 8268ed4c9d ("mptcp: introduce and use mptcp_try_coalesce()")
Cc: stable@vger.kernel.org # please delay 2 weeks after 6.13-final release
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-3-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-02 18:44:03 -08:00
Paolo Abeni
551844f26d mptcp: don't always assume copied data in mptcp_cleanup_rbuf()
Under some corner cases the MPTCP protocol can end-up invoking
mptcp_cleanup_rbuf() when no data has been copied, but such helper
assumes the opposite condition.

Explicitly drop such assumption and performs the costly call only
when strictly needed - before releasing the msk socket lock.

Fixes: fd8976790a ("mptcp: be careful on MPTCP-level ack.")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-2-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-02 18:44:03 -08:00
Paolo Abeni
449e6912a2 mptcp: fix recvbuffer adjust on sleeping rcvmsg
If the recvmsg() blocks after receiving some data - i.e. due to
SO_RCVLOWAT - the MPTCP code will attempt multiple times to
adjust the receive buffer size, wrongly accounting every time the
cumulative of received data - instead of accounting only for the
delta.

Address the issue moving mptcp_rcv_space_adjust just after the
data reception and passing it only the just received bytes.

This also removes an unneeded difference between the TCP and MPTCP
RX code path implementation.

Fixes: 5813022985 ("mptcp: error out earlier on disconnect")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241230-net-mptcp-rbuf-fixes-v1-1-8608af434ceb@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-02 18:44:03 -08:00