linux

mirror of https://github.com/torvalds/linux.git synced 2026-01-25 15:03:52 +08:00

Author	SHA1	Message	Date
Jeremy Kerr	269936db5e	net: mctp: separate routing database from routing operations This change adds a struct mctp_dst, representing the result of a routing lookup. This decouples the struct mctp_route from the actual implementation of a routing operation. This will allow for future routing changes which may require more involved lookup logic, such as gateway routing - which may require multiple traversals of the routing table. Since we only use the struct mctp_route at lookup time, we no longer hold routes over a routing operation, as we only need it to populate the dst. However, we do hold the dev while the dst is active. This requires some changes to the route test infrastructure, as we no longer have a mock route to handle the route output operation, and transient dsts are created by the routing code, so we can't override them as easily. Instead, we use kunit->priv to stash a packet queue, and a custom dst_output function queues into that packet queue, which we can use for later expectations. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-3-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	fc2b87d036	net: mctp: test: make cloned_frag buffers more appropriately-sized In our input_cloned_frag test, we currently allocate our test buffers arbitrarily-sized at 100 bytes. We only expect to receive a max of 15 bytes from the socket, so reduce to a more appropriate size. There are some upcoming changes to the routing code which hit a frame-size limit on s390, so reduce the usage before that lands. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-2-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Jeremy Kerr	e0f3c79cc0	net: mctp: don't use source cb data when forwarding, ensure pkt_type is set In the output path, only check the skb->cb data when we know it's from a local socket; input packets will have source address information there instead. In order to detect when we're forwarding, set skb->pkt_type on input/output. Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20250702-dev-forwarding-v5-1-1468191da8a4@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 12:39:23 +02:00
Paolo Abeni	05cc60ef27	Merge branch 'add-broadcast_neighbor-for-no-stacking-networking-arch' Tonghao Zhang says: ==================== add broadcast_neighbor for no-stacking networking arch For no-stacking networking arch, and enable the bond mode 4(lacp) in datacenter, the switch require arp/nd packets as session synchronization. More details please see patch. Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Cc: Zengbing Tu <tuzengbing@didiglobal.com> ==================== Link: https://patch.msgid.link/cover.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:58 +02:00
Tonghao Zhang	2f9afffc39	net: bonding: send peer notify when failure recovery In LACP mode with broadcast_neighbor enabled, after LACP protocol recovery, the port can transmit packets. However, if the bond port doesn't send gratuitous ARP/ND packets to the switch, the switch won't return packets through the current interface. This causes traffic imbalance. To resolve this issue, when LACP protocol recovers, send ARP/ND packets if broadcast_neighbor is enabled. Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Signed-off-by: Zengbing Tu <tuzengbing@didiglobal.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/3993652dc093fffa9504ce1c2448fb9dea31d2d2.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:42 +02:00
Tonghao Zhang	3d98ee5265	net: bonding: add broadcast_neighbor netlink option User can config or display the bonding broadcast_neighbor option via iproute2/netlink. Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Signed-off-by: Zengbing Tu <tuzengbing@didiglobal.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/76b90700ba5b98027dfb51a2f3c5cfea0440a21b.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:42 +02:00
Tonghao Zhang	ce7a381697	net: bonding: add broadcast_neighbor option for 802.3ad Stacking technology is a type of technology used to expand ports on Ethernet switches. It is widely used as a common access method in large-scale Internet data center architectures. Years of practice have proved that stacking technology has advantages and disadvantages in high-reliability network architecture scenarios. For instance, in stacking networking arch, conventional switch system upgrades require multiple stacked devices to restart at the same time. Therefore, it is inevitable that the business will be interrupted for a while. It is for this reason that "no-stacking" in data centers has become a trend. Additionally, when the stacking link connecting the switches fails or is abnormal, the stack will split. Although it is not common, it still happens in actual operation. The problem is that after the split, it is equivalent to two switches with the same configuration appearing in the network, causing network configuration conflicts and ultimately interrupting the services carried by the stacking system. To improve network stability, "non-stacking" solutions have been increasingly adopted, particularly by public cloud providers and tech companies like Alibaba, Tencent, and Didi. "non-stacking" is a method of mimicing switch stacking that convinces a LACP peer, bonding in this case, connected to a set of "non-stacked" switches that all of its ports are connected to a single switch (i.e., LACP aggregator), as if those switches were stacked. This enables the LACP peer's ports to aggregate together, and requires (a) special switch configuration, described in the linked article, and (b) modifications to the bonding 802.3ad (LACP) mode to send all ARP/ND packets across all ports of the active aggregator. Note that, with multiple aggregators, the current broadcast mode logic will send only packets to the selected aggregator(s). +-----------+ +-----------+ \| switch1 \| \| switch2 \| +-----------+ +-----------+ ^ ^ \| \| +-----------------+ \| bond4 lacp \| +-----------------+ \| \| \| NIC1 \| NIC2 +-----------------+ \| server \| +-----------------+ - https://www.ruijie.com/fr-fr/support/tech-gallery/de-stack-data-center-network-architecture/ Cc: Jay Vosburgh <jv@jvosburgh.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Simon Horman <horms@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Tonghao Zhang <tonghao@bamaicloud.com> Signed-off-by: Zengbing Tu <tuzengbing@didiglobal.com> Link: https://patch.msgid.link/84d0a044514157bb856a10b6d03a1028c4883561.1751031306.git.tonghao@bamaicloud.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2025-07-08 10:59:41 +02:00
Jakub Kicinski	0234362d0a	Merge branch 'net-mlx5-hws-optimize-matchers-icm-usage' Mark Bloch says: ==================== net/mlx5: HWS, Optimize matchers ICM usage This series optimizes ICM usage for unidirectional rules and empty matchers and with the last patch we make hardware steering the default FDB steering provider for NICs that don't support software steering. Hardware steering (HWS) uses a type of rule table container (RTC) that is unidirectional, so matchers consist of two RTCs to accommodate bidirectional rules. This small series enables resizing the two RTCs independently by tracking the number of rules separately. For extreme cases where all rules are unidirectional, this results in saving close to half the memory footprint. Results for inserting 1M unidirectional rules using a simple module: Pages Memory Before this patch: 300k 1.5GiB After this patch: 160k 900MiB The 'Pages' column measures the number of 4KiB pages the device requests for itself (the ICM). The 'Memory' column is the difference between peak usage and baseline usage (before starting the test) as reported by `free -h`. In addition, second to last patch of the series handles a case where all the matcher's rules were deleted: the large RTCs of the matcher are no longer required, and we can save some more ICM by shrinking the matcher to its initial size. Finally the last patch makes hardware steering the default mode when in swichdev for NICs that don't have software steering support. ==================== Link: https://patch.msgid.link/20250703185431.445571-1-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:19 -07:00
Moshe Shemesh	a9aec713d0	net/mlx5: Add HWS as secondary steering mode Add HW Steering (HWS) as a secondary option for device steering mode. If the device does not support SW Steering (SWS), HW Steering will be used as the default, provided it is supported. FW Steering will now be selected as the default only if both HWS and SWS are unavailable. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703185431.445571-11-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:17 -07:00
Yevgeny Kliteynik	96e4c4a1a5	net/mlx5: HWS, Shrink empty matchers Matcher size is dynamic: it starts at initial size, and then it grows through rehash as more and more rules are added to this matcher. When rules are deleted, matcher's size is not decreased. Rehash approach is greedy. The idea is: if the matcher got to a certain size at some point, chances are - it will get to this size again, so it is better to avoid costly rehash operations whenever possible. However, when all the rules of the matcher are deleted, this should be viewed as special case. If the matcher actually got to the point where it has zero rules, it might be an indication that some usecase from the past is no longer happening. This is where some ICM can be freed. This patch handles this case: when a number of rules in a matcher goes down to zero, the matcher's tables are shrunk to the initial size. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703185431.445571-10-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:17 -07:00
Yevgeny Kliteynik	29063103f8	net/mlx5: HWS, Rearrange to prevent forward declaration As a preparation for the following patch that will add support for shrinking empty matchers, rearrange the code to prevent forward declaration of functions. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703185431.445571-9-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:17 -07:00
Vlad Dogaru	6b44fffdc7	net/mlx5: HWS, Track matcher sizes individually Track and grow matcher sizes individually for RX and TX RTCs. This allows RX-only or TX-only use cases to effectively halve the device resources they use. For testing we used a simple module that inserts 1M RX-only rules and measured the number of pages the device requests, and memory usage as reported by `free -h`. Pages Memory Before this patch: 300k 1.5GiB After this patch: 160k 900MiB Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703185431.445571-8-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:16 -07:00
Vlad Dogaru	c8332ce096	net/mlx5: HWS, Decouple matcher RX and TX sizes Kernel HWS only uses FDB tables and, as such, creates two lower level containers (RTCs) for each matcher: one for RX and one for TX. Allow these RTCs to differ in size by converting the size part of the matcher attribute to a two element array. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250703185431.445571-7-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:16 -07:00
Vlad Dogaru	59807d0717	net/mlx5: HWS, Create STEs directly from matcher Matchers were using the pool abstraction solely as a convenience to allocate two STE ranges. The pool's core functionality, that of allocating individual items from the range, was unused. Matchers rely either on the hardware to hash rules into a table, or on a user-provided index. Remove the STE pool from the matcher and allocate the STE ranges manually instead. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250703185431.445571-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:16 -07:00
Vlad Dogaru	3dcac700d2	net/mlx5: HWS, Refactor rule skip logic Reduce nesting by adding a couple of early return statements. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703185431.445571-5-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:16 -07:00
Vlad Dogaru	d8e7ab591b	net/mlx5: HWS, Export rule skip logic The bwc layer will use `mlx5hws_rule_skip` to keep track of numbers of RX and TX rules individually, so export this function for future usage. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250703185431.445571-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:16 -07:00
Yevgeny Kliteynik	26b06579d5	net/mlx5: HWS, remove incorrect comment Removing incorrect comment section that is probably some copy-paste artifact. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250703185431.445571-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:16 -07:00
Vlad Dogaru	60afb51c89	net/mlx5: HWS, remove unused create_dest_array parameter `flow_source` is not used anywhere in mlx5hws_action_create_dest_array. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250703185431.445571-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:12:16 -07:00
Jakub Kicinski	4b62261def	Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-07-03 Vladimir Oltean converts Intel drivers (ice, igc, igb, ixgbe, i40e) to utilize new timestamping API (ndo_hwtstamp_get() and ndo_hwtstamp_set()). For ixgbe: Paul, Don, Slawomir, and Radoslaw add Malicious Driver Detection (MDD) support for X550 and E610 devices to detect, report, and handle potentially malicious VFs. Simon Horman corrects spelling mistakes. For igbvf: Kohei Enju removes a couple of unreported counters and adds reporting of Tx timeouts. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igbvf: add tx_timeout_count to ethtool statistics igbvf: remove unused interrupt counter fields from struct igbvf_adapter ixgbe: spelling corrections ixgbe: turn off MDD while modifying SRRCTL ixgbe: add Tx hang detection unhandled MDD ixgbe: check for MDD events ixgbe: add MDD support i40e: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() ixgbe: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() igb: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() igc: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() ice: convert to ndo_hwtstamp_get() and ndo_hwtstamp_set() ==================== Link: https://patch.msgid.link/20250703174242.3829277-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:06:13 -07:00
Jakub Kicinski	95f6fedd62	Merge branch 'net-phylink-support-autoneg-configuration-for-sfps' Russell King says: ==================== net: phylink: support !autoneg configuration for SFPs This series comes from discussion during a patch series that was posted at the beginning of April, but these patches were never posted (I was too busy!) We restrict ->sfp_interfaces to those that the host system supports, and ensure that ->sfp_interfaces is cleared when a SFP is removed. We then add phylink_sfp_select_interface_speed() which will select an appropriate interface from ->sfp_interfaces for the speed, and use that in our phylink_ethtool_ksettings_set() when a SFP bus is present on a directly connected host (not with a PHY.) ==================== Link: https://patch.msgid.link/aGT_hoBELDysGbrp@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:02:39 -07:00
Russell King (Oracle)	320164a6e1	net: phylink: add phylink_sfp_select_interface_speed() Add phylink_sfp_select_interface_speed() which attempts to select the SFP interface based on the ethtool speed when autoneg is turned off. This allows users to turn off autoneg for SFPs that support multiple interface modes, and have an appropriate interface mode selected. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1uWu14-005KXo-IO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:02:36 -07:00
Russell King (Oracle)	b0fdff22d5	net: phylink: clear SFP interfaces when not in use Clear the SFP interfaces bitmap when we're not using it - in other words, when a module is unplugged, or we're using a PHY on the module. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1uWu0z-005KXi-EM@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:02:36 -07:00
Russell King (Oracle)	ff1fce1bdd	net: phylink: restrict SFP interfaces to those that are supported When configuring an optical SFP interface, restrict the bitmap of SFP interfaces (pl->sfp_interfaces) to those that are supported by the host, rather than calculating this in a local variable. This will allow us to avoid recomputing this in the phylink_ethtool_ksettings_set() path. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1uWu0u-005KXc-A4@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 19:02:36 -07:00
Jakub Kicinski	8d5d927d96	Merge branch 'introducing-broadcom-bnge-ethernet-driver' Vikas Gupta says: ==================== Introducing Broadcom BNGE Ethernet Driver This patch series introduces the Ethernet driver for Broadcom’s BCM5770X chip family, which supports 50/100/200/400/800 Gbps link speeds. The driver is built as the bng_en.ko kernel module. To keep the series within a reviewable size (~5K lines of code), this initial submission focuses on the core infrastructure and initialization, including: 1) PCIe support (device IDs, probe/remove) 2) Devlink support 3) Firmware communication mechanism 4) Creation of network device 5) PF Resource management (rings, IRQs, etc. for netdev & aux dev) Support for Tx/Rx datapaths, link management, ethtool/devlink operations and additional features will be introduced in the subsequent patch series. The bng_en driver shares the bnxt_hsi.h file with the bnxt_en driver, as the bng_en driver leverages the hardware communication protocol used by the bnxt_en driver. ==================== Link: https://patch.msgid.link/20250701143511.280702-1-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:09 -07:00
Vikas Gupta	13a68c1ed7	bng_en: Add a network device Add a network device with netdev features enabled. Some features are enabled based on the capabilities advertised by the firmware. Add the skeleton of minimal netdev operations. Additionally, initialize the parameters for rings (TX/RX/Completion). Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-11-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:01 -07:00
Vikas Gupta	3fa9e977a0	bng_en: Initialize default configuration Query resources from the firmware and, based on the availability of resources, initialize the default settings. The default settings include: 1. Rings and other resource reservations with the firmware. This ensures that resources are reserved before network and auxiliary devices are created. 2. Mapping the BAR, which helps access doorbells since its size is known after querying the firmware. 3. Retrieving the TCs and hardware CoS queue mappings. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-10-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:01 -07:00
Vikas Gupta	18a975389f	bng_en: Add irq allocation support Add irq allocation functions. This will help to allocate IRQs to both netdev and RoCE aux devices. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-9-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:01 -07:00
Vikas Gupta	627c67f038	bng_en: Add resource management support Get the resources and capabilities from the firmware. Add functions to manage the resources with the firmware. These functions will help netdev reserve the resources with the firmware before registering the device in future patches. The resources and their information, such as the maximum available and reserved, are part of the members present in the bnge_hw_resc struct. The bnge_reserve_rings() function also populates the RSS table entries once the RX rings are reserved with the firmware. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-8-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:01 -07:00
Vikas Gupta	29c5b358f3	bng_en: Add backing store support Backing store or context memory on the host helps the device to manage rings, stats and other resources. Context memory is allocated with the help of ring alloc/free functions. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-7-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:01 -07:00
Vikas Gupta	27544c0ecb	bng_en: Add ring memory allocation support Add ring allocation/free mechanism which help to allocate rings (TX/RX/Completion) and backing stores memory on the host for the device. Future patches will use these functions. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-6-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:00 -07:00
Vikas Gupta	fb7d8b61c1	bng_en: Add initial interaction with firmware Query firmware with the help of basic firmware commands and cache the capabilities. With the help of basic commands start the initialization process of the driver with the firmware. Since basic information is available from the firmware, register with devlink. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-5-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:00 -07:00
Vikas Gupta	7037d1d897	bng_en: Add firmware communication mechanism Add support to communicate with the firmware. Future patches will use these functions to send the messages to the firmware. Functions support allocating request/response buffers to send a particular command. Each command has certain timeout value to which the driver waits for response from the firmware. In error case, commands may be either timed out waiting on response from the firmware or may return a specific error code. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-4-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:00 -07:00
Vikas Gupta	9099bfa115	bng_en: Add devlink interface Allocate a base device and devlink interface with minimal devlink ops. Add dsn and board related information. Map PCIe BAR (bar0), which helps to communicate with the firmware. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-3-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:00 -07:00
Vikas Gupta	74715c4ab0	bng_en: Add PCI interface Add basic pci interface to the driver which supports the BCM5770X NIC family. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Rajashekar Hudumula <rajashekar.hudumula@broadcom.com> Link: https://patch.msgid.link/20250701143511.280702-2-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:54:00 -07:00
Jakub Kicinski	11bd57844f	Merge branch 'netpoll-factor-out-functions-from-netpoll_send_udp-and-add-ipv6-selftest' Breno Leitao says: ==================== netpoll: Factor out functions from netpoll_send_udp() and add ipv6 selftest Refactors the netpoll UDP transmit path to improve code clarity, maintainability, and protocol-layer encapsulation. Function netpoll_send_udp() has more than 100 LoC, which is hard to understand and review. After this patchset, it has only 32 LoC, which is more manageable. The series systematically moves the construction of protocol headers (UDP, IPv4, IPv6, Ethernet) out of the core `netpoll_send_udp()` function into dedicated static helpers: - `push_udp()` for UDP header setup - `push_ipv4()` and `push_ipv6()` for IP header setup - `push_eth()` for Ethernet header setup This results in a clean, layered abstraction that mirrors the protocol stack, reduces code duplication, and improves readability. Also, to make sure this is not breaking anything, add IPv6 selftest to netconsole tests, which will exercise this code. This test would also pick problems similiar to the one fixed by `f599020702` ("net: netpoll: Initialize UDP checksum field before checksumming"), which was embarrassin we didn't have a selftest catch it. Anyway, there are no functional changes intended in this patchset. v1: https://lore.kernel.org/20250627-netpoll_untagle_ip-v1-0-61a21692f84a@debian.org ==================== Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-0-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:53:00 -07:00
Breno Leitao	3dc6c76391	selftests: net: Add IPv6 support to netconsole basic tests Add IPv6 support to the netconsole basic functionality tests by: - Introducing separate IPv4 and IPv6 address variables (SRCIP4/SRCIP6, DSTIP4/DSTIP6) to replace the single SRCIP/DSTIP variables - Adding select_ipv4_or_ipv6() function to choose protocol version - Updating socat configuration to use UDP6-LISTEN for IPv6 tests - Adding wait_for_port() wrapper to handle protocol-specific port waiting - Expanding test matrix to run both basic and extended formats against both IPv4 and IPv6 protocols - Improving cleanup to kill any remaining socat processes - Adding sleep delays for better IPv6 packet handling reliability The test now validates netconsole functionality across both IP versions, improving test coverage for dual-stack network environments. This test would avoid the regression fixed by commit `f599020702` ("net: netpoll: Initialize UDP checksum field before checksumming") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-7-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:52:57 -07:00
Breno Leitao	eb4e773f13	netpoll: move Ethernet setup to push_eth() helper Refactor Ethernet header population into dedicated function, completing the layered abstraction with: - push_eth() for link layer - push_udp() for transport - push_ipv4()/push_ipv6() for network Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-6-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:52:56 -07:00
Breno Leitao	cacfb1f4e9	netpoll: factor out UDP header setup into push_udp() helper Move UDP header construction from netpoll_send_udp() into a new static helper function push_udp(). This completes the protocol layer refactoring by: 1. Creating a dedicated helper for UDP header assembly 2. Removing UDP-specific logic from the main send function 3. Establishing a consistent pattern with existing IPv4/IPv6 helpers: - push_udp() - push_ipv4() - push_ipv6() The change improves code organization and maintains the encapsulation pattern established in previous refactorings. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-5-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:52:56 -07:00
Breno Leitao	8c27639dbe	netpoll: factor out IPv4 header setup into push_ipv4() helper Move IPv4 header construction from netpoll_send_udp() into a new static helper function push_ipv4(). This completes the refactoring started with IPv6 header handling, creating symmetric helper functions for both IP versions. Changes include: 1. Extracting IPv4 header setup logic into push_ipv4() 2. Replacing inline IPv4 code with helper call 3. Moving eth assignment after helper calls for consistency The refactoring reduces code duplication and improves maintainability by isolating IP version-specific logic. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-4-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:52:56 -07:00
Breno Leitao	839388f39a	netpoll: factor out IPv6 header setup into push_ipv6() helper Move IPv6 header construction from netpoll_send_udp() into a new static helper function, push_ipv6(). This refactoring reduces code duplication and improves readability in netpoll_send_udp(). Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-3-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:52:55 -07:00
Breno Leitao	01dae7a61c	netpoll: factor out UDP checksum calculation into helper Extract UDP checksum calculation logic from netpoll_send_udp() into a new static helper function netpoll_udp_checksum(). This reduces code duplication and improves readability for both IPv4 and IPv6 cases. No functional change intended. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-2-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:52:55 -07:00
Breno Leitao	4b52cdfcce	netpoll: Improve code clarity with explicit struct size calculations Replace pointer-dereference sizeof() operations with explicit struct names for improved readability and maintainability. This change: 1. Replaces `sizeof(udph)` with `sizeof(struct udphdr)` 2. Replaces `sizeof(ip6h)` with `sizeof(struct ipv6hdr)` 3. Replaces `sizeof(*iph)` with `sizeof(struct iphdr)` This will make it easy to move code in the upcoming patches. No functional changes are introduced by this patch. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250702-netpoll_untagle_ip-v2-1-13cf3db24e2b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:52:55 -07:00
Jakub Kicinski	49402a628e	Merge branch 'net-ethernet-mtk_eth_soc-improve-device-tree-handling' Daniel Golle says: ==================== net: ethernet: mtk_eth_soc: improve device tree handling This series further improves the mtk_eth_soc driver in preparation to complete upstream support for the MediaTek MT7988 SoC family. Frank Wunderlich's previous attempt to have the ethernet node included in mt7988a.dtsi and cover support for MT7988 in the device tree bindings was criticized for the way mtk_eth_soc references SRAM in device tree[1]. Having a 2nd 'reg' property, like introduced by commit `ebb1e4f9cf` ("net: ethernet: mtk_eth_soc: add support for in-SoC SRAM") isn't acceptable and a dedicated "mmio-sram" node should be used instead. In order to make the code more clean and readable, the existing hardcoded offsets for the scratch ring, RX and TX rings are dropped in favor of using the generic allocator. However, support for the hardcoded offset of the SRAM itself being included as part of the Ethernet's "reg" MMIO space is kept as it will still be required in order to support existing legacy device trees of the MT7986 SoC family. While at it also replace confusing error messages when using legacy device trees without "interrupt-names" with a warning informing users that they are using a legacy device tree. [1]: https://patchwork.ozlabs.org/comment/3533543/ ==================== Link: https://patch.msgid.link/cover.1751461762.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:50:49 -07:00
Daniel Golle	04c7aaccdc	net: ethernet: mtk_eth_soc: use generic allocator for SRAM Use a dedicated "mmio-sram" node and the generic allocator instead of open-coding SRAM allocation for DMA rings. Keep support for legacy device trees but notify the user via a warning to update, and let the ethernet driver create the gen_pool in this case. Co-developed-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/c2b9242229d06af4e468204bcf42daa1535c3a72.1751461762.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:50:45 -07:00
Daniel Golle	d717d32f51	net: ethernet: mtk_eth_soc: fix kernel-doc comment Fix and add some missing field descriptions to kernel-doc comment of struct mtk_eth. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/748e7de848e45ecdc84fbb78e34e9e13b9aa4329.1751461762.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:50:45 -07:00
Daniel Golle	e81d36d488	net: ethernet: mtk_eth_soc: improve support for named interrupts Use platform_get_irq_byname_optional() to avoid outputting error messages when using legacy device trees which rely identifying interrupts only by index. Instead, output a warning notifying the user to update their device tree. Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/aeccd00eccb7186d39d2c16292019b3b22ec53b8.1751461762.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:50:45 -07:00
Eric Dumazet	6058099da5	net: remove RTNL use for /proc/sys/net/core/rps_default_mask Use a dedicated mutex instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250702061558.1585870-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:42:12 -07:00
Byungchul Park	d8bf56a0ca	page_pool: make page_pool_get_dma_addr() just wrap page_pool_get_dma_addr_netmem() The page pool members in struct page cannot be removed unless it's not allowed to access any of them via struct page. Do not access 'page->dma_addr' directly in page_pool_get_dma_addr() but just wrap page_pool_get_dma_addr_netmem() safely. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/20250702053256.4594-6-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:40:10 -07:00
Byungchul Park	4369d40da2	netmem: use _Generic to cover const casting for page_to_netmem() The current page_to_netmem() doesn't cover const casting resulting in trying to cast const struct page * to const netmem_ref fails. To cover the case, change page_to_netmem() to use macro and _Generic. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://patch.msgid.link/20250702053256.4594-5-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:40:10 -07:00
Byungchul Park	b56ce86846	page_pool: rename __page_pool_alloc_pages_slow() to __page_pool_alloc_netmems_slow() Now that __page_pool_alloc_pages_slow() is for allocating netmem, not struct page, rename it to __page_pool_alloc_netmems_slow() to reflect what it does. Signed-off-by: Byungchul Park <byungchul@sk.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250702053256.4594-4-byungchul@sk.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-07-07 18:40:09 -07:00

1 2 3 4 5 ...

1369309 Commits