Russell King suggested that felix_vsc9959, seville_vsc9953 and
ocelot_ext have a large portion of duplicated init and teardown code,
which could be made common [1]. The teardown code could even be
simplified away if we made use of devres, something which is used here
and there in the felix driver, just not very consistently.
[1] https://lore.kernel.org/all/Zh1GvcOTXqb7CpQt@shell.armlinux.org.uk/
Prepare the ground in the ocelot_ext driver, by allocating the data
structures using devres and deleting the kfree() calls. This also
deletes the "Failed to allocate ..." message, since memory allocation
errors are extremely loud anyway, and it's hard to miss them.
Suggested-by: "Russell King (Oracle)" <linux@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Colin Foster <colin.foster@in-advantage.com>
Tested-by: Colin Foster <colin.foster@in-advantage.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guangguan Wang says:
====================
net/smc: Change the upper boundary of SMC-R's snd_buf and rcv_buf to 512MB
SMCR_RMBE_SIZES is the upper boundary of SMC-R's snd_buf and rcv_buf.
The maximum bytes of snd_buf and rcv_buf can be calculated by 2^SMCR_
RMBE_SIZES * 16KB. SMCR_RMBE_SIZES = 5 means the upper boundary is 512KB.
TCP's snd_buf and rcv_buf max size is configured by net.ipv4.tcp_w/rmem[2]
whose default value is 4MB or 6MB, is much larger than SMC-R's upper
boundary.
In some scenarios, such as Recommendation System, the communication
pattern is mainly large size send/recv, where the size of snd_buf and
rcv_buf greatly affects performance. Due to the upper boundary
disadvantage, SMC-R performs poor than TCP in those scenarios. So it
is time to enlarge the upper boundary size of SMC-R's snd_buf and rcv_buf,
so that the SMC-R's snd_buf and rcv_buf can be configured to larger size
for performance gain in such scenarios.
The SMC-R rcv_buf's size will be transferred to peer by the field
rmbe_size in clc accept and confirm message. The length of the field
rmbe_size is four bits, which means the maximum value of SMCR_RMBE_SIZES
is 15. In case of frequently adjusting the value of SMCR_RMBE_SIZES
in different scenarios, set the value of SMCR_RMBE_SIZES to the maximum
value 15, which means the upper boundary of SMC-R's snd_buf and rcv_buf
is 512MB. As the real memory usage is determined by the value of
net.smc.w/rmem, not by the upper boundary, set the value of SMCR_RMBE_SIZES
to the maximum value has no side affects.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
SMCR_RMBE_SIZES is the upper boundary of SMC-R's snd_buf and rcv_buf.
The maximum bytes of snd_buf and rcv_buf can be calculated by 2^SMCR_
RMBE_SIZES * 16KB. SMCR_RMBE_SIZES = 5 means the upper boundary is 512KB.
TCP's snd_buf and rcv_buf max size is configured by net.ipv4.tcp_w/rmem[2]
whose default value is 4MB or 6MB, is much larger than SMC-R's upper
boundary.
In some scenarios, such as Recommendation System, the communication
pattern is mainly large size send/recv, where the size of snd_buf and
rcv_buf greatly affects performance. Due to the upper boundary
disadvantage, SMC-R performs poor than TCP in those scenarios. So it
is time to enlarge the upper boundary size of SMC-R's snd_buf and rcv_buf,
so that the SMC-R's snd_buf and rcv_buf can be configured to larger size
for performance gain in such scenarios.
The SMC-R rcv_buf's size will be transferred to peer by the field
rmbe_size in clc accept and confirm message. The length of the field
rmbe_size is four bits, which means the maximum value of SMCR_RMBE_SIZES
is 15. In case of frequently adjusting the value of SMCR_RMBE_SIZES
in different scenarios, set the value of SMCR_RMBE_SIZES to the maximum
value 15, which means the upper boundary of SMC-R's snd_buf and rcv_buf
is 512MB. As the real memory usage is determined by the value of
net.smc.w/rmem, not by the upper boundary, set the value of SMCR_RMBE_SIZES
to the maximum value has no side affects.
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Co-developed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
SG_MAX_SINGLE_ALLOC is used to limit maximum number of entries that
will be allocated in one piece of scatterlist. When the entries of
scatterlist exceeds SG_MAX_SINGLE_ALLOC, sg chain will be used. From
commit 7c703e54cc ("arch: switch the default on ARCH_HAS_SG_CHAIN"),
we can know that the macro CONFIG_ARCH_NO_SG_CHAIN is used to identify
whether sg chain is supported. So, SMC-R's rmb buffer should be limited
by SG_MAX_SINGLE_ALLOC only when the macro CONFIG_ARCH_NO_SG_CHAIN is
defined.
Fixes: a3fe3d01bd ("net/smc: introduce sg-logic for RMBs")
Signed-off-by: Guangguan Wang <guangguan.wang@linux.alibaba.com>
Co-developed-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: Wen Gu <guwen@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When splice() support was added in commit 2b514574f7 ("net:
af_unix: implement splice for stream af_unix sockets"), we had
to release unix_sk(sk)->readlock (current iolock) before calling
splice_to_pipe().
Due to the unlock, commit 73ed5d25dc ("af-unix: fix use-after-free
with concurrent readers while splicing") added a safeguard in
unix_stream_read_generic(); we had to bump the skb refcount before
calling ->recv_actor() and then check if the skb was consumed by a
concurrent reader.
However, the pipe side locking was refactored, and since commit
25869262ef ("skb_splice_bits(): get rid of callback"), we can
call splice_to_pipe() without releasing unix_sk(sk)->iolock.
Now, the skb is always alive after the ->recv_actor() callback,
so let's remove the unnecessary drop_skb logic.
This is mostly the revert of commit 73ed5d25dc ("af-unix: fix
use-after-free with concurrent readers while splicing").
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20240529144648.68591-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Rengarajan says:
====================
lan78xx: Enable 125 MHz CLK and Auto Speed configuration for LAN7801 if NO EEPROM is detected
This patch series adds the support for 125 MHz clock, Auto speed and
auto duplex configuration for LAN7801 in the absence of EEPROM.
====================
Link: https://lore.kernel.org/r/20240529140256.1849764-1-rengarajan.s@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Matteo Croce says:
====================
net: visibility of memory limits in netns
Some programs need to know the size of the network buffers to operate
correctly, export the following sysctls read-only in network namespaces:
- net.core.rmem_default
- net.core.rmem_max
- net.core.wmem_default
- net.core.wmem_max
====================
Link: https://lore.kernel.org/r/20240530232722.45255-1-technoboy85@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jacob Keller says:
====================
ice: Introduce ETH56G PHY model for E825C products
E825C products have a different PHY model than E822, E823 and E810 products.
This PHY is ETH56G and its support is necessary to have functional PTP stack
for E825C products.
This series refactors the ice driver to add support for the new PHY model.
Karol introduces the ice_ptp_hw structure. This is used to replace some
hard-coded values relating to the PHY quad and port numbers, as well as to
hold the phy_model type.
Jacob refactors the driver code that converts between the ice_ptp_tmr_cmd
enumeration and hardware register values to better re-use logic and reduce
duplication when introducing another PHY type.
Sergey introduces functions to help enable and disable the Tx timestamp
interrupts. This makes the ice_ptp.c code more generic and encapsulates the
PHY specifics into ice_ptp_hw.c
Karol introduces helper functions to clear the valid bits for Tx and Rx
timestamps. This enables informing hardware to discard stale timestamps
after performing clock operations.
Sergey moves the Clock Generation Unit (CGU) logic out of the E822 specific
area of the ice_ptp_hw.c file as it will be re-used for other device PHY
models.
Jacob introduces a helper function for obtaining the base increment values,
moving this logic out of ice_ptp.c and into the ice_ptp_hw.c file to better
encapsulate hardware differences.
Sergey builds on these refactors to introduce the new ETH56G PHY model used
by the E825C products. This includes introducing the required helpers,
constants, and PHY model checks.
Karol simplifies the CGU logic by using anonymous structures, dropping an
unnecessary ".field" name for accessing the CGU data.
Michal Michalik updates the CGU logic to support the E825C hardware,
ensuring that the clock generation is configured properly.
Grzegorz Nitka adds support to read the NAC topology data from the device.
This is in preparation for supporting devices which combine two NACs
together, connecting all ports to the same clock source. This enables the
driver to determine if its operating on such a device, or if its operating
on the standard 1-NAC configuration.
Grzsecgorz Nitka adjusts the PTP initialization to prepare for the 2x50G
E825C devices, introducing special mapping for the PHY ports to prepare for
support of the 2-NAC devices.
With this, the ice driver is capable of handling PTP for the single-NAC
E825C devices. Complete support for the 2-NAC devices requirs some work on
how the ports connect to the clock owner. During review of this work, it
was pointed out that our existing use of auxiliary bus is disliked, and
Jiri requested that we change it. We are currently working on developing a
replacement solution for the auxiliary bus implementation and have dropped
the relevant changes out of this series. A future series will refactor the
port to clock connection, at which time we will finish the support for
2-NAC E825C devices.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
====================
Link: https://lore.kernel.org/r/20240528-next-2024-05-28-ptp-refactors-v1-0-c082739bb6f6@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
>From FW/HW perspective, 2 port topology in E825C devices requires
merging of 2 port mapping internally and breakout mapping externally.
As a consequence, it requires different port numbering from PTP code
perspective.
For that topology, pf_id can not be used to index PTP ports. Even if
the 2nd port is identified as port with pf_id = 1, all PHY operations
need to be performed as it was port 2. Thus, special mapping is needed
for the 2nd port.
This change adds detection of 2x50G topology and applies 'custom'
mapping on the 2nd port.
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240528-next-2024-05-28-ptp-refactors-v1-11-c082739bb6f6@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Multiple places in the driver code need to convert enum ice_ptp_tmr_cmd
values into register bits for both the main timer and the PHY port
timers. The main MAC register has one bit scheme for timer commands,
while the PHY commands use a different scheme.
The E810 and E830 devices use the same scheme for port commands as used
for the main timer. However, E822 and ETH56G hardware has a separate
scheme used by the PHY.
Introduce helper functions to convert the timer command enumeration into
the register values, reducing some code duplication, and making it
easier to later refactor the individual port write commands.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20240528-next-2024-05-28-ptp-refactors-v1-2-c082739bb6f6@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Similar to what is done in other 'sysctl' pages: it looks clearer from a
readability perspective.
This might cause troubles in the short/mid-term with the backports, but
by not putting new entries at the end, this can help to reduce conflicts
in case of backports in the long term. We don't change the information
here too often, so it looks OK to do that.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240530-upstream-net-20240520-mptcp-doc-v3-2-e94cdd9f2673@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
net: phylink: rearrange ovr_an_inband support
This series addresses the use of the ovr_an_inband flag, which is used
by two drivers to indicate to phylink that they wish to use inband mode
without firmware specifying inband mode.
The issue with ovr_an_inband is that it overrides not only PHY mode,
but also fixed-link mode. Both of the drivers that set this flag
contain code to detect when fixed-link mode will be used, and then
either avoid setting it or explicitly clear the flag. This is
wasteful when phylink already knows this.
Therefore, the approach taken in this patch set is to replace the
ovr_an_inband flag with a default_an_inband flag which means that
phylink defaults to MLO_AN_INBAND instead of MLO_AN_PHY, and will
allow that default to be overriden if firmware specifies a fixed-link.
This allows users of ovr_an_inband to be simplified.
What's more is this requires minimal changes in phylink to allow this
new mode of operation.
This series changes phylink, and also updates the two drivers
(fman_memac and stmmac), and then removes the unnecessary complexity
from the drivers.
This series may depend on the stmmac cleanup series I've posted
earlier - this is something I have not checked, but I currently have
these patches on top of that series.
====================
Link: https://lore.kernel.org/r/ZlctinnTT8Xhemsm@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Of the two users of phylink_config->ovr_an_inband, both manually check
for a fixed link before setting this flag (or clearing it if they find
a fixed link.) This is unnecessary complication.
Test ovr_an_inband before checking for the fixed-link properties, which
will allow ovr_an_inband to be overriden by a fixed link specification.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Link: https://lore.kernel.org/r/E1sCJMq-00Ecqv-P8@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Of the two users of phylink_config->ovr_an_inband, both manually check
for a fixed link before setting this flag (or clearing it if they find
a fixed link.) This is unnecessary complication.
Rearrange phylink_parse_mode() a little so we can change how
phylink_config->ovr_an_inband works. This will allow the flag to be
tested before checking for the fixed link properties in the next patch.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Link: https://lore.kernel.org/r/E1sCJMl-00Ecqp-K0@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Russell King says:
====================
net: stmmac: cleanups
This series removes various redundant items in the stmmac driver:
- the unused TBI and RTBI PCS flags
- the NULL pointer initialisations for PCS methods in dwxgmac2
- the stmmac_pcs_rane() method which is never called, and it's
associated implementations
- the redundant netif_carrier_off()s
Finally, it replaces asm/io.h with the preferred linux/io.h.
====================
Link: https://lore.kernel.org/r/Zlbp7xdUZAXblOZJ@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It is incorrect to call netif_carrier_off(), or in fact any driver
teardown, before unregister_netdev() has been called.
unregister_netdev() unpublishes the network device from userspace, and
takes the interface down if it was up prior to returning. Therefore,
once the call has returned, we are guaranteed that .ndo_stop() will
have been called for an interface that was up. Phylink will take the
carrier down via phylink_stop(), making any manipulation of the carrier
in the remove path unnecessary.
In the stmmac_release() path, the netif_carrier_off() call follows the
call to phylink_stop(), so this call is redundant.
Reviewed-by: Andrew Halaney <ahalaney@redhat.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/E1sCErZ-00EOPx-PF@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>