Commit Graph

1074187 Commits

Author SHA1 Message Date
Vladimir Oltean
36fac35b29 net: mscc: ocelot: delete OCELOT_MRP_CPUQ
MRP frames are configured to be trapped to the CPU queue 7, and this
number is reflected in the extraction header. However, the information
isn't used anywhere, so just leave MRP frames to go to CPU queue 0
unless needed otherwise.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
c518afec28 net: mscc: ocelot: consolidate cookie allocation for private VCAP rules
Every use case that needed VCAP filters (in order: DSA tag_8021q, MRP,
PTP traps) has hardcoded filter identifiers that worked well enough for
that use case alone. But when two or more of those use cases would be
used together, some of those identifiers would overlap, leading to
breakage.

Add definitions for each cookie and centralize them in ocelot_vcap.h,
such that the overlaps are more obvious.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
e3c02b7c65 net: mscc: ocelot: use a consistent cookie for MRP traps
The driver uses an identifier equal to (ocelot->num_phys_ports + port)
for MRP traps installed when the system is in the role of an MRC, and an
identifier equal to (port) otherwise.

Use the same identifier in both cases as a consolidation for the various
cookie values spread throughout the driver.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:50 +00:00
David S. Miller
c8b441d2fb Merge tag 'mlx5-updates-2022-02-16' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:

====================
mlx5-updates-2022-02-16

Misc updates for mlx5:
1) Alex Liu Adds support for using xdp->data_meta

2) Aya Levin Adds PTP counters and port time stamp mode for representors and
switchdev mode.

3) Tariq Toukan, Striding RQ simple improvements.

4) Roi Dayan (7): Create multiple attr instances per flow

Some TC actions use post actions for their implementation.
For example CT and sample actions.

Create a new flow attr instance after each multi table action and
create a post action rule for it as a generic parsing step.
Now multi table actions like CT, sample don't require to do it.

When flow has multiple attr instances, the first flow attr is being
offloaded normally and linked to the next attr (post action rule) by
setting an id on reg_c for matching.
Post action rule (rule created from second attr instance) match the
id on reg_c and does rest of the actions.

Example rule with actions CT,goto will be created with 2 attr instances
as following: attr1(CT)->attr2(goto)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 10:57:21 +00:00
Roi Dayan
b070e70381 net/mlx5e: TC, Allow sample action with CT
Allow sample+CT actions but still block sample+CT NAT
as it is not supported.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:17 -08:00
Roi Dayan
7843bd6040 net/mlx5e: TC, Make post_act parse CT and sample actions
Before this commit post_act can be used for normal rules
and didn't handle special cases like CT and sample.
With this commit post_act rule can also handle the special cases
when needed.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:17 -08:00
Roi Dayan
2a829fe25d net/mlx5e: TC, Clean redundant counter flag from tc action parsers
When tc actions being parsed only the last flow attr created needs the
counter flag and the previous flags being reset.
Clean the flag from the tc action parsers.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:17 -08:00
Roi Dayan
a81283263b net/mlx5e: Use multi table support for CT and sample actions
CT and sample actions use post actions for their implementation.
Flag those actions as multi table actions so the post act infrastructure
will handle the post actions allocation.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:16 -08:00
Roi Dayan
8300f22526 net/mlx5e: Create new flow attr for multi table actions
Some TC actions use post actions for their implementation.
For example CT and sample actions.

Create a new flow attr after each multi table action and
create a post action rule for it.

First flow attr being offloaded normally and linked to the next
attr (post action rule) with setting an id on reg_c.
Post action rules match the id on reg_c and continue to the next one.

The flow counter is allocated on the last rule.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:16 -08:00
Roi Dayan
314e110583 net/mlx5e: Add post act offload/unoffload API
Introduce mlx5e_tc_post_act_offload() and mlx5e_tc_post_act_unoffload()
to be able to unoffload and reoffload existing post action rules handles.
For example in neigh update events, the driver removes and readds rules in
hardware.

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:16 -08:00
Roi Dayan
0610f8dc03 net/mlx5e: Pass actions param to actions_match_supported()
Currently the mlx5_flow object contains a single mlx5_attr instance.
However, multi table actions (e.g. CT) instantiate multiple attr instances.

Currently action_match_supported() reads the actions flag from the
flow's attribute instance. Modify the function to receive the action
flags as a parameter which is set by the calling function and
pass the aggregated actions to actions_match_supported().

Signed-off-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:15 -08:00
Paul Blakey
d1a3138f79 net/mlx5e: TC, Move flow hashtable to be per rep
To allow shared tc block offload between two or more reps of the
same eswitch, move the tc flow hashtable to be per rep, instead
of per eswitch.

Signed-off-by: Paul Blakey <paulb@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:15 -08:00
Aya Levin
bfbdd77ac5 net/mlx5e: E-Switch, Add support for tx_port_ts in switchdev mode
When turning on tx_port_ts (private flag) a PTP-SQ is created. Consider
this queue when adding rules matching SQs to VPORTs. Otherwise the
traffic on this queue won't reach the wire.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:15 -08:00
Aya Levin
7c5f940d26 net/mlx5e: E-Switch, Add PTP counters for uplink representor
There is a configuration where the uplink interface is the synchronizer.
Add PTP counters for this interface for monitoring.

Signed-off-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:14 -08:00
Tariq Toukan
4b5fba4a3a net/mlx5e: RX, Restrict bulk size for small Striding RQs
In RQs of type multi-packet WQE (Striding RQ), each WQE is relatively
large (typically 256KB) but their number is relatively small (8 in
default).

Re-mapping the descriptors' buffers before re-posting them is done via
UMR (User-Mode Memory Registration) operations.

On the one hand, posting UMR WQEs in bulks reduces communication overhead
with the HW and better utilizes its processing units.
On the other hand, delaying the WQE repost operations for a small RQ
(say, of 4 WQEs) might drastically hit its performance, causing packet
drops due to no receive buffer, for high or bursty incoming packets rate.

Here we restrict the bulk size for too small RQs. Effectively, with the current
constants, RQ of size 4 (minimum allowed) would have no bulking, while larger
RQs will continue working with bulks of 2.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:14 -08:00
Tariq Toukan
1d5024f88d net/mlx5e: Default to Striding RQ when not conflicting with CQE compression
CQE compression is turned on by default on slow pci systems to help
reduce the load on pci.
In this case, Striding RQ was turned off as CQEs of packets that span
several strides were not compressed, significantly reducing the compression
effectiveness.
This issue does not exist when using the newer mini_cqe format "stride_index".
Hence, allow defaulting to Striding RQ in this case.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:14 -08:00
Tariq Toukan
b98d2d722f net/mlx5e: Generalize packet merge error message
Update the old error message for LRO state modify with the new
general name.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Khalid Manaa <khalidm@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:14 -08:00
Alex Liu
c1e80bf4ad net/mlx5e: Add support for using xdp->data_meta
Add support for using xdp->data_meta for cross-program communication

Pass "true" to the last argument of xdp_prepare_buff().

After SKB is built, call skb_metadata_set() if metadata was pushed.

Signed-off-by: Alex Liu <liualex@fb.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:13 -08:00
Colin Ian King
9625bf39bd net/mlx5e: Fix spelling mistake "supoported" -> "supported"
There is a spelling mistake in a NL_SET_ERR_MSG_MOD error
message.  Fix it.

Fixes: 3b49a7edec ("net/mlx5e: TC, Reject rules with multiple CT actions")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-02-16 23:55:13 -08:00
Petr Machata
22b67d1719 net: rtnetlink: rtnl_stats_get(): Emit an extack for unset filter_mask
Both get and dump handlers for RTM_GETSTATS require that a filter_mask, a
mask of which attributes should be emitted in the netlink response, is
unset. rtnl_stats_dump() does include an extack in the bounce,
rtnl_stats_get() however does not. Fix the omission.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/01feb1f4bbd22a19f6629503c4f366aed6424567.1645020876.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:56:21 -08:00
Jakub Kicinski
4d078475bd Merge branch 'mptcp-so_sndtimeo-and-misc-cleanup'
Mat Martineau says:

====================
mptcp: SO_SNDTIMEO and misc. cleanup

Patch 1 adds support for the SO_SNDTIMEO socket option on MPTCP sockets.

The remaining patches are various small cleanups:

Patch 2 removes an obsolete declaration.

Patches 3 and 5 remove unnecessary function parameters.

Patch 4 removes an extra cast.

Patches 6 and 7 add some const and ro_after_init modifiers.

Patch 8 removes extra storage of TCP helpers.
====================

Link: https://lore.kernel.org/r/20220216021130.171786-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:10 -08:00
Florian Westphal
952382c648 mptcp: don't save tcp data_ready and write space callbacks
Assign the helpers directly rather than save/restore in the context
structure.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:06 -08:00
Florian Westphal
51fa7f8ebf mptcp: mark ops structures as ro_after_init
These structures are initialised from the init hooks, so we can't make
them 'const'.  But no writes occur afterwards, so we can use ro_after_init.

Also, remove bogus EXPORT_SYMBOL, the only access comes from ip
stack, not from kernel modules.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:05 -08:00
Paolo Abeni
90d9308821 mptcp: constify a bunch of of helpers
A few pm-related helpers don't touch arguments which lacking
the const modifier, let's constify them.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:05 -08:00
Geliang Tang
af7939f390 mptcp: drop port parameter of mptcp_pm_add_addr_signal
Drop the port parameter of mptcp_pm_add_addr_signal() and reflect it to
avoid passing too many parameters.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:05 -08:00
Geliang Tang
742e2f36c0 mptcp: drop unneeded type casts for hmac
Drop the unneeded type casts to 'unsigned long long' for printing out the
hmac values in add_addr_hmac_valid() and subflow_thmac_valid().

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:04 -08:00
Geliang Tang
0799e21b5a mptcp: drop unused sk in mptcp_get_options
The parameter 'sk' became useless since the code using it was dropped
from mptcp_get_options() in the commit 8d548ea1dd ("mptcp: do not set
unconditionally csum_reqd on incoming opt"). Let's drop it.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:04 -08:00
Matthieu Baerts
f8e9ce4a6e mptcp: mptcp_parse_option is no longer exported
Options parsing in now done from mptcp_incoming_options().

mptcp_parse_option() has been removed from mptcp.h when CONFIG_MPTCP is
defined but not when it is not.

Fixes: cfde141ea3 ("mptcp: move option parsing into mptcp_incoming_options()")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:04 -08:00
Geliang Tang
d6ab5ea2a3 mptcp: add SNDTIMEO setsockopt support
Add setsockopt support for SO_SNDTIMEO_OLD and SO_SNDTIMEO_NEW to fix this
error reported by the mptcp bpf selftest:

 (network_helpers.c:64: errno: Operation not supported) Failed to set SO_SNDTIMEO
 test_mptcp:FAIL:115

 All error logs:

 (network_helpers.c:64: errno: Operation not supported) Failed to set SO_SNDTIMEO
 test_mptcp:FAIL:115
 Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:52:03 -08:00
Yang Li
8c16baa51b net: Fix an ignored error return from dm9051_get_regs()
The return from the call to dm9051_get_regs() is int, it can be
a negative error code, however this is being assigned to an unsigned
int variable 'ret', so making 'ret' an int.

Eliminate the following coccicheck warning:
./drivers/net/ethernet/davicom/dm9051.c:527:5-8: WARNING: Unsigned
expression compared with zero: ret < 0

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220216014507.109117-1-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:47:52 -08:00
Vladimir Oltean
c862033595 net: dsa: tag_8021q: only call skb_push/skb_pull around __skb_vlan_pop
__skb_vlan_pop() needs skb->data to point at the mac_header, while
skb_vlan_tag_present() and skb_vlan_tag_get() don't, because they don't
look at skb->data at all.

So we can avoid uselessly moving around skb->data for the case where the
VLAN tag was offloaded by the DSA master.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220215204722.2134816-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:35:35 -08:00
Tom Rix
a5e516d026 net: ethernet: altera: cleanup comments
Replacements:
queueing to queuing
trasfer to transfer
aditional to additional
adaptor to adapter
transactino to transaction

Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20220215213802.3043178-1-trix@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:33:04 -08:00
D. Wythe
1ce2204706 net/smc: return ETIMEDOUT when smc_connect_clc() timeout
When smc_connect_clc() times out, it will return -EAGAIN(tcp_recvmsg
retuns -EAGAIN while timeout), then this value will passed to the
application, which is quite confusing to the applications, makes
inconsistency with TCP.

From the manual of connect, ETIMEDOUT is more suitable, and this patch
try convert EAGAIN to ETIMEDOUT in that case.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
Reviewed-by: Karsten Graul <kgraul@linux.ibm.com>
Link: https://lore.kernel.org/r/1644913490-21594-1-git-send-email-alibuda@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:31:39 -08:00
YueHaibing
8aa69d3482 net: hns3: Remove unused inline function hclge_is_reset_pending()
This is unused since commit 8e2288cad6 ("net: hns3: refactor PF
cmdq init and uninit APIs with new common APIs").

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Jie Wang <wangjie125@huawei.com>
Link: https://lore.kernel.org/r/20220216113507.22368-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 18:34:48 -08:00
David S. Miller
f0ead99e62 Merge branch 'Replay-and-offload-host-VLAN-entries-in-DSA'
Vladimir Oltean says:

====================
Replay and offload host VLAN entries in DSA

v2->v3:
- make the bridge stop notifying switchdev for !BRENTRY VLANs
- create precommit and commit wrappers around __vlan_add_flags().
- special-case the BRENTRY transition from false to true, instead of
  treating it as a change of flags and letting drivers figure out that
  it really isn't.
- avoid setting *changed unless we know that functions will not error
  out later.
- drop "old_flags" from struct switchdev_obj_port_vlan, nobody needs it
  now, in v2 only DSA needed it to filter out BRENTRY transitions, that
  is now solved cleaner.
- no BRIDGE_VLAN_INFO_BRENTRY flag checks and manipulations in DSA
  whatsoever, use the "bool changed" bit as-is after changing what it
  means.
- merge dsa_slave_host_vlan_{add,del}() with
  dsa_slave_foreign_vlan_{add,del}(), since now they do the same thing,
  because the host_vlan functions no longer need to mangle the vlan
  BRENTRY flags and bool changed.

v1->v2:
- prune switchdev VLAN additions with no actual change differently
- no longer need to revert struct net_bridge_vlan changes on error from
  switchdev
- no longer need to first delete a changed VLAN before readding it
- pass 'bool changed' and 'u16 old_flags' through switchdev_obj_port_vlan
  so that DSA can do some additional post-processing with the
  BRIDGE_VLAN_INFO_BRENTRY flag
- support VLANs on foreign interfaces
- fix the same -EOPNOTSUPP error in mv88e6xxx, this time on removal, due
  to VLAN deletion getting replayed earlier than FDB deletion

The motivation behind these patches is that Rafael reported the
following error with mv88e6xxx when the first switch port joins a
bridge:

mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95 (-EOPNOTSUPP)

The FDB entry that's added is the MAC address of the bridge, in VID 1
(the default_pvid), being replayed as part of br_add_if() -> ... ->
nbp_switchdev_sync_objs().

-EOPNOTSUPP is the mv88e6xxx driver's way of saying that VID 1 doesn't
exist in the VTU, so it can't program the ATU with a FID, something
which it needs.

It appears to be a race, but it isn't, since we only end up installing
VID 1 in the VTU by coincidence. DSA's approximation of programming
VLANs on the CPU port together with the user ports breaks down with
host FDB entries on mv88e6xxx, since that strictly requires the VTU to
contain the VID. But the user may freely add VLANs pointing just towards
the bridge, and FDB entries in those VLANs, and DSA will not be aware of
them, because it only listens for VLANs on user ports.

To create a solution that scales properly to cross-chip setups and
doesn't leak entries behind, some changes in the bridge driver are
required. I believe that these are for the better overall, but I may be
wrong. Namely, the same refcounting procedure that DSA has in place for
host FDB and MDB entries can be replicated for VLANs, except that it's
garbage in, garbage out: the VLAN addition and removal notifications
from switchdev aren't balanced. So the first 2 patches attempt to deal
with that.

This patch set has been superficially tested on a board with 3 mv88e6xxx
switches in a daisy chain and appears to produce the primary desired
effect - the driver no longer returns -EOPNOTSUPP when the first port
joins a bridge, and is successful in performing local termination under
a VLAN-aware bridge.
As an additional side effect, it silences the annoying "p%d: already a
member of VLAN %d\n" warning messages that the mv88e6xxx driver produces
when coupled with systemd-networkd, and a few VLANs are configured.
Furthermore, it advances Florian's idea from a few years back, which
never got merged:
https://lore.kernel.org/lkml/20180624153339.13572-1-f.fainelli@gmail.com/
v2 has also been tested on the NXP LS1028A felix switch.

Some testing:

root@debian:~# bridge vlan add dev br0 vid 101 pvid self
[  100.709220] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 9 vlan 101
[  100.873426] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 10 vlan 101
[  100.892314] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 9 vlan 101
[  101.053392] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 10 vlan 101
[  101.076994] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_add: port 9 vlan 101
root@debian:~# bridge vlan add dev br0 vid 101 pvid self
root@debian:~# bridge vlan add dev br0 vid 101 pvid self
root@debian:~# bridge vlan
port              vlan-id
eth0              1 PVID Egress Untagged
lan9              1 PVID Egress Untagged
lan10             1 PVID Egress Untagged
lan11             1 PVID Egress Untagged
lan12             1 PVID Egress Untagged
lan13             1 PVID Egress Untagged
lan14             1 PVID Egress Untagged
lan15             1 PVID Egress Untagged
lan16             1 PVID Egress Untagged
lan17             1 PVID Egress Untagged
lan18             1 PVID Egress Untagged
lan19             1 PVID Egress Untagged
lan20             1 PVID Egress Untagged
lan21             1 PVID Egress Untagged
lan22             1 PVID Egress Untagged
lan23             1 PVID Egress Untagged
lan24             1 PVID Egress Untagged
sfp               1 PVID Egress Untagged
lan1              1 PVID Egress Untagged
lan2              1 PVID Egress Untagged
lan3              1 PVID Egress Untagged
lan4              1 PVID Egress Untagged
lan5              1 PVID Egress Untagged
lan6              1 PVID Egress Untagged
lan7              1 PVID Egress Untagged
lan8              1 PVID Egress Untagged
br0               1 Egress Untagged
                  101 PVID
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
[  108.340487] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 9 vlan 101
[  108.379167] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 10 vlan 101
[  108.402319] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_del: port 9 vlan 101
[  108.425866] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 9 vlan 101
[  108.452280] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 10 vlan 101
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
root@debian:~# bridge vlan
port              vlan-id
eth0              1 PVID Egress Untagged
lan9              1 PVID Egress Untagged
lan10             1 PVID Egress Untagged
lan11             1 PVID Egress Untagged
lan12             1 PVID Egress Untagged
lan13             1 PVID Egress Untagged
lan14             1 PVID Egress Untagged
lan15             1 PVID Egress Untagged
lan16             1 PVID Egress Untagged
lan17             1 PVID Egress Untagged
lan18             1 PVID Egress Untagged
lan19             1 PVID Egress Untagged
lan20             1 PVID Egress Untagged
lan21             1 PVID Egress Untagged
lan22             1 PVID Egress Untagged
lan23             1 PVID Egress Untagged
lan24             1 PVID Egress Untagged
sfp               1 PVID Egress Untagged
lan1              1 PVID Egress Untagged
lan2              1 PVID Egress Untagged
lan3              1 PVID Egress Untagged
lan4              1 PVID Egress Untagged
lan5              1 PVID Egress Untagged
lan6              1 PVID Egress Untagged
lan7              1 PVID Egress Untagged
lan8              1 PVID Egress Untagged
br0               1 Egress Untagged
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:05 +00:00
Vladimir Oltean
164f861bd4 net: dsa: offload bridge port VLANs on foreign interfaces
DSA now explicitly handles VLANs installed with the 'self' flag on the
bridge as host VLANs, instead of just replicating every bridge port VLAN
also on the CPU port and never deleting it, which is what it did before.

However, this leaves a corner case uncovered, as explained by
Tobias Waldekranz:
https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260

Forwarding towards a bridge port VLAN installed on a bridge port foreign
to DSA (separate NIC, Wi-Fi AP) used to work by virtue of the fact that
DSA itself needed to have at least one port in that VLAN (therefore, it
also had the CPU port in said VLAN). However, now that the CPU port may
not be member of all VLANs that user ports are members of, we need to
ensure this isn't the case if software forwarding to a foreign interface
is required.

The solution is to treat bridge port VLANs on standalone interfaces in
the exact same way as host VLANs. From DSA's perspective, there is no
difference between local termination and software forwarding; packets in
that VLAN must reach the CPU in both cases.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:05 +00:00
Vladimir Oltean
134ef2388e net: dsa: add explicit support for host bridge VLANs
Currently, DSA programs VLANs on shared (DSA and CPU) ports each time it
does so on user ports. This is good for basic functionality but has
several limitations:

- the VLAN group which must reach the CPU may be radically different
  from the VLAN group that must be autonomously forwarded by the switch.
  In other words, the admin may want to isolate noisy stations and avoid
  traffic from them going to the control processor of the switch, where
  it would just waste useless cycles. The bridge already supports
  independent control of VLAN groups on bridge ports and on the bridge
  itself, and when VLAN-aware, it will drop packets in software anyway
  if their VID isn't added as a 'self' entry towards the bridge device.

- Replaying host FDB entries may depend, for some drivers like mv88e6xxx,
  on replaying the host VLANs as well. The 2 VLAN groups are
  approximately the same in most regular cases, but there are corner
  cases when timing matters, and DSA's approximation of replicating
  VLANs on shared ports simply does not work.

- If a user makes the bridge (implicitly the CPU port) join a VLAN by
  accident, there is no way for the CPU port to isolate itself from that
  noisy VLAN except by rebooting the system. This is because for each
  VLAN added on a user port, DSA will add it on shared ports too, but
  for each VLAN deletion on a user port, it will remain installed on
  shared ports, since DSA has no good indication of whether the VLAN is
  still in use or not.

Now that the bridge driver emits well-balanced SWITCHDEV_OBJ_ID_PORT_VLAN
addition and removal events, DSA has a simple and straightforward task
of separating the bridge port VLANs (these have an orig_dev which is a
DSA slave interface, or a LAG interface) from the host VLANs (these have
an orig_dev which is a bridge interface), and to keep a simple reference
count of each VID on each shared port.

Forwarding VLANs must be installed on the bridge ports and on all DSA
ports interconnecting them. We don't have a good view of the exact
topology, so we simply install forwarding VLANs on all DSA ports, which
is what has been done until now.

Host VLANs must be installed primarily on the dedicated CPU port of each
bridge port. More subtly, they must also be installed on upstream-facing
and downstream-facing DSA ports that are connecting the bridge ports and
the CPU. This ensures that the mv88e6xxx's problem (VID of host FDB
entry may be absent from VTU) is still addressed even if that switch is
in a cross-chip setup, and it has no local CPU port.

Therefore:
- user ports contain only bridge port (forwarding) VLANs, and no
  refcounting is necessary
- DSA ports contain both forwarding and host VLANs. Refcounting is
  necessary among these 2 types.
- CPU ports contain only host VLANs. Refcounting is also necessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:05 +00:00
Vladimir Oltean
c4076cdd21 net: switchdev: introduce switchdev_handle_port_obj_{add,del} for foreign interfaces
The switchdev_handle_port_obj_add() helper is good for replicating a
port object on the lower interfaces of @dev, if that object was emitted
on a bridge, or on a bridge port that is a LAG.

However, drivers that use this helper limit themselves to a box from
which they can no longer intercept port objects notified on neighbor
ports ("foreign interfaces").

One such driver is DSA, where software bridging with foreign interfaces
such as standalone NICs or Wi-Fi APs is an important use case. There, a
VLAN installed on a neighbor bridge port roughly corresponds to a
forwarding VLAN installed on the DSA switch's CPU port.

To support this use case while also making use of the benefits of the
switchdev_handle_* replication helper for port objects, introduce a new
variant of these functions that crawls through the neighbor ports of
@dev, in search of potentially compatible switchdev ports that are
interested in the event.

The strategy is identical to switchdev_handle_fdb_event_to_device():
if @dev wasn't a switchdev interface, then go one step upper, and
recursively call this function on the bridge that this port belongs to.
At the next recursion step, __switchdev_handle_port_obj_add() will
iterate through the bridge's lower interfaces. Among those, some will be
switchdev interfaces, and one will be the original @dev that we came
from. To prevent infinite recursion, we must suppress reentry into the
original @dev, and just call the @add_cb for the switchdev_interfaces.

It looks like this:

                br0
               / | \
              /  |  \
             /   |   \
           swp0 swp1 eth0

1. __switchdev_handle_port_obj_add(eth0)
   -> check_cb(eth0) returns false
   -> eth0 has no lower interfaces
   -> eth0's bridge is br0
   -> switchdev_lower_dev_find(br0, check_cb, foreign_dev_check_cb))
      finds br0

2. __switchdev_handle_port_obj_add(br0)
   -> check_cb(br0) returns false
   -> netdev_for_each_lower_dev
      -> check_cb(swp0) returns true, so we don't skip this interface

3. __switchdev_handle_port_obj_add(swp0)
   -> check_cb(swp0) returns true, so we call add_cb(swp0)

(back to netdev_for_each_lower_dev from 2)
      -> check_cb(swp1) returns true, so we don't skip this interface

4. __switchdev_handle_port_obj_add(swp1)
   -> check_cb(swp1) returns true, so we call add_cb(swp1)

(back to netdev_for_each_lower_dev from 2)
      -> check_cb(eth0) returns false, so we skip this interface to
         avoid infinite recursion

Note: eth0 could have been a LAG, and we don't want to suppress the
recursion through its lowers if those exist, so when check_cb() returns
false, we still call switchdev_lower_dev_find() to estimate whether
there's anything worth a recursion beneath that LAG. Using check_cb()
and foreign_dev_check_cb(), switchdev_lower_dev_find() not only figures
out whether the lowers of the LAG are switchdev, but also whether they
actively offload the LAG or not (whether the LAG is "foreign" to the
switchdev interface or not).

The port_obj_info->orig_dev is preserved across recursive calls, so
switchdev drivers still know on which device was this notification
originally emitted.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
7b465f4cf3 net: switchdev: rename switchdev_lower_dev_find to switchdev_lower_dev_find_rcu
switchdev_lower_dev_find() assumes RCU read-side critical section
calling context, since it uses netdev_walk_all_lower_dev_rcu().

Rename it appropriately, in preparation of adding a similar iterator
that assumes writer-side rtnl_mutex protection.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
b28d580e29 net: bridge: switchdev: replay all VLAN groups
The major user of replayed switchdev objects is DSA, and so far it
hasn't needed information about anything other than bridge port VLANs,
so this is all that br_switchdev_vlan_replay() knows to handle.

DSA has managed to get by through replicating every VLAN addition on a
user port such that the same VLAN is also added on all DSA and CPU
ports, but there is a corner case where this does not work.

The mv88e6xxx DSA driver currently prints this error message as soon as
the first port of a switch joins a bridge:

mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95

where a6:ef:77:c8:5f:3d vid 1 is a local FDB entry corresponding to the
bridge MAC address in the default_pvid.

The -EOPNOTSUPP is returned by mv88e6xxx_port_db_load_purge() because it
tries to map VID 1 to a FID (the ATU is indexed by FID not VID), but
fails to do so. This is because ->port_fdb_add() is called before
->port_vlan_add() for VID 1.

The abridged timeline of the calls is:

br_add_if
-> netdev_master_upper_dev_link
   -> dsa_port_bridge_join
      -> switchdev_bridge_port_offload
         -> br_switchdev_vlan_replay (*)
         -> br_switchdev_fdb_replay
            -> mv88e6xxx_port_fdb_add
-> nbp_vlan_init
   -> nbp_vlan_add
      -> mv88e6xxx_port_vlan_add

and the issue is that at the time of (*), the bridge port isn't in VID 1
(nbp_vlan_init hasn't been called), therefore br_switchdev_vlan_replay()
won't have anything to replay, therefore VID 1 won't be in the VTU by
the time mv88e6xxx_port_fdb_add() is called.

This happens only when the first port of a switch joins. For further
ports, the initial mv88e6xxx_port_vlan_add() is sufficient for VID 1 to
be loaded in the VTU (which is switch-wide, not per port).

The problem is somewhat unique to mv88e6xxx by chance, because most
other drivers offload an FDB entry by VID, so FDBs and VLANs can be
added asynchronously with respect to each other, but addressing the
issue at the bridge layer makes sense, since what mv88e6xxx requires
isn't absurd.

To fix this problem, we need to recognize that it isn't the VLAN group
of the port that we're interested in, but the VLAN group of the bridge
itself (so it isn't a timing issue, but rather insufficient information
being passed from switchdev to drivers).

As mentioned, currently nbp_switchdev_sync_objs() only calls
br_switchdev_vlan_replay() for VLANs corresponding to the port, but the
VLANs corresponding to the bridge itself, for local termination, also
need to be replayed. In this case, VID 1 is not (yet) present in the
port's VLAN group but is present in the bridge's VLAN group.

So to fix this bug, DSA is now obligated to explicitly handle VLANs
pointing towards the bridge in order to "close this race" (which isn't
really a race). As Tobias Waldekranz notices, this also implies that it
must explicitly handle port VLANs on foreign interfaces, something that
worked implicitly before:
https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260

So in the end, br_switchdev_vlan_replay() must replay all VLANs from all
VLAN groups: all the ports, and the bridge itself.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
263029ae31 net: bridge: make nbp_switchdev_unsync_objs() follow reverse order of sync()
There may be switchdev drivers that can add/remove a FDB or MDB entry
only as long as the VLAN it's in has been notified and offloaded first.
The nbp_switchdev_sync_objs() method satisfies this requirement on
addition, but nbp_switchdev_unsync_objs() first deletes VLANs, then
deletes MDBs and FDBs. Reverse the order of the function calls to cater
to this requirement.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
8d23a54f5b net: bridge: switchdev: differentiate new VLANs from changed ones
br_switchdev_port_vlan_add() currently emits a SWITCHDEV_PORT_OBJ_ADD
event with a SWITCHDEV_OBJ_ID_PORT_VLAN for 2 distinct cases:

- a struct net_bridge_vlan got created
- an existing struct net_bridge_vlan was modified

This makes it impossible for switchdev drivers to properly balance
PORT_OBJ_ADD with PORT_OBJ_DEL events, so if we want to allow that to
happen, we must provide a way for drivers to distinguish between a
VLAN with changed flags and a new one.

Annotate struct switchdev_obj_port_vlan with a "bool changed" that
distinguishes the 2 cases above.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
27c5f74c7b net: bridge: vlan: notify switchdev only when something changed
Currently, when a VLAN entry is added multiple times in a row to a
bridge port, nbp_vlan_add() calls br_switchdev_port_vlan_add() each
time, even if the VLAN already exists and nothing about it has changed:

bridge vlan add dev lan12 vid 100 master static

Similarly, when a VLAN is added multiple times in a row to a bridge,
br_vlan_add_existing() doesn't filter at all the calls to
br_switchdev_port_vlan_add():

bridge vlan add dev br0 vid 100 self

This behavior makes driver-level accounting of VLANs impossible, since
it is enough for a single deletion event to remove a VLAN, but the
addition event can be emitted an unlimited number of times.

The cause for this can be identified as follows: we rely on
__vlan_add_flags() to retroactively tell us whether it has changed
anything about the VLAN flags or VLAN group pvid. So we'd first have to
call __vlan_add_flags() before calling br_switchdev_port_vlan_add(), in
order to have access to the "bool *changed" information. But we don't
want to change the event ordering, because we'd have to revert the
struct net_bridge_vlan changes we've made if switchdev returns an error.

So to solve this, we need another function that tells us whether any
change is going to occur in the VLAN or VLAN group, _prior_ to calling
__vlan_add_flags().

Split __vlan_add_flags() into a precommit and a commit stage, and rename
it to __vlan_flags_update(). The precommit stage,
__vlan_flags_would_change(), will determine whether there is any reason
to notify switchdev due to a change of flags (note: the BRENTRY flag
transition from false to true is treated separately: as a new switchdev
entry, because we skipped notifying the master VLAN when it wasn't a
brentry yet, and therefore not as a change of flags).

With this lookahead/precommit function in place, we can avoid notifying
switchdev if nothing changed for the VLAN and VLAN group.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
cab2cd7700 net: bridge: vlan: make __vlan_add_flags react only to PVID and UNTAGGED
Currently there is a very subtle aspect to the behavior of
__vlan_add_flags(): it changes the struct net_bridge_vlan flags and
pvid, yet it returns true ("changed") even if none of those changed,
just a transition of br_vlan_is_brentry(v) took place from false to
true.

This can be seen in br_vlan_add_existing(), however we do not actually
rely on this subtle behavior, since the "if" condition that checks that
the vlan wasn't a brentry before had a useless (until now) assignment:

	*changed = true;

Make things more obvious by actually making __vlan_add_flags() do what's
written on the box, and be more specific about what is actually written
on the box. This is needed because further transformations will be done
to __vlan_add_flags().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
3116ad0696 net: bridge: vlan: don't notify to switchdev master VLANs without BRENTRY flag
When a VLAN is added to a bridge port and it doesn't exist on the bridge
device yet, it gets created for the multicast context, but it is
'hidden', since it doesn't have the BRENTRY flag yet:

ip link add br0 type bridge && ip link set swp0 master br0
bridge vlan add dev swp0 vid 100 # the master VLAN 100 gets created
bridge vlan add dev br0 vid 100 self # that VLAN becomes brentry just now

All switchdev drivers ignore switchdev notifiers for VLAN entries which
have the BRENTRY unset, and for good reason: these are merely private
data structures used by the bridge driver. So we might just as well not
notify those at all.

Cleanup in the switchdev drivers that check for the BRENTRY flag is now
possible, and will be handled separately, since those checks just became
dead code.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Vladimir Oltean
b2bc58d41f net: bridge: vlan: check early for lack of BRENTRY flag in br_vlan_add_existing
When a VLAN is added to a bridge port, a master VLAN gets created on the
bridge for context, but it doesn't have the BRENTRY flag.

Then, when the same VLAN is added to the bridge itself, that enters
through the br_vlan_add_existing() code path and gains the BRENTRY flag,
thus it becomes "existing".

It seems natural to check for this condition early, because the current
code flow is to notify switchdev of the addition of a VLAN that isn't a
brentry, just to delete it immediately afterwards.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
Haiyue Wang
b0471c2610 gve: enhance no queue page list detection
The commit
a5886ef4f4 ("gve: Introduce per netdev `enum gve_queue_format`")
introduces three queue format type, only GVE_GQI_QPL_FORMAT queue has
page list. So it should use the queue page list number to detect the
zero size queue page list. Correct the design logic.

Using the 'queue_format == GVE_GQI_RDA_FORMAT' may lead to request zero
sized memory allocation, like if the queue format is GVE_DQO_RDA_FORMAT.

The kernel memory subsystem will return ZERO_SIZE_PTR, which is not NULL
address, so the driver can run successfully. Also the code still checks
the queue page list number firstly, then accesses the allocated memory,
so zero number queue page list allocation will not lead to access fault.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Reviewed-by: Bailey Forrest <bcf@google.com>
Link: https://lore.kernel.org/r/20220215051751.260866-1-haiyue.wang@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-15 18:01:06 -08:00
Colin Ian King
2c955856da net: dm9051: Fix spelling mistake "eror" -> "error"
There are spelling mistakes in debug messages. Fix them.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-15 14:56:57 +00:00
Yang Li
99cd6a64e1 dpaa2-eth: Simplify bool conversion
Fix the following coccicheck warnings:
./drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1199:42-47: WARNING:
conversion to bool not needed here
./drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c:1218:54-59: WARNING:
conversion to bool not needed here

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-15 14:39:12 +00:00
Vladimir Oltean
5454f5c28e net: bridge: vlan: check for errors from __vlan_del in __vlan_flush
If the following call path returns an error from switchdev:

nbp_vlan_flush
-> __vlan_del
   -> __vlan_vid_del
      -> br_switchdev_port_vlan_del
-> __vlan_group_free
   -> WARN_ON(!list_empty(&vg->vlan_list));

then the deletion of the net_bridge_vlan is silently halted, which will
trigger the WARN_ON from __vlan_group_free().

The WARN_ON is rather unhelpful, because nothing about the source of the
error is printed. Add a print to catch errors from __vlan_del.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-15 14:37:28 +00:00