summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-10-08net: sched: cls_u32: get rid of tp_cAl Viro
Both hnode ->tp_c and tp_c argument of u32_set_parms() the latter is redundant, the former - never read... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: the tp_c argument of u32_set_parms() is always tp->dataAl Viro
It must be tc_u_common associated with that tp (i.e. tp->data). Proof: * both ->ht_up and ->tp_c are assign-once * ->tp_c of anything inserted into tp_c->hlist is tp_c * hnodes never get reinserted into the lists or moved between those, so anything found by u32_lookup_ht(tp->data, ...) will have ->tp_c equal to tp->data. * tp->root->tp_c == tp->data. * ->ht_up of anything inserted into hnode->ht[...] is equal to hnode. * knodes never get reinserted into hash chains or moved between those, so anything returned by u32_lookup_key(ht, ...) will have ->ht_up equal to ht. * any knode returned by u32_get(tp, ...) will have ->ht_up->tp_c point to tp->data Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: pass tc_u_common to u32_set_parms() instead of tc_u_hnodeAl Viro
the only thing we used ht for was ht->tp_c and callers can get that without going through ->tp_c at all; start with lifting that into the callers, next commits will massage those, eventually removing ->tp_c altogether. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: clean tc_u_common hashtableAl Viro
* calculate key *once*, not for each hash chain element * let tc_u_hash() return the pointer to chain head rather than index - callers are cleaner that way. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: get rid of tc_u_common ->rcuAl Viro
unused Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: get rid of tc_u_knode ->tpAl Viro
not used anymore Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: get rid of unused argument of u32_destroy_key()Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: make sure that divisor is a power of 2Al Viro
Tested by modifying iproute2 to allow sending a divisor > 255 Tested-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: disallow linking to root hnodeAl Viro
Operation makes no sense. Nothing will actually break if we do so (depth limit in u32_classify() will prevent infinite loops), but according to maintainers it's best prohibited outright. NOTE: doing so guarantees that u32_destroy() will trigger the call of u32_destroy_hnode(); we might want to make that unconditional. Test: tc qdisc add dev eth0 ingress tc filter add dev eth0 parent ffff: protocol ip prio 100 u32 \ link 800: offset at 0 mask 0f00 shift 6 plus 0 eat match ip protocol 6 ff should fail with Error: cls_u32: Not linking to root node Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: sched: cls_u32: mark root hnode explicitlyAl Viro
... and produce consistent error on attempt to delete such. Existing check in u32_delete() is inconsistent - after tc qdisc add dev eth0 ingress tc filter add dev eth0 parent ffff: protocol ip prio 100 handle 1: u32 \ divisor 1 tc filter add dev eth0 parent ffff: protocol ip prio 200 handle 2: u32 \ divisor 1 both tc filter delete dev eth0 parent ffff: protocol ip prio 100 handle 801: u32 and tc filter delete dev eth0 parent ffff: protocol ip prio 100 handle 800: u32 will fail (at least with refcounting fixes), but the former will complain about an attempt to remove a busy table, while the latter will recognize it as root and yield "Not allowed to delete root node" instead. The problem with the existing check is that several tcf_proto instances might share the same tp->data and handle-to-hnode lookup will be the same for all of them. So comparing an hnode to be deleted with tp->root won't catch the case when one tp is used to try deleting the root of another. Solution is trivial - mark the root hnodes explicitly upon allocation and check for that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08Merge branch ↵David S. Miller
'net-phy-mscc-add-support-for-VSC8584-and-VSC8574-Microsemi-quad-port-PHYs' Quentin Schulz says: ==================== net: phy: mscc: add support for VSC8584 and VSC8574 Microsemi quad-port PHYs RESEND: rebased on top of latest net-next and on top of latest version of "net: phy: mscc: various improvements to Microsemi PHY driver" patch series. Both PHYs are 4-port PHY that are 10/100/1000BASE-T, 100BASE-FX, 1000BASE-X and triple-speed copper SFP capable, can communicate with the MAC via SGMII, QSGMII or 1000BASE-X, supports downshifting and can set the blinking pattern of each of its 4 LEDs, supports SyncE as well as HP Auto-MDIX detection. VSC8574 supports WOL and VSC8584 supports hardware offloading of MACsec. This patch series add support for 10/100/1000BASE-T, SGMII/QSGMII link with the MAC, downshifting, HP Auto-MDIX detection and blinking pattern for their 4 LEDs. They have also an internal Intel 8051 microcontroller whose firmware needs to be patched when the PHY is reset. If the 8051's firmware has the expected CRC, its patching can be skipped. The microcontroller can be accessed from any port of the PHY, though the CRC function can only be done through the PHY that is the base PHY of the package (internal address 0) due to a limitation of the firmware. The GPIO register bank is a set of registers that are common to all PHYs in the package. So any modification in any register of this bank affects all PHYs of the package. If the PHYs haven't been reset before booting the Linux kernel and were configured to use interrupts for e.g. link status updates, it is required to clear the interrupts mask register of all PHYs before being able to use interrupts with any PHY. The first PHY of the package that will be init will take care of clearing all PHYs interrupts mask registers. Thus, we need to keep track of the init sequence in the package, if it's already been done or if it's to be done. Most of the init sequence of a PHY of the package is common to all PHYs in the package, thus we use the SMI broadcast feature which enables us to propagate a write in one register of one PHY to all PHYs in the same package. We also introduce a new development board called PCB120 which exists in variants for VSC8584 and VSC8574 (and that's the only difference to the best of my knowledge). I suggest patches 1 to 3 go through net tree and patches 4 and 5 go through MIPS tree. Patches going through net tree and those going through MIPS tree do not depend on one another. This patch series depends on this patch series: (https://lore.kernel.org/lkml/20181008100728.24959-1-quentin.schulz@bootlin.com/) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: add support for VSC8574 PHYQuentin Schulz
The VSC8574 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX, 1000BASE-X and triple-speed copper SFP capable, can communicate with the MAC via SGMII, QSGMII or 1000BASE-X, supports WOL, downshifting and can set the blinking pattern of each of its 4 LEDs, supports SyncE as well as HP Auto-MDIX detection. This adds support for 10/100/1000BASE-T, SGMII/QSGMII link with the MAC, WOL, downshifting, HP Auto-MDIX detection and blinking pattern for its 4 LEDs. The VSC8574 has also an internal Intel 8051 microcontroller whose firmware needs to be patched when the PHY is reset. If the 8051's firmware has the expected CRC, its patching can be skipped. The microcontroller can be accessed from any port of the PHY, though the CRC function can only be done through the PHY that is the base PHY of the package (internal address 0) due to a limitation of the firmware. The GPIO register bank is a set of registers that are common to all PHYs in the package. So any modification in any register of this bank affects all PHYs of the package. If the PHYs haven't been reset before booting the Linux kernel and were configured to use interrupts for e.g. link status updates, it is required to clear the interrupts mask register of all PHYs before being able to use interrupts with any PHY. The first PHY of the package that will be init will take care of clearing all PHYs interrupts mask registers. Thus, we need to keep track of the init sequence in the package, if it's already been done or if it's to be done. Most of the init sequence of a PHY of the package is common to all PHYs in the package, thus we use the SMI broadcast feature which enables us to propagate a write in one register of one PHY to all PHYs in the same package. Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: add support for VSC8584 PHYQuentin Schulz
The VSC8584 PHY is a 4-ports PHY that is 10/100/1000BASE-T, 100BASE-FX, 1000BASE-X and triple-speed copper SFP capable, can communicate with the MAC via SGMII, QSGMII or 1000BASE-X, supports downshifting and can set the blinking pattern of each of its 4 LEDs, supports hardware offloading of MACsec and supports SyncE as well as HP Auto-MDIX detection. This adds support for 10/100/1000BASE-T, SGMII/QSGMII link with the MAC, downshifting, HP Auto-MDIX detection and blinking pattern for its 4 LEDs. The VSC8584 has also an internal Intel 8051 microcontroller whose firmware needs to be patched when the PHY is reset. If the 8051's firmware has the expected CRC, its patching can be skipped. The microcontroller can be accessed from any port of the PHY, though the CRC function can only be done through the PHY that is the base PHY of the package (internal address 0) due to a limitation of the firmware. The GPIO register bank is a set of registers that are common to all PHYs in the package. So any modification in any register of this bank affects all PHYs of the package. If the PHYs haven't been reset before booting the Linux kernel and were configured to use interrupts for e.g. link status updates, it is required to clear the interrupts mask register of all PHYs before being able to use interrupts with any PHY. The first PHY of the package that will be init will take care of clearing all PHYs interrupts mask registers. Thus, we need to keep track of the init sequence in the package, if it's already been done or if it's to be done. Most of the init sequence of a PHY of the package is common to all PHYs in the package, thus we use the SMI broadcast feature which enables us to propagate a write in one register of one PHY to all PHYs in the same package. The revA of the VSC8584 PHY (which is not and will not be publicly released) should NOT patch the firmware of the microcontroller or it'll make things worse, the easiest way is just to not support it. Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08dt-bindings: net: vsc8531: add two additional LED modes for VSC8584Quentin Schulz
The VSC8584 (and most likely other PHYs in the same generation) has two additional LED modes that can be picked, so let's add them. Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08Merge branch 'net-phy-mscc-various-improvements-to-Microsemi-PHY-driver'David S. Miller
Quentin Schulz says: ==================== net: phy: mscc: various improvements to Microsemi PHY driver The Microsemi PHYs have multiple banks of registers (called pages). Registers can only be accessed from one page, if we need a register from another page, we need to switch the page and the registers of all other pages are not accessible anymore. Basically, to read register 5 from page 0, 1, 2, etc., you do the same phy_read(phydev, 5); but you need to set the desired page beforehand. In order to guarantee that two concurrent functions do not change the page, we need to do some locking per page. This can be achieved with the use of phy_select_page and phy_restore_page functions but phy_write/read calls in-between those two functions shall be replaced by their lock-free alternative __phy_write/read. The Microsemi PHYs have several counters so let's make them available as PHY statistics. The VSC 8530/31/40/41 also need to update their EEE init sequence in order to avoid packet losses and improve performance. This patch series also makes some minor cosmetic changes to the driver. v3: - add reviewed-by, - use phy_read/write/modify_paged whenever possible instead of the combo phy_select_page, __phy_read/write/modify, phy_restore_page when only one __phy_read/write/modify was executed, v2: - add patch to migrate MSCC driver to use phy_restore/select_page, - migrate all patches from v1 to use those two functions, - put the multiple lines of constants writes in an array and iterate over it to write the values, - add reviewed-bys, ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: remove unneeded temporary variableQuentin Schulz
Here, the rc variable is either used only for the condition right after the assignment or right before being used as the return value of the function it's being used in. So let's remove this unneeded temporary variable whenever possible. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: shorten `x != 0` condition to `x`Quentin Schulz
`if (x != 0)` is basically a more verbose version of `if (x)` so let's use the latter so it's consistent throughout the whole driver. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: remove unneeded parenthesisQuentin Schulz
The == operator precedes the || operator, so we can remove the parenthesis around (a == b) || (c == d). The condition is rather explicit and short so removing the parenthesis definitely does not make it harder to read. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: Add EEE init sequenceRaju Lakkaraju
Microsemi PHYs (VSC 8530/31/40/41) need to update the Energy Efficient Ethernet initialization sequence. In order to avoid certain link state errors that could result in link drops and packet loss, the physical coding sublayer (PCS) must be updated with settings related to EEE in order to improve performance. Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: add ethtool statistics countersRaju Lakkaraju
There are a few counters available in the PHY: receive errors, false carriers, link disconnects, media CRC errors and valids counters. So let's expose those in the PHY driver. Use the priv structure as the next PHY to be supported has a few additional counters. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microsemi.com> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: phy: mscc: migrate to phy_select/restore_page functionsQuentin Schulz
The Microsemi PHYs have multiple banks of registers (called pages). Registers can only be accessed from one page, if we need a register from another page, we need to switch the page and the registers of all other pages are not accessible anymore. Basically, to read register 5 from page 0, 1, 2, etc., you do the same phy_read(phydev, 5); but you need to set the desired page beforehand. In order to guarantee that two concurrent functions do not change the page, we need to do some locking per page. This can be achieved with the use of phy_select_page and phy_restore_page functions but phy_write/read calls in-between those two functions shall be replaced by their lock-free alternative __phy_write/read. Let's migrate this driver to those functions. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: dpaa2: fix and improve dpaa2-ptp driverYangbo Lu
This patch is to fix and improve dpaa2-ptp driver in some places. - Fixed the return for some functions. - Replaced kzalloc with devm_kzalloc. - Removed dev_set_drvdata(dev, NULL). - Made ptp_dpaa2_caps const. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: dpaa2: remove unused code for dprtcYangbo Lu
This patch is to removed unused code for dprtc. This code will be re-added along with more features of dpaa2-ptp added. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: dpaa2: rename rtc as ptp in dpaa2-ptp driverYangbo Lu
In dpaa2-ptp driver, it's odd to use rtc in names of some functions and structures except these dprtc APIs. This patch is to use ptp instead of rtc in names. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: dpaa2: fix dependency of config FSL_DPAA2_ETHYangbo Lu
The NETDEVICES dependency and ETHERNET dependency hadn't been required since dpaa2-eth was moved out of staging. Also allowed COMPILE_TEST for dpaa2-eth. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08MAINTAINERS: update files maintained under DPAA2 PTP/ETHERNETYangbo Lu
The files maintained under DPAA2 PTP/ETHERNET needs to be updated since dpaa2 ptp driver had been moved into drivers/net/ethernet/freescale/dpaa2/. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08net: dpaa2: move DPAA2 PTP driver out of staging/Yangbo Lu
This patch is to move DPAA2 PTP driver out of staging/ since the dpaa2-eth had been moved out. Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-08Merge branch 'bpf-to-bpf-calls-nfp'Daniel Borkmann
Quentin Monnet says: ==================== This patch series adds support for hardware offload of programs containing BPF-to-BPF function calls. First, a new callback is added to the kernel verifier, to collect information after the main part of the verification has been performed. Then support for BPF-to-BPF calls is incrementally added to the nfp driver, before offloading programs containing such calls is eventually allowed by lifting the restriction in the kernel verifier, in the last patch. Please refer to individual patches for details. Many thanks to Jiong and Jakub for their precious help and contribution on the main patches for the JIT-compiler, and everything related to stack accesses. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08bpf: allow offload of programs with BPF-to-BPF function callsQuentin Monnet
Now that there is at least one driver supporting BPF-to-BPF function calls, lift the restriction, in the verifier, on hardware offload of eBPF programs containing such calls. But prevent jit_subprogs(), still in the verifier, from being run for offloaded programs. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: support pointers to other stack frames for BPF-to-BPF callsQuentin Monnet
Mark instructions that use pointers to areas in the stack outside of the current stack frame, and process them accordingly in mem_op_stack(). This way, we also support BPF-to-BPF calls where the caller passes a pointer to data in its own stack frame to the callee (typically, when the caller passes an address to one of its local variables located in the stack, as an argument). Thanks to Jakub and Jiong for figuring out how to deal with this case, I just had to turn their email discussion into this patch. Suggested-by: Jiong Wang <jiong.wang@netronome.com> Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: optimise save/restore for R6~R9 based on register usageQuentin Monnet
When pre-processing the instructions, it is trivial to detect what subprograms are using R6, R7, R8 or R9 as destination registers. If a subprogram uses none of those, then we do not need to jump to the subroutines dedicated to saving and restoring callee-saved registers in its prologue and epilogue. This patch introduces detection of callee-saved registers in subprograms and prevents the JIT from adding calls to those subroutines whenever we can: we save some instructions in the translated program, and some time at runtime on BPF-to-BPF calls and returns. If no subprogram needs to save those registers, we can avoid appending the subroutines at the end of the program. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: fix return address from register-saving subroutine to calleeQuentin Monnet
On performing a BPF-to-BPF call, we first jump to a subroutine that pushes callee-saved registers (R6~R9) to the stack, and from there we goes to the start of the callee next. In order to do so, the caller must pass to the subroutine the address of the NFP instruction to jump to at the end of that subroutine. This cannot be reliably implemented when translated the caller, as we do not always know the start offset of the callee yet. This patch implement the required fixup step for passing the start offset in the callee via the register used by the subroutine to hold its return address. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: update fixup function for BPF-to-BPF calls supportQuentin Monnet
Relocation for targets of BPF-to-BPF calls are required at the end of translation. Update the nfp_fixup_branches() function in that regard. When checking that the last instruction of each bloc is a branch, we must account for the length of the instructions required to pop the return address from the stack. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: account for additional stack usage when checking stack limitQuentin Monnet
Offloaded programs using BPF-to-BPF calls use the stack to store the return address when calling into a subprogram. Callees also need some space to save eBPF registers R6 to R9. And contrarily to kernel verifier, we align stack frames on 64 bytes (and not 32). Account for all this when checking the stack size limit before JIT-ing the program. This means we have to recompute maximum stack usage for the program, we cannot get the value from the kernel. In addition to adapting the checks on stack usage, move them to the finalize() callback, now that we have it and because such checks are part of the verification step rather than translation. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: add main logics for BPF-to-BPF calls support in nfp driverQuentin Monnet
This is the main patch for the logics of BPF-to-BPF calls in the nfp driver. The functions called on BPF_JUMP | BPF_CALL and BPF_JUMP | BPF_EXIT were used to call helpers and exit from the program, respectively; make them usable for calling into, or returning from, a BPF subprogram as well. For all calls, push the return address as well as the callee-saved registers (R6 to R9) to the stack, and pop them upon returning from the calls. In order to limit the overhead in terms of instruction number, this is done through dedicated subroutines. Jumping to the callee actually consists in jumping to the subroutine, that "returns" to the callee: this will require some fixup for passing the address in a later patch. Similarly, returning consists in jumping to the subroutine, which pops registers and then return directly to the caller (but no fixup is needed here). Return to the caller is performed with the RTN instruction newly added to the JIT. For the few steps where we need to know what subprogram an instruction belongs to, the struct nfp_insn_meta is extended with a new subprog_idx field. Note that checks on the available stack size, to take into account the additional requirements associated to BPF-to-BPF calls (storing R6-R9 and return addresses), are added in a later patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: account for BPF-to-BPF calls when preparing nfp JITQuentin Monnet
Similarly to "exit" or "helper call" instructions, BPF-to-BPF calls will require additional processing before translation starts, in order to record and mark jump destinations. We also mark the instructions where each subprogram begins. This will be used in a following commit to determine where to add prologues for subprograms. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: ignore helper-related checks for BPF calls in nfp verifierQuentin Monnet
The checks related to eBPF helper calls are performed each time the nfp driver meets a BPF_JUMP | BPF_CALL instruction. However, these checks are not relevant for BPF-to-BPF call (same instruction code, different value in source register), so just skip the checks for such calls. While at it, rename the function that runs those checks to make it clear they apply to _helper_ calls only. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: copy eBPF subprograms information from kernel verifierQuentin Monnet
In order to support BPF-to-BPF calls in offloaded programs, the nfp driver must collect information about the distinct subprograms: namely, the number of subprograms composing the complete program and the stack depth of those subprograms. The latter in particular is non-trivial to collect, so we copy those elements from the kernel verifier via the newly added post-verification hook. The struct nfp_prog is extended to store this information. Stack depths are stored in an array of dedicated structs. Subprogram start indexes are not collected. Instead, meta instructions associated to the start of a subprogram will be marked with a flag in a later patch. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08nfp: bpf: rename nfp_prog->stack_depth as nfp_prog->stack_frame_depthQuentin Monnet
In preparation for support for BPF to BPF calls in offloaded programs, rename the "stack_depth" field of the struct nfp_prog as "stack_frame_depth". This is to make it clear that the field refers to the maximum size of the current stack frame (as opposed to the maximum size of the whole stack memory). Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08bpf: add verifier callback to get stack usage info for offloaded progsQuentin Monnet
In preparation for BPF-to-BPF calls in offloaded programs, add a new function attribute to the struct bpf_prog_offload_ops so that drivers supporting eBPF offload can hook at the end of program verification, and potentially extract information collected by the verifier. Implement a minimal callback (returning 0) in the drivers providing the structs, namely netdevsim and nfp. This will be useful in the nfp driver, in later commits, to extract the number of subprograms as well as the stack depth for those subprograms. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08bpf, doc: Document Jump X addressing modeArthur Fabre
bpf_asm and the other classic BPF tools support jump conditions comparing register A to register X, in addition to comparing register A with constant K. Only the latter was documented in filter.txt, add two new addressing modes that describe the former. Signed-off-by: Arthur Fabre <arthur@arthurfabre.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08libbpf: relicense libbpf as LGPL-2.1 OR BSD-2-ClauseAlexei Starovoitov
libbpf is maturing as a library and gaining features that no other bpf libraries support (BPF Type Format, bpf to bpf calls, etc) Many Apache2 licensed projects (like bcc, bpftrace, gobpf, cilium, etc) would like to use libbpf, but cannot do this yet, since Apache Foundation explicitly states that LGPL is incompatible with Apache2. Hence let's relicense libbpf as dual license LGPL-2.1 or BSD-2-Clause, since BSD-2 is compatible with Apache2. Dual LGPL or Apache2 is invalid combination. Fix license mistake in Makefile as well. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrey Ignatov <rdna@fb.com> Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Beckett <david.beckett@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Joe Stringer <joe@ovn.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <kafai@fb.com> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Roman Gushchin <guro@fb.com> Acked-by: Wang Nan <wangnan0@huawei.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-08xsk: proper AF_XDP socket teardown orderingBjörn Töpel
The AF_XDP socket struct can exist in three different, implicit states: setup, bound and released. Setup is prior the socket has been bound to a device. Bound is when the socket is active for receive and send. Released is when the process/userspace side of the socket is released, but the sock object is still lingering, e.g. when there is a reference to the socket in an XSKMAP after process termination. The Rx fast-path code uses the "dev" member of struct xdp_sock to check whether a socket is bound or relased, and the Tx code uses the struct xdp_umem "xsk_list" member in conjunction with "dev" to determine the state of a socket. However, the transition from bound to released did not tear the socket down in correct order. On the Rx side "dev" was cleared after synchronize_net() making the synchronization useless. On the Tx side, the internal queues were destroyed prior removing them from the "xsk_list". This commit corrects the cleanup order, and by doing so xdp_del_sk_umem() can be simplified and one synchronize_net() can be removed. Fixes: 965a99098443 ("xsk: add support for bind for Rx") Fixes: ac98d8aab61b ("xsk: wire upp Tx zero-copy functions") Reported-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-10-07net: vhost: remove bad code lineTonghao Zhang
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-07net: sched: pie: fix coding style issuesLeslie Monis
Fix 5 warnings and 14 checks issued by checkpatch.pl: CHECK: Logical continuations should be on the previous line + if ((q->vars.qdelay < q->params.target / 2) + && (q->vars.prob < MAX_PROB / 5)) WARNING: line over 80 characters + q->params.tupdate = usecs_to_jiffies(nla_get_u32(tb[TCA_PIE_TUPDATE])); CHECK: Blank lines aren't necessary after an open brace '{' +{ + CHECK: braces {} should be used on all arms of this statement + if (qlen < QUEUE_THRESHOLD) [...] + else { [...] CHECK: Unbalanced braces around else statement + else { CHECK: No space is necessary after a cast + if (delta > (s32) (MAX_PROB / (100 / 2)) && CHECK: Unnecessary parentheses around 'qdelay == 0' + if ((qdelay == 0) && (qdelay_old == 0) && update_prob) CHECK: Unnecessary parentheses around 'qdelay_old == 0' + if ((qdelay == 0) && (qdelay_old == 0) && update_prob) CHECK: Unnecessary parentheses around 'q->vars.prob == 0' + if ((q->vars.qdelay < q->params.target / 2) && + (q->vars.qdelay_old < q->params.target / 2) && + (q->vars.prob == 0) && + (q->vars.avg_dq_rate > 0)) CHECK: Unnecessary parentheses around 'q->vars.avg_dq_rate > 0' + if ((q->vars.qdelay < q->params.target / 2) && + (q->vars.qdelay_old < q->params.target / 2) && + (q->vars.prob == 0) && + (q->vars.avg_dq_rate > 0)) CHECK: Blank lines aren't necessary before a close brace '}' + +} CHECK: Comparison to NULL could be written "!opts" + if (opts == NULL) CHECK: No space is necessary after a cast + ((u32) PSCHED_TICKS2NS(q->params.target)) / WARNING: line over 80 characters + nla_put_u32(skb, TCA_PIE_TUPDATE, jiffies_to_usecs(q->params.tupdate)) || CHECK: Blank lines aren't necessary before a close brace '}' + +} CHECK: No space is necessary after a cast + .delay = ((u32) PSCHED_TICKS2NS(q->vars.qdelay)) / WARNING: Missing a blank line after declarations + struct sk_buff *skb; + skb = qdisc_dequeue_head(sch); WARNING: Missing a blank line after declarations + struct pie_sched_data *q = qdisc_priv(sch); + qdisc_reset_queue(sch); WARNING: Missing a blank line after declarations + struct pie_sched_data *q = qdisc_priv(sch); + q->params.tupdate = 0; Signed-off-by: Leslie Monis <lesliemonis@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-07bnxt_en: Remove unnecessary unsigned integer comparison and initialize variableGustavo A. R. Silva
There is no need to compare *val.vu32* with < 0 because such variable is of type u32 (32 bits, unsigned), making it impossible to hold a negative value. Fix this by removing such comparison. Also, initialize variable *max_val* to -1, just in case it is not initialized to either BNXT_MSIX_VEC_MAX or BNXT_MSIX_VEC_MIN_MAX before using it in a comparison with val.vu32 at line 159: if (val.vu32 > max_val) Addresses-Coverity-ID: 1473915 ("Unsigned compared against 0") Addresses-Coverity-ID: 1473920 ("Uninitialized scalar variable") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-07Merge tag 'wireless-drivers-next-for-davem-2018-10-07' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.20 Second set of patches for 4.20. Heavy refactoring on mt76 continues and the usual drivers in active development (iwlwifi, qtnfmac, ath10k) getting new features. And as always, fixes and cleanup all over. Major changes: mt76 * more major refactoring to make it easier add new hardware support * more work on mt76x0e support * support for getting firmware version via ethtool * add mt7650 PCI ID iwlwifi * HE radiotap cleanup and improvements * reorder channel optimization for scans * bump the FW API version qtnfmac * fixes for 'iw' output: rates for enabled SGI, 'dump station' * expose more scan features to host: scan flush and dwell time * inform cfg80211 when OBSS is not supported by firmware wlcore * add support for optional wakeirq ath10k * retrieve MAC address from system firmware if provided * support extended board data download for dual-band QCA9984 * extended per sta tx statistics support via debugfs * average ack rssi support for data frames * speed up QCA6174 and QCA9377 firmware download using diag Copy Engine * HTT High Latency mode support needed by SDIO and USB support * get STA power save state via debugfs ath9k * add reset functionality for airtime station debugfs file ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-10-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2018-10-06Merge tag 'mt76-for-kvalo-2018-10-05' of https://github.com/nbd168/wirelessKalle Valo
mt76 patches for 4.20 * unify code between mt76x0, mt76x2 * mt76x0 fixes * another fix for rx buffer allocation regression on usb * move mt76x2 source files to mt76x2 folder * more work on mt76x0e support
2018-10-06Merge tag 'iwlwifi-next-for-kalle-2018-10-06' of ↵Kalle Valo
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next Third set of iwlwifi patches for 4.20 * Fix for a race condition that caused the FW to crash; * HE radiotap cleanup and improvements; * Reorder channel optimization for scans; * Bumped the FW API version supported after the last API change for this release; * Debugging improvements; * A few bug fixes; * Some cleanups in preparation for a new implementation; * Other small improvements, cleanups and fixes.