summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
2024-03-05ice: fix uninitialized dplls mutex usageMichal Schmidt
The pf->dplls.lock mutex is initialized too late, after its first use. Move it to the top of ice_dpll_init. Note that the "err_exit" error path destroys the mutex. And the mutex is the last thing destroyed in ice_dpll_deinit. This fixes the following warning with CONFIG_DEBUG_MUTEXES: ice 0000:10:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.36.0 ice 0000:10:00.0: 252.048 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x16 link) ice 0000:10:00.0: PTP init successful ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 0 PID: 410 at kernel/locking/mutex.c:587 __mutex_lock+0x773/0xd40 Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ice(+) nvme nvme_c> CPU: 0 PID: 410 Comm: kworker/0:4 Not tainted 6.8.0-rc5+ #3 Hardware name: HPE ProLiant DL110 Gen10 Plus/ProLiant DL110 Gen10 Plus, BIOS U56 10/19/2023 Workqueue: events work_for_cpu_fn RIP: 0010:__mutex_lock+0x773/0xd40 Code: c0 0f 84 1d f9 ff ff 44 8b 35 0d 9c 69 01 45 85 f6 0f 85 0d f9 ff ff 48 c7 c6 12 a2 a9 85 48 c7 c7 12 f1 a> RSP: 0018:ff7eb1a3417a7ae0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff85ac2bff RDI: 00000000ffffffff RBP: ff7eb1a3417a7b80 R08: 0000000000000000 R09: 00000000ffffbfff R10: ff7eb1a3417a7978 R11: ff32b80f7fd2e568 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ff32b7f02c50e0d8 FS: 0000000000000000(0000) GS:ff32b80efe800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b5852cc000 CR3: 000000003c43a004 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x84/0x170 ? __mutex_lock+0x773/0xd40 ? report_bug+0x1c7/0x1d0 ? prb_read_valid+0x1b/0x30 ? handle_bug+0x42/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __mutex_lock+0x773/0xd40 ? rcu_is_watching+0x11/0x50 ? __kmalloc_node_track_caller+0x346/0x490 ? ice_dpll_lock_status_get+0x28/0x50 [ice] ? __pfx_ice_dpll_lock_status_get+0x10/0x10 [ice] ? ice_dpll_lock_status_get+0x28/0x50 [ice] ice_dpll_lock_status_get+0x28/0x50 [ice] dpll_device_get_one+0x14f/0x2e0 dpll_device_event_send+0x7d/0x150 dpll_device_register+0x124/0x180 ice_dpll_init_dpll+0x7b/0xd0 [ice] ice_dpll_init+0x224/0xa40 [ice] ? _dev_info+0x70/0x90 ice_load+0x468/0x690 [ice] ice_probe+0x75b/0xa10 [ice] ? _raw_spin_unlock_irqrestore+0x4f/0x80 ? process_one_work+0x1a3/0x500 local_pci_probe+0x47/0xa0 work_for_cpu_fn+0x17/0x30 process_one_work+0x20d/0x500 worker_thread+0x1df/0x3e0 ? __pfx_worker_thread+0x10/0x10 kthread+0x103/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> irq event stamp: 125197 hardirqs last enabled at (125197): [<ffffffff8416409d>] finish_task_switch.isra.0+0x12d/0x3d0 hardirqs last disabled at (125196): [<ffffffff85134044>] __schedule+0xea4/0x19f0 softirqs last enabled at (105334): [<ffffffff84e1e65a>] napi_get_frags_check+0x1a/0x60 softirqs last disabled at (105332): [<ffffffff84e1e65a>] napi_get_frags_check+0x1a/0x60 ---[ end trace 0000000000000000 ]--- Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05net: ice: Fix potential NULL pointer dereference in ice_bridge_setlink()Rand Deeb
The function ice_bridge_setlink() may encounter a NULL pointer dereference if nlmsg_find_attr() returns NULL and br_spec is dereferenced subsequently in nla_for_each_nested(). To address this issue, add a check to ensure that br_spec is not NULL before proceeding with the nested attribute iteration. Fixes: b1edc14a3fbf ("ice: Implement ice_bridge_getlink and ice_bridge_setlink") Signed-off-by: Rand Deeb <rand.sec96@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05ice: virtchnl: stop pretending to support RSS over AQ or registersJacob Keller
The E800 series hardware uses the same iAVF driver as older devices, including the virtchnl negotiation scheme. This negotiation scheme includes a mechanism to determine what type of RSS should be supported, including RSS over PF virtchnl messages, RSS over firmware AdminQ messages, and RSS via direct register access. The PF driver will always prefer VIRTCHNL_VF_OFFLOAD_RSS_PF if its supported by the VF driver. However, if an older VF driver is loaded, it may request only VIRTCHNL_VF_OFFLOAD_RSS_REG or VIRTCHNL_VF_OFFLOAD_RSS_AQ. The ice driver happily agrees to support these methods. Unfortunately, the underlying hardware does not support these mechanisms. The E800 series VFs don't have the appropriate registers for RSS_REG. The mailbox queue used by VFs for VF to PF communication blocks messages which do not have the VF-to-PF opcode. Stop lying to the VF that it could support RSS over AdminQ or registers, as these interfaces do not work when the hardware is operating on an E800 series device. In practice this is unlikely to be hit by any normal user. The iAVF driver has supported RSS over PF virtchnl commands since 2016, and always defaults to using RSS_PF if possible. In principle, nothing actually stops the existing VF from attempting to access the registers or send an AQ command. However a properly coded VF will check the capability flags and will report a more useful error if it detects a case where the driver does not support the RSS offloads that it does. Fixes: 1071a8358a28 ("ice: Implement virtchnl commands for AVF support") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Alan Brady <alan.brady@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05idpf: disable local BH when scheduling napi for marker packetsEmil Tantilov
Fix softirq's not being handled during napi_schedule() call when receiving marker packets for queue disable by disabling local bottom half. The issue can be seen on ifdown: NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!! Using ftrace to catch the failing scenario: ifconfig [003] d.... 22739.830624: softirq_raise: vec=3 [action=NET_RX] <idle>-0 [003] ..s.. 22739.831357: softirq_entry: vec=3 [action=NET_RX] No interrupt and CPU is idle. After the patch when disabling local BH before calling napi_schedule: ifconfig [003] d.... 22993.928336: softirq_raise: vec=3 [action=NET_RX] ifconfig [003] ..s1. 22993.928337: softirq_entry: vec=3 [action=NET_RX] Fixes: c2d548cad150 ("idpf: add TX splitq napi poll support") Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04e1000e: Minor flow correction in e1000_shutdown functionVitaly Lifshits
Add curly braces to avoid entering to an if statement where it is not always required in e1000_shutdown function. This improves code readability and might prevent non-deterministic behaviour in the future. Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04igc: fix LEDS_CLASS dependencyArnd Bergmann
When IGC is built-in but LEDS_CLASS is a loadable module, there is a link failure: x86_64-linux-ld: drivers/net/ethernet/intel/igc/igc_leds.o: in function `igc_led_setup': igc_leds.c:(.text+0x75c): undefined reference to `devm_led_classdev_register_ext' Add another dependency that prevents this combination. Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04ixgbe: Add 1000BASE-BX supportErnesto Castellotti
Added support for 1000BASE-BX, i.e. Gigabit Ethernet over single strand of single-mode fiber. The initialization of a 1000BASE-BX SFP is the same as 1000BASE-SX/LX with the only difference that the Bit Rate Nominal Value must be checked to make sure it is a Gigabit Ethernet transceiver, as described by the SFF-8472 specification. This was tested with the FS.com SFP-GE-BX 1310/1490nm 10km transceiver: $ ethtool -m eth4 Identifier : 0x03 (SFP) Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) Connector : 0x07 (LC) Transceiver codes : 0x00 0x00 0x00 0x40 0x00 0x00 0x00 0x00 0x00 Transceiver type : Ethernet: BASE-BX10 Encoding : 0x01 (8B/10B) BR, Nominal : 1300MBd Rate identifier : 0x00 (unspecified) Length (SMF,km) : 10km Length (SMF) : 10000m Length (50um) : 0m Length (62.5um) : 0m Length (Copper) : 0m Length (OM3) : 0m Laser wavelength : 1310nm Vendor name : FS Vendor OUI : 64:9d:99 Vendor PN : SFP-GE-BX Vendor rev : Option values : 0x20 0x0a Option : RX_LOS implemented Option : TX_FAULT implemented Option : Power level 3 requirement BR margin, max : 0% BR margin, min : 0% Vendor SN : S2202359108 Date code : 220307 Optical diagnostics support : Yes Laser bias current : 17.650 mA Laser output power : 0.2132 mW / -6.71 dBm Receiver signal average optical power : 0.2740 mW / -5.62 dBm Module temperature : 47.30 degrees C / 117.13 degrees F Module voltage : 3.2576 V Alarm/warning flags implemented : Yes Laser bias current high alarm : Off Laser bias current low alarm : Off Laser bias current high warning : Off Laser bias current low warning : Off Laser output power high alarm : Off Laser output power low alarm : Off Laser output power high warning : Off Laser output power low warning : Off Module temperature high alarm : Off Module temperature low alarm : Off Module temperature high warning : Off Module temperature low warning : Off Module voltage high alarm : Off Module voltage low alarm : Off Module voltage high warning : Off Module voltage low warning : Off Laser rx power high alarm : Off Laser rx power low alarm : Off Laser rx power high warning : Off Laser rx power low warning : Off Laser bias current high alarm threshold : 110.000 mA Laser bias current low alarm threshold : 1.000 mA Laser bias current high warning threshold : 100.000 mA Laser bias current low warning threshold : 1.000 mA Laser output power high alarm threshold : 0.7079 mW / -1.50 dBm Laser output power low alarm threshold : 0.0891 mW / -10.50 dBm Laser output power high warning threshold : 0.6310 mW / -2.00 dBm Laser output power low warning threshold : 0.1000 mW / -10.00 dBm Module temperature high alarm threshold : 90.00 degrees C / 194.00 degrees F Module temperature low alarm threshold : -45.00 degrees C / -49.00 degrees F Module temperature high warning threshold : 85.00 degrees C / 185.00 degrees F Module temperature low warning threshold : -40.00 degrees C / -40.00 degrees F Module voltage high alarm threshold : 3.7950 V Module voltage low alarm threshold : 2.8050 V Module voltage high warning threshold : 3.4650 V Module voltage low warning threshold : 3.1350 V Laser rx power high alarm threshold : 0.7079 mW / -1.50 dBm Laser rx power low alarm threshold : 0.0028 mW / -25.53 dBm Laser rx power high warning threshold : 0.6310 mW / -2.00 dBm Laser rx power low warning threshold : 0.0032 mW / -24.95 dBm Signed-off-by: Ernesto Castellotti <ernesto@castellotti.net> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04intel: make module parameters readable in sys filesystemJon Maxwell
Linux users sometimes need an easy way to check current values of module parameters. For example the module may be manually reloaded with different parameters. Make these visible and readable in the /sys filesystem to allow that. But don't make the "debug" module parameter visible as debugging is enabled via ethtool msglvl. Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04ice: avoid unnecessary devm_ usageMaciej Fijalkowski
1. pcaps are free'd right after AQ routines are done, no need for devm_'s 2. a test frame for loopback test in ethtool -t is destroyed at the end of the test so we don't need devm_ here either. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: do not disable Tx queues twice in ice_down()Maciej Fijalkowski
ice_down() clears QINT_TQCTL_CAUSE_ENA_M bit twice, which is not necessary. First clearing happens in ice_vsi_dis_irq() and second in ice_vsi_stop_tx_ring() - remove the first one. While at it, make ice_vsi_dis_irq() static as ice_down() is the only current caller of it. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: cleanup line splitting for context set functionsJacob Keller
The indentation for ice_set_ctx and ice_write_rxq_ctx breaks the function name after the return type. This style of breaking is used a lot throughout the ice driver, even in cases where its not actually helpful for readability. We no longer prefer this style of line splitting in the driver, and new code is avoiding it. Normally, I would leave this alone unless the actual function contents or description needed updating. However, a future change is going to add inverse functions for converting packed context to unpacked context structures. To keep this code uniform with the existing set functions, fix up the style to the modern format of keeping the type on the same line. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: use GENMASK instead of BIT(n) - 1 in pack functionsJacob Keller
The functions used to pack the Tx and Rx context into the hardware format rely on using BIT() and then subtracting 1 to get a bitmask. These functions even have a comment about how x86 machines can't use this method for certain widths because the SHL instructions will not work properly. The Linux kernel already provides the GENMASK macro for generating a suitable bitmask. Further, GENMASK is capable of generating the mask including the shift_width. Since width is the total field width, take care to subtract one to get the final bit position. Since we now include the shifted bits as part of the mask, shift the source value first before applying the mask. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: rename ice_write_* functions to ice_pack_ctx_*Jacob Keller
In ice_common.c there are 4 functions used for converting the unpacked software Tx and Rx context structure data into the packed format used by hardware. These functions have extremely generic names: * ice_write_byte * ice_write_word * ice_write_dword * ice_write_qword When I saw these function names my first thought was "write what? to where?". Understanding what these functions do requires looking at the implementation details. The functions take bits from an unpacked structure and copy them into the packed layout used by hardware. As part of live migration, we will want functions which perform the inverse operation of reading bits from the packed layout and copying them into the unpacked format. Naming these as "ice_read_byte", etc would be very confusing since they appear to write data. In preparation for adding this new inverse operation, rename the existing functions to use the prefix "ice_pack_ctx_". This makes it clear that they perform the bit packing while copying from the unpacked software context structure to the packed hardware context. The inverse operations can then neatly be named ice_unpack_ctx_*, clearly indicating they perform the bit unpacking while copying from the packed hardware context to the unpacked software context structure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: remove vf->lan_vsi_num fieldJacob Keller
The lan_vsi_num field of the VF structure is no longer used for any purpose. Remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: use relative VSI index for VFs instead of PF VSI numberJacob Keller
When initializing over virtchnl, the PF is required to pass a VSI ID to the VF as part of its capabilities exchange. The VF driver reports this value back to the PF in a variety of commands. The PF driver validates that this value matches the value it sent to the VF. Some hardware families such as the E700 series could use this value when reading RSS registers or communicating directly with firmware over the Admin Queue. However, E800 series hardware does not support any of these interfaces and the VF's only use for this value is to report it back to the PF. Thus, there is no requirement that this value be an actual VSI ID value of any kind. The PF driver already does not trust that the VF sends it a real VSI ID. The VSI structure is always looked up from the VF structure. The PF does validate that the VSI ID provided matches a VSI associated with the VF, but otherwise does not use the VSI ID for any purpose. Instead of reporting the VSI number relative to the PF space, report a fixed value of 1. When communicating with the VF over virtchnl, validate that the VSI number is returned appropriately. This avoids leaking information about the firmware of the PF state. Currently the ice driver only supplies a VF with a single VSI. However, it appears that virtchnl has some support for allowing multiple VSIs. I did not attempt to implement this. However, space is left open to allow further relative indexes if additional VSIs are provided in future feature development. For this reason, keep the ice_vc_isvalid_vsi_id function in place to allow extending it for multiple VSIs in the future. This change will also simplify handling of live migration in a future series. Since we no longer will provide a real VSI number to the VF, there will be no need to keep track of this number when migrating to a new host. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: remove unnecessary duplicate checks for VF VSI IDJacob Keller
The ice_vc_fdir_param_check() function validates that the VSI ID of the virtchnl flow director command matches the VSI number of the VF. This is already checked by the call to ice_vc_isvalid_vsi_id() immediately following this. This check is unnecessary since ice_vc_isvalid_vsi_id() already confirms this by checking that the VSI ID can locate the VSI associated with the VF structure. Furthermore, a following change is going to refactor the ice driver to report VSI IDs using a relative index for each VF instead of reporting the PF VSI number. This additional check would break that logic since it enforces that the VSI ID matches the VSI number. Since this check duplicates the logic in ice_vc_isvalid_vsi_id() and gets in the way of refactoring that logic, remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: pass VSI pointer into ice_vc_isvalid_q_idJacob Keller
The ice_vc_isvalid_q_id() function takes a VSI index and a queue ID. It looks up the VSI from its index, and then validates that the queue number is valid for that VSI. The VSI ID passed is typically a VSI index from the VF. This VSI number is validated by the PF to ensure that it matches the VSI associated with the VF already. In every flow where ice_vc_isvalid_q_id() is called, the PF driver already has a pointer to the VSI associated with the VF. This pointer is obtained using ice_get_vf_vsi(), rather than looking up the VSI using the index sent by the VF. Since we already know which VSI to operate on, we can modify ice_vc_isvalid_q_id() to take a VSI pointer instead of a VSI index. Pass the VSI we found from ice_get_vf_vsi() instead of re-doing the lookup. This removes some unnecessary computation and scanning of the VSI list. It also removes the last place where the driver directly used the VSI number from the VF. This will pave the way for refactoring to communicate relative VSI numbers to the VF instead of absolute numbers from the PF space. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: remove dealloc vector msg err in idpf_intr_relAlan Brady
This error message is at best not really helpful and at worst misleading. If we're here in idpf_intr_rel we're likely trying to do remove or reset. If we're in reset, this message will fail because we lose the virtchnl on reset and HW is going to clean up those resources regardless in that case. If we're in remove and we get an error here, we're going to reset the device at the end of remove anyway so not a big deal. Just remove this message it's not useful. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: fix minor controlq issuesAlan Brady
While we're here improving virtchnl we can include two minor fixes for the lower level ctrlq flow. This adds a memory barrier to idpf_post_rx_buffs before we update tail on the controlq. We should make sure our writes have had a chance to finish before we tell HW it can touch them. This also removes some defensive programming in idpf_ctrlq_recv. The caller should not be using a num_q_msg value of zero or more than the ring size and it's their responsibility to call functions sanely. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: prevent deinit uninitialized virtchnl coreAlan Brady
In idpf_remove we need to tear down the virtchnl core with idpf_vc_core_deinit so we can free up resources and leave things in a good state. However, in the case where we failed to establish VC communications we may not have ever actually successfully initialized the virtchnl core. This fixes it by setting a bit once we successfully init the virtchnl core. Then, in deinit, we'll check for it before going on further, otherwise we just return. Also clear the bit at the end of deinit so we know it's gone now. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: cleanup virtchnl cruftAlan Brady
We can now remove a bunch of gross code we don't need anymore like the vc state bits and vc_buf_lock since everything is using transaction API now. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor idpf_recv_mb_msgAlan Brady
Now that all the messages are using the transaction API, we can rework idpf_recv_mb_msg quite a lot to simplify it. Due to this, we remove idpf_find_vport as no longer used and alter idpf_recv_event_msg slightly. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: add async_handler for MAC filter messagesAlan Brady
There are situations where the driver needs to add a MAC filter but we're explicitly not allowed to sleep so we can wait for a virtchnl message to complete. This adds an async_handler for asynchronously sent messages for MAC filters so that we can better handle if there's an error of some kind. If success we don't need to do anything else, but if we failed to program the new filter we really should remove it from our list of MAC filters. If we don't remove bad filters, what I expect to happen is after a reset of some kind we try to program the MAC filter again and it fails again. This is clearly wrong and I would expect to be confusing for the user. It could also be the failure is for a delete MAC filter message but those filters get deleted regardless. Not much we can do about a delete failure. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor remaining virtchnl messagesAlan Brady
This takes care of RSS/SRIOV/MAC and other misc virtchnl messages. This again is mostly mechanical. In absence of an async_handler for MAC filters, this will simply generically report any errors from idpf_vc_xn_forward_async. This maintains the existing behavior. Follow up patch will add an async handler for MAC filters to remove bad filters from our list. While we're here we can also make the code much nicer by converting some variables to auto-variables where appropriate. This makes it cleaner and less prone to memory leaking. There's still a bit more cleanup we can do here to remove stuff that's not being used anymore now; follow-up patches will take care of loose ends. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor queue related virtchnl messagesAlan Brady
This reworks queue specific virtchnl messages to use the added transaction API. It is fairly mechanical and generally makes the functions using it more simple. Functions using transaction API no longer need to take the vc_buf_lock since it's not using it anymore. After filling out an idpf_vc_xn_params struct, idpf_vc_xn_exec takes care of the send and recv handling. This also converts those functions where appropriate to use auto-variables instead of manually calling kfree. This greatly simplifies the memory alloc paths and makes them less prone memory leaks. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor vport virtchnl messagesAlan Brady
This reworks the way vport related virtchnl messages work to take advantage of the added transaction API. It is fairly mechanical as, to use the transaction API, the function just needs to fill out an appropriate idpf_vc_xn_params struct to pass to idpf_vc_xn_exec which will take care of the actual send and recv. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: implement virtchnl transaction managerAlan Brady
This starts refactoring how virtchnl messages are handled by adding a transaction manager (idpf_vc_xn_manager). There are two primary motivations here which are to enable handling of multiple messages at once and to make it more robust in general. As it is right now, the driver may only have one pending message at a time and there's no guarantee that the response we receive was actually intended for the message we sent prior. This works by utilizing a "cookie" field of the message descriptor. It is arbitrary what data we put in the cookie and the response is required to have the same cookie the original message was sent with. Then using a "transaction" abstraction that uses the completion API to pair responses to the message it belongs to. The cookie works such that the first half is the index to the transaction in our array, and the second half is a "salt" that gets incremented every message. This enables quick lookups into the array and also ensuring we have the correct message. The salt is necessary because after, for example, a message times out and we deem the response was lost for some reason, we could theoretically reuse the same index but using a different salt ensures that when we do actually get a response it's not the old message that timed out previously finally coming in. Since the number of transactions allocated is U8_MAX and the salt is 8 bits, we can never have a conflict because we can't roll over the salt without using more transactions than we have available. This starts by only converting the VIRTCHNL2_OP_VERSION message to use this new transaction API. Follow up patches will convert all virtchnl messages to use the API. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: add idpf_virtchnl.hAlan Brady
idpf.h is quite heavy. We can reduce the burden a fair bit by introducing an idpf_virtchnl.h file. This mostly just moves function declarations but there are many of them. This also makes an attempt to group those declarations in a way that makes some sense instead of mishmashed. Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04eth: igc: remove unused embedded struct net_deviceJakub Kicinski
struct net_device poll_dev in struct igc_q_vector was added in one of the initial commits, but never used. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04net: adopt skb_network_header_len() more broadlyEric Dumazet
(skb_transport_header(skb) - skb_network_header(skb)) can be replaced by skb_network_header_len(skb) Add a DEBUG_NET_WARN_ON_ONCE() in skb_network_header_len() to catch cases were the transport_header was not set. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04net: adopt skb_network_offset() and similar helpersEric Dumazet
This is a cleanup patch, making code a bit more concise. 1) Use skb_network_offset(skb) in place of (skb_network_header(skb) - skb->data) 2) Use -skb_network_offset(skb) in place of (skb->data - skb_network_header(skb)) 3) Use skb_transport_offset(skb) in place of (skb_transport_header(skb) - skb->data) 4) Use skb_inner_transport_offset(skb) in place of (skb_inner_transport_header(skb) - skb->data) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> # for sfc Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-01ice: reconfig host after changing MSI-X on VFMichal Swiatkowski
During VSI reconfiguration filters and VSI config which is set in ice_vf_init_host_cfg() are lost. Recall the host configuration function to restore them. Without this config VF on which MSI-X amount was changed might had a connection problems. Fixes: 4d38cb44bd32 ("ice: manage VFs MSI-X using resource tracking") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-01ice: reorder disabling IRQ and NAPI in ice_qp_disMaciej Fijalkowski
ice_qp_dis() currently does things in very mixed way. Tx is stopped before disabling IRQ on related queue vector, then it takes care of disabling Rx and finally NAPI is disabled. Let us start with disabling IRQs in the first place followed by turning off NAPI. Then it is safe to handle queues. One subtle change on top of that is that even though ice_qp_ena() looks more sane, clear ICE_CFG_BUSY as the last thing there. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-01i40e: disable NAPI right after disabling irqs when handling xsk_poolMaciej Fijalkowski
Disable NAPI before shutting down queues that this particular NAPI contains so that the order of actions in i40e_queue_pair_disable() mirrors what we do in i40e_queue_pair_enable(). Fixes: 123cecd427b6 ("i40e: added queue pair disable/enable functions") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-01ixgbe: {dis, en}able irqs in ixgbe_txrx_ring_{dis, en}ableMaciej Fijalkowski
Currently routines that are supposed to toggle state of ring pair do not take care of associated interrupt with queue vector that these rings belong to. This causes funky issues such as dead interface due to irq misconfiguration, as per Pavel's report from Closes: tag. Add a function responsible for disabling single IRQ in EIMC register and call this as a very first thing when disabling ring pair during xsk_pool setup. For enable let's reuse ixgbe_irq_enable_queues(). Besides this, disable/enable NAPI as first/last thing when dealing with closing or opening ring pair that xsk_pool is being configured on. Reported-by: Pavel Vazharov <pavel@x3me.net> Closes: https://lore.kernel.org/netdev/CAJEV1ijxNyPTwASJER1bcZzS9nMoZJqfR86nu_3jFFVXzZQ4NA@mail.gmail.com/ Fixes: 024aa5800f32 ("ixgbe: added Rx/Tx ring disable/enable functions") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Chandan Kumar Rout <chandanx.rout@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: net/mptcp/protocol.c adf1bb78dab5 ("mptcp: fix snd_wnd initialization for passive socket") 9426ce476a70 ("mptcp: annotate lockless access for RX path fields") https://lore.kernel.org/all/20240228103048.19255709@canb.auug.org.au/ Adjacent changes: drivers/dpll/dpll_core.c 0d60d8df6f49 ("dpll: rely on rcu for netdev_dpll_pin()") e7f8df0e81bf ("dpll: move xa_erase() call in to match dpll_pin_alloc() error path order") drivers/net/veth.c 1ce7d306ea63 ("veth: try harder when allocating queue memory") 0bef512012b1 ("net: add netdev_lockdep_set_classes() to virtual drivers") drivers/net/wireless/intel/iwlwifi/mvm/d3.c 8c9bef26e98b ("wifi: iwlwifi: mvm: d3: implement suspend with MLO") 78f65fbf421a ("wifi: iwlwifi: mvm: ensure offloading TID queue exists") net/wireless/nl80211.c f78c1375339a ("wifi: nl80211: reject iftype change with mesh ID change") 414532d8aa89 ("wifi: cfg80211: use IEEE80211_MAX_MESH_ID_LEN appropriately") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-28igb: extend PTP timestamp adjustments to i211Oleksij Rempel
The i211 requires the same PTP timestamp adjustments as the i210, according to its datasheet. To ensure consistent timestamping across different platforms, this change extends the existing adjustments to include the i211. The adjustment result are tested and comparable for i210 and i211 based systems. Fixes: 3f544d2a4d5c ("igb: adjust PTP timestamps for Tx/Rx latency") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240227184942.362710-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-02-28net: intel: igc: Use linkmode helpers for EEEAndrew Lunn
Make use of the existing linkmode helpers for converting PHY EEE register values into links modes, now that ethtool_keee uses link modes, rather than u32 values. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-28net: intel: igb: Use linkmode helpers for EEEAndrew Lunn
Make use of the existing linkmode helpers for converting PHY EEE register values into links modes, now that ethtool_keee uses link modes, rather than u32 values. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-28net: intel: e1000e: Use linkmode helpers for EEEAndrew Lunn
Make use of the existing linkmode helpers for converting PHY EEE register values into links modes, now that ethtool_keee uses link modes, rather than u32 values. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-28net: intel: i40e/igc: Remove setting Autoneg in EEE capabilitiesAndrew Lunn
Energy Efficient Ethernet should always be negotiated with the link peer. Don't include SUPPORTED_Autoneg in the results of get_eee() for supported, advertised or lp_advertised, since it is assumed. Additionally, ethtool(1) ignores the set bit, and no other driver sets this. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-28net: ethernet: ixgbe: Convert EEE to use linkmodesAndrew Lunn
Convert the tables to make use of ETHTOOL link mode bits, rather than the old u32 SUPPORTED speeds. Make use of the linkmode helps to set bits and compare linkmodes. As a result, the _u32 members of keee are no longer used, a step towards removing them. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-02-20ice: Fix ASSERT_RTNL() warning during certain scenariosAmritha Nambiar
Commit 91fdbce7e8d6 ("ice: Add support in the driver for associating queue with napi") invoked the netif_queue_set_napi() call. This kernel function requires to be called with rtnl_lock taken, otherwise ASSERT_RTNL() warning will be triggered. ice_vsi_rebuild() initiating this call is under rtnl_lock when the rebuild is in response to configuration changes from external interfaces (such as tc, ethtool etc. which holds the lock). But, the VSI rebuild generated from service tasks and resets (PFR/CORER/GLOBR) is not under rtnl lock protection. Handle these cases as well to hold lock before the kernel call (by setting the 'locked' boolean to false). netif_queue_set_napi() is also used to clear previously set napi in the q_vector unroll flow. Handle this for locked/lockless execution paths. Fixes: 91fdbce7e8d6 ("ice: Add support in the driver for associating queue with napi") Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-20ice: fix pin phase adjust updates on PF resetArkadiusz Kubalewski
Do not allow to set phase adjust value for a pin if PF reset is in progress, this would cause confusing netlink extack errors as the firmware cannot process the request properly during the reset time. Return (-EBUSY) and report extack error for the user who tries configure pin phase adjust during the reset time. Test by looping execution of below steps until netlink error appears: - perform PF reset $ echo 1 > /sys/class/net/<ice PF>/device/reset - change pin phase adjust value: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --do pin-set --json '{"id":0, "phase-adjust":1000}' Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks") Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-20ice: fix dpll periodic work data updates on PF resetArkadiusz Kubalewski
Do not allow dpll periodic work function to acquire data from firmware if PF reset is in progress. Acquiring data will cause dmesg errors as the firmware cannot respond or process the request properly during the reset time. Test by looping execution of below step until dmesg error appears: - perform PF reset $ echo 1 > /sys/class/net/<ice PF>/device/reset Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-20ice: fix dpll and dpll_pin data access on PF resetArkadiusz Kubalewski
Do not allow to acquire data or alter configuration of dpll and pins through firmware if PF reset is in progress, this would cause confusing netlink extack errors as the firmware cannot respond or process the request properly during the reset time. Return (-EBUSY) and extack error for the user who tries access/modify the config of dpll/pin through firmware during the reset time. The PF reset and kernel access to dpll data are both asynchronous. It is not possible to guard all the possible reset paths with any determinictic approach. I.e., it is possible that reset starts after reset check is performed (or if the reset would be checked after mutex is locked), but at the same time it is not possible to wait for dpll mutex unlock in the reset flow. This is best effort solution to at least give a clue to the user what is happening in most of the cases, knowing that there are possible race conditions where the user could see a different error received from firmware due to reset unexpectedly starting. Test by looping execution of below steps until netlink error appears: - perform PF reset $ echo 1 > /sys/class/net/<ice PF>/device/reset - i.e. try to alter/read dpll/pin config: $ ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/dpll.yaml \ --dump pin-get Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-20ice: fix dpll input pin phase_adjust value updatesArkadiusz Kubalewski
The value of phase_adjust for input pin shall be updated in ice_dpll_pin_state_update(..). Fix by adding proper argument to the firmware query function call - a pin's struct field pointer where the phase_adjust value during driver runtime is stored. Previously the phase_adjust used to misinform user about actual phase_adjust value. I.e., if phase_adjust was set to a non zero value and if driver was reloaded, the user would see the value equal 0, which is not correct - the actual value is equal to value set before driver reload. Fixes: 90e1c90750d7 ("ice: dpll: implement phase related callbacks") Reviewed-by: Alan Brady <alan.brady@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-20ice: fix connection state of DPLL and out pinYochai Hagvi
Fix the connection state between source DPLL and output pin, updating the attribute 'state' of 'parent_device'. Previously, the connection state was broken, and didn't reflect the correct state. When 'state_on_dpll_set' is called with the value 'DPLL_PIN_STATE_CONNECTED' (1), the output pin will switch to the given DPLL, and the state of the given DPLL will be set to connected. E.g.: --do pin-set --json '{"id":2, "parent-device":{"parent-id":1, "state": 1 }}' This command will connect DPLL device with id 1 to output pin with id 2. When 'state_on_dpll_set' is called with the value 'DPLL_PIN_STATE_DISCONNECTED' (2) and the given DPLL is currently connected, then the output pin will be disabled. E.g: --do pin-set --json '{"id":2, "parent-device":{"parent-id":1, "state": 2 }}' This command will disable output pin with id 2 if DPLL device with ID 1 is connected to it; otherwise, the command is ignored. Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Yochai Hagvi <yochai.hagvi@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-16i40e: Remove VEB recursionIvan Vecera
The VEB (virtual embedded switch) as a switch element can be connected according datasheet though its uplink to: - Physical port - Port Virtualizer (not used directly by i40e driver but can be present in MFP mode where the physical port is shared between PFs) - No uplink (aka floating VEB) But VEB uplink cannot be connected to another VEB and any attempt to do so results in: "i40e 0000:02:00.0: couldn't add VEB, err -EIO aq_err I40E_AQ_RC_ENOENT" that indicates "the uplink SEID does not point to valid element". Remove this logic from the driver code this way: 1) For debugfs only allow to build floating VEB (uplink_seid == 0) or main VEB (uplink_seid == mac_seid) 2) Do not recurse in i40e_veb_link_event() as no VEB cannot have sub-VEBs 3) Ditto for i40e_veb_rebuild() + simplify the function as we know that the VEB for rebuild can be only the main LAN VEB or some of the floating VEBs 4) In i40e_rebuild() there is no need to check veb->uplink_seid as the possible ones are 0 and MAC SEID 5) In i40e_vsi_release() do not take into account VEBs whose uplink is another VEB as this is not possible 6) Remove veb_idx field from i40e_veb as a VEB cannot have sub-VEBs Tested using i40e debugfs interface: 1) Initial state [root@cnb-03 net-next]# CMD="/sys/kernel/debug/i40e/0000:02:00.0/command" [root@cnb-03 net-next]# echo dump switch > $CMD [root@cnb-03 net-next]# dmesg -c [ 98.440641] i40e 0000:02:00.0: header: 3 reported 3 total [ 98.446053] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 98.452593] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 98.458856] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 2) Add floating VEB [root@cnb-03 net-next]# echo add relay > $CMD [root@cnb-03 net-next]# dmesg -c [ 122.745630] i40e 0000:02:00.0: added relay 162 [root@cnb-03 net-next]# echo dump switch > $CMD [root@cnb-03 net-next]# dmesg -c [ 136.650049] i40e 0000:02:00.0: header: 4 reported 4 total [ 136.655466] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 136.661994] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 136.668264] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 [ 136.674787] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0 3) Add VMDQ2 VSI to this new VEB [root@cnb-03 net-next]# dmesg -c [ 168.351763] i40e 0000:02:00.0: added VSI 394 to relay 162 [ 168.374652] enp2s0f0np0v0: NIC Link is Up, 40 Gbps Full Duplex, Flow Control: None [root@cnb-03 net-next]# echo dump switch > $CMD [root@cnb-03 net-next]# dmesg -c [ 195.683204] i40e 0000:02:00.0: header: 5 reported 5 total [ 195.688611] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16 [ 195.695143] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0 [ 195.701410] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 195.707935] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 195.714201] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 4) Try to delete the VEB [root@cnb-03 net-next]# echo del relay 162 > $CMD [root@cnb-03 net-next]# dmesg -c [ 239.260901] i40e 0000:02:00.0: deleting relay 162 [ 239.265621] i40e 0000:02:00.0: can't remove VEB 162 with 1 VSIs left 5) Do PF reset and check switch status after rebuild [root@cnb-03 net-next]# echo pfr > $CMD [root@cnb-03 net-next]# echo dump switch > $CMD [root@cnb-03 net-next]# dmesg -c ... [ 272.333655] i40e 0000:02:00.0: header: 5 reported 5 total [ 272.339066] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16 [ 272.345599] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0 [ 272.351862] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 272.358387] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 272.364654] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 6) Delete VSI and delete VEB [ 297.199116] i40e 0000:02:00.0: deleting VSI 394 [ 299.807580] i40e 0000:02:00.0: deleting relay 162 [ 309.767905] i40e 0000:02:00.0: header: 3 reported 3 total [ 309.773318] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 309.779845] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 309.786111] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-02-16i40e: Fix broken support for floating VEBsIvan Vecera
Although the i40e supports so-called floating VEB (VEB without an uplink connection to external network), this support is broken. This functionality is currently unused (except debugfs) but it will be used by subsequent series for switchdev mode slow-path. Fix this by following: 1) Handle correctly floating VEB (VEB with uplink_seid == 0) in i40e_reconstitute_veb() and look for owner VSI and create it only for non-floating VEBs and also set bridge mode only for such VEBs as the floating ones are using always VEB mode. 2) Handle correctly floating VEB in i40e_veb_release() and disallow its release when there are some VSIs. This is different from regular VEB that have owner VSI that is connected to VEB's uplink after VEB deletion by FW. 3) Fix i40e_add_veb() to handle 'vsi' that is NULL for floating VEBs. For floating VEB use 0 for downlink SEID and 'true' for 'default_port' parameters as per datasheet. 4) Fix 'add relay' command in i40e_dbg_command_write() to allow to create floating VEB by 'add relay 0 0' or 'add relay' Tested using debugfs: 1) Initial state [root@host net-next]# echo dump switch > $CMD [root@host net-next]# dmesg -c [ 173.701286] i40e 0000:02:00.0: header: 3 reported 3 total [ 173.706701] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 173.713241] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 173.719507] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 2) Add floating VEB [root@host net-next]# CMD="/sys/kernel/debug/i40e/0000:02:00.0/command" [root@host net-next]# echo add relay > $CMD [root@host net-next]# dmesg -c [ 245.551720] i40e 0000:02:00.0: added relay 162 [root@host net-next]# echo dump switch > $CMD [root@host net-next]# dmesg -c [ 276.984371] i40e 0000:02:00.0: header: 4 reported 4 total [ 276.989779] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 276.996302] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 277.002569] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 [ 277.009091] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0 3) Add VMDQ2 VSI to this new VEB [root@host net-next]# echo add vsi 162 > $CMD [root@host net-next]# dmesg -c [ 332.314030] i40e 0000:02:00.0: added VSI 394 to relay 162 [ 332.337486] enp2s0f0np0v0: NIC Link is Up, 40 Gbps Full Duplex, Flow Control: None [root@host net-next]# echo dump switch > $CMD [root@host net-next]# dmesg -c [ 387.284490] i40e 0000:02:00.0: header: 5 reported 5 total [ 387.289904] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16 [ 387.296446] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0 [ 387.302708] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 387.309234] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 387.315500] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 4) Try to delete the VEB [root@host net-next]# echo del relay 162 > $CMD [root@host net-next]# dmesg -c [ 428.749297] i40e 0000:02:00.0: deleting relay 162 [ 428.754011] i40e 0000:02:00.0: can't remove VEB 162 with 1 VSIs left 5) Do PF reset and check switch status after rebuild [root@host net-next]# echo pfr > $CMD [root@host net-next]# echo dump switch > $CMD [root@host net-next]# dmesg -c [ 738.056172] i40e 0000:02:00.0: header: 5 reported 5 total [ 738.061577] i40e 0000:02:00.0: type=19 seid=394 uplink=162 downlink=16 [ 738.068104] i40e 0000:02:00.0: type=17 seid=162 uplink=0 downlink=0 [ 738.074367] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 738.080892] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 738.087160] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 6) Delete VSI and delete VEB [root@host net-next]# echo del vsi 394 > $CMD [root@host net-next]# echo del relay 162 > $CMD [root@host net-next]# echo dump switch > $CMD [root@host net-next]# dmesg -c [ 1233.081126] i40e 0000:02:00.0: deleting VSI 394 [ 1239.345139] i40e 0000:02:00.0: deleting relay 162 [ 1244.886920] i40e 0000:02:00.0: header: 3 reported 3 total [ 1244.892328] i40e 0000:02:00.0: type=19 seid=392 uplink=160 downlink=16 [ 1244.898853] i40e 0000:02:00.0: type=17 seid=160 uplink=2 downlink=0 [ 1244.905119] i40e 0000:02:00.0: type=19 seid=390 uplink=160 downlink=16 Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>