summaryrefslogtreecommitdiff
path: root/drivers/net/mlx5/mlx5.h
AgeCommit message (Collapse)Author
2019-08-06net/mlx5: fix limit of direct rules tables numberDekel Peled
MLX5 PMD limits the number of SW steering tables to 32. This patch updates the limit to 65535, to allow wide range of values. Fixes: e2b4925ef7c1 ("net/mlx5: support Direct Rules E-Switch") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-08-06net/mlx5: add workaround for VLAN in virtual machineViacheslav Ovsiienko
On some virtual setups (particularly on ESXi) when we have SR-IOV and E-Switch enabled there is the problem to receive VLAN traffic on VF interfaces. The NIC driver in ESXi hypervisor does not setup E-Switch vport setting correctly and VLAN traffic targeted to VF is dropped. The patch provides the temporary workaround - if the rule containing the VLAN pattern is being installed for VF the VLAN network interface over VF is created, like the command does: ip link add link vf.if name mlx5.wa.1.100 type vlan id 100 The PMD in DPDK maintains the database of created VLAN interfaces for each existing VF and requested VLAN tags. When all of the RTE Flows using the given VLAN tag are removed the created VLAN interface with this VLAN tag is deleted. The name of created VLAN interface follows the format: evmlx.d1.d2, where d1 is VF interface ifindex, d2 - VLAN ifindex Implementation limitations: - mask in rules is ignored, rule must specify VLAN tags exactly, no wildcards (which are implemented by the masks) are allowed - virtual environment is detected via rte_hypervisor() call, and the type of hypervisor is checked. Currently we engage the workaround for ESXi and unrecognized hypervisors (which always happen on platforms other than x86 - it means workaround applied for the Flow over PCI VF). There are no confirmed data the other hypervisors (HyperV, Qemu) need this workaround, we are trying to reduce the list of configurations on those workaround should be applied. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-29net/mlx5: allow LRO per Rx queueMatan Azrad
Enabling LRO offload per queue makes sense because the user will probably want to allocate different mempool for LRO queues - the LRO mempool mbuf size may be bigger than non LRO mempool. Change the LRO offload to be per queue instead of per port. If one of the queues is with LRO enabled, all the queues will be configured via DevX. If RSS flows direct TCP packets to queues with different LRO enabling, these flows will not be offloaded with LRO. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-29net/mlx5: allow implicit LRO flowMatan Azrad
When a user configures LRO in the port offloads, he probably wants each TCP packet will have a chance to open an LRO session. The PMD wasn't configure LRO in the flow TIR if the flow is not explicitly configured TCP item despite the flow included TCP traffic. For example, the next flows were not LRO offloaded: pattern eth / end, pattern eth / ip / end, pattern eth / ipv6 / end. Enable LRO configuration for all the TIRs if LRO is configured in the port. No performance impact for non-LRO traffic in these TIRs. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: adjust maximum LRO message sizeMatan Azrad
LRO message is contained in the MPRQ strides. While the LRO message size cannot be bigger than 65280 according to the PRM, the strides which contain it may be bigger than the maximum buffer size allowed in dpdk mbuf - 0xFFFF. Adjust the maximum LRO message size to avoid buffer length overflow. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: store protection domain number on createDekel Peled
Function mlx5_alloc_shared_ibctx() allocates Protection Domain using verbs API, as part of shared IB device context. This patch adds reading and storing of pdn value from the created PD object, using DV API. The pdn value is required when creating WQ using DevX API. This patch also updates function flow_dv_create_counter_stat_mem_mng() which uses the pdn value as well. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: rename verbs indirection table to objDekel Peled
Prepare for introducing of DevX RQT object. Rx indirection table object is currently created using verbs only. The next patches will add the option to create an RQT object using DevX. This patch renames ind_table_ibv to ind_table_obj wherever relevant, and adds the DevX items to relevant structs. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: rename RxQ verbs to general RxQ objectDekel Peled
Prepare for introducing of DevX RxQ object. RxQ object is currently created using verbs only. The next patches will add the option to create RxQ object using DevX. This patch renames rxq_ibv to rxq_obj wherever relevant, and adds the DevX items to relevant structs. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: allocate door-bells via DevXDekel Peled
When using DevX API, memory for door-bell records should be allocated by PMD and registered using DevX API. This patch implements the utility functions to support it: - Add struct mlx5_devx_dbr_page, containing door-bells page data. - Add list of struct mlx5_devx_dbr_page door-bell pages to device private data. - Implement function mlx5_alloc_dbr_page() to allocate page for door-bell records, and register it using DevX API. - Implement function mlx5_get_dbr(). to acquire a door-bell record from the door-bells page, allocating a new page if needed. - Implement function mlx5_release_dbr() to release a door-bell record that is no longer needed, freeing the containing page if it becomes empty. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: create advanced RxQ table via DevXDekel Peled
Implement function mlx5_devx_cmd_create_rqt() to create RQT object using DevX API. Add related structs in mlx5.h and mlx5_prm.h. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: create advanced Rx object via DevXDekel Peled
Implement function mlx5_devx_cmd_create_tir() to create TIR object using DevX API.. Add related structs in mlx5.h and mlx5_prm.h. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: modify advanced RxQ object via DevXDekel Peled
Implement function mlx5_devx_cmd_modify_rq() to modify RQ. Add related structs in mlx5.h and mlx5_prm.h. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: create advanced RxQ object via DevXDekel Peled
Implement function mlx5_devx_cmd_create_rq() to create RQ object using DevX API. Add related structs in mlx5.h and mlx5_prm.h. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: update Tx queue create for LRODekel Peled
Update function mlx5_txq_ibv_new(), query and store the TIS transport domain value. It is required later on Rx side when creating matching TIR. Add field in mlx5 data structure to store Transport Domain ID. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: support Tx interface query via DevXDekel Peled
Implement function mlx5_devx_cmd_qp_tis_td_query(), to query QP TIS Transport Domain value. Add related structs in mlx5_prm.h. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: add glue for create action via DevXDekel Peled
Add compile option HAVE_MLX5DV_DR_ACTION_DEST_DEVX_TIR, and matching dest_tir flag in device configuration structure. Add glue function pointer dv_create_flow_action_dest_devx_tir, and function mlx5_glue_dv_create_flow_action_dest_devx_tir(), to invoke API mlx5dv_dr_action_create_dest_devx_tir(); Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: query LRO capabilities via DevXDekel Peled
Update function mlx5_devx_cmd_query_hca_attr() to query HCA capabilities related to LRO. Add relevant structs in drivers/net/mlx5/mlx5_prm.h. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: introduce LRODekel Peled
Add command-line argument to set LRO session timeout. Add LRO settings struct in PMD configuration struct. Add support of LRO offload in port configuration. Add macros and function to check if LRO is supported and enabled. Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: revert Netlink socket sharingViacheslav Ovsiienko
This reverts commit e28111ac9864af09e826241a915dfff87a9c00ad. The netlink requests are replaced by ifindex caching and not needed anymore. Fixes: e28111ac9864 ("net/mlx5: fix master device Netlink socket sharing") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23net/mlx5: cache associated network device indexViacheslav Ovsiienko
The associated device index is retrieved via Netlink request to underlying Infiniband device driver. This network device index is permanent throughout the lifetime of device. We do not spawn the rte_eth_dev ports without associated network device, and if network device is being unbound we get the remove notification message and rte_eth_dev port is also detached. So, we may store the ifindex in mlx5_device_spawn() routine at rte_eth_dev port creation and initialization time and use the cached value further instead of doing actual Netlink request. Reported-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23net/mlx5: add Tx configuration and setupViacheslav Ovsiienko
This patch updates the Tx datapath control and configuration structures and code for managing Tx datapath settings. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23net/mlx5: extend NIC attributes query via DevXViacheslav Ovsiienko
This patch extends the NIC attributes query via DevX. The appropriate interface structures are borrowed from kernel driver headers and DevX calls are added to mlx5_devx_cmd_query_hca_attr() routine. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23net/mlx5: add Tx devargsViacheslav Ovsiienko
This patch introduces new mlx5 PMD devarg options: - txq_inline_min - specifies minimal amount of data to be inlined into WQE during Tx operations. NICs may require this minimal data amount to operate correctly. The exact value may depend on NIC operation mode, requested offloads, etc. - txq_inline_max - specifies the maximal packet length to be completely inlined into WQE Ethernet Segment for ordinary SEND method. If packet is larger the specified value, the packet data won't be copied by the driver at all, data buffer is addressed with a pointer. If packet length is less or equal all packet data will be copied into WQE. - txq_inline_mpw - specifies the maximal packet length to be completely inlined into WQE for Enhanced MPW method. Driver documentation is also updated. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23net/mlx5: remove Tx implementationViacheslav Ovsiienko
This patch removes the existing Tx datapath code as preparation step before introducing the new implementation. The following entities are being removed: - deprecated devargs support - tx_burst() routines - related PRM definitions - SQ configuration code - Tx routine selection code - incompatible Tx completion code The following devargs are deprecated and ignored: - "txq_inline" is going to be converted to "txq_inline_max" for compatibility issue - "tx_vec_en" - "txqs_max_vec" - "txq_mpw_hdr_dseg_en" - "txq_max_inline_len" is going to be converted to "txq_inline_mpw" for compatibility issue The deprecated devarg keys are recognized by PMD and ignored/converted to the new ones in order not to block device probing. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-07-23net/mlx5: fix typos in commentsDekel Peled
Some spelling mistakes were found in comments. This patch fixes them. Fixes: d10b09db0a45 ("net/mlx5: fix allocation when no memory on device NUMA node") Fixes: fc2c498ccb94 ("net/mlx5: add Direct Verbs translate items") Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support") Fixes: f6d9ab4e769f ("net/mlx5: check Tx queue size overflow") Cc: stable@dpdk.org Signed-off-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-23net/mlx5: allow basic counter management fallbackMatan Azrad
In case the asynchronous devx commands are not supported in RDMA core fallback to use a basic counter management. Here, the PMD counters cashe is redundant and the host thread doesn't update it. hence, each counter operation will go to the FW and the acceleration reduces. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23net/mlx5: accelerate DV flow counter queryMatan Azrad
All the DV counters are cashed in the PMD memory and are contained in pools which are contained in containers according to the counters allocation type - batch or single. Currently, the flow counter query is done synchronously in pool resolution means that on the user request a FW command is triggered to read all the counters in the pool. A new feature of devX to asynchronously read batch of flow counters allows to accelerate the user query operation. Using the DPDK host thread, the PMD periodically triggers asynchronous query in pool resolution for all the counter pools and an interrupt is triggered by the FW when the values are updated. In the interrupt handler the pool counter values raw data is replaced using a double buffer algorithm (very fast). In the user query, the PMD just returns the last query values from the PMD cache - no system-calls and FW commands are triggered from the user control thread on query operation! More synchronization is added with the host thread: Container resize uses double buffer algorithm. Pools growing in container uses atomic operation. Pool query buffer replace uses a spinlock. Pool minimum devX counter ID uses atomic operation. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-23net/mlx5: accelerate DV flow counter transactionsMatan Azrad
The DevX interface exposes a new feature to the PMD that can allocate a batch of counters by one FW command. It can improve the flow transaction rate (with count action). Add a new counter pools mechanism to manage HW counters in the PMD. So, for each flow with counter creation the PMD will try to find a free counter in the PMD pools container and only if there is no a free counter, it will allocate a new DevX batch counters. Currently we cannot support batch counter for a group 0 flow, so create a 2 container types, one which allocates counters one by one and one which allocates X counters by the batch feature. The allocated counters objects are never released back to the HW assuming the flows maximum number will be close to the actual value of the flows number. Later, it can be updated, and dynamic release mechanism can be added. The counters are contained in pools, each pool with 512 counters. The pools are contained in counter containers according to the allocation resolution type - single or batch. The cache memory of the counters statistics is saved as raw data per pool. All the raw data memory is allocated for all the container in one memory allocation and is managed by counter_stats_mem_mng structure which registers all the raw memory to the HW. Each pool points to one raw data structure. The query operation is in pool resolution which updates all the pool counter raw data by one operation. Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-07-05net/mlx5: remove TCF supportMoti Haimovsky
This commit removes the support of configuring the device E-switch using TCF since it is now possible to configure it via DR (direct verbs rules), and by that to also remove the PMD dependency in libmnl. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
2019-07-06eal: hide shared memory configAnatoly Burakov
Now that everything that has ever accessed the shared memory config is doing so through the public API's, we can make it internal. Since we're removing quite a few headers from rte_eal_memconfig.h, we need to add them back in places where this header is used. This bumps the ABI, so also change all build files and make update documentation. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>
2019-06-14net/mlx5: fix master device Netlink socket sharingViacheslav Ovsiienko
There is the patch [1] that uses master device Netlink socket to retrieve master device link settings. This is not thread safe because this resource may be in use by other call to the master device itself. Using the same Netlink socket concurrently from the multiple threads causes Netlink requests malfunction and must be eliminated. The patch replaces master Netlink socket with the socket from representor device. [1] http://patches.dpdk.org/patch/53120/ Fixes: 0333b2f584d9 ("net/mlx5: inherit master link settings for representors") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14net/mlx5: recover secondary process Rx errorsMatan Azrad
The RQ errors recovery mechanism in the PMD invokes a Verbs functions to modify the RQ states in order to reset the RQ and to reactivate it. These Verbs functions are not allowed to be invoked from a secondary process, hence the PMD skips the recovery when the error is captured by secondary processes queues. Using the DPDK IPC mechanism the secondary process can request Verbs queues state modifications to be done synchronically by the primary process. Add support for secondary process Rx errors recovery. Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-14net/mlx5: add log file procedure for debug dataMatan Azrad
Add a global function in the PMD which dumps debug information to specific file. The data can be printed in hexadecimal format or as regular string. The number of debug files per PMD entity should be limited by a new PMD probe parameter called max_dump_files_num. The files will be created in the /var/log directory or in the current directory. Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-06net/mlx5: fix event handler uninstallViacheslav Ovsiienko
When device is being closed and tries to unregister interrupt callback, there is a chance the handler is still active (called in context of eal_intr_thread_main thread). If so the rte_intr_callback_unregister returns -EAGAIN and keeps the handler registered, causing crash when underlaying resourse is gone away. This race condition may happen if event handling in application takes a long time. We should check the return code of unregistering routine and try again to unregister the handler. The diagnostic messages are shown once a second, while trying to unregister. Fixes: 028b2a28c3cb ("net/mlx5: update event handler for multiport IB devices") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-06-06net/mlx5: support reading clockTom Barbette
Implements support for read_clock for the mlx5 driver. mlx5 supports hardware timestamp offload, setting packets timestamp field to the device clock. rte_eth_read_clock allows to read the device's current clock value and therefore compare values on similar time base. See rxtx_callbacks for an example. Signed-off-by: Tom Barbette <barbette@kth.se> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-06-05ipc: handle unsupported IPC in action registerAnatoly Burakov
Currently, IPC API will silently ignore unsupported IPC. Fix the API call and its callers to explicitly handle unsupported IPC cases. For primary processes, it is OK to not have IPC because there may not be any secondary processes in the first place, and there are valid use cases that disable IPC support, so all primary process usages are fixed up to ignore IPC failures. For secondary processes, IPC will be crucial, so leave all of the error handling as is. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-05-24net: add rte prefix to ether definesOlivier Matz
Add 'RTE_' prefix to defines: - rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN. - rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN. - rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN. - rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN. - rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN. - rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN. - rename ETHER_MTU as RTE_ETHER_MTU. - rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN. - rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID. - rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN. - rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU. - rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR. - rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR. - rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4. - rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6. - rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP. - rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN. - rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP. - rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ. - rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG. - rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588. - rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW. - rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB. - rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP. - rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS. - rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM. - rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN. - rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE. - rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4. - rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6. - rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH. - rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH. - rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS. - rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP. - rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG. - rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24net: add rte prefix to ether structuresOlivier Matz
Add 'rte_' prefix to structures: - rename struct ether_addr as struct rte_ether_addr. - rename struct ether_hdr as struct rte_ether_hdr. - rename struct vlan_hdr as struct rte_vlan_hdr. - rename struct vxlan_hdr as struct rte_vxlan_hdr. - rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-03net/mlx5: fix Direct Rules APIOri Kam
The RDMA-CORE Direct Rules API was changed in latest upstream code This commit update the API accordingly. Fixes: 4f84a19779ca ("net/mlx5: add Direct Rules API") Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03net/mlx5: update memory event callback for shared contextViacheslav Ovsiienko
Mellanox mlx5 PMD implements the list of devices to process the memory free events to reflect the actual memory state to Memory Regions. Because this list contains the devices and devices may share the same context the callback routine may be called multiple times with the same parameter, that is not optimal. This patch modifies the list to contain the device contexts instead of device objects and shared context is included in the list only once. Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-03net/mlx5: share Memory Regions for multiport deviceViacheslav Ovsiienko
The multiport Infiniband device support was introduced [1]. All active ports, belonging to the same Infiniband device use the single shared Infiniband context of that device and share the resources: - QPs are created within shared context - Verbs flows are also created with specifying port index - DV/DR resources - Protection Domain - Event Handlers This patchset adds support for Memory Regions sharing between ports, created on the base of multiport Infiniband device. The datapath of mlx5 uses the layered cache subsystem for allocating/releasing Memory Regions, only the lowest layer L3 is subject to share due to performance issues. [1] http://patches.dpdk.org/cover/51800/ Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19net/mlx5: add drop action to Direct Verbs E-SwitchOri Kam
This commit adds support for drop action when creating E-Switch flow using DV. Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19net/mlx5: add E-Switch port id action to Direct VerbsOri Kam
This commits adds matching on source port, using DV API. Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19net/mlx5: validate Direct Rule E-SwitchOri Kam
Add validation logic for E-Switch using Direct Rules. Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19net/mlx5: support Direct Rules E-SwitchOri Kam
This commit checks the for DR E-Switch support. The support is based on both Device and Kernel. This commit also enables the user to manually disable this this feature. Signed-off-by: Ori Kam <orika@mellanox.com> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-19net/mlx5: support PF representorViacheslav Ovsiienko
On BlueField platform we have the new entity - PF representor. This one represents the PCI PF attached to external host on the side of ARM. The traffic sent by the external host to the NIC via PF will be seem by ARM on this PF representor. This patch refactors port recognizing capability on the base of physical port name. We have two groups of name formats. Legacy name formats are supported by kernels before ver 5.0 (being more precise - before the patch [1]) or before Mellanox OFED 4.6, and new naming formats added by the patch [1]. Legacy naming formats are supported: - missing physical port name (no sysfs/netlink key) at all, master is assumed - decimal digits (for example "12"), representor is assumed, the value is the index of attached VF New naming formats are supported: - "p" followed by decimal digits, for example "p2", master is assumed - "pf" followed by PF index concatenated with "vf" followed by VF index, for example "pf0vf1", representor is assumed. If index of VF is "-1" it is a special case of host PF representor, this representor must be indexed in devargs as 65535, for example representor=[0-3,65535] will allow representors for VF0, VF1, VF2, VF3 and for host PF. Note: do not specify representor=[0-65535], it causes devargs processing error, because number of ports (rte_eth_dev) is limited. Applications should distinguish representors and master devices exclusively by device flag RTE_ETH_DEV_REPRESENTOR and do not rely on switch port_id (mlx5 PMD deduces ones from representor_id) values returned by dev_infos_get() API. [1] https://www.spinics.net/lists/netdev/msg547007.html Linux-tree: c12ecc23 (Or Gerlitz 2018-04-25 17:32 +0300) "net/mlx5e: Move to use common phys port names for vport representors" Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12net/mlx5: remove device register remapYongseok Koh
UAR (User Access Region) register does not need to be remapped for primary process but it should be remapped only for secondary process. UAR register table is in the process private structure in rte_eth_devices[], (struct mlx5_proc_priv *)rte_eth_devices[port_id].process_private The actual UAR table follows the data structure and the table is used for both Tx and Rx. For Tx, BlueFlame in UAR is used to ring the doorbell. MLX5_TX_BFREG(txq) is defined to get a register for the txq. Processes access its own private data to acquire the register from the UAR table. For Rx, the doorbell in UAR is required in arming CQ event. However, it is a known issue that the register isn't remapped for secondary process. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2019-04-12net/mlx5: fix recursive inclusion of header fileYongseok Koh
mlx5.h includes mlx5_rxtx.h and mlx5_rxtx.h includes mlx5.h recursively. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12net/mlx5: fix device probing for old kernel driversViacheslav Ovsiienko
Retrieving network interface index via Netlink fails in case of old ib_core kernel driver installed - mlx5_nl_ifindex() routine fails due to RDMA_NLDEV_ATTR_NDEV_INDEX attribute is not supported by the old driver. The patch allowing to retrieve the network interface index and name via Netlink [1]. So, the problem depends on ib_core module version - 4.16 supports getting ifindex via Netlink, 4.15 does not. This error was ignored in previous versions of MLX5 PMD probing routine. For single device ifindex was retrieved via sysfs and link control was not lost, so problem just was not noticed. In order to support MLX5 PMD functioning over old kernel driver this patch adds ifindex retrieving via sysfs into probing routine. It is worth to note this method works for master/standalone device only. [1] https://www.spinics.net/lists/linux-rdma/msg62948.html Linux tree: 5b2cc79d (Leon Romanovsky 2018-03-27 20:40:49 +0300 270) Fixes: ad74bc619504 ("net/mlx5: support multiport IB device during probing") Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05net/mlx5: share Direct Rules/Verbs flow related structuresViacheslav Ovsiienko
Direct Rules/Verbs related structures are moved to the shared context: - rx/tx namespaces, shared by master and representors - rx/tx flow tables - matchers - encap/decap action resources - flow tags (MARK actions) - modify action resources - jump tables Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>