summaryrefslogtreecommitdiff
path: root/drivers/net/mlx4/mlx4.c
AgeCommit message (Collapse)Author
8 daysdrivers/net: update Rx RSS hash offload capabilitiesPavan Nikhilesh
Add DEV_RX_OFFLOAD_RSS_HASH flag for all PMDs that support RSS hash delivery. Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-08-06net/mlx4: fix crash on info query in secondary processStephen Hemminger
mlx4_dev_info_get calls mlx4_get_ifname, but mlx4_get_ifname uses priv->ctx which is not a valid pointer in a secondary process. The fix is to cache the value in primary. In the primary process, get and store the interface index of the device so that secondary process can see it. Bugzilla ID: 320 Fixes: 61cbdd419478 ("net/mlx4: separate device control functions") Cc: stable@dpdk.org Reported-by: Suyang Ju <sju@paloaltonetworks.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Matan Azrad <matan@mellanox.com>
2019-07-22eal: fix IOVA mode selection as VA for PCI driversDavid Marchand
The incriminated commit broke the use of RTE_PCI_DRV_IOVA_AS_VA which was intended to mean "driver only supports VA" but had been understood as "driver supports both PA and VA" by most net drivers and used to let dpdk processes to run as non root (which do not have access to physical addresses on recent kernels). The check on physical addresses actually closed the gap for those drivers. We don't need to mark them with RTE_PCI_DRV_IOVA_AS_VA and this flag can retain its intended meaning. Document explicitly its meaning. We can check that a driver requirement wrt to IOVA mode is fulfilled before trying to probe a device. Finally, document the heuristic used to select the IOVA mode and hope that we won't break it again. Fixes: 703458e19c16 ("bus/pci: consider only usable devices for IOVA mode") Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Tested-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-07-06eal: hide shared memory configAnatoly Burakov
Now that everything that has ever accessed the shared memory config is doing so through the public API's, we can make it internal. Since we're removing quite a few headers from rte_eal_memconfig.h, we need to add them back in places where this header is used. This bumps the ABI, so also change all build files and make update documentation. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>
2019-06-05ipc: handle unsupported IPC in action registerAnatoly Burakov
Currently, IPC API will silently ignore unsupported IPC. Fix the API call and its callers to explicitly handle unsupported IPC cases. For primary processes, it is OK to not have IPC because there may not be any secondary processes in the first place, and there are valid use cases that disable IPC support, so all primary process usages are fixed up to ignore IPC failures. For secondary processes, IPC will be crucial, so leave all of the error handling as is. Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
2019-06-04net/mlx: support IOVA VA modeYongseok Koh
Set RTE_PCI_DRV_IOVA_AS_VA to driver's drv_flags as device's IOMMU takes virtual address. Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-05-27net/mlx4: use dynamic log typeStephen Hemminger
This driver should use dynamic log level not RTE_LOGTYPE_PMD. Other drivers were converted back in 18.02. This is really a bug, all other drivers use dynamic log levels by now. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24net: add rte prefix to ether definesOlivier Matz
Add 'RTE_' prefix to defines: - rename ETHER_ADDR_LEN as RTE_ETHER_ADDR_LEN. - rename ETHER_TYPE_LEN as RTE_ETHER_TYPE_LEN. - rename ETHER_CRC_LEN as RTE_ETHER_CRC_LEN. - rename ETHER_HDR_LEN as RTE_ETHER_HDR_LEN. - rename ETHER_MIN_LEN as RTE_ETHER_MIN_LEN. - rename ETHER_MAX_LEN as RTE_ETHER_MAX_LEN. - rename ETHER_MTU as RTE_ETHER_MTU. - rename ETHER_MAX_VLAN_FRAME_LEN as RTE_ETHER_MAX_VLAN_FRAME_LEN. - rename ETHER_MAX_VLAN_ID as RTE_ETHER_MAX_VLAN_ID. - rename ETHER_MAX_JUMBO_FRAME_LEN as RTE_ETHER_MAX_JUMBO_FRAME_LEN. - rename ETHER_MIN_MTU as RTE_ETHER_MIN_MTU. - rename ETHER_LOCAL_ADMIN_ADDR as RTE_ETHER_LOCAL_ADMIN_ADDR. - rename ETHER_GROUP_ADDR as RTE_ETHER_GROUP_ADDR. - rename ETHER_TYPE_IPv4 as RTE_ETHER_TYPE_IPv4. - rename ETHER_TYPE_IPv6 as RTE_ETHER_TYPE_IPv6. - rename ETHER_TYPE_ARP as RTE_ETHER_TYPE_ARP. - rename ETHER_TYPE_VLAN as RTE_ETHER_TYPE_VLAN. - rename ETHER_TYPE_RARP as RTE_ETHER_TYPE_RARP. - rename ETHER_TYPE_QINQ as RTE_ETHER_TYPE_QINQ. - rename ETHER_TYPE_ETAG as RTE_ETHER_TYPE_ETAG. - rename ETHER_TYPE_1588 as RTE_ETHER_TYPE_1588. - rename ETHER_TYPE_SLOW as RTE_ETHER_TYPE_SLOW. - rename ETHER_TYPE_TEB as RTE_ETHER_TYPE_TEB. - rename ETHER_TYPE_LLDP as RTE_ETHER_TYPE_LLDP. - rename ETHER_TYPE_MPLS as RTE_ETHER_TYPE_MPLS. - rename ETHER_TYPE_MPLSM as RTE_ETHER_TYPE_MPLSM. - rename ETHER_VXLAN_HLEN as RTE_ETHER_VXLAN_HLEN. - rename ETHER_ADDR_FMT_SIZE as RTE_ETHER_ADDR_FMT_SIZE. - rename VXLAN_GPE_TYPE_IPV4 as RTE_VXLAN_GPE_TYPE_IPV4. - rename VXLAN_GPE_TYPE_IPV6 as RTE_VXLAN_GPE_TYPE_IPV6. - rename VXLAN_GPE_TYPE_ETH as RTE_VXLAN_GPE_TYPE_ETH. - rename VXLAN_GPE_TYPE_NSH as RTE_VXLAN_GPE_TYPE_NSH. - rename VXLAN_GPE_TYPE_MPLS as RTE_VXLAN_GPE_TYPE_MPLS. - rename VXLAN_GPE_TYPE_GBP as RTE_VXLAN_GPE_TYPE_GBP. - rename VXLAN_GPE_TYPE_VBNG as RTE_VXLAN_GPE_TYPE_VBNG. - rename ETHER_VXLAN_GPE_HLEN as RTE_ETHER_VXLAN_GPE_HLEN. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-24net: add rte prefix to ether structuresOlivier Matz
Add 'rte_' prefix to structures: - rename struct ether_addr as struct rte_ether_addr. - rename struct ether_hdr as struct rte_ether_hdr. - rename struct vlan_hdr as struct rte_vlan_hdr. - rename struct vxlan_hdr as struct rte_vxlan_hdr. - rename struct vxlan_gpe_hdr as struct rte_vxlan_gpe_hdr. Do not update the command line library to avoid adding a dependency to librte_net. Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2019-05-03net/mlx4: support multicast address list interfaceAdrien Mazarguil
Since this driver does not distinguish unicast/multicast addresses, applications could always rely on the standard MAC add/remove/set interface to configure both types. As a result, the multicast address list interface never got implemented (rte_eth_dev_set_mc_addr_list()) however PMD-agnostic applications still rely on it for compatibility reasons; a wrapper is therefore required. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12net/mlx4: remove device register remapYongseok Koh
UAR (User Access Region) register does not need to be remapped for primary process but it should be remapped only for secondary process. UAR register table is in the process private structure in rte_eth_devices[], (struct mlx4_proc_priv *)rte_eth_devices[port_id].process_private The actual UAR table follows the data structure and the table is used for both Tx and Rx. For Tx, BlueFlame in UAR is used to ring the doorbell. MLX4_TX_BFREG(txq) is defined to get a register for the txq. Processes access its own private data to acquire the register from the UAR table. For Rx, the doorbell in UAR is required in arming CQ event. However, it is a known issue that the register isn't remapped for secondary process. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-12net/mlx4: fix Tx doorbell register unmapYongseok Koh
If rdma-core library doesn't support remapping UAR registers, the register shouldn't be unmapped on device stop. Fixes: 0203d33a1059 ("net/mlx4: support secondary process") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05net/mlx4: add control of excessive memory pinning by kernelYongseok Koh
A new PMD parameter (mr_ext_memseg_en) is added to control extension of memseg when creating a MR. It is enabled by default. If enabled, mlx4_mr_create() tries to maximize the range of MR registration so that the LKey lookup tables on datapath become smalle and get the best performance. However, it may worsen memory utilization because registered memory is pinned by kernel driver. Even if a page in the extended chunk is freed, that doesn't become reusable until the entire memory is freed and the MR is destroyed. To make freed pages available immediately, this parameter has to be turned off but it could drop performance. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05net/mlx4: support secondary processYongseok Koh
In order to support secondary process, a few features are required. a) rdma-core library should allocate device resources using DPDK's memory allocator. b) UAR should be remapped for secondary processes. Currently, in order not to use different data structure for secondary processes, PMD tries to reserve identical virtual address space for both primary and secondary processes. c) IPC channel is necessary, which can be easily set with rte_mp APIs. Through the channel, Verbs command FD is delivered to the secondary process and the device stop/start event is also broadcast from primary process. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05net/mlx4: add external allocator for Verbs objectYongseok Koh
To support secondary process, the memory allocated by library such as completion rings (CQ) and buffer rings (WQ) must be manageable by EAL, in order to share it with secondary processes. With new changes in rdma-core and kernel driver, it is possible to provide an external allocator to the library layer for this purpose. All such resources will now be allocated within DPDK framework. Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-04-05net/mlx4: change device reference for secondary processYongseok Koh
rte_eth_devices[] is not shared between primary and secondary process, but a static array to each process. The reverse pointer of device (priv->dev) becomes invalid if mlx4 supports secondary process. Instead, priv has the pointer to shared data of the device, struct rte_eth_dev_data *dev_data; Two macros are added, #define PORT_ID(priv) ((priv)->dev_data->port_id) #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)]) Cc: stable@dpdk.org Suggested-by: Raslan Darawsheh <rasland@mellanox.com> Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-03-01net/mlx: prefix private structureThomas Monjalon
The private structure stored in rte_eth_dev->data->dev_private was named "struct priv". In order to ease code browsing, the structure is renamed "struct mlx[45]_priv". Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Yongseok Koh <yskoh@mellanox.com>
2019-02-13net/mlx: support firmware version queryThomas Monjalon
The API function rte_eth_dev_fw_version_get() is querying drivers via the operation callback fw_version_get(). The implementation of this operation is added for mlx4 and mlx5. Both functions are copying the same ibverbs field fw_ver which is retrieved when calling ibv_query_device[_ex]() during the port probing. It is tested with command "drvinfo" of examples/ethtool/. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2019-01-14config: gather options for dlopen mlx dependencyThomas Monjalon
Rename options CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS and CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS to a single option CONFIG_RTE_IBVERBS_LINK_DLOPEN. Rename meson option enable_driver_mlx_glue to ibverbs_link. There was no good reason for setting a different link option for mlx4 and mlx5. Having a single common option makes it easier to understand and unify make and meson systems. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-10-26ethdev: free all common data when releasing portThomas Monjalon
This is a clean-up of common ethdev data freeing. All data freeing are moved to rte_eth_dev_release_port() and done only in case of primary process. It is probably fixing some memory leaks for PMDs which were not freeing all data. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-10-17drivers/bus: move driver assignment to end of probingThomas Monjalon
The PCI mapping requires to know the PCI driver to use, even before the probing is done. That's why the PCI driver is referenced early inside the PCI device structure. See commit 1d20a073fa5e ("bus/pci: reference driver structure before mapping") However the rte_driver does not need to be referenced in rte_device before the device probing is done. By moving back this assignment at the end of the device probing, it becomes possible to make clear the status of a rte_device. Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Tested-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Rosen Xu <rosen.xu@intel.com>
2018-08-02net/mlx4: check RSS queues number limitationMoti Haimovsky
This patch verifies that the number of Rx queues configured for RSS is supported by the device hardware. RSS support in mlx4 requires contiguous chunk of QPs to be reserved, there is a hardware limitation on the amount of contiguous QPs which is reported by the hardware. Ignoring this value will cause Rx queues creation to fail. Cc: stable@dpdk.org Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
2018-07-10net/mlx4: support hardware TSOMoti Haimovsky
Implement support for hardware TSO. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
2018-07-12remove useless constructor headersThomas Monjalon
A constructor is usually declared with RTE_INIT* macros. As it is a static function, no need to declare before its definition. The macro is used directly in the function definition. Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2018-06-17net/mlx4: fix minor resource leak during initAdrien Mazarguil
Temporary IB device context and list are not freed in case of a successful initialization of the device. This issue is caused by the two following commits, the first of which causes initialization to return early, while the second one goes a bit overboard while switching to negative errno values; an internal variable (err) is needed to tell success from failure at the end of the function since rte_errno is not reliable enough. Fixes: f2318196c71a ("net/mlx4: remove limitation on number of instances") Fixes: 9d14b27308a0 ("net/mlx4: standardize on negative errno values") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-05-28net/mlx4: fix crash when configure is not calledYongseok Koh
Although uncommon, applications may destroy a device immediately after probing it without going through dev_configure() first. This patch addresses a crash which occurs when mlx4_dev_close() calls mlx4_mr_release() due to an uninitialized entry in the private structure. In addition MR cache init takes place on the device configuration. When the device is re-configured multiple times, for example when changing the number of queue on the flight, deadlock can happen. This patch moved MR cache init from device configuration function to probe function to make sure init only once. Fixes: 9797bfcce1c9 ("net/mlx4: add new memory region support") Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Signed-off-by: Xueming Li <xuemingl@mellanox.com> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-14ethdev: add probing finish functionThomas Monjalon
A new hook function is added and called inside the PMDs at the end of the device probing: - in primary process, after allocating, init and config - in secondary process, after attaching and local init This new function is almost empty for now. It will be used later to add some post-initialization processing. For the PMDs calling the helpers rte_eth_dev_create() or rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish() is called from here, and not in the PMD itself. Note that the helper rte_eth_dev_create() could be used more, especially for vdevs, avoiding some code duplication in PMDs. Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
2018-05-14net/mlx4: add new memory region supportYongseok Koh
This is the new design of Memory Region (MR) for mlx PMD, in order to: - Accommodate the new memory hotplug model. - Support non-contiguous Mempool. There are multiple layers for MR search. L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized array by linear search. L0/L1 is in an inline function - mlx4_mr_lookup_cache(). If L1 misses, the bottom-half function is called to look up the address from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_bh() and it is not an inline function. Data structure for L2 is the Binary Tree. If L2 misses, the search falls into the slowest path which takes locks in order to access global device cache (priv->mr.cache) which is also a B-tree and caches the original MR list (priv->mr.mr_list) of the device. Unless the global cache is overflowed, it is all-inclusive of the MR list. This is L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and can't be expanded on the fly due to deadlock. Refer to the comments in the code for the details - mr_lookup_dev(). If L3 is overflowed, the list will have to be searched directly bypassing the cache although it is slower. If L3 misses, a new MR for the address should be created - mlx4_mr_create(). When it creates a new MR, it tries to register adjacent memsegs as much as possible which are virtually contiguous around the address. This must take two locks - memory_hotplug_lock and priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any allocation/free of memory inside. In the free callback of the memory hotplug event, freed space is searched from the MR list and corresponding bits are cleared from the bitmap of MRs. This can fragment a MR and the MR will have multiple search entries in the caches. Once there's a change by the event, the global cache must be rebuilt and all the per-queue caches will be flushed as well. If memory is frequently freed in run-time, that may cause jitter on dataplane processing in the worst case by incurring MR cache flush and rebuild. But, it would be the least probable scenario. To guarantee the most optimal performance, it is highly recommended to use an EAL option - '--socket-mem'. Then, the reserved memory will be pinned and won't be freed dynamically. And it is also recommended to configure per-lcore cache of Mempool. Even though there're many MRs for a device or MRs are highly fragmented, the cache of Mempool will be much helpful to reduce misses on per-queue caches anyway. '--legacy-mem' is also supported. Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
2018-05-02net/mlx4: fix inner RSS support for broken kernelsAdrien Mazarguil
Linux 4.15 and 4.16 may report inner RSS as a supported capability of the device, however it can't be used due to missing code in the kernel. This triggers an error when creating the default hash QP and prevents this PMD from starting up without a prior call to rte_flow_isolate(). Fixes: 55e8991e3199 ("net/mlx4: restore inner VXLAN RSS support") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-05-02net/mlx4: fix default RSS hash fieldsAdrien Mazarguil
Using special types value -1 with mlx4_conv_rss_types() is supposed to return a supported set of Verbs RSS hash fields, that is, priv->hw_rss_sup unmodified. Due to the way this function is written and because it is also used to initially populate priv->hw_rss_sup however, this special value works properly only once and fails with ENOTSUP errors afterward. This problem can be seen when re-creating default flows (e.g. by entering and leaving isolated mode). Fixes: 024e87bef40b ("net/mlx4: restore UDP RSS by probing capabilities") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-05-02net/mlx4: fix Rx resource leak in case of errorAdrien Mazarguil
When creation of a flow rule fails during dev_start(), the usage count of the common RSS context is not decremented, which triggers an assertion failure in debug mode during dev_close(). This is addressed by tracking the initialization status of the common RSS context in order to add missing cleanup code. A similar issue exists in mlx4_rxq_attach(), where usage count is incremented on a Rx queue but not released in case of error. This may lead to the above issue since RSS contexts created by flow rules attach themselves to Rx queues, incrementing their usage count. Fixes: 5697a4142107 ("net/mlx4: relax Rx queue configuration order") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-27ethdev: flatten RSS configuration in flow APIAdrien Mazarguil
Since its inception, the rte_flow RSS action has been relying in part on external struct rte_eth_rss_conf for compatibility with the legacy RSS API. This structure lacks parameters such as the hash algorithm to use, and more recently, a method to tell which layer RSS should be performed on [1]. Given struct rte_eth_rss_conf will never be flexible enough to represent a complete RSS configuration (e.g. RETA table), this patch supersedes it by extending the rte_flow RSS action directly. A subsequent patch will add a field to use a non-default RSS hash algorithm. To that end, a field named "types" replaces the field formerly known as "rss_hf" and standing for "RSS hash functions" as it was confusing. Actual RSS hash function types are defined by enum rte_eth_hash_function. This patch updates all PMDs and example applications accordingly. It breaks ABI compatibility for the following public functions: - rte_flow_copy() - rte_flow_create() - rte_flow_query() - rte_flow_validate() [1] commit 676b605182a5 ("doc: announce ethdev API change for RSS configuration") Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
2018-04-14net/mlx4: support CRC strip togglingOphir Munk
Previous to this commit mlx4 CRC stripping was executed by default and there was no verbs API to disable it. Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-04-11align SPDX Mellanox copyrightsShahaf Shuler
Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed. Signed-off-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-03-30net/mlx: fix rdma-core glue path with EAL pluginsAdrien Mazarguil
Glue object files are looked up in RTE_EAL_PMD_PATH by default when set and should be installed in this directory. During startup, EAL attempts to load them automatically like other plug-ins found there. While normally harmless, dlopen() fails when rdma-core is not installed, EAL interprets this as a fatal error and terminates the application. This patch requests glue objects to be installed in a different directory to prevent their automatic loading by EAL since they are PMD helpers, not actual DPDK plug-ins. Fixes: f6242d0655cd ("net/mlx: make rdma-core glue path configurable") Cc: stable@dpdk.org Reported-by: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Tested-by: Timothy Redaelli <tredaelli@redhat.com>
2018-02-06net/mlx: make rdma-core glue path configurableAdrien Mazarguil
Since rdma-core glue libraries are intrinsically tied to their respective PMDs and used as internal plug-ins, their presence in the default search path among other system libraries for the dynamic linker is not necessarily desired. This commit enables their installation and subsequent look-up at run time in RTE_EAL_PMD_PATH if configured to a nonempty string. This path can also be overridden by environment variables MLX[45]_GLUE_PATH. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06net/mlx: version rdma-core glue librariesAdrien Mazarguil
When built as separate objects, these libraries do not have unique names. Since they do not maintain a stable ABI, loading an incompatible library may result in a crash (e.g. in case multiple versions are installed). This patch addresses the above by versioning glue libraries, both on the file system (version suffix) and by comparing a dedicated version field member in glue structures. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-06net/mlx: add debug checks to glue structureAdrien Mazarguil
This code should catch mistakes early if a glue structure member is added without a corresponding implementation in the library. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-02-01net/mlx4: use SPDX tags in 6WIND copyrighted filesOlivier Matz
Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-01-31net/mlx4: spawn rdma-core dependency plug-inAdrien Mazarguil
When mlx4 is not compiled directly as an independent shared object (e.g. CONFIG_RTE_BUILD_SHARED_LIB not enabled for performance reasons), DPDK applications inherit its dependencies on libibverbs and libmlx4 through rte.app.mk. This is an issue both when DPDK is delivered as a binary package (Linux distributions) and for end users because rdma-core then propagates as a mandatory dependency for everything. Application writers relying on binary DPDK packages are not necessarily aware of this fact and may end up delivering packages with broken dependencies. This patch therefore introduces an intermediate internal plug-in hard-linked with rdma-core (to preserve symbol versioning) loaded by the PMD through dlopen(), so that a missing rdma-core does not cause unresolved symbols, allowing applications to start normally. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-31net/mlx4: move rdma-core calls to separate fileAdrien Mazarguil
This lays the groundwork for externalizing rdma-core as an optional run-time dependency instead of a mandatory one. No functional change. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
2018-01-30net/mlx4: fix removal detection of stopped portMoti Haimovsky
In failsafe device start can be called for ports/devices that had been plugged out. The mlx4 PMD detects device removal by listening to the device RMV events, when the mlx4 port is being stopped, the PMD no longer listens to these events causing the PMD to stop detecting device removals. This patch fixes this issue by moving installation of the interrupt handler to device configuration, and toggle only the Rx-queue interrupts on start/stop. Fixes: a6e8b01c3c26 ("net/mlx4: compact interrupt functions") Cc: stable@dpdk.org Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
2018-01-29net/mlx4: fix single port configurationOphir Munk
The number of mlx4 present ports is calculated as follows: conf.ports.present |= (UINT64_C(1) << device_attr.phys_port_cnt) - 1; That is - all ones sequence (due to -1 subtraction) When retrieving the number of ports, 1 must be added in order to obtain the correct number of ports to the power of 2, as follows: uint32_t ports = rte_log2_u32(conf->ports.present + 1); If 1 was not added, in the case of one port, the number of ports would be falsely calculated as 0. Fixes: 8264279967dc ("net/mlx4: check max number of ports dynamically") Cc: stable@dpdk.org Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-22ethdev: separate driver APIsFerruh Yigit
Create a rte_ethdev_driver.h file and move PMD specific APIs here. Drivers updated to include this new header file. There is no update in header content and since ethdev.h included by ethdev_driver.h, nothing changed from driver point of view, only logically grouping of APIs. From applications point of view they can't access to driver specific APIs anymore and they shouldn't. More PMD specific data structures still remain in ethdev.h because of inline functions in header use them. Those will be handled separately. Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
2018-01-21net/mlx4: support a device removal check operationMatan Azrad
Add support to get removal status of mlx4 device. Signed-off-by: Matan Azrad <matan@mellanox.com>
2018-01-16net/mlx4: verify Tx max sgesMoti Haimovsky
Max number of Tx scatter-gather entries is a property of the device and is queried at init. This value was not changed in a while and most probably will not be changed in the future, Therefore and in order to enhance Tx performance, the Tx max-sge value is hardcoded in mlx4 PRM code. This patch adds a verification that the above assumption still holds and that the hardcoded value is still supported by the mlx4 hardware. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16net/mlx4: revert workaround for broken VerbsMatan Azrad
This workaround was needed to properly handle device removal with old Mellanox OFED releases that are not supported by this PMD anymore. Starting from rdma-core v16 this removal issue shouldn't happen when setting MLX4_DEVICE_FATAL_CLEANUP environment variable to 1. Set the aforementioned variable to 1. Reverts: 5f4677c6ad5e ("net/mlx4: workaround verbs error after plug-out") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
2018-01-16net/mlx4: restore inner VXLAN RSS supportAdrien Mazarguil
Inner VXLAN RSS was supported and performed by default prior to the entire mlx4 refactoring that occurred in DPDK 17.11, however so far the new Verbs RSS API did not provide means to enable it. This will be addressed in Linux 4.15 and in RDMA core. Thanks to RSS capabilities, the PMD can now probe for its support and enable it again by default. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>
2018-01-16net/mlx4: restore UDP RSS by probing capabilitiesAdrien Mazarguil
Until now, UDP RSS support could not be relied on due to a problem in the Linux kernel implementation and mlx4 RSS capabilities were not reported at all, hence the PMD had to make assumptions. Since both issues will be addressed simultaneously in Linux 4.15 (related patches already upstream) and likely backported afterward, UDP RSS support can be enabled by probing RSS capabilities. Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Neil Horman <nhorman@tuxdriver.com>
2017-11-07net/mlx4: enhance Rx packet type offloadsMoti Haimovsky
This patch enhances the Rx packet type offload to also report the L4 protocol information in the hw ptype filled by the PMD for each received packet. Signed-off-by: Moti Haimovsky <motih@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>