path: root/drivers/net/mlx5/mlx5_defs.h
AgeCommit message (Collapse)Author
2018-11-05net/mlx5: make vectorized Tx threshold configurableYongseok Koh
Add txqs_max_vec parameter to configure the maximum number of Tx queues to enable vectorized Tx. And its default value is set according to the architecture and device type. Signed-off-by: Yongseok Koh <> Acked-by: Shahaf Shuler <>
2018-08-05net/mlx5: fix minimum number of Multi-Packet RQ buffersYongseok Koh
If MPRQ is enabled, a PMD-private mempool is allocated. For ConnectX-4 Lx, the minimum number of strides is 512 which ConnectX-5 supports 8. This results in quite small number of elements for the MPRQ mempool. For example, if the size of Rx ring is configured as 512, only one MPRQ buffer can cover the whole ring. If there's only one Rx queue is configured. In the following code in mlx5_mprq_alloc_mp(), desc is 1 and obj_num will be 36 as a result. desc *= 4; obj_num = desc + MLX5_MPRQ_MP_CACHE_SZ * priv->rxqs_n; However, rte_mempool_create_empty() has a sanity check to refuse large per-lcore cache size compared to the number of elements. Cache flush threshold should not exceed the number of elements of a mempool. For the above example, the threshold is 32 * 1.5 = 48 which is larger than 36 and it fails to create the mempool. Fixes: 7d6bf6b866b8 ("net/mlx5: add Multi-Packet Rx support") Cc: Signed-off-by: Yongseok Koh <>
2018-07-26net/mlx5: fix build with old kernelsMoti Haimovsky
This commit fixes compilation errors due to missing definitions found when compiling mlx5 PMD from DPDK 17.11-LTS on Ubuntu 12.4 with kernel 3.15. Fixes: 75ef62a94301 ("net/mlx5: fix link speed capability information") Fixes: 5bfc9fc112dd ("net/mlx5: use static assert for compile-time sanity checks") Cc: Signed-off-by: Moti Haimovsky <> Acked-by: Shahaf Shuler <>
2018-07-12net/mlx5: support 32-bit systemsMoti Haimovsky
This patch adds support for building and running mlx5 PMD on 32bit systems such as i686. The main issue to tackle was handling the 32bit access to the UAR as quoted from the mlx5 PRM: QP and CQ DoorBells require 64-bit writes. For best performance, it is recommended to execute the QP/CQ DoorBell as a single 64-bit write operation. For platforms that do not support 64 bit writes, it is possible to issue the 64 bits DoorBells through two consecutive writes, each write 32 bits, as described below: * The order of writing each of the Dwords is from lower to upper addresses. * No other DoorBell can be rung (or even start ringing) in the midst of an on-going write of a DoorBell over a given UAR page. The last rule implies that in a multi-threaded environment, the access to a UAR page (which can be accessible by all threads in the process) must be synchronized (for example, using a semaphore) unless an atomic write of 64 bits in a single bus operation is guaranteed. Such a synchronization is not required for when ringing DoorBells on different UAR pages. Signed-off-by: Moti Haimovsky <> Acked-by: Yongseok Koh <>
2018-07-03net/mlx5: increase number of stridesYongseok Koh
If WQE ID is used in CQE for Multi-Packet RQ, the ratio of CQE compression drops a little bit. In order to reach to 100Gbps with 64B traffic, it is needed to further save PCIe bandwidth by increasing the number of strides in a WQE. It is now 64 by default but adjustable by a PMD parameter - mprq_log_stride_num. Signed-off-by: Yongseok Koh <> Acked-by: Shahaf Shuler <>
2018-07-03net/mlx5: fix Rx buffer replenishment thresholdYongseok Koh
The threshold of buffer replenishment for vectorized Rx burst is a constant value (64). If the size of Rx queue is comparatively small, device could run out of buffers. For example, if the size of Rx queue is 128, buffers are replenished only twice per a wraparound. This can cause jitter in receiving packets and the jitter can cause unnecessary retransmission for TCP connections. Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86") Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM") Cc: Signed-off-by: Yongseok Koh <> Acked-by: Shahaf Shuler <>
2018-05-14net/mlx5: add Multi-Packet Rx supportYongseok Koh
Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth by posting a single large buffer for multiple packets. Instead of posting a buffer per a packet, one large buffer is posted in order to receive multiple packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides and each stride receives one packet. Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is comparatively small, or PMD attaches the Rx packet to the mbuf by external buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external buffers will be allocated and managed by PMD. Signed-off-by: Yongseok Koh <> Acked-by: Shahaf Shuler <>
2018-05-14net/mlx5: add new memory region supportYongseok Koh
This is the new design of Memory Region (MR) for mlx PMD, in order to: - Accommodate the new memory hotplug model. - Support non-contiguous Mempool. There are multiple layers for MR search. L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized array by linear search. L0/L1 is in an inline function - mlx5_mr_lookup_cache(). If L1 misses, the bottom-half function is called to look up the address from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh() and it is not an inline function. Data structure for L2 is the Binary Tree. If L2 misses, the search falls into the slowest path which takes locks in order to access global device cache (priv->mr.cache) which is also a B-tree and caches the original MR list (priv->mr.mr_list) of the device. Unless the global cache is overflowed, it is all-inclusive of the MR list. This is L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and can't be expanded on the fly due to deadlock. Refer to the comments in the code for the details - mr_lookup_dev(). If L3 is overflowed, the list will have to be searched directly bypassing the cache although it is slower. If L3 misses, a new MR for the address should be created - mlx5_mr_create(). When it creates a new MR, it tries to register adjacent memsegs as much as possible which are virtually contiguous around the address. This must take two locks - memory_hotplug_lock and priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any allocation/free of memory inside. In the free callback of the memory hotplug event, freed space is searched from the MR list and corresponding bits are cleared from the bitmap of MRs. This can fragment a MR and the MR will have multiple search entries in the caches. Once there's a change by the event, the global cache must be rebuilt and all the per-queue caches will be flushed as well. If memory is frequently freed in run-time, that may cause jitter on dataplane processing in the worst case by incurring MR cache flush and rebuild. But, it would be the least probable scenario. To guarantee the most optimal performance, it is highly recommended to use an EAL option - '--socket-mem'. Then, the reserved memory will be pinned and won't be freed dynamically. And it is also recommended to configure per-lcore cache of Mempool. Even though there're many MRs for a device or MRs are highly fragmented, the cache of Mempool will be much helpful to reduce misses on per-queue caches anyway. '--legacy-mem' is also supported. Signed-off-by: Yongseok Koh <>
2018-05-14net/mlx5: remove memory region supportYongseok Koh
This patch removes current support of Memory Region (MR) in order to accommodate the dynamic memory hotplug patch. This patch can be compiled but traffic can't flow and HW will raise faults. Subsequent patches will add new MR support. Signed-off-by: Yongseok Koh <>
2018-04-27net/mlx5: allow max 192B TSO inline header lengthXueming Li
Change max inline header length to 192B to allow IPv6 VXLAN TSO headers and header with options that more than 128B. Signed-off-by: Xueming Li <> Acked-by: Yongseok Koh <>
2018-04-27net/mlx5: implement multicast add list devopNélio Laranjeiro
Signed-off-by: Nelio Laranjeiro <>
2018-04-27net/mlx5: split MAC address add/remove codeNélio Laranjeiro
Move some code in DPDK callbacks to add/remove MAC addresses to internal function. This modification will be necessary to handle implement the devop set_mc_addr_list. Signed-off-by: Nelio Laranjeiro <>
2018-04-11align SPDX Mellanox copyrightsShahaf Shuler
Aligning Mellanox SPDX copyrights to a single format. In addition replace to SPDX licence files which were missed. Signed-off-by: Shahaf Shuler <> Acked-by: Adrien Mazarguil <>
2018-03-30net/mlx5: fix link status to use wait to completeNélio Laranjeiro
Wait to complete is present to let the application get a correct status when it requires it, it should not be ignored. Fixes: e313ef4c2fe8 ("net/mlx5: fix link state on device start") Fixes: cb8faed7dde8 ("mlx5: support link status update") Cc: Signed-off-by: Nelio Laranjeiro <> Acked-by: Adrien Mazarguil <>
2018-02-01net/mlx5: use SPDX tags in 6WIND copyrighted filesOlivier Matz
Signed-off-by: Olivier Matz <> Acked-by: Nelio Laranjeiro <> Acked-by: Bruce Richardson <> Acked-by: Thomas Monjalon <>
2018-01-29net/mlx5: map UAR address around huge pagesXueming Li
Reserving the memory space for the UAR near huge pages helps to **reduce** the cases where the secondary process cannot start. Those pages being physical pages they must be mapped at the same virtual address as in the primary process to have a working secondary process. As this remap is almost the latest being done by the processes (libraries, heaps, stacks are already loaded), similar to huge pages, there is **no guarantee** this mechanism will always work. Signed-off-by: Xueming Li <> Acked-by: Nelio Laranjeiro <>
2018-01-29net/mlx5: fix link state on device startShahaf Shuler
Following commit c7bf62255edf ("net/mlx5: fix handling link status event") the link state must be up in order for the burst function to be set on the device ops. As the link may take time to move between down and up state it is possible the rte_eth_dev_start call will return with wrong burst function (either null or the empty burst function). Fixing it by forcing the link to be up before returning from device start. In case the link is still not up after 5 seconds fail the function. In addition initialize the burst function on device probe to prevent crashes before the link is up. Fixes: c7bf62255edf ("net/mlx5: fix handling link status event") Cc: Signed-off-by: Shahaf Shuler <> Acked-by: Nelio Laranjeiro <>
2018-01-22ethdev: separate driver APIsFerruh Yigit
Create a rte_ethdev_driver.h file and move PMD specific APIs here. Drivers updated to include this new header file. There is no update in header content and since ethdev.h included by ethdev_driver.h, nothing changed from driver point of view, only logically grouping of APIs. From applications point of view they can't access to driver specific APIs anymore and they shouldn't. More PMD specific data structures still remain in ethdev.h because of inline functions in header use them. Those will be handled separately. Signed-off-by: Ferruh Yigit <> Acked-by: Shreyansh Jain <> Acked-by: Andrew Rybchenko <> Acked-by: Thomas Monjalon <>
2018-01-16net/mlx5: fix un-supported RSS hash fields useNélio Laranjeiro
MLX5 NIC does not support all hash fields, this patch limit by refusing impossible RSS combination to avoid errors. Fixes: 2f97422e7759 ("mlx5: support RSS hash update and get") Cc: Signed-off-by: Nelio Laranjeiro <> Acked-by: Yongseok Koh <>
2017-10-12net/mlx5: use flow to enable unicast trafficNélio Laranjeiro
RSS hash configuration is currently ignored by the PMD, this commits removes the RSS feature. This functionality will be added in a later commit. Signed-off-by: Nelio Laranjeiro <> Acked-by: Yongseok Koh <>
2017-10-06net/mlx5: enforce Tx num of segments limitationShahaf Shuler
Mellanox NICs has a limitation on the number of mbuf segments a multi segment mbuf can have. The max number depends on the Tx offloads requested. The current code not enforce such limitation, which might cause malformed work requests to be written to the device. This commit adds verification for the number of mbuf segments posted to the device. In case of overflow the packet will not be sent. In addition update the nic documentation with the limitation. Considering device limitation is 63 data segments in a work request, the maximum number of segment in mbuf was calculated taking TSO as the worst case: max_nb_segs = 63 - (control_segment + ethernet segment + TSO headers inline + inline segment + extra inline to align to cacheline) Cc: Signed-off-by: Shahaf Shuler <> Acked-by: Yongseok Koh <> Acked-by: Nelio Laranjeiro <>
2017-07-07net/mlx5: add vectorized Rx/Tx burst for x86Yongseok Koh
To make vectorized burst routines enabled, it is required to run on x86_64 architecture. If all the conditions are met, the vectorized burst functions are enabled automatically. The decision is made individually on RX and TX. There's no PMD option to make a selection. Signed-off-by: Yongseok Koh <> Acked-by: Nelio Laranjeiro <>
2017-04-04net/mlx5: add enhanced multi-packet send for ConnectX-5Yongseok Koh
ConnectX-5 supports enhanced version of multi-packet send (MPS). An MPS Tx descriptor can carry multiple packets either by including pointers of packets or by inlining packets. Inlining packet data can be helpful to better utilize PCIe bandwidth. In addition, Enhanced MPS supports hybrid mode - mixing inlined packets and pointers in a descriptor. This feature is enabled by default if supported by HW. Signed-off-by: Yongseok Koh <>
2017-04-04net/mlx5: support hardware TSOShahaf Shuler
Implement support for hardware TSO. Signed-off-by: Shahaf Shuler <> Acked-by: Nelio Laranjeiro <>
2017-01-30net/mlx5: increase RSS indirection table size limitYongseok Koh
The size of Rx RSS indirection table was limited by 256, but it is not required anymore for all Mellanox NICs. However, the librte_ether still limits the size by 512. Signed-off-by: Yongseok Koh <> Acked-by: Adrien Mazarguil <>
2017-01-17net/mlx5: support extended statisticsShahaf Shuler
Implement extended statistics callbacks. Suggested-by: Hanoch Haim <> Signed-off-by: Shahaf Shuler <> Signed-off-by: Elad Persiko <> Acked-by: Adrien Mazarguil <>
2016-10-13net/mlx: align drivers to latest naming conventionDavid Marchand
Fixes: 2f45703c17ac ("drivers: make driver names consistent") Signed-off-by: David Marchand <> Acked-by: Adrien Mazarguil <>
2016-06-27net/mlx5: replace countdown with threshold for Tx completionsAdrien Mazarguil
Replacing the variable countdown (which depends on the number of descriptors) with a fixed relative threshold known at compile time improves performance by reducing the TX queue structure footprint and the amount of code to manage completions during a burst. Completions are now requested at most once per burst after threshold is reached. Signed-off-by: Adrien Mazarguil <> Signed-off-by: Nelio Laranjeiro <> Signed-off-by: Vasily Philipov <>
2016-06-27net/mlx5: update prerequisites for upcoming enhancementsNélio Laranjeiro
The latest version of Mellanox OFED exposes hardware definitions necessary to implement data path operation bypassing Verbs. Update the minimum version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks for previous releases. Signed-off-by: Nelio Laranjeiro <> Signed-off-by: Adrien Mazarguil <>
2016-06-27net/mlx5: remove inline Tx supportNélio Laranjeiro
Inline TX will be fully managed by the PMD after Verbs is bypassed in the data path. Remove the current code until then. Signed-off-by: Nelio Laranjeiro <> Signed-off-by: Adrien Mazarguil <>
2016-06-27net/mlx5: remove configuration variableNélio Laranjeiro
There is no scatter/gather support anymore, CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N has no purpose and can be removed. Signed-off-by: Nelio Laranjeiro <> Signed-off-by: Adrien Mazarguil <>
2016-03-31mlx5: fix RETA table sizeYaacov Hazan
When the number of RX queues is not a power of two, the RETA table is configured to its maximum size for better balancing. Testing showed that limiting its size to 256 improves performance noticeably with little to no impact on balancing results. Fixes: ebb30ec64a68 ("mlx5: increase RETA table size") Signed-off-by: Yaacov Hazan <> Acked-by: Adrien Mazarguil <>
2016-03-16mlx5: support flow directorYaacov Hazan
Add support for flow director filters (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN modes). This feature requires MLNX_OFED >= 3.2. Signed-off-by: Yaacov Hazan <> Signed-off-by: Adrien Mazarguil <> Signed-off-by: Raslan Darawsheh <>
2016-03-16mlx5: add special flows for broadcast and IPv6 multicastYaacov Hazan
Until now, broadcast frames were handled like unicast. Moving the related flow to the special flows table frees up the related unicast MAC entry. The same method is used to handle IPv6 multicast frames. Signed-off-by: Yaacov Hazan <> Signed-off-by: Adrien Mazarguil <>
2016-03-16mlx5: refactor special flows handlingYaacov Hazan
Merge redundant code by adding a static initialization table to manage promiscuous and allmulticast (special) flows. New function priv_rehash_flows() implements the logic to enable/disable relevant flows in one place from any context. Signed-off-by: Yaacov Hazan <> Signed-off-by: Adrien Mazarguil <>
2016-03-03mlx5: increase RETA table sizeNelio Laranjeiro
ConnectX-4 NICs can handle at most 512 entries in RETA table. Signed-off-by: Nelio Laranjeiro <> Acked-by: Adrien Mazarguil <>
2015-11-01mlx5: handle link status interruptsNelio Laranjeiro
Add interrupts handler for port status notification. Signed-off-by: Nelio Laranjeiro <> Signed-off-by: Adrien Mazarguil <>
2015-10-31mlx5: adapt indirection table size depending on Rx queues numberNelio Laranjeiro
Use the maximum size of the indirection table when the number of requested RX queues is not a power of two, this help to improve RSS balancing. A message informs users that balancing is not optimal in such cases. Signed-off-by: Nelio Laranjeiro <> Signed-off-by: Adrien Mazarguil <>
2015-10-30mlx5: support VLAN filteringAdrien Mazarguil
All MAC RX flows must be updated with VLAN information when configuring a VLAN filter. Signed-off-by: Adrien Mazarguil <> Signed-off-by: Nelio Laranjeiro <>
2015-10-30mlx5: add software countersAdrien Mazarguil
Hardware counters are not supported yet. Signed-off-by: Adrien Mazarguil <> Signed-off-by: Nelio Laranjeiro <>
2015-10-30mlx5: support non-scattered Tx and RxAdrien Mazarguil
RSS implementation with parent/child QPs comes from mlx4 and is temporary. Signed-off-by: Adrien Mazarguil <> Signed-off-by: Nelio Laranjeiro <>
2015-10-30mlx5: introduce new driver for Mellanox ConnectX-4 adaptersAdrien Mazarguil
In its current state, this driver implements the bare minimum to initialize itself and Mellanox ConnectX-4 adapters without doing anything else (no RX/TX for instance). It is disabled by default since it is based on the mlx4 driver and also depends on libibverbs. Signed-off-by: Adrien Mazarguil <> Signed-off-by: Nelio Laranjeiro <> Signed-off-by: Or Ami <>