|author||Viacheslav Ovsiienko <email@example.com>||2019-11-08 15:07:50 +0000|
|committer||Ferruh Yigit <firstname.lastname@example.org>||2019-11-11 14:23:02 +0100|
net/mlx5: control transmit doorbell register mapping
The rdma core library can map doorbell register in two ways, depending on the environment variable "MLX5_SHUT_UP_BF": - as regular cached memory, the variable is either missing or set to zero. This type of mapping may cause the significant doorbell register writing latency and requires explicit memory write barrier to mitigate this issue and prevent write combining. - as non-cached memory, the variable is present and set to not "0" value. This type of mapping may cause performance impact under heavy loading conditions but the explicit write memory barrier is not required and it may improve core performance. The new devarg is introduced "tx_db_nc", if this parameter is set to zero, the doorbell register is forced to be mapped to cached memory and requires explicit memory barrier after writing to. If "tx_db_nc" is set to non-zero value the doorbell will be mapped as non-cached memory, not requiring the memory barrier. If "tx_db_nc" is missing the behaviour will be defined by presence of "MLX5_SHUT_UP_BF" in environment. If variable is missed the default value zero will be set for ARM64 hosts and one for others. In run time the code checks the mapping type and provides the memory barrier after writing to tx doorbell register if it is needed. The mapping type is extracted directly from the uar_mmap_offset field in the queue properties. Fixes: 18a1c20044c0 ("net/mlx5: implement Tx burst template") Cc: email@example.com Signed-off-by: Viacheslav Ovsiienko <firstname.lastname@example.org> Acked-by: Matan Azrad <email@example.com>
Diffstat (limited to 'doc/guides/nics')
1 files changed, 23 insertions, 0 deletions
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 3651e82..5fd313c 100644
@@ -552,6 +552,29 @@ Run-time configuration
Also, if minimal data inlining is requested by non-zero ``txq_inline_min``
option or reported by the NIC, the eMPW feature is disengaged.
+- ``tx_db_nc`` parameter [int]
+ The rdma core library can map doorbell register in two ways, depending on the
+ environment variable "MLX5_SHUT_UP_BF":
+ - As regular cached memory, if the variable is either missing or set to zero.
+ - As non-cached memory, if the variable is present and set to not "0" value.
+ The type of mapping may slightly affect the Tx performance, the optimal choice
+ is strongly relied on the host architecture and should be deduced practically.
+ If ``tx_db_nc`` is either omitted or set to zero, the doorbell is forced to be
+ mapped to regular memory, the PMD will perform the extra write memory barrier
+ after writing to doorbell, it might increase the needed CPU clocks per packet
+ to send, but latency might be improved.
+ If ``tx_db_nc`` is set to not zero, the doorbell is forced to be mapped to
+ non cached memory, the PMD will not perform the extra write memory barrier
+ after writing to doorbell, on some architectures it might improve the
+ The default ``tx_db_nc`` value is zero ARM64 hosts and one for others.
- ``tx_vec_en`` parameter [int]
A nonzero value enables Tx vector on ConnectX-5, ConnectX-6, ConnectX-6 DX