summaryrefslogtreecommitdiff
path: root/doc/guides/nics/mlx5.rst
diff options
context:
space:
mode:
authorYongseok Koh <yskoh@mellanox.com>2017-10-24 17:27:25 -0700
committerFerruh Yigit <ferruh.yigit@intel.com>2017-10-26 02:33:01 +0200
commitfb870be5a879c6617fecabf47873ae2b576e6e69 (patch)
treeabd79339ea355b7eeed46287da4090dff6baffa5 /doc/guides/nics/mlx5.rst
parent2262eed7523b305e59e8172a141b75e385b43cf0 (diff)
downloaddpdk-fb870be5a879c6617fecabf47873ae2b576e6e69.zip
dpdk-fb870be5a879c6617fecabf47873ae2b576e6e69.tar.gz
dpdk-fb870be5a879c6617fecabf47873ae2b576e6e69.tar.xz
net/mlx5: fix Tx doorbell memory barrier
Configuring UAR as IO-mapped makes maximum throughput decline by noticeable amount. If UAR is configured as write-combining register, a write memory barrier is needed on ringing a doorbell. rte_wmb() is mostly effective when the size of a burst is comparatively small. Revert the register back to write-combining and enforce a write memory barrier instead, except for vectorized Tx burst routines. Application can change it by setting MLX5_SHUT_UP_BF under its own necessity. Fixes: 9f9bebae5530 ("net/mlx5: don't map doorbell register to write combining") Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Shahaf Shuler <shahafs@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Diffstat (limited to 'doc/guides/nics/mlx5.rst')
-rw-r--r--doc/guides/nics/mlx5.rst17
1 files changed, 17 insertions, 0 deletions
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index bcc39f6..cdb880a 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -173,6 +173,23 @@ Environment variables
This is disabled by default since this can also decrease performance for
unaligned packet sizes.
+- ``MLX5_SHUT_UP_BF``
+
+ Configures HW Tx doorbell register as IO-mapped.
+
+ By default, the HW Tx doorbell is configured as a write-combining register.
+ The register would be flushed to HW usually when the write-combining buffer
+ becomes full, but it depends on CPU design.
+
+ Except for vectorized Tx burst routines, a write memory barrier is enforced
+ after updating the register so that the update can be immediately visible to
+ HW.
+
+ When vectorized Tx burst is called, the barrier is set only if the burst size
+ is not aligned to MLX5_VPMD_TX_MAX_BURST. However, setting this environmental
+ variable will bring better latency even though the maximum throughput can
+ slightly decline.
+
Run-time configuration
~~~~~~~~~~~~~~~~~~~~~~