path: root/doc/guides/nics/mlx5.rst
diff options
authorYongseok Koh <>2017-10-24 17:27:25 -0700
committerFerruh Yigit <>2017-10-26 02:33:01 +0200
commitfb870be5a879c6617fecabf47873ae2b576e6e69 (patch)
treeabd79339ea355b7eeed46287da4090dff6baffa5 /doc/guides/nics/mlx5.rst
parent2262eed7523b305e59e8172a141b75e385b43cf0 (diff)
net/mlx5: fix Tx doorbell memory barrier
Configuring UAR as IO-mapped makes maximum throughput decline by noticeable amount. If UAR is configured as write-combining register, a write memory barrier is needed on ringing a doorbell. rte_wmb() is mostly effective when the size of a burst is comparatively small. Revert the register back to write-combining and enforce a write memory barrier instead, except for vectorized Tx burst routines. Application can change it by setting MLX5_SHUT_UP_BF under its own necessity. Fixes: 9f9bebae5530 ("net/mlx5: don't map doorbell register to write combining") Signed-off-by: Yongseok Koh <> Acked-by: Shahaf Shuler <> Acked-by: Nelio Laranjeiro <>
Diffstat (limited to 'doc/guides/nics/mlx5.rst')
1 files changed, 17 insertions, 0 deletions
diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index bcc39f6..cdb880a 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -173,6 +173,23 @@ Environment variables
This is disabled by default since this can also decrease performance for
unaligned packet sizes.
+- ``MLX5_SHUT_UP_BF``
+ Configures HW Tx doorbell register as IO-mapped.
+ By default, the HW Tx doorbell is configured as a write-combining register.
+ The register would be flushed to HW usually when the write-combining buffer
+ becomes full, but it depends on CPU design.
+ Except for vectorized Tx burst routines, a write memory barrier is enforced
+ after updating the register so that the update can be immediately visible to
+ HW.
+ When vectorized Tx burst is called, the barrier is set only if the burst size
+ is not aligned to MLX5_VPMD_TX_MAX_BURST. However, setting this environmental
+ variable will bring better latency even though the maximum throughput can
+ slightly decline.
Run-time configuration