diff options
| author | Peter-Jan Gootzen <pgootzen@nvidia.com> | 2024-05-01 17:38:17 +0200 | 
|---|---|---|
| committer | Miklos Szeredi <mszeredi@redhat.com> | 2024-05-10 13:38:14 +0200 | 
| commit | 529395d2ae6456c556405016ea0c43081fe607f3 (patch) | |
| tree | b7f1b4967f42119ae7ad64e7bb596a946b5185f7 /scripts/generate_rust_analyzer.py | |
| parent | 103c2de111bf32f7c36a0ce8f638b114a37e0b76 (diff) | |
virtio-fs: add multi-queue support
This commit creates a multi-queue mapping at device bring-up.
The driver first attempts to use the existing MSI-X interrupt
affinities (previously disabled), and if not present, will distribute
the request queues evenly over the CPUs.
If the latter fails as well, all CPUs are mapped to request queue zero.
When a request is handed from FUSE to the virtio-fs device driver, the
driver will use the current CPU to index into the multi-queue mapping
and determine the optimal request queue to use.
We measured the performance of this patch with the fio benchmarking
tool, increasing the number of queues results in a significant speedup
for both read and write operations, demonstrating the effectiveness
of multi-queue support.
Host:
  - Dell PowerEdge R760
  - CPU: Intel(R) Xeon(R) Gold 6438M, 128 cores
  - VM: KVM with 32 cores
Virtio-fs device:
  - BlueField-3 DPU
  - CPU: ARM Cortex-A78AE, 16 cores
  - One thread per queue, each busy polling on one request queue
  - Each queue is 1024 descriptors deep
Workload:
  - fio, sequential read or write, ioengine=libaio, numjobs=32,
    4GiB file per job, iodepth=8, bs=256KiB, runtime=30s
Performance Results:
+===========================+==========+===========+
|     Number of queues      | Fio read | Fio write |
+===========================+==========+===========+
| 1 request queue (GiB/s)   | 6.1     | 4.6        |
+---------------------------+----------+-----------+
| 8 request queues (GiB/s)  | 25.8    | 10.3       |
+---------------------------+----------+-----------+
| 16 request queues (GiB/s) | 30.9    | 19.5       |
+---------------------------+----------+-----------+
| 32 request queue (GiB/s)  | 33.2    | 22.6       |
+---------------------------+----------+-----------+
| Speedup                   | 5.5x    | 5x         |
+---------------=-----------+----------+-----------+
Signed-off-by: Peter-Jan Gootzen <pgootzen@nvidia.com>
Signed-off-by: Yoray Zack <yorayz@nvidia.com>
Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Diffstat (limited to 'scripts/generate_rust_analyzer.py')
0 files changed, 0 insertions, 0 deletions
