Rasagar/Library/PackageCache/com.unity.collections/Documentation~/performance-comparison-allocators.md
2024-08-26 23:07:20 +03:00

14 KiB
Raw Blame History

Performance Comparison: Allocators

This file is auto-generated

All measurments were taken on 12th Gen Intel(R) Core(TM) i9-12900K with 24 logical cores.
Unity Editor version: 2022.2.8f1
To regenerate this file locally use: DOTS -> Unity.Collections -> Generate *** menu.

Table of Contents

Benchmark Results

The following benchmarks make 150 consecutive allocations per sample set.
Multithreaded benchmarks make the full 150 consecutive allocations per worker thread per sample set.
The Min of 50 sample sets is compared against the baseline on the far right side of the table.
5 extra sample sets are run as warmup.

Legend

(S) = Safety Enabled
(B) = Burst Compiled with Safety Disabled
(S+B) = Burst Compiled with Safety Enabled
(E) = Engine Provided

italic results are for benchmarking comparison only; these are not included in standard Performance Framework tests


RewindableAllocator

Functionality RewindableAllocator (S) RewindableAllocator (S+B) RewindableAllocator (B) TempJob (E) Temp (E) Persistent (E)
FixedSize(1, 1024)³ 11.4µs (2.2x)       4.0µs (6.3x)       3.8µs (6.6x) 🟢 17.0µs (1.5x)       10.1µs (2.5x)       25.1µs (1.0x) 🟠
FixedSize(2, 1024)²˒³ 23.0µs (2.1x)       23.1µs (2.0x)       9.0µs (5.2x) 🟢 20.2µs (2.3x)       11.2µs (4.2x)       47.2µs (1.0x) 🟠
FixedSize(4, 1024)²˒³ 66.4µs (1.9x)       71.9µs (1.7x)       80.8µs (1.5x)       23.5µs (5.3x)       11.5µs (10.7x) 🟢 123.5µs (1.0x) 🟠
FixedSize(8, 1024)²˒³ 167.1µs (2.2x)       169.2µs (2.2x)       167.3µs (2.2x)       45.6µs (8.0x)       12.8µs (28.6x) 🟢 366.4µs (1.0x) 🟠
FixedSize(1, 1048576)³ 11.9µs (16.3x)       4.7µs (41.3x)       4.4µs (44.1x) 🟢 17.1µs (11.4x)       10.9µs (17.8x)       194.1µs (1.0x) 🟠
FixedSize(2, 1048576)²˒³ 26.0µs (10.0x)       17.0µs (15.2x)       14.1µs (18.4x)       32.0µs (8.1x)       11.7µs (22.1x) 🟢 258.9µs (1.0x) 🟠
FixedSize(4, 1048576)²˒³ 70.1µs (11.6x)       71.3µs (11.4x)       75.3µs (10.8x)       208.5µs (3.9x)       12.5µs (65.0x) 🟢 812.2µs (1.0x) 🟠
FixedSize(8, 1048576)²˒³ 139.7µs (14.6x)       161.0µs (12.7x)       179.8µs (11.3x)       1317.1µs (1.5x)       19.5µs (104.6x) 🟢 2039.9µs (1.0x) 🟠
IncSize(1, 4096) 11.9µs (4.0x)       4.6µs (10.3x)       4.3µs (11.0x) 🟢 17.9µs (2.6x)       10.3µs (4.6x)       47.2µs (1.0x) 🟠
IncSize(2, 4096)²˒⁴ 26.8µs (4.8x)       10.9µs (11.7x)       10.5µs (12.2x) 🟢 31.7µs (4.0x)       10.9µs (11.7x)       127.6µs (1.0x) 🟠
IncSize(4, 4096)²˒⁴ 58.9µs (7.6x)       67.6µs (6.6x)       64.5µs (6.9x)       71.9µs (6.2x)       11.2µs (39.7x) 🟢 444.7µs (1.0x) 🟠
IncSize(8, 4096)²˒⁴ 169.3µs (7.8x)       159.0µs (8.3x)       185.7µs (7.1x)       350.8µs (3.8x)       11.5µs (114.7x) 🟢 1319.0µs (1.0x) 🟠
IncSize(1, 65536) 12.7µs (49.2x)       5.0µs (125.1x)       4.7µs (133.1x) 🟢 19.0µs (32.9x)       11.0µs (56.9x)       625.4µs (1.0x) 🟠
IncSize(2, 65536)²˒⁴ 25.0µs (46.3x)       15.8µs (73.3x)       13.0µs (89.1x)       578.1µs (2.0x)       11.3µs (102.5x) 🟢 1157.7µs (1.0x) 🟠
IncSize(4, 65536)²˒⁴ 73.3µs (34.7x)       73.0µs (34.8x)       70.5µs (36.1x)       2098.0µs (1.2x)       11.9µs (213.6x) 🟢 2542.2µs (1.0x) 🟠
IncSize(8, 65536)²˒⁴ 141.3µs (40.5x)       168.1µs (34.1x)       162.6µs (35.2x)       6036.0µs (0.9x) 🟠 12.7µs (450.8x) 🟢 5724.9µs (1.0x)      
DecSize(1, 4096) 12.2µs (6.1x)       4.6µs (16.1x)       4.3µs (17.2x) 🟢 16.9µs (4.4x)       9.8µs (7.5x)       73.9µs (1.0x) 🟠
DecSize(2, 4096)²˒⁵ 27.6µs (3.4x)       12.5µs (7.6x)       11.9µs (8.0x)       37.3µs (2.5x)       11.4µs (8.3x) 🟢 94.9µs (1.0x) 🟠
DecSize(4, 4096)²˒⁵ 68.5µs (7.5x)       74.7µs (6.8x)       69.4µs (7.4x)       79.3µs (6.5x)       11.0µs (46.5x) 🟢 511.6µs (1.0x) 🟠
DecSize(8, 4096)²˒⁵ 173.5µs (7.4x)       173.4µs (7.4x)       168.3µs (7.6x)       313.4µs (4.1x)       17.1µs (75.1x) 🟢 1284.6µs (1.0x) 🟠
DecSize(1, 65536) 12.1µs (47.9x)       4.6µs (126.0x)       4.3µs (134.8x) 🟢 20.8µs (27.9x)       11.7µs (49.6x)       579.8µs (1.0x) 🟠
DecSize(2, 65536)²˒⁵ 28.6µs (37.1x)       17.7µs (60.0x)       11.5µs (92.3x) 🟢 658.8µs (1.6x)       12.5µs (84.9x)       1061.4µs (1.0x) 🟠
DecSize(4, 65536)²˒⁵ 67.3µs (38.8x)       69.4µs (37.6x)       73.1µs (35.7x)       2386.4µs (1.1x)       14.2µs (183.8x) 🟢 2609.3µs (1.0x) 🟠
DecSize(8, 65536)²˒⁵ 154.4µs (37.8x)       166.6µs (35.0x)       155.9µs (37.4x)       5938.8µs (1.0x) 🟠 28.6µs (203.9x) 🟢 5830.8µs (1.0x)      

² Benchmark run on parallel job workers - results may vary
³ FixedSize(workerThreads, allocSize)
IncSize(workerThreads, allocSize) -- Makes linearly increasing allocations [1⋅allocSize, 2⋅allocSize ... N⋅allocSize]
DecSize(workerThreads, allocSize) -- Makes linearly decreasing allocations [N⋅allocSize ... 2⋅allocSize, 1⋅allocSize]