Ei kuvausta
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
siu 4d51728523 backup 2 kuukautta sitten
..
Unity.PerformanceTesting.Benchmark backup 2 kuukautta sitten
ArrayPerformanceTests.cs backup 2 kuukautta sitten
ArrayPerformanceTests.cs.meta backup 2 kuukautta sitten
BenchmarkAllocator.cs backup 2 kuukautta sitten
BenchmarkAllocator.cs.meta backup 2 kuukautta sitten
BenchmarkContainer.cs backup 2 kuukautta sitten
BenchmarkContainer.cs.meta backup 2 kuukautta sitten
BenchmarkContainerParallel.cs backup 2 kuukautta sitten
BenchmarkContainerParallel.cs.meta backup 2 kuukautta sitten
HashMapPerformanceTests.cs backup 2 kuukautta sitten
HashMapPerformanceTests.cs.meta backup 2 kuukautta sitten
HashSetPerformanceTests.cs backup 2 kuukautta sitten
HashSetPerformanceTests.cs.meta backup 2 kuukautta sitten
ListPerformanceTests.cs backup 2 kuukautta sitten
ListPerformanceTests.cs.meta backup 2 kuukautta sitten
NativeArraySortPerformanceTests.cs backup 2 kuukautta sitten
NativeArraySortPerformanceTests.cs.meta backup 2 kuukautta sitten
ParallelHashMapPerformanceTests.cs backup 2 kuukautta sitten
ParallelHashMapPerformanceTests.cs.meta backup 2 kuukautta sitten
ParallelHashSetPerformanceTests.cs backup 2 kuukautta sitten
ParallelHashSetPerformanceTests.cs.meta backup 2 kuukautta sitten
PerformanceTestFrameworkOutputFixedSize.png backup 2 kuukautta sitten
PerformanceTestFrameworkOutputFixedSize.png.meta backup 2 kuukautta sitten
PerformanceTestFrameworkOutputListAdd.png backup 2 kuukautta sitten
PerformanceTestFrameworkOutputListAdd.png.meta backup 2 kuukautta sitten
QueueParallelPerformanceTests.cs backup 2 kuukautta sitten
QueueParallelPerformanceTests.cs.meta backup 2 kuukautta sitten
QueuePerformanceTests.cs backup 2 kuukautta sitten
QueuePerformanceTests.cs.meta backup 2 kuukautta sitten
README.md backup 2 kuukautta sitten
README.md.meta backup 2 kuukautta sitten
RewindableAllocatorPerformanceTests.cs backup 2 kuukautta sitten
RewindableAllocatorPerformanceTests.cs.meta backup 2 kuukautta sitten
RingQueuePerformanceTests.cs backup 2 kuukautta sitten
RingQueuePerformanceTests.cs.meta backup 2 kuukautta sitten
Unity.Collections.PerformanceTests.asmdef backup 2 kuukautta sitten
Unity.Collections.PerformanceTests.asmdef.meta backup 2 kuukautta sitten
Unity.PerformanceTesting.Benchmark.meta backup 2 kuukautta sitten
UnsafeListPerformanceTests.cs backup 2 kuukautta sitten
UnsafeListPerformanceTests.cs.meta backup 2 kuukautta sitten
UnsafeStreamPerformanceTests.cs backup 2 kuukautta sitten
UnsafeStreamPerformanceTests.cs.meta backup 2 kuukautta sitten

README.md

Collections Benchmarking and Performance Tests

Table of Contents

Overview

com.unity.collections provides pre-defined intermediate ‘glue’ layers on top of the Benchmark Framework to enable relatively simple creation of performance and benchmark testing for a wide variety of code paths which may be taken when using the collections package.

Containers

Examples of provided benchmarking and performance testing include:

  • NativeContainer code
  • Burst compiled NativeContainer code with safety enabled
  • Burst compiled NativeContainer code with safety disabled
  • UnsafeContainer code
  • Burst compiled UnsafeContainer code with safety enabled
  • Burst compiled UnsafeContainer code with safety disabled

Combine those with:

  • Container.ParallelWriter code going wide in any of the above mentioned situations
  • Container.ReadOnly code going wide

and it is easy to visualize the vast number of possibilities which we want to monitor and generate concrete performance data and comparisons on.

Regarding comparisons, we also want to ensure that these burst compatible containers are competitive or better with a similar container in .NET/IL2CPP/Mono’s base class library, and have a way to validate and track improvements there as well, such as those found in:

  • System.Collections.Generic
  • System.Collections.Concurrent

Allocators

Naturally, there is a similar story with the custom allocator types provided by the collections package. In this case we want to be able to compare:

  • A provided IAllocator implementation in a managed code path
  • The same in a Burst compiled code path with safety enabled
  • Again the same in a Burst compiled code path with safety disabled

against:

  • The UnityEngine built-in Allocator.Temp
  • The UnityEngine built-in Allocator.TempJob
  • The UnityEngine built-in Allocator.Persistent

Container Benchmarking and Performance Tests

Container performance testing and benchmarks are built around a small handful of types. |Type|Description| |---|---| |BenchmarkContainerType|This enum defines variations for Native and Unsafe containers with and without burst compilation - with and without safety enabled. See the inline documentation for full details.| |IBenchmarkContainer|Tests are written as implementations of this interface. It provides means for generic int parameters, allocation and disposal of Native, Unsafe, and C# Base Class Library containers, and measurement of the same. |BenchmarkContainerRunner|Easy-to-use API for running measurements in a single call. See inline documentation for full details, and see below for example usage.| |IBenchmarkContainerParallel|Similar to IBenchmarkContainer, but designed to support tightly designed measurement code with Unity Job system workers in mind| |BenchmarkContainerRunnerParallel|Similar to BenchmarkContainerRunner, but designed to parameterize worker thread counts for performance testing and benchmarking parallel container implementations|


Example Code - List.Add

Here is a real-world basic example of implementing a performance and test and benchmark comparison for lists. This measures the cost of simply adding elements to a list with the expected capacity pre-allocated.

    struct ListAdd : IBenchmarkContainer
    {
        int capacity;
        NativeList<int> nativeContainer;
        UnsafeList<int> unsafeContainer;

        void IBenchmarkContainer.SetParams(int capacity, params int[] args) => this.capacity = capacity;

        public void AllocNativeContainer(int capacity) => ListUtil.AllocInt(ref nativeContainer, capacity, false);
        public void AllocUnsafeContainer(int capacity) => ListUtil.AllocInt(ref unsafeContainer, capacity, false);
        public object AllocBclContainer(int capacity) => ListUtil.AllocBclContainer(capacity, false);

        public void MeasureNativeContainer()
        {
            for (int i = 0; i < capacity; i++)
                nativeContainer.Add(i);
        }
        public void MeasureUnsafeContainer()
        {
            for (int i = 0; i < capacity; i++)
                unsafeContainer.Add(i);
        }
        public void MeasureBclContainer(object container)
        {
            var bclContainer = (System.Collections.Generic.List<int>)container;
            for (int i = 0; i < capacity; i++)
                bclContainer.Add(i);
        }
    }

To run these measurements, the calling code is quite simple, and generates a multitude of Performance Test Framework tests which can be run from the Unity Test Runner as well as through CI regression checks, and it also supports the necessary code paths for Benchmarking to make performance comparisons on all the variations including the BCL variation. Note the BCL variation of System.Collections.Generic.List will not appear as a Performance Test Framework test - it is considered for benchmarking only.

    [Benchmark(typeof(BenchmarkContainerType))]
    class List
    {
        ... 
        [Test, Performance]
        [Category("Performance")]
        public unsafe void Add(
            [Values(10000, 100000, 1000000)] int insertions,
            [Values] BenchmarkContainerType type)
        {
            BenchmarkContainerRunner<ListAdd>.Run(insertions, type);
        }
        ...
    }

Results - List.Add

This above two code snippets generate something like the following (notice the BCL tests aren’t generated):

Performance Test Framework example

Running the DOTS/Unity.Collections/Generate Container Benchmarks menu item will generate a markdown report, again running the same single code path per type. Here is a snippet of the full results showing only the output for List.Add:

List

Functionality NativeList (S) NativeList (S+B) NativeList (B) UnsafeList (S) UnsafeList (S+B) UnsafeList (B) List (BCL)
Add(10000) 0.178ms (0.1x) 🟠 0.057ms  (0.3x)       0.018ms  (0.8x)       0.041ms (0.4x)       0.006ms  (2.3x) 🟢 0.014ms  (1.1x)       0.015ms (1.0x)      
Add(100000) 1.827ms (0.1x) 🟠 0.622ms  (0.2x)       0.180ms  (0.8x)       0.432ms (0.3x)       0.061ms  (2.4x) 🟢 0.139ms  (1.1x)       0.146ms (1.0x)      
Add(1000000) 18.910ms (0.1x) 🟠 6.443ms  (0.2x)       1.814ms  (0.8x)       4.136ms (0.4x)       0.586ms  (2.5x) 🟢 1.482ms  (1.0x)       1.468ms (1.0x)      

Allocator Benchmarking and Performance Tests

Allocator performance testing and benchmarks are built around a small handful of types. |Type|Description| |---|---| |BenchmarkAllocatorType|This enum defines variations for allocators with and without burst compilation - with and without safety enabled. See the inline documentation for full details.| |IBenchmarkAllocator|Tests are written as implementations of this interface. It provides means for generic int parameters, creation and destruction of allocators, allocation and freeing of memory using these allocators as well as using Unity Engine’s built-in allocators Temp, TempJob, and Persistent, and measurement of the same. |BenchmarkAllocatorRunner|Easy-to-use API for running measurements in a single call. See inline documentation for full details, and see below for example usage.| |BenchmarkAllocatorUtil|Generalized API for simplifying common Setup and Teardown implementations of IBenchmarkAllocator derived test types|


Example Code - RewindableAllocator.FixedSize

The following example will omit another utility type designed for RewindableAllocator. The type is designed to simplify setup, teardown, and Rewind functionality necessary on a per-test-run basis. See RewindableAllocatorPerformanceTests.cs for reference.

    struct Rewindable_FixedSize : IBenchmarkAllocator
    {
        RewindableAllocationInfo allocInfo;

        public void CreateAllocator(Allocator builtinOverride) => allocInfo.CreateAllocator(builtinOverride);
        public void DestroyAllocator() => allocInfo.DestroyAllocator();
        public void Setup(int workers, int size, int allocations) =>
            allocInfo.Setup(workers, size, 0, allocations);
        public void Teardown() => allocInfo.Teardown();
        public void Measure(int workerI) => allocInfo.Allocate(workerI);
    }

To run these measurements, the calling code is quite simple, and generates a multitude of Performance Test Framework tests which can be run from the Unity Test Runner as well as through CI regression checks, and it also supports the necessary code paths for Benchmarking to make performance comparisons on all the variations including the Temp, TempJob, and Persistent variations. Note these Unity Engine built-in allocator variations will not appear as a Performance Test Framework test - it is considered for benchmarking only.

    [Benchmark(typeof(BenchmarkAllocatorType))]
    [BenchmarkNameOverride("RewindableAllocator")]
    class RewindableAllocatorBenchmark
    {
        ...
        [Test, Performance]
        [Category("Performance")]
        [BenchmarkTestFootnote]
        public void FixedSize(
            [Values(1, 2, 4, 8)] int workerThreads,
            [Values(1024, 1024 * 1024)] int allocSize,
            [Values] BenchmarkAllocatorType type)
        {
            BenchmarkAllocatorRunner<Rewindable_FixedSize>.Run(type, allocSize, workerThreads);
        }
        ...
    }

Results - RewindableAllocator.FixedSize

This above two code snippets generate something like the following (notice the BCL tests aren’t generated):

Performance Test Framework example

Running the DOTS/Unity.Collections/Generate Allocator Benchmarks menu item will generate a markdown report, again running the same single code path per type. Here is a snippet of the full results showing only the output for RewindableAllocator.FixedSize:

RewindableAllocator

Functionality RewindableAllocator (S) RewindableAllocator (S+B) RewindableAllocator (B) TempJob (E) Temp (E) Persistent (E)
FixedSize(1, 1024)³ 11.4µs  (2.5x)       3.9µs   (7.3x)       3.6µs   (7.9x) 🟢 13.6µs  (2.1x)       10.2µs   (2.8x)       28.6µs (1.0x) 🟠
FixedSize(2, 1024)²˒³ 27.8µs  (2.5x)       17.7µs   (3.9x)       8.8µs   (7.9x) 🟢 19.3µs  (3.6x)       10.6µs   (6.5x)       69.1µs (1.0x) 🟠
FixedSize(4, 1024)²˒³ 65.3µs  (1.9x)       73.1µs   (1.7x)       66.8µs   (1.8x)       28.2µs  (4.3x)       11.8µs  (10.3x) 🟢 121.8µs (1.0x) 🟠
FixedSize(8, 1024)²˒³ 141.5µs  (2.1x)       133.3µs   (2.3x)       158.5µs   (1.9x)       46.0µs  (6.6x)       11.6µs  (26.2x) 🟢 303.9µs (1.0x) 🟠
FixedSize(1, 1048576)³ 12.3µs (16.5x)       4.6µs  (44.2x)       4.2µs  (48.4x) 🟢 17.3µs (11.8x)       10.5µs  (19.4x)       203.3µs (1.0x) 🟠
FixedSize(2, 1048576)²˒³ 24.7µs (12.1x)       14.9µs  (20.0x)       10.4µs  (28.7x) 🟢 27.7µs (10.8x)       11.3µs  (26.4x)       298.4µs (1.0x) 🟠
FixedSize(4, 1048576)²˒³ 70.8µs (12.4x)       77.5µs  (11.3x)       72.5µs  (12.1x)       199.5µs  (4.4x)       12.5µs  (70.2x) 🟢 877.7µs (1.0x) 🟠
FixedSize(8, 1048576)²˒³ 152.0µs (14.5x)       155.2µs  (14.2x)       160.9µs  (13.7x)       1010.8µs  (2.2x)       12.4µs (177.2x) 🟢 2197.7µs (1.0x) 🟠

² Benchmark run on parallel job workers - results may vary
³ FixedSize(workerThreads, allocSize)