C#中避免GC压力和提高性能的8种技术

In a .NET application, memory and performance are very much linked. Poor memory management can hurt performance in many ways. One such effect is called GC Pressure or Memory Pressure.

GC Pressure (garbage collector pressure) is when the GC doesn’t keep up with memory deallocations. When the GC is pressured, it will spend more time garbage collecting, and these collections will come more frequently. When your app spends more time garbage collecting, it spends less time executing code, thus directly hurting performance.

If you’re not familiar with garbage collector fundamentals, I suggest reading this article first.

This article will show 8 techniques to minimize GC pressure, and by doing so, improve performance.

1. Set initial capacity for dynamic collections

.NET provides a lot of great collections types like , , and . All those collections have dynamic size capacity. That means they automatically expand in size as you add more items.List<T>Dictionary<T>HashSet<T>

While this functionality is very convenient, it’s not great for memory management. Whenever the collection reaches its size limit, it will allocate a new larger memory buffer (usually an array double in size). That means an additional allocation and deallocation.

Check out this benchmark:

[Benchmark]
public void ListDynamicCapacity()
{List<int> list = new List<int>();for (int i = 0; i < Size; i++){list.Add(i);}
}
[Benchmark]
public void ListPlannedCapacity()
{List<int> list = new List<int>(Size);for (int i = 0; i < Size; i++){list.Add(i);}
}
I’m using BenchmarkDotNet here with [Host]: .NET Core 2.1.9 (CoreCLR 4.6.27414.06, CoreFX 4.6.27415.01), 64bit RyuJIT

In the first method, the collection started with default capacity and expanded in size. In the second benchmark, I set the initial capacity to the number of items it’s going to have.List

For 1000 items, the results were:

MethodMeanErrorStdDev
ListDynamicCapacity 3.415 us 0.0687 us 0.1240 us
ListPlannedCapacity 2.422 us 0.0219 us 0.0183 us

By setting capacity, we saved 30% in performance time. In practice, the improvement in performance is probably even greater because BenchmarkDotNet performs GC collections before and after each benchmark run.

I performed another benchmark for and , with similar results:DictionaryHashSet

MethodMeanErrorStdDev
DictionaryDynamicCapacity 36.693 us 0.7505 us 1.4637 us
DictionaryPlannedCapacity 17.500 us 0.3325 us 0.3696 us
HashSetDynamicCapacity 28.080 us 0.4264 us 0.3780 us
HashSetPlannedCapacity 16.533 us 0.3285 us 0.3374 us

2. Use ArrayPool for short-lived large arrays

Allocation of arrays and the inevitable de-allocation can be quite costly. Performing these allocations in high frequency will cause GC pressure and hurt performance. An elegant solution is the class found in the Systems.Buffers NuGet .System.Buffers.ArrayPool

The idea is pretty similar to to the ThreadPool. A shared buffer for arrays is allocated, which you can reuse without actually allocating and de-allocating memory. The basic usage is by calling . This returns a regular array, which you can use any way you please. When finished, call to return the buffer back to the shared pool.ArrayPool<T>.Shared.Rent(size)ArrayPool<int>.Shared.Return(array)

Here’s a benchmark showing this:

[Benchmark]
public void RegularArray()
{int[] array = new int[ArraySize];
}
[Benchmark]
public void SharedArrayPool()
{var pool = ArrayPool<int>.Shared;int[] array = pool.Rent(ArraySize);pool.Return(array);
}

For 100 integers the results are:

MethodMeanErrorStdDev
RegularArray 41.23 ns 0.8544 ns 2.236 ns
SharedArrayPool 47.42 ns 0.9781 ns 1.087 ns

Pretty similar, but when running for 1,000 integers:

MethodMeanErrorStdDev
RegularArray 404.53 ns 8.074 ns 18.872 ns
SharedArrayPool 51.71 ns 1.354 ns 1.505 ns

As you can imagine, the ArrayPool allocation time stays the same, whereas regular allocation time increases as the size grows.

Much like the ThreadPool with threads, the ArrayPool should be used for short-lived large arrays. For more info on the ArrayPool, read Adam Sitnik’s excellent blog post .

3. Use Structs instead of Classes (sometimes)

Structs have several benefits when it comes to deallocation:

  • When structs are not part of a class, they are allocated on the Stack and don’t require garbage collection at all (stack unwinding).
  • Structs are stored on the heap when they are part of a class (or any reference-type). In that case, they are stored inline and are deallocated when the containing type is deallocated. Inline means the struct’s data is stored as-is. As opposed to a reference type, where a pointer is stored to another location on the heap with the actual data. This is especially meaningful in collections, where a collection of structs is much cheaper to de-allocate because it’s just one buffer of memory.
  • Structs take less memory than a reference type because they don’t have an ObjectHeader and a MethodTable.

In most cases, you will want to use classes. Use structs when all of the following is true (full guidelines from Microsoft ):

  • The struct size is less than or equals to 16 bytes (e.g 4 integers). More than that size, classes are more effective than structs.
  • The struct is short lived
  • The struct is immutable.
  • The struct will not have to be boxed frequently.

In addition, structs are passing by value. So when you’re passing a struct as a method parameter, it will be copied entirely. Copying is expensive and can hurt performance instead of improving it.

Here’s a benchmark that shows how efficient allocating structs can be:

class VectorClass
{public int X { get; set; }public int Y { get; set; }
}
struct VectorStruct
{public int X { get; set; }public int Y { get; set; }
}
private const int ITEMS = 10000;
[Benchmark]
public void WithClass()
{VectorClass[] vectors = new VectorClass[ITEMS];for (int i = 0; i < ITEMS; i++){vectors[i] = new VectorClass();vectors[i].X = 5;vectors[i].Y = 10;}
}
[Benchmark]
public void WithStruct()
{VectorStruct[] vectors = new VectorStruct[ITEMS];// At this point all the vectors instances are already allocated with default valuesfor (int i = 0; i < ITEMS; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}

Result:

MethodMeanErrorStdDev
WithClass 77.97 us 1.5528 us 2.6785 us
WithStruct 12.97 us 0.2564 us 0.6094 us

As you can see, the allocation is about 6.5 times faster than allocation.structclass

4. Avoid Finalizers

Finalizers in C# are very expensive for several reasons:

  • Any class with a finalizer is automatically promoted a generation by the garbage collector. This means they can’t be garbage collected in Gen 0, which is the fastest generation.
  • The finalizer is placed in a Finalizer Queue, handled by a single dedicated thread. This can cause problems is some finalizer runs for a long time or throws an exception.

To prove how terrible finalizers can be for performance, consider the following benchmark:

class Simple
{public int X { get; set; }
}
class SimpleWithFinalizer
{~SimpleWithFinalizer(){}public int X { get; set; }
}
private int ITEMS = 100000;
private static Simple _instance1;
private static SimpleWithFinalizer _instance2;
[Benchmark]
public void AllocateSimple()
{for (int i = 0; i < ITEMS; i++){_instance1 = new Simple();}
}
[Benchmark]
public void AllocateSimpleWithFinalizer()
{for (int i = 0; i < ITEMS; i++){_instance2 = new SimpleWithFinalizer();}
}

The result for 100,000 items is:

MethodMeanErrorStdDev
AllocateSimple 409.9 us 9.063 us 17.24 us
AllocateSimpleWithFinalizer 128,796.8 us 2,520.871 us 2,588.75 us
The measuring unit ‘us’ stands for microseconds. 1000 us = 1 millisecond

As you can see, there’s a 1:320 ratio in favor of classes without finalizers.

Sometimes, finalizers are unavoidable. For example, they are often used in the Dispose Pattern . In such cases, make sure to suppress the finalizers when it’s no longer required, like this:

public  void  Dispose()
{Dispose(true); // the actual dispose functionalityGC.SuppressFinalize(this); //now, the finalizer won't be called
}

5. Use StackAlloc for short-lived array allocations

The keyword in C# allows for very fast allocation and deallocation of unmanaged memory. That is, classes won’t work, but primitives, structs, and arrays are supported. Here’s an example benchmark:StackAlloc

struct VectorStruct
{public int X { get; set; }public int Y { get; set; }
}
[Benchmark]
public void WithNew()
{VectorStruct[] vectors = new VectorStruct[5];for (int i = 0; i < 5; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}
[Benchmark]
public unsafe void WithStackAlloc() // Note that unsafe context is required
{VectorStruct* vectors = stackalloc VectorStruct[5];for (int i = 0; i < 5; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}
public void WithStackAllocSpan() // When using Span, no need for unsafe context
{Span<VectorStruct> vectors = stackalloc VectorStruct[5];for (int i = 0; i < 5; i++){vectors[i].X = 5;vectors[i].Y = 10;}
}

This results are:

MethodMeanErrorStdDev
WithNew 10.372 ns 0.1531 ns 0.1432 ns
WithStackAlloc 5.704 ns 0.0938 ns 0.0831 ns
WithStackAllocSpan 5.742 ns 0.0965 ns 0.1021 ns

stackalloc is about twice as fast as regular instantiation. When increasing the number of items from 5 to 100, the difference is even greater – 82ns : 36ns.

Use Span<T> rather than array pointer since no unsafe context is needed

Learn more about here .stackalloc

6. Use StringBuilder, but not always

Strings are immutable. As such, they cannot change. Any concatenation like will allocate a new object. To prevent these new allocations and improve performance, the class was created.str1 = str1 + str2StringBuilder

I recently wrote a blog post on StringBuilder performance and found out that things were not as simple as they might seem. Here’s the summary of my research:

  • Regular concatenations are more efficient than for a small number of concatenations. Depending on string sizes, using becomes more efficient with over 10-15 concatenations.StringBuilderStringBuilder
  • StringBuilder can be optimized by setting its initial capacity.
  • StringBuilder can be optimized by reusing the same instance. This can make a difference for very frequent usages like logging.

For more information, read the full article: Challenging the C# StringBuilder Performance

7. Use String Interning in very specific cases

About 60% percent of the human body is water. Similarly, about 70% of a .NET application is strings. This makes optimizing strings one of the most important aspects of memory management.

The .NET runtime has a hidden optimization. For literal strings with the same value, it uses the same reference. For example, consider the following code:

string a = "Table";
string b = "Table";

It seems like and will be allocated to 2 different objects. But, the CLR will allocate just 1 object, which both and will reference. This optimization is called String Interning. There are 2 positive side effects to this:abab

  1. You save memory by using just 1 object.
  2. It’s cheaper to compare between the strings. A comparison first checks for reference equality. Since both and referencing same object, the comparison will return without actually checking the string contents.abtrue

This optimization is done just for string literals. For example, when you write something like this: . It’s not done for strings that are calculated at runtime. The reason is that string interning is expensive. When interning a new string, the runtime has to look for an identical string in memory to find a match. This is obviously expensive and just not done.string myString = "Something"

As it happens, you can perform string intering manually. This is done with the method. And you can check if a string is already interned with . In very specific cases, you can use this for optimization. Here’s one example:string.Intern(string)string.IsInterned(string)

private string s1 = "Hello";
private string s2 = " World";
[Benchmark]
public void WithoutInterning()
{string s1 = GetNonLiteral();string s2 = GetNonLiteral();for (int i = 0; i < Size; i++){bool x = s1.Equals(s2);}
}
[Benchmark]
public void WithInterning()
{string s1 = string.Intern(GetNonLiteral());string s2 = string.Intern(GetNonLiteral());for (int i = 0; i < Size; i++){bool x = s1.Equals(s2);}
}
private string GetNonLiteral()
{return s1 + s2;
}

For 100 items this benchmark will return:

MethodMeanErrorStdDevMedian
WithoutInterning 198.3 ns 3.986 ns 10.776 ns 201.5 ns
WithInterning 424.4 ns 8.426 ns 8.653 ns 421.0 ns

And for 10,00 items:

MethodMeanErrorStdDev
WithoutInterning 68.06 us 0.6225 us 0.5198 us
WithInterning 16.11 us 0.3288 us 0.3075 us

As you can see, this can be very effective when the amount of comparisons is much larger than the number of intern operations. These cases are very rare. If you do consider interning, do some benchmarking to make sure you are actually optimizing anything.

Note that an interned string will never be garbage collected. It might make more sense to create a local string-pool of your own. You can see Jon Skeet’s answer on StackOverflow where he explains this point further and even shows an implementation example.

8. Avoid memory leaks

Memory leaks are a constant troublemaker in any big application. Besides the obvious danger of an eventual out-of-memory exception, memory leaks also cause GC Pressure and performance issues. Here’s how:

  • With a memory leak, objects remain referenced, even when they are effectually unused. While referenced, the garbage collector will keep promoting them to higher generations instead of collecting them. These promotions are expansive and add work for the GC.
  • Memory leaks cause more memory to be in use. This means you will run out of free space quicker, causing the GC to do more frequent collections.

Memory leaks are a huge subject. Here are 2 resources you can take advantage of to learn more:

  • 8 Ways You can Cause Memory Leaks in .NET
  • Find, Fix, and Avoid Memory Leaks in C# .NET: 8 Best Practices

Summary

I hope you got value from the mentioned tips and tricks. You probably noticed that all of the above optimizations make use of one or more of these core concepts:

  • Allocations should be avoided if possible.
  • Reusing memory is better than allocating new memory.
  • Allocating on the Stack is faster than allocating on the Heap.

These are not the only concepts in performance optimizations, but probably the most important ones when it comes to GC pressure.

Happy coding.

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/908341.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

UNIX网络编程笔记:共享内存区和远程过程调用 - 指南

pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", …

基于OpenCv做照片分析应用一(Java) - 指南

pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", …

函数内联

本文将介绍 什么是内联(Inlining)、为什么重要,以及如何在 .NET 应用中有效使用 [MethodImpl]。一个相对小众但强大的工具就是 [MethodImpl(MethodImplOptions.AggressiveInlining)] 特性。 本文将介绍 什么是内联(…

7. Innodb底层原理与Mysql日志机制深入剖析

7.1 Mysql的内部结构 大体来说,MySQL可以分为Server层和存储引擎层两部分。 7.1.1 Server层 主要包括连接器、查询缓存、分析器、优化器、执行器等,涵盖 MySQL 的大多数核心服务功能,以及所有的内 置函数(如日期、…

WPF 字符竖向排列的排版格式(直排)表明控件

pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", …

新建Vue3项目流程

新建Vue3项目流程​一、环境准备​​ 创建Vue3项目前,需确保系统安装以下工具:• ​​Node.js​​:建议使用LTS版本(≥18.3),可通过node -v命令检查是否安装成功;• ​​包管理工具​​:推荐使用npm(Node.js自…

G. Chimpanzini Bananini

View PostG. Chimpanzini BananiniG. Chimpanzini Bananini大致题意:有以下三种操作:循环右移数组,即 \([a_1, a_2, \ldots, a_n]\) 变成 \([a_n, a_1, a_2, \ldots, a_{n-1}]\)。 反转数组,即 \([a_1, a_2, \ldot…

深入解析:HSA35NV001美光固态闪存NQ482NQ470

深入解析:HSA35NV001美光固态闪存NQ482NQ470pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", &qu…

ERP和MES、WMS、CRM,到底怎么配合 - 智慧园区

最近和不少老板聊ERP的时候,有句话听得特别多:总部说上了ERP就够了 可仓库说还要WMS 生产说MES必不可少 销售又提CRM这些系统到底怎么配合? 这话一出,就把企业信息化的痛点揭示出来了—— 很多老板以为ERP能包打天…

YOLO实战应用 1YOLOv5 架构与模块

实战应用 YOLOv5 架构与模块 核心概念YOLOv5:YOLO 系列的工程化实现,结构更简洁,代码更易读。 配置文件解析:通过 .yaml/.cfg 文件逐层定义网络结构。 Focus 模块:将输入图像切片重组,降低空间分辨率的同时增加…

YOLO实战应用 2数据准备与增强

实战应用 2数据准备与增强 核心概念残差模块 (shortcut / cover block):通过卷积与捷径连接实现特征相加,保证深层网络训练稳定。 NMS(非极大值抑制):用于去除多余重叠框,提升检测结果的准确性。 Soft-NMS:对高…

Day18稀疏数组

二维数组中的大部分默认值都为零,导致记录了许多没有意义的数据,稀疏数组用坐标对应有效值的方式大大简化了原本繁杂的数组package com.cc.array;import java.util.Arrays;public class ArrayDemo8 {public static v…

底层

面向过程转变成面向对象的底层逻辑 本套课程有一定难度,讲得不好,请多多包涵!里面有很多我的个人见解(仅供参考!如有指导,请把邮件发送到该邮箱690141760@qq.com) 如果有人问面向对象四大特征是什么?我相信基本…

YOLO实战应用 3训练与优化策略

实战应用 3训练与优化策略 核心概念数据加载与缓存:通过缓存机制提升训练效率,避免重复读取和处理标签。 Mosaic 数据增强:随机拼接四张图像,提升数据多样性与模型鲁棒性。 Batch 构建:一次迭代处理多个样本,并…

WPF 视图缩略图控件(支持缩放调节与拖拽定位)

实现 WPF 应用中画布的缩放控制与缩略图导航,支持滑块调节缩放比例、缩略图拖拽定位,实时同步主画布视图与缩略图视口位置。缩放控制:通过 Slider 值变化计算缩放比例,同步更新主画布 ScaleTransform,并调整 Scro…

实用指南:Dify关联Ollama

实用指南:Dify关联Ollama2025-09-20 12:21 tlnshuju 阅读(0) 评论(0) 收藏 举报pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; f…

ik中文分词器使用

IK分词器介绍 在ElasticSearch中默认使用的分词器为Standard分词器,该分词器对中文不友好,对中文的处理方式是按单个汉字分词,无法识别中文里的词语、短语等语义单元。例如对于 "汉朝" 这个词,默认分词器…

动态水印也能去除?ProPainter一键视频抠图整合包下载

ProPainter是一个基于E2FGVI实现的AI视频编辑工具,它结合了增强的传播和Transformer机制,能够快速高效地进行视频修复和水印去除功能特点对象移除:智能地检测和移除视频中的动态物体,对于去除不需要的元素或错误…

SpringBoot整合RustFS:全方位优化文件上传性能

SpringBoot整合RustFS:全方位优化文件上传性能作为一名多年深耕分布式存储的架构师,我在多个企业级项目中成功实施SpringBoot与RustFS的集成。本文将分享一套​经过实战检验的性能优化方案,帮助你的文件上传速度提升…

javaScript(WebAPI) - 教程

pre { white-space: pre !important; word-wrap: normal !important; overflow-x: auto !important; display: block !important; font-family: "Consolas", "Monaco", "Courier New", …