The Art of Low-Level Memory: Mastering Span, Memory, and ref struct


This article introduces a powerful, modern C# toolkit designed to bypass traffic jams by writing allocation-free code. We will explore Span<T>, a type-safe “window” into existing memory that lets you parse and process data without creating copies. We’ll then cover its essential, heap-friendly counterpart, Memory<T>, which is crucial for asynchronous programming. Finally, we’ll dive into creating your own ref struct types to build custom, high-speed utilities that operate entirely on the stack. Throughout this guide, we will use the practical context of our car rental application to demonstrate how these features can be used to optimize critical code paths, delivering a faster, more reliable experience for your users.


The Hidden Traffic Jam in Your Application

Imagine your car rental service during a peak holiday weekend. The website, which was once snappy, begins to slow down. Customers report that searching for available cars is sluggish, completing a booking takes forever, and sometimes the request times out entirely. Your first instinct might be to blame the database or a slow network connection. But often, the real culprit is more subtle: a hidden, internal traffic jam caused by the way your application manages memory.

To understand this jam, we need to look at how .NET handles memory. When you create a new object in your code—whether it’s a string, a List<T>, or a custom Car class—the runtime allocates a chunk of memory for it on a large memory area called the heap. The heap is incredibly flexible, but it has a finite amount of space. This is where the Garbage Collector (GC) comes in. The GC is .NET’s essential cleanup crew, periodically scanning the heap for objects that are no longer in use and reclaiming their memory.

Herein lies the problem. Every allocation, no matter how small, contributes to the “litter” on the heap. Consider a seemingly harmless operation, like generating a confirmation message for a rental booking:

// Inefficient way to build a string
public string GetBookingConfirmation(string customerName, string carModel, int days)
{
// Each '+' operation can create a new string object on the heap
string message = "Confirmation for " + customerName;
message += ". You have rented a " + carModel;
message += " for " + days + " days.";
return message;
}

While this code works, each + operation can result in a new string being created on the heap. If this method is called hundreds of times per second, you are effectively littering the heap with thousands of temporary string objects. The more litter there is, the more frequently and aggressively the GC has to work. When the GC performs a collection, it can pause your application’s execution threads for a brief moment. These tiny pauses, or “janks,” accumulate, leading to the sluggishness your customers experience. This is the hidden traffic jam: not a single, massive roadblock, but a death-by-a-thousand-cuts from constant, small memory allocations.

This is precisely the problem that modern, low-level C# features are designed to solve. This article introduces you to a powerful toolkit for writing high-performance, allocation-free code. We will explore Span<T>, a “window” into existing memory that lets you perform operations without creating copies. We’ll examine its heap-friendly counterpart, Memory<T>, which is essential for asynchronous programming. Finally, we’ll dive into creating your own ref struct types to build custom, high-speed utilities.

Throughout this guide, we will use the practical context of our car rental application to demonstrate how these tools can be used to parse complex data like a Vehicle Identification Number (VIN), efficiently process binary network data, and build web requests without creating a single piece of “garbage.” By the end, you’ll have the knowledge to identify and clear the memory traffic jams in your own applications, delivering a faster, more reliable experience for your users.

Span<T>: The High-Speed Lens for Your Data

Now that we understand the cost of heap allocations, we can introduce our first tool for fighting them: Span<T>. At its core, Span<T> is a memory-safe type that represents a contiguous sequence of arbitrary memory. The key concept to grasp is that a Span<T> is a view, not a copy. It acts as a lightweight “window” or “lens” that lets you look at a section of memory that already exists somewhere else—be it on the heap, the stack, or even in unmanaged memory. Think of it like using a magnifying glass to examine a portion of a large paper map. You are inspecting the details of a specific area without needing to cut that piece out and make a photocopy. This ability to operate on existing memory in-place is what gives Span<T> its power.

This power comes with a critical rule known as the “golden rule” of Span<T>. The type is defined as a ref struct, which imposes a strict limitation: it must only ever live on the execution stack. This means you cannot store a Span<T> as a field in a regular class or struct, as those can be moved to the heap. It also means you cannot use a Span<T> across an await boundary in an asynchronous method, nor can you box it or assign it to a variable of type object. The reason for this strictness is safety. If a Span<T> could live on the heap, it might outlive the actual memory it points to. This would create a “dangling pointer,” and trying to access it would lead to memory corruption and application crashes. By forcing Span<T> to be stack-only, the C# compiler guarantees that it can never outlive the data it’s viewing.

To see this in action, let’s return to our car rental application. A common task is to parse a Vehicle Identification Number (VIN), a 17-character code. We need to extract specific parts: the World Manufacturer Identifier (first 3 characters), the Model Year (10th character), and the Plant Code (11th character).

Here is the traditional, inefficient way to do this using string.Substring():

public class VinParts
{
public string WorldManufacturerId { get; init; }
public string ModelYearCode { get; init; }
public string PlantCode { get; init; }
}

public class InefficientVinParser
{
// This method creates 3 new strings on the heap for every VIN processed.
public VinParts Parse(string vin)
{
if (string.IsNullOrEmpty(vin) || vin.Length != 17)
{
throw new ArgumentException("Invalid VIN", nameof(vin));
}

// Each call to Substring allocates a new string object.
var wmi = vin.Substring(0, 3);
var year = vin.Substring(9, 1);
var plant = vin.Substring(10, 1);

return new VinParts { WorldManufacturerId = wmi, ModelYearCode = year, PlantCode = plant };
}
}

In the code above, each call to Substring allocates a brand-new string on the heap. If your application processes thousands of VINs from a data feed, you are creating thousands of tiny, short-lived objects that the Garbage Collector must clean up, causing performance degradation.

Now, let’s refactor this using ReadOnlySpan<char> to achieve a zero-allocation parsing routine.

public class EfficientVinParser
{
// This method performs zero heap allocations for the parsing logic.
public VinParts Parse(string vin)
{
if (string.IsNullOrEmpty(vin) || vin.Length != 17)
{
throw new ArgumentException("Invalid VIN", nameof(vin));
}

// A ReadOnlySpan<char> is a view over the existing string's memory. No copy is made.
ReadOnlySpan<char> vinSpan = vin.AsSpan();

// The Slice() method creates a new "view" without allocating any memory.
// It simply adjusts the internal pointer and length.
var wmiSlice = vinSpan.Slice(0, 3);
var yearSlice = vinSpan.Slice(9, 1);
var plantSlice = vinSpan.Slice(10, 1);

// We only allocate at the very end when creating the final result object.
return new VinParts
{
WorldManufacturerId = new string(wmiSlice),
ModelYearCode = new string(yearSlice),
PlantCode = new string(plantSlice)
};
}
}

In this efficient version, vin.AsSpan() creates a ReadOnlySpan<char> that points directly to the memory of the original vin string. The crucial part is the Slice() method. Unlike Substring(), Slice() does not create a new object on the heap. It simply returns a new Span<T> instance with a different starting point and length, providing a new “view” into the same underlying memory. The actual parsing logic—the slicing—is performed entirely without allocations. The only allocations occur at the very end, when we create the final VinParts object and its properties. For any high-throughput data processing pipeline, this approach dramatically reduces GC pressure and eliminates the hidden memory traffic jam.

Memory<T>: Your Heap-Friendly Travel Companion

While Span<T> is a phenomenal tool for synchronous, high-performance operations, its stack-only nature presents a significant challenge in modern C# development, which is dominated by asynchronous programming. What happens when you need to hold onto a slice of memory across an await call, or store it in a class field for later use? Since Span<T> cannot be placed on the heap, it simply cannot be used in these common scenarios.

This is the exact problem that Memory<T> (and its read-only sibling, ReadOnlyMemory<T>) is designed to solve. Unlike Span<T>, Memory<T> is a standard struct, not a ref struct. This means it can be stored on the heap, making it the perfect “carrier” or “owner” for a slice of memory that needs to survive longer than a single method’s execution frame.

The standard workflow is to use Memory<T> for storage and transport, and then acquire a short-lived Span<T> from it when you are ready to perform the actual high-performance processing. Memory<T> acts as the durable container, while Span<T> remains the high-speed processing tool.

Let’s illustrate this with a common scenario in our car rental application: a background service that receives a large binary payload containing thousands of booking records. The service needs to read each record, perform an asynchronous database lookup to validate the customer, and then parse the final details.

using System;
using System.Buffers.Binary;
using System.Threading.Tasks;

// Represents the data parsed from a single record
public record BookingRecord(int CustomerId, Guid CarId, DateTime StartDate);

// A mock database service
public class CustomerValidationService
{
public async Task<bool> IsCustomerValidAsync(int customerId)
{
// Simulate a database call
await Task.Delay(5);
return true;
}
}

public class BookingProcessor
{
private readonly ReadOnlyMemory<byte> _batchData;
private readonly CustomerValidationService _validator = new();

public BookingProcessor(ReadOnlyMemory<byte> batchData)
{
_batchData = batchData;
}

public async Task ProcessBookingsAsync()
{
const int recordSize = 28; // 4 bytes for CustomerId, 16 for CarId, 8 for StartDate
int offset = 0;

while (offset + recordSize <= _batchData.Length)
{
// 1. Slice the MEMORY for one record. This is safe to use across await.
ReadOnlyMemory<byte> recordMemory = _batchData.Slice(offset, recordSize);

// Temporarily get a span to read the Customer ID for validation
int customerId = BinaryPrimitives.ReadInt32LittleEndian(recordMemory.Span.Slice(0, 4));

// 2. Perform an async operation. We are holding onto 'recordMemory', not a span.
bool isValid = await _validator.IsCustomerValidAsync(customerId);

if (isValid)
{
// 3. After the await, get a SPAN from the memory to do the final, fast parsing.
ReadOnlySpan<byte> recordSpan = recordMemory.Span;

Guid carId = new Guid(recordSpan.Slice(4, 16));
long startDateTicks = BinaryPrimitives.ReadInt64LittleEndian(recordSpan.Slice(20, 8));
var booking = new BookingRecord(
customerId,
carId,
new DateTime(startDateTicks)
);

Console.WriteLine($"Processed booking for Customer {booking.CustomerId}");
}

offset += recordSize;
}
}
}

In this example, the BookingProcessor class safely stores the entire batch of data as a ReadOnlyMemory<byte> field. Inside the ProcessBookingsAsync method, we first slice the _batchData to get a ReadOnlyMemory<byte> representing a single record. We can then safely await the _validator.IsCustomerValidAsync call because recordMemory is heap-friendly. After the asynchronous operation completes, we obtain a ReadOnlySpan<byte> from recordMemory.Span to perform the final, fast, allocation-free parsing of the CarId and StartDate. This powerful combination allows us to maintain the performance benefits of Span<T> within the practical constraints of asynchronous code.

Slicing and Dicing: The Power of In-Place Processing

The true workhorse behind both Span<T> and Memory<T> is the .Slice() method. Understanding how it enables in-place processing is fundamental to mastering these types. As we’ve seen, slicing does not create a copy of the underlying data. Instead, it performs a simple and incredibly fast operation: it creates a new Span or Memory instance that points to the same underlying memory but with a different start offset and length. This is the essence of zero-allocation manipulation. You can dice up a large piece of data into countless smaller views without ever telling the Garbage Collector to clean up after you.

Let’s apply this to another common task in our car rental application: parsing a car’s features from a single, comma-separated string. On our website, we might want to check if a car has a specific feature, like “Sunroof,” to display a special icon next to its listing.

The conventional approach would be to use string.Split(','), which is convenient but highly inefficient for performance-critical code.

public class InefficientFeatureParser
{
// This method allocates a new string array and a string for each feature.
public bool HasFeature(string featuresCsv, string featureToFind)
{
// ALLOCATION: string.Split creates a new array and new strings for each item.
string[] features = featuresCsv.Split(',');
foreach (var feature in features)
{
if (feature == featureToFind)
{
return true;
}
}
return false;
}
}

This single line, featuresCsv.Split(','), allocates an entire array on the heap to hold the results, as well as a new string object for every single feature in the list. If you call this method for hundreds of cars on a search results page, the GC impact becomes significant.

We can eliminate all of these allocations by “consuming” the string with a ReadOnlySpan<char> and the Slice() method.

public class EfficientFeatureParser
{
// This method performs ZERO allocations.
public bool HasFeature(string featuresCsv, ReadOnlySpan<char> featureToFind)
{
ReadOnlySpan<char> remainingSpan = featuresCsv.AsSpan();

while (remainingSpan.Length > 0)
{
int delimiterIndex = remainingSpan.IndexOf(',');

// If no more commas, the slice is the rest of the span.
// Otherwise, it's the part before the comma.
ReadOnlySpan<char> currentFeatureSlice = (delimiterIndex == -1)
? remainingSpan
: remainingSpan.Slice(0, delimiterIndex);

// SequenceEqual performs an efficient, allocation-free comparison.
if (currentFeatureSlice.SequenceEqual(featureToFind))
{
return true;
}

// If we're at the end, break.
if (delimiterIndex == -1)
{
break;
}

// "Consume" the part we just processed by slicing the remainder.
remainingSpan = remainingSpan.Slice(delimiterIndex + 1);
}

return false;
}
}

This efficient implementation works like an advancing cursor. It starts with a span covering the entire string. In each iteration, it finds the next comma, slices the span to get a view of the current feature ("GPS", then "Leather Seats", etc.), and performs an allocation-free comparison with SequenceEqual. Crucially, it then updates the remainingSpan by slicing past the feature and the comma it just processed. This loop effectively walks through the original string’s memory, examining each part without ever creating new string objects or arrays on the heap. This is the power of in-place processing made possible by Slice().

Interoperability: A Universal Language for Memory

One of the most profound benefits of Span<T> is its role as a great unifier. It provides a single, consistent API for working with various types of contiguous memory, breaking down the barriers that traditionally existed between them. Whether your data originates from a managed array, a simple string, or even a raw pointer from native code, Span<T> allows you to write one set of processing logic that handles them all. You can create a Span<T> from:

  • Arrays (T[]): The most common source.
  • Strings (string): Creates a ReadOnlySpan<char>.
  • Stack-allocated memory (stackalloc): For small, temporary buffers.
  • Unmanaged memory pointers (void*): The bridge to the native world.

This unification drastically simplifies code that needs to be flexible about its data sources. In our car rental application, let’s consider a system that processes telematics data (like GPS location and speed). A modern vehicle in our fleet might send this data over the network as a standard, managed byte[]. However, an older vehicle might be equipped with a legacy C++ device that communicates via a P/Invoke call, providing its data as an unmanaged memory pointer (IntPtr).

Without Span<T>, you would need to write two separate processing paths, likely involving an expensive and unsafe Marshal.Copy to move the unmanaged data into a managed byte[] just so your C# code could work with it. With Span<T>, this complexity vanishes.

using System;
using System.Runtime.InteropServices;
using System.Buffers.Binary;

public record TelematicsData(double Latitude, double Longitude, float SpeedKph);

public class TelematicsParser
{
// This ONE method can parse data from any contiguous memory source.
public TelematicsData Parse(ReadOnlySpan<byte> data)
{
if (data.Length < 20) // 8 bytes for lat, 8 for lon, 4 for speed
{
throw new ArgumentException("Data payload is too small.");
}

var latitude = BinaryPrimitives.ReadDoubleLittleEndian(data.Slice(0, 8));
var longitude = BinaryPrimitives.ReadDoubleLittleEndian(data.Slice(8, 8));
var speed = BinaryPrimitives.ReadSingleLittleEndian(data.Slice(16, 4));

return new TelematicsData(latitude, longitude, speed);
}
}

public class TelematicsIngestionService
{
private readonly TelematicsParser _parser = new();

// Scenario 1: Processing data from a modern .NET service
public void ProcessManagedData(byte[] modernPayload)
{
Console.WriteLine("Processing data from managed array...");
// Simply create a span from the array. No copies, no fuss.
TelematicsData data = _parser.Parse(modernPayload);
Console.WriteLine($"Received: Lat={data.Latitude}, Lon={data.Longitude}, Speed={data.SpeedKph} kph");
}

// Scenario 2: Processing data from a legacy C++ device via P/Invoke
public void ProcessUnmanagedData(IntPtr legacyPayloadPtr, int payloadSize)
{
Console.WriteLine("Processing data from unmanaged C++ pointer...");

// This requires an 'unsafe' context but is highly efficient.
unsafe
{
// Create a span directly from the native pointer. No Marshal.Copy needed!
var unmanagedSpan = new ReadOnlySpan<byte>(legacyPayloadPtr.ToPointer(), payloadSize);
TelematicsData data = _parser.Parse(unmanagedSpan);
Console.WriteLine($"Received: Lat={data.Latitude}, Lon={data.Longitude}, Speed={data.SpeedKph} kph");
}
}
}

In the TelematicsIngestionService, the Parse method is completely agnostic about where its data comes from. The ProcessManagedData method calls it by creating a span directly from a byte[]. The ProcessUnmanagedData method, operating within an unsafe context, creates a span directly from the IntPtr and the data size. The core parsing logic remains identical, safe, and efficient in both cases. This demonstrates the power of Span<T> as a universal language for memory, enabling you to write cleaner, more reusable, and higher-performance code, especially when interoperating with the world outside the .NET runtime.

Advanced ref struct: Building Your Own High-Performance Tools

The true power of the low-level memory features in C# is realized when you move beyond just using Span<T> and start composing with its underlying technology: ref struct. You can create your own specialized, stack-only types to build complex, high-performance, and allocation-free helper utilities. This is how you encapsulate sophisticated, low-level logic into a safe and reusable API.

Let’s tackle a very common performance hotspot: building a URL with a dynamic query string. In our car rental app, the vehicle search page might have several optional filters. A typical approach using StringBuilder or string concatenation is convenient but results in intermediate allocations.

// Inefficient builder using StringBuilder
var sb = new StringBuilder("api/cars/search");
sb.Append("?type=SUV");
sb.Append("&color=red");
string url = sb.ToString(); // Multiple appends can cause re-allocations inside StringBuilder

We can do better by creating a zero-allocation query builder. Our builder will be a ref struct that writes directly into a character buffer allocated on the stack via stackalloc. Because the builder itself is a ref struct, it can never escape to the heap, and the C# compiler will enforce its safe usage.

using System;
using System.Globalization;

public ref struct QueryBuilder
{
private Span<char> _buffer;
private int _position;
private bool _hasParams;

public QueryBuilder(Span<char> initialBuffer)
{
_buffer = initialBuffer;
_position = 0;
_hasParams = false;
}

// Returning 'ref QueryBuilder' (or 'ref this') allows for fluent method chaining.
public ref QueryBuilder Append(ReadOnlySpan<char> name, ReadOnlySpan<char> value)
{
// Append '&' or '?'
_buffer[_position++] = _hasParams ? '&' : '?';
_hasParams = true;

// Append "name=value"
name.CopyTo(_buffer.Slice(_position));
_position += name.Length;
_buffer[_position++] = '=';
value.CopyTo(_buffer.Slice(_position));
_position += value.Length;

return ref this;
}

// Overload for integer values to avoid boxing
public ref QueryBuilder Append(ReadOnlySpan<char> name, int value)
{
// TryFormat writes the integer directly into the span, allocation-free.
value.TryFormat(_buffer.Slice(_position + name.Length + 1), out int charsWritten, default, CultureInfo.InvariantCulture);

// Now call the main Append logic with the formatted value
return ref Append(name, _buffer.Slice(_position + name.Length + 1, charsWritten));
}

// The only allocation happens here, at the very end.
public override string ToString()
{
return new string(_buffer.Slice(0, _position));
}
}

public class UrlGenerator
{
public string BuildSearchUrl(string carType, string color, int? minSeats)
{
// Allocate a buffer on the stack. 256 chars should be enough.
Span<char> buffer = stackalloc char[256];

// Copy the base path into our stack-allocated buffer.
"api/cars/search".AsSpan().CopyTo(buffer);

// Create the builder, passing it the remaining part of the buffer.
var qb = new QueryBuilder(buffer.Slice("api/cars/search".Length));

if (!string.IsNullOrEmpty(carType))
{
qb.Append("type", carType);
}
if (!string.IsNullOrEmpty(color))
{
qb.Append("color", color);
}
if (minSeats.HasValue)
{
qb.Append("min-seats", minSeats.Value);
}

// The final string includes the base path and the query string.
return $"{buffer.Slice(0, "api/cars/search".Length).ToString()}{qb.ToString()}";
}
}

This QueryBuilder is a masterpiece of allocation-free design. We start by allocating a raw character buffer on the stack—a lightning-fast operation. The QueryBuilder then works directly on this buffer. Its Append methods write character data straight into the Span<char>, advancing a position counter. Notice the overload for int; by using TryFormat, we convert the integer to its character representation without allocating a temporary string. The ref return type on the Append methods is what enables the fluent, chainable syntax (qb.Append(...).Append(...)). The entire process of building the query string happens without a single heap allocation. The only allocation occurs in the final ToString() call, when the finished view of the buffer is used to construct the final, immutable string. This pattern is invaluable for any performance-critical code that involves building or formatting text.

When and How to Use These Tools

We have journeyed deep into the world of low-level memory management in C#, moving from the “why” of performance to the “how” of practical implementation. By now, the roles of the key players in this space should be clear.

  • Span<T> is your primary tool for high-speed, synchronous processing. It is the ultimate parser, the king of in-place modification, and your go-to choice for any performance-critical code that can operate entirely on the stack.
  • Memory<T> is the essential, heap-friendly partner to Span<T>. It acts as the carrier, allowing you to safely store and transport slices of memory across asynchronous boundaries and in class fields, ready to be converted into a Span<T> when it’s time for processing.
  • ref struct is the enabling technology that makes it all possible. It’s the blueprint not only for Span<T> but for your own custom, allocation-free utilities, allowing you to build sophisticated and safe high-performance APIs.

However, with great power comes great responsibility. These tools are specialized instruments, not everyday hammers. It is crucial to resist the urge of premature optimization. Before you refactor your entire application to be allocation-free, you must profile first. Use a memory profiler, like the one built into Visual Studio or a third-party tool like dotMemory, to identify the true allocation “hotspots” in your application—the 1% of the code that is causing 99% of the GC pressure. Focus your efforts there. Applying these techniques to code that is not on a critical performance path can add complexity for little to no real-world benefit.

Now it’s your turn. Find a small, tight loop in one of your projects. Look for a method that parses strings, processes byte arrays, or builds up complex text. Profile it, measure its allocations, and then refactor it using the techniques you’ve learned here. The first time you see the allocation count drop to zero and measure the tangible performance improvement, you’ll have mastered the art of clearing the hidden traffic jams in your code.

Leave a comment