Vinicius Quinafelex Alves

🌐Ler em português

[C#] Prefetching async methods for performance

Prefetching is a technique that starts loading data before it is needed, reducing total runtime at the risk of loading unnecessary data.

In coding, since there is more knowledge over what is needed to execute a method, it is easier to control when certain data should be loaded or not.

On C#, using async and tasks allow fetching data without interrupting the flow of code. This way, by prefetching data, the algorithm can dilute the downtime of I/O operations by working on other operations while waiting for the I/O result.

Below are some examples of prefetching with asynchony:

Prefetching one result

Calling an async method will immediately start its execution without interrupting the code flow. By holding the reference of an async task, it is possible to await the data only when it is actually needed.

public static async Task<string> GetHtmlAsync(string uri)
{
    using (var client = new HttpClient())
        return await client.GetStringAsync(uri);
}
// Start prefetching
var taskHtml = GetHtmlAsync("https://domain.com");
CodeWithoutHtml();

// Await and consume the result
var html = await taskHtml
CodeWithHtml(html);

Prefetching multiple results

Calling async methods in sequence without awaiting them is enough to fetch their data in parallel, without interrupting code flow, until an await is executed.

// Start pre-fetching
var taskHtml1 = GetHtmlAsync("https://domain1.com");
var taskHtml2 = GetHtmlAsync("https://domain2.com");
var taskHtml3 = GetHtmlAsync("https://domain3.com");

// Await results
var html1 = await taskHtml1;
var html2 = await taskHtml2;
var html3 = await taskHtml3;

Chaining prefetching

There are situations where an async method requires the result of another async method, creating a chain of fetches.

Tasks can be chained through the use of ContinueWith(), which invokes a method that starts executing as soon as a task is completed, and Unwrap(), which will expose the chained task being executed inside ContinueWith. Note that, since ContinueWith will only start when the task is completed, calling .Result property will not block the thread.

// Start pre-fetching
var taskUrl = RetrieveUrlAsync();

var taskStatusCode = taskUrl.ContinueWith(async (task) => 
{
    return await GetStatusCodeAsync(task.Result);
}).Unwrap();

var taskFavicon = taskUrl.ContinueWith(async (task) => 
{
    return await HasFaviconAsync(task.Result);
}).Unwrap();

// Await results
var statusCode = await taskStatusCode;
var hasFavicon = await taskFavicon;

Prefetching with IAsyncEnumerable

When using IAsyncEnumerable, it is also possible to prefetch the next result by manipulating the IEnumerator, as demonstrated by the following extension method. Benchmarks generated by BenchmarkDotNet.

public static async IAsyncEnumerable<T> WithPrefetch<T>(this IAsyncEnumerable<T> enumerable)
{
    await using(var enumerator = enumerable.GetAsyncEnumerator())
    {
        ValueTask<bool> hasNextTask = enumerator.MoveNextAsync();

        while(await hasNextTask)
        {
            T data = enumerator.Current;
            hasNextTask = enumerator.MoveNextAsync();
            yield return data;
        }
    }
}
// Prefetching 1 item
await foreach(var item in EnumerateAsync().WithPrefetch())
    Process(item);

There is a significant performance improvement prefetching data when fetch time and processing time are close. Attempting to use prefetch on data already in memory overloads the system with no gains.

FetchProcessExecution time (no-prefetch)Execution time (prefetch)Improvement
0 ms0 ms0.0006 ms0.0013 ms-116%
20 ms20 ms964 ms497 ms93%
100 ms20 ms2117 ms1675 ms26%
20 ms100 ms2118 ms1674 ms26%
200 ms20 ms3540 ms3099 ms14%
20 ms200 ms3572 ms3093 ms15%

There are no benefits prefeching more items when only a single item is processed at a time. Below is a code to prefetch more a single item:

public static IAsyncEnumerable<T> WithPrefetch<T>(this IAsyncEnumerable<T> enumerable, int prefetchDepth)
{
    while(prefetchDepth > 0)
    { 
        enumerable = enumerable.WithPrefetch();
        prefetchDepth--;
    }

    return enumerable;
}
// Prefetching 10 items
await foreach(var item in EnumerateAsync().WithPrefetch(10))
    Process(item);

Benchmarks demonstrate that prefetching more than one data causes a linear increase on processing time.

The method used as a reference sums 100 numbers from an enumerator when data is loaded synchronously, asynchronously or asynchronously with prefetch. 1 ms = 1,000,000 ns

MethodMeanErrorStdDevGen0Allocated
Sync132.4 ns0.53 ns0.44 ns--
AsyncWithoutPrefetch5,062.2 ns49.27 ns46.09 ns0.0381168 B
AsyncWithPrefetch_01_Record8,872.0 ns175.48 ns195.05 ns0.0763344 B
AsyncWithPrefetch_02_Records15,175.8 ns260.77 ns243.93 ns0.1221520 B
AsyncWithPrefetch_04_Records21,440.6 ns143.50 ns127.21 ns0.1831872 B
AsyncWithPrefetch_08_Records37,753.9 ns251.59 ns196.43 ns0.36621576 B
AsyncWithPrefetch_16_Records78,780.6 ns944.58 ns837.35 ns0.61042984 B
AsyncWithPrefetch_32_Records159,310.1 ns2,197.14 ns2,055.21 ns1.22075800 B

Changing only processing time or fetching time without the other does not impact on the execution time of the function, regardless of the configured depth of prefetched data.

MethodMeanErrorStdDevGen0Allocated
Fetch0ms_Process20ms_Prefetch0468.0 ms9.0 ms7.5ms-16608 B
Fetch0ms_Process20ms_Prefetch1466.0 ms2.2 ms2.0ms-16784 B
Fetch0ms_Process20ms_Prefetch2470.8 ms7.9 ms7.0ms-10992 B
Fetch0ms_Process20ms_Prefetch10465.8 ms4.2 ms3.3ms-18376 B
Fetch20ms_Process0ms_Prefetch0469.9 ms5.8 ms5.4 ms-17072 B
Fetch20ms_Process0ms_Prefetch1466.4 ms3.2 ms2.8 ms-17664 B
Fetch20ms_Process0ms_Prefetch2466.3 ms3.0 ms2.5 ms-17976 B
Fetch20ms_Process0ms_Prefetch10466.0 ms2.6 ms2.3 ms-20840 B

Increasing the prefetching depth also does not improve performance even when processing time and fetching time favor parallelism.

MethodMeanImprovement to non-prefetch
Fetch100ms_Process20ms_Prefetch02.117 sN/A
Fetch100ms_Process20ms_Prefetch11.675 s26.39%
Fetch100ms_Process20ms_Prefetch21.681 s25.94%
Fetch100ms_Process20ms_Prefetch101.675 s26.39%
MethodMeanImprovement to non-prefetch
Fetch200ms_Process20ms_Prefetch03.540 sN/A
Fetch200ms_Process20ms_Prefetch13.099 s14.23%
Fetch200ms_Process20ms_Prefetch23.096 s15.35%
Fetch200ms_Process20ms_Prefetch103.099 s14.23%
MethodMeanImprovement to non-prefetch
Fetch20ms_Process100ms_Prefetch02.118 sN/A
Fetch20ms_Process100ms_Prefetch11.674 s26.52%
Fetch20ms_Process100ms_Prefetch21.680 s26.07%
Fetch20ms_Process100ms_Prefetch101.675 s26.45%
MethodMeanImprovement to non-prefetch
Fetch20ms_Process200ms_Prefetch03.527 sN/A
Fetch20ms_Process200ms_Prefetch13.093 s14.03%
Fetch20ms_Process200ms_Prefetch23.096 s13.92%
Fetch20ms_Process200ms_Prefetch103.091 s14.11%