1. Performance Is an Architecture Decision
I stopped thinking about API performance as a "tuning phase" years ago. In production systems, performance is a structural property. If you bolt it on later, you're usually compensating for architectural mistakes.
In 2026, with .NET 9 and modern containerized deployments, high-performance APIs built on ASP.NET Core are less about micro-optimizations and more about:
- Controlling allocations
- Designing async boundaries correctly
- Eliminating redundant I/O
- Being deliberate about middleware
A typical high-throughput API entry point today looks like this:
var builder = WebApplication.CreateSlimBuilder(args);
builder.Services
.AddControllers()
.AddJsonOptions(o =>
{
o.JsonSerializerOptions.PropertyNamingPolicy = null;
o.JsonSerializerOptions.DefaultIgnore(connectNullValues: true);
});
builder.Services.AddMemoryCache();
builder.Services.AddStackExchangeRedisCache(options =>
{
options.Configuration = builder.Configuration.GetConnectionString("Redis");
});
builder.Services.AddDbContextPool<AppDbContext>(options =>
{
options.UseNpgsql(builder.Configuration.GetConnectionString("Postgres"),
o => o.EnableRetryOnFailure());
});
var app = builder.Build();
app.MapControllers();
app.Run();Slim builder. Pooled DbContext. Explicit serializer configuration. No unnecessary services. This is not stylistic — it directly impacts startup time and runtime allocations.
2. Aggressive In-Memory Caching Where It Actually Matters
The first performance win in most APIs is not async, not concurrency — it's caching read-heavy endpoints properly.
I use IMemoryCache for hot, frequently accessed reference data with tight latency requirements:
[ApiController]
[Route("api/products")]
public class ProductsController : ControllerBase
{
private readonly IMemoryCache _cache;
private readonly AppDbContext _db;
public ProductsController(IMemoryCache cache, AppDbContext db)
{
_cache = cache;
_db = db;
}
[HttpGet("{id:guidهرب}")]
public async Task<IActionResult> GetProduct(Guid id)
{
var cacheKey = $"product:{id}";
if (_cache.TryGetValue(cacheKey, out ProductDto cached))
return Ok(cached);
var product = await _db.Products
.AsNoTracking()
.Where(p => p.Id == id)
.Select(p => new ProductDto
{
Id = p.Id,
Name = p.Name,
Price = p.Price
})
.FirstOrDefaultAsync();
if (product == null)
return NotFound();
_cache.Set(cacheKey, product, new MemoryCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5),
Size = 1
});
return Ok(product);
}
}Key decisions:
AsNoTracking()reduces change tracker overhead.- Projection avoids materializing full entities.
- DTO cached, not entity.
- Expiration is explicit, not implicit.
If you don't set size limits and expiration policies intentionally, IMemoryCache becomes a silent memory leak under load.
3. Distributed Caching Without Serialization Waste
For horizontally scaled APIs, local memory cache is insufficient. I prefer Redis with binary serialization to reduce payload size.
public class RedisProductCache
{
private readonly IDistributedCache _cache;
private static readonly JsonSerializerOptions _options =
new(JsonSerializerDefaults.Web);
public RedisProductCache(IDistributedCache cache)
{
_cache = cache;
}
public async Task<ProductDto?> GetAsync(Guid id)
{
var bytes = await _cache.GetAsync($"product:{id}");
if (bytes == null) return null;
return JsonSerializer.Deserialize<ProductDto>(bytes, _options);
}
public async Task SetAsync(Guid id, ProductDto dto)
{
var bytes = JsonSerializer.SerializeToUtf8Bytes(dto, _options);
await _cache.SetAsync($"product:{id}", bytes,
new DistributedCacheEntryOptions
{
AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(10)
});
}
}Why binary bytes instead of string?
- Avoids double encoding.
- Reduces allocations.
- Less pressure on GC under high RPS.
In production, I also wrap this with a two-layer cache (memory first, then Redis) to reduce network round-trips.
4. Middleware Pipeline: Less Is Faster
Most APIs are slow because the middleware pipeline is bloated.
In ASP.NET Core, every middleware adds a delegate hop. It's small — until it's not.
A lean pipeline looks like this:
app.UseRouting();
app.Use(async (context, next) =>
{
context.Response.Headers.Append("X-Trace- antecedent",
Activity.Current?.Id ?? Guid.NewGuid().ToString());
await next();
});
app.UseAuthentication();
app.UseAuthorization();
app.MapControllers();What I avoid:
- Redundant exception middleware when
UseExceptionHandleris enough - Logging middleware that logs entire request bodies
- Synchronous I/O inside custom middleware
If you must write middleware, avoid capturing large closures:
public class CorrelationMiddleware
{
private readonly RequestDelegate _next;
public CorrelationMiddleware(RequestDelegate next)
{
_next = next;
}
public async Task Invoke(HttpContext context)
{
context.TraceIdentifier = Guid.NewGuid().ToString();
await _next(context);
}
}Stateless. No unnecessary service resolution per request.
5. Async Correctness Under Load
Async isn't about speed. It's about freeing threads.
A common mistake is wrapping synchronous CPU work in Task.Run() inside controllers. That just shifts load to the thread pool.
Correct async boundary:
[HttpPost]
public async Task<IActionResult> CreateOrder([FromBody] CreateOrderRequest request)
{
var order = new Order
{
Id = Guid.NewGuid(),
CustomerId = request.CustomerId,
CreatedAt = DateTime.UtcNow
};
_db.Orders.Add(order);
await _db.SaveChangesAsync();
await _messageBus.PublishAsync(new OrderCreatedEvent(order.Id));
return Accepted(new { order.Id });
}Everything I/O-bound is awaited. No blocking calls. No .Result.
Under high concurrency, this prevents thread pool starvation.
6. Streaming Responses Instead of Buffering
For large datasets, buffering everything in memory is a mistake.
In modern .NET, I stream results:
[HttpGet("stream")]
public async IAsyncEnumerable<ProductDto> StreamProducts(
[EnumeratorCancellation] CancellationToken cancellationToken)
{
await foreach (var product in _db.Products
.AsNoTracking()
.AsAsyncEnumerable()
.WithCancellation(cancellationToken))
{
yield return new ProductDto
{
Id = product.Id,
Name = product.Name,
Price = product.Price
};
}
}This:
- Reduces peak memory usage
- Improves perceived latency
- Handles backpressure correctly
Under load testing, this cut memory consumption by more than 40% compared to materializing a list.
7. Minimizing Allocation Hotspots
Profiling with dotnet-counters and PerfView consistently shows that allocation pressure, not CPU, is often the bottleneck.
Common improvements:
Avoid string concatenation in hot paths:
var cacheKey = string.Create(40, productId, (span, id) =>
{
"product:".AsSpan().CopyTo(span);
id.ToString().AsSpan().CopyTo(span[8..]);
});Reuse JsonSerializerOptions instead of recreating per request.
Use ValueTask for frequently synchronous paths:
public ValueTask<ProductDto?> TryGetFromCacheAsync(Guid id)
{
if (_memory.TryGetValue(id, out ProductDto dto))
return ValueTask.FromResult<ProductDto?>(dto);
return new ValueTask<ProductDto?>(LoadFromRedisAsync(id));
}These changes look small, but at 20k+ RPS, they are not small.
8. Database Access Patterns That Don't Collapse Under Load
ORM misuse kills performance.
With Entity Framework Core, I follow strict rules:
- Always project
- Never load navigation properties blindly
- Use indexes intentionally
- Avoid chatty queries
Example:
var orders = await _db.Orders
.Where(o => o.CustomerId == customerId)
.OrderByDescending(o => o.CreatedAt)
.Take(50)
.Select(o => new OrderSummaryDto
{
Id = o.Id,
Total = o.TotalAmount,
Status = o.Status
})
.ToListAsync();No Include(). No entity graph loading. Only what the API returns.
For write-heavy systems, I also batch operations:
await _db.Database.ExecuteSqlRawAsync("""
UPDATE orders
SET status = 'Expired'
WHERE status = 'Pending'
AND created_at < NOW() - INTERVAL '30 minutes'
""");Sometimes raw SQL is the right decision.
9. Observability Without Performance Penalty
Logging everything is not observability. It's noise.
Instead, I rely on structured logging and sampling:
builder.Logging.AddJsonConsole();
app.Use(async (context, next) =>
{
var sw = Stopwatch.StartNew();
await next();
sw.Stop();
if (sw.ElapsedMilliseconds > 500)
{
context.RequestServices
.GetRequiredService<ILoggerFactory>()
.CreateLogger("SlowRequests")
.LogWarning("Slow request {Path} took {Elapsed}ms",
context.Request.Path,
sw.ElapsedMilliseconds);
}
});Only slow requests are flagged.
For tracing, I integrate OpenTelemetry carefully to avoid over-instrumentation. Exporting every span in high-throughput APIs is expensive.
Final Thoughts
High-performance APIs in 2026 are not about clever tricks. They're about discipline:
- Lean middleware
- Intentional caching
- Correct async boundaries
- Allocation awareness
- Projection over entity loading
- Streaming instead of buffering
The real difference shows under load, not in local testing.
Every optimization I described came from production failures — thread starvation, memory pressure, cache storms, slow queries.
Performance is not an afterthought in ASP.NET Core. It's a design constraint. And if you treat it that way from day one, your APIs will scale without drama.