Over the last weeks, while running sustained load tests for Virto Commerce on .NET 10, we identified an important performance observation that is easy to misinterpret if you rely only on Azure Application Insights (AI) metrics.
When CPU utilization goes above ~85–90%, Application Insights Request and Dependency durations become misleading.
This post explains why this happens, how we reproduced it, and how to correctly analyze performance under CPU pressure.
The symptom we observed
During load testing:
- At normal CPU levels (≤70%)
- Request duration metrics are stable
- Dependency durations (SQL, HTTP, etc.) reflect reality
- At high CPU load (≥85–90%)
- “Holes” (gaps) appear between requests
- The same requests suddenly show longer durations
- Application Insights reports inflated Dependency durations
Yet:
- No DB slowdown
- No network issues
- No change in request logic
Screenshot Count of Request + Duration in AI -
Highlight the moment CPU crosses ~85% then ~95% and green zone less than 85%
Screenshot - Count of DB Dependency + Duration in AI + Duration in AI -
Highlight the moment CPU crosses ~85% then ~95% and green zone less than 85%
Screenshot - CPU usage over time + Request Duration -
Highlight the moment CPU crosses ~85% and latency spikes. Red and Orange lines.
Why this happens (important part)
This is not a code or dependency (database) problem.
This is a scheduler and thread starvation problem.
When CPU is saturated:
- ASP.NET thread pool cannot schedule work immediately
- Async continuations are delayed
- Dependency calls wait before they even start
- Application Insights measures:
- Wall-clock time
- Not actual execution time
So AI reports:
“Dependency took 120 ms”
But in reality:
- 110 ms was waiting for CPU
- 10 ms was real dependency execution
AI cannot distinguish between these two.
Key misconception to avoid
“The database became slow”
“External service latency increased”
In our case:
- DB execution time stayed stable
- External services responded normally
- Only CPU pressure changed
Why the same request behaves differently
We validated this by running the same load profile with more CPU (Virto Commerce is scallable solution
):
- Increased instance count
Result:
- Latency disappeared
- “Holes” in request execution vanished
- AI dependency durations returned to normal
Same code. Same data. Same queries. Different CPU headroom.
What Application Insights is actually telling you
Application Insights is not lying — it’s just incomplete.
AI measures:
- When an operation starts
- When it finishes
It does not know:
- How long the thread waited for CPU
- How long async continuations were queued
- How much time was lost to scheduler contention
At high CPU:
AI durations = execution time + scheduling delay
Correct way to analyze performance under load
Always correlate metrics
Never analyze in isolation:
- CPU utilization
- Request duration
- Dependency duration
- Thread pool metrics
- GC metrics
Treat ≥85% CPU as a danger zone and configure autoscaling
In .NET services:
- 85–90% CPU = unstable latency
- 90%+ CPU = unreliable telemetry
Even if:
- No errors
- No exceptions
- No DB alerts
Scale first, optimize second
I recommend stabilizing the system by: Adding instances or Allocating more CPU.
Yes, infrastructure cost increased — but Latency normalized and Metrics became trustworthy again.
Only after stabilizing CPU, profiling and optimization make sense.
Final thought
If your CPU is overloaded, your metrics are overloaded too.
- Always include CPU context in performance analysis
- Never optimize based on AI durations alone
- Treat CPU saturation as a first-class failure mode
- Use load tests + profiling to identify real bottlenecks
and Measure wisely.


