The Smoothness Delta: A Forensic Engineer's Treatise on Frametime Statistics and Perceptual Fidelity

February 26, 2026|By Assurd Engineering Lab

The Smoothness Delta: Why Average FPS Is a Statistically Dishonest Metric

"An arithmetic mean, applied to a non-normal distribution, tells you almost nothing about the tails — and in GPU performance analysis, the tails are where hardware dies." — Assurd Engineering Lab, Internal SOP v4.2

When a prospective buyer evaluates a pre-owned GPU, the first number they reach for is Average FPS. It is intuitive, it is widely reported, and it is — from a forensic standpoint — profoundly misleading. Average FPS is an arithmetic mean applied to a distribution of frame delivery times that is rarely, if ever, Gaussian. It conflates the performance ceiling with the perceptual floor, and in doing so, it obscures virtually every failure mode that matters to a reliability engineer.

At Assurd Techlabs, our forensic certification pipeline does not treat Average FPS as a primary diagnostic variable. It is a baseline reference point — the "ceiling" that anchors our analysis. The actual health of a GPU's silicon, power delivery network, and memory subsystem is encoded in the distribution of frametimes, specifically in the statistical lower percentiles we refer to collectively as the Smoothness Delta.

This document is a complete technical explanation of that methodology: the physics of why frametimes matter, the mathematics of our percentile-based diagnostics, and the specific hardware failure signatures that manifest in 1% and 0.1% low data.


Part I: The Physics of Frame Delivery — Why Time, Not Rate, Is the Fundamental Unit

Before we can discuss percentile statistics, we must establish the correct mental model. The human visual system does not perceive "frames per second" as a raw count. It perceives the temporal spacing between successive frames — the frametime, measured in milliseconds.

A display rendering at a constant 100 FPS delivers one frame every 10.0 ms. This is perceptually smooth because the inter-frame interval is uniform. The visual cortex adapts to the rhythm.

Now consider a GPU that delivers frames at the following sequence over one second:

8ms, 8ms, 8ms, 80ms, 8ms, 8ms, 8ms, 8ms, 8ms, 8ms, 8ms, 8ms, 8ms

This sequence totals approximately 148ms, yielding an "Average FPS" of roughly ~87 FPS over the measured window — a respectable number. But embedded in that sequence is an 80ms stutter: a single frame that took ten times longer than its neighbors to render. The human perceptual system detects inter-frame delta deviations as low as 8–10ms as distinct stutter events. An 80ms hitch is not a statistic — it is a catastrophic perceptual failure.

This is the foundational problem with Average FPS: it is latency-blind. It cannot distinguish between a perfectly isochronous 87 FPS delivery and a delivery that includes severe microstutter events.

The solution is to represent GPU output not as a scalar but as a probability distribution of frametimes, and then to interrogate the tails of that distribution.


Part II: Defining the Metrics — From Mean to Percentile

Average FPS — The Reference Ceiling

FPSavg=NframesTtotal\text{FPS}_{\text{avg}} = \frac{N_{\text{frames}}}{T_{\text{total}}}

Where NframesN_{\text{frames}} is the total count of rendered frames and TtotalT_{\text{total}} is the elapsed benchmark duration in seconds.

Average FPS answers one question: does this silicon have sufficient raw throughput to handle this workload at this resolution? It is the appropriate metric for resolution-tier decisions — whether a card can sustain playable output at 1080p, 1440p, or 4K. It says nothing about the quality of that output.

Forensic Significance: Assurd uses Average FPS to anchor the analysis. A card performing >15% below its reference average for its SKU and driver version triggers a secondary investigation into clock states, power limits, and VBIOS configuration.


The 1% Low — The Perceptual Floor

The 1% Low is formally defined as the average frametime of the slowest 1% of all frames captured during a benchmark run. If a 60-second run at 60 FPS produces 3,600 frames, the 1% Low is derived from the 36 frames with the highest individual frametimes (i.e., the slowest deliveries).

It is important to distinguish this from the 1st percentile frametime, which is a single point value. Our methodology uses the average of all frames in the bottom percentile band, which produces a more stable and reproducible diagnostic signal — less susceptible to single-frame outliers from OS scheduler interrupts.

The 1% Low is the metric that captures sustained heavy-load degradation. The events that drive frames into the 1% band include:

  • VRAM bandwidth saturation: When the framebuffer overflows available VRAM capacity, the GPU must stall execution while it evicts and re-fetches textures from system RAM over the PCIe bus. This is a latency-intensive operation — typical round-trip latency for a PCIe Gen4 x16 memory access is on the order of 200–400ns per transaction, compared to sub-nanosecond on-die GDDR6X access. The resulting frametime spike is characteristic and reproducible.

  • Driver overhead and API call batching: DirectX 12 and Vulkan reduce driver overhead substantially compared to DX11, but poorly optimized titles can still introduce CPU-GPU synchronization bubbles that manifest in the 1% low band.

  • Thermal throttling onset: As a GPU approaches its thermal junction limit (TjT_j), the firmware will reduce the operating V/F (voltage-frequency) point to stay within the thermal envelope. The transition between boost states is not instantaneous — the resulting frequency dip produces a characteristic elongated frametime.

Forensic Significance: A gap between Average FPS and 1% Low that exceeds 35% (i.e., FPSavgFPS1%FPSavg>0.35\frac{\text{FPS}_{\text{avg}} - \text{FPS}_{\scriptscriptstyle 1\%}}{\text{FPS}_{\text{avg}}} > 0.35) is a primary diagnostic flag for VRAM capacity pressure, thermal throttling, or driver-level pathology. In a certified pre-owned context, it frequently indicates that a card has been operated with degraded thermal interface material or configured with a reduced power limit by a previous owner.


The 0.1% Low — The Stutter Floor and the Primary Hardware Instability Sensor

The 0.1% Low is the average frametime of the slowest 0.1% of frames. In a 3,600-frame run, this represents approximately 3–4 individual frame deliveries. This is where the distribution becomes forensically critical.

At the 1% band, the causes of slow frames are frequently software-mediated or thermally-driven. At the 0.1% band, the causes are predominantly hardware-layer pathologies:

1. Capacitor Equivalent Series Resistance (ESR) Degradation

Every VRM stage on a GPU PCB uses output capacitors to filter the switched-mode power supply's voltage ripple. A healthy aluminum polymer or MLCC capacitor presents very low ESR — typically <5 mΩ for a quality bulk capacitor array.

As capacitors age — accelerated by thermal cycling, high-ripple current, and electrolyte degradation — their ESR rises. Increased ESR means the capacitor cannot instantaneously supply the transient current demanded when the GPU's shader array transitions from a light workload state to a full-load burst. The result is a momentary voltage droop event on the VCore rail.

The GPU's internal power management unit (PMU) detects this droop and responds by lowering the operating frequency to a state the degraded power delivery can support. This transition takes microseconds to milliseconds — during which the GPU is effectively stalled. The result is a discrete, severe frametime spike that appears in the 0.1% low band.

2. High-Side MOSFET Transition Degradation

In a synchronous buck converter (the standard VRM topology used in modern GPUs), the high-side MOSFET switches the input rail to the inductor, and the low-side MOSFET provides the freewheeling path. The switching transition — typically occurring at 200–600 kHz per phase — must be clean and fast to maintain efficiency and regulation.

A degraded high-side MOSFET (due to gate oxide stress, electromigration in the die metallization, or thermal damage) exhibits increased RDS(on)R_{DS(on)} (drain-source on-resistance) and slower trt_r/tft_f (rise and fall times). The practical consequence is increased switching losses, reduced regulation bandwidth, and impaired transient response.

During a GPU load transient — for example, a compute shader dispatching a particle system involving hundreds of thousands of concurrent calculations — the VRM must respond to a ΔI\Delta I demand of tens of amps within microseconds. A degraded phase cannot do this cleanly. The resulting under-voltage condition is captured as a 0.1% low spike.

3. PCIe Interface Signal Integrity Degradation

In severely degraded or riser-damaged units, link-layer retransmissions on the PCIe bus can introduce latency spikes. While most desktop configurations running Gen4 x16 have enormous headroom, mining riser cables and slot-to-slot adapters can introduce differential signal impedance discontinuities that elevate LTSSM recovery latency — injecting multi-millisecond stalls directly into the render pipeline.

The RMA Signal: When FPS1%FPS0.1%FPS1%>0.50\frac{\text{FPS}_{\scriptscriptstyle 1\%} - \text{FPS}_{\scriptscriptstyle 0.1\%}}{\text{FPS}_{\scriptscriptstyle 1\%}} > 0.50, meaning the 0.1% low is more than 50% lower than the 1% low, we treat this as a primary hardware instability flag. In our certification database, this pattern correlates with a >74% probability of a hardware-layer pathology — not a software or thermal issue. Units exhibiting this signature proceed to VRM resistance measurement and capacitor ESR probing.


Part III: The Assurd Fluidity Formula — Mathematical Derivation

The Fluidity Score is a single normalized scalar derived from the three core metrics. It is designed to be both diagnostically sensitive (catching instability) and interpretively accessible to non-engineers.

The formula weights the Average-to-1% gap more heavily than the 1%-to-0.1% gap, because the former affects moment-to-moment gameplay feel across a broader range of users, while the latter is a forensic flag for hardware pathology rather than purely a perceptual quality metric:

SF=(1(FPSavgFPS1%FPSavg0.7+FPS1%FPS0.1%FPS1%0.3))×100S_F = \left(1 - \left(\frac{\text{FPS}_{\text{avg}} - \text{FPS}_{\scriptscriptstyle 1\%}}{\text{FPS}_{\text{avg}}} \cdot 0.7 + \frac{\text{FPS}_{\scriptscriptstyle 1\%} - \text{FPS}_{\scriptscriptstyle 0.1\%}}{\text{FPS}_{\scriptscriptstyle 1\%}} \cdot 0.3\right)\right) \times 100

Where:

  • SFS_F = Fluidity Score (0–100)
  • FPSavg\text{FPS}_{\text{avg}} = Arithmetic mean FPS over the full benchmark duration
  • FPS1%\text{FPS}_{\scriptscriptstyle 1\%} = Average FPS of the lowest 1% of frames
  • FPS0.1%\text{FPS}_{\scriptscriptstyle 0.1\%} = Average FPS of the lowest 0.1% of frames
  • 0.7 = Weighting coefficient for sustained floor degradation (perceptual impact)
  • 0.3 = Weighting coefficient for stutter floor instability (hardware pathology signal)

Worked Example — Healthy Card

Consider a GPU producing:

  • FPSavg=145\text{FPS}_{\text{avg}} = 145
  • FPS1%=118\text{FPS}_{\scriptscriptstyle 1\%} = 118
  • FPS0.1%=109\text{FPS}_{\scriptscriptstyle 0.1\%} = 109
SF=(1(1451181450.7+1181091180.3))×100S_F = \left(1 - \left(\frac{145 - 118}{145} \cdot 0.7 + \frac{118 - 109}{118} \cdot 0.3\right)\right) \times 100 SF=(1(0.18620.7+0.07630.3))×100S_F = \left(1 - \left(0.1862 \cdot 0.7 + 0.0763 \cdot 0.3\right)\right) \times 100 SF=(1(0.1303+0.0229))×100=(10.1532)×10084.7S_F = \left(1 - \left(0.1303 + 0.0229\right)\right) \times 100 = \left(1 - 0.1532\right) \times 100 \approx \mathbf{84.7}

This card achieves Certified status. The relatively tight 1%/0.1% relationship (9 FPS gap) confirms power delivery stability.

Worked Example — Hardware-Degraded Card

Consider a card with:

  • FPSavg=142\text{FPS}_{\text{avg}} = 142
  • FPS1%=105\text{FPS}_{\scriptscriptstyle 1\%} = 105
  • FPS0.1%=41\text{FPS}_{\scriptscriptstyle 0.1\%} = 41
SF=(1(1421051420.7+105411050.3))×100S_F = \left(1 - \left(\frac{142 - 105}{142} \cdot 0.7 + \frac{105 - 41}{105} \cdot 0.3\right)\right) \times 100 SF=(1(0.26060.7+0.60950.3))×100S_F = \left(1 - \left(0.2606 \cdot 0.7 + 0.6095 \cdot 0.3\right)\right) \times 100 SF=(1(0.1824+0.1829))×100=(10.3653)×10063.5S_F = \left(1 - \left(0.1824 + 0.1829\right)\right) \times 100 = \left(1 - 0.3653\right) \times 100 \approx \mathbf{63.5}

This card fails certification. The catastrophic 1%-to-0.1% drop (61% collapse) is a definitive hardware instability signature. Average FPS alone — 142, well within normal range — would have completely hidden this.


Part IV: Benchmark Methodology and Environmental Controls

The Fluidity Score is only reproducible and meaningful if the test conditions are rigorously controlled. Frametime data is highly sensitive to environmental variables that introduce noise:

Test Bench Specification

All Assurd frametime measurements are captured on a controlled reference platform with the following specifications:

  • CPU: Fixed-frequency operation (no boost) to eliminate CPU-side render bottlenecks from contaminating GPU frametime data
  • RAM: XMP/EXPO enabled; sub-timings verified for consistency across test runs
  • Storage: NVMe SSD for asset streaming; HDD or degraded NVMe introduces artificial 1% lows unrelated to GPU health
  • OS: Clean Windows install, background process suppression, Windows Game Mode disabled (introduces scheduler unpredictability)
  • Driver: Fixed reference driver version, consistent across all units tested in a batch

Workload Selection

We use a four-title suite spanning different rendering paradigms:

TitlePrimary GPU StressFrametime Pathology Exposed
Cyberpunk 2077 (RT Overdrive)RT Core + Shader ALU + VRAM BWThermal throttle, VRAM capacity
Metro Exodus EnhancedLighting compute + Memory pressureVRM transient response
F1 24CPU-GPU pipelining, low overheadPCIe link latency, clock stretching
Blender Benchmark (GPU Compute)Sustained 100% FP32 utilizationSteady-state VRM stability

Each title runs for a minimum of 600 seconds (10 minutes) after a 120-second thermal pre-conditioning phase. Data is captured at 1ms polling granularity using a direct OS-level PresentMon telemetry pipeline..

Statistical Filtering

Raw frametime data is post-processed to remove OS-level interrupt artifacts — hard stalls caused by driver interrupts, background service activity, or display composition pipeline events unrelated to GPU render performance. We apply a 3-sigma outlier filter to the top 0.01% of frametimes (the absolute spikes) to isolate OS noise from hardware-generated anomalies.

This is a critical methodological distinction. An unfiltered 0.1% low in a 600-second capture includes approximately 360ms of total frame budget. OS-induced interrupts, while real, are not GPU health indicators. Filtering them correctly separates the hardware signal from the software noise floor.


Part V: Certification Thresholds and Grading Rubric

Fluidity ScorePerceptual Profile1% Low Ratio0.1% Low RatioLab Verdict
95 – 100Imperceptibly smooth; frame pacing is isochronous within ~1ms tolerance>85% of Avg>92% of 1% LowAssurd Gold
88 – 94Highly playable; occasional minor frametime variance, imperceptible to most users>75% of Avg>80% of 1% LowAssurd Certified
80 – 87Playable with occasional perceptible stutter under heavy asset loads>65% of Avg>70% of 1% LowCertified — Noted
< 80Noticeable jitter; competitive use impairedAny<50% of 1% LowAudit Required
N/A0.1% collapse >60% below 1% Low regardless of Fluidity ScoreAny<40% of 1% LowHardware Investigation Mandatory

Note on the Final Row: A card can technically achieve a Fluidity Score above 80 while still exhibiting a severe 0.1% low collapse if the Average and 1% Low are tight. The absolute collapse threshold overrides the scalar score because the 0.1% low pattern is a hardware failure signature, not a perceptual quality issue.


Part VI: Clinical Patterns — What the Data Looks Like

Pattern A: Healthy Card

Avg: 144 | 1% Low: 121 | 0.1% Low: 112
Ratio: 84% / 93% | Fluidity: 88.2 | VERDICT: Certified

Tight, consistent distribution. The 0.1%/1% ratio of 93% indicates power delivery is clean under transient load. This card has been operated within thermal and electrical margins.


Pattern B: VRAM Pressure Signature

Avg: 139 | 1% Low: 74 | 0.1% Low: 66
Ratio: 53% / 89% | Fluidity: 63.1 | VERDICT: Audit Required

The catastrophic Average-to-1% collapse (53% ratio) with a relatively intact 1%-to-0.1% relationship (89% ratio) is the characteristic VRAM saturation signature. When VRAM fills, the card stalls on PCIe memory transactions — producing frequent, sustained slow frames. But within those slow frames, the power delivery remains stable (the 0.1% doesn't further collapse from the 1%). This card is likely operating a VRAM-intensive workload beyond its physical memory capacity, or has partially failed VRAM modules reducing effective addressable memory.


Pattern C: VRM/Capacitor Degradation Signature

Avg: 148 | 1% Low: 124 | 0.1% Low: 38
Ratio: 84% / 31% | Fluidity: 70.8 | VERDICT: Hardware Investigation

The tight Average-to-1% relationship (84%) — indicating no thermal or VRAM issues — combined with a catastrophic 0.1% collapse (31% of 1% Low) is the canonical VRM instability signature. The card runs smoothly almost all the time. But periodically, under peak transient demand, the power delivery subsystem cannot maintain regulation, producing brief but severe voltage droops. These appear as isolated, extreme frametime spikes — exactly the events that end competitive matches and cause game crashes. This card proceeded to ESR probing and VRM phase resistance measurement. Results documented in Case Study CS-004.


Conclusion

Average FPS is a marketing metric. It answers the question "is this card fast?" — and stops there. The Smoothness Delta and the Assurd Fluidity Score answer the questions that actually determine whether hardware is fit for use: Is the power delivery clean? Is the VRAM functional under pressure? Is the silicon stable under transient load?

Every GPU that leaves our laboratory with an Assurd Certified badge has passed a full frametime distribution audit. The mathematics are not proprietary — they are derived from first-principles statistics and electrical engineering. What is proprietary is our failure pattern library — thousands of test runs that allow our analysts to recognize hardware pathology signatures the moment they appear in the data.

When you see our certification badge, you are not paying for a 10-second benchmark. You are paying for statistical certainty.


Technical questions regarding our benchmark methodology, filtering algorithms, or certification thresholds can be directed to the Assurd Engineering Lab via our technical disclosure portal.