Nvidia Low Latency Mode: On vs Ultra vs Off

{tocify} $title={Table of Contents}

What is NVIDIA Low Latency Mode

NVIDIA Low Latency Mode is a tool in your graphics settings to make games feel faster and more responsive. It reduces the tiny delay / latency between when you press a key and what shows on screen. This is good for quick games like shooters.

You can set it in the NVIDIA Control Panel with three options: Off, On, and Ultra.

It works best for old games with DirectX 9 or 11. For new games with DX12, try NVIDIA Reflex.

How Low Latency Mode Works

Each setting changes how quickly your computer makes game frames:

Off: Your GPU works at its normal speed. This may add a small delay. But it is fine for slow games like puzzle games or story games. In those games, fast moves do not matter.
On: The GPU goes a bit faster to lower the delay. Most people use this. It gives smooth play, and it does not make your computer work too much.
Ultra: This lowers the delay the most but it makes the CPU work harder, so this is more suitable for strong PCs. On weaker PCs, it may cause stutters if your fps drop. It is good for competitive games like CS2.

The Ultra option is best when lowest latency is very important like in competitive games. But not all computers need it. On old or simple PCs, it may cause stutters.

How Low Latency Mode Changes the Way Frames Get Ready (Frame Rendering)

Low Latency Mode changes how many frames the GPU prepares in advance before displaying them. The number of frames depends on the setting:

Off: Roughly 3 to 4 frames are prepared ahead of time.
On: About 1 to 2 frames are prepared.
Ultra: About 0 to 1 frames are prepared.

Why Does Pre-Rendering Frames Cause Latency?

Latency happens when too many frames are prepared ahead of time, creating a queue that takes longer to process. Think of it like a simple line of steps:

The GPU processes a frame.
The frame gets stored in a buffer (a waiting area).
The frame is then displayed.
The next frame goes into the buffer.

If the buffer holds 3 to 4 frames, there’s a noticeable delay before the next frame shows up. A smaller buffer means less delay, making the system feel more responsive.

Here's an analogy

Imagine a restaurant where the kitchen prepares food and places it in a holding area before serving it to customers. If the holding area is too large, it takes longer for your meal to arrive. A smaller holding area means food gets served faster.

In the same way, reducing the buffer size allows the GPU to process and display frames more quickly, leading to a smoother and more responsive experience.