godot/servers/rendering/renderer_rd
clayjohn 2e59cb41f4 Optimize glow and tonemap gather step in the mobile renderer
Mobile devices are typically bandwidth bound which means we need to do as few texture samples as possible.

They typically use TBDR GPUs which means that all rendering takes place on special optimized tiles. As a side effect, reading back memory from tile to VRAM is really slow, especially on Mali devices.

This commit uses a technique where you do a small blur while downsampling, and then another small blur while upsampling to get really high quality glow. While this doesn't reduce the renderpass count very much, it does reduce the texture read bandwidth by almost 10 times. Overall glow was more texture-read bound than memory write, bound, so this was a huge win.

A side effect of this new technique is that we can gather the glow as we upsample instead of gathering the glow in the final tonemap pass. Doing so allows us to significantly reduce the cost of the tonemap pass as well.
2025-10-30 21:56:26 -07:00
..
effects Optimize glow and tonemap gather step in the mobile renderer 2025-10-30 21:56:26 -07:00
environment Push pipeline compilation of various effects to the worker thread pool. 2025-10-13 12:00:23 -03:00
forward_clustered Use half float precision buffer for 3D when HDR2D is enabled 2025-10-21 13:44:46 -07:00
forward_mobile Use half float precision buffer for 3D when HDR2D is enabled 2025-10-21 13:44:46 -07:00
shaders Optimize glow and tonemap gather step in the mobile renderer 2025-10-30 21:56:26 -07:00
spirv-reflect SCons: Add unobtrusive type hints in SCons files 2024-09-25 09:34:35 -05:00
storage_rd Optimize glow and tonemap gather step in the mobile renderer 2025-10-30 21:56:26 -07:00
cluster_builder_rd.cpp Rename server "free" functions to "free_rid" to match exposed API 2025-09-30 16:52:25 -07:00
cluster_builder_rd.h [macOS] Selectively bake "no image atomics" shader variants. 2025-07-12 21:05:48 +03:00
framebuffer_cache_rd.cpp Implement hooks into renderer 2024-02-18 21:54:21 +11:00
framebuffer_cache_rd.h Use idiomatic templating vargs in a few places to reduce code. 2025-06-08 12:24:07 +02:00
pipeline_cache_rd.cpp Rename server "free" functions to "free_rid" to match exposed API 2025-09-30 16:52:25 -07:00
pipeline_cache_rd.h Style: Replace header guards with #pragma once 2025-03-07 17:33:47 -06:00
pipeline_deferred_rd.h Push pipeline compilation of various effects to the worker thread pool. 2025-10-13 12:00:23 -03:00
pipeline_hash_map_rd.h Move server files into their subfolders 2025-09-30 19:39:39 -07:00
renderer_canvas_render_rd.cpp Rename server "free" functions to "free_rid" to match exposed API 2025-09-30 16:52:25 -07:00
renderer_canvas_render_rd.h Rewrite HashMapHasherDefault based on type traits - it is now possible to declare a default hashing function for any type. 2025-10-05 01:49:11 +02:00
renderer_compositor_rd.cpp Add Stretch Modes for Splash Screen 2025-10-21 18:20:44 -04:00
renderer_compositor_rd.h Add Stretch Modes for Splash Screen 2025-10-21 18:20:44 -04:00
renderer_scene_render_rd.cpp Optimize glow and tonemap gather step in the mobile renderer 2025-10-30 21:56:26 -07:00
renderer_scene_render_rd.h Optimize glow and tonemap gather step in the mobile renderer 2025-10-30 21:56:26 -07:00
SCsub SCons: Add unobtrusive type hints in SCons files 2024-09-25 09:34:35 -05:00
shader_rd.cpp Rename server "free" functions to "free_rid" to match exposed API 2025-09-30 16:52:25 -07:00
shader_rd.h Move server files into their subfolders 2025-09-30 19:39:39 -07:00
uniform_set_cache_rd.cpp Optimize RenderForwardClustered::_setup_render_pass_uniform_set by reducing Vector allocations during push_back operations 2024-12-02 15:03:50 +01:00
uniform_set_cache_rd.h Use idiomatic templating vargs in a few places to reduce code. 2025-06-08 12:24:07 +02:00