godot/servers/rendering/renderer_rd/storage_rd
clayjohn 2e59cb41f4 Optimize glow and tonemap gather step in the mobile renderer
Mobile devices are typically bandwidth bound which means we need to do as few texture samples as possible.

They typically use TBDR GPUs which means that all rendering takes place on special optimized tiles. As a side effect, reading back memory from tile to VRAM is really slow, especially on Mali devices.

This commit uses a technique where you do a small blur while downsampling, and then another small blur while upsampling to get really high quality glow. While this doesn't reduce the renderpass count very much, it does reduce the texture read bandwidth by almost 10 times. Overall glow was more texture-read bound than memory write, bound, so this was a huge win.

A side effect of this new technique is that we can gather the glow as we upsample instead of gathering the glow in the final tonemap pass. Doing so allows us to significantly reduce the cost of the tonemap pass as well.
2025-10-30 21:56:26 -07:00
..
forward_id_storage.cpp One Copyright Update to rule them all 2023-01-05 13:25:55 +01:00
forward_id_storage.h Core: Modernize C headers with C++ equivalents 2025-05-02 08:23:01 -05:00
light_storage.cpp Use half float precision buffer for 3D when HDR2D is enabled 2025-10-21 13:44:46 -07:00
light_storage.h Prompt editor restart when reflection probe size is updated 2025-09-29 18:35:34 -07:00
material_storage.cpp Rename server "free" functions to "free_rid" to match exposed API 2025-09-30 16:52:25 -07:00
material_storage.h Add shader baker to project exporter. 2025-05-27 12:45:27 -03:00
mesh_storage.cpp Rename server "free" functions to "free_rid" to match exposed API 2025-09-30 16:52:25 -07:00
mesh_storage.h Add some multimesh null checks to avoid crash 2025-07-13 10:09:36 +08:00
particles_storage.cpp Push pipeline compilation of various effects to the worker thread pool. 2025-10-13 12:00:23 -03:00
particles_storage.h Push pipeline compilation of various effects to the worker thread pool. 2025-10-13 12:00:23 -03:00
render_buffer_custom_data_rd.h Style: Replace header guards with #pragma once 2025-03-07 17:33:47 -06:00
render_data_rd.cpp Remove empty bind_methods() 2024-08-15 08:24:32 +02:00
render_data_rd.h Style: Replace header guards with #pragma once 2025-03-07 17:33:47 -06:00
render_scene_buffers_rd.compat.inc Resolve load and store ops automatically for render passes for discardable textures. 2024-11-25 11:27:48 -03:00
render_scene_buffers_rd.cpp Optimize glow and tonemap gather step in the mobile renderer 2025-10-30 21:56:26 -07:00
render_scene_buffers_rd.h Optimize glow and tonemap gather step in the mobile renderer 2025-10-30 21:56:26 -07:00
render_scene_data_rd.cpp Fix wrong indexes for double precision 2025-10-09 22:01:41 +02:00
render_scene_data_rd.h Optimize vertex shader using mat3x4 to reduce bandwidth, load/store operations and ALUs 2025-09-26 23:20:08 -07:00
SCsub SCons: Add unobtrusive type hints in SCons files 2024-09-25 09:34:35 -05:00
texture_storage.cpp Use half float precision buffer for 3D when HDR2D is enabled 2025-10-21 13:44:46 -07:00
texture_storage.h Add and enable default textures for other samplers 2025-07-31 00:08:43 +01:00
utilities.cpp Introduce 'drivers/apple_embedded' abstract platform for code reuse 2025-05-19 15:37:13 -07:00
utilities.h Style: Replace header guards with #pragma once 2025-03-07 17:33:47 -06:00