X Tutup
The Wayback Machine - https://web.archive.org/web/20201108192847/https://github.com/bevyengine/bevy/issues/179
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PBR / Clustered Forward Rendering #179

Open
aclysma opened this issue Aug 14, 2020 · 17 comments
Open

PBR / Clustered Forward Rendering #179

aclysma opened this issue Aug 14, 2020 · 17 comments

Comments

@aclysma
Copy link
Contributor

@aclysma aclysma commented Aug 14, 2020

This is a Focus Area tracking issue

PBR is a standard-ish way of rendering realistically in 3D. There is both a lot of interest and a lot of brain-power in this area, so it makes sense to build PBR now. This focus area has the following (minimum) requirements:

  • PBR Shaders (which implies hdr)
  • Bloom (to convey hdr)
  • Shadowing (forces us to build out a real "pipeline")
  • Battle-test the current mid-level rendering abstractions and rework them where necessary

Active Crates / Repos

  • @StarArawn's draft PR with an implementation based on Filament: #261

Sub Issues

No active issues discussing subtopics for this focus area. If you would like to discuss a particular topic, look for a pre-existing issue in this repo. If you can't find one, feel free to make one! Link to it in this issue and I'll add it to the index.


Original Post (sorry @aclysma for stomping on this)

There was a discord conversation that I think is worth capturing. I'll do my best but I may miss some people or get some sentiments wrong. I also don't know everyone's github names.

StarToaster, fusha, and aclysma (me) all commented that clustered forward rendering was a good overall model to pursue. (Possibly also matthewfcarlson as well, not sure if he was agreeing or just linking a helpful doc :D)

StarToaster: I kinda hope we don't add a deferred renderer. We'll still need some sort of gbuffer for ssao, but clustered shading is more accurate and faster when compared with deffered shading.
matthewfcarlson: the Filament doc linked to the PBR milestone there was a good explanation of CFR along the frustum
Fusha: also, throwing my hat into the mix in support of clustered forward rendering
aclysma: forward is intuitive to work with and lots of techniques "just work", and clustering mitigates the issue with forward having practical limits to light sources

@cart suggested later:

cart: We'll need to start building a plan, but in the short term if you're interested, start familiarizing yourself with the current state of the bevy renderer. And maybe check out the Google Filament document. It's a nice overview of pbr implementation.

To summarize the main advantages of this approach are:

  • Most techniques work simply and intuitively. (For example you can use plain MSAA)
  • Good choice for VR and mobile
  • Can start by simply implementing/extending with forward rendering, and transition to clustered later without too much waste
  • A good balance of simple and good performance with many light sources
@GabLotus
Copy link
Contributor

@GabLotus GabLotus commented Aug 14, 2020

Hey thats a great discussion to have. I suggest changing the title to something more actionable so that we can instinctively have an idea of the "Lifetime" of this issue. an example would be "Make a plan regarding X" or "Investigate the possiblity of Y". Its just a small guideline that helps keeping track of where a discussion should go and end :)

@cart
Copy link
Contributor

@cart cart commented Aug 14, 2020

Or maybe even just Clustered Rendering. As @GabLotus said, in general I'd like to avoid issues without a clear done definition. I'm fine with issues being big, but they should have clear outcomes.

(i also fully admit that i've set a bad example with some of my past issues)

@aclysma aclysma changed the title Render Pipeline Discussion Clustered Forward Rendering Aug 14, 2020
@aclysma
Copy link
Contributor Author

@aclysma aclysma commented Aug 14, 2020

What I'd like to see happen with this task is to gather opinions and possibly find consensus on a high level default render pipeline structure for bevy to target in the very near term and a little longer term. (i.e. very near term = ~3 months, little longer term = ~12 months). Ideally we could come up with a very brief roadmap that provides value now and also provides a path to improve on it later.

I would recommend keeping focus on forward rendering today, and favor choices that can fold well into doing clustered forward rendering later.

@termhn
Copy link
Contributor

@termhn termhn commented Aug 14, 2020

I'm Fusha on Discord, basically just coming to throw my 👍 into the hat on clustered forward but also to comment and mention a couple other related things:

  1. One of the biggest downsides of (clustered) forward in my view is that it's basically going to rely on a big "uber-shader" to do, say, 90% of the heavy lifting of shading. The main issue with this (at least the one that I think of first) is that this creates the potential for a lot of code duplication, especially if you want to be able to dynamically change "graphics settings" (which is something I would say we definitely want to eventually). It's probably a good idea to have some sort of plan on how to manage that.

  2. Also, somewhat related, there's a few things mentioned but not explicitly talked about too much in Filament that are quite useful and relate to this... a couple off the top of my head being NPR tonemapping (such as false-color ramp based on luminance) for debugging, and then doing exposure adjustment on the original light radiance values, before all the lighting calculations happen, rather than doing exposure correction after lighting calculations: this allows the use of half-precision floats for all the shading calculations which is particularly valuable for low-end and mobile graphics processors.

@StarArawn
Copy link
Contributor

@StarArawn StarArawn commented Aug 14, 2020

One of the biggest downsides of (clustered) forward in my view is that it's basically going to rely on a big "uber-shader" to do, say, 90% of the heavy lifting of shading. The main issue with this (at least the one that I think of first) is that this creates the potential for a lot of code duplication, especially if you want to be able to dynamically change "graphics settings" (which is something I would say we definitely want to eventually). It's probably a good idea to have some sort of plan on how to manage that.

Hmm, I don't think you necessarily need to rely on one big uber shader. Godot does this and quite frankly its a mess. Plus WSL wont support shader defines. Instead if we really want a more module adaptive shader we likely should build shaders from smaller pieces programmatically. The other option is to use shader includes as much as possible to dedup code and have different shaders as different files. Currently though the shader system in place does not support shader includes. I created an issue here about it: #185

Also, somewhat related, there's a few things mentioned but not explicitly talked about too much in Filament that are quite useful and relate to this... a couple off the top of my head being NPR tonemapping (such as false-color ramp based on luminance) for debugging, and then doing exposure adjustment on the original light radiance values, before all the lighting calculations happen, rather than doing exposure correction after lighting calculations: this allows the use of half-precision floats for all the shading calculations which is particularly valuable for low-end and mobile graphics processors.

I'm not familiar using NPR tonemapping in this way do you have any articles or papers on the subject? On mobile reducing render passes becomes much much more important. I'm not convinced we should limit the desktop by mobile(low-end) constraints though. Perhaps going down the path of splitting the renderer into two different graphs would make more sense. Internally we could build the graph based off of user settings and hardware limits.

Also this is an interesting read:
http://efficientshading.com/wp-content/uploads/s2015_mobile.pptx

@StarArawn
Copy link
Contributor

@StarArawn StarArawn commented Aug 14, 2020

(i.e. very near term = ~3 months, little longer term = ~12 months)

And here I'm trying to add it in after the compute stuff is done. 😄 I don't mind waiting but adding clustered shading doesn't change a lot of stuff.

@aclysma
Copy link
Contributor Author

@aclysma aclysma commented Aug 14, 2020

What I had in mind for the 3-ish months was to spend some time improving and polishing a simple forward renderer. PBR, bloom, HDR... In particular, get shadows up and running because it touches a lot of rendering systems (multiple views, multiple passes in a material, can be done in parallel with other rendering stages).

Clustered forward rendering sounds to me like the direction we want to be headed and it seems like there is consensus on that so far. Could prototyping/R&D for clustered forward happen in parallel with fleshing out the current pipeline? I think this would help limit risk of things stalling out.

@StarArawn
Copy link
Contributor

@StarArawn StarArawn commented Aug 14, 2020

Could prototyping/R&D for clustered forward happen in parallel with fleshing out the current pipeline? I think this would help limit risk of things stalling out.

We likely could create a plugin for it(similar to bevy_pbr) was my thoughts. bevy_pbr_clustered?

@termhn
Copy link
Contributor

@termhn termhn commented Aug 14, 2020

@StarArawn

I'm not familiar using NPR tonemapping in this way do you have any articles or papers on the subject? On mobile reducing render passes becomes much much more important. I'm not convinced we should limit the desktop by mobile(low-end) constraints though. Perhaps going down the path of splitting the renderer into two different graphs would make more sense. Internally we could build the graph based off of user settings and hardware limits.

Check out this part of the Filament docs https://google.github.io/filament/Filament.html#imagingpipeline/validation/scenereferredvisualization

Also, tonemapping (at least, global tonemapping, i.e. each fragment only has information about itself, which is the case for most tonemapping operators used in games) can simply be done as the last step of the main uber-shader; it doesn't necessarily need to be a separate render pass.

@cart
Copy link
Contributor

@cart cart commented Aug 14, 2020

Lots of great conversation happening here. I really appreciate the thoughtfulness and expertise you all are bringing to the table.

I will defer to you all here when it comes to clustered. It seems like a good place to start. Ideally we experiment with multiple paradigms and build things in a way that makes them reusable across paradigms. The Armory project is a pretty good example of supporting forward and deferred with the same pieces. However that isn't a hard requirement. We can always try to modularize later if we need to.

Uber shaders don't scare me as an output, but they do definitely scare me from an organizational standpoint. Using imports to create scoped (and ideally reusable) pieces of shader logic seems like the right call, with or without uber shaders.

Uber shaders can perform quite well in some contexts (ex: the dolphin emulator project had great success with their "uber shader" effort).

The Filament docs really are great. We should probably start a collection of rendering resources that we can all learn from. When you start implementing, please record in the repo what sources you used (both for giving appropriate credit and to help new contributors).

I'm starting to consolidate my thoughts on what the Bevy development process should look like. I think the main bevy crates (and bevy repo) should be for building our current best ideas for "final" implementations of core functionality. Ex: eventually bevy_pbr will contain the default pbr plugin that everyone uses. Pushing code there will be a signal that we have made a decision to take that crate in a given direction.

But spending time discussing and worrying about building the 100% correct solution will stall us and force us to get caught up in theoreticals. Almost without exception I think we should be building "fast and loose" prototype code outside of the main bevy repo, probably with some naming convention like bevy_exp_pbr_clustered, bevy_pbr_clustered_prototype, etc. Descriptiveness (when there can and should be multiple competing implementations) is ideal.

In the short term, I encourage you all to create and distribute your own crates for experimentation (while being respectful of the core bevy_XXX namespace). As specific implementations gain momentum and stability, we can then start discussing centralization of efforts.

I'll try to give appropriate visibility into the various distributed projects to help direct people's attention and avoid duplicate work.

I'll also be setting up working groups for specific focus areas (and PBR will be one of them).

@StarArawn
Copy link
Contributor

@StarArawn StarArawn commented Aug 14, 2020

Almost without exception I think we should be building "fast and loose" prototype code outside of the main bevy repo, probably with some naming convention like bevy_exp_pbr_clustered, bevy_pbr_clustered_prototype, etc.

This I guess raises the next question, how do we separate out the shaders into different crates? Ideally we should have a way of sharing the PBR implementation between a bevy_clustered and a bevy_forward plugin. Calculating the lighting works exactly the same between forward and forward clustered. I'm not sure I want to replicate PBR work inside a bevy_pbr_clustered_prototype plugin. If possible the PBR plugin should likely create an include that can be shared with other plugins. However we currently don't have a good mechanism for including shaders..

@cart
Copy link
Contributor

@cart cart commented Aug 14, 2020

Yeah I agree that long term breaking them up into separate crates could be beneficial (or alternatively, just separate modules in the same crate). Short term I expect making crate divisions will hamper productivity. But if that workflow works for the implementors, im cool with it.

As you saw, right now shader includes dont work. I don't see a huge problem with building "big shaders" first and then breaking them up later. But its very possible we can make includes work with a small amount of effort. I just dont want to waste too much effort on that when naga is so close.

@termhn
Copy link
Contributor

@termhn termhn commented Aug 15, 2020

Uber shaders don't scare me as an output, but they do definitely scare me from an organizational standpoint. Using imports to create scoped (and ideally reusable) pieces of shader logic seems like the right call, with or without uber shaders.

Uber shaders can perform quite well in some contexts (ex: the dolphin emulator project had great success with their "uber shader" effort).

Yeah this is my thoughts as well and is what I wanted to convey originally.

@aclysma
Copy link
Contributor Author

@aclysma aclysma commented Aug 15, 2020

We had a short discussion in discord tonight RE: next short-term steps. I'll summarize here for further discussion:

Now:

  • PBR (which implies HDR)
  • Bloom (to convey HDR)
  • Shadowing (forces us to build out a real "pipeline")
  • Papercuts with current implementation

Later:

(I don't think this needs to block someone doing R&D for forward clustered rendering as a longer-term project on the side.)

@cart cart added the focus-area label Aug 20, 2020
@cart cart changed the title Clustered Forward Rendering PBR / Clustered Forward Rendering Aug 20, 2020
@StarArawn
Copy link
Contributor

@StarArawn StarArawn commented Aug 20, 2020

I opened a draft PR #261 which is a somewhat working implementation of googles filament. I think its a good starting point for getting a feel of how we want PBR to work in bevy.

@StarArawn
Copy link
Contributor

@StarArawn StarArawn commented Aug 20, 2020

Thinking about bind groups and how limited we are on them I came up with the following(WIP) non-exhaustive list:

Textures:

// Standard PBR textures
albedo: Texture -  set 3 binding 1
albedo_sampler: Sampler - set 3 binding 2
normal_map: Texture -  set 3 binding 3
normal_map_sampler: Texture -  set 3 binding 4
combined_roughness_metallic: Texture -  set 3 binding 5
combined_roughness_metallic_sampler: Sampler - set 3 binding 6
ambient_occlusion: Texture -  set 3 binding 7
ambient_occlusion_sampler: Sampler -  set 3 binding 8
emissive: Texture -  set 3 binding 7
emissive_sampler: Sampler -  set 3 binding 8

Lighting:

// Lights
light_buffer: Buffer - set 1 binding 0

// Cluster Info
light_cull_list: Buffer - set 1 binding 1
frustums: Buffer - set 1 binding 2 // Camera frustum separated out into smaller frustums

// Shadows
// Represent directional/point/spot shadow maps. Point light shadow maps would be as a cube map in 2D space.
// Optionally we can drop this down to a single shadow map atlas texture, however I'm not sure if that's any better.
// From my testing you can more smartly allocate/de-allocate memory using arrays, however it does eat up an 
// additional 3 slots for textures.
// The resolution and the number of allocations per resolution could be a user setting or we can hide it behind
//  a single setting called ShadowQuality. 
shadow_map_1: Texture2DArray - set 1 binding 3  // highest resolution shadow maps ex: 1024x1024 x 8
shadow_map_2: Texture2DArray - set 1 binding 4 // medium resolution shadow maps ex: 512x512 x 16
shadow_map_3: Texture2DArray - set 1 binding 5 // low resolution shadow maps ex: 256x256 x 32
shadow_map_4: Texture2DArray - set 1 binding 6 // lowest resolution shadow maps ex: 128x128 x 64
shadow_sampler: Sampler - set 1 binding 7 // Note: We might want more samplers depending on the light type.

// IBL
// Similar to the storage/allocation strategy for shadow maps.
// Internally to the texture cube array we would have groups of two probes(diffuse irradiance, specular).
probe_map_1: TextureCubeArray - set 1 binding 8 // highest resolution probe maps
probe_map_2: TextureCubeArray - set 1 binding 9 // medium resolution probe maps
probe_map_3: TextureCubeArray - set 1 binding 10 // low resolution probe maps
probe_map_4: TextureCubeArray - set 1 binding 11 // lowest resolution probe maps
probe_sampler: Sampler - set 1 binding 12

This brings us to 13 textures which gives us a little wiggle room for more maxSampledTexturesPerShaderStage: 16

We can request for more than 16, but remember that it will no longer align to webgpu min spec and wont run on all hardware and may not run on web at all.

@cart cart pinned this issue Aug 23, 2020
@aclysma aclysma mentioned this issue Aug 24, 2020
@cart cart added this to In Progress in Roadmap Sep 10, 2020
@zicklag
Copy link
Contributor

@zicklag zicklag commented Oct 24, 2020

It might not be ready for use for a while, but it would be awesome if we could leverage Embark's recently announced Rust GPU project:

https://github.com/EmbarkStudios/rust-gpu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Roadmap
In Progress
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants
You can’t perform that action at this time.
X Tutup