March 19th, 2007

Hardware Occlusion And Portal Culling

 One of the most important requirements for a game engine is some form of agressive culling. If you take a look at any major 3D fps written within the past 10 years, all use some sort of culling algorithm. Traditionally, techniques such as BSP’s were employed (made famous by the Quake series), however the problem with BSP’s is that they must be computed offline, and impose some limits on your scene geometry. In other words, they are static.

 This doesn’t work well in today’s games, where levels are often highly-dynamic, possibly even destructable via physics.

 One, more recent approach that has surfaced to deal with dynamic environments is Portal Culling. The idea is fairly simple: divide the world into Zones or Sectors (call them what you will) - these are essentially “rooms”. For each opening into the sector (ie. window, doorway, etc) add a portal. Portals are used to connect different sectors together. Then, at runtime, find all the portals in the room your camera is in that are in the view frustum. For each of these, create a new frustum extending from the portal, and find all the portals visible to that. You do this recursively until you cannot find more portals, and then you check to see which of these portals are also visible to your original view frustum. If they are, then the sectors the portals belong to must be visible, and therefore you can render that sector.

 Because portals are arbitrary, their position and the sectors they are linked too can be moved around dynamically. Also, because there is no pre-processing, portal culling works well in dynamic scenes where portals can be added/moved/resized/removed/etc. However, there are some weakness:

Overdraw: Because portal culling uses a bounding box intersection with the view frustum, there is a possibility of a portal being in the view frustum but not actually visible. A good example of this is to imagine a room with two doorways, opposite each other. In the middle of the room is a large block - large enough to obscure the opposing door when standing behind it. Portal culling, because it works only with portal and sector intersections, would have no knowledge of the block, and therefore would consider the opposing yet occluded doorway to be visible, potentially causing a very large overdraw (if there are several visible rooms beyond that one). Now, it’s true that you could pre-process the scene and therefore take into account the large block - however, this would defeat the purpose of dynamic portal culling.

Large Portals: In order for us to create a frustum at each portal, we must be given a location to set our camera. However, the portal itself is not a singular point - it might be a massive 24km wide window. Obviously, in such a case using a 45 degree viewing angle is not sufficient (or whatever view angle it is you are using). While there are ways around this (upping the angle to 180 degrees), most require artist intervention (which prevents dynamic creating of portals) and can lead to over/underdraw.

 It is obvious then, that the weakness in portal culling is the fact that we are using simple frustum checks, which do not accurately represent the geometry in our world. What can we do about that? Well, starting with DX9, we have access to hardware occlusion.

Hardware occlusion, in theory, is pretty simple. Render all occluders to a rendersurface, and save the zbuffer info. Next, check the bounding box of the object you want to know whether is occluded or not against the zbuffer, and determine how many pixels would actually be drawn (depth < occluders depth). DX9 implemented a way to do this directly on the hardware, bypassing slow drivers, which makes this test fairly quick if used properly. For our purposes, we can replace all of our frustum checks with simple hardware occlusion queries. What are the advantages? Well:

Occluder Awareness: Because hardware occlusion automatically handles any occluding geometry, the scenario presented earlier with the large box in the room would be handled correctly - the opposite doorwar would be considered not in view and culled away. This is done automatically - because the box is in a visible room, we can safely assume that the box is an occluder and quickly render it to the occluder surface. If the box moves, the occlusion surface is updated accordingly and everything works without problems. This is especially useful for things like doors that can open/shut arbitrarily.

No More View Frustums: Because we are doing away with view frustum checks, we are also removing their inherint weakness (requiring a position). Now, portals of any size can be used without representing a problem. Also, because we are including occluding geometry in the check, there’s no reason to create multiple view frustums and check them against the main view.

Essentially, hardware occlusion-based portal culling gives us virtually NO overdraw, which can make a big difference in complicated scenes.

So, by now you’re wondering, “what’s the catch? Why isn’t everyone using this then?” Well, the answer put simply is that hardware occlusion queries are not free. In fact, they are fairly expensive. First you burn some fillrate rendering to your occlusion surface (though that surface can be very tiny). Second, you must access the zbuffer to check whether an object is occluded or not. Third, because this is all done on the GPU, the CPU and GPU arn’t going to be synced up, and the cpu generally has to wait for the GPU to catch up.

However, these issues can be overcome. First, only do occlusion checks when they are necessary. Second, batch all GPU calls together in order to reduce lag.

In an all out speed-battle, frustum-based portal culling will outperform hardware occlusion-based portal culling, almost always (checking a bounding box against a frustum is pretty damned fast). However, something I havn’t mentioned before is implementation time. In order to be robust, it is a fair bit more difficult to do frustum portal culling. Occlusion portal culling took me roughly a day and a half to implement and tweak. That’s pretty quick for something so vital!

 

As a note, I don’t actually use DX directly. Sylvain added hardware occlusion to yesterday’s TrueVision 6.5 beta release, and it’s just a matter of a few calls to the engine. However, I believe the calls mirror DX’s fairly closely.

Read the rest of this entry »

February 18th, 2007

Preliminary Assembler Completed!

The first version of PixelCraft’s shader assembler (the part that compiles the nodes down into a useable shader) works!

 I’ve spent the last couple days coding pretty much every minute of my spare time to get this done (exams are coming up), and it’s just an amazing feeling to have it working. The last 3 months of planning and development has been geared towards this point - PixelCraft now actually does something useful!

It’s 1am and I have to go volunteer at 8am, and I’m bloody tired, so I’ll post more information tomorrow when I get the time to do a proper job.

 Party time!

February 12th, 2007

ShaderCraft - PixelCraft

If you look closely at the screens, you’ll notice that my app is actually named PixelCraft and not ShaderCraft. Now, THAT is an embarassing mistake - I forgot I changed the name =(

Aside from that, things are progressing well. Almost all the small annoying UI bugs (that I know about) have been resolved. I’ve added the project and node explorers, adding XML saving/loading of nodes and the ubershader. Grouping has been set up (so you can choose to only compile group A shaders, allows you to automatically create several version of a shader. Say one for 1 dir light and 2 point lights, one for 1 point light and 2 dir lights, etc), and I’ve just started writing the compiler today.

So far the compiler traverses the node tree, seperates all the techniques and passes, and stores proper depth info (so I know which nodes are dependant on each other, so that I can compile them in order). I have a lot of school-related stuff coming up this week so progress will be slow, but I’m hoping to have a concept-demo out for a few people to test within the next 3 weeks.

February 5th, 2007

New Look

After being inspired by Office 2007’s awesome new ribbon bar, I’ve decided to completely revamp ShaderCraft’s UI. I havn’t had a lot of time lately, so the change has taken some time (not to mention the plethora of bug fixes I’ve done), but things are coming along!

 

(Click for a larger picture)

January 26th, 2007

ShaderCraft Updates (among other things)

First off, I’d like to apologize for neglecting my blog so much. A lot has been going on in my life, development wise and personal wise. But, work has progressed despite the lack of updates!

First, ShaderCraft finally now has both a name, and something to show for itself. Graph construction is pretty functional, and fairly error proof, even sorting out what inputs/outputs can and cannot connect (pre-lim error checking). The compiler is about 20% complete, and I hope to have that functioning within the next couple weeks.

Screenshot

Thumbnail of ShaderCraft

 

Aside from that, I’ve also done a bit of other shaderwork. Slight progress has been made with my shadowing demo, and I managed to complete an entire fur demo in just a couple of days =)

Follow this link to view screenshots, download a video, and if you’re a lucky owner of a TrueVision3D license, you can download the source (VB.Net 2k5).

 

Finally, Arius and Potato (both from TrueVision3D) and I have decided to reform PAB, and continue development on our FPS demo. My job, of course, is shader developement (though I might have to move on to networking as well, if no one else decides they want to tackle the beast. I don’t blame them). I hope to have ShaderCraft at a functional level fairly soon, so that I can begin prototyping. I expect to have the entire HDR post-processing pipeline complete within the next month.

December 2nd, 2006

Alpha-Tested Shadows

One of the greatest drawbacks to stencil shadows is the inability to properly shadow alpha-tested objects; that is to say, an object with a transparent texture. Take, for example, a chain link fence. Many FPS game involve fences of some kind in one area or another, and in most cases these fences are nothing more than a simple quad with an alpha texture applied. It looks great, and it’s cheap, however it falls apart as soon as you attempt to shadow it. Stencils work by extruding the silloughete of a mesh, and since the transparency is a texture effect and not part of the mesh geometry, it isn’t possible to capture that in the stencil buffer.

However, this problem is resolved with shadowmapping. Because an object is draw into the depth buffer pixel by pixel, it is possible to do a lookup on the object’s diffuse texture, and determine whether or not the alpha component is solid or not. If it is, then continue to mark the depth. If not, kill the pixel shader (texkill is wonderful) and do not write out a depth (no shadow). It works great!

void VS(in float3 pos: POSITION, float2 texCoord: TEXCOORD0, out vsOut output) {

     output.pos = mul(float4(pos, 1.0), world);
     output.depth = length(output.pos.xyz - lightPos.xyz);
    output.pos = mul(output.pos, viewProj);

 output.texCoord = texCoord;

}

float4 PS(in float depth: TEXCOORD0, float2 texCoord: TEXCOORD1) : COLOR {
   
 clip(tex2D(DiffuseMapSampler, texCoord).a - 0.5f);

     return float4(depth, 0.0, 0.0, 1.0);
}

 As you can see here, it’s simple enough to implement. There is a rather major drawback though - texkill (clip in HLSL) interferes with a 3d card’s parallalism, and combined with an extra texture read, really chews up performance. However I’m not too worried about this - very few objects in the world will be required to use this special shader vs. the standard depth map shader.

December 1st, 2006

Not Forgotten

 VIDEO LINK: Foamy’s Rancor model, casting soft shadows (35 megs). 

Well now, it’s been quite a while since I’ve last posted any updates. I’m very sorry for the lack of news - it’s due in part to the fact that with school being so rush rush, I don’t have as much time for coding as I used too, and also due in part that I’ve spent all the spare time I’ve had coding instead of blogging =)

 A lot of work has perspired since the last post though. I’ll give a list here, though I’m sure I’m missing out some things.

  • Demo: First off, actual work on my public demo is underway! Right now it’s just a basic little room I made in 3ds Max with some physics-enabled objects, and the camera floats around shooting out boxes or spheres, etc. I’ll speak a little bit more about this in a bit.
  • Lighting error fixes: I’ve spent a solid 4 days pulling my hair out about this. Apparently, every since screenshot I’ve taken up until yesterday was been suffering incorrect lighting. I hadn’t realized this, because all of my scenes were static, but the lighting shader would invalidate as soon as some kind of translation/rotation was applied to an object. I realized this when I added shooting cubes to the demo. It turns out I wasn’t properly converting everything into tangent space (hence my problem with translation).. a simple error, but it required me to rewrite the entire shader (that’s what you get for just plunging into lighting without much research). However, now lighting is correct for all objects, and simpler, and cleaner. Also, the parallax I had implemented before I had actually taken out of demo because it looked so ugly - turns out I was normalizing the view vector after grabbing the new offsetted tex coords, and not before. Switching that around made everything look fantastic!
  • Optimizations: Because I was forced to rewrite my shader, and with a little help from Zaknafein, the new lighting shader is far better optimized. I have a few free instructions, which has allowed me to bring back distance-based PCF and attenuation. Some of the optimizations included creating a TBN matrix, removing the light dir vec (redundant with a properly transformed light vec), using the dot product of a falloff range with itself for attenuation, etc. Also, I’ve been playing around with different cube map sizes for the lights.. and you can get acceptable quality even from a 128 ^ 6 map! (assuming the light isn’t very large). This has given me some ideas..
  • Actors: Thanks to a request from Foamy (a member at TrueVision3D - very talented artist) I spent part of today adding support for actors and shadowmapping. Surprisingly, it’s gone over very well, with few issues. This really shows off the power of shadowmapping - seamless self-shadowing, and soft-shadowing on all objects. Try doing that with stencils, and you get some horrible artifacts. Unfortunately right now the actors need to be transformed on the cpu, since I havn’t added skinning support (would require a seperate shader dedicated for actors). I’ll get to this in the future.

As you can see, I’ve been hard at work =) There have been a few hangups though. Most noticeably right now is that I have a feeling that actors do something funny with their transformation matrix. While I can’t pinpoint if my lighting is bad on actors (it looks right, but I can’t use an untextured actor do to a TrueVision bug), I know for a fact attenuation doesn’t work. I need an insane falloff range for a light to affect an actor.. and yet it affects the surrounding scene as expected. This would mean that I’m not able to grab the vertex distance properly, but that doesn’t make sense, since shadows are cast correctly. This is going to require a chat with Sylvain.

Additionally, the Max7 exporter hasn’t been updated to work with the new engine release, so I’m stuck using Panda exporter and .x files. While things have worked out ok for now, I would much rather export to .TVM directly.. I wish they would hurry and release it.

I do have a few goals though, before I release the demo, aside from these bug fixes.

  • Multiple lights: This, imho, is essential to having useful point light shadows. Unlike directional lights, of which there are generally only one in a scene, point lights and spot lights often exist in abundance. Unfortunately, this means I’m going to have to write a very tight shadow manager.
  • Shadow manager: An absolute must. The idea is that it’s a complete waste to do two things: Render lights that don’t need to be updated, and give lights too much/too little resolution. This means that my manager needs to figure out what faces of what light need to be updated each pass (otherwise, 2 lights * 6 faces = 12 renders JUST for shadows), and then perform the updates over several frames to keep things smooth, based on how close the shadow is to the user, how large it is, how many objects it affects, and its resolution. Also, render surfaces and cubemaps take up large amounts of memory, and rendering to a 1024 map is much slower than a 256 map. Small light sources require smaller maps, and maps should be dynamically resized based on the distance of reciever to camera, importance of the light vs. others, etc. Wow, that’s a lot of work to do!
  • Multiple lighting models: In order to demonstrate both the modularity and flexability of shadowmapping, I want to support more than basic parallax mapping. Planned types include anisotropic (metal), fur, and perhapes some kind of chromatic dispersion.
  • Alpha test objects: One of the greatest knocks against stencil shadows is the inflexability towards alpha-tested objects (ie. chain link fence). Because it uses silloughete extrusion, and cant read in texture information, you end up with a big blocky shadow. I’d like to be able to support transparent textures.. possibly using tex kill in the depth shader.
  • Projection lights: I’m not super sure about this one, but I might want to support texture-projecting lights. Maybe. It doesn’t seem like a major feature people want.

My that’s a long post! I’ll have to post more frequently then, to avoid doing this again.

October 15th, 2006

Bash

 (Ike_Aran) Our health teacher told us that “1 out of 3 people who start smoking will eventually die.” The other two apparently became immortal.

lol =)

October 12th, 2006

Bloom!

So I’ve kinda quickly thrown together a bloom shader to simulate light hot-spots and lens imperfection. It’s the new lens-flare, but hey, it does look slick if used right. It’s a little strong here because the test scene wasn’t setup for it, but I think the results are fairly decent. I will be migrating to full HDR as soon as I find the time to do it right.

As you can see, fillrate due to the guassian blurs cuts my fps almost in half. This is definately not acceptable, and I’ll be working on correcting that. Here’s a good link I’ll be referring to. Once I finish sorting out the bloom, I’ll be able to really go to town with the public shadow demo =D.

October 2nd, 2006

Cubic Shadowmapping Shader Almost Complete!

I’ve been on a sort of a roll today and I’ve almost finished my shader-based work with cubic shadowmaps! Now the shader supports Parallax mapping through a dedicated height map, and you can control the specular value via both a global multiplier and per-pixel through the alpha channel of the height map. Additionally the shader source itself has been cleaned up and organized, I’m about to comment it, and I’ve added support for TV mesh symantics (texture stages). Now to throw together a little test room and share it with everyone =D

I’ve also begun preliminary work on my shader tech demo for the TrueVision3D tech demo contest (tba). I’ve got some pretty cool ideas, including a completely custom shader-driven lighting pipeline with soft shadowmapping, fur, full HDR with tonemapping, anisotropic lighting, reflection/refraction and more! It’ll also be fully physics enabled. Stay tooned!

Note: Be sure to check out Zaknafein’s work on directional shadowmapping - it’s good stuff!

« Previous Entries