bloxfeatured

OpenGL Minecraft Style Volume Rendering

To help reinforce my knowledge of OpenGL and C++ I’ve been working on a project to build a rendering engine which can then be used as the basis for a game or other type of 3D application. I opted to implement a rendering engine that renders volumes in a ‘blocky’ style similar to that of the game Minecraft.

Minecraft is a hugely popular indie game set in a world composed of destructible blocks. Players can add or remove blocks at will or explore a vast, randomly generated world of rolling hills and subterranean networks. I decided that writing an engine that performed rendering like Minecraft would be a good exercise as there is plenty of scope for learning and it is simple enough that I’d be able to get something that looked semi-decent running quickly.

If you’re looking for a complete implementation of this kind of engine then you should check out PolyVox.

I frequently refer to parts of my code which are a bit too large to paste here. The code can be found in my GitHub repository here.

Tools

I’m using a mix of Python and C++ on this project, but I might move everything into C++ at a later stage.

Pyglet and gletools are used in conjunction with Python for rapid prototyping of new ideas and for providing a window manager and mouse/keyboard input. Much of the functionality is written in C++, which uses the OpenGL mathematics library and GLEW as a dependency.

Volume Storage

In Minecraft the world environment is stored as a Voxel volume which is then rendered as polygons by using some method of extracting a polygonal mesh from the volume data. Blocks can be added or removed from the environment by making modifications to the volume data and then re-generating the polygonal mesh; it has be to be able to do this at interactive speeds without any visible pauses, so making this process efficient is important. Minecraft subdivides a large volume (such as a game world) into smaller volumes referred to ‘chunks’ in order to speed up the process of regenerating the polygon mesh after terrain modifications or loading new terrain sections, I will use a similar approach to this.

Volume arrangement

A singe block represents a value on a regular grid in three dimensional space, so an intuitive way of storing this data would be to use a three dimensional array where the array index represents the position in space and the value of the element represents some property of the block at that position (for example: material type). Initially I used a flattened 3D array of bytes to store chunk data, the value of a block at a given point in the chunk could by found by:

value = volume_array[size_x + (y * size_x) + (z * size_x * size_y)]

Unfortunately the memory used by this approach quickly turned out to be unmanageable for a volume of any significant size, so I turned to using a sparse data structure like a boost::unordered_map. With this approach, the block x,y,z co-ordinates are packed into a boost::tuple and used as the key to the map and the block properties are stored as a byte. Only ‘solid’ blocks are stored in the map, so if a particular key cannot be found in the map the block at that co-ordinate is assumed to be empty, or ‘air’, and will not be drawn. Using a map allows quick access and modification of an arbitrary element, which is one of the main requirements of an engine like this, and less memory is used as ’empty’ blocks are no longer stored. Fast block access is not only important for changing the volume data, but for rendering those changes too.

Generating a Polygon Mesh

A single cube can be drawn by creating a list of vertices representing each of its six sides and then passing these to the graphics card. Each side of a cube requires two triangles (using 6 vertices, two of which are duplicates) or a single quad (using 4 vertices).

Cube vertices

I initially used quads for rendering as it sounded the most efficient (less vertices must be better, right?), but I’ve since found out that not all hardware supports quads and, curiously, quads are less efficient than triangles even on the hardware that does support them, so I switched to using triangles.

To build a polygonal mesh representing an entire volume we have to iterate across the std::map containing the volume data and for every entry store the list of vertices required to draw a cube at that location in space. The number of vertices generated by this process can be reduced by checking the neighbours of each volume element to see if there are any solid neighbouring blocks that would obscure the one we are currently generating vertices for. For any block where there is a solid neighbour there is no need to draw the faces between both blocks as we will never see them anyway. The following code snippet from chunk.cpp illustrates this process:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// Loop through every voxel in the volume and create the vertices required
// to render exposed faces. Faces hidden by surrounding cubes are not drawn.
for(boost::unordered_map<Position, byte>::iterator ii=chunk.data.begin(); 
    ii!=chunk.data.end();
    ++ii)
{        
   // Get the voxel chunk coords.
   x = (*ii).first.tuple.get<0>();
   y = (*ii).first.tuple.get<1>();
   z = (*ii).first.tuple.get<2>();
   // Find out if there are surrounding empty neighbours for each axis.
   if (x > 0)
   {
      if(not data.is_solid(x-1, y, z))
      {
         // The neighbour to our left is empty, generate face.
         // Setpos simply loads the appropriate vertices into a vertex array.
         setPos(x, y, z, LEFT); 
      }
   }
   else
   {
      setPos(x, y, z, LEFT);
   }
   if (x < (size - 1))
   {
      if(not data.is_solid(x+1, y, z))
      {
         // The neighbour to our right is empty, generate face.
         setPos(x, y, z, RIGHT);
      }
   }
   else
   {
      setPos(x, y, z, RIGHT);
   }  
   if (y > 0)
   {
      if(not data.is_solid(x, y-1, z))
      {
         // The neighbour below us is empty, generate face.
         setPos(x, y, z, BELOW);
      }
   }
      ... and so on for the remaining directions ...

To store the vertex positions generated by the above process, I initially used a single array of vertices which was then uploaded to the graphics card as a vertex buffer object (VBO). This worked fine up until the point where I tried to add optimisations to my rendering like back face culling.

Back face culling can be used to further reduce the number of redundant faces sent to the graphics card, it involves determining which faces in each chunk are visible from the current viewpoint and sending only those to the GPU. To be able to do this there has to be some method of filtering the vertex positions based on which face direction they belong to. One way of doing this is to group vertex positions into six arrays, one for each possible face direction. Any time the viewpoint moves, the position of each chunk relative to the viewpoint position should be checked to determine which face directions are visible. For example: if the viewpoint is immediately to the left of a particular chunk then the faces pointing to the right in that chunk do not need to be rendered and so the array holding right-facing vertices does not need to be passed to the GPU; this process can be repeated for the other two axis as well. Only the arrays holding faces that we know are visible are sent to the graphics card, across many chunks this can be a considerable saving.

Back face culling (x-axis example)

Face Normals

Face normals are useful as they will allow us to apply lighting effects to the blocks later, for now a very simple lighting model will be used which attenuates the colour of each block face depending on which direction it is pointing. Because we are using arrays to group vertex positions based on which side of a cube they represent, we can avoid having to create an array of vertex normals to accompany the array of vertex positions, and instead tell the graphics card the normal of the vertex positions it is currently receiving. It is possible to compute the face normals directly in the pixel shader and avoid having to send them to the GPU at all, but for now I will stick to simply calculating them on the CPU. Here is a section of the code from chunk.cpp that sets the vertex normals:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
/* VisibleFaceGroups is part of the back-face culling code described above */
if(visibleFaceGroups.z == DRAW_FRONT or visibleFaceGroups.z == DRAW_BOTH_Z)
{
   /* Set the GLSL Normal uniform */
   glUniform3fv(normAttrib, 1, glm::value_ptr(glm::vec3(0, 0, -1)));
   /* Upload the forward-pointing faces to the GPU */
   glBindBuffer(GL_ARRAY_BUFFER, verticesFrontVBO);
   glVertexAttribPointer(posAttrib, 4, GL_UNSIGNED_BYTE, GL_FALSE, 0, 0);
   glDrawArrays(GL_TRIANGLES, 0, verticesFront.size());
}
if(visibleFaceGroups.z == DRAW_BACK or visibleFaceGroups.z == DRAW_BOTH_Z)
{
   /* Set the GLSL Normal uniform */
   glUniform3fv(normAttrib, 1, glm::value_ptr(glm::vec3(0, 0, 1)));
   /* Upload the back-pointing faces to the GPU */
   glBindBuffer(GL_ARRAY_BUFFER, verticesBackVBO);
   glVertexAttribPointer(posAttrib, 4, GL_UNSIGNED_BYTE, GL_FALSE, 0, 0);
   glDrawArrays(GL_TRIANGLES, 0, verticesBack.size());
}
 
/* ...continue for each axis... */

The image below illustrates a basic lighting model on a scene:

No normals (left) Normals for basic lighting (right).
Face Normals

Heightmap to Block Terrain

To test out my engine I initially used a set of procedures which loaded each chunk with a primitive shape like a sphere or pyramid. I got bored of this pretty fast, so to make things more interesting I created a procedure to load a heightmap image into the world. If I set the size of the world chunks to 64x64x64 blocks and use a 256×256 heightmap and the intensity of the pixels in the heightmap ranges from 0 to 255, I need 4x4x4 chunks, or 256^3 blocks to represent the heightmap in three dimensional space. In addition to this, I also wrote a fragment shader in GLSL which colours each block in the volume based on its vertical position using the common jet colour mapping:

1
2
3
4
5
//VALUE in this case is the y position of the block and ranges from 0 to 255
float k = 4*(VALUE/float(255));
float red = clamp(min(k - 1.5, -k + 4.5),0.0,1.0);
float green = clamp(min(k - 0.5, -k + 3.5),0.0,1.0);
float blue  = clamp(min(k + 0.5, -k + 2.5),0.0,1.0);

Here’s a few images of where the engine is now:

Pyramids (Each one is stored in a chunk)

Pyramids

Mountains

Mountains

Volcano

Volcano

Hills

Volcano

Posted in General, OpenGL, Projects and tagged , , , .

21 Comments

    • Thanks Layl,

      I’m running this on a fairly modern graphics card, which might account for the good framerate! I’ll be adding some optimisations soon which should make it run even faster.

  1. Damn ! Seriously, I already said it, but it looks awesome. A few weeks ago, I also started to code my own voxel engine, and I found your site while searching tips to speed up openGL rendering. After reading some articles, I tried to perform my rendering with display lists, and it increased the FPS, but now it takes too much space on the memory ( I also use a tridimensionnal array, because it’s simple and I’m relatively new in programming, and a total noob in openGL :D), and as you said: “it quickly turned out to be unmanageable”. Now I have to find another way to store voxels, because in the tridimensional array, there is about 75% of unused data (“air blocks”). If you are ok, I think i will have a look in your code to see how you do that (but I don’t know what are tuples or unordered maps, so I have to document myself)
    (I’m not English so there is probably a few mistakes in my sentences, sorry).
    I wait impatiently your other articles.

    • I am also a total noob at OpenGL, so we’re both learning together :) I found this guide really helpful for learning, maybe you should have a look: http://www.arcsynthesis.org/gltut/

      About storing your voxel data…

      An unordered map (or just a regular map) is a container for storing values with a unique ‘key’, in simple terms you can think of the key like an index to an array. The voxel data we’re trying to store in this case is a range of unique co-ordinates in three dimensional space and an associated ‘block property’ or value (I use a byte); this lends itself well to a map data structure.

      One of the big benefits of using a map is that instead of storing a value for every single block in the volume like we do with the array approach, we can instead store only blocks that we know are ‘solid’. If we want to know if a particular block is solid, all we have to do is check to see if there is an entry in the map for the co-ordinate we are interested in, if there isn’t then we assume the block at that co-ordinate must be ‘air’ and not visible.

      The function “is_solid” in chunk.cpp performs the check I described above.

      Please feel free to use my code if it helps you, that’s why I put it here! I’m learning too though so there may be mistakes ;)

      I’ll eventually add more info about what I have learnt working on this engine, I’ve been busy with other things lately.

      • You have to keep in mind that switching arrays for your boost::unordered_map idea is a trade-off in terms of sparseness and memory usage. Oh wait, sorry – I meant a big trade-off.

        With a standard vertical pillar Minecraft-style page or “chunk”, we can safely assume that the chunk is about 50% substance and 50% air. With an array of integers denoting voxel type or color, this would always hold a constant value of 262,144 bytes (size of int = 4 bytes * X16 * Y256 * X16), or 0.25 MB. Of course, this may seem potentially bad, as nearly half of that data is just NULL, so naively you would want to find a way to reduce that size.

        This is where your unordered_map idea has problems. If you’re storing the integer XYZ positions of the voxels, along with a byte for the type, you now store 13 bytes (X = 4, Y = 4, Z = 4, type = 1) of memory for only 1 voxel. The idea may cut the amount of stored voxels in half based on the above percentages, but now the voxels take 3 times as much data. So, 262,144 potential voxels * ~1/2 air voxels cut out = ~131,072 solid voxels * 13 bytes = ~1,703,936 bytes of total data. That’s ~1.65 megabytes – 6 and a half times as much data as the flat array. Ouch.
        With cubical chunks, half of the chunks are going to be virtually filled with voxels anyways, so the unordered_map doesn’t really help there.
        Of course, if boost somehow compresses its tuples, then the above could be entirely wrong. Feel free to correct me.

        The only thing I see that this would help is with iteration, in where half of the calls are discarded. As the bottleneck of most voxel engines is iteration, this actually would increase performance by half. If you were smart with iteration, you can reduce the amount of iterations to an optimal amount though via the use of vertex buffer objects, though…

        • Sorry, I noticed that I considered the amount of bytes as the amount of voxels in the chunk. The error would be corrected to be 65,536 voxels per 16x256x16 chunk.

          I noticed that my calculations were somewhat wrong… anyways, if we correct my previous comment, about ~32,768 voxels would be solid, but multiplying this by the required 13 bytes will be ~425,984 total bytes – this is still 2 times as much as the amount of bytes needed for the flat array. Sorry for my mistake.

          • Hi Dejarrod,

            I don’t think the unordered map would store the tuple of integer (x,y,z) co-ordinates of each voxel with its state, instead the tuple would be hashed and the state byte of the voxel would be associated with the hash value. The associative container uses more memory than a straightforward array would per populated voxel, but I found that not storing empty voxels in memory at all allowed me to have much larger scenes than I would be able to have with standard arrays.

  2. Hey,
    Am doing my thesis on Animating Sparse Voxel Octrees, by any chance are you using an octree for storage of your voxels?

    I haven’t gone through your code yet. Thought it might be easier just to ask you.

    • Hi Andre, sounds interesting!

      I’m not using sparse voxel octrees although I did previously look into using one; at the moment I’m just storing the voxels in a sparse array.
      If you’re interested in sparse voxel octrees you should check out GigaVoxels if you haven’t already. There’s an open source c++ octree class available here which you might find useful.

      My code is hosted on GitHub now (https://github.com/pabennett/glblox), which makes it easier to navigate, I’ll still need to update the links on this site.

  3. Your engine is very inspiring to me! I am a 15 year old C++ programmer who has been programming for about 2 years while I am also a participating as a student. I really got interested in Game Development and of course I thought of a game that has some sort of destructable world which involves voxels. Now I am not totally new to OpenGL as I have created a colored cube and triangle but I am still struggling. This is really my first big project and I am wondering, how do you come up with all these ideas on how to render and how to design this whole engine so easily? Can you come up with a design quickly because you are an experienced programmer? Is there anything that you can recommend I do that will help me be able to think through the design of an engine like this and possibly other big projects? Perhaps program more programs without OpenGL? Or maybe program many smaller programs with OpenGL?

    Thanks,
    Brent

    • Hey Brent, thanks for dropping by. I’m glad I inspired someone!

      Its great that you’re learning programming, it can be really rewarding as a hobby and/or a career.

      I’m quite new to OpenGL like yourself, some aspects of 3D programming can get fairly involved but you will be surprised what you can accomplish with a little perseverance. I’ve been programming for several years now in various languages and I’ve picked up a few tricks along the way that helped me build this engine.

      Voxel engines are all the rage at the moment and building one for yourself is a great way of picking up some skills that will be useful to you in other projects and your studies. It really surprised me how much I learnt from working on this project and how much of it can be applied in unrelated fields.

      If you’re struggling with ideas on how to get started with the rendering you should seek out similar projects on Google to see how other people have gone about it. A guy called AlwaysGeeky has put a nice guide up over at https://sites.google.com/site/letsmakeavoxelengine/home/basic-block-rendering and there’s a guide on Wikibooks that I found useful when I was starting out that you can get here: http://en.wikibooks.org/wiki/OpenGL_Programming (see the section on Glescraft). You could always take a look at my code too, but I make no guarantees about its readability or correctness ;)

      If you would like something in addition to your studies to hone your programming and maths skills (both of which are necessary for 3D programming) you could check out Udacity: http://www.udacity.com/. It has a series of self paced online courses run by the chaps from Google; their language of choice is Python which you should find a breeze given your C++ background. These courses (particularly CS212) will help you learn how to design good software.

      If you want any further help feel free to get in touch.

  4. Hi there, I’m getting some great pointers from this, thank you!

    Like you, I was originally looking at GleCraft and started building based on that, but I think using the unordered map is actually an awesome idea. It’s so much easier I think to do checks. From what I saw in GleCraft they used extra variables (memory) for thigns like isvisible, etc… whereas with this you just see if there’s an entry in the chunk.

    separating each chunk into 6 VBOs for the sides is actually an awesome idea, too I think.

    I did have one question, I noticed in your checks you have if x < size -1… where are you getting size from? I'm guessing it's from a constant for chunk size?

    The way I have it now is I have the key as bytes for the coordinates, and the value as an int (I want to be able to store more than 255 textures) Then I would send a block structure to the VBO and pass VertexAttribPointer for the vertex coords and one for the block type where i'd figure out the texture position in the shader… Not sure if it makes sense, but that's my thought process at the moment

    • Hi, your guess about size being the chunk size in voxels was correct, its passed into the constructor the the chunk. I’ve found a value of 32 or 64 offers the best trade off between time taken for chunk updates and number of chunks required to represent the world. Your texturing idea sounds a bit like a texture atlas, which seems to be a popular approach for engines like this, so it sounds like you’re on the right track!

  5. I’m curious, how long does it take you to generate the lists of polygons for each face?

    I used your code for something similar and decided to use lists (C#) to populate the arrasy of verticies for each face. It’s taking me about 5 seconds to ‘update’ the arrays, which is clearly not acceptable.

    Pushing the created arrays to OpenGL only takes about 2ms which is nothing, but I’m definitely having problems with making the list of polys

    • I made a simple change and decided not to go with a dictionary for my map data. instead i’m using a multidimensional array. I’m still using the idea for different lists for each face though. it’s actually loading very fast now with that change. I did a stress test of up to 255 chunks and it was handling it all like a champ

      • What dimensions are you using for your chunks? I currently have it set up on my machine so that the world is 256x256x256 voxels subdivided into chunks of 32x32x32 voxels each. Performing a full update of all the chunks in the world when the program is first launched takes around 5 seconds, but modifying individual chunks afterwards takes very little time at all. I’m running an intel i5-2500k which takes about 10ms to update a single 32x32x32 chunk.

        • because I’m using multidimensional arrasy I have more overhead (i’m generating 32x32x32 in about 20ms), but I couldn’t use the unorderedmap because the .ContainsKey() and accessing the actual data to see if it was ‘seethrough’ was causing one 16x16x16 chunk to generate every 5 seconds. (remember I’m using C#). I’m still using lists (equivalent to vectors in c++) for each face.

          I’m still trying to figure out how I might remove that overhead caused by the multidimensional array, but for C# unordered maps (dictionary) i don’t think is the answer.

          The good thing is although it takes a little longer to generate the buffers, it shouldn’t make a huge difference once they are generated

          • Sorry if this is no longer relivent, but if you dont want a multidimension array but still know your max sizes then you can use moduli. ie if x and y max is 20 and (assuming x then y) the current var is 21 then you know its x is 1 and y is 1 since 21 mod 20 roundid is 1, and the leftover is 1. when you want to find one its just multiples of your max values.

  6. Hey Man,

    I pulled your repo because i was looking at implementing some of the optimizations you made in an SDL based voxel render i’m working on for a school project. I’ve had no luck getting it to build though.

    • Hi Matthew,

      Let me know what error you’re getting and I’ll see if I can help. Feel free to drop me an email if there’s too much info to post here, my address is in the contact links above.

      Pete

Leave a Reply to lid6j86 Cancel reply

Your email address will not be published. Required fields are marked *