OpenGL Minecraft Style Volume Rendering
To help reinforce my knowledge of OpenGL and C++ I’ve been working on a project to build a rendering engine which can then be used as the basis for a game or other type of 3D application. I opted to implement a rendering engine that renders volumes in a ‘blocky’ style similar to that of the game Minecraft (pictured below).

Minecraft is a hugely popular indie game set in a world composed of destructible blocks. Players can add or remove blocks at will or explore a vast, randomly generated world of rolling hills and subterranean networks. I decided that writing an engine that performed rendering like Minecraft would be a good exercise as there is plenty of scope for learning and it is simple enough that I’d be able to get something that looked semi-decent running quickly. This post is the first of a series covering my progress, hopefully you will find it useful if you are trying to build an engine like this too!
If you’re looking for a complete implementation of this kind of engine then you should check out PolyVox.
I frequently refer to parts of my code which are a bit too large to paste here. The code can be found in my GitHub repository here.
Tools
I’m using a mix of Python and C++ on this project, but I might move everything into C++ at a later stage.
Pyglet and gletools are used in conjunction with Python for rapid prototyping of new ideas and for providing a window manager and mouse/keyboard input. Much of the functionality is written in C++, which uses the OpenGL mathematics library and GLEW as a dependency.
Volume Storage
In Minecraft the world environment is stored as a Voxel volume which is then rendered as polygons by using some method of extracting a polygonal mesh from the volume data. Blocks can be added or removed from the environment by making modifications to the volume data and then re-generating the polygonal mesh; it has be to be able to do this at interactive speeds without any visible pauses, so making this process efficient is important. Minecraft subdivides a large volume (such as a game world) into smaller volumes referred to ‘chunks’ in order to speed up the process of regenerating the polygon mesh after terrain modifications or loading new terrain sections, I will use a similar approach to this.

A singe block represents a value on a regular grid in three dimensional space, so an intuitive way of storing this data would be to use a three dimensional array where the array index represents the position in space and the value of the element represents some property of the block at that position (for example: material type). Initially I used a flattened 3D array of bytes to store chunk data, the value of a block at a given point in the chunk could by found by:
value = volume_array[size_x + (y * size_x) + (z * size_x * size_y)]
Unfortunately the memory used by this approach quickly turned out to be unmanageable for a volume of any significant size, so I turned to using a sparse data structure like a boost::unordered_map. With this approach, the block x,y,z co-ordinates are packed into a boost::tuple and used as the key to the map and the block properties are stored as a byte. Only ‘solid’ blocks are stored in the map, so if a particular key cannot be found in the map the block at that co-ordinate is assumed to be empty, or ‘air’, and will not be drawn. Using a map allows quick access and modification of an arbitrary element, which is one of the main requirements of an engine like this, and less memory is used as ‘empty’ blocks are no longer stored. Fast block access is not only important for changing the volume data, but for rendering those changes too.
Generating a Polygon Mesh
A single cube can be drawn by creating a list of vertices representing each of its six sides and then passing these to the graphics card. Each side of a cube requires two triangles (using 6 vertices, two of which are duplicates) or a single quad (using 4 vertices).

I initially used quads for rendering as it sounded the most efficient (less vertices must be better, right?), but I’ve since found out that not all hardware supports quads and, curiously, quads are less efficient than triangles even on the hardware that does support them, so I switched to using triangles.
To build a polygonal mesh representing an entire volume we have to iterate across the std::map containing the volume data and for every entry store the list of vertices required to draw a cube at that location in space. The number of vertices generated by this process can be reduced by checking the neighbours of each volume element to see if there are any solid neighbouring blocks that would obscure the one we are currently generating vertices for. For any block where there is a solid neighbour there is no need to draw the faces between both blocks as we will never see them anyway. The following code snippet from chunk.cpp illustrates this process:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 |
// Loop through every voxel in the volume and create the vertices required // to render exposed faces. Faces hidden by surrounding cubes are not drawn. for(boost::unordered_map<Position, byte>::iterator ii=chunk.data.begin(); ii!=chunk.data.end(); ++ii) { // Get the voxel chunk coords. x = (*ii).first.tuple.get<0>(); y = (*ii).first.tuple.get<1>(); z = (*ii).first.tuple.get<2>(); // Find out if there are surrounding empty neighbours for each axis. if (x > 0) { if(not data.is_solid(x-1, y, z)) { // The neighbour to our left is empty, generate face. // Setpos simply loads the appropriate vertices into a vertex array. setPos(x, y, z, LEFT); } } else { setPos(x, y, z, LEFT); } if (x < (size - 1)) { if(not data.is_solid(x+1, y, z)) { // The neighbour to our right is empty, generate face. setPos(x, y, z, RIGHT); } } else { setPos(x, y, z, RIGHT); } if (y > 0) { if(not data.is_solid(x, y-1, z)) { // The neighbour below us is empty, generate face. setPos(x, y, z, BELOW); } } ... and so on for the remaining directions ... |
To store the vertex positions generated by the above process, I initially used a single array of vertices which was then uploaded to the graphics card as a vertex buffer object (VBO). This worked fine up until the point where I tried to add optimisations to my rendering like back face culling.
Back face culling can be used to further reduce the number of redundant faces sent to the graphics card, it involves determining which faces in each chunk are visible from the current viewpoint and sending only those to the GPU. To be able to do this there has to be some method of filtering the vertex positions based on which face direction they belong to. One way of doing this is to group vertex positions into six arrays, one for each possible face direction. Any time the viewpoint moves, the position of each chunk relative to the viewpoint position should be checked to determine which face directions are visible. For example: if the viewpoint is immediately to the left of a particular chunk then the faces pointing to the right in that chunk do not need to be rendered and so the array holding right-facing vertices does not need to be passed to the GPU; this process can be repeated for the other two axis as well. Only the arrays holding faces that we know are visible are sent to the graphics card, across many chunks this can be a considerable saving.

Face Normals
Face normals are useful as they will allow us to apply lighting effects to the blocks later, for now a very simple lighting model will be used which attenuates the colour of each block face depending on which direction it is pointing. Because we are using arrays to group vertex positions based on which side of a cube they represent, we can avoid having to create an array of vertex normals to accompany the array of vertex positions, and instead tell the graphics card the normal of the vertex positions it is currently receiving. It is possible to compute the face normals directly in the pixel shader and avoid having to send them to the GPU at all, but for now I will stick to simply calculating them on the CPU. Here is a section of the code from chunk.cpp that sets the vertex normals:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
/* VisibleFaceGroups is part of the back-face culling code described above */ if(visibleFaceGroups.z == DRAW_FRONT or visibleFaceGroups.z == DRAW_BOTH_Z) { /* Set the GLSL Normal uniform */ glUniform3fv(normAttrib, 1, glm::value_ptr(glm::vec3(0, 0, -1))); /* Upload the forward-pointing faces to the GPU */ glBindBuffer(GL_ARRAY_BUFFER, verticesFrontVBO); glVertexAttribPointer(posAttrib, 4, GL_UNSIGNED_BYTE, GL_FALSE, 0, 0); glDrawArrays(GL_TRIANGLES, 0, verticesFront.size()); } if(visibleFaceGroups.z == DRAW_BACK or visibleFaceGroups.z == DRAW_BOTH_Z) { /* Set the GLSL Normal uniform */ glUniform3fv(normAttrib, 1, glm::value_ptr(glm::vec3(0, 0, 1))); /* Upload the back-pointing faces to the GPU */ glBindBuffer(GL_ARRAY_BUFFER, verticesBackVBO); glVertexAttribPointer(posAttrib, 4, GL_UNSIGNED_BYTE, GL_FALSE, 0, 0); glDrawArrays(GL_TRIANGLES, 0, verticesBack.size()); } /* ...continue for each axis... */ |
The image below illustrates a basic lighting model on a scene:
No normals (left) Normals for basic lighting (right).

Heightmap to Block Terrain
To test out my engine I initially used a set of procedures which loaded each chunk with a primitive shape like a sphere or pyramid. I got bored of this pretty fast, so to make things more interesting I created a procedure to load a heightmap image into the world. If I set the size of the world chunks to 64x64x64 blocks and use a 256×256 heightmap and the intensity of the pixels in the heightmap ranges from 0 to 255, I need 4x4x4 chunks, or 256^3 blocks to represent the heightmap in three dimensional space. In addition to this, I also wrote a fragment shader in GLSL which colours each block in the volume based on its vertical position using the common jet colour mapping:
1 2 3 4 5 |
//VALUE in this case is the y position of the block and ranges from 0 to 255 float k = 4*(VALUE/float(255)); float red = clamp(min(k - 1.5, -k + 4.5),0.0,1.0); float green = clamp(min(k - 0.5, -k + 3.5),0.0,1.0); float blue = clamp(min(k + 0.5, -k + 2.5),0.0,1.0); |
Here’s a few images of where the engine is now:
Pyramids (Each one is stored in a chunk)

Mountains

Volcano

Hills

There will be follow-up posts discussing other parts of the engine in the near future.
Using Python with C/C++
I’ve found it useful in the past when working on a Python script to be able to move some parts of it into C/C++ to take advantage of the extra speed. For most cases it isn’t necessary as Python is generally quick enough, however for things like 3D rendering, which I’ve been working on recently, moving parts of the code into C/C++ has resulted in a large frame rate increase. I wrote this short guide as a personal reference for the future, maybe you will find it useful too.
Luckily its reasonably easy to get Python to call functions contained in a compiled C/C++ library by using the ctypes module which comes with Python as standard.
First off, here’s a simple C++ class that provides two member functions: hello and get_number. The hello function simply prints ‘Hello!’ to the standard output, and get_number returns an integer.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
class MyClass { public: MyClass(); ~MyClass(); void hello(); int get_number(); }; MyClass::MyClass() {}; MyClass::~MyClass() {}; void MyClass::hello() { std::cout << "Hello!" << std::endl; } int MyClass::get_number() { return 1; } |
We need to provide a C interface to this class to enable Python’s ctypes to be able to use it. It’s a little tedious, but not difficult:
1 2 3 4 5 6 |
extern "C" { MyClass* newClass(){return new MyClass();} void hello(MyClass* instance){instance->hello();} int get_number(MyClass* instance){return instance->get_number();} } |
I compiled this code using the GNU compiler collection with the following commands (this works for me on Windows, for another OS you may need to change some settings):
1 2 |
g++ -c demo.cpp -o demo.o g++ -shared -Wl,-soname,demo.so -o demo.so demo.o |
With the code compiled and a *.so available we now need a Python wrapper class to access the functions provided in the ‘extern C’ part of our C/C++ code. A good approach is to store your .so file in a directory with a __init__.py file, so that you can import your wrapper class and use it in other parts of your project. Here’s the __init__.py code wrapping the C/C++ library using ctypes:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
from os.path import abspath, dirname, join import ctypes from ctypes import cdll here = dirname(abspath(__file__)) path = join(here, 'demo.so') demo_lib = cdll.LoadLibrary(path) class MyClass: def __init__(self): self.obj = demo_lib.newClass() def __enter__(self): return self def __exit__(self,*args): demo_lib.destroy(self.obj) def hello(self): demo_lib.hello(self.obj) def get_number(self): return demo_lib.get_number(self.obj) |
In the code above, the shared library demo.so is loaded using the ctypes function cdll.LoadLibrary(path) and stored in the demo_lib variable. When a new instance of the Python class MyClass is generated, an object pointer to the C++ class is stored in a class variable called obj. If a Python class instance of MyClass is destroyed (maybe by Python’s garbage collector), the C++ class instance is also destroyed and the memory freed.
This small Python script demonstrates how our class behaves:
1 2 3 4 5 6 7 8 9 10 |
>>> with MyClass() as A: ... A ... A.hello() ... print A.get_number() ... ... <__main__.MyClass instance at 0x02A2D6C0> Hello! 1 >>> |
This approach has generally worked well for my personal projects, which tend to be fairly small. For larger projects I expect maintaining the wrapper code will quickly become cumbersome, so you might want to look into something like SWIG
Learning OpenGL
I recently decided to have a go at learning OpenGL, after searching around for a while I stumbled across an excellent tutorial Learning Modern 3D Graphics Programming that assumes no prior knowledge of OpenGL and does a really good job of helping you set up a development environment to start building 3D applications in C/C++. Each chapter comes with its own demo code and ‘review’ questions that help reinforce what you learned and break up the reading a bit. The demo code also comes bundled with the unofficial OpenGL SDK which has everything you need to get started. Its worth noting that the tutorial appears to be under active development and bits may not be complete, however there is already enough content there to get you up to speed so its definitely worth a look!
There’s a another great looking guide over at Durian Software, although I haven’t worked through this one yet.
RS232 UART (VHDL)
Description
I designed the UART core to allow me send and receive data from a Spartan 6 LX9 Microboard which has an on board Silicon Labs Cp2102 USB-UART Bridge.
The UART Core is a simple RS232 communications controller which can be used to provide a quick and easy means of communicating with your FPGA board. The baud rate used by the controller is easily set using the generic map, although I have only tested it so far with a rate of 115200 baud using a 100MHz system clock. The communications scheme used is 8 data bits, no parity and 1 stop bit (8N1), so make sure your terminal software is using these settings if you have any problems.
Internally, the UART provides a simple interface incorporating a data strobe and acknowledge for sending and receiving data. To send data from the FPGA, the data must be presented to the DATA_STREAM_IN port and the DATA_STREAM_IN_STB (Strobe) asserted; when the DATA_STREAM_IN_ACK (Acknowledge) output is asserted by the UART the data has been sent. The strobe should be deasserted 1 clock cycle after the acknowledge is seen to avoid an additional byte of data getting sent.

Specifications
- RS232 8 data bits, 1 stop bit, no parity data stream format.
- Customisable baud rate.
- Simple internal interface and handshaking scheme.
Download
You can grab the source code for the UART through my GitHub repository here.

