Archive for the ‘Uncategorized’ Category

Vectorizing smallpt with ispc

May 15, 2018

Today’s processors can do a lot of work in parallel on a single core. Programs need to be specifically designed for that, as this functionality is exposed via specialized instructions such as SSE or AVX instructions on x86 processors. By operating on wide registers storing vectors instead of scalars, a single SSE or AVX vector instruction operates on 4 or 8 numbers simultaneously.

Most programming languages do not expose vector types and functions to the programmer. Instead, compilers try to convert a scalar program into a vectorized program automatically in a process called automatic vectorization. Unfortunately, compilers often fail to do so, and it is often unclear when and why this happens.

The Intel SPMD program compiler (ispc) solves this issue via some small extensions to the C programming language. Motivated by a recent series of blog posts about ispc, I ported smallpt to this compiler to get a first impression of how much path tracing can profit from vectorization.

Result image of smallpt-ispc

The following table shows the time for rendering one sample at a resolution of 640×480 pixels on an Intel Core i7-4770K using AVX2 (1 core):

C ISPC Speed-up
1 bounce 106 ms 16 ms 6.6x
2 bounces 154 ms 28 ms 5.5x
6 bounces 302 ms 80 ms 3.8x
6 bounces with RR 404 ms 338 ms 2.3x

In theory, the maximum speed-up that could be achieved is 8x in case of AVX2. Unsurprisingly, the biggest improvements can be observed in camera rays, where the workload is mostly coherent.

Have a look at the code for smallpt-ispc on Github to find out more.

Dominant Colors in Movies

January 24, 2016

Inspired by Nordlicht I made a small program that finds the most dominant colors in movies. Using k-means clustering, it builds a color palette with 32 colors and calculates the number of pixels with smallest distance to the cluster centers.

As it would be far too computationally expensive if clustering worked on the complete set of full-sized images of a movie, I only extracted one frame per second and scaled it down to 16×16 pixels. Moreover, I cropped black borders from each frame, and removed all logos at the beginning as well as the credits at the end. This avoids having meaningless black pixels in the clustering process.

Some results are shown in the following images. The pie charts show the colors and their frequency of occurence. The bar below shows just the colors.

Big Buck Bunny
Big Buck Bunny

Big Buck Bunny

District 9
District 9

District 9


Shared Photos – a UPnP/DLNA image viewer for Android

January 10, 2015

‘Shared Photos’ is a small image viewer for Android that loads and displays images stored on an UPnP/DLNA server in the local network. I wrote it in the last weeks to become familiar Android development. It has some unique features which I could not find in similar apps on Google Play:

  • Support for Low Profile and Immersive Full-Screen Mode.
  • Show image description stored in IPTC metadata as subtitle.
  • Automatic screen rotation: the image is rotated so that it covers as much pixels of the screen as possible.
  • Zooming, panning etc. as in the Android Gallery app.
  • Easy to use. Just start it, choose a server and browse your images.
Albums overview

Albums overview

All images in an album

All images in an album

Show image in Low-Profile Fullscreen Mode (Android 4.3).

Show image in Low-Profile Fullscreen Mode (Android 4.3).

I use some really great libraries in this project, namely PhotoView to display images, metadata-extractor to extract metadata/thumbnails and Cling to access DLNA servers. Thanks for that!

However, although the app uses UPnP/DLNA to communicate with the server, I can’t promise that it works with a lot of servers out there. I almost exclusively tested it with MiniDLNA 1.0.21. For this reason, I decided to not publish it on Google Play for now, maybe at some time in the future. Moreover, the app probably still contains a lot of bugs, as I have not used and tested it extensively. It basically works for me on a Sony Xperia (Android 7.0), Nexus 10 (Android 5.0), Nexus 7 2013 (Android 4.4.4) and Galaxy Nexus (Android 4.3). I will publish it here to get some feedback and because the source code may be interesting for someone.

If you want to try it out, you can download the APK below. As usual, the source is on GitHub. Feel free to fork, play with the code and contribute, if you like.

Update 2015-03-27 Added option to disable automatic screen rotation and subtitles.
Update 2016-04-30 Added option to change size of thumbnails in browser.
Update 2017-06-10 Save scroll position when navigating between folders (thanks to jhavens1566). Support for Android N.


Calling functions in DLLs from JavaScript

October 22, 2013

During the past few weeks, I worked a lot with Google’s V8, the JavaScript engine used in Google Chrome. The engine can be easily embedded into applications, providing the ability to write some functionality of the application in JavaScript instead of C++. However, from within JavaScript code, there is no way out of the sandbox in which the engine executes the script code, except for a few functions explicitly added by the embedder (i.e., a browser) to communicate with the outside world. Due to security concerns, this makes much sense in a browser environment. Nevertheless, if the application only executes “trusted” scripts, it could be interesting to allow the script full access to the operating system, just like an ordinary application developed in C. That means that it should be possible to call functions in DLLs from pure JavaScript. This is similar to python-ctypes.

As a proof-of-concept, I created jswin, a small runtime environment for JavaScript based on V8 that allows script code to load a DLL and call its functions. Inside the environment it is possible to communicate directly with the native interface of the operating system. In this post I want to focus on the two most interesting points of the implementation: how to call a function in a DLL and how to handle callback functions. I recommend reading the API documentation of jswin before continuing.


HyperlapseMB: Hyperlapse.js with Motion Blur

May 1, 2013

With the help of GSVPanoDepth.js (see previous post), I have created HyperlapseMB, a fork of Hyperlapse.js that uses the depth information to create a nice motion blur effect. The images are filtered as a pre-processing step before the hyperlapse is played. The algorithm is similar to the technique used as post-processing effect in video games as described in Rosado (2007). Altough the filtering is done before the hyperlapse is played, it is implemented in WebGL. In pure JavaScript, the filtering takes multiple seconds, nearly the same time the GPU needs for the whole sequence.

You can watch a demo (Firefox/Chrome) or get the source code. Enjoy!


Rosado (2007): Motion Blur as a Post-Processing Effect. In GPU Gems 3: Programming Techniques for High-Performance Graphics and General-Purpose Computation, edited by Hubert Nguyen.

Extract depth maps from Google Street View

May 1, 2013

Besides high-resolution images, Google Street View also provides a depth map for each panorama, containing the distance from the camera to the nearest surface at each pixel. As far as I know, there is no official documentation of the format, but there are some open source projects which contain code showing how to interpret the data. The depth information is stored in a slightly complicated way, probably to save bandwidth. Each pixel in a grid of 512×256 pixels references one of several planes. A plane is given by its normal vector and its distance to the camera. Therefore, in order to calculate the depth at a pixel, one has to determine the intersection point of a ray starting at the center of the camera and the plane corresponding to the pixel.

The geometry of the planes and the map containing the index of the plane at each pixel can be retrieved as Base64-encoded and zlib-compressed data by requesting the following URL:

I have written a small JavaScript library that fetches the data, decompresses it and computes a depth map from the planes. Here is an example:


As usual, you can find the code on my github page.

Ray Marching Signed Distance Fields

January 19, 2013

This is my first attempt in real-time rendering of a procedurally generated terrain:


The terrain is generated from 3D Perlin Noise. It is converted to a signed distance field using Danielsson’s distance transform and stored in a 3D texture of size 256x256x256 on the GPU. The scene is rendered directly from the distance field, applying third-order texture filtering to improve the quality.

You can download the demo (Windows, x64) here. The source code is in my github repository.

Third-Order Texture Filtering using Linear Interpolation

January 12, 2013

A straightfoward implementation of bi- or tricubic texture filtering requires lots of texture lookups. As texture lookups are slow on current graphics hardware, reducing their number greatly improves the performance of high-order filtering kernels. In contrast, linear interpolation is relatively cheap as it is supported by the hardware. This fact has been exploited by Sigg and Hadwiger (2005) to make cubic interpolation much faster by performing a series of linear texture lookups. In this article, an in-depth explanation of their technique is given, starting with some general information on linear interpolation and approximation using cubic B-splines.


Edge-Avoiding À-Trous Wavelets in WebCL

December 3, 2012

WebCL is a new standard of the Khronos Group that allows web applications to benefit from the massive power of todays GPUs. It is based on OpenCL, like WebGL is based on OpenGL. Recently, I wrote a small demo that shows how WebCL can be used in image processing. I decided to implement the Edge-Avoiding À-Trous Wavelet transformation, as the algorithm is simple and gives very nice results. You need the Nokia WebCL plugin for Firefox to run the demo.

Demo: Edge-Avoiding À-Trous Wavelets in WebCL

Emulate Linux System Calls on Windows

November 17, 2012

A few minutes ago I finished a small example that shows how to load and execute unmodified Linux ELF binaries in Windows. Basically, this is the same as Wine, just the other way around. You can download the source code from my github page. The code is very simple and only supports the system calls SYS_write and SYS_exit, just enough to write “Hello, World!” onto the screen. Nevertheless, it shows the basic principle.