UMBC game development team HueBots is competing today in the US Finals of the Microsoft Imagine Cup! Better yet, you can stream the competition online, and vote for your favorite (be honest, it’s Huebots, isn’t it!)
We now have Unreal Engine licenses for academic use by any UMBC students, faculty or staff. If you are affiliated with UMBC, you can request a license code online (must be logged into your myUMBC account)
This posting was inspired by a conversation I had with one of my students a few weeks ago about the Wikipedia page for SSIM, the Structural Similarity Image Metric, and discussion on the topic today in my lab.
First some background
Image Quality Assessment (IQA) is about trying to estimate how good an image would seem to a human. This is especially important for video, image, and texture compression. In IQA terms, if you are doing a full reference comparison, have the reference (perfect) image, and a distorted image (e.g. after lossy compression), and want to know how good or bad the distorted image is.
Many texture compression papers have used Mean Square Error (MSE), or one of the other measures derived from it, Root Mean Square Error (RMS), or Peak Signal to Noise Ratio (PSNR). All of these easy to compute, but are sensitive to a variety of things that people don’t notice: global shifts in intensity, slight pixel shifts, etc. IQA algorithms try to come up with a measurement that’s more connected with the differences humans notice.
How does an IQA researcher know if their algorithm is doing well? Well, there are several databases of original and distorted images together with the results of human experiments comparing or rating them (e.g. ). You want the results of your algorithm to correlate well with the human data. Most of the popular algorithms have been shown better match the human experiments than MSE alone, and this difference is statistically significant (p=0.05) .
What is wrong with IQA?
There are a couple of problems I see that crop up from this method of evaluating image quality assessment algorithms.
First, the data sets are primarily photographs with certain types of distortions (different amounts of JPEG compression, blurring, added noise, etc.). If your images are not photographs, or if your distortions are not well matched by the ones used in the human studies, there’s no guarantee that any of the IQA results will actually match what a human would say . In fact Cadik et al. didn’t find any statistically significant differences between metrics when applied to their database of rendered image artifacts, even though there were statistically significant differences between these same algorithms on the photographic data.
Second, even just considering those photographic datasets, there is a statistically significant difference between the user study data and every existing IQA algorithm . Ideally, there would be no significant difference between image comparisons from an IQA algorithm and what how a human would judge those same images, but we’re not there yet. We can confidently say that one algorithm is better than another, but none are as good as people.
Is SSIM any good?
SSIM  is one of the most popular image quality assessment algorithms. Some IQA algorithms try to mimic what we know about the human visual system, but SSIM just combines computational measures of luminance, contrast, and structure to come up with a rating. It is easy to compute and correlates pretty well with the human experiments.
The Wikipedia article (at least as of this writing) mentions a paper by Dosselmann and Yang  that questions SSIM’s effectiveness. In fact, the Wikipedia one-sentence summary of the Dosselmann and Yang paper is egregiously wrong (“show […] that SSIM provides quality scores which are no more correlated to human judgment than MSE (Mean Square Error) values.”). That’s not at all what that paper claims. The SSIM correlation to the human experiments is 0.9393, while MSE is 0.8709 (where correlations go from -1 for complete inverse correlation to 0 for completely uncorrelated, to 1 for completely correlated). Further, the difference is is statistically significant (p=0.05). The paper does “question whether the structural similarity index is ready for widespread adoption”, but definitely doesn’t claim that it is equivalent to MSE. They do point out that SSIM and MSE are algebraically related (that the structure measure is just a form of MSE on image patches), but that’s not the same as equivalent. That MSE in image patches does better than MSE over the whole image is the whole point!
Overall, when it comes to evaluating image quality, I’m probably going to stick with SSIM, at least for now. There are some better-performing metrics, but SSIM is far easier to compute than anything else comparable that I’ve found (yet). It definitely does better than the simpler MSE or PSNR on at least some types of images, and is statistically similar on others. In other words, if the question is “will people notice this error”, SSIM isn’t perfect, but it’s also not bad.
Extending to other areas?
We had some interesting discussion about whether this kind of approach could apply in other places. For example, if you could set up a user study to compare a bunch of cloth simulations, maybe changing grid, step size, etc. From that data alone, you just directly have a ranking of those simulations. However, if you use that dataset to evaluate some model that could measure the simulation and estimate the quality, you might then be able to use that assessment model to say whether a new simulation was good enough or not. Like the image datasets, the results would likely be limited to the types of cloth in the initial study. If you only tested cotton but not silk, any quality assessment measure you built wouldn’t be able to tell you much useful about how well your simulation matches on silk. I’m not likely to try doing these tests, but it’d be pretty interesting if someone did!
 Ponomarenko, Lukin, Egiazarian, Astola, Carli, and Battisti, “Color Image Database for Evaluation of Image Quality Metrics“, MMSP 2008.
 Sheikh, Sabir, and Bovik, “A Statistical Evaluation of Recent Full Reference Image Quality Assessment Algorithms“, IEEE Transactions on Image Processing, v15n11, 2006.
 Cadik, Harzog, Mantiuk, Myszkowski, and Seidel, “New Measurements Reveal Weaknesses of Image Quality Metrics in Evaluting Graphics Artifacts“, ACM SIGGRAPH Asia 2012.
 Wang, Bovik, Sheik, and Simoncelli, “Image Quality Assessment: From Error Visibility to Structural Similarity“, IEEE Transactions on Image Processing, v13n4, 2004.
 Dosselmann and Yang, “A Comprehensive Assessment of the Structural Similarity Image Metric”, Signal, Image and Video Processing, v5n1, 2011.
I know you can’t take everything in a PR posting at face value, but the phrase “our invention of programmable shading” in NVIDIA’s announcement of their patent suits against Samsung and Qualcomm definitely rubbed me the wrong way. Maybe it’s something about personally having more of a claim to having invented programmable shading, at least on graphics hardware, than NVIDIA. Since many of the accounts (I’m looking at you Wikipedia) of the background of programmable shading seem to have been written by people who don’t even remember a time before GPUs, this seems like a good excuse for some historical recollections.
In the beginning…
The seeds of programmable shading were planted in 1981 by Turner Whitted and David Weimer. They didn’t have a shading language, but did create the first deferred shading renderer by splitting a scan line renderer into two parts. The first part rasterized all the parameters you’d need for shading (we’d call it a G-buffer now), and the second part could compute the shading from that G-buffer. The revolutionary idea was that you could make changes and re-shade the image without needing to redo the (then expensive) rasterization. Admittedly, no shading language, so you’d better be comfortable writing your shading code in C.
The real invention of the shading language
In 1984 (yes, 30 years ago), Rob Cook published a system called “Shade Trees”, that let you write shading expressions that it parsed. I’ve seen some mis-interpretation (maybe because of the name?) that this was a graphical node/network interface for creating shaders. It wasn’t. That was Abram and Whitted’s Building Block Shaders in 1990. Shade Trees was more like writing a single expression a C-like language, without loops or branches. It also introduced the shader types of surface, light, atmosphere, etc. still present in RenderMan today.
In 1985, Ken Perlin’s Image Synthesizer expanded this into a full language, with functions, loops and branching. This is the same paper that introduced his noise function — talk about packing a lot into one paper!
Over the next few years, Pixar built a shading language based on these ideas into RenderMan. This was published in The RenderMan Companion by Steve Upstill in 1989, with more technical detail in Hanrahan and Lawson’s 1990 SIGGRAPH paper.
Shading comes to graphics hardware
In 1990, I started as a new grad student at the University of North Carolina. I was working on the Pixel-Planes 5 project, which, among other things, featured a 512×512 processor-per-pixel SIMD array. It only had 208 bits per pixel, but had a 1-bit ALU, so you could make your data any size you wanted, not just multiples of bytes (13 bit normals? No problem!). This was, of course, important to give you any chance of having everything (data and computation) fit into just 26 bytes. I was writing shading code for it inside the driver/graphics library in something that basically looked like assembly language.
By 1992, a group of others at UNC created an assembly language interface that could be directly programmed without having to change the guts of the graphics library. This is really the first example of end-user programmable shading in graphics hardware. Unfortunately, the limitations of the underlying system made it really, really hard to do anything complex, so it didn’t end up being used outside the Pixel-Planes team.
Meanwhile, we were making plans for the next machine. This ended up being PixelFlow, largely an implementation of ideas Steve Molnar had just finished in his dissertation, with shading accommodations for what I was planning for my dissertation. I had this crazy idea that you ought to be able to compile high-level shading code for graphics hardware, and if you abstracted away enough of the hardware details and relied on the compiler to manage the mapping between shading code (what I want to do) and implementation (how to make it happen), that you’d get something an actual non-hardware person would be able to use.
It took a while, and a bunch of people to make it work, but the result of actual high-level programmable shading on actual graphics hardware was published in 1998. I followed the RenderMan model of surface/light shaders, rather than the vertex/pixel division that became popular in later hardware. I still think the “what I want to do” choice is better than the “how I think you should to do it”, though the latter does have advantages when you are working near the limits of what the hardware can do.
The SIGGRAPH paper described just the language and the surface and light shaders, along with a few of the implementation/translation problems I had to solve to make it work. There’s more in my dissertation itself, including shading stages for transformation and for primitives. The latter was combination of what you can do in geometry shaders with a shading interface for rasterization (much like pixel shader/pixel discard approaches for billboards or some non-linear distortion correction rendering).
Note that both Pixel-Planes 5 and PixelFlow were SIMD engines, processing a bunch of pixels at once, so they actually had quite a bit in common with that aspect of current GPUs. Much more so than some of the intermediate steps that the actual GPUs went through before they got to the same place.
OK, but how about on commercial hardware?
PixelFlow was developed in partnership with HP, and they did demo it as a product at SIGGRAPH, but it was cancelled before any shipped. After I graduated in 1998, I went to SGI to work on adding shading to a commercial product. At first, we were working on a true RenderMan implementation on a new hardware product (that never shipped). That hardware would have done just one or two operations per pass, and relied on streaming and prefetching of data per pixel to hide the framebuffer accesses.
After it was cancelled, we switched to something that would use a similar approach on existing hardware. The language ended up being being very assembly-like, with one operation per statement, but we did actually ship that and had at least a few external customers who used it. Both shading systems were described in a 2000 SIGGRAPH paper.
In 2000, I co-taught a class at Stanford with Bill Mark. That helped spur their work in hardware shading compilers. Someone who was there would have to say whether they were doing anything before that class, though I would not be surprised if they were already looking at it, given Pat Hanrahan’s involvement in the original RenderMan. In any case in 2001 they published a paper about their RTSL language and compiler, which could compile a high level shading language to the assembly-language vertex shaders NVIDIA had introduced, to the NVIDIA register combiner insanity, and to multiple rendering passes in the way the SGI stuff worked, if necessary.
RTSL was also the origin of the Vertex/Pixel division that still exists today.
And on GPUs
And now we leave the personal reminiscing part of this post. I did organize a series of SIGGRAPH courses on programmable shading in graphics hardware from 2000 through 2006, but since I wasn’t actually at NVIDIA, ATI, 3DLabs or Microsoft, I don’t know the details of when some of these efforts started or what else might have been going on behind the scenes.
Around 1999, NVIDIA’s register combiners were probably the first step from fixed-function to true programmability. Each of the two (later eight) combiner stages could do two vector multiples and add or dot product the result. You just had to make separate API calls to set each of the four inputs, each of the three outputs, and the function. In the SGI stuff, we were using blend, color transform and multi-texture operations as ALU ops for the compiler to target. The Stanford RTSL compiler could do the same type of compilation with the register combiners as well. Without something like RTSL, it was pretty ugly, and definitely wins the award for most lines of code per ALU operation.
Better, was the assembler vertex programs in the GeForce3, around 2000. It didn’t allow branching or looping, but executed in a deep instruction-per-stage pipeline. Among other things, that meant that as long as your program fit, doing one instruction was exactly the same cost as doing the maximum your hardware supported.
Assembly-level fragment programs came in around 2001-2002, and shared originally shared that same characteristic — anywhere from 1 to 1024 instructions with no branching at the same performance.
Around 2002, there was an explosion of high-level shading languages targeting GPUs, including NVIDIA’s Cg, the closely related DirectX HLSL, and the OpenGL Shading Language.
This list of dates is pretty NVIDIA centric, and they were definitely pushing the feature envelope on many of these elements. On the other hand, most of these also were connected to DirectX versions requiring similar features, so soon everyone had some kind of programmability. NVIDIA’s Cg tutorial put’s the first generation of programmable GPUs as appearing around 2001. ATI and 3DLabs also started to introduce programmable shading in a similar time period (some of which was represented in my 2002 SIGGRAPH course).
As a particular example of multiple companies all working toward similar goals, NVIDIA’s work, especially on Cg, had a huge influence on DirectX. Meanwhile, 3DLabs was introducing their own programmable hardware that I believe was a bit more flexible, and they had a big influence on the OpenGL Shading Language. As a result, though they were very similar in many ways, especially in the early versions there was a significant difference in philosophy between exposing hardware limitations in Direct3D vs. generality (even when slow on a particular GPU) in OpenGL. In hindsight, though generality makes sense now, on that original generation of GPUs, it lead too often to unexpected performance cliffs, which certainly hurt OpenGL’s reputation among game developers.
Gregory D. Abram and Turner Whitted. 1990. Building block shaders. In Proceedings of the 17th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’90). ACM, New York, NY, USA, 283-288. DOI=10.1145/97879.97910
Robert L. Cook. 1984. Shade trees. In Proceedings of the 11th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’84), ACM, New York, NY, USA, 223-231. DOI=10.1145/800031.808602
Pat Hanrahan and Jim Lawson. 1990. A language for shading and lighting calculations. In Proceedings of the 17th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’90). ACM, New York, NY, USA, 289-298. DOI=10.1145/97879.97911
Steven Molnar, John Eyles, and John Poulton. 1992. PixelFlow: high-speed rendering using image composition. In Proceedings of the 19th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’92), James J. Thomas (Ed.). ACM, New York, NY, USA, 231-240. DOI=10.1145/133994.134067
Marc Olano and Anselmo Lastra. 1998. A shading language on graphics hardware: the pixelflow shading system. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’98). ACM, New York, NY, USA, 159-168. DOI=10.1145/280814.280857
Mark S. Peercy, Marc Olano, John Airey, and P. Jeffrey Ungar. 2000. Interactive multi-pass programmable shading. In Proceedings of the 27th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’00). ACM Press/Addison-Wesley Publishing Co., New York, NY, USA, 425-432. DOI=10.1145/344779.344976
Ken Perlin. 1985. An image synthesizer. In Proceedings of the 12th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’85). ACM, New York, NY, USA, 287-296. DOI=10.1145/325334.325247
Kekoa Proudfoot, William R. Mark, Svetoslav Tzvetkov, and Pat Hanrahan. 2001. A real-time procedural shading system for programmable graphics hardware. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’01). ACM, New York, NY, USA, 159-170. DOI=10.1145/383259.383275
John Rhoades, Greg Turk, Andrew Bell, Andrei State, Ulrich Neumann, and Amitabh Varshney. 1992. Real-time procedural textures. In Proceedings of the 1992 symposium on Interactive 3D graphics (I3D ’92). ACM, New York, NY, USA, 95-100. DOI=10.1145/147156.147171
Steve Upstill. 1989. Renderman Companion: A Programmer’s Guide to Realistic Computer Graphics. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA.
Turner Whitted and David M. Weimer. 1981. A software test-bed for the development of 3-D raster graphics systems. In Proceedings of the 8th annual conference on Computer graphics and interactive techniques (SIGGRAPH ’81). ACM, New York, NY, USA, 271-277. DOI=10.1145/800224.806815
We now have live streaming of the UMBC Global Game Jam 2014 site at youtube.com/umbcgaim! For now, it’s a bunch of people in a computer lab, but check back for the demos tomorrow (Sunday January 26th) around 3:00. Should be fun to watch.
The Global Game Jam has announced their diversifiers for 2014. These are not the theme, which will be announced Friday at the jam. These are optional additional things you can add to your game to help it stand out. Unlike the theme, we’re allowed to share these now. Have fun thinking about how you might explore them in a game.
- Back to the 1885. The game could have been built and played in the 19th century.
- Can You Come And Play? The game has a local multi-player mode.
- Design, Create, Play. All the content in the game is procedurally created, including graphics and sound.
- Hackontroller. The game must use a custom controller invented by the team, or use an existing controller in unconventional manner.
- Homo Sapiens are Boring. The game is meant to be played by cats.
- Honor Aaron Swartz. The game only uses materials found in the public domain.
- I am who I want to be. The game has characters, but nothing in their design suggests a gender.
- Inclusive. The game is specifically designed to be accessible to one or more groups of gamers with disabilities – vision, motor, hearing or cognitive impairments.
- Rebels Learns it Better. In this educational game a hidden learning path is provided for those who oppose the given rules.
- Round and Round. Rotation is one of the primary mechanics in the game.
- The Ultimate Bechdel Test Survivor. The game survives all three conditions of the Bechdel test.
- You Only Live Thrice. The player only has 3 lives and each level starts over when you die.
- You Say it! The game utilises audio produced by the player either recording or instructing player to make sounds.
UMBC is once again hosting the Global Gam Jam this January. It will run from 5pm Friday, January 24th to 5pm Sunday, January 26th, just before classes start. Once again, thanks to a generous donation by NextCentury, registration is free. Space is limited, so sign up now!
For anyone who hasn’t participated, the global game jam is a 48 hour game development event with hundreds of host sites around the world. At 5pm local time, introduce the jam and announce this year’s theme. Previous year’s themes have ranged from a phrase (“as long as we’re together there will always be problems”) to a word (“extinction”) to an image (ouroboros: a snake eating its tail), to a sound (the recording of a heartbeat). Participants brainstorm game ideas around the theme, form into teams, and spend the weekend building games that are designed to be both fun and express the theme.
The UMBC site is not restricted to just students. In previous years, we have had a mix of UMBC students, alumni, students from other schools, game development professionals, and just people with an interest in game development. More details at gaim.umbc.edu/global-game-jam. However, we are limited to just 40 participants, so sign up early if you want to come. If UMBC fills up, other local(ish) sites include the Unviersity of Baltimore, American University, and George Mason University. If you are not near UMBC, check the Global Game Jam for a host site near you.
Make plans to come to the 2013 UMBC Digital Entertainment Conference (DEC) on Saturday, April 27th, starting at 10am in the Engineering Building lecture hall on the UMBC campus. This day long event is organized by the UMBC Game Developers Club, and sponsored this year by Mindgrub.
The DEC is open to anyone, and features speakers from Firaxis Games, Zenimax, Pure Bang Games, Bioware Mythic, and Mindgrub. Whether you are a High School student, go to UMBC or another University, or are already working in a different industry, you are sure find interesting information about how the games industry works, how some current developers got started, and what they do. If you are a game developer, you are sure to find High School students, UMBC students and students from other Universities who are interested in jobs in the games industry.
|10:00||Jeremy Shopf – Lead Graphics Engineer, Firaxis|
|11:00||Ching Lau – Artist, Zenimax|
|1:00||Ben Walsh – CEO, Pure Bang Games|
|2:00||Carrie Gouskos – Lead Producer, Bioware Mythic|
|3:00||Michelle Menard – Designer|
|4:00||Alex Hachey – Game Design Lead, Mindgrub|
We have 30 participants in the Global Game Jam at UMBC this weekend, working on eight different games. Fitting for a world-wide event, the theme this year is non-verbal, it is the sound of a human heartbeat. They started at 5PM Friday, to build games around this theme.
Starting around 3:30PM, each team will be demoing their game, with the demos live-streamed on the web at twitch.tv/olanom. Watch and be amazed!