It’s a golden era of video game remakes. Earlier this year, Final Fantasy VII was updated for 2020 with a complete overhaul of its graphics and core gameplay systems, and fans of 900-degree skateboard turns will soon ollie into complete remasters of the first two Tony Hawk Pro Skater games.
These overhauls typically take years for professional game studios, which rebuild the game from the ground up. But fans have also been hard at work remaking classic games themselves, using artificial intelligence algorithms to upscale the pixelated, blotchy renderings of older games with crisp, modern graphics.
This community of thousands of game upscalers has sprung up thanks to the ability to instantly access and use A.I. research, which is often posted for free online by researchers from academia to big tech companies. Hobbyist game upscalers typically use an algorithm called ESRGAN, which won top prize at an international image upscaling competition in 2018.
The development of GANs was a “eureka” moment, like the fabled image of Archimedes running from his bath after realizing how water displacement works.
Using A.I. to remaster old games involves breaking down the components of a video game into two categories, “structures” and “textures.” Structures refers to 3D objects in the game, and textures refers to the graphics stretched over those objects to give them color, detail, and the illusion of depth.
By replacing the textures in games, modders can completely change how a game looks or feels. And in the case of game upscaling, they can also update those textures with newer, higher-resolution versions.
The Forgotten Pixel Art Masterpieces of the PlayStation 1 Era
Video games simply aren’t made like this anymore
To understand how the algorithms recreating these textures actually function, you have to go back to 2014 to when the foundational concept of a GAN (or Generative Adversarial Network) was introduced.
The development of GANs was a “eureka” moment, like the fabled image of Archimedes running from his bath after realizing how water displacement works. In 2014, Ian Goodfellow was at a bar drinking with some fellow A.I. researchers, as a send-off to a colleague who had just finished their PhD, he told Wired in 2017. A researcher was describing a method for generating images using deep learning algorithms, but Goodfellow disagreed on the methodology. He thought that images were too complex for one algorithm to learn to generate realistically on its own. But he had an idea: What if an algorithm could teach another algorithm how to generate images? Two algorithms, one to try and generate an image and the other to determine whether the result is realistic, might automate the usual trial and error researchers employ to tune algorithms.
“I went home still a little bit drunk. And my girlfriend had already gone to sleep. And I was sitting there thinking: ‘My friends at the bar are wrong!’” he told Wired. “I stayed up and coded GANs on my laptop.”
It worked on the first try.
Playing two algorithms off each other was foundational to an entirely new subsection of artificial intelligence. GANs suddenly became the hot new trend in A.I. research. In 2016, Goodfellow presented a tutorial on how the process worked at the industry’s largest conference, nodding to the history of algorithms playing games against themselves to improve, which dated back to A.I. pioneer Arthur Samuel’s checkers program in the 1950s.
ArXiv, the online repository of research papers favored by A.I. scientists, became flooded with papers about GANs. Goodfellow’s original GAN paper has been cited by nearly 20,000 other pieces of research since 2014.
One of those papers was from a group of researchers at Twitter, who proposed an adaptation of Goodfellow’s work not to generate new images, but to enhance existing ones. This is an area of research called super-resolution.
The Twitter researchers trained the GAN’s image-generating algorithm on two versions of images, one group that was blurred to a low quality and one group that was a higher quality. The idea is that the algorithm would learn the difference between low-quality images and high-quality images and be able to reconstruct detail in new images as well. This approach was dubbed Super-Resolution Generative Adversarial Network, or SRGAN.
But the iteration that would bring the technology into the hands of game upscaling hobbyists would ultimately be called an Enhanced Super-Resolution Generative Adversarial Network, or ESRGAN. The paper came from a group of researchers in academic labs sponsored by SenseTime, one of China’s A.I. giants.
SRGAN, like GANs in general, is made of two algorithms: One that generates an image, and one that determines whether the image is real or fake. ESRGAN compares the generated image to a real image, and tries to determine which is more real. Researchers say that this more nuanced approach forces the algorithm to eventually generate details with sharper edges and more realistic textures.
Just two weeks after ESRGAN was introduced, a video game modder who goes by the name kingdomakrillic posted a tutorial on Tumblr about how to use the A.I. research paper for upscaling video games.
One video game upscaler, Aminuddin Abdollah, told OneZero that kingdomakrillic’s results using ESRGAN spurred his fascination with machine learning.
This technique that game upscalers use is wholly different than the approach game studios use.
“After ESRGAN came about, kingdomakrillic’s posts revealed a very important thing overlooked perhaps by the ESRGAN authors themselves: the dataset plays a very major role in the results,” Abdollah told OneZero over Discord chat.
When using ESRGAN for upscaling games, kingdomakrillic had trained the algorithm not on the photographs that the academics had used, but on a dataset of comics called Manga109. The images of Manga109 were more closely related to video game graphics than real pictures, which allowed the algorithm to more accurately understand what kind of distinct lines and shapes video games used.
How ESRGAN actually functions is by looking for color variation between an image’s pixels, and then inserting the detail that it thinks should exist based on the images it’s seen before. When trained on photographs, ESRGAN inserts tons of details into the large pixels typically seen in older video game art. Here’s an example given by Abdollah, from a game he upsampled this year: On the left is the original, and on the right is the ESRGAN upscale when trained on photographs.
The algorithm actually adds too much detail into the image, oversharpening it and making it look unrealistic. But training it on drawings removes the algorithm’s knowledge of all that extra detail found in photographs. In effect, it blunts the algorithm’s tendency to oversharpen and actually softens the pixels. Calling the process game downscaling would almost be more accurate.
Here is what the same image looks like when generated by ESRGAN is trained on a better dataset more suited to video games:
You can see that the transition between colors is more gradual, and the image doesn’t lose its pixel-art quality. It just looks like higher-resolution pixel art.
And game upscalers are sharing their newly trained versions of the algorithms as well. Those who have altered algorithms specifically for certain art styles, like pixel art, have posted them publicly for others to use.
This technique that game upscalers use is wholly different than the approach game studios use. While upsamplers use the same base game, where the detail of the 3D structures are still limited by the processing power of the game systems of the time, and just replace textures. Game studios remastering a game have the luxury of actually rewriting how a game functions, sometimes in a new engine, or the program that mixes 3D models, textures, and the physics of the game.
Vicarious Visions, the company remastering the new Tony Hawk games, told Polygon that it was using a newer engine to build the remasters while still retaining some old code from the original games.
“We dug into Neversoft’s codebase and were able to pull the handling code out of there, bring it into the engine that we’re in now and update it to make sure that we are making that feel exactly the way you remember it but updated with modern animation,” Jen Oneal, Vicarious Visions studio head, told Polygon.
But the point of DIY game upscaling isn’t truly to build a game again from the ground up. Rather, it’s to align the way a game looks to how we might have remembered it looked when we first played it in 1985 or 1995. The resolution of the Nintendo 64 was 640×480, more than six times less resolution than a standard 1080p game today. But viewed on an old CRT television, they still looked incredible to our eyes.
Upscaling is also far from the first community of people modifying video games. Cult classics like Doom have hundreds of iterations, where modders replace files in the original game to create alternate enemies or settings.
But now some games are looking incredible again, thanks to the modding community, which is still looking for the next technology to bring older video games into the 21st century.
“We’re already seeing the limitations of ESRGAN at recognizing certain things,” Abdollah said. “Waiting for the next big thing to arrive. But hopefully one that hobbyists will be able to use.”