GPT-3 Is an Amazing Research Tool. But OpenAI Isn’t Sharing the Code.
Some A.I. experts warn against a lack of transparency in the buzzy new program
For years, A.I. research lab OpenAI has been chasing the dream of an algorithm that can write like a human.
Its latest iteration on that concept, a language-generation algorithm called GPT-3, has now been used to generate such convincing fake writing that one blog written by the it fooled posters on Hacker News and became popular enough to top the site. (A telling excerpt from the post: “In order to get something done, maybe we need to think less. Seems counter-intuitive, but I believe sometimes our thoughts can get in the way of the creative process.”)
While OpenAI has released its algorithms to the public in the past, it has opted to keep GPT-3 locked away.
OpenAI has been able to achieve such a powerful algorithm because of its access to massive amounts of computing power and data. And the algorithm itself is bigger than any that’s come before it: The largest version of GPT-3 has 175 billion parameters, which are equations that help the algorithm make a more precise prediction. GPT-2 had 1.5 billion.
While OpenAI has released its algorithms to the public in the past, it has opted to keep GPT-3 locked away. The research firm says it’s simply too large for most people to run, and putting it behind a paywall allows OpenAI to monetize its research. In the past year, OpenAI has changed its corporate structure to make itself more appealing to investors. It dropped a nonprofit standing in favor of a “capped-profit” model that would allow investors to get returns on their investment if OpenAI becomes profitable. It also entered into a $1 billion deal with Microsoft, opening collaboration between the firms and giving OpenAI priority access to Microsoft’s cloud computing platform.
Researchers who spoke to OneZero questioned OpenAI’s decisions to not release the algorithm, saying that it goes against basic scientific principles and makes the company’s claims harder to verify. (A representative for OpenAI declined to comment when reached for this article.)
“I remain unconvinced by any of the arguments provided so far for not sharing the code for AlphaGo, GPT-2/GPT-3,” Joelle Pineau, co-managing director of Facebook AI Research (FAIR) and head of the FAIR lab in Montreal, told OneZero in an email. “And there are many more cases in A.I.”
At its heart, GPT-3 is an incredibly powerful tool for writing in the English language. The most important thing about GPT-3 is its size. GPT-3 learned to produce writing by analyzing 45 terabytes of data, and that training process reportedly cost millions of dollars in cloud computing. It has seen human writing in billions of combinations.
This is a key part of OpenAI’s long-term strategy. The firm has been saying for years that when it comes to deep learning algorithms, the bigger the better. More data and more computing power make a more capable algorithm. For instance, when OpenAI crushed professional esports players at Dota 2, it was due to its ability to train algorithms on hundreds of GPUs at the same time.
It’s something OpenAI leaders have told me previously: Jack Clark, policy director for OpenAI, said that the bigger the algorithm, the “more coherent, more creative, and more reliable” it is. When talking about the amount of training the Dota 2 bots needed, CTO Greg Brockman said, “We just kept waiting for the magic to run out. We kept waiting to hit a wall, and we never seemed to hit a wall.”
A similar approach was taken for GPT-3. OpenAI argues that bigger algorithms, meaning more parameters, allow more general behavior. For instance, GPT-3’s most basic function is to act like an autocomplete. Give it one word or a sentence, and it will generate what it thinks comes next, word by word. But it can also answer questions, or even do translations, without needing any changes to the algorithm. This is different from more specialized, fine-tuned algorithms that can tackle only one task.
Some argue that this is a step toward general intelligence, the holy grail of A.I. that would mean an algorithm could learn and adapt much like a human, while others say the algorithm still doesn’t actually understand the words it’s spitting out.
OpenAI has released a detailed research paper that explains the architecture of the algorithm and results it achieved, but when it comes to studying how GPT-3 functions, other A.I. researchers have to take OpenAI at its word. The research firm, which has recently pivoted away from its nonprofit roots to raise money and develop commercial products, hasn’t publicly released this algorithm, as it has done in the past.
OpenAI infamously claimed in February 2019 that the largest version of its previous GPT-2 algorithm was too dangerous to be released, due to its ability to potentially generate misinformation or fake news. The firm initially released smaller, truncated versions of GPT-2, and seeing no evidence of misuse, ended up releasing the largest version of the algorithm. Now, instead of being too dangerous, GPT-3 seems to be too lucrative to release.
GPT-3 can be accessed only through an API run by OpenAI, similar to how companies like Amazon, Google, and Microsoft have monetized their own algorithms. Coders are able to write programs that send specific commands to GPT-3, which generates a response in OpenAI’s cloud and sends back the result. While the API is free during its private beta testing period, OpenAI is figuring out long-term pricing.
That means researchers can send only specific commands to the algorithm, and OpenAI can revoke access at any time.
OpenAI’s reasoning for this comes down to safety and scale. If the firm catches someone misusing the API to do something like prop up a fake news website, then the company could shut down that developer’s access.
As for scale, the company says that the algorithms are large and expensive to run — let alone train to begin with. “This makes it hard for anyone except larger companies to benefit from the underlying technology,” OpenAI’s website says. “We’re hopeful that the API will make powerful A.I. systems more accessible to smaller businesses and organizations.”
The exact cost for OpenAI to train and operate the algorithm is difficult to game out because of how the price of cloud computing is calculated. The cost of renting a GPU differs wildly, depending on factors like geographic proximity to certain server regions and negotiated rates based on the scale of projects. OpenAI also likely benefitted from its billion-dollar partnership with Microsoft, as some of that funding was allocated to build OpenAI its own supercomputer for these kinds of tasks.
But these limitations—the size and lack of transparency—make it hard for other scientists to replicate and validate the algorithm’s efficacy.
A.I. research, for all the venture capital and corporate interest, is still an avenue of computer science, and the scientific method still applies. The best-conducted scientific experiments, in this case the building of an algorithm that succeeds at a task and proves a hypothesis, can be repeated by others.
Pineau, an ardent supporter of replicable computer science, says that she thinks of unreleased algorithms like GPT-3 and AlphaGo as “scientific artifacts.”
“A bit like a dinosaur bone you might dig out, which gives you some evidence to support some theories but is not the same as running an actual experiment,’” she told OneZero in an email.
Pineau says that these artifacts can be very useful in shaping future hypotheses, but they’re still not a replacement for conclusive knowledge.
Others worry that by limiting access to the code and trained algorithm, OpenAI threatens the “democratization” of artificial intelligence, an idea that access to A.I. should be attainable by anyone.
The phrase “access to A.I.” is multifaceted, meaning access to computing power, datasets, and the algorithms themselves. Open source frameworks like Google’s TensorFlow and Facebook’s PyTorch make algorithms easy to build and share, and many open source datasets exist.
But computing power comes from hardware, a constrained physical resource that’s most accessible to large companies and well-funded research organizations like OpenAI.
If OpenAI’s experiments turn out to be the way forward for artificial intelligence, and bigger algorithms translate to increased performance, then cutting-edge A.I. becomes inaccessible to those who can’t afford it. It also allows big companies with the resources to make the rules as to who has access to certain A.I. algorithms. For example, they could set them behind an API and charge access to use the algorithm.
“If we believe that the road to better A.I. in fact is a function of larger models, then OpenAI becomes a gatekeeper of who can have good A.I. and who cannot,” says Mark Riedl, an A.I. professor at Georgia Institute of Technology who studies natural language processing.
Riedl also questions whether OpenAI, which has gone to great lengths in the past to think about how its algorithms could be misused, would monitor all the uses of its new API to see if its being used for malicious ends.
“Will OpenAI look at the outputs and try to make judgement calls about whether their technology is being used appropriately? This seems to be a critical question given OpenAI’s mission statement and how it is at odds with their new for-profit mode. Can they even monitor at scale?” he asked.
And not everyone is sold that OpenAI’s “bigger is better” method is the way forward.
For example, natural language processing researcher Melanie Mitchell put GPT-3 through its paces on a “copycat” test, where the algorithm was asked to identify patterns in how certain series of letters were changed. If “abc” is changed to “abd,” then what does “efg” change to?
These kinds of tests, which Mitchell developed an algorithm to solve in the 1980s, are a tiny simulation of making the kinds of analogies that humans make all the time. To make an analogy correctly, you have to understand all the components’ relationships with each other. In the alphabet example, the algorithm has to understand that the alphabet is ordered and the position of each letter.
While the algorithm performed well in many of the tests, Mitchell found that it was also unable to grasp some simple concepts that other algorithms had mastered decades ago.
“On the research side, I personally think that throwing more compute and parameters at a problem may be a dead-end strategy in A.I.,” Mitchell told OneZero. “I don’t think that’s the way that real progress will be made if our goal is to build machines with robust, general intelligence.”
She does concede that the idea of large amounts of computing power gives tech giants an edge when it comes to building A.I. products that require deep learning, but conversely, not every modern problem requires a power-hungry deep learning algorithm. In other words: Not every problem needs a GPT-3-sized solution.
“All in all, GPT-3’s performance is often impressive and surprising, but it is also similar to a lot of what we see in today’s state-of-the-art A.I. systems: impressive, intelligent-seeming performance interspersed with unhumanlike errors, plus no transparency as to why it performs well or makes certain errors,” Mitchell wrote when testing the algorithm.