How OpenAI Sold its Soul for $1 Billion

The company behind GPT-3 and Codex isn’t as open as it claims.

The best intentions can be corrupted when money gets in the way.

OpenAI was founded in 2015 as a non-profit company whose primary concern was to ensure that artificial general intelligence (AGI) would be created safely and would benefit all humanity evenly.

“As a non-profit, our aim is to build value for everyone rather than shareholders.” Is it though?

In 2019, OpenAI became a for-profit company called OpenAI LP, controlled by a parent company called OpenAI Inc. The result was a “capped-profit” structure that would limit the return of investment at 100-fold the original sum. If you invested $10 million, at most you’d get $1 billion. Not exactly what I’d call capped.

A few months after the change, Microsoft injected $1 billion. OpenAI’s partnership with Microsoft was sealed on the grounds of allowing the latter to commercialize part of the tech, as we’ve seen happening with GPT-3 and Codex.

OpenAI, one of the most powerful forces leading humanity towards a (supposedly) better future is now subjugated by the money it needs to continue its quest. Can we trust them to keep their promise and maintain the focus of building AI for the betterment of humanity?

Money always has the upper hand

OpenAI was an AI research laboratory. But its ambitions were simply out of reach for the resources it had access to. Training GPT-3 cost them an estimated $12 million. Just the training. Where in the world would they get that amount of money if not from someone bigger that would eventually ask for something in return? When they realized they’d need investment, Microsoft was there waiting to provide cloud computing services in exchange for a license to commercialize their systems in obscure ways that weren’t disclosed at the time.

Karen Hao, the AI reporter at MIT Technology Review, conducted an investigation to answer some questions about OpenAI. In a brilliant article, she exposed OpenAI’s inconsistencies in their discourse. Why a company whose foundation is to ensure a better future for all decides that “in order to stay relevant” they suddenly need huge amounts of private money? The move from non- to for-profit generated frenzy critiques among the public opinion and even within the company.

Oren Etzioni, director of the Allen Institute for AI, also received the news with skepticism. “I disagree with the notion that a nonprofit can’t compete. […] If bigger and better funded was always better, then IBM would still be number one.” Caroline Haskins, who used to write at Vice News, distrusts OpenAI’s promise to remain loyal to their mission: “[W]e’ve never been able to rely on venture capitalists to better humanity.”

OpenAI had decided to put emphasis on larger neural networks fueled by bigger computers and tons of data. They needed money, a lot of money, to continue that path. But, as Etzioni says, that’s not the only way to achieve state-of-the-art results in AI. Sometimes you need to think creatively about novel ideas, instead of “putting more iron behind old ones.”

How OpenAI became the villain of this story

GPT-2 and GPT-3, the “dangerous” language generators

In early 2019, OpenAI — already a for-profit company — announced GPT-2, a powerful language model capable of generating human-level texts. The researchers qualified GPT-2, a huge leap forward at the time, as too dangerous to release. They feared it could be used to “spread fake news, spam, and disinformation.” However, not long after, they decided to share the model after finding “no strong evidence of misuse.”

Britt Paris, a professor at Rutgers University said that “it seemed like OpenAI was trying to capitalize off of panic around AI.” Many viewed the thrill around GPT-2 as a publicity strategy. They thought the system wasn’t as powerful as OpenAI claimed. Yet, from a marketing perspective, they could attract the media attention and give GPT-2 the hype they wanted in the first place. OpenAI denied these accusations but the public opinion wasn’t satisfied.

If GPT-2 wasn’t as powerful as they proclaimed, then why trying to make it seem more dangerous than it was? If it was indeed that powerful, why releasing it fully, just because they had found “no strong evidence of misuse?” Either way, they didn’t seem to be adhering to their own ethical standards.

In June 2020, GPT-3 — GPT-2’s successor — was released through an API. OpenAI seemed to consider the new system — 100x larger than GPT-2, more powerful, and therefore intrinsically more dangerous —safe enough to share with the world. They set a waitlist to review every access request one by one, but still, they had no means to control what the system was used for eventually.

They even acknowledged several issues that could happen if the product landed in the wrong hands. From potential misuse applications — including “misinformation, spam, phishing, abuse of legal and governmental processes, fraudulent academic essay writing and social engineering pretexting” — to biases — emphasizing gender, race, and religion.

They recognized these problems existed and yet decided to allow users to experiment with the system. Why release it through an API instead of open-sourcing the model?. Well, OpenAI said one of the reasons was to “pay for [their] ongoing AI research, safety, and policy efforts.”

To summarize: The company “in charge” of protecting us from harmful AIs decided to let people use a system capable of engaging in disinformation and dangerous biases so they could pay for their costly maintenance. It doesn’t sound very “value-for-everyone” to me.

Predictably, heated discussions soon emerged on social media about the potential damage GPT-3 would cause. Jerome Pesenti, head of AI at Facebook, wrote a Tweet in which he pointed out an example:

“I wish OpenAI had been more open and less sensationalistic, by just open sourcing both [GPT-2 and GPT-3] for research, especially on #responsibleAI aspects, while acknowledging that neither was ready for production,” he said.

In an innocent attempt at leveraging the uniqueness of GPT-3, Liam Porr made the system write a productivity article he’d share with its subscribers without revealing the trick. The article made the number one spot at Hacker News. If Porr, a single student at UC Berkeley, managed to fool everyone with an AI-written piece, what could groups of people with malicious intent do?

One thing is to spread fake news, another, very different thing is to spread fake news that can’t be reliably distinguished from other human-written articles. And that’s exactly what GPT-3 is capable of, also recognized — and even highlighted — by OpenAI:

“[M]ean human accuracy at detecting the longer articles that were produced by GPT-3 175B was barely above chance at 52%. This indicates that, for news articles that are around 500 words long, GPT-3 continues to produce articles that humans find difficult to distinguish from human written news articles.”

Codex and Copilot, breaking the law

This year they’ve done something similar.

A few weeks ago, GitHub, Microsoft, and OpenAI released Copilot, an AI system built on top of Codex designed to be an AI pair programmer. Leaving aside the potential threat to the workforce, it was strongly criticized because the system was blindly trained with open-source code from public GitHub repositories.

GitHub CEO Nat Friedman shared the news in Hacker News generating concerns about the legal implications of using Copilot. One user pointed out some unanswered gaps:

“Lots of questions:

- The generated code by AI belongs to me or GitHub?

- Under what license the generated code falls under?

- If generated code becomes the reason for infringement, who gets the blame or legal action?”

Armin Ronacher, an important open-source developer, shared on Twitter an example of Copilot plagiarizing a chunk of copyrighted code:

To what another user responded: “Here we have direct evidence of GitHub [Copilot] directly reproducing a GPL’d chunk of code, proving that this [is] a really dangerous tool to use in commercial environments.”

Going deeper, even if Copilot didn’t copy the code verbatim, an ethical question arises: Is it okay for companies like GitHub or OpenAI to train these systems on open-source code generated by thousands of developers to then sell the use of these systems to those same developers?

To this, Evelyn Woods, programmer and game designer, says: “It feels like it’s laughing in the face of open source.”

Should we pin our hopes on OpenAI?

What are OpenAI's real intentions now? Are they tied to Microsoft's interests so much that they’ve forgotten their original purpose for “the betterment of humanity?” Or did they truly think they were the ones with the best tools and minds to construe this path even if that meant selling their souls to a giant tech corporation? Are we willing to let OpenAI build the future as they wish or should we diversify our intents and, more importantly, separate them from financial profit?

OpenAI is a leading force towards more sophisticated forms of artificial intelligence, but there are other, very capable institutions free of monetary ties that are also doing relevant work. They may not enjoy a mattress of money to lie comfortably, and that may be the exact reason why we should be paying even more attention to their work.

In the end, the top priority of big tech companies isn’t scientific curiosity of building a general artificial intelligence, and it is neither to build the safest, most responsible, most ethical type of AI. Their top priority — which isn’t illicit by itself — is to earn money. What may be of dubious morality is that they’d do whatever it takes to do it, even if it means going down obscure paths that most of us would avoid.

Even Elon Musk, who co-founded OpenAI, agrees with the critiques:

Final thought

That said, I still believe OpenAI employees keep their original mission as their main motivation. But they shouldn’t forget that the end doesn’t always justify the means. Higher ends could be harmed by those very means.

Do we want AGI? Scientifically speaking the answer can’t be “no.” Scientific curiosity has no limits. However, we should always assess potential dangers. Nuclear fusion is marvelous, but nuclear fusion bombs aren’t.

Do we want AGI at any cost? Now, morally, the answer can’t be “yes.” And it’s there where we should put our focus, because the pressing issues we’ll face due to the fast pace at which we’re advancing with these technologies, will affect us all. Sooner or later. And those who only care about their belly button, be it OpenAI or whoever, will have a significant degree of responsibility in the consequences.

If you liked this article, consider subscribing to my free weekly newsletter Minds of Tomorrow! News, opinions, and insights on Artificial Intelligence every Monday!

You can also support my work directly and get unlimited access by becoming a Medium member here! :)

Writing at the intersection of AI, philosophy, and the cognitive sciences | alber.romgar@gmail.com | Get my articles for free: https://mindsoftomorrow.ck.page