To understand the excitement around GPT-3, let’s take a look at the context in which it exists: the field of Natural Language Processing (NLP). NLP is a subfield of artificial intelligence, which is concerned with processing and analyzing human language. You might not realize it, but you likely interact with some form of NLP every day: it is used for spam detection, autocomplete, and it enables you to talk to Siri & Alexa, amongst many other applications.
Challenges within this subfield of AI include speech recognition, natural language understanding, and natural language generation. GPT-3 is the latter, a natural language generation (NLG) model, which means that it is able to generate text based on the data that it has been given.
The excitement around GPT-3 (Generative Pre-Trained Transformer-3) is largely due to its size and the amount of texts on which it has been pre-trained. The model has been fed huge amounts of texts from digitized books and the web: from every Wikipedia page ever written to conversations on Reddit.
GPT-3 comes in eight different sizes, ranging from 125M to 175M parameters. Putting that in perspective (see visualization) and comparing it to its predecessor GPT-2, the increase in size is incredible.
Furthermore, in contrast to many other models (such as BERT), GPT-3 only needs a few examples to be taught something new. This is also known as few-shot learning, which means that a machine learning model only needs a small amount of data, contrary to the normal use of large amounts of data. This is what gets people so excited about GPT-3: custom language tasks without much training data.
GPT-3 was developed by OpenAI: an artificial intelligence research institute, with a mission to ensure that artificial general intelligence benefits all of humanity. They provided a few members within the AI community with beta access to their API, to further their understanding of its possibilities, business opportunities, as well as potential risks. Thus far, the model has been used for writing poems, articles, music, and creative fantasies and has made it possible to talk to historical figures. Many of the use cases so far have been very creative, and playful. But how can you make use of GPT-3 to create value within your organization, beyond creative applications? We’ve rounded up four high potential application areas:
Making complex data more accessible
GPT-3 could help make complex data more accessible, by converting this data into a format that is easier to comprehend. For instance, the model could help people understand “legal language” that is often used in contracts, by translating it into simpler words; making the information described more accessible and transparent for its readers. Furthermore, GPT-3 could also be utilized as a “translator” for business intelligence applications: enabling users to interpret graphs, charts, and tables more easily, as the model could transform this kind of information into plain text. Considering GPT-3’s potential in converting complex data into uncomplicated text, the model will likely help businesses improve their ability to share complicated material with different stakeholders in a way that everybody will understand and profit from.
Making programming more accessible
GPT-3 could add value in the field of software development, by translating code into plain text or converting text into snippets of code. Translating code into plain text would be helpful for non-technical stakeholders or rookie developers that want to understand how the code works. Wouldn’t it be nice if you could select a piece of code, and with the help of GPT-3 you would instantly understand how that code works? It could definitely save time in development processes. Also, the other way around, converting text into code, could be a nice starting point for people that want to execute a simple snippet of code. Like writing an SQL query or setting up a simple layout by just describing it in plain text, without having to ask a developer. Leveraging GPT-3 for these kinds of advancements could greatly increase the efficiency of software development processes.
Enabling simple content creation
GPT-3 could also be used for marketing purposes, by generating textual content. Rather than having marketeers write product descriptions, alternatively, this could be automated with GPT-3. By just giving GPT-3 some bullet-points with the information that has to be included in the fragment, the model could generate a text that describes the product. This can also be applied for work emails, summaries of news articles, or even LinkedIn posts. These kinds of applications could save time and increase work efficiency.
Enhancing the customer experience with chatbots
Finally, GPT-3 could be used to enhance the customer experience of organizations, by using it to amplify the language abilities of chatbots. We’ve seen applications where GPT-3 has been used to create new customer experiences for museums: allowing visitors to have a fictive chat with Van Gogh or Einstein to learn more about their creations and inventions. By using a model such as GPT-3, which has general knowledge of human language, the chatbot will have answers to most questions. The question remains: will it have the right answers?
On that note, GPT-3 is still fighting its flaws. It has been trained on the massive Common Crawl dataset, an expansive scrape of the web including fan fiction, poems, conspiracy theories, and news articles. This data has not been pre-filtered, and therefore will likely contain certain biases and prejudices. Beyond bias and prejudice, the same cautiousness exists for GPT-3 as did for its predecessors: if it ends up in the wrong hands, what are the dangers of it being abused? With the ability to generate ‘human like’ language without the labor, GPT-3 could potentially be used to quicken the production of terrorist propaganda and political and financial market misinformation.
OpenAI has confirmed its concerns around GPT-3 by openly acknowledging them and publishing their findings. These concerns also seem to be one of the reasons OpenAI will only provide access to GPT-3 through their own API, so they can control responsible usage of the model, and prevent abuse.
And what about the environmental impact?
The environmental impact of large AI models, such as GPT-3, has been heavily scrutinized over the past years. In a 2019 study, researchers at the University of Massachusetts at Amherst estimated that training a large deep-learning model produces 626,000 pounds of planet-warming carbon dioxide, equal to the lifetime emissions of five cars. GPT-3 being the biggest natural language model yet, also means it required a lot of computing power: the costs for training the model have been estimated at around 12 million dollars. Training of models such as GPT-3 could thus be seen as unsustainable practice, however, it is still unknown what exactly the size of the environmental footprint is.
OpenAI has shown us that with more computing power and more parameters models can potentially get better. But is it possible to get close to perfect with a model that has learned from the internet? Personally, I do expect that NLG models will generate small amounts of texts soon, expecting it can be trained and fine-tuned enough for a specific task in a specific context. Admittedly, it is a scary leap forward, but it also a promising one. GPT-3 is another step closer to artificial general intelligence (AGI): a machine that has the capacity to understand or learn any intellectual task that a human being can.
At DEUS we’re currently working on several Natural Language Processing projects and exploring different application areas of natural language generation and models such as GPT-3. We’re always happy to share thoughts on this subject, so if you’re interested in learning more, or share your perspective, we’d love to hear from you!