AI x Text
This guide is our collection of learning and resources on text-generating AI – one of the most accessible and widespread applications of machine learning. We invited Fred Wordie to walk us through the topic, sharing his thoughts on what is legitimately hyped and not for text-generating AI.
This guide is a part of our community program AI Playground. AI Playground, led by @computational_mama, showcases creators from around the world that use AI tools to feed their creative practice.
[PUBLISHED]
Aug 2022
[AUTHORS]
[EDITORS]
[FUNDING]
01_Introduction to text-generating AI
Text-generating AI, like image-generating AI, is one of the most accessible and tangible applications of machine learning. When given a prompt, pre-trained models like GPT-3 or GPT-J are seemingly able to understand the context of your words and generate creative and coherent responses.
Text-generating AI is a tool artists can use to explore language, writers can use for inspiration, and businesses can use to empower (or replace) their workforce. There is no doubt the technology is very cool. Future-y in the same way flying cars and Huel are. However, it also has the potential to create an internet full of AI-generated noise, become the go-to tool for propagandists and homogenise the written word. Thinking about AI-generated text, I find myself asking:
-
We’re excited and amazed by texts written by AI. When the novelity of an AI author wears off, will we still be enamoured by its outputs?
-
It’s a great tool for starting points in the creative process: mock-ups, ideation, fake copy for design work etc. Will the utility of these placeholder texts distract from the creation of “real” content? Will this placeholder text become a barrier to creative thinking, prevent us from exploring new mediums, lock us into only current ways of thinking?
Key terminology
AI-generated text 📄
Long before the arrival of computers, humans have been experimenting with text generation, finding ways to randomise and generate text – either just for fun or with serious intent, like in the process of divination. AI-generated text is text generated using AI and Machine Learning as a tool that takes the prompt and processes it into output.
Corpus
Literally ‘a body’ in Latin, a corpus is a body of knowledge (can be text, images sounds etc) that the algorithm analyses and uses to generate outputs.
GPT-3 & Open AI
GPT-3 made by Open AI is the most powerful and commonly used commercial text-generating AI (this article is written in September 2022, things might change and will change very fast – this is the nature of technological development)
GPT-J & GPT-Neo by Eleuther AI
powerful text-generating models made by an open-source grassroots collective
Neural network
A neural network ”is a computational learning system that uses a network of functions to understand and translate a data input of one form into a desired output, usually in another form. The concept of the artificial neural network was inspired by human biology and the way neurons of the human brain function together to understand inputs from human senses.” From machine learning glossary at deepai.org
API
API stands for an Application Programming Interface. From IBM’s explainer on the term: “An API is a set of defined rules that explain how computers or applications communicate with one another.”
A (machine learning) model
In this definition by Microsoft “A machine learning model is a file that has been trained to recognize certain types of patterns. You train a model over a set of data, providing it an algorithm that it can use to reason over and learn from those data.”
02_History of text-generating AI
Alan Turing, image source
On 20th February 1947, the famous British mathematician Alan Turing delivered a lecture to the London Mathematical Society, where he talked about testing AI in the game of chess. In this lecture, he talked about training machines, postulating that machines could learn to teach themselves:
‘What we want is a machine that can learn from experience… the possibility of letting the machine alter its own instructions provides the mechanism for this.’ source
This was the first time someone would lay down the groundwork for how neural networks, including text-generating AI, would work.
Neural networks are models that are shown a bunch of data – in this case text – and start to form their own connections between letters, words and phrases. When it comes to generating results that would normally be produced by human creativity, neural networks learn how to respond to a given prompt in the correct aesthetic way. They do so not by understanding the prompt, but by guessing what words the user might expect back. Hence, the bigger the data set and the computing power, the more ‘human-like’ responses we can expect back.
One of the first text-generating AIs to work in such a way was ELIZA, created at by Joseph Weizenbaum at MIT in 1966. When in conversation with a human, it simulated a psychotherapist having a session with a patient, ala Carl Rogers.
Even though ELIZA was only really able to respond to statements in the form of rephrased questions, users including Weizenbaum’s own
J. Weizenbaum, Computer power and human reason: From Judgment to Calculation, W.H. Freeman & Company, 1976
secretary became ‘infatuated’ by the program and occasionally forgot that they were conversing with a computer. Reflecting on the development of ELIZA, Weizenbaum writes:
“What I had not realized is that extremely short exposures to a relatively simple computer program could induce powerful delusional thinking in quite normal people.”
Below is an example of the conversation with ELIZA:
ELIZA: Please tell me what’s been bothering you.
YOU: I don’t really know
ELIZA: Do you wish to be able to really know ?
YOU: Hmm… well, I guess yes
ELIZA:Do you say you guess yes for some special reason ?
YOU: I don’t really know
ELIZA: Does that trouble you?
YOU: What do you think?
ELIZA: What are your feelings now ?
YOU: Sad
ELIZA: Does talking about this bother you?
Since ELIZA, the field of text-generating AI plodded along For a timeline of development in text-generating AI, check out this timeline by Dr Alan D. Thompson. until an explosion in performance starting in the 2010s. With models like IBM’s Jeopardy winning Watson and Google’s Transformer architecture that enabled game-changing models such as OpenAI’s GPT series: GPT-1, GPT-2 and today’s 175-billion-parameter GPT-3, today, AI can do much more than be pseudo therapists. AI tools can write code or emails for Excerpts from “A robot wrote this entire article. Are you scared yet, human?”
Written by GTP-3
*The mission for this op-ed is perfectly clear. I am to convince as many human beings as possible not to be afraid of me. Stephen Hawking has warned that AI could “spell the end of the human race”. I am here to convince you not to worry. Artificial intelligence will not destroy humans. Believe me.*you, screen CVs, book your flights, translate whole books and even be your friend (sorta).
In 2020, The Guardian published an opinion piece titled, “A robot wrote this entire article. Are you scared yet, human?” written by Open AI’s GPT-3 (and edited by a human). It is no longer a question whether machines can pass the Turing test. Machines are constantly passing the test. Maybe we need to change the benchmark.
Where can we find AI-generated text in everyday life?
Translation – DeepL
Chatbots – IBM’s Watson
Ideation for designers – Figment
Gmail writing assistant – Smart Compose
AI-generated texts frequently appear in social media spam and propaganda
SEO-centred websites
GitHub co-pilot tool
All sorts of tools that help copywriters, marketers or anyone else who writes professionally, like:
Writing assistant – JarvisAI
Writing assistant – Grammarly
Emails – Flowrite
03_Text-generating AI is redefining creativity
When discussing how AI-generated text may impact your craft as a designer or artist, it is important to understand that there are two main forms of text generators.
The first and the easiest to get started with are the pre-trained models, such as GPT-3, GPT-Neo and J-1. These are built on varying sizes of diverse datasets and as such are very versatile, allowing you to generate text from a variety of prompts. These models are as plug and play as they can get, with APIs that allow you to create chatbots, generate product descriptions or even write books. However, they have limitations such as inherent biases (discussed in Chapter 4), and can get fairly cost prohibitive as most are proprietary and can only be used when following the creators’ guidelines.
If you’re looking for more control and have time to prepare your own datasets, then training your own model is also an option. With this approach, you take a model, sometimes with some pertaining data, and feed it your own dataset. Thus it will be able to generate text that is similar to your dataset. Once created, these models can let artists generate books trained from Sci-Fi Novels, new words from obscure dictionaries or Chatbots educated on Queer texts. Word of warning: to get very good results when training a model from scratch, you will need a lot of data. ALOT.
These methods are sure to augment our ways of working, create new product categories for designers and become a new medium for artists. One question I have for artists is ‘Can generative AI be used as a tool for impactful and meaningful work when the medium and its aesthetic are no longer new and interesting?’
NOTABLE PROJECTS AND CREATORS

Hip Hop Poetry Bot by Alex Fefegha
An AI experiment supported by Google Arts and Culture + Google AI. We developed an algorithm trained on a dataset of hip hop, rap & rnb lyrics, and experimented with using this to generate poetic responses to everyday questions.

Nonsense Laboratory by Allison Parrish
The Nonsense Laboratory uses machine learning to let you poke at, mangle, and play with the spelling of words. It is based on Pincelate, a machine learning model that breaks down any English word into its phonetics. You can put in made up words, nonsense words, or just random letters and Pincelate will try to sound them out. The model works by translating letters into mouth movements and back. The tools in the Nonsense Laboratory let you manipulate these letters and mouth movements to make strange and new words…a bit like playing a musical instrument or modeling clay. Allison Parrish is a poet and programmer who specializes in the theory and practice of computational literature.

PHARMAKO-AI By K Allado-McDowell
During the first summer of the coronavirus pandemic, a diary entry by K Allado-McDowell initiates an experimental conversation with the AI language model GPT-3. Over the course of a fortnight, the exchange rapidly unfolds into a labyrinthine exploration of memory, language and cosmology. The first book to be co-created with the emergent AI, Pharmako-AI is a hallucinatory journey into selfhood, ecology and intelligence via cyberpunk, ancestry and biosemiotics. Through a writing process akin to musical improvisation, Allado-McDowell and GPT-3 together offer a fractal poetics of AI and a glimpse into the future of literature.

Dictionary of obscure AI Sorrows by Fred Wordie
The promise of the GPT-3 is that it can produce human-like text. For me, this is made much more interesting if the text it can create is both novel and moving in some way. I see the creation of words that describe often felt but still undefined human emotions, as a micro example to test this. The Dictionary of Obscure AI Sorrows is a collection of new words and definitions that attempt to describe specific and intrinsically human experiences. These words were generated by OpenAI's machine learning model GPT-3, after being given a selection of words from The Dictionary of Obscure Sorrows as a prompt.

Digital Folktales by Fabian Mosele
The internet is a place of cultural importance, where stories and ideas are created by its communities. Digital Folktales presents fifty-seven tales from the web, generated by two algorithmic companions and one human. Each story is uniquely representing the internet of the early 20’s, from dank stories to more political ones. Fabian Mosele is an AI prompt engineer, 3D animator and AIxDesign member
N(AI)VE POETRY COLLABORATIONS
AI_PLAYGROUND / TEXT / WORKSHOP WITH ANDREAS REFSGAARDFor a friendly introduction to working with text-generating AI and using these texts to create images, see the recording of the workshop Andreas hosted at AIxD in June 2022.
Words from another mouth by Andreas Refsgaard, an artist and creative coder, working in between art and interaction design using algorithms, coding and machine learning to explore the creative potentials of emerging digital technologies.

04_Worries about text-generating AI
Bias
As with all AI systems, text generated by AI is will create outputs based on the the data it is trained on. Hence all pre-trained models reflect the inherent bias and issues we see in our society and on the internet (fake news, hate speech and the rest of the contents of the digital Pandora’s box). The models themselves are not racist or misogynistic, and neither have they a colonial mindset or any mindset of their own. But they are trained on our written history, and specifically our written histories that are recorded and stored online, thus the outputs created by these models contain our society’s biases. When a text-generating AI looks to deliver against a prompt, the model searches for patterns and answers in the data sets it was trained on, and in doing so it perpetuates harmful biases, and fails to represent alternative points of view.
See the examples pictured right, produced by GTP-3 (2022), which shows gender bias by picking exclusively male names for nurses and exclusively male names for doctors 👉🏽
Telling the truth?
Something that is often misunderstood about AI-generated text – and AI in general – is that if it’s generated by a machine that must be telling the truth. This is not true. AI-generated text may “look the part”, may be “aesthetically accurate”, but the AI has simply made that stuff up. For instance:
Ask GPT-3 to write a news story on why keyboards are good for your health, and it will cite a study by Dr Ethan Berke from The University of Utah, but neither this study nor Dr. Ethan Berke is real.
This problem is exacerbated by how up-to-date the datasets are. For example, GPT-3’s most recent training data is from 2021, hence it has no knowledge about recent history, like the Russian invasion of Ukraine.
Screenshot by Fred Wordie, September 2022
Labour
AI tools augment our workflows, can speed up our processes, make us more accurate and inspire new ideas. These seem great for designers and artists, but those same benefits will also help people with less wholesome objectives. What happens when a solitary troll-farmer can use AI to write 10,000 different tweets? What happens when a content creator clogs up Google search with an infinite number of SEO-focused spam websites?
Human limits in productivity and content production are no longer relevant, and the internet may continue to get nosier.
With the maturity of AI chatbots more and more companies have decided to replace customer support with one of those chatty widgets at the bottom right of their webpage. As this practice continues and increases for cost reasons – Hello Recession – it will become even harder to pick up a phone and talk to someone.
There are a few examples of how AI chatbots have been used to help people access underfunded services, such as Citizens Advice. However, it’s important we don’t jump to implement design and tech solutions to inherently political problems. Note: this sentiment applies to both public and private products and services.
Ownership
Legal battles rage on about who owns the copyright or usage rights to AI-generated content. What we know for sure is that these models are trained on us, our writing, our drawing, our photos, and Our Data. Data that has been shared for free either as a result of complex Privacy Notices or with Creative Commons Licenses. For instance:
Millions of photos have been uploaded by their creators to Flickr. These photos have since been used to train facial recognition software, often without the explicit consent of their creators nor the subjects(See this film).
Personally, while I feel it’s OK to use this data to train AI models, the internet should be a free and wild place, where images and text can be remixed and reinvented. What I, and many others, have an issue with is companies using this free content to create things they sell back to us.
Homogenisation
As mentioned earlier with bias, these text-generating models are only a reflection and remix of what they have learnt from us. Hence the argument goes – if you train an AI on the internet and it starts creating content for the internet, then the next generation of AI models will learn from that new internet created by AIs. In this scenario, are we left with a grey and bland world wide web?
Furthermore, as these AI tools become more integrated into our lives it is already apparent that they are focused on the English language and western expectations. As such, we can expect to see an expedited loss of regional dialects and even whole languages.
05_Tools & Resources
If this has got you all hot and flustered to play with some Text Generating AI yourself, we have made a list of useful resources so you can dive right in. Now go have fun kids and stay safe ✌🏽
EleutherAI - An open source alternative to GPT-3 created by a grassroots collective of researchers.
GPT-3 - The current high water mark for plug-and-play Text Generating AI
Interactive textgenrnn - Created by Max Woolf this Google notebook is a great entry point to train your own Text Generating AI
J-1 by Studio AI21 - Another pre-trained model that is ready to play with
I taught an AI to make pasta - A great video on the process of training your own generator.
For the experts in the room, let us know how you like working with BLOOM – the World’s Largest Open Multilingual Language Model
06_Closing Remarks
Love or hate text-generating AI, one thing is for certain, it is here to stay and unless there is a collective push back it will be moulded by corporate interests not us.
Current streams of funding and attention to AI have led and will continue to lead, to a small number of western-centric models. These models, in turn, enable the creation of tools that help give us superpowers. On the flip side, these models, mostly trained on data that was never made for AI-training purposes, will be created to secure and consolidate capital and power. These corporate interests from the usual suspects of Google, Meta, IBM, and new entrants like Open AI mimicking their tired business models, will decide how much these models cost, who can use them and how they are built and moderated. Note: alternative operating models exist, like those used at Hugging Face, but these are few and far between.
I hope for digital futures that will have a sea of clumsy and identity-rich models created by artists and independent researchers so that we can explore alternative ideas, and make art for the purpose of cultural exploration. However, outside the small worlds of critical design and contemporary art events, exposure to these more expressive models is limited. Just as cities continue to be gentrified into homogenised Helvetica sludge, with only a splattering of unique oddities, I fear the internet will follow the same path of homogenous blandness. As AI tools get built into our systems and start suggesting how we should write emails, video scripts, website code and much more, it will be harder to find those weirdo places that make the internet an interesting place to explore.
To conclude, there is no doubt text-generating AI is going places, will get “better”, easier to use, become more commonplace and continue to generate a lot of media buzz. With this will **come a new wave of artistic possibilities, design inspiration, and productivity tools. However, it also is well positioned to clutter the internet with both homogenised and/or harmful ephemera.