Figure 1: AI-generated
The development of AI has led to very smart people theorizing about it taking all the jobs and dominating the human race. But those fears have lacked evidence, so far. In this story by the Wall Street Journal, researchers pitted ChatGPT against humans. The results weren’t that close:
How good is AI in generating new ideas?
The conventional wisdom has been not very good. Identifying opportunities for new ventures, generating a solution for an unmet need, or naming a new company are unstructured tasks that seem ill-suited for algorithms. Yet recent advances in AI, and specifically the advent of large language models like ChatGPT, are challenging these assumptions.
We have taught innovation, entrepreneurship and product design for many years. For the first assignment in our innovation courses at the Wharton School, we ask students to generate a dozen or so ideas for a new product or service. As a result, we have heard several thousand new venture ideas pitched by undergraduate students, M.B.A. students and seasoned executives. Some of these ideas are awesome, some are awful, and, as you would expect, most are somewhere in the middle.
The library of ideas, though, allowed us to set up a simple competition to judge who is better at generating innovative ideas: the human or the machine.
In this competition, which we ran together with our colleagues Lennart Meincke and Karan Girotra, humanity was represented by a pool of 200 randomly selected ideas from our Wharton students. The machines were represented by ChatGPT4, which we instructed to generate 100 ideas with otherwise identical instructions as given to the students: “generate an idea for a new product or service appealing to college students that could be made available for $50 or less.”
In addition to this vanilla prompt, we also asked ChatGPT for another 100 ideas after providing a handful of examples of successful ideas from past courses (in other words, a trained GPT group), providing us with a total sample of 400 ideas.
The academic literature on ideation postulates three dimensions of creative performance: the quantity of ideas, the average quality of ideas, and the number of truly exceptional ideas.
First, on the number of ideas per unit of time: Not surprisingly, ChatGPT easily outperforms us humans on that dimension. Generating 200 ideas the old-fashioned way requires days of human work, while ChatGPT can spit out 200 ideas with about an hour of supervision.
Next, to assess the quality of the ideas, we market tested them. Specifically, we took each of the 400 ideas and put them in front of a survey panel of customers in the target market via an online purchase-intent survey. The question we asked was: “How likely would you be to purchase based on this concept if it were available to you?” The possible responses ranged from definitely wouldn’t purchase to definitely would purchase.
The responses can be translated into a purchase probability using simple market-research techniques. The average purchase probability of a human-generated idea was 40%, that of vanilla GPT-4 was 47%, and that of GPT-4 seeded with good ideas was 49%. In short, ChatGPT isn’t only faster but also on average better at idea generation.
Still, when you’re looking for great ideas, averages can be misleading. In innovation, it’s the exceptional ideas that matter: Most managers would prefer one idea that is brilliant and nine ideas that are flops over 10 decent ideas, even if the average quality of the latter option might be higher. To capture this perspective, we investigated only the subset of the best ideas in our pool—specifically the top 10%. Of these 40 ideas, five were generated by students and 35 were created by ChatGPT (15 from the vanilla ChatGPT set and 20 from the pretrained ChatGPT set). Once again, ChatGPT came out on top.
You may be thinking that these humans are MBA students and that other humans would do better. That’s a valid point but this result is worrying nonetheless.
Science corner
Media recommendation
Amazon’s Coach Prime is a four-episode docuseries featuring NFL star-turned-coach Deion Sanders in his first season (2022) coaching Jackson State, a HBCU in Jackson, Mississippi. At the end of the season, he gets a head coaching job at the University of Colorado in Boulder. Fast-forward to 2023 in real-time and the CU Buffalos are 2-0, upsetting 17th-ranked TCU and beating rival Nebraska. It’s the most exciting storyline in 2023 sports. Coach Prime is a show about leadership, building great organizations, and excellence.
There you have it, the ninth edition of Sunday Digest featuring AI dominating humans, possibly relatedly, human inability to sit patiently without administering self-electroshock, and the Legend of Deion Sanders. The portrait of a world spinning faster and faster. The good news is you have Netflix, Uber Eats, and running water. Until next time, be a good citizen, quit doomscrolling, and go outside.
Ad Astra Per Aspera!