Saturday, April 20, 2024

AI Humor

 

https://www.theatlantic.com/sponsored/google-2023/a-bot-walks-into-a-bar/3874/

When Ken Jennings lost to IBM’s Watson on Jeopardy! in 2011, below his Final Jeopardy answer, he scrawled, “I, for one, welcome our new computer overlords.” By doing that, he easily beat Watson in the humor category, just by using a well-known meme formula signaling submission to a perceived all-powerful force to which resistance is futile.

Formulaic humor would seem precisely what an AI like Watson might have used to banter with Jennings. Yet Watson’s AI of 2011 could no more come up with a snappy comeback than its humanity-humbling predecessor Deep Blue when it took the crown from chess king Garry Kasparov in 1997.

I noted Deep Blue’s supposedly watershed victory in this New Yorker cartoon of mine, which foresaw a time when it might be my turn to say, “I, for one, welcome our new cartoon overlords.”

If that comes to pass, I might have had a hand or two in my own humbling. I’m the president of CartoonStock, the world’s largest database of single-panel cartoons like the one above. While I don’t know if any AI system has used it yet, by making thousands upon thousands of cartoons publicly available and easily scraped, I essentially created a gold mine for them.

The New Yorker Caption Contest is also my brainchild. Since 2005, The New Yorker has published a cartoon without a caption every week and asked readers to compete to write the winning caption. In 2016, the magazine began relying on an algorithm to sort the 5,000 to 10,000 caption entries per cartoon by funniness, aggregating voters’ opinions to present ranked lists. The combination of the prestige of the New Yorker cartoons and the unique quality of this dataset present the opportunity to give computers what they have always lacked: some humor.

The root of humans’ sense of humor has nothing to do with being able to “get” a joke or make one. Rather, it’s the answer to life’s existential problems that have no solutions, the blessing we receive in exchange for the curse of mortality. As Mark Twain said, “The secret source of humor is not joy but sorrow. There is no humor in Heaven.” AI has no sorrow and thus no need for humor, and creating that need would be cruel. We would have to give a machine sentience enough to suffer and vulnerability enough to die.

And yet, step by step, AI is getting closer to understanding what makes something funny. Perhaps unsurprisingly, considering the rapid advancement of large language models, it’s starting with captions. In 2016, I got an email from Vincent Vanhoucke, then Google DeepMind’s principal data scientist, now the senior director of robotics. “I believe that the success of artificial intelligence will ultimately not be measured in how well it can do at games like Go or whether it can respond to your emails,” he wrote, “but rather whether an artificial mind can one day get its cartoon caption published in The New Yorker.”

I wasn’t surprised that someone as high up the AI food chain as Vanhoucke was interested in the contest. It had been on the radar of the AI community for years, cited in a wide range of academic papers. But I was taken aback by the ultimate ambition, which appeared to be nothing less than creating—gasp—a Bot Mankoff. Winning the caption contest seemed to be merely one small step for machine kind.

At the time, even DeepMind’s whole bag of AI tricks couldn’t make a bot funny: Its combination of computer vision, convolutional neural networks, semantic vectors, and the oxymoronic Long Term Short-Term Memory wasn’t up to the task of even entering the contest. To be honest, I doubted that AI ever would be. I knew the technology was advancing at a mind-blowingly fast rate, but even if the machine-learning magic were up to snuff, I couldn’t see how all the data and equations in the world could replace the fundamentally human heart of humor that Twain described.

New Yorker cartoons can be zany or intellectual, or philosophical or spot-on observational, but they all share an understanding of what it means to be a certain type of person. In my own work, my objective is not to punch up or down, but to elbow to the side: to make fun of myself and others like me. The best cartoons cause us to laugh with them as well as at them. Even the best AI doesn’t have human peers with whom it shares interests and pet peeves. How could it beat us at making fun of ourselves?

Vanhoucke wasn’t a fan of this line of thinking. He was convinced it was only a matter of time before AI began competing seriously in the caption contest. And I’ve started to conclude he’s probably right. Last year, six years after our initial conversation, he sent me another email. “While we may still be a long way away from AI producing humor,” he wrote, “our latest models have become surprisingly adept at understanding why something is funny.” AI wasn’t writing jokes yet, but it seemed to be grasping the concept of what made a particular joke work.

The model Vanhoucke was describing was PaLM, a version of which now powers Google’s Bard. Here’s how it describes my 1997 cartoon about the chess-playing microwave:

The joke is that the microwave is so advanced that it can play chess. The man is frustrated that the microwave is so advanced that it can play chess, but it can’t reheat lasagna. The joke is funny because it is absurd. The joke is also funny because it is a commentary on the fact that technology is advancing so quickly that it is outpacing our ability to use it.

I’d rate this a solid B- explanation. Soon, PaLM will probably be able to earn an A. On one hand, so what? There will never be a coffee-table anthology of New Yorker cartoon explanations.

But the ability to understand humor is a key stepping stone toward the ability to create it. In a paper last year, a group of researchers led by Jack Hessel of the Allen Institute of AI, with some curation assistance from me, challenged AI models with three tasks: 1) matching a caption to a cartoon, 2) identifying a winning caption, and 3) explaining why a winning caption is funny. In all three categories, humans remained superior to even the most advanced models. The AI’s best performance came when describing the humor behind each cartoon, just as Vanhoucke found. For the Hessel paper, AI wrote 653 explanations for caption contest winners, creating its own database of what makes captions funny. Now someone could simply plop a bunch of the descriptions from the paper into Bard, ask for more, and rinse and repeat until the model has mastered every possible joke formulation.

Next, Hessel attempted a more sophisticated spin on things. The AI model developed 50 ideas for cartoons, each with five caption possibilities. From those 250 combinations, I picked the four I liked best, and cartoonist Shannon Wheeler drew them. Here are the results:

Shannon wasn’t impressed with the output. “Weird cartoon ideas. They lack the implied narrative that’s a solid New Yorker cartoon,” he summarized.

I agree with him, but that doesn’t diminish the scale of the accomplishment here. First of all, the training set of caption-contest cartoons that we fed into the AI model was intentionally bizarre. “A giant fish is seated at the bar with six empty shot glasses in front of it, gesturing to a bartender to bring another round,” one description read. “Museum workers looking at two dinosaur skeletons in a dancing pose like old-time vaudevillians with top hats and canes at a museum exhibit,” said another. Weirdness in, weirdness out.

And even without the narrative power of a New Yorker cartoon, the immersive museum gag and Brunhild on the subway evoked a smile from me. That means AI created at least serviceable cartoons out of nothing: Neither the captions nor images were in its training set, and to my knowledge, they did not previously exist anywhere else either. And for the sake of a clean experiment, we played it completely straight, not altering either the caption or the image description at all. We could have achieved much better results if AI-human collaboration had been permitted.

This points the way toward the most likely role for AI in cartooning: not a replacement but a brainstorming tool, helping creatively blocked cartoonists come up with ideas that the human can then improve upon. My captions will always feel more human than a machine’s because they arise from life in the real world—my own emotions, annoyances, and grievances. But just as some digital native cartoonists prefer iPads and e-ink to pen and paper, some also may like to use AI to reach greater creative heights.

I have no wish to welcome our cartoon overlords. But there’s also no need for me to shun AI models as potential collaborators, creative assistants, or inspirers. Cartoonists have shown that they are alchemists extraordinaire. I’m convinced they will be able to use this tool to augment their alchemy. I’m just as sure there will never be a day when robot cartoonists are creating robot cartoons for robot readers of The New Yorker to laugh at.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.