One of the reasons I started podcasting a decade ago was to give myself some opportunities to play with new skills, tools, and ways of working that I might not otherwise get to try in the day job. So as I prepared to start covering the classic Chinese novel “Journey to the West” on my podcast, I experimented with incorporating generative AI into my work.
One pain point that has often come up in my podcasting work is the need for images to accompany and promote individual episodes. While there are public-domain or Creative Commons-licensed images for some of the works I’ve covered on my podcast, finding fresh pictures every week becomes difficult when one novel spans around 100 episodes.
When I did the Water Margin Podcast a few years ago, I addressed this by creating Lego scenes for every episode, which was fun but also incredibly time-consuming. I needed something less labor-intensive. So I recently subscribed to ChatGPT and started playing with it to try generating images for my soon-to-be-launched “Journey to the West” podcast series. Here are some of the results.
Monkey Business
First, I told ChatGPT to create an image based on the content of the first chapter of “Journey to the West.” Below are my prompts and 10 subsequent results as I tried to refine the output, along with some thoughts on each image.
Attempt 1
User: generate an image using content from chapter 1 of Journey to the West
Assistant: Here is an image inspired by Chapter 1 of Journey to the West. It captures the mythical essence and grandeur of the Monkey King and his magical surroundings.
I didn’t really care for the style of the image. Also, I’m not sure why the Monkey King was holding some kind of fire wand, which he never used in the novel.
Attempt 2
User: Make the image in the style of a traditional chinese painting and less cartoony
Assistant: Here is an image inspired by Chapter 1 of Journey to the West, rendered in the style of a traditional Chinese painting. It captures the elegance and serene beauty of classical Chinese art while showcasing the Monkey King in his mythical setting.
I liked the style of this image much more. But the Monkey King was now holding a Buddhist staff, which was also not an object associated with him. I also didn’t care for the colorful, flowing monk clothes, which he didn’t wear in the novel. The thing that bugged me most, though, was the giant beard, which seemed more King Lear than Monkey King.
Attempt 3
User: remove the beard from the monkey king
Assistant: Here is the updated image of the Monkey King from Journey to the West, depicted without a beard in the style of a traditional Chinese painting. It maintains the elegant and serene qualities of classical Chinese art while reflecting this detail change.
Umm … pretty sure I asked for beard removal. Sigh. Let’s try something a little different and maybe ChatGPT will just catch on at some point.
Attempt 4
User: put the monkey king in less extravagant clothing and remove his cane
Assistant: Here is the updated depiction of the Monkey King from Journey to the West. He is now dressed in simpler, less extravagant clothing, and his cane has been removed, maintaining the serene and traditional Chinese painting style.
Still far afield clothing-wise, and … is that the cane I specifically asked you to remove? Fine, maybe ChatGPT has poor sartorial taste. Well, early in chapter 1, the Monkey King was still in the process of becoming “civilized,” so let’s try him in the nude.
Attempt 5
User: The monkey king shouldn’t be wearing a Daoist robe. He shouldn’t have any clothes.
Assistant: Here is the updated depiction of the Monkey King from Journey to the West. He is shown in his natural form, with no clothing, blending harmoniously into the traditional Chinese painting style and the serene environment.
Hmmm …
Attempt 6
User: Make him look younger and more energetic. He should be more like an action hero
Assistant: Here is the depiction of a younger, more energetic Monkey King, styled like an action hero. His vibrant pose and dynamic energy are highlighted within the traditional Chinese painting style, emphasizing his adventurous and heroic nature.
This wasn’t gonna fly. The novel makes numerous references to the Monkey King being skinny. This was not skinny.
Attempt 7
User: make him less muscular. He’s skinny
Assistant: Here is the updated depiction of the Monkey King, now portrayed as youthful, skinny, and energetic. His lean build and lively pose emphasize his adventurous spirit, presented in the style of a traditional Chinese painting.
Ok, this was getting closer. He’s wearing the simple yellow tunic that he’s often depicted in. But that vibrant orange fur didn’t fit with the desaturated background.
Attempt 8
User: dial down his color a bit so he blends more with the background
Assistant: Here is the refined depiction of the Monkey King with softer, muted tones that harmonize with the misty and serene background. The adjustment emphasizes his integration into the natural setting while maintaining his dynamic energy and presence.
Yeeesh. What happened here? All the sudden his face was eerily human-like. And he’s staring straight off into the short side of the image, about to leap face-first into the left edge. He’s also leaping away from the waterfall when in the story, he leaped INTO the waterfall.
Attempt 9
User: make him face the waterfall
Assistant: Here is the updated depiction of the Monkey King facing the waterfall, with his back to the viewer. This perspective emphasizes his connection to the natural environment and adds a contemplative, harmonious tone to the traditional Chinese painting-style artwork.
Leaving little to the imagination here.
Attempt 10
User: put some leaves around his waist to hide his shame
Assistant: Here is the updated depiction of the Monkey King, now adorned with a modest arrangement of green leaves around his waist. This addition blends naturally with the scene while preserving the elegant style of the traditional Chinese painting.
I actually kind of like the way the Monkey King was depicted in this iteration, though I don’t know why there was a flowing green scarf growing out of the base of his neck. Also, contrary to what ChatGPT claimed, there was no arrangement of leaves around his waist.
Monk Business
At this point, I decided to stop and try something else. I dumped the script for the intro episode of my podcast into ChatGPT and asked it to create an image for that episode. Here’s what it spat out:
Assistant: Here is an illustration inspired by the podcast script’s introduction to Journey to the West, rendered in the style of a traditional Chinese painting. It captures the spiritual and adventurous essence of Xuanzang’s journey with serene and mystical elements.
Nice, except for the teensy little detail that Xuanzang was a monk and always portrayed as bald. Whoever this guy in the picture was supposed to be, he was NOT a monk.
Instead of trying to give that guy a shave, I told ChatGPT to replace him with the Monkey King instead.
Assistant: Here is the updated illustration with the Monkey King, Sun Wukong, replacing Xuanzang. The artwork reflects his vibrant and energetic nature while maintaining the serene and mystical elements of the traditional Chinese painting style.
Why does he look like a hairier Legolas? And why does he have TWO tails?!! And what’s up with the grumpy cloud monkeys?
Cover Image?
As a final experiment, I described my podcast to ChatGPT and asked it to generate a cover image for the show. I described the show as something along these lines:
“The Chinese Lore Podcast is an English retelling of classic Chinese tales. The podcast has covered novels such as the Romance of the Three Kingdoms, the Water Margin, and Journey to the West.”
After a couple more refinement prompts, here’s the end result (I added the URL later):
Honestly, I could almost see myself using this.
Early Conclusions
- The illustrations are certainly pretty and stylish, and far beyond my abilities.
- Some of the monkey images were hilariously off-base. The podcast cover image, on the other hand, was actually a pretty solid, if generic, extrapolation of the stories covered by my podcast. It could easily be a scene out of some of those tales, though you won’t recognize any of the characters here. They’re just all “generic general from Chinese period drama.”
- If you don’t look too closely or know too much about the subject matter, a lot of these images are probably just fine. And that’s one concern with AI-generated content — unless you know what to look for, completely off-base stuff can easily slip past you.
- Would I use these images? I think the key question is: How would I use them? Some of these are fairly decent as throwaway fodder, and our social media-driven infosphere is awash in images that people don’t have time to look closely at. So if I just need something to blast through that firehose, some of these would work fine. But they’re mostly just eye candy — heavy on style, low on substance, and occasionally completely wrong. So if it’s something people might look closely at and give a moment’s thought to, I probably would not use them at this point.