Interested in learning what's next for the gaming industry? Join gaming executives to discuss emerging parts of the industry this October at GamesBeat Summit Next. Learn more.
Artificial intelligence (AI) can generate eerily realistic faces, but what about tribal artwork? That’s the question Victor Dibia, a human-computer interaction researcher and Carnegie Mellon graduate, sought to answer with an AI system trained on a dataset of African masks.
As Dibia explains in a blog post, the work was inspired by a trip to the 2018 Deep Learning Indaba, an annual machine learning conference held at Stellenbosch University, South Africa, in September. Attendees were provided access to second-generation Tensor Processing Units (TPUs) — Google-designed chips purpose-built for fast training or inference of AI models — which Dibia used for training.
He tapped Google’s TensorFlow machine learning framework to get a generative adversarial network (GAN) — a two-part neural network consisting of generators that produce samples and discriminators that attempt to distinguish between the generated samples and real-world samples — up and running on the TPUs. Specifically, he selected a deep convolutional GAN, or DCGAN.
GANs have a knack for image generation and processing. Data scientists at Alphabet subsidiary DeepMind, for example, recently tasked a GAN with generating convincing photos of burgers, dogs, and butterflies, and researchers at the University of Edinburgh’s Institute for Perception and Institute for Astronomy used GANs to create images of artificial galaxies complete with star clusters, nebulae, and other interstellar features.
And it’s not the first time they and other AI architectures have been used to create artwork. They’ve composed memorable (if not entirely coherent) holiday tunes and written lyrics; produced humanlike paintings; and come up with names for fireworks, to name a few examples.
To “teach” the DCGAN how to create new mask designs, Dibia sourced a manually curated set of images — 9,300 in all — depicting African masks. Prior to training, he resized and cropped each image before converting them to TFRecords, a format for storing sequences of binary records.
Two versions of the DCGAN model were trained to produce 64-by-64-pixel and 128-by-128-pixel images, respectively.
In subsequent experiments, the model tasked with generating 64-by-64-pixel pics provided “better diversity” compared to the second system, while the 128-by-128-pixel images had superior quality. But the latter suffered a mode collapse — a failure case where the GAN’s generator began to produce samples with extremely low variety. (Dibia chalked it up to the dataset “being insufficient to train such a large model.”)
The more successful of the two AI models managed to come up with novel masks with sideways orientations, hair or “hair-like projections,” or oblong features.
“The goal is not to generate a perfectly realistic mask … but more towards observing any creative or artistic elements encoded in the resulting GAN,” Dibia wrote.
“In this case, while some of the generated images are not complete masks, they excel at capturing the texture or feel of African art,” Dibia wrote. “For example, I showed a colleague and they mentioned the generated images had a ‘tribal feel to it.'”
He leaves to future work extending the Africa Masks dataset, experiments with conditioned GANs, and other machine learning architectures for realistic and higher-resolution image generation.
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.