Photo/Illutration Images of people wearing masks are marked with squares to make it easier for artificial intelligence to recognize their faces. (Provided by Pixta Inc.)

A dataset of 1,000 facial photos of Japanese wearing masks is suddenly in high demand amid a growing need for massive amounts of images to train facial recognition programs. 

Operators of stock photo websites are offering their contents in a bundle to capitalize on surging demand from businesses using artificial intelligence-based programs to scan people’s faces or items and check if they are in the database.

Stock photos are often used in corporate advertisement and media reports. Operators pay providers of images posted on the websites when customers sign a contract to use that content under certain conditions.

Stock photos are increasingly being utilized in training AI to recognize human faces and items because they save the trouble of taking a number of sample images for that purpose.

FACIAL PHOTOS OF JAPANESE IN HIGH DEMAND

Pixta Inc., the operator of the major Japanese stock photo platform Pixta, began offering AI training data in 2018. Facial photos are the most sought-after items among its clients, mainly manufacturers of electrical machinery and cameras as well as universities and other research institutions.

Pixta, based in Tokyo’s Shibuya Ward, provides a variety of data for training AI-based programs to recognize the faces of Japanese.

Caucasians account for a large portion of images of humans used for machine learning applications. That is blamed for a growing problem of systems that identify people in photos being less accurate when their skin is darker. 

Pixta began receiving more inquiries for its stockpile of facial images of Japanese from clients who hope those data will help improve the accuracy of their facial recognition systems to be used in places with many Japanese.

In June 2021, the company started offering datasets each containing 1,000 facial photos of Japanese wearing masks. The same year, inquiries for its AI training data tripled from a year earlier.

The spread of mask wearing amid the pandemic has forced manufacturers to update their facial recognition programs to enable them to recognize human faces with masks and identify individuals without obtaining information on the mouth and surrounding areas.

Makers find it convenient to train AI using images of human faces covered with masks or other items.

“We began receiving more inquiries for photos of people wearing masks, so we started offering them as a dataset,” said Sayaka Fukumoto, a Pixta official in charge of customer services. “We hope the service will help clients save time and energy collecting those photos for their research.”

To create the dataset, Pixta officials search photos by their tags, such as "mask" and "Japanese," attached to them when they are posted on the platform.

A dataset of images edited for machine learning applications costs 165,000 yen ($1,300) each, while a package of unedited ones is offered for 99,000 yen. Some clients buy more than 10 datasets at once, according to a Pixta official.

NO GLOBAL RULES ON USE OF AI TRAINING DATASETS

Getty Images Inc., a Seattle-based company that operates the world’s largest stock photo platform, also began offering datasets for machine learning applications in Japan and elsewhere around four years ago.

It has sold the rights to using millions of images to makers, retailers and public entities. In the agriculture sector, for example, photos of ripe fruits are in high demand to use the images to quickly assess the degree of the ripeness of fruits when harvesting or shipping them.

The company also used to offer datasets to those who need AI training data to develop facial recognition systems, but it stopped doing so.

Unlike Pixta, Getty Images is taking a cautious approach as no international rules have yet to be established on the usage of such data for machine learning purposes.

Those uploading images of people to stock photo websites need to obtain model release, or consent from the subjects of the photos for publishing the images, beforehand.

Getty Images announced in March that it introduces a new model release form to check if models are consenting to the use of their biometric data captured in photos for training machine learning algorithms to develop biometric identification technologies.

“Global privacy law is rapidly evolving and in many jurisdictions explicit consent is required to use biometric data for AI/ML (machine learning) purposes,” said a Getty Images official. “With our new model release we will be building a biometrically released data set that can give customers legal confidence that consent has been obtained.”

Yoichiro Itakura, a lawyer well-versed in personal information protection, said, “Regulators overseas are moving to create a legal framework to protect personal data, such as the EU’s General Data Protection Regulation. It (Getty Images) probably decided that it would be safer for its business to obtain explicit consent from models.”

Pixta mainly targets customers in Japan.

According to the government’s Personal Information Protection Commission, Japan’s Personal Information Protection Law stipulates that businesses should inform subjects in their photos when providing the images, which contain the models' personal data, to third parties.

But the law does not require companies to specify the purpose of doing so.

Tatsuhiro Ueno, a professor of intellectual property law at Waseda University, said people, in principle, can freely use contents for information analysis, such as machine learning, even if they do not hold the copyright of the original works as set forth in Article 30, Section 4 of the Copyright Law.

“We regularly review our model release forms by following changing trends and other companies’ moves,” said a Pixta public relations official. “We will continue to closely watch moves to revise related laws and take an appropriate response.”