​XJTLU PhD student's research advances AI image recognition

18 Dec 2019

A Xi’an Jiaotong-Liverpool University student’s research could improve the safety of self-driving cars and enhance other technologies that rely on use of artificial intelligence to identify images.

The research of Bingfeng Zhang, a PhD student in XJTLU’s School of Advanced Technology, focusses on a form of semantic segmentation – a process that can enable a computer to recognise the subject of an image and a technology that is critical to self-driving cars’ abilities.

Through his research, Zhang has found a new, highly effective way to use weakly supervised semantic segmentation to identify images.

He explains that semantic segmentation involves “training” the computer to intelligently label pixels in an image so that it can identify the subject.

“In semantic segmentation, we are trying to make the computer smarter than simply being able to recognise that an image is of a dog,” he says.

“We want the computer to learn which pixels in the image belong to the dog, so that in the future it can recognise other images more intelligently.

“To train the computer, we feed it a vast number of images and label the pixels to tell the computer which belong to a dog and which belong to something else, like a cat.

“After the computer has been trained in this manner, when we input a new image, the computer can identify the pixels and tell us what the image is.”

Semantic segmentation can be used to enable an autonomous car to “see” its surroundings and any hazards.

“If the car’s computer can more accurately and quickly identify the images that enter its sensors, its driving safety and accuracy can be improved,” he notes.

While a computer being able to recognise an image has many useful applications, the process of training it by labeling each pixel is time-consuming, Zhang explains. Therefore researchers are trying to find ways to train a computer to recognise images with less need for human effort.

The method Zhang has used – weakly supervised semantic segmentation – requires less human input than other forms of semantic segmentation but presents challenges in achieving image identification accuracy.

However, his approach was demonstrably successful – his method performed the highest compared to others when tested on a public dataset.

Zhang’s research paper was recently selected for publication at the 34th AAAI Conference on Artificial Intelligence, which will be held in New York City in February 2020.

Zhang said he found inspiration for his work after reading another paper on the topic and wanting to improve upon the methods described.

“When I first read a paper about this task, I was very confused; it was very complicated,” he said.

“Like many researchers who have devised ways to perform this task, first, the authors' method identified what the image is – a dog, for example. Then they used a second model to find information like the height of the dog.

“Using two individual models is very complicated and takes a lot of time. I wanted to be able to use just one model to perform the same task.”

Professor Eng Gee Lim, dean of the School of Advanced Technology and director of the AI Research Centre at XJTLU, commended Zhang’s achievements under Dr Jimin Xiao.

“Bingfeng has thrived in the research-led education environment at XJTLU,” he said.

“Publishing in a top AI conference as a second-year PhD student is exceptional and demonstrates the high quality of research conducted in the University.”

The Association for the Advancement of Artificial Intelligence sponsors the upcoming conference, on which the Chinese Computer Federation has conferred its highest ranking.

By Joseph Jones, edited by Tamara Kaup

18 Dec 2019


When trends in electric, automated and shared cars merge

When trends in electric, automated and shared cars merge

You’ve just hired a Didi car. You get in the back, and off you go. But there’s no driver in the front and no purr of a combustion engine. The car is not onl...

Learn more