Researchers at Xi’an Jiaotong-Liverpool University have proposed a new method to teach a computer to recognise irregular text common in everyday scenarios, such as on road signs, license plates or street shop logos.
For example, this type of technology can enable a driverless car to know where it’s at based on signs and other text around it, helping it stay safe and on the right path.
Jing Li, a PhD student at XJTLU’s School of Advanced Technology and the first author of a paper on the method that was recognised at an international conference, explains that this text is not always even or on a horizontal line.
Instead, it may be in irregular or distorted patterns, with varied shapes, font styles, sizes and colours. This presents a challenge in teaching the computer to read it.
Examples of text layouts on signs
To solve the problem, the XJTLU researchers’ method first adjusts the irregular text so that it is less distorted. This adjustment is called rectification.
“This research focuses on the rectification of scene texts, a process before recognition, which adjusts the irregular text and turns the distorted original text into a regular shape, thus reducing the difficulty of recognition and improving accuracy,” Li said.
Transformation of text: left, unrectified; right, rectified
Dr Qiufeng Wang, Li’s advisor at XJTLU’s Laboratory of Cognitive Computation and Applied Technology, said Li’s proposed method considers outcomes beyond rectification.
“Li considered the effectiveness of not only the rectification but also the recognition. I believe that this method can help improve scene text recognition.”
The paper was awarded the Best Paper Runner-Up at the 27th International Conference on Neural Information Processing, one of only four papers awarded at the conference.
Professor Kaizhu Huang, Associate Dean of Research at the School, said: “This is one of five paper awards that LCCAT has received in recent years. It represents the recognition from counterparts in China and abroad of LCCAT’s achievement in artificial intelligence and pattern recognition.”
Based in the School of Advanced Technology at XJTLU, LCCAT focuses on pattern recognition, cognitive computation, machine learning and their applications in text, images, sounds and videos.
ICONIP is one of the leading conferences in the Asia-Pacific region on neural networks and related fields.
By Huatian Jin
Translated by Xiangyin Han
Edited by Chloe Byrne and Tamara Kaup