Recently, three undergraduate students from Xi’an Jiaotong-Liverpool University Entrepreneur College (Taicang) won silver medals in competitions on Kaggle, a prestigious international data science competition platform. The three participants, Yunze Wang, Yaqi Yu and Jiayuan Zhu, are all Year Three students in the BEng Data Science and Big Data Technology programme.
Each team ranked among the top 5% of participants in their respective competitions, winning silver medals.
Kaggle, a subsidiary of Google, is a data science community and competition platform with more than 850,000 participating data scientists worldwide. Enterprises and researchers with a data science need, such as identifying animals from photographs or providing better e-commerce recommendations, can host competitions offering prize money for the best solution.
Most of the participants are graduate students and doctoral students from first-class universities, or professionals in relevant fields such as data analysis and machine learning. In competitions with more than 1,000 participating teams, the top 10 teams and the top 0.2% will be awarded a gold medal, the top 5% a silver medal, and the top 10% a bronze medal.
Finance, literacy and brains
Yunze Wang and his teammates ranked in the top 0.8% in the Optiver Realized Volatility Prediction competition, in which 3,852 teams participated. Optiver, a leading global market maker, tasked participants with using stock trading data to predict stock volatility.
To solve the problem, Wang’s team tried various models, including Catboost, RNN attention, and Transformer, eventually integrating the results. He says the most important and difficult part of the process is feature engineering before selecting the model. He explains: “Feature engineering refers to selecting better data features from the original data to improve the performance of the model. As I didn’t have enough knowledge in the financial field, I had to learn a lot of things about stocks, random processes, time series and statistics, so that I had a better understanding of the datasets.”
Yaqi Yu and his teammates participated in a competition called CommonLit Readability Prize, where they rated the complexity of reading passages for use in classrooms for students grades three to 12.
According to Yu, they chose to integrate five models including CLPR, RoBERTa, and mean pooling. They made some improvements on the official RoBERTa pre-training model, and employed RoBERTa-base and RoBERTa-large model for training. They also kept the differences between these models to ensure that the final results would be more accurate.
Jiayuan Zhu participated in the RSNA-MICCAI Brain Tumor Radiogenomic Classification competition, which entailed training the model using MRI data, for better brain tumour diagnoses. “This competition is in line with the practical application of what we’ve learned, because it calls for data science knowledge to solve specific problems,” Zhu says. “Therefore, in addition to the knowledge of big data technology, self-learning is also important – the more knowledge you have acquired in the related field, the easier it would be to find the proper approach to pre-process the data for a more accurate calculation result.”
Yu mentioned that since the models in the data science field are updated at a rapid rate, it’s not enough to content oneself with the knowledge gained from the classroom – people should constantly explore and learn new things.
In the past two years since his first participation in Kaggle, Wang has won three silver medals and one bronze. Yu, who also first participated two years ago, has won two silver and one bronze. This is Zhu’s first participation in a Kaggle competition, and it garnered her a silver medal.
Professor Angelos Stefanidis, Dean of the School of AI and Advanced Computing of XJTLU Entrepreneur College (Taicang), says: “I know how hard these students have worked, and their accomplishments motivate all of us to do better.
“Their work captures the essence of what we try to do in our School, which is to focus on close collaboration between academic staff, students, industry, government and the wider community, with the aim of fostering a modern, entrepreneurial and sustainable society.”
By Qiuchen Hu
Translated by Ziying Shu
Edited by Patricia Pieterse