Research shows how recommendations can be made without user history

20 Apr 2020

Recommender Systems (RS) are widely-applied in everyday life. For example, on online shopping and video platforms, RSs predict users’ preferences and make personalised recommendations by analysing their behaviours such as shopping records and watching history.

But how is it possible to make recommendations to new users without any record or history? A research team from XJTLU’s Department of Mathematical Sciences proposed a solution to the challenge in their paper.

The paper titled: “Knowledge Discovery and Recommendation With Linear Mixed Model” was recently published on IEEE ACCESS, a first-class journal presenting research results on a combination of computer and technology. The lead author, Zhiyi Chen, graduated from the Department of Mathematical Sciences in 2019, and is studying for his master degree in statistics at Columbia University.

Chen introduced that traditional RSs analyse what users have bought or viewed so that they can make recommendations based on the features of products. But the solution they offered was based on the features of users—by analysing users who have bought the same product and summarising their features, recommendations are made to those with similar characteristics.

“With the linear mixed-effects model (LMM), I analysed around a million users’ rating data and found out some patterns of their features. That enabled recommendations to new users in line with these patterns,” Chen says.

MovieLens is a movie recommendation website for scientific purposes, run by GroupLens, a research lab at the University of Minnesota. The website categorises users according to gender, age, occupation, etc., in order to recommend the most suitable movies as well as make it convenient for researchers collecting data.

Chen started from “age”. He analysed ratings of the movie Life Is Beautiful using LMM, and obtained the average ratings of different age groups. To gain a more general pattern, he went to analyse the whole comedy genre, which has the largest quantity of rating data, and also other genres.

“From the data analysis, I found out that even for those movies with extremely high or low overall ratings, reviews from different age groups can vary a lot. Take comedies as an example, it turns out that generally speaking, users aged 50-55 give the highest ratings, while those aged 18-24, lower. Therefore movie websites should recommend more comedies to elder viewers rather than younger.”

He also found out that young people give lower ratings than the elder despite the difference of genres, which mean that when it comes to movies, the former is “pickier” than the latter. However, this finding requires a more rigorous process of hypothesis and deduction, as well as more evidence.

According to Chen, though traditional linear models are more commonly used, LMM is more accurate: “A linear model involves only one influential factor, but LMM enabled me to take occupation, age and other factors into consideration while age remained the most important one.”

“The solution still has some shortcomings, because the recommendations are not targeted enough. For example, the system can only recommend comedies for the elder, or certain type of movies for a group of people with similar jobs.

“Machine learning and deep learning algorithms in the field of artificial intelligence are necessary to generate more precise recommendations.”

The paper was based on Zhiyi Chen’s undergraduate final project titled “Censorious Young: Knowledge Discovery from High-throughput Movie Rating Data with LME4”, which was accepted by the 2019 International Conference on Big Data Analytics (ICBDA).

The final version published on IEEE Access was improved with assistance from Chen’s supervisor--Dr Shengxin Zhu from the Department of Mathematical Sciences, Dr Qiang Niu and Tianyu Zuo.

By Xinyuan Yuan and Qiuchen Hu, translated by Xiangyin Han, edited by Will Venn

20 Apr 2020