Design of Postgraduate Entrance Examination Recommendation Platform Based on Big Data and Ensemble Learning
DOI: https://doi.org/10.62517/jike.202604128
Author(s)
Liuyang Zhao1, Zanpu Wang1, Qingfeng Zhou2,*
Affiliation(s)
1College of Artificial Intelligence and Big Data, Henan University of Technology, Zhengzhou, Henan, China
2iFLYTEK Co., Ltd., Hefei, Anhui, China
*Corresponding Author
Abstract
To address the issues of information asymmetry and intense competition in the postgraduate entrance examination, this paper designs and implements an intelligent analysis and recommendation platform based on big data and machine learning. The system employs the Scrapy framework to collect multi-source heterogeneous data from graduate admission websites and university portals, and constructs a four-layer data warehouse based on Hadoop and Hive for data cleaning and offline computing. Furthermore, the XGBoost algorithm is introduced to build a re-test score prediction model, and a Large Language Model is integrated to develop an intelligent preparation agent that provides personalized study plans. Experimental results show that the platform achieves multi-dimensional visualization via ECharts, and the prediction model achieves an R-squared of 0.7808, effectively assisting candidates in scientific school selection and efficient preparation.
Keywords
XGBoost; Python; Data Visuali-Zation; Recommendation System; Big Data Analytics
References
[1] Dogan M E, Goru Dogan T, Bozkurt A. The use of artificial intelligence (AI) in online learning and distance education processes: A systematic review of empirical studies. Applied sciences, 2023, 13(5): 3056.
[2] Abodayeh A, Hejazi R, Najjar W, et al. Web scraping for data analytics: A beautifulsoup implementation//2023 sixth international conference of women in data science at prince Sultan University (WiDS PSU). IEEE, 2023: 65-69.
[3] Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016: 785-794.
[4] Yağcı M. Educational data mining: prediction of students' academic performance using machine learning algorithms. Smart Learning Environments, 2022, 9(1): 11.
[5] Thusoo A, Sarma J S, Jain N, et al. Hive: a warehousing solution over a map-reduce framework. Proceedings of the VLDB Endowment, 2009, 2(2): 1626-1629.
[6] Thusoo A, Sarma J S, Jain N, et al. Hive-a petabyte scale data warehouse using hadoop//2010 IEEE 26th international conference on data engineering (ICDE 2010). IEEE, 2010: 996-1005.
[7] Kaur P. Sentiment analysis using web scraping for live news data with machine learning algorithms. Materials today: proceedings, 2022, 65: 3333-3341.
[8] Nalla L N, Reddy V M. AI-driven big data analytics for enhanced customer journeys: A new paradigm in e-commerce. International Journal of Advanced Engineering Technologies and Innovations, 2024, 2(1): 719-740.