A team of students from the Department of Mathematics, Faculty of Science and Technology, Gadjah Mada University, consisting of Dimaz Andhika Putra (Statistics, 2022) and Rahma Nur Annisa (Statistics, 2022), successfully secured second place in the Data Mining Competition, part of the Academic Competition of Data Science series with the theme “Data Science for Environment and Sustainability.” This competition encourages participants to innovate using data science for environmental sustainability. On this occasion, Dimaz and Rahma presented their analysis on air quality for air pollution mitigation based on machine learning, an important solution amidst the serious threat of air pollution to global health.
Their journey began in the preliminary round, which involved a thorough data mining process—from preprocessing, exploratory data analysis, modeling, to evaluation. They chose to focus on air quality data, a topic that is not only relevant but also urgent to address. Air pollution, which poses a serious challenge in various parts of the world, serves as the backdrop for the selection of this topic, supported by WHO data stating that around 7 million people die each year due to exposure to air pollution. By choosing this topic, Dimaz and Rahma hope to make a real contribution to air pollution mitigation, while at the same time raising awareness of the importance of air quality monitoring.
Dimaz expressed that the biggest challenge in the early stages was formulating ideas and conceptualizing analyses with limited data availability. The ideas they developed had to be adjusted several times, until they finally decided to address the issue of air pollution and develop a system called Aerosafe. This system uses daily Air Quality Index (AQI) data to generate predictions and early warnings regarding increased air pollution. Through forecasting and classification methods, they managed to capture the judges’ attention and advance to the final round.
The final round, held in-person in Surabaya, added a new challenge. Dimaz and Rahma had to go through two stages: working on a case study for 9 hours, followed by a presentation session in front of the judges. This case study revisits the issue of air pollution with different data, allowing them to explore this problem more deeply. The work was carried out through the Kaggle platform, where the finalists competed to improve the model’s accuracy to achieve the highest ranking on the leaderboard. Rahma expressed her surprise and happiness at encountering the same issue again, allowing her to explore this problem further. Repeated efforts were made by Dimaz and Rahma to achieve optimal model accuracy. Changes in leaderboard positions became a challenge in themselves. They have to work hard to maintain an optimal position amidst the tight competition with other finalists.
The final round continued with presentations where they presented the case studies they had conducted in front of the jury panel with a background in statistics and data science. At this stage, they outlined the steps they took to achieve optimal results. With a systematic presentation and the ability to answer the judges’ questions well, Dimaz and Rahma successfully secured second place in this competition.
The Aerosafe system contributes to the achievement of SDG 3, “Good Health and Well-being,” by providing air quality prediction tools, allowing the community to take preventive actions against the health impacts of air pollution. Additionally, Aerosafe supports SDG 13, “Climate Action,” by enhancing mitigation capabilities against air pollution, which contributes to the reduction of pollutant emissions and environmental quality monitoring.
This success underscores the importance of data analysis in addressing environmental issues. Through this competition, they realized that an optimal data analysis approach is key to finding valuable insights. They hope that a system like Aerosafe can serve as a concrete example of contributions that can be further developed, paving the way for innovations that support environmental sustainability in the future.
Keywords: Data Analysis, Forecasting, Student Achievement
Author: Dimaz Andhika Putra
Editor: Endang Sulastri
Photo: Dimaz Andhika Putra