News Release

Fresher AIr: AI and mobility data may improve air pollution exposure models

Peer-Reviewed Publication

Penn State

Beaver Stadium June 2023

image: Beaver Stadium in June 2023 when smoke from Canadian wildfires blanketed skies across the northeastern United States. A Penn State-led research team used data from low-cost sensors, artificial intelligence and mobility data to improve models that assess human exposure to fine particulate matter (PM 2.5), tiny particles in smoke and other forms of air pollution that can pose health dangers. Public health officials can use the models to develop strategies to reduce exposure to unhealthy air quality, according to the researchers. view more 

Credit: Patrick Mansell / Penn State

UNIVERSITY PARK, Pa. — Americans in the northeast paid greater attention to air quality alerts this summer as wildfire smoke thickened skies with an orange-tinted haze. Smoke and other sources of air pollution contain tiny particles, called fine particulate matter (PM 2.5). Smaller than the width of a human hair, PM 2.5 pose health dangers when inhaled, especially to people with pre-existing heart and lung conditions. To assess exposure to PM 2.5 and help public health officials develop strategies, a Penn State-led research team designed improved models using artificial intelligence and mobility data.

“Our research shows that incorporating artificial intelligence and mobility data into air quality models can improve the models and help decision makers and public health officials prioritize areas that need extra monitoring or safety alerts because of unhealthy air quality or a combination of unhealthy air quality and high pedestrian traffic,” said Manzhu Yu, assistant professor of geography at Penn State and first author of the study.

Reported in the journal Frontiers in Environmental Science, the researchers examined PM 2.5 measurements across eight large metropolitan areas in the continental United States. Air quality data came from Environmental Protection Agency (EPA) monitoring stations and low-cost sensors usually purchased and distributed by local community organizations. They used the data to find hourly PM 2.5 averages in each region.

The scientists input the air quality data into a land use regression model. The model uses local geographical factors like satellite-measured aerosol levels, also called aerosol optical depth; distance to nearest road or stream; elevation; vegetation; and meteorological conditions such as humidity and wind speed to examine how the factors affect air quality. Past models have taken a linear approach to assessing air pollution, meaning that they assigned a fixed importance to each geographic factor and its impact on air quality, Yu explained. Certain factors like vegetation and meteorological conditions, however, cannot be represented this way because they change hourly or seasonally and may have complex interactions with other factors that affect air quality.

Yu and her colleagues took a nonlinear approach to better account for these changing or complex factors by incorporating automated machine learning — a type of artificial intelligence that automatically performs time-consuming tasks such as data preparation, parameter selection, and model selection and deployment — into the land use regression model. The automated machine learning approach used an ensemble method, which allows the machine to run and combine multiple models, to identify the best-performing model for each region. The researchers also examined anonymized cell phone mobility data to pinpoint areas with unhealthy air quality and high visitor numbers.

The researchers found that their automated machine learning method with integrated data from low-cost sensors and EPA monitoring stations improved the accuracy of air pollution exposure models by an average of 17.5%, offering greater spatial variation than using regulatory monitors alone. Yu credited the improved accuracy to the method’s ability to better account for the dynamic variables of aerosol optical depth and meteorological factors, which consistently proved to be the most important across all study regions. The mobility data component allowed the team to map potential hotspots within regions and times during the day and year when large numbers of people may be exposed to high PM 2.5 levels in these areas.

“Many areas may have consistently high air pollution levels, like those near factories and major transportation hubs, but that is not enough information to make a prioritized list of places needing extra monitoring or health alerts,” she said. “Our mobility-based exposure maps show public health officials and decision makers hotspots that have unhealthy air quality levels plus high visitor traffic. They can use this information to send alerts to people’s mobile phones when they enter an area with really high PM 2.5 levels to reduce their exposure to unhealthy air quality.”

Additional contributors to the research were Shiyan Zhang, doctoral candidate in geography, Penn State; Junjun Yin, assistant research professor in the Social Science Research Institute, Penn State; Jiheng Miao, who recently graduated with a bachelor’s degree in geography from Penn State; Kai Zhang, Empire Innovation Associate Professor in the School of Public Health, State University of New York, Albany; and Matthew Varela, an incoming Penn State graduate student who recently graduated with a bachelor’s degree in meteorology from the University of Oklahoma and participated in the study during Penn State’s summer 2022 Research Experiences for Undergraduates in Climate Science program.

Penn State, through the Miller Faculty Fellow Award from the College of Earth and Mineral Sciences, supported this research.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.