Listy Biometryczne - Biometrical Letters Vol. 35(1998), No. 1, 37-66
This paper presents two different statistical strategies to elucidate the dependence of ozone on primary pollutants (nitrogen oxides) and on meteorology. The aim is to forecast, at Sam of a current day, the maximum ozone value occurring in the afternoon, using 6 years (1990-95) of pollutant and meteorological data for Paris. The first method is based on classical methods using simultaneously cluster analysis, analysis of variance, discriminant analysis and stepwise regression. We identify three distinct and homogeneous groups in Paris area. Within these groups, daily curves of ozone pollution form clusters of decreasing levels; these clusters arę well discriminated by previous ozone, primary pollutant and meteorological data. The second method is based on nonparametric methods using a kernel estimator of an autoregressive function with exogenous variables. It works by analogy on climatic and pollution conditions. The forecast is a weighted sum of maxima observed in the past. We compare the two methods on 1996 data, and propose some improvements to avoid forecast errors in particular cases.
air pollution, ozone concentration, prediction, linear model, kernel nonparametric forecasting