“Using a linear model to detect clusters of pollution slightly differs from more standard models and therefore does not admit the same convergence properties.
This is why we studied a Monte Carlo method specific to spatial scans in order to evaluate the significance of the associated likelihood ratio test, allowing us to estimate the validity of the identified cluster.
The application focuses on pollution data, we identified, through a PM10 survey at the national level, the region with the highest concentration of this pollutant.
From epidemiology services seeking to explain geographical clusters relating to breast cancer, to quality control questioning clusters of defective products, via the detection of areas of pollution, statistical techniques for spatial scanning are the subject of numerous research topics. The heart of these methods makes it possible to define geographical areas with the highest concentration of the target variable.
In the case of our application, we were looking to detect pollution clusters in France. To do this, we modelled our data by a generalized linear model. We then defined the significance of the cluster, and finally, we calculated the associated p-value in order to define the most significant cluster.
Firstly, we introduced the model used and highlighted the convergence issues of the likelihood ratio. Then, we presented the solutions to overcome this problem and proposed an application of spatial scans to pollution data.
Keywords: Spatial Scans, GLM, Maximum Likelihood Estimator, Likelihood Ratio Test, Monte-Carlo”