Shai Moshenberg, M.Sc
Atmospheric data collection is a fundamental element is all atmospheric studies. Data collection is mainly done through field sensors networks that monitor pollution and other atmospheric variables. A common problem that hampers the use of the data is time series with missing samples, due to budget limits or technical issues. The absence of information can have a critical effect on the results and our understanding of the environment. Since coping with this problem through additional hardware and redundant measurements is not economically and practically feasible there will always be some missing data, and thus algorithm based methods are needed.
Generally there are two main algorithmic approaches to this problem Single-variable and Multi-variable methods. Single-variable approaches, i.e. imputation using a single variable e.g. NO2, O3 or CO, are simpler and more widely implemented but their quality drops quickly as the portion of the corrupted signal samples rises. Multi-variable methods, in which the imputation is calculated using data of more than one variable deal better with higher corruption rates, to some degree, but are much harder to implement and are computational intensive.
This research aims at devising a spectral based method for filling missing data in air pollution time series, i.e. use many different repeating cycles and patterns in the data collected to estimate the value of the missing data points.
This method is expected to work well on air quality sequences, due to the evident presence of temporal patterns in different scales as a result of their physical nature. Thus this method is expected to yield good results even with relatively high rates of corruption especially with regard to long term chronic exposure in which the long term average exposure level is critical.