It is important to develop models that will predict the impact of environmental factors on the incidence of disease. In this paper, Gabda et al. formulate and test a number of linear-regression models that relate atmospheric metrics to the number of hospitalizations for asthma in a region of Malaysia. The dependent variables are measured concentrations of carbon monoxide, ozone, sulfur dioxide, nitrogen dioxide, and particulates, as well as temperature, humidity, and rainfall.
Their approach is to construct a family of linear-regression models, with the dependent variable transformed using a logistic expression; therefore, despite its title, this work develops a logistic-regression model for asthma. Using eight different expressions for least-squares error, with penalties for large model sizes, the optimal model is selected as one having up to fifth-order cross products of the input terms.
It would be beneficial to have more complete information on the data used (such as the period of time for the study data and the dataset sizes), as well as the results (r² and testing on other data sets). There are some nonintuitive findings reported that raise questions about the overall results: notably, the negative correlation of concentration of particulates with incidence of asthma.
In conclusion, this work should be regarded more as a demonstration of the application of simple regression modeling to an interesting data set, than as a result that has significance from a health-care perspective.