Statistically grounded research into light pollution and effects on cancer rates. Objective: establish if existing data resources demonstrate the link on a national scale.
resources
light pollution
This provides estimates of light pollution as it affects out view of the night sky:
- Light Pollution satellite studies -- high resolution world atlas of light pollution! Problematically the data is quantized into a relatively small number of levels, whereas descriptions of acquisition suggest quite the opposite. Either providing non-stepped data was difficult or is bad for the research of the group.
This provides satellite data of upward-directed light pollution. This provides estimates of light pollution as it affects the ground. The data is much sharper than the above:
- Earth Observation Group -- provides downloadable global composite images. I am examining these for saturation features to see how much data can be extracted and at what resolution. This is the dataset that is used in the famous Israeli study, and should be sufficient for the purposes of a US-based or global study. This data appears to be the "upward release" and not diffusion model as provided by the world atlas of light pollution.
Backing:
cancer rates and registries
known causes of cancer (control variables)
- ESA Earth Observation PI Portal -- Medium to low-resolution satellite data of many measurable features of the earth's geology, climate, atmosphere, oceans, and risks. A potential source of global data on atmospheric pollution. I am currently providing a registration summary of this work to gain access to the free versions of this data. An example of what is possible can be found at http://www.esa.int/esaEO/SEM340NKPZD_index_0.html. This could be used for global evaluation of air pollutants; a pollutant created in concert with many other air pollution creating processes is a good proxy for pollution levels.
- Occupational Respiratory Disease Surveillance -- Potentially important records of occupational risk due to work exposure to harmful chemicals, not limited to cancer.
- Scorecard -- database of pollution risk indexes by county in the US. Sources are described http://www.scorecard.org/about/txt/data.html but methodology is extremely complex. It is based on the TEP Caltox model http://www.scorecard.org/env-releases/def/tep_caltox.html. The county-level cancer risk estimates provided by this site must not have been derived from observed rates of cancer. They must be related only to estimates of toxic release and systematic air and water pollution rates and established models of toxicity that are themselves uncorrelated with the specific localities under analysis. If this can be proven then perhaps the data can be scraped and used as a pollution index in this study. At very least it can serve as a set of data for early testing, and when more control is required the data can be drawn from the acknowledged sources and reconstructed in a fashion that rebases it solely on toxicology estimates derived from biochemistry and specific study of the pollutants in question.
- EPA air quality -- air quality data.
- smoking statistics -- information about smoking rates around the world.
normalizers
geospatial data
papers
Risk of breast cancer in female flight attendants: a population-based study (Iceland)
Spatial analysis of lung, colorectal, and breast cancer on Cape Cod: An application of generalized additive models to case-control data
Using kernel density function as an urban analysis tool: Investigating the association between nightlight exposure and the incidence of breast cancer in Haifa, Israel
I have found a similar paper, Light at Night Co-Distributes with Incident Breast but not Lung Cancer in the Female Population of Israel, which is written by the same authors and may be the actual paper which many papers on nighttime light exposure have cited. The other paper seems to be a methods paper.
A Comparison of Nighttime Satellite Imagery and Population Density
books
The Causes and Prevention of Cancer: The Role of Environment I can't get access to this, perhaps at the MIT library? I have found a decent summary of the book, but it includes no citations http://understandingscience.ucc.ie/naturalworld/Cancer_and_chemicals_in_environment.pdf.
data flow
Pick a region (California, for instance).
- Integrate light intensity across counties or political subunits in that region. This requires registering the tiff images of the world at night with public GIS data representing the state, then averaging the observations falling in each county. A better approach to this might be to take even higher-resolution population distribution data and then generate a factor representing the relationship between population distribution and light intensity in the county.
- Integrate cancer incidence per population across counties in that region.
- Obtain environmental cancer risk for the region.
- Incorporate other potentially useful control factors, such as potential proxies for hidden variables (indicators for health and welfare, education, income).