Date of Award




Document Type


Degree Name

Doctor of Philosophy (PhD)


Department of Environmental Health Sciences

Content Description

1 online resource (viii, 120 pages) : illustrations (some color)

Dissertation/Thesis Chair

David Carpenter

Committee Members

Erin Bell, JoEllen Welsh, Beth Feingold, Lawrence Lessner


Cancer research, COVID-19, Epidemiology, Lymphoma, Oncology, Pancreatic cancer, Cancer, COVID-19 (Disease), Airborne infection

Subject Categories

Epidemiology | Oncology


When trying to understand risk factors for a disease, even before we know the causal agents, it is necessary to create a surveillance data set that answers the questions of who, when, and where, and includes any potential covariates which may either promote or prevent the disease. There are a wide variety of surveillance data. For example, hospital discharge data, such as the New York Statewide Planning and Research Cooperative System (SPARCS) data, encompass all hospitalized cases in the state, while clinical datasets cover a specially constructed population in pursuit of research of a certain disease type. Analysis of surveillance data must begin with the selection of an appropriate study design and analytical approach. An ideal data source would include abundant information directly collected from individuals. The goal of this dissertation is to demonstrate the limitations and strengths of ecological, cross-sectional, and systematic analysis with quantitative comparison in contrast to ideal methodological approaches given the practical constraints. Consequently, three independent studies have been conducted requiring different analytical approaches taking into consideration of the research questions and the types of data sources. The health outcome data sources used in the three different projects were intentionally vastly different from each other in terms of the granularity of data and exposure types. The granularity of data sources across the studies in this dissertation ranges from combined administrative units by the similarity of exposures (group of ZIP codes), aggregated geographical data by small administrative unit, and individual-patient level exposure data in the context of medical treatment. The scientific evidence obtained through these independent studies demonstrated in this dissertation is important to help harness the needed medical resources for the population with unmet preventative and treatment means to fight the disease.