Found 2 projects
Poster Presentation 1
11:00 AM to 1:00 PM
- Presenter
-
- Townson Cocke, Sophomore, Pre-Sciences
- Mentor
-
- David Marcano, Statistics
- Session
-
-
Poster Session 1
- MGH 241
- Easel #74
- 11:00 AM to 1:00 PM
We analyze the International Study of Asthma and Allergies in Childhood (ISAAC) data for Seattle and apply statistical clustering methods to identify asthma phenotypes. This study focuses on Phase One of ISAAC, conducted between 1994 and 1995, in which approximately 3,000 adolescent asthma patients and their parents filled out detailed questionnaires about asthma, rhinitis and eczema symptoms. Asthma is a heterogeneous disease, meaning it has a highly variable clinical presentation, so in order to identify distinct phenotypes, we apply a hierarchical bottom-up clustering method on categorical variables from the questionnaire and verified cluster stability using various linkage methods. The clusters we obtained were distinguished primarily by differences in the severity of respiratory symptoms and the presence of eczema symptoms. After assessing the accuracy of several alternative clustering methods, we conclude by comparing the clusters identified by this analysis to clinically recognized asthma phenotypes. Accurate characterization of asthma phenotypes is important for informing management and treatment strategies for urban adolescents with asthma. Techniques that identify severe phenotypes in population data sets can help target treatment to those who may benefit from high-intensity treatment regimens, careful attention to potential exposures to environmental allergens, and specialist level care.
Oral Presentation 1
1:30 PM to 3:00 PM
- Presenter
-
- Joia W (Joia) Zhang, Senior, Statistics: Data Science
- Mentor
-
- Sat Gupta, Mathematics, Statistics, UNC Greensboro
- Session
-
-
Session O-1G: Modeling Diverse Datasets at Every Scale
- MGH 251
- 1:30 PM to 3:00 PM
When conducting surveys containing sensitive questions, Social Desirability Bias (SDB), people’s tendency to provide socially acceptable answers rather than truthful ones, often leads to low response rate or worse, untruthful responding. Randomized Response Techniques (RRT) combat SDB by allowing respondents to provide scrambled responses. However, if a respondent does not trust the RRT model, data accuracy will be compromised. Lack of trust in binary RRT models has been shown to lead to untruthfulness, and thus unreliable data and unreliable estimates of the sensitive trait. Yet, no quantitative RRT model currently accounts for respondent lack of trust. We propose an Optional Enhanced Trust (OET) RRT model that extends Warner’s Additive Model, a quantitative RRT model with additive noise, by allowing additional multiplicative noise if the respondent does not trust the Warner Additive Model. Using the Unified Measure, a combined metric of respondent privacy and model efficiency, we demonstrate both theoretically and empirically that the proposed OET model is superior to existing models in terms of respondent privacy and estimator accuracy.