Session 1L

Mathematical Modeling in the Sciences

12:30 PM to 2:15 PM | Moderated by Elizabeth Thompson


Finding Clusters in Data Through Recursive Graph Thresholding
Presenter
  • Richard Zhu, Sophomore, Statistics
Mentor
  • Werner Stuetzle, Statistics
Session
  • 12:30 PM to 2:15 PM

Finding Clusters in Data Through Recursive Graph Thresholdingclose

The goal of clustering is to partition a dataset into subsets (“clusters”) such that observations in the same cluster are similar and dissimilar from observations in other clusters. We think of the clustering problem as a graph problem. The vertices of the graph are the observations. Any two vertices are connected by an edge. The weight of the edge is the dissimilarity between the observations. We can form clusters by progressive graph thresholding. We repeatedly break the longest edge until the graph has two connected components. These two connected components form clusters in the sense described above. We then apply the same procedure recursively until we have obtained the desired number of clusters. We propose a method to efficiently find the connected components and their memberships.


Mathematical Studies of Data Storage in CD-ROM and DNA
Presenters
  • Iuliia Dmitrieva, Sophomore, Engineering Physics, Lake Wash Tech Coll
  • Dylan Dean, Sophomore, Computer Engineering, Lake Wash Tech Coll
  • Taylour Mills, Sophomore, Aeronautical Engineering, Lake Wash Tech Coll
Mentor
  • Narayani Choudhury, Computer Science & Engineering, Mathematics, Physics, Lake Washington Institute of Technology, Kirkland
Session
  • 12:30 PM to 2:15 PM

Mathematical Studies of Data Storage in CD-ROM and DNAclose

Current data storage elements have reached their threshold capabilities due to extensive data and limiting size requirements. Digital storage in DNA has aroused considerable interest as the next generation miniaturized high capacity storage device. Deoxyribonucleic acid (DNA) forms the genetic blueprint of life and is the primary carrier of genetic information in living cells and organisms. Data storage in DNA involves encoding of digital binary data into synthesized DNA strands. Here, we employ calculus-based methods to provide a comparative study of data storage capacities of conventional CD ROM and DNA. We use parametric equations to model the spiral structure in CD ROM and double helix of DNA and employ calculus-based methods to study the arc length, curvature and topological properties of DNA. The data storage densities for binary, base 3 and base 4 in DNA are estimated. The calculated data storage densities are found to be in good agreement with reported estimates. Recent studies demonstrate that magnetic nano-knots can be used for data storage. The topological properties of DNA including twists, links and knots thus provide additional attributes which may in future be used for data storage.


Network Topology of Knots and Borromean Rings
Presenters
  • Taylour Mills, Sophomore, Aeronautical Engineering, Lake Wash Tech Coll
  • Johnathan Hannon
  • Abdulrahman Ghalib
Mentor
  • Narayani Choudhury, Applied & Computational Math Sciences, Engineering & Mathematics, Physics, Lake Washington Institute of Technology, Kirkland
Session
  • 12:30 PM to 2:15 PM

Network Topology of Knots and Borromean Ringsclose

The use of magnetic nano-knots and Brunnian links for data storage and communications, makes understanding the geometric and network topology of knots and links very important. Recent reports suggest that DNA and other halogen networks self-assemble into exotic Borromean ring molecular topologies. Borromean rings form a Brunnian link with three rings linked in such a way that no two alone are connected. Only when all the three rings come together does the linkage occur. Borromean links form the current logo of the International Mathematical Union and they display strength in unity. Understanding knots, links and their networking is central to our understanding of DNA, protein folding, polymers and other soft materials. We have used a 3D printer to print and design a Borromean Math puzzle. The puzzle falls apart when a link is pulled out and is an excellent learning tool for studying Borromean link topologies. We use mathematical methods using parametric equations to study Borromean rings and trefoil knots. We wrote computer visualization code using SAGE to display trefoil knots and complex Borromean links for distorted circular, elliptical and other geometries. The SIEFERT surface of Borromean links are sketched using SeifertView and provide an aesthetic 3D view of the rings which can be oriented on a plane. The Seifert surface of a knot is a knot invariant; it is the characteristic of the knot with the knot as a boundary. The adjacency matrix and topological connectivity of the links are studied using vector directed graph models. A computer program is written to unravel the complex linking and intriguing connectivity properties of the trefoil knot and Borromean networks.


Interactive Construction and Exploration of Hexagonal Mosaic Knots
Presenter
  • Declan Mills, Fifth Year, Applied Computing, UW Bothell
Mentor
  • Jennifer McLoud-Mann, Mathematics, UW Bothell
Session
  • 12:30 PM to 2:15 PM

Interactive Construction and Exploration of Hexagonal Mosaic Knotsclose

Mathematical knots are non-intersecting closed loops which may be tangled; links are knots that are possibly intertwined. These 3-dimensional paths often resist description, so mathematicians choose nice ways to describe them. One such way is to project them onto a plane. Even more, it is interesting to build knots in discrete ways such as placing them on tiles in a plane. In this poster we are considering hexagonal mosaic knots, knots that are projected on a plane tiled by the honeycomb hexagonal tessellation. In this way, knots can be built from a small collection of hexagonal tiles with loops. We create an interactive tool which presents hexagonal tile types, a grid on which to lay them, and options for analysis. The researcher uses a point-and-click tool to lay down a mosaic grid, and in so doing, creates an underlying data structure representing the segments contained in the mosaic. When requested, the software traverses this data structure like a linked list. In this manner, one may determine if the data structure represents a suitably connected hexagonal mosaic knot or if it contains dead ends or stray segments; that is, determine if a data structure represents a knot/link or not. This process helpfully assigns segments to their respective knots, distinguishing not only ‘over’ and ‘under’ but also ‘self’ and ‘other’. We hope to continue exploring automatic generation of information about knots from their tiled representations. Once more developed, we hope to be able to answer more questions about the knot or link represented by the data structure. We also hope to continue exploring the use of rapid, flexible feedback from prototypes in aiding exploratory research.


How do Mean and Variance Affect Gene Survival and Gene Frequencies?
Presenter
  • Jueyi Liu, Senior, Economics, Applied & Computational Mathematical Sciences (Scientific Computing & Numerical Algorithms) UW Honors Program
Mentor
  • Elizabeth Thompson, Statistics
Session
  • 12:30 PM to 2:15 PM

How do Mean and Variance Affect Gene Survival and Gene Frequencies?close

Mutation produces new variation in populations, and in each generation these variants are copied from parents to offspring. While almost all variants of genes are lost, they may remain in the population for many generations. We use branching process models to analyze counts of gene copies. In a population of constant size, on average, a gene copy produces one offspring copy at the next generation. An advantageous mutant will have a mean greater than 1, and a deleterious one will have a mean less than 1. It is thought that most mutations are slightly deleterious, and with high probability those variants become extinct rapidly. Nonetheless, the few deleterious mutants that are not yet extinct may achieve high numbers. Thus, we have a particular interest in those with a mean slightly less than 1. We use different probability models for the offspring distribution and consider the mutant’s survival about: the extinction probability over k generations, the expected copy count conditional on survival, and the probability of survival additional k generations conditional on surviving k already. We find that variances in addition to means of offspring distributions closely relate these statistics. By adjusting parameters of distributions, we let the mean and variance be approximately the same across distributions. Based on our simulations, when k is large and the mean and variance of offspring are the same, the mutant’s survival condition is uniform throughout. In other words, those statistics above can be estimated by the mean and variance exclusively, and the specific distribution does not affect much the conditional population dynamics. However, at the first few generations, these statistics are different for each distribution. Thus, if we know the mean and variance of a mutant, we can predict the long-term population behaviors conditional on survival without knowing the true distribution of the mutant.


Objectivity and Likelihoodism: Relaxing Rationality Constraints via Qualitative Probability
Presenters
  • Aditya Saraf, Senior, Computer Engineering
  • Soham Pardeshi, Junior, Pre Engineering
Mentor
  • Conor Mayo-Wilson, Philosophy, University of Washington, Seattle
Session
  • 12:30 PM to 2:15 PM

Objectivity and Likelihoodism: Relaxing Rationality Constraints via Qualitative Probabilityclose

Likelihoodism is the conjunction of the following two statistical principles: (1) the Law of Likelihood (LL) which characterizes when a piece of evidence “favors” one hypothesis over another; and (2) the Likelihood Principle (LP) which characterizes when an agent ought to draw the same inference about how likely two competing hypotheses are, given a piece of evidence. Elliot Sober has argued that some of the greatest scientific achievements of the last two centuries – from Darwin’s arguments for common ancestry to Eddington’s argument that the bending of light during an eclipse favors Einstein’s theory of relativity – are applications of likelihoodism. Many proponents of likelihoodism maintain that it provides an “objective” standard for evidence, and we first provide a precise account of the sense in which LL and LP are objective. We consider a statistical principle more objective if it is acceptable to agents with more diverse beliefs and/or values. We argue that LL and LP meet this type of objectivity by showing that all Bayesian agents endorse LL/LP. However a statistical principle is clearly not objective if it is acceptable only to agents meeting stringent rationality constraints. The typical formulations of LL and LP rely on the probability axioms, but one can show that requiring agents to cohere to these axioms requires agents to be logically omniscient. This unreasonable constraint makes LL/LP unacceptable for any human agent. Our paper resolves these issues by restating likelihoodism in a formal representation of qualitative probability and proving the corresponding theorems in this new context. This work elucidates how likelihoodism is objective and how it plays into a Bayesian framework.


Highly Restricted Latent Class Model Structure Recovery: An Initial Simulation Study
Presenter
  • Richie Wang, Junior, Psychology
Mentor
  • Brian Flaherty, Psychology
Session
  • 12:30 PM to 2:15 PM

Highly Restricted Latent Class Model Structure Recovery: An Initial Simulation Studyclose

Latent class models are frequently used to classify observations into homogeneous albeit qualitatively different sub-groups. Typically, these models are unrestricted, meaning there is no a priori structure or restrictions on model estimates. This approach is similar to many cluster analysis approaches. The study examines modeling conclusions when an unrestricted model is applied to data generated from a highly restrictive population model through a simulation study. We expect that a researcher will conclude fewer classes than the true number specified in the population. Simulated data for this work is based on a substance use example identified in other research. The population was generated based on six classes, but with a within class error rate fixed to 5%. Preliminary analyses indicate that commonly employed model selection criteria indicate that models with fewer than six classes are preferable. This is important because most analysts using latent class models use unrestricted models with little apparent consideration to what realistic error rates may be. In research domains where smaller error rates may be plausible, restrictive models of the sort used to generate the data for this study may be worth considering.


The University of Washington is committed to providing access and accommodation in its services, programs, and activities. To make a request connected to a disability or health condition contact the Office of Undergraduate Research at undergradresearch@uw.edu or the Disability Services Office at least ten days in advance.