Session 2B

Machine Learning

3:30 PM to 5:15 PM | Moderated by Kurtis Heimerl


Adversarial Language Generation with MCTS
Presenter
  • Jize (Tony) Cao, Junior, Computer Science (Data Science), Statistics
Mentors
  • William Agnew, Computer Science & Engineering
  • Pedro Domingos, Computer Science & Engineering
Session
  • 3:30 PM to 5:15 PM

Adversarial Language Generation with MCTSclose

Natural language generation (NLG) aims to generate meaningful and coherent natural language from a machine-representation system. There have been many approaches casting this task as a reinforcement learning (RL) problem. The NLG RL paradigm usually has a generative model to produce response for a given query, and a discriminator model to distinguish between human-generated dialogues and machine-generated ones, analogous to the human evaluator in Turing test. Previous research shows that such adversarial-trained generators can generate higher quality sentences than baseline supervised generators. However, the current state of the art focuses on how to fine-tune the generator using the discriminator, rarely incorporating the discriminator in the final language generation. Formally, we define the NLG as a planning problem. The agent (model) is trying to generate responses given prior query. Each action in the plan represents the intermediate state of generating a response. At each state, the agent decides which word to generate next. The main issue for that paradigm is the enormous number of possible actions at each step, equal to the total number of words in the vocabulary. To solve this issue, we incorporate an idea from AlphaGo, another great success in RL. AlphaGo uses the Monte Carlo Tree Search (MCTS) algorithm to reduce the searching space to avoid intractable computations from the enormous search space in Go. Inspired by current NLG RL paradigm and AlphaGo’s success, we propose to incorporate MCTS into NLG RL. The agent estimates initial word values using the generator probability distribution, and picks the next words through Upper Confidence Bounds for Trees (UCT), a widely used MCTS planning algorithm. Our results approach the current state of the art. My main responsibility was implementing the paradigm and seek ways to improve it. This work’s main contribution is presenting an effective way to incorporate discriminator into NLG.


Comparison of Reinforcement Learning Methods in a Real-World, Real-Time Environment
Presenters
  • Min Jing (Wendy) Jiang, Sophomore, Computer Science, Bellevue Coll
  • Megan Bui, Sophomore, Electrical Engineering, Bellevue Coll
  • Abduselam Mohammed (Abdul) Shaltu, Senior,
  • Samuel Vanderlinda, Sophomore, Computer Science, Bellevue Coll
  • Tejas Rao, Non-Matriculated,
Mentor
  • Christina Sciabarra, Political Science
Session
  • 3:30 PM to 5:15 PM

Comparison of Reinforcement Learning Methods in a Real-World, Real-Time Environmentclose

Reinforcement Learning (RL) is a subcategory of machine learning, in which an agent (the decision maker) observes its environment and executes the best course of actions to maximize rewards. This is similar to teaching a pet to perform tricks using treats as positive reinforcement. Our research compares different RL methods on low-performance devices like a Raspberry Pi in real-time, real-world environments. RL has gained popularity recently with breakthroughs from DeepMind’s paper, Playing Atari with Deep Reinforcement Learning, where an agent learns to play Atari games from raw pixels and from DeepMind’s AlphaGo (DeepMind, https://deepmind.com/research/alphago) program that was the first computer program to beat a world champion Go player. RL projects like AlphaGo have utilized big data, powerful computing resources, and simulated environments that do not require real-time interaction to train the machine learning models. Our group compares the effectiveness of different RL methods on an accessible level of computing power on offline devices that an average consumer could acquire. The team constructs a physical environment for the robot to navigate, creates an OpenAI Gym environment that our agents will use to control the robot and get feedback from the environment. We train our agents using different RL methods to optimally navigate the environment and avoid collisions. We then compare the performance of the different methods in our physical real-time environment. Reinforcement Learning in small, offline devices could pave the way for a variety of devices that learn over time without being connected to a network. Imagine a small Mars rover that learns to navigate its environment efficiently over time.


Understanding and Influencing the Dynamical Regime of Recurrent Neural Networks
Presenter
  • Timothy John Moore, Senior, Mathematics
Mentor
  • Eric Shea-Brown, Applied Mathematics
Session
  • 3:30 PM to 5:15 PM

Understanding and Influencing the Dynamical Regime of Recurrent Neural Networksclose

Neural networks are machine learning instruments capable of solving a vast array of problems ranging from image classification to decision making. Developers of neural networks algorithms have many consequential decisions they must make in order to adapt a model to a particular task. Recurrent neural networks (RNNs) are an important instance among these algorithms — they are popular and powerful due to their ability to integrate information over time. A structured understanding of how RNNs are affected by the decisions made during model adaptation will improve RNN performance. Unlike conventional feed-forward neural networks, RNNs are allowed to evolve over time. The study of time-dependent systems is known mathematically as dynamical systems analysis. Hence, we can build upon existing machine learning literature by incorporating a dynamical systems perspective to describe an RNNs regime. Through this lens we investigate how an RNN which operates in different dynamical regimes (chaotic or stable) displays optimal temporal integration abilities. We establish a relationship between a mathematical metric for the dynamical regime and the qualitative solution for the network. Our preliminary results further suggest improved performance when the RNN is encouraged to operate in a chaotic dynamical regime.


A More Biologically Accurate Artificial Neural Network to Learn Environment Models for Reinforcement Learning
Presenter
  • Vinny Murugappan Palaniappan, Senior, Neurobiology, Computer Science UW Honors Program
Mentor
  • Rajesh Rao, Computer Science & Engineering
Session
  • 3:30 PM to 5:15 PM

A More Biologically Accurate Artificial Neural Network to Learn Environment Models for Reinforcement Learningclose

Current artificial neural networks (ANNs) use an archaic view of neurons based on oversimplification of their biological computations. This has allowed optimized computation through GPUs, leading to the widespread adoption of ANNs in deep learning but losing important biological features. In this research we create a new model for artificial neural networks that incorporates more realistic aspects of biological neural networks such as stochastic vesicle release in neuronal synapses and dendritic computation. It has been shown that animals learn models of the environment when introduced to a new situation, but this type of learning is often not incorporated into reinforcement learning models in AI. The goal is to have the new, biologically realistic ANN learn models of environments in simulation frameworks like OpenAI Gym and AI2Thor so that given past frames/images and an action taken by the agent/player the network can predict how the environment will react over time. We compare the performance of this network with that of traditional ANNs (e.g. recurrent neural networks with long-short term memory) to demonstrate the capabilities of the new network. Our results have implications for recent efforts to move toward biologically inspired models of learning in the fields of artificial intelligence and computer vision in robotics. We expect the model learning algorithms we present to more efficiently learn an environment and select actions to achieve arbitrary goals within that environment. This is different than traditional reinforcement learning models, which aim to complete a single goal and can take a long time to train. The novelty of this work is the increased biological realism without the computational complexity of simulating real neurons, the temporal aspect in neural processing in addition to the spatial aspect, and prediction based on actions instead of pure video prediction.


Machine Learning to Label Pilot Blinks During Takeoff and Landing in a Flight Simulator
Presenter
  • Paul Michael Curry, Senior, Computer Science UW Honors Program
Mentor
  • Linda Boyle, Industrial Engineering
Session
  • 3:30 PM to 5:15 PM

Machine Learning to Label Pilot Blinks During Takeoff and Landing in a Flight Simulatorclose

Minimizing cognitive load is an integral part of human-centered design, where a more intuitive, easy to learn, and adaptive interface is desired. In this study, I implemented a program which was used to measure cognitive load using blink rates collected from pilots during takeoff and landing in a flight simulator. A GoPro camera was used to capture pilots’ blink rate. The program I wrote analyzed the GoPro video by first extracting the face of the pilot and then points on the face called facial landmarks. These facial landmarks are then rotated and scaled to standardize for analyses. The facial landmarks are fed into an Support Vector Machine (SVM), a type of machine learning model. To train the model I labeled around 60,000 frames that were sampled from the GoPro videos, as having pilot eyes “open” or “closed”. To predict blinks, the model is tasked with classifying each frame in a video as either eyes open or eyes closed. This data is then smoothed by a heuristic 20 frames. That is, if any two frames within a set of 20 frames is marked as closed, the entire series is marked as closed. The middle of each series of closed eye frames is then marked as the frame where the blink occurred. The output of the process is the video where each frame is marked as being the apex of a blink or not. I evaluated several machine learning models (random forests, gradient boosted decision trees, a multi-layered perceptron, and a support vector machine) and the support vector machine had the greatest precision and recall. This work is important because identifying peak workload tasks in aircraft cockpits contributes to the identification of areas that require better user interfaces and automation improvement to increase safety and efficiency of aircraft operations.


Using Artificial Intelligence to Predict Possibilities of Human-Nature Interaction in Natural Landscapes: A Proof of Concept
Presenter
  • Audryana Nay, Senior, Environmental Science & Resource Management (Landscape Ecology & Conservation)
Mentor
  • Peter Kahn, Environmental & Forest Sciences, University of Washington, Department of Psychology and School of Environmental and Forest Sciences
Session
  • 3:30 PM to 5:15 PM

Using Artificial Intelligence to Predict Possibilities of Human-Nature Interaction in Natural Landscapes: A Proof of Conceptclose

Think about a meaningful interaction in nature that you have had. Now characterize it in such a way that you could imagine many such examples of it happening, and even though each example would be at least a little different from the others, you would not have a problem recognizing each one as essentially the same form of interaction. An example includes Walking along the Edges of Water and Land (e.g., around Green Lake or at the beach in Golden Gardens). We call these characterizations interaction patterns. By assembling a verb, preposition, and nature noun, the profound internal experiences we feel in nature are given vernacular expression. Over the last five years, my research lab has empirically generated over 150 interaction patterns in diverse landscapes. Currently, interaction patterns have to be identified by an expert. This is where my novel research project comes in. I am using an Application Programming Interface (API) called Clarifai to develop an Artificial Intelligence (AI) program that can predict possible interaction patterns in a landscape from photo data. I anticipate having worked with approximately 10,000 photos to train the system on around two dozen interaction patterns by the end of spring quarter 2019. My goal is to develop a proof-of-concept for our novel approach, which could then be scaled upward with potentially large implications for conservation and urban design. For example, a future AI system like this one could predict the range and depth of interaction patterns experienced in a landscape that is under threat of development, to argue that that landscape is worth preserving. Our future AI system could also be integrated into the industry-standard urban design software AutoCAD, to optimize the integration of interaction patterns into urban design. In short, a proof-of-concept now: global reach later as a hopeful endpoint.


The University of Washington is committed to providing access and accommodation in its services, programs, and activities. To make a request connected to a disability or health condition contact the Office of Undergraduate Research at undergradresearch@uw.edu or the Disability Services Office at least ten days in advance.