Research Training Group (RTG)

RTG: Statistics in the 21st Century

Objects, Geometry and Computing

Images, matrices, functions, trajectories, trees, or graphs are examples of objects arising in modern data analysis. The importance of such novel data types in statistics, as well as the importance of geometry and computing in their analysis cannot be overstated. This RTG training grant is addressing these challenges by relevant training activities for:

  • undergraduate students,
  • graduate students,
  • postdoctoral fellows

The offered activities are in general open to all interested and qualified students, even if they are not formally members of the RTG. However, each individual training activity has a natural limit to the number of participants. Interested students should contact Professor Wolfgang Polonik (wpolonik@ucdavis.edu) to discuss further details.

General Description

Statistical analysis of object-data requires skills and knowledge that are not (yet) part of a standard training of a statistician. This includes data handling skills, such as accessing web services, or manipulating data formats; skills needed for working in a multi-disciplinary scientific environment, including the ability to understand the scientific context the data arose from; and also more mathematical skills, including notions of geometry, shape or topology, and the understanding of their role in the development and the analysis of corresponding statistical methodology. The training offered by the RTG addresses all these issues. On the one hand this is preparing participants for more advanced studies and research activities in Statistics and the mathematical sciences in general, and on the other hand, these training activities will also enhance the professional development of trainees.

The training itself consists of a blend of data analysis applied to real scientific questions, exposure to research in new methodologies, relevant mathematical theories, and computing.

Through a natural feedback mechanism this training program is expected to also lead to new approaches and ideas for strengthening the training of statisticians at various levels. Resources and tools for a modern training in statistics will be developed and will be broadly disseminated to the community.

The general goals of the RTG can be summarized as follows:

  • Train and prepare undergraduate students, graduate students and postdoctoral researchers at UC Davis to conduct research in the mathematical sciences, in particular in areas of Statistics involving objects, geometry and shape.
  • Provide a model for a modern education in Statistics via a comprehensive training from data handling, data analysis and methodology to theory by utilizing real data and real modern scientific challenges at all levels.
  • Increase the number of undergraduate majors in Statistics at UC Davis, to enhance their professional development, and to increase the percentage of those who enter graduate school.
  • Increase the overall number of underrepresented minorities and female domestic students in Statistics at UC Davis.
  • Increase the awareness and appreciation of the overall importance of statistical sciences among undergraduate and graduate students.

The specific statistical research topics of the training sessions are naturally guided by the interests of the RTG members. The following are instances of relevant topic areas:

  • Visualization, dynamics and manifolds for complex data with applications in medical and biological image analysis, and in finance and economics;
  • Trees, Graphs, and Shape Statistics such as filaments, level sets, and observed shapes such as animal tracks;

Other activities of the RTG include:

  • a regular RTG seminar
  • training in mentoring and teaching
  • lecture series to be held in Winter 2014
  • periodic sessions on
    • scientific writing
    • oral presentation
    • grant writing and job applications (for graduate students and postdocs)
    • applying to graduate school (for undergraduate students)

RTG members

Wolfgang Polonik Statistics Director wpolonik@ucdavis.edu
Prabir Burman Statistics pburman@ucdavis.edu
Hans-Georg Müller Statistics hgmueller@ucdavis.edu
Thomas Lee Statistics tcmlee@ucdavis.edu
Jie Peng Statistics jiepeng@ucdavis.edu
Ethan Anderes Statistics ebanderes@ucdavis.edu
Alexander Aue Statistics aaue@ucdavis.edu
Debashis Paul Statistics debpaul@ucdavis.edu
Duncan Temple Lang Statistics Duncan@wald.ucdavis.edu
James Carey Entomology jrcarey@ucdavis.edu
Vladimir Filkov Computer Science filkov@cs.ucdavis.edu
Lloyd Knox Physics & Cosmology lknox@ucdavis.edu
Naoki Saito Mathematics saito@math.ucdavis.edu

Current Students

Clark Fitzgerald

PhD in Statistics

rcfitzgerald@ucdavis.edu

Dmitriy Izyumin

PhD in Statistics

dizyumin@ucdavis.edu

Irene Kim

PhD in Statistics

imkkim@ucdavis.edu

Jamshid Namdari

PhD in Statistics

jamnamdari@ucdavis.edu

Ken Wang

PhD in Statistics

kenwang@ucdavis.edu

Nicholas Ulle

PhD in Statistics

naulle@ucdavis.edu

Pamela Patterson

PhD in Statistics

ppatterson@ucdavis.edu

Eric Kalosa-Kenyon

PhD in Statistics

ekal@ucdavis.edu

Dayanara Lebron-Aldea

MS in Statistics

dlebron@ucdavis.edu

Andrew Blandino

PhD in Statistics

ablandino@ucdavis.edu

Cody Carroll

PhD in Statistics

cjcarroll@ucdavis.edu

Benjamin Roycraft

PhD in Statistics

btroycraft@ucdavis.edu

Olga Zamoroueva

PhD in Statistics

ozamoroueva@ucdavis.edu

Training Program

Undergraduate Training

A vital component of this training is for undergraduates to get exposed to research in statistics by learning techniques for approaching scientific problems from a statistical point of view, and to get an exposure to more non-traditional statistical topics that often are not addressed in standard undergraduate classroom settings. The RTG will approach the training in two ways:

a) two quarters of regular group sessions with a mix of discussion style and lecture style, and
b) summer research projects (either individual or in small groups for up to 2 months) mentored by RTG members.

Funding is available. Students with a solid background in statistics interested in participating in the RTG activities should contact Professor Wolfgang Polonik (wpolonik@ucdavis.edu) for more details.

To apply for the RTG program:

Applications for the 2016-2017 academic year are now closed.  Applications for the 2017-2018 academic year will open Fall 2017.  Once we receive your application, the Mentors on the projects will choose who to interview. If selected for an interview, you will be notified via email. Letters of recommendation are not required until after interviews.   

Please see the RTG undergrad brochure for more information.

PROJECTS (2016-17)

Project 1: Applied Functional Data Analysis

Faculty Mentor(s): Professor Hans-Georg Müller (Statistics)

Abstract: In this project the participating student will obtain data that contain functions (time courses) or can be viewed as being generated by underlying functions and will study, apply and compare various linear and nonlinear dimension reduction methods, including functional principal component analysis. Applications may include economic or biological data that involve samples of time courses. 

Prerequisites: Fluency with R, STA 135 or STA 141A.

Project 2: Quantifying Patterns of Survival and Reproduction for Cohorts of Flies

Faculty Mentor(s):  Professor Hans-Georg Müller (Statistics), Professor James Carey (Entomology)

Abstract: This project touches upon biodemography, ecology and evolution, functional data analysis, and survival analysis.  There are several possible projects, including the study of large samples of cohort lifetables. 

Prerequisites: Matlab, R, STA 131B, 135 or STA 141A.

Project 3: Network Data Visualization with Linear Algebra

Faculty Mentor(s): Professor James Sharpnack (Statistics)

Abstract: Network data has become increasingly prevalent with the rise of online social networks, modern biomedical technologies, smart city initiatives, and massive sensor networks.  Due to their abstract mathematical nature, networks do not immediately admit a natural visualization technique.  We will explore how eigenvectors of various graph based matrices can be used to visualize vertices of the network, and how this can be used to learn underlying manifold structure in high-dimensional data.  This project will emphasize exploratory tools for understanding networks, and we will look at specific examples of network data.

Prerequisites: MAT 22A

Data link: https://snap.stanford.edu/data/ 

Sample publications: 

Project 4: Processing and analyzing data from the Human Connectome Project

Faculty Mentor(s): Professor Jie Peng (Statistics), Professor Debashis Paul (Statistics)

Abstract: In this project, we will look into various aspects magnetic resonance imaging of human brain. Specifically, we aim to make use of  data from the  Human Connectome Project (HCP). The project involves various steps for data handling including accessing, processing, visualizing data and analyzing the data. We will focus on learning default mode functional connectivity networks using resting state functional MRI data. 

Prerequisites: The students are expected to be familiar with at least one software language (R or Matlab), be capable of learning how to use specific neuroimaging software,  and  be familiar with matrix algebra and  regression analysis. Knowledge of time series analysis (STA137) and multivariate statistics (STA135) are preferred though not required. 

Project 5: Where do we get data from and what can we do with it?

Faculty Mentor(s): Professor Christiana Drake (Statistics)

Abstract: Statisticians and data analysts get data from many sources. In agriculture, investigators often conduct experiments. Studies involving humans are more complicated. Clinical trials are used to assess the efficacy of new drugs. These studies are mostly randomized trials. Epidemiologists and public health workers often study potentially harmful substances and studies are observational in nature. Biologists, also, often have to rely on observational studies to investigate biological processes in the progression of diseases. Nutritionists use randomized studies or observational studies to assess interventions in nutritional behaviors and sample surveys to find out what people are eating. All these studies have one thing in common. The data always have problems. Missing data are a big concern. The format of the data is often not suitable for statistical analysis and reformatting and checking for accuracy can be very time consuming. We will study these issues using data sets collected for various reasons and preparing them for statistical analysis. We will use excel, SAS and R as needed. 

Project 6:  Exploration of classification methods: SVM, kNN, and KDE

Faculty Mentor(s): Professor Xiaodong Li (Statistics)

Abstract: Classification is a very important methodology in statistics, machine learning and data science. In this project we will study three popular classification methods: Support Vector Machines (SVM), k-nearest neighbor classifiers (kNN), and classification via kernel density estimation (KDE). The goal of this project is to understand these methods on a conceptual level, and to explore their practical behavior by applying them to various data sets (using R). The outcome of this project will be a Wiki page (see http://stats.libretexts.org/) describing the three classification methods. 

Project 7: Analysis and visualization of data from a social website for sharing music and memories

Faculty Mentor(s): Petr Janata (Center for Mind and Brain; Department of Psychology)

Abstract: The aim of this project is to analyze and display data that are being collected on a social website called MEAMCentral, a place where people can associate memories with music.  Depending on the interests of the project team, analyses that can be performed include: user interactions with the website, natural language processing of memory content, and music similarity based on associated memories or meta-information about the music. Music and memory data are stored in a graph database that can be retrieved using SPARQL or PROLOG queries. Analyses may be written in R and/or Python, and results should be visualized via web browser using the D3 JavaScript library.

Prerequisites:  Skills include facility with GitHub and R. Some prior experience with database querying, e.g. using SQL or SPARQL, Python, and D3 or JavaScript would be helpful.

Click here to see the ARCHIVE OF PREVIOUS RTG PROJECTS

Python Course

In addition, two of the graduate students in the RTG, Nick Ulle and Clark Fitzgerald, taught a successful course in "Python for Data Mining", which had almost 100 participants.  This course was taught under the supervision of Professor Duncan Temple Lang and was co-sponsored by the UC Davis Data Science Initiative.

Course material: https://github.com/nick-ulle/2015-python

Link to lectures: http://dsi.ucdavis.edu/PythonMiniCourse/

Graduate Training

The activities will address:

  •     tools for statistical research involving objects and geometry: two quarter training activity in form of a combination of discussion style instructions, group projects and presentations;
  •     enhancement of other professional skills, including scientific writing, presentation style, teaching and mentoring skills, advanced literature search.
  •    For more information, please contact Professor Wolfgang Polonik (wpolonik@ucdavis.edu).More details to be added;Postdoctoral trainingMuch of the postdocs’ activities will consist in conducting research either independently or in collaboration with other researchers in the group on topics related to the theme of the training grant, to build up a scientific network, and to develop other professional skills that are necessary for a successful career in the Mathematical Sciences. The group members will assist, guide and mentor the postdocs in this process through appropriate interactions that will be adapted to the individual needs of the postdocs and to the different phases of their development. Particular emphasis will be given to the choice of a cutting edge research topic. The ultimate goal is
  •     to train a postdoc to have strong professional skills and expertise in modern Statistics involving objects, geometry and computing, and having these strengths demonstrated by a strong record.

For more information, please contact Professor Wolfgang Polonik (wpolonik@ucdavis.edu), or download the RTG Flier (grad program).