Climate and environmental changes are today widely discussed, and in particular the impact of human activity. To understand variations in past climate over longer time periods, historical documents, year rings from trees, corals, ice cores from glaciers as well as lake and sea sediments are being used.
In this interdisciplinary project, we use a FDA approach to analyze varved lake sediments, aiming at reconstructing past environment and climate changes. In particular we analyze a varved sediment core taken from the bottom of the lake Kassjön (above), which covers more than 6000 years. Kassjön is situated outside Umeå, N. Sweden.
We consider Bayesian analysis of continuous curve functions in 1D, 2D and 3D space. A fundamental aspect of the analysis is that it is invariant under a simultaneous warping of all the curves, as well as translation, rotation and scale of each individual. We introduce Bayesian models based on the curve representation named the Square Root Velocity Function (SRVF) introduced by Srivastava et al. A Gaussian process model for the SRVF of curves is proposed, and suitable prior models such as Dirichlet process are employed for modeling the warping function as a Cumulative Distribution Function (CDF). Simulation from posterior distribution is via Markov chain Monte Carlo methods, and credibility regions for mean curves, warping functions as well as nuisance parameters are obtained. Special treatment needs to be applied when target curves are closed or uncertain starting points are involved in open curves. We will illustrate the methodology with applications in 1D proteomics data, 2D mouse vertebra outlines and 3D vascular data.
We adopt a Functional Data Analysis approach, and propose a penalized regression model for data spatially distributed over non-planar twodimensional Riemannian manifolds. The model is a generalized additive model with a roughness penalty term involving a suitable differential operator computed over the nonplanar domain. We show that the estimation problem can be solved, first by conformally mapping the non-planar domain to a planar domain and then by applying existing models for penalized spatial regression over planar domains, appropriately modified to account for the domain deformation implied by the flattening. The flattening map and the estimation problem are computed by resorting to finite element methods. The estimators are linear in the observed data values and classical inferential tools are derived. The application driving this research is the study of hemodynamical forces on the wall of an internal carotid artery affected by an aneurism.
In languages such as Mandarin, the tone of the spoken utterances determines the meaning of the word. This tone is conveyed primarily through the Fundamental Frequency (F0). This work analyzes F0 patterns not only in terms of the amplitude variation (Hz) but also in terms of the time-displacement (ms). Viewing the sampled F0 curve dataset as realizations of a stochastic process both in amplitude and phase, we formulate a framework where we register our data on a unique time-scale and subsequently employ FPCA on both the amplitude and the phase domains of each syllable. Projections obtained by FPCA are then used within a multivariate mixed effects regression framework that allows us to detect relevant interactions and draw statistically significant conclusions about the phonetic phenomena in our corpus, while also providing the possibility of producing predictions and estimates of the complete curves on the original domain.
We reconsider the registration procedure described in Kneip and Ramsay [2008], in which the warping functions are iteratively constructed by optimizing the fit between a principal component decomposition and the aligned curves by solving a penalized minimization problem.
The original procedure is especially capable of "local" alignments which, sometimes, is not enough for registering the whole curves. We overcome this problem by introducing "global" parameters in the warping function such that global shiftings of the individual curves are allowed. Additionally, we do not assume a fixed low dimensional representation of the target function during the algorithm. As one result we are able to align curves which the original algorithm was not capable of. This Improvement is shown in some simulations.
Alois Kneip and James O Ramsay. Combining registration and fitting for functional models. Journal of the American Statistical Association, 103(483):1155-1165, 2008.
We present a novel methodology for a comprehensive statistical analysis of approximately periodic biosignal data. There are two main challenges in such analysis: (1) the automatic extraction (segmentation) of cycles from long, cyclostationary biosignals and (2) the subsequent statistical analysis, which in many cases involves the separation of temporal and amplitude variabilities. The proposed framework provides a principled approach for statistical analysis of such signals, which in turn allows for an efficient cycle segmentation algorithm. This is achieved using a convenient representation of functions called the square-root slope function (SRSF). The segmented cycles, represented by SRSFs, are temporally aligned using the notion of the Karcher mean, which in turn allows for more efficient statistical summaries of signals. We show the strengths of this method through various disease classification studies.
Work done in collaboration with Dr. Wei Wu, Dr. Gary E. Christensen, and Dr. Anuj Srivastava.
In clinical lameness examination of horses, a lameness score is assigned to the horse based on visual inspection of the locomotion pattern. This is quite subjective, and objective measurements of lameness would be helpful as supplement to the visual inspection. The poster studies the relation between acceleration signals and lameness, with special emphasis on alignment. Each data signal consists of eight gait cycles, and these eight subsignals are aligned. Moreover, each subsignal consists of two parts, and we examine the relation between lameness and the phase displacement between those two parts.
We consider the problem of clustering multiple response curves of the similar type based on their delay patterns in response. The curves are similar not only because they are measured on the same subject, but because they represent certain underlying characteristics in common. Most work has focused on extracting the common characteristics subject to random variability, often separately for each variable. However when viewing these curves as response variable, there is additional variability associated with delay patterns in response, and these are often corrected in pre-processing. Here we consider the situation where the curves are similar up to some delays and understanding different patterns of delayed response is the main aim of the analysis. As delays can be measured only relatively, we define subject-specific relative delays from the time warping functions for the response variables and explore clustering approaches to find sub-populations defined by patterns of delays. We adopt k-means clustering algorithm with variations of L2 distance measures for multiple curves of relative delays. We illustrate our approach with growth curves of different body parts and find that the clusters are stable with respect to the choice of distance measures.
Constructing generative models for functional observations is an important task in statistical functional analysis. In general, Functional data contains both phase (or x or horizontal) and amplitude (or y or vertical) variability. Traditional methods often ignore the phase variability and focus solely on the amplitude variation, using cross-sectional techniques such as fPCA for dimensional reduction and data modeling. Ignoring phase variability leads to a loss of structure in the data and ineffciency in data models. We present a novel approach that relies on separating the phase (x-axis) and amplitude (y-axis), then modeling these components using joint distributions. This separation, in turn, is performed using a technique called elastic shape analysis of curves that involves novel mathematical representation of functional data. Then, using individual fPCAs, one each for phase and amplitude components, while respecting the nonlinear geometry of the phase representation space; impose joint probability models on principal coeffcients of these components.We demonstrate these ideas using random sampling, for models estimated from simulated and real datasets, and show their superiority over models that ignore phase-amplitude separation.Furthermore, we apply these generative models to classi?cation of functional data and achieve high performance in applications involving SONAR signals of underwater objects, handwritten signatures, and periodic body movements recorded by smart phones.
We introduce a modeling and mathematical framework in which the problem of registering a functional data set can be consistently set. In detail, we show that the introduction, in a functional data analysis, of a metric/semi-metric and of a group of warping functions, with respect to which the metric/semi-metric is invariant, enables a sound and not ambiguous definition of phase and amplitude variability. Indeed, in this framework, we prove that the analysis of a registered functional data set can be re-interpreted as the analysis of a set of suitable equivalence classes associated to original functions and induced by the group of the warping functions.
Moreover, an amplitude-to-total variability index is proposed. This index turns out to be useful in practical situations for measuring to what extent phase variability affects the data and for comparing the effectiveness of different registration methods.
Most of solutions of registering pairs or groups of images are variational, using energy functions that fail to satisfy two most basic and desired properties in registration: (1) invariance under identical warping; (2) inverse consistency. We present a novel registration approach that uses the L2-norm, between certain functions called SRF derived from images, as an objective function for registering images. This framework satisfies the desired properties. Additionally, our framework induces a metric in the space of equivalence classes of images which enables us to perform joint registration of multiple images, using the concept of a mean image.