Archive

Archive for the ‘Statistics’ Category

TV: KOB4, Police cadet test scores under investigation

February 5th, 2013

Tonight KOB-TV4 aired the NM Law Enforcement Academy “Police cadet test scores under investigation” story on the Eyewitness News 4 at 10 P.M., for which I gave a short interview to Gadi Schwartz using a plot I created from the test score data.

I gave the information and interview out of a personal desire to be helpful and was not acting on the University’s behalf. I did not speculate on the cause for the outlying class’s scores. I value the men and women who risk their lives daily serving our communities.

Statistics

Plot improved: NM Registered voters 2008

February 5th, 2013

Showing party affiliation by age group can be made more informative by representing voter power with census data.  Note that in the “before” plot, years 60+ appear to be almost half the plot width while in the “after” plot we see that 60+ only represent 25% of the voting pool.

Before

After

R code to create the “after” plot follows.

Read more…

Statistics

Invited talks: Neuroimaging and Statistics at Wright State University, Dayton, OH

November 4th, 2012

I just returned from a fun event-filled couple days at Wright State Univeristy in Dayton, Ohio, visiting statistician Harry Khamis.  Harry invited me to give two talks on Friday, November 2nd, 2012: one in Statistics and a second in Neuroscience, arranged by Thomas N. Hangartner.  Harry was the model host; I always felt taken care of, my needs met.

I was excited to meet two people from my talks who could use the methods I presented. Prof Nasser H Kashou develops models for HRF functions, which the SimTB might be helpful for. Prof Yvonne Vadeboncoeur uses stable isotopes to study freshwater ecosystems, and we had some exciting discussion about collaborative opportunities.

The links to the papers the talks draw on are at the bottom.

My morning neuroimaging talk (10:15) in the Department of Biomedical, Industrial & Human Factors Engineering (BIE) included two-and-one-half topics: SimTB, subject variability with GICA, and a little data visualization.

Title
Capturing inter-subject variability with group independent component analysis of fMRI data: a simulation study

Abstract
A key challenge in functional neuroimaging is the meaningful combination of results across subjects. Even in a sample of healthy participants, brain morphology and functional organization exhibit considerable variability, such that no two individuals have the same neural activation at the same location in response to the same stimulus. This inter-subject variability limits inferences at the group-level as average activation patterns may fail to represent the patterns seen in individuals. A promising approach to multi-subject analysis is group independent component analysis (GICA), which identifies group components and reconstructs activations at the individual level. GICA has gained considerable popularity, particularly in studies where temporal response models cannot be specified. However, a comprehensive understanding of the performance of GICA under realistic conditions of inter-subject variability is lacking. In this study we use simulated functional magnetic resonance imaging (fMRI) data to determine the capabilities and limitations of GICA under conditions of spatial, temporal, and amplitude variability. Simulations, generated with the SimTB toolbox, address questions that commonly arise in GICA studies, such as: (1) How well can individual subject activations be estimated and when will spatial variability preclude estimation? (2) Why does component splitting occur and how is it affected by model order? (3) How should we analyze component features to maximize sensitivity to intersubject differences? Overall, our results indicate an excellent capability of GICA to capture between-subject differences and we make a number of recommendations regarding analytic choices for application to functional imaging data. mialab.mrn.org/software/simtb

My afternoon statistics talk (3:00) in the Department of Mathematics and Statistics to a packed room (they had to bring in additional chairs!) included work that extends my published stable isotope sourcing work.

Title
An extended Bayesian stable isotope mixing model for trophic level inference

Abstract
You are what and where you eat on the food web. We developed an extended Bayesian mixing model to jointly infer organic matter utilization and isotopic enrichment of organic matter sources in order to infer the trophic levels of several numerically abundant fish species (consumers) present in Apalachicola Bay, FL, USA. Bayesian methods apply for arbitrary numbers of isotopes and diet sources but existing models are somewhat limited as they assume that trophic fractionation is estimated without error or that isotope ratios are uncorrelated. The model uses stable isotope ratios of carbon, nitrogen, and sulfur, isotopic fractionations, elemental concentrations, elemental assimilation efficiencies, as well as prior information (expert opinion) to inform the diet and trophic level parameters. The model appropriately accounts for uncertainly and prior information at all levels of the analysis.

Neuroscience talk
Summary of both SimTB papers.

SimTB, a simulation toolbox for fMRI data under a model of spatiotemporal separability
Erik B. Erhardt, Elena A. Allen, Yonghua Wei, Tom Eichele, Vince D. Calhoun
NeuroImage 59 (2012), pp. 4160-4167
http://www.sciencedirect.com/science/article/pii/S105381191101370X

Capturing inter-subject variability with group independent component analysis of fMRI data: A simulation study
Elena A. Allen, Erik B. Erhardt, Yonghua Wei, Tom Eichele, Vince D. Calhoun
NeuroImage
http://www.sciencedirect.com/science/article/pii/S1053811911011712

Data visualization in the neurosciences: overcoming the curse of dimensionality
Elena A. Allen, Erik B. Erhardt, Vince D. Calhoun
Neuron
www.cell.com/neuron/retrieve/pii/S089662731200428X

Statistics talk
A Bayesian framework for stable isotope mixing models
Erik B. Erhardt and Edward J. Bedrick
Environmental and Ecological Statistics
http://www.springerlink.com/content/vg4v62j8717671p3/

Bio
Erik Barry Erhardt, PhD, is an Assistant Professor of Statistics at the University of New Mexico Department of Mathematics and Statistics, where he serves as Director of the statistics consulting clinic. His research interests include Bayesian and frequentist statistical methods for stable isotope sourcing and brain imaging. Erik is a Howard Hughes Medical Institute Interfaces Scholar collaborating in interdisciplinary research and consulting. StatAcumen.com

MIND, Research, stable isotopes, Statistics

Paper published: Bayesian Simultaneous Intervals for Small Areas: An Application to Variation in Maps

October 30th, 2012

Bayesian Simultaneous Intervals for Small Areas: An Application to Variation in Maps
Erik Barry Erhardt, Balgobin Nandram, Jai Won Choi
International Journal of Statistics and Probability
Vol 1, No 2, pp. 229–243
Received: September 19, 2012
Accepted: October 24, 2012
Online: October 29, 2012
http://www.ccsenet.org/journal/index.php/ijsp/article/view/20714
doi:10.5539/ijsp.v1n2p229

Abstract
Bayesian inference about small areas is of considerable current interest, and simultaneous intervals for the parameters for the areas are needed because these parameters are correlated. This is not usually pursued because with many areas the problem becomes difficult. We describe a method for finding simultaneous credible intervals for a relatively large number of parameters, each corresponding to a single area. Our method is model based, it uses a hierarchical Bayesian model, and it starts with either the 100(1-alpha)% (e.g., alpha=0.05 for 95%) credible interval or highest posterior density (HPD) interval for each area. As in the construction of the HPD interval, our method is the result of the solution of two simultaneous equations, an equation that accounts for the probability content, 100(1-alpha)% of all the intervals combined, and an equation that contains an optimality condition like the “equal ordinates” condition in the HPD interval. We compare our method with one based on a nonparametric method, which as expected under a parametric model, does not perform as well as ours, but is a good competitor. We illustrate our method and compare it with the nonparametric method using an example on disease mapping which utilizes a standard Poisson regression model.

Research, Statistics

Paper published: A Bayesian framework for stable isotope mixing models

October 25th, 2012
Comments Off

A Bayesian framework for stable isotope mixing models
Erik B. Erhardt and Edward J. Bedrick
Environmental and Ecological Statistics
Submitted 19 February 2011
Accepted 28 September 2012
Online 23 October 2012
http://www.springerlink.com/content/vg4v62j8717671p3/
DOI 10.1007/s10651-012-0224-1

Abstract:

Stable isotope sourcing is used to estimate proportional contributions of sources to a mixture, such as in the analysis of animal diets and plant nutrient use. Statistical methods for inference on the diet proportions using stable isotopes have focused on the linear mixing model. Existing frequentist methods provide inferences when the diet proportion vector can be uniquely solved for in terms of the isotope ratios. Bayesian methods apply for arbitrary numbers of isotopes and diet sources but existing models are somewhat limited as they assume that trophic fractionation or discrimination is estimated without error or that isotope ratios are uncorrelated. We present a Bayesian model for the estimation of mean diet that accounts for uncertainty in source means and discrimination and allows correlated isotope ratios. This model is easily extended to allow the diet proportion vector to depend on covariates, such as time. Two data sets are used to illustrate the methodology. Code is available for selected analyses.

Research, stable isotopes, Statistics

Paper published: Data visualization in the neurosciences: overcoming the curse of dimensionality

May 24th, 2012
Data visualization in the neurosciences: overcoming the curse of dimensionality

Elena A. Allen, Erik B. Erhardt, Vince D. Calhoun
Neuron
Accepted 7 May 2012
Online 24 May 2012
doi:10.1016/j.neuron.2012.05.001
www.cell.com/neuron/retrieve/pii/S089662731200428X

Abstract:
In publications, presentations, and popular media, scientific results are predominantly communicated through graphs. But are these figures clear and honest, or misleading? We examine current practices in data visualization and discuss improvements, advocating design choices which reveal data rather than hide it.

MIND, Research, Statistics

Awarded: 2011-12 UNM Math & Stat Outstanding Undergraduate Instructor

May 11th, 2012
Comments Off

I’m grateful to my students who voted for me as UNM Math & Stat Outstanding Undergraduate Instructor for 2011-12 (certificate).  I was tied for UNM Math & Stat Outstanding Graduate Instructor, as well.  I work hard to give my students the best experience, to give them time before and after class to ask questions, to respond promptly to their email, and to reach them where they are and pull them up or show them how to keep climbing.

I keep adding to my Teaching Dossier, reflecting on my experience and accomplishments.

Statistics

Mega Millions $540M jackpot, or How to NOT LOSE at the lottery

March 30th, 2012

I quickly prepared the fun slides below for a short interview with KOAT Channel 7 on the Mega Millions $540M jackpot (estimated at $640 at 3:30pm on the day of the drawing), since it is greatly surpassing the previous record of $390 million.

A ticket’s probability of winning the jackpot is roughly the ratio of the length of one of your fingers to the diameter of the earth, so unchangeably near 0 (0.00000000569). It is interesting for the jackpot to be large enough that the expected value is a few times larger than the cost of a ticket, which makes it a sensible time to buy from an expected value point-of-view. In fact, now’s a good time to purchase EVERY ticket combination — hurry, and hope you don’t have to split it with another winner!
Read more…

Statistics

Funded: UNM Math&Stat Travel award to WNAR 2012

March 7th, 2012
Comments Off

The UNM Department of Mathematics and Statistics funded my travel request to go to 2012 WNAR – Graybill June 17-20, 2012 at Colorado State University – Fort Collins, Colorado. I intend to participate with a talk on my trophic-level modeling in the “Biostatistics and systems biology” session, meet and discuss research with those working in similar areas, and look for opportunities for cross-collaboration for students at these other western universities.

Research, Statistics

Paper published: The 5.1 ka Aridization Event, Expansion of Piñon-Juniper Woodlands

February 13th, 2012
Comments Off

The 5.1 ka Aridization Event, Expansion of Piñon-Juniper Woodlands, and the Introduction of Maize (Zea mays) in the American Southwest
Brandon L. Drake, W. H. Wills, Erik B. Erhardt
The Holocene
Published online before print July 9, 2012, doi: 10.1177/0959683612449758
Accepted 2/13/2012

Lee Drake (UNM Anthropology) exemplifies excellence and I will make every opportunity to work with him again.

Abstract
Pollen analysis is frequently used to build climate and environmental histories. A distinct Holocene pollen series exists for Chaco Canyon, New Mexico. This study reports linear modeling and hypothesis testing of long distance dispersal pollen from radiocarbon-dated packrat middens which reveal strong relationships between piñon pine (Pinus edulis) and ponderosa pine (Pinus ponderosa). Ponderosa pollen dominates midden pollen assemblages during the early Holocene, while a rapid shift to a much higher proportion of piñon to ponderosa pine pollen between 5,440 and 5,100 BP points to an aridization episode. This shift is associated with higher δ18O values in Southwest speleothem records relative to the preceding millenium.  The period of aridization is followed by a sharp increase in El Niño/Southern Oscillation events that would have caused highly variable precipitation and lasted until 4,200 BP. Bayesian changepoint analysis suggests that this aridization episode led to stable ecotonal boundaries for at least 3,000 years. The piñon/ponderosa transition may have been caused by punctuated multi-year droughts, analogous to those in the 20th century. The earliest documented instance of Zea mays cultivation on the Colorado Plateau is around ca. 4,290 BP. The introduction of this laborintensive cultigen from Mesoamerica may have been facilitated by changes in the regional ecosystems, specifically by an increase in piñon trees, that promoted increasing human territoriality. Linear modeling and hypothesis testing can complement traditional palynological techniques by adding greater resolution in vegetation patterning to climate/environmental histories.

Research, stable isotopes, Statistics