Archive

Archive for the ‘Statistics’ Category

Paper published: A morphometric analysis of Actaea racemosa L. (Ranunculaceae)

January 4th, 2012

A morphometric analysis of Actaea racemosa L. (Ranunculaceae)
Z. Gardner, L. Lueck, E.B. Erhardt, L.E. Craker
Journal of Medicinally Active Plants
http://scholarworks.umass.edu/cgi/preview.cgi?article=1008&context=jmap

Abstract
Actaea racemosa L. (syn. Cimicifuga racemosa [L.] Nutt.), Ranunculaceae, commonly known as black cohosh, is an herbaceous, perennial, medicinal plant native to the deciduous woodlands of eastern North America. Historical texts and current sales data indicate the continued popularity of this plant as an herbal remedy for over 175 years. Much of the present supply of A. racemosa is harvested from the wild. Diversity within and between populations of the species has not been well characterized. The purpose of this study was to assess the morphological variation of A. racemosa and identify patterns of variation at the population and species levels. A total of twentysix populations representative of a significant portion of the natural range of the species were surveyed and plant material was collected for the morphological analysis of 37 leaflet, flower, and whole plant characteristics. In total, 511 leaflet samples and 83 flower samples were examined. Several of the populations surveyed had sets of relatively unique characteristics (large leaflet measurements, tall leaves and flowers, and a large number of stamen) and Tukey-Kramer multiple comparisons revealed significant differences between specific populations for 20 different characteristics. However, no unique phenotype was found. Considerable morphological plasticity was noted in the apices of the staminodia. Cluster analyses showed that the morphological variation within populations is not smaller than between population and that this variation in not influenced by their geographic distribution.

Research, Statistics

Funded: UNM RAC grant Erhardt/Hanson, Modeling (photo)respiration

December 27th, 2011

We got one! Research Allocation Committee (RAC) Grants are for supporting new research or creative works. The RAC is particularly supportive of projects that may lead to outside funding and/or larger related projects.

PIs: Erik Erhardt and David Hanson
Title: “Frequentist (bootstrap) and Bayesian modeling of (photo)respiration in plants”
Amount: $3982.63, RAC 12-04
Use: To hire statistics graduate student, Mohammad Hattab, to implement and develop modeling that I did last summer in Switzerland.

Purpose:
We are requesting $3982.63 to develop statistical models to estimate (photo)respiration in plants, accounting for sources of uncertainty and prior information. Because current models provide estimates without meaningful assessments of uncertainty, our model will have broad application in understanding photosynthetic pathways and carbon usage in plants, clarifying the precision of our knowledge, conditional on what is already believed. This modeling is an important step towards developing more comprehensive models of photosynthetic parameters. Support from the Resource Allocation Committee will allow us to: (1) develop frequentist (bootstrap) and Bayesian models to analyze existing experimental data, providing inferences on the set of parameters related in the model; (2) design experiments and acquire additional data to distinguish and estimate respiration and photorespiration under a set of scientifically relevant conditions; (3) conduct validations using pre-existing data and estimates; (4) publish our model with results; and (5) develop grant proposals to apply this model more broadly.

Research, stable isotopes, Statistics

Another look at New Mexico suicide statistics: conditional probability and data visualization

November 4th, 2011

This article was printed in the Daily Lobo on 11/10/2011.

Presenting information in a way that clearly answers interesting questions is challenging. Every plot has an implicit question (hypothesis) that it helps you answer. Therefore, it is important to align a visual display of information with the intended interesting question(s). Collaboration or consultation with a statistician can clarify interesting questions and lead to answers through appropriate data analysis (visit UNM’s free statistics consulting clinic, www.stat.unm.edu/~clinic).

Suicide was the topic of the front cover story in the Daily Lobo on Thurs, Nov 3rd. With the story, two pie charts displayed average annual proportions of “successful” and “unsuccessful” suicides by method in NM. The “successful” pie chart answers this statement of conditional probability (their implied question): “given a successful suicide, what percentage used certain methods?” A question I consider more interesting reverses the conditioning (my question): “given an attempted suicide with a certain method, what percentage were successful?” Furthermore, I want to know the overall frequency and percentage of each method attempted. How can we present the information in a way that simultaneously answers these questions?

The Suicide Prevention Resource Center (SPRC.org) maintains national and state suicide fact sheets, last updated September 2008, describing “deaths by suicide, estimated hospitalized attempts, and data on medical costs, work loss costs, gender, race/ethnicity, age, and method of suicide.” The pie charts in Thursday’s Daily Lobo were reproductions of those found on the NM fact sheet. From their NM summaries, below is the SPRC table for estimated mean frequencies by method for “successful” and “unsuccessful” suicides.

Method Successful Unsuccessful Total
Cut/Pierce 4 229 233
Firearms 191 16 207
Poisoning 60 1097 1157
Suffocation 73 23 96
Other/Unspecified 13 91 104
Total 341 1456 1797

Their question and pie charts (below) consider percentages down columns. When the data are reduced to row percentages for “successful” and “unsuccessful” attempts separately, you lose the relative frequency of attempts. The percentage of firearms “successes” (56%), for example, depends on all the other “successful” attempts. Because proportions for “successful” and “unsuccessful” attempts are separate, you can’t learn about how successful firearm attempts are.

Original pie chart

Original pie charts of proportions of method conditional on attempt "success", which doesn't ask/answer the interesting/relavant question.

It is critical to consider the temporal process: a person first chooses a method, then makes an attempt, and is either “successful” or not. The data display and questions should follow these temporal steps. The pie chart displays ignore this process.

My question and plot (below) considers the temporal process of attempting suicide, considering percentages across rows, including row total information. First, the relative use of various methods is clear; almost two-thirds of attempts are by poisoning, and firearm and cut/pierce are each just above one in ten. However, though attempts by firearms (12%) and cut/pierce (13%) are relatively rare, the “success” rates are extremely different (92% versus 2%)! The plot has been sorted by the numbers of “successes” to emphasize the relative risk of the methods in terms of lives, information which is lost in the pie charts. Also, the area of each box is relative to the frequency in each box. The Agora Crisis Center (505-277-3013, 9am-midnight, every day) plays a critical role in our community, and our education as individuals around these issues can save someone. Using statistics and visualization to tell and understand the important story in the data can lead to improvements in strategies and resource allocation for treatment and prevention.

Improved visualization

Improved visualization has relative use of methods across the horizontal and proportion of successes along the vertical. Area is proportional to people.

R code follows to produce plot above (with modest post-production necessary).
Read more…

Research, Statistics

Statistics job resources

June 19th, 2011

Applying for an academic job is serious work.  I ended up lucky (though, luck favors the prepared (Louis Pasteur)).  I received two job offers this season and took my first-choice job.  But I worked hard to get those offers.  I kept a detailed CV my entire student career (starting as a BA student, not waiting until job season to start), wrote an extensive teaching dossier for the 20 courses I’ve taught and ugrad tutoring experience, and developed a research statement as that vision became clearer to me.  Clearly, self-investment and personal excellence are the most important ingredients.  Next is to find people who want to hire you.

Two sites and one magazine basically covers the bases for statistics.

1.  If you’re a statistics student, you’re already a member of the ASA, right?  If so, the back of the AmStat News magazine has many jobs listed.

http://magazine.amstat.org/

2. Many jobs are posted at the American Statistical Association (ASA) jobs website.   Subscribe to their feed in your RSS reader:

http://jobs.amstat.org/search/results/index.cfm?SN=25&ss=1&display=rss

While I have had my CV posted on the site for years, I’ve never received any contact because of it.  I think the more direct approach of networking or replying to specific jobs is more effective.

3. The University of Florida statistics website lists many jobs, too.  My impression is that this site is even more comprehensive than jobs.amstat sometimes.

http://www.stat.ufl.edu/vlib/Index.html

I recommend being subscribed to the jobs.amstat.org in your RSS reader, because then most of the jobs will come to you.  You can follow-up at the UFlorida website to make sure you’re not missing anything.  Start looking in Sept/Oct and work on cover letters through Nov/Dec for the Dec/Jan/Feb deadlines.  Ask for letters of recommendation early (maybe even late summer while your professors are not busy with the semester).  Ask your advisor to look over your CV, cover letter, and other submission materials (scan a pdf of your unofficial transcript).  They’ve reviewed many applications hiring in their department before and will have good advice.  Send your application materials (all in pdf format — not doc!) as soon as you are ready to help yours be near the top of their review pile.  And while your application is in the hands of many hiring committees, try not to sweat — you’ve done all you can and it’s largely out of your control until they ask for an interview (or send you a form rejection letter, or never respond to you at all).  Feel free to send a follow-up email to request status if it’s a week or so after their self-predicted decision deadline, if it will help calm your nerves, but try not to hassle them.  It’s a very challenging market and positions regularly get 80-300 applications, so everything you can do to rise to the top of that deep stack can make the difference between getting a toe in the door and the alternative.

Interviewing is next step.  Here are some pages with questions to prepare for.  Write your questions down just as you’d say them and practice saying them aloud, maybe to a friend who will listen.  You want to clarify your answers to yourself and get them to flow smoothly out of your mouth.
10 tough interview questions
General advice

The job talk is the last step.
You’re a grown up, use Mac’s iWork Keynote — it’s the best presentation software available.
BBP was a great resource, provided you can ignore all the MSPP BS.  First five slidesTemplate. Video.
Matt Might’s presentation tips and job hunt advice.
CS Berkeley

Negotiating for your salary, start-up, teaching reduction, and more — ask your advisor for advice.  If you have a second offer, all of this becomes much, much easier!

Research, Statistics

tdllicor: estimates discrimination and other parameters associated with leaf photosynthesis

June 8th, 2011

Together with David Hanson, I developed R package tdllicor which reads TDL and Licor files, aligns them, and calculates quantities of interest with bootstrap intervals.  It is currently private as it is specialized and not of general interest.  It has already been important for a number of conference publications and is used for active research:

Conference Publications

DT Pater, EB Erhardt, and DT Hanson. Photorespiratory and respiratory carbon
isotope fractionation in leaves. In Proceedings of the Biophysical Society 55th
Annual Meeting, Baltimore, MD, Mar 2010. Biophysical Society.

DT Pater, EB Erhardt, and DT Hanson. Isotopic signature of photorespiration.
In Joint Annual Meetings of the American Society of Plant Biologists and the
Canadian Society of Plant Physiologists, Montreal, CA, August 2010.

Research, Statistics

mortest: estimates the total number of carcasses at a windfarm

June 8th, 2011

Working with Aaftab Jain, we developed a estimator for total number of bird and bat carcasses at a windfarm called “mortest” and implemented it as an R package.  We are interested in estimating c, the total number of carcasses (mortalities) in a period (year). The total number of carcasses is the sum of carcasses over size classes, c = sum_s=1^S c_s. If carcasses are retained (that is, not scavenged) and searcher efficiency is perfect (every carcass is found) and every tower is searched, then each c_s would be counted perfectly. Yet, carcass scavenging by predators and searchers overlooking carcasses are a reality, making observed counts an underestimate. Furthermore, tower sampling rather than censusing is a cost-saving convenience. Our estimator of total mortality, c, weighs the estimates from different search intervals and adjusts the observed counts for scavenging, search efficiency, searchable area of each tower, and proportion of towers searched, accounting for uncertainty in these estimates using a bootstrap.

The software was written by Erik Erhardt and is currently private.  Contact Aaftab Jain <aaftabj+gmail.com> for more information for using the software.

Research, Statistics

Talk: ACASA Annual meeting 2011

April 17th, 2011

I’ll be giving a shortened version of my Bayesian stable isotope mixing model talk (title and abstract below) at the Albuquerque Chapter of the American Statistical Association (ACASA) annual meeting on Friday, April 29, 2011. I gave two distinct longer versions of this talk recently as part of job interview talks at St. Louis University and the University of New Mexico.  I’m looking forward to the meeting to visit with people who I’ve worked with over the last several years, organizing judging events at science fairs, and other events.

Read more…

Research, Statistics

Paper published: δ13C of soluble sugars in Tillandsia epiphytes

June 17th, 2010

In a previous post I discussed this paper and how fun it was to write with Laurel.  Here I’m happy to report it’s available electronically (SpringerLink, pdf) and soon in paper.

Laurel K. Goode, Erik B. Erhardt, Louis S. Santiago, Michael F. Allen.  Carbon stable isotopic composition of soluble sugars in Tillandsia epiphytes varies in response to shifts in habitat. Oecologia (2010) 163:583–590.
DOI 10.1007/s00442-010-1577-5
Received: 11 March 2009 / Accepted: 25 January 2010 / Published online: 13 February 2010

Research, stable isotopes, Statistics

Visions

March 15th, 2010

A few important areas of focus, reflecting what I’m doing and where I’m going. Read more…

MIND, Research, stable isotopes, Statistics

Paper accepted: δ13C of soluble sugars in Tillandsia epiphytes vary in response to shifts in habitat

January 26th, 2010

Laurel Goode, Erik Erhardt, Louis Santiago, and Michael Allen.
δ13C of soluble sugars in Tillandsia epiphytes vary in response to shifts in habitat.
Oecologia, Physiological ecology section, 2010.

I met Laurel at SIRFER 2008 where we enjoyed a wide range of stable isotope lectures and lab experience. She first used my software, SISUS, to estimate the proportion of C3 vs CAM photosynthesis of epiphytes. Our work and friendship led to the collaboration where we thought about and developed a model for the environmental factors affecting the phothsynthetic pathways of the species studied. Read more…

stable isotopes, Statistics