Archive

Archive for the ‘Statistics’ Category

Awarded: 2011-12 UNM Math & Stat Outstanding Undergraduate Instructor

May 11th, 2012

I’m grateful to my students who voted for me as UNM Math & Stat Outstanding Undergraduate Instructor for 2011-12.  I was tied for UNM Math & Stat Outstanding Graduate Instructor, as well.  I work hard to give my students the best experience, to give them time before and after class to ask questions, to respond promptly to their email, and to reach them where they are and pull them up or show them how to keep climbing.

I keep adding to my Teaching Dossier, reflecting on my experience and accomplishments.

Statistics

Mega Millions $540M jackpot, or How to NOT LOSE at the lottery

March 30th, 2012

I quickly prepared the fun slides below for a short interview with KOAT Channel 7 on the Mega Millions $540M jackpot (estimated at $640 at 3:30pm on the day of the drawing), since it is greatly surpassing the previous record of $390 million.

A ticket’s probability of winning the jackpot is roughly the ratio of the length of one of your fingers to the diameter of the earth, so unchangeably near 0 (0.00000000569). It is interesting for the jackpot to be large enough that the expected value is a few times larger than the cost of a ticket, which makes it a sensible time to buy from an expected value point-of-view. In fact, now’s a good time to purchase EVERY ticket combination — hurry, and hope you don’t have to split it with another winner!
Read more…

Statistics

Funded: UNM Math&Stat Travel award to WNAR 2012

March 7th, 2012

The UNM Department of Mathematics and Statistics funded my travel request to go to 2012 WNAR – Graybill June 17-20, 2012 at Colorado State University – Fort Collins, Colorado. I intend to participate with a talk on my trophic-level modeling in the “Biostatistics and systems biology” session, meet and discuss research with those working in similar areas, and look for opportunities for cross-collaboration for students at these other western universities.

Research, Statistics

Paper published: The 5.1 ka Aridization Event, Expansion of Piñon-Juniper Woodlands

February 13th, 2012

The 5.1 ka Aridization Event, Expansion of Piñon-Juniper Woodlands, and the Introduction of Maize (Zea mays) in the American Southwest
Brandon L. Drake, W. H. Wills, Erik B. Erhardt
The Holocene
Publication details will update here, accepted 2/13/2012

Lee Drake (UNM Anthropology) exemplifies excellence and I will make every opportunity to work with him again.

Abstract
Pollen analysis is frequently used to build climate and environmental histories. A distinct Holocene pollen series exists for Chaco Canyon, New Mexico. This study reports linear modeling and hypothesis testing of long distance dispersal pollen from radiocarbon-dated packrat middens which reveal strong relationships between piñon pine (Pinus edulis) and ponderosa pine (Pinus ponderosa). Ponderosa pollen dominates midden pollen assemblages during the early Holocene, while a rapid shift to a much higher proportion of piñon to ponderosa pine pollen between 5,440 and 5,100 BP points to an aridization episode. This shift is associated with higher δ18O values in Southwest speleothem records relative to the preceding millenium.  The period of aridization is followed by a sharp increase in El Niño/Southern Oscillation events that would have caused highly variable precipitation and lasted until 4,200 BP. Bayesian changepoint analysis suggests that this aridization episode led to stable ecotonal boundaries for at least 3,000 years. The piñon/ponderosa transition may have been caused by punctuated multi-year droughts, analogous to those in the 20th century. The earliest documented instance of Zea mays cultivation on the Colorado Plateau is around ca. 4,290 BP. The introduction of this laborintensive cultigen from Mesoamerica may have been facilitated by changes in the regional ecosystems, specifically by an increase in piñon trees, that promoted increasing human territoriality. Linear modeling and hypothesis testing can complement traditional palynological techniques by adding greater resolution in vegetation patterning to climate/environmental histories.

Research, stable isotopes, Statistics

Paper published: A morphometric analysis of Actaea racemosa L. (Ranunculaceae)

January 4th, 2012

A morphometric analysis of Actaea racemosa L. (Ranunculaceae)
Z. Gardner, L. Lueck, E.B. Erhardt, L.E. Craker
Journal of Medicinally Active Plants
http://scholarworks.umass.edu/cgi/preview.cgi?article=1008&context=jmap

Abstract
Actaea racemosa L. (syn. Cimicifuga racemosa [L.] Nutt.), Ranunculaceae, commonly known as black cohosh, is an herbaceous, perennial, medicinal plant native to the deciduous woodlands of eastern North America. Historical texts and current sales data indicate the continued popularity of this plant as an herbal remedy for over 175 years. Much of the present supply of A. racemosa is harvested from the wild. Diversity within and between populations of the species has not been well characterized. The purpose of this study was to assess the morphological variation of A. racemosa and identify patterns of variation at the population and species levels. A total of twentysix populations representative of a significant portion of the natural range of the species were surveyed and plant material was collected for the morphological analysis of 37 leaflet, flower, and whole plant characteristics. In total, 511 leaflet samples and 83 flower samples were examined. Several of the populations surveyed had sets of relatively unique characteristics (large leaflet measurements, tall leaves and flowers, and a large number of stamen) and Tukey-Kramer multiple comparisons revealed significant differences between specific populations for 20 different characteristics. However, no unique phenotype was found. Considerable morphological plasticity was noted in the apices of the staminodia. Cluster analyses showed that the morphological variation within populations is not smaller than between population and that this variation in not influenced by their geographic distribution.

Research, Statistics

Funded: UNM RAC grant Erhardt/Hanson, Modeling (photo)respiration

December 27th, 2011

We got one! Research Allocation Committee (RAC) Grants are for supporting new research or creative works. The RAC is particularly supportive of projects that may lead to outside funding and/or larger related projects.

PIs: Erik Erhardt and David Hanson
Title: “Frequentist (bootstrap) and Bayesian modeling of (photo)respiration in plants”
Amount: $3982.63, RAC 12-04
Use: To hire statistics graduate student, Mohammad Hattab, to implement and develop modeling that I did last summer in Switzerland.

Purpose:
We are requesting $3982.63 to develop statistical models to estimate (photo)respiration in plants, accounting for sources of uncertainty and prior information. Because current models provide estimates without meaningful assessments of uncertainty, our model will have broad application in understanding photosynthetic pathways and carbon usage in plants, clarifying the precision of our knowledge, conditional on what is already believed. This modeling is an important step towards developing more comprehensive models of photosynthetic parameters. Support from the Resource Allocation Committee will allow us to: (1) develop frequentist (bootstrap) and Bayesian models to analyze existing experimental data, providing inferences on the set of parameters related in the model; (2) design experiments and acquire additional data to distinguish and estimate respiration and photorespiration under a set of scientifically relevant conditions; (3) conduct validations using pre-existing data and estimates; (4) publish our model with results; and (5) develop grant proposals to apply this model more broadly.

Research, stable isotopes, Statistics

Another look at New Mexico suicide statistics: conditional probability and data visualization

November 4th, 2011

This article was printed in the Daily Lobo on 11/10/2011.

Presenting information in a way that clearly answers interesting questions is challenging. Every plot has an implicit question (hypothesis) that it helps you answer. Therefore, it is important to align a visual display of information with the intended interesting question(s). Collaboration or consultation with a statistician can clarify interesting questions and lead to answers through appropriate data analysis (visit UNM’s free statistics consulting clinic, www.stat.unm.edu/~clinic).

Suicide was the topic of the front cover story in the Daily Lobo on Thurs, Nov 3rd. With the story, two pie charts displayed average annual proportions of “successful” and “unsuccessful” suicides by method in NM. The “successful” pie chart answers this statement of conditional probability (their implied question): “given a successful suicide, what percentage used certain methods?” A question I consider more interesting reverses the conditioning (my question): “given an attempted suicide with a certain method, what percentage were successful?” Furthermore, I want to know the overall frequency and percentage of each method attempted. How can we present the information in a way that simultaneously answers these questions?

The Suicide Prevention Resource Center (SPRC.org) maintains national and state suicide fact sheets, last updated September 2008, describing “deaths by suicide, estimated hospitalized attempts, and data on medical costs, work loss costs, gender, race/ethnicity, age, and method of suicide.” The pie charts in Thursday’s Daily Lobo were reproductions of those found on the NM fact sheet. From their NM summaries, below is the SPRC table for estimated mean frequencies by method for “successful” and “unsuccessful” suicides.

Method Successful Unsuccessful Total
Cut/Pierce 4 229 233
Firearms 191 16 207
Poisoning 60 1097 1157
Suffocation 73 23 96
Other/Unspecified 13 91 104
Total 341 1456 1797

Their question and pie charts (below) consider percentages down columns. When the data are reduced to row percentages for “successful” and “unsuccessful” attempts separately, you lose the relative frequency of attempts. The percentage of firearms “successes” (56%), for example, depends on all the other “successful” attempts. Because proportions for “successful” and “unsuccessful” attempts are separate, you can’t learn about how successful firearm attempts are.

Original pie chart

Original pie charts of proportions of method conditional on attempt "success", which doesn't ask/answer the interesting/relavant question.

It is critical to consider the temporal process: a person first chooses a method, then makes an attempt, and is either “successful” or not. The data display and questions should follow these temporal steps. The pie chart displays ignore this process.

My question and plot (below) considers the temporal process of attempting suicide, considering percentages across rows, including row total information. First, the relative use of various methods is clear; almost two-thirds of attempts are by poisoning, and firearm and cut/pierce are each just above one in ten. However, though attempts by firearms (12%) and cut/pierce (13%) are relatively rare, the “success” rates are extremely different (92% versus 2%)! The plot has been sorted by the numbers of “successes” to emphasize the relative risk of the methods in terms of lives, information which is lost in the pie charts. Also, the area of each box is relative to the frequency in each box. The Agora Crisis Center (505-277-3013, 9am-midnight, every day) plays a critical role in our community, and our education as individuals around these issues can save someone. Using statistics and visualization to tell and understand the important story in the data can lead to improvements in strategies and resource allocation for treatment and prevention.

Improved visualization

Improved visualization has relative use of methods across the horizontal and proportion of successes along the vertical. Area is proportional to people.

R code follows to produce plot above (with modest post-production necessary).
Read more…

Research, Statistics

Statistics job resources

June 19th, 2011

Applying for an academic job is serious work.  I ended up lucky (though, luck favors the prepared (Louis Pasteur)).  I received two job offers this season and took my first-choice job.  But I worked hard to get those offers.  I kept a detailed CV my entire student career (starting as a BA student, not waiting until job season to start), wrote an extensive teaching dossier for the 20 courses I’ve taught and ugrad tutoring experience, and developed a research statement as that vision became clearer to me.  Clearly, self-investment and personal excellence are the most important ingredients.  Next is to find people who want to hire you.

Two sites and one magazine basically covers the bases for statistics.

1.  If you’re a statistics student, you’re already a member of the ASA, right?  If so, the back of the AmStat News magazine has many jobs listed.

http://magazine.amstat.org/

2. Many jobs are posted at the American Statistical Association (ASA) jobs website.   Subscribe to their feed in your RSS reader:

http://jobs.amstat.org/search/results/index.cfm?SN=25&ss=1&display=rss

While I have had my CV posted on the site for years, I’ve never received any contact because of it.  I think the more direct approach of networking or replying to specific jobs is more effective.

3. The University of Florida statistics website lists many jobs, too.  My impression is that this site is even more comprehensive than jobs.amstat sometimes.

http://www.stat.ufl.edu/vlib/Index.html

I recommend being subscribed to the jobs.amstat.org in your RSS reader, because then most of the jobs will come to you.  You can follow-up at the UFlorida website to make sure you’re not missing anything.  Start looking in Sept/Oct and work on cover letters through Nov/Dec for the Dec/Jan/Feb deadlines.  Ask for letters of recommendation early (maybe even late summer while your professors are not busy with the semester).  Ask your advisor to look over your CV, cover letter, and other submission materials (scan a pdf of your unofficial transcript).  They’ve reviewed many applications hiring in their department before and will have good advice.  Send your application materials (all in pdf format — not doc!) as soon as you are ready to help yours be near the top of their review pile.  And while your application is in the hands of many hiring committees, try not to sweat — you’ve done all you can and it’s largely out of your control until they ask for an interview (or send you a form rejection letter, or never respond to you at all).  Feel free to send a follow-up email to request status if it’s a week or so after their self-predicted decision deadline, if it will help calm your nerves, but try not to hassle them.  It’s a very challenging market and positions regularly get 80-300 applications, so everything you can do to rise to the top of that deep stack can make the difference between getting a toe in the door and the alternative.

Interviewing is next step.  Here are some pages with questions to prepare for.  Write your questions down just as you’d say them and practice saying them aloud, maybe to a friend who will listen.  You want to clarify your answers to yourself and get them to flow smoothly out of your mouth.
10 tough interview questions
General advice

The job talk is the last step.
You’re a grown up, use Mac’s iWork Keynote — it’s the best presentation software available.
BBP was a great resource, provided you can ignore all the MSPP BS.  First five slidesTemplate. Video.
Matt Might’s presentation tips and job hunt advice.
CS Berkeley

Negotiating for your salary, start-up, teaching reduction, and more — ask your advisor for advice.  If you have a second offer, all of this becomes much, much easier!

Research, Statistics

tdllicor: estimates discrimination and other parameters associated with leaf photosynthesis

June 8th, 2011

Together with David Hanson, I developed R package tdllicor which reads TDL and Licor files, aligns them, and calculates quantities of interest with bootstrap intervals.  It is currently private as it is specialized and not of general interest.  It has already been important for a number of conference publications and is used for active research:

Conference Publications

DT Pater, EB Erhardt, and DT Hanson. Photorespiratory and respiratory carbon
isotope fractionation in leaves. In Proceedings of the Biophysical Society 55th
Annual Meeting, Baltimore, MD, Mar 2010. Biophysical Society.

DT Pater, EB Erhardt, and DT Hanson. Isotopic signature of photorespiration.
In Joint Annual Meetings of the American Society of Plant Biologists and the
Canadian Society of Plant Physiologists, Montreal, CA, August 2010.

Research, Statistics

mortest: estimates the total number of carcasses at a windfarm

June 8th, 2011

Working with Aaftab Jain, we developed a estimator for total number of bird and bat carcasses at a windfarm called “mortest” and implemented it as an R package.  We are interested in estimating c, the total number of carcasses (mortalities) in a period (year). The total number of carcasses is the sum of carcasses over size classes, c = sum_s=1^S c_s. If carcasses are retained (that is, not scavenged) and searcher efficiency is perfect (every carcass is found) and every tower is searched, then each c_s would be counted perfectly. Yet, carcass scavenging by predators and searchers overlooking carcasses are a reality, making observed counts an underestimate. Furthermore, tower sampling rather than censusing is a cost-saving convenience. Our estimator of total mortality, c, weighs the estimates from different search intervals and adjusts the observed counts for scavenging, search efficiency, searchable area of each tower, and proportion of towers searched, accounting for uncertainty in these estimates using a bootstrap.

The software was written by Erik Erhardt and is currently private.  Contact Aaftab Jain <aaftabj+gmail.com> for more information for using the software.

Research, Statistics