Rubric

PDS Ch 3: Personal Codebook (slightly modified)

  1. (1 p) Is there a topic of interest?

  2. (1 p) Are the variables relavant to the question (or related questions)?

  3. (1 p) Is a unique identifier variable included?

  4. (4 p) Are there at least two numeric and at least two categorical variables (at least 4 total variables)?

  5. (3 p) For each variable is there a variable description, a data type, and coded value descriptions?

  6. Compile this Rmd file to an html and upload to crowdgrader (do not include your name, keep it anonymous).


Below I give an example of a personal codebook to help get you started on this assignment. This is the beginning of your investigation toward answering a few questions that you’ll pursue throughout the semester.

The purpose of this assignment is to

  1. select a dataset (PDS Ch 2),

  2. identify a specific topic of interest (PDS Ch 3), and

  3. prepare a codebook of your own (as the example below) from the larger codebook that includes the questions/items/variables that measure your selected topics (PDS Ch 3).

Your codebook may continue to develop during your literature review next week.

Question of interest

Dataset: National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), with codebook wv1codebook.pdf.

Initial thinking: While nicotine dependence is a good starting point, I need to determine what it is about nicotine dependence that I am interested in. It strikes me that friends and acquaintances that I have known through the years that became hooked on cigarettes did so across very different periods of time. Some seemed to be dependent soon after their first few experiences with smoking and others after many years of generally irregular smoking behavior. I decide that I am most interested in exploring the association between level of smoking and nicotine dependence. I add to my codebook variables reflecting smoking levels (e.g. smoking quantity and frequency).

Topic of interest: I have decided to investigate the relationship between nicotine dependence and the frequency and quantity of smoking on people up to 25 years old. The association may differ by ethnicity, age, gender, and other factors.

How I did it: I look through the codebook wv1codebook.pdf and find some variables of interest. I searched the text with “Ctrl-F” (find) to find these variables. For each variable, I copy/paste the description here, then formatted so it’s organized. You can choose to use a table or an outline format. I found this text format to be very easy to format. I retained the “frequency” of each response because it’s interesting to know, and because it was already in the codebook — this value is not required for your codebook.

Codebook

Dataset: NESARC
Primary association: nicotine dependence vs frequency and quantity of smoking

Key:
VarName
  Variable description
  Data type (Continuous, Discrete, Nominal, Ordinal)
  Frequency ItemValue Description

IDNUM
  UNIQUE ID NUMBER WITH NO ALPHABETICS
  Nominal
  43093 1-43093. Unique Identification number

SEX
  SEX
  Nominal
  18518 1. Male
  24575 2. Female

AGE
  AGE
  Continuous
  43079 18-97. Age in years
     14 98. 98 years or older

CHECK321
  CIGARETTE SMOKING STATUS
  Nominal
   9913 1. Smoked cigarettes in the past 12 months
   8078 2. Smoked cigarettes prior to the last 12 months
     22 9. Unknown
  25080 BL. NA, never or unknown if ever smoked 100+ cigarettes

TAB12MDX
  NICOTINE DEPENDENCE IN THE LAST 12 MONTHS
  Nominal
  38131 0. No nicotine dependence
   4962 1. Nicotine dependence

S3AQ3B1
  USUAL FREQUENCY WHEN SMOKED CIGARETTES
  Ordinal
  14836 1. Every day
    460 2. 5 to 6 Day(s) a week
    687 3. 3 to 4 Day(s) a week
    747 4. 1 to 2 Day(s) a week
    409 5. 2 to 3 Day(s) a month
    772 6. Once a month or less
    102 9. Unknown
  25080 BL. NA, never or unknown if ever smoked 100+ cigarettes

ETHRACE2A
  IMPUTED RACE/ETHNICITY
  Nominal
  24507 1. White, Not Hispanic or Latino
   8245 2. Black, Not Hispanic or Latino
    701 3. American Indian/Alaska Native, Not Hispanic or Latino
   1332 4. Asian/Native Hawaiian/Pacific Islander, Not Hispanic or Latino
   8308 5. Hispanic or Latino

S3AQ3C1
  USUAL QUANTITY WHEN SMOKED CIGARETTES
  Discrete
  17751 1-98. Cigarette(s)
    262 99. Unknown
  25080 BL. NA, never or unknown if ever smoked 100+ cigarettes