PDS Ch 3: Personal Codebook (slightly modified)
(1 p) Is there a topic of interest?
(1 p) Are the variables relavant to the question (or related questions)?
(1 p) Is a unique identifier variable included?
(4 p) Are there at least two numeric and at least two categorical variables (at least 4 total variables, more are ok)?
(3 p) For each variable is there a variable description, a data type, and categorical coded value descriptions?
Compile this Rmd file to an html, print the html as a pdf file, and upload the pdf to UNM Learn.
Below I give an example of a personal codebook to help get you started on this assignment. This is the beginning of your investigation toward answering a few questions that you’ll pursue throughout the semester.
The purpose of this assignment is to
select a dataset (PDS Ch 2),
identify a specific topic of interest (PDS Ch 3), and
prepare a codebook of your own (as the example below) from the larger codebook that includes the questions/items/variables that measure your selected topics (PDS Ch 3).
Your codebook may continue to develop during your literature review.
Dataset: (You need this part.) National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), with codebook NESARC_W1_CodeBook.pdf.
Initial thinking: (My helpful narrative description to help you get going.) While nicotine dependence is a good starting point, I need to determine what it is about nicotine dependence that I am interested in. It strikes me that friends and acquaintances that I have known through the years that became hooked on cigarettes did so across very different periods of time. Some seemed to be dependent soon after their first few experiences with smoking and others after many years of generally irregular smoking behavior. I decide that I am most interested in exploring the association between level of smoking and nicotine dependence. I add to my codebook variables reflecting smoking levels (e.g., smoking quantity and frequency).
Topic of interest: (You need this part.) I have decided to investigate the relationship between nicotine dependence and the frequency and quantity of smoking on people up to 25 years old. The association may differ by ethnicity, age, gender, and other factors.
How I did it: (My helpful narrative description to help you get going.) I look through the codebook and find some variables of interest. I searched the text with “Ctrl-F” (find) to find these variables. For each variable, I copy/paste the description here, then formatted it so it’s organized. You can choose to use a table or an outline format. I found this verbatim text format to be very easy to format. I retained the “frequency” of each response because it’s interesting to know, and because it was already in the codebook — this value is not required for your codebook.
Dataset: NESARC
Primary association: nicotine dependence vs frequency and quantity of smoking
Key:
VarName
Variable description
Data type (Continuous, Discrete, Nominal, Ordinal)
Frequency ItemValue Description
IDNUM
UNIQUE ID NUMBER WITH NO ALPHABETICS
Nominal
43093 1-43093. Unique Identification number
SEX
SEX
Nominal
18518 1. Male
24575 2. Female
AGE
AGE
Continuous
43079 18-97. Age in years
14 98. 98 years or older
CHECK321
CIGARETTE SMOKING STATUS
Nominal
9913 1. Smoked cigarettes in the past 12 months
8078 2. Smoked cigarettes prior to the last 12 months
22 9. Unknown
25080 BL. NA, never or unknown if ever smoked 100+ cigarettes
TAB12MDX
NICOTINE DEPENDENCE IN THE LAST 12 MONTHS
Nominal
38131 0. No nicotine dependence
4962 1. Nicotine dependence
S3AQ3B1
USUAL FREQUENCY WHEN SMOKED CIGARETTES
Ordinal
14836 1. Every day
460 2. 5 to 6 Day(s) a week
687 3. 3 to 4 Day(s) a week
747 4. 1 to 2 Day(s) a week
409 5. 2 to 3 Day(s) a month
772 6. Once a month or less
102 9. Unknown
25080 BL. NA, never or unknown if ever smoked 100+ cigarettes
ETHRACE2A
IMPUTED RACE/ETHNICITY
Nominal
24507 1. White, Not Hispanic or Latino
8245 2. Black, Not Hispanic or Latino
701 3. American Indian/Alaska Native, Not Hispanic or Latino
1332 4. Asian/Native Hawaiian/Pacific Islander, Not Hispanic or Latino
8308 5. Hispanic or Latino
S3AQ3C1
USUAL QUANTITY WHEN SMOKED CIGARETTES
Discrete
17751 1-98. Cigarette(s)
262 99. Unknown
25080 BL. NA, never or unknown if ever smoked 100+ cigarettes