Merge pull request #56 from kaijagahm/researcher-narrative

Created researcher narrative in work session
carpentries-incubator · Sep 5, 2024 · e441325 · e441325
2 parents cbc09c1 + f2c1a9e
commit e441325
Show file tree

Hide file tree

Showing 3 changed files with 206 additions and 0 deletions.
diff --git a/github_labels.csv b/github_labels.csv
@@ -0,0 +1,23 @@
+print_order,type,label,color,use_prefix,description,long_description
+1,status,help wanted,#DCECC7,FALSE,Looking for Contributors,"Issue reviewed by Maintainers, and ready to be addressed. Maintainers are looking for Contributors to address this issue, anyone is welcome to work on addressing the issue.  Issues with this label will be listed on <a href=""https://carpentries.org/help-wanted-issues/#for-maintainers"">the Help Wanted page of The Carpentries website</a>."
+2,status,in progress,#9BCC65,TRUE,Contributor working on issue,"A Contributor is actively working on addressing the issue, this label should be used once someone has been assigned the issue. Because, we can only assign people using GitHub's interface when they are part of the organization, the assignment is done by tagging them in a comment of the issue. The Maintainer should set an initial deadline for a PR to be submitted. We suggest 7 days, but it can be adapted to the discretion of the Maintainer depending on the complexity of the task."
+3,status,waiting for response,#679F38,TRUE,Waiting for Contributor to respond to maintainers' comments or update PR,Maintainers responded to the Contributor's inquiry and are either waiting on the Contributor to reply back or implement proposed changes.
+4,status,wait,#FFF2DF,TRUE,Progress dependent on another issue or conversation,Progress on addressing issue or merging PR is dependent on another issue or ongoing conversation and cannot be addressed at this time. Ideally this other conversation should be referenced in the comments.
+5,status,refer to cac,#FFDFB2,TRUE,Curriculum Advisory Committee input needed,Maintainers need advice from the Curriculum Advisory Committee to make a decision on how to proceed with the issue or pull request.
+6,status,need more info,#EE6C00,TRUE,More information needed,Not enough information is provided to proceed with the issue or pull request.
+7,status,blocked,#E55100,TRUE,Progress on addressing issue blocked,A technical problem is hindering progress. A Maintainer or someone else in the community should be notified to ensure that progress is being made.
+8,status,out of scope,#EEEEEE,TRUE,Proposed changes are out of scope,Changes proposed in the issue or in the pull request doesn't fall within the scope of the lesson
+9,status,duplicate,#BDBDBD,TRUE,Issue or PR already exists,The concern raised in the issue or pull request has already been mentioned. This previous issues/PR should be mentioned in the comment before this label is used.
+10,type,typo text,#F8BAD0,TRUE,Typo in text for the lesson,Typo in the text/code of the lesson
+11,type,bug,#EB3F79,TRUE,Code included in the lesson needs to be fixed,"Issue about the code, including challenges, answers."
+12,type,formatting,#AC1357,TRUE,Formatting needs to be fixed,Issue about something being wrong in the formatting of the lesson
+13,type,template and tools,#7985CB,TRUE,Issue about template and tools,"Issue or feature request about a technical aspect of the lesson (e.g., in the scripts used to render the lesson), including the documentation of these tools. Pull requests should probably be directed to https://github.com/carpentries/styles "
+14,type,instructor guide,#00887A,TRUE,Issue with the instructor guide,Issue related to the content of the instructor guide. Best suited to be addressed by someone familiar with the content of the lesson
+15,type,discussion,#B2E5FC,TRUE,Discussion or feedback about the lesson,"Issue used to ask a question about how the lesson is taught, ask for clarification. Such issues might indicate that the instructor guide or the documentation may need to be updated."
+16,type,enhancement,#7FDEEA,TRUE,Propose enhancement to the lesson,"Proposal to add new content to the lesson (e.g., introducing additional function, library, command, flag), or adding more technical detail on a topic already covered in the lesson. Such issues may need to be considered by the infrastructure sub-committee, the curriculum advisory committee, or other relevant group."
+17,type,clarification,#00ACC0,TRUE,Suggest change for make lesson clearer,"Part of a lesson which, while not incorrect (i.e., not a bug) is presented in a way that is potentially confusing or misleading. Existing content could benefit from rephrasing or rearranging."
+18,type,teaching example,#CED8DC,TRUE,PR showing how lesson was modified in a workshop,"PR that illustrates how someone modified the lesson in their workshop. Not intended to be merged, but as a way to document how other instructors have used the lesson. Can be closed once the label has been applied."
+19,type,accessibility,#2F1D46,TRUE,improve content compatibility with assistive technology as well as unassisted access,"PR with suggested accessibility improvements or issue used to ask a question or draw attention to an accessibility issue that needs to be addressed in the lesson content or across other Carpentries resources."
+20,type,invalid,#CCCCCC,FALSE,PR considered as spam.,"Mostly around Hacktoberfest repositories receive spammy pull requests (insignificant, useless, or unnecessary changes). Tagging PRs with this label ensures that these PRs are not counted toward Hacktoberfest."
+21,difficulty,good first issue,#FFEB3A,FALSE,Good issue for first-time contributors,Good issue for a new Contributor to our lesson.
+22,priority,high priority,#D22E2E,FALSE,Need to be addressed ASAP,For issues and pull requests that needs to be addressed as soon as possible because the lesson uses code that doesn’t work anymore or includes information that is out of date.
diff --git a/researcher_narrative.qmd b/researcher_narrative.qmd
@@ -0,0 +1,178 @@
+---
+title: "Researcher Narrative"
+format: html
+editor: visual
+---
+# Intro
+## Introduce researcher
+
+You are a PhD student studying kangaroo rats.
+In preparation for some fieldwork this year, you are analyzing an old dataset from your lab between 1977 and 1989. Your new fieldwork will attempt to replicate some of this, and you want to make sure you have a good handle on your focal organisms, several species of kangaroo rat.
+
+You're somewhat familiar with this dataset, but this is your first time doing a deep dive on visualizing and analyzing it. You already know that the dataset contains information from rodent captures in various types of study plots. Body measurements were taken and many of the rodents were sexed.
+
+## Identify research questions
+
+Some things you want to find out:
+1. How many kangaroo rats of each species were found at the study site in past years (so you know what to expect for a sample size this year)?
+2. Do the k-rat exclusion plots work? (i.e. Does the abundance of each species differ by plot?)
+
+# (Episode 3: Identify the problem)
+## Load in the data
+
+First, let's load some packages we're going to need for our analysis
+```{r}
+library(readr)
+library(dplyr)
+library(ggplot2)
+library(stringr)
+```
+
+We can start by reading in our dataset from a csv file
+```{r}
+rodents <- read_csv("scripts/data/surveys_complete_77_89.csv")
+```
+
+## Explore/understand data structure
+
+When we were out in the field, we recorded genus, species, the day, and each individual's weight and hindfoot length. Let's take a look at these to remind ourselves of what our dataset looks like:
+```{r}
+glimpse(rodents) # or click on the environment
+str(rodents) # an alternative that does the same thing
+head(rodents) # or open fully with View() or click in environment
+```
+
+Whoops, when I look at the data, I remember that we actually recorded data about more than just rodents. I can tell this because there's a `taxa` column that includes "Rodent" as an option. It also has other options.
+```{r}
+table(rodents$taxa)
+```
+
+As an alternative, we can make a simple plot to visualize the abundance distribution of taxa
+```{r}
+rodents %>%
+  ggplot(aes(x=taxa))+
+  geom_bar()
+```
+
+Let's examine the NA values in the taxa column of the dataset
+```{r}
+## How do we find NAs anyway? ----
+head(is.na(rodents$taxa)) # logical--tells us when an observation is an NA (T or F)
+
+# Not very helpful. BUT
+sum(is.na(rodents$taxa)) # sum considers T = 1 and F = 0
+```
+
+
+## Data wrangling part 1
+
+Let's simplify it down to just the rodents!
+```{r}
+rodents <- rodents %>%
+  filter(taxa == "Rodent")
+glimpse(rodents)
+```
+
+We're interested in studying kangaroo rats. Let's see if we can find a way to filter down the data just to kangaroo rats. Kangaroo rats belong to the genus *Dipodomys*, so let's filter to that.
+```{r}
+krats <- rodents %>%
+  filter(genus == "Dipodomys")
+dim(krats) # okay, so that's a lot smaller, great.
+glimpse(krats)
+```
+
+We know we're going to want to look at some trends over time, and currently the date information is divided into three different columns. Let's create a single date column.
+```{r}
+krats <- krats %>%
+  mutate(date = lubridate::ymd(paste(year, month, day, sep = "-")))
+```
+
+We know from our colleagues' previous records that there were some changes made to the experimental setup in January 1988.
+"Rodent treatments were changed on subsets of the short-term plots at three points in time. In
+January 1988, treatments were changed on 8 of the short-term plots: 2 control plots became
+Banner-tailed exclosures, 2 Banner-tailed exclosures became rodent exclosures, and 4 controls
+became kangaroo rat exclosures." [REF](https://www.biorxiv.org/content/10.1101/332783v3.full.pdf+html)
+
+Therefore, we'd like to make a new column to indicate before/after the change.
+
+```{r}
+krats <- krats %>%
+  mutate(time_period = ifelse(year < 1988, "early", "late"))
+
+# Check that this went through; check for NAs
+table(krats$time_period, exclude = NULL) # learned how to do this earlier
+```
+
+# (Episode 4: Reproducible data)
+## Make exploratory visualizations
+
+To get a sense of the data and begin addressing your research questions, you decide to make some exploratory visualizations. 
+
+Starting with the first research question: 
+1. How many kangaroo rats of each species were found at the study site in past years (so you know what to expect for a sample size this year)?
+
+You additionally decide to look at this by plot type because you know the numbers might differ.
+```{r}
+krats %>%
+  ggplot(aes(x = date, fill = plot_type)) +
+  geom_histogram()+
+  facet_wrap(~species)+ 
+  theme_bw()+
+  scale_fill_viridis_d(option = "plasma")+
+  geom_vline(aes(xintercept = lubridate::ymd("1988-01-01")), col = "dodgerblue")
+```
+
+You realize you should remove the unidentified k-rats.
+```{r}
+krats <- krats %>%
+  filter(species != "sp.")
+```
+
+Re-do the plot with these removed:
+```{r}
+krats %>%
+  ggplot(aes(x = date, fill = plot_type)) +
+  geom_histogram()+
+  facet_wrap(~species)+ 
+  theme_bw()+
+  scale_fill_viridis_d(option = "plasma")+
+  geom_vline(aes(xintercept = lubridate::ymd("1988-01-01")), col = "dodgerblue")
+```
+
+Yay, beautiful!
+
+You're curious how many individuals of each species you might expect to catch per day.
+```{r}
+krats_per_day <- krats %>%
+  group_by(date, year, species) %>%
+  summarize(n = n()) %>%
+  group_by(species)
+
+krats_per_day %>%
+  ggplot(aes(x = species, y = n))+
+  geom_boxplot(outlier.shape = NA)+
+  geom_jitter(width = 0.2, alpha = 0.2, aes(col = year))+
+  theme_classic()+
+  ylab("Number per day")+
+  xlab("Species")
+```
+
+# (Episode 5: Reproducible code)
+Okay, so we know we are going to be catching different numbers of individuals, but now let's see why: did the exclusion plots work at keeping the target species out of certain areas?
+
+2. Do the k-rat exclusion plots work? (i.e. Does the abundance of each species differ by plot?)
+
+```{r}
+krats %>%
+  ggplot(aes(x = plot_type, fill = species))+
+    geom_bar() # XXX this needs to be edited
+```
+## Modeling part 1
+
+It's a little hard to draw our conclusion directly from the plot, so let's build a model!
+
+```{r}
+# Insert here: a model to test statistically what we can see in the above plot.
+```
+
+If needed, we can add more wrangling, tweaks, additional plots for this episode, but it's not necessary and might be too much content.
diff --git a/scripts/example_narrative.R b/scripts/example_narrative.R
@@ -289,3 +289,8 @@ krats <- krats %>%
 
 # We now need something more complicate errors for the reprex code
 
+
+# Making linear models
+## It's not clear that weight or hindfoot length can predict sex, but maybe the combination of them can? Let's make a linear model to find out!
+
+mod1 <- lm(sex ~ hindfoot_length + weight + species, data = krats)