wilderness – A.Z. Andis Arietta https://www.azandisresearch.com Ecology, Evolution & Conservation Mon, 21 Jul 2025 17:01:46 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.1 141290705 Wild Idea Podcast https://www.azandisresearch.com/2025/07/21/wild-idea-podcast/ Mon, 21 Jul 2025 17:01:46 +0000 https://www.azandisresearch.com/?p=2396 I recently joined my dear friend Bill Hodge on the The Wild Idea Podcast for a conversation about ecological resilience, climate adaptation, and how we think about wilderness in a changing world. We covered topics such as road ecology, species adaptation, and the sometimes counterintuitive lessons that emerge when humans step back from the landscape. From wood frogs that freeze solid in winter to the 22-mile rule showing how few truly remote places remain, we explored how human systems, even unintended ones, shape the trajectories of natural systems.

Drawing on my work in evolutionary ecology, wilderness ethics, and machine learning, I reflected on the tension between our desire to intervene and our limited ability to forecast long-term ecological outcomes. Using examples like the Chernobyl exclusion zone—where many species are thriving in the absence of people despite nuclear contamination—I argued that ecological recovery is often less about precision intervention and more about restraint. We discussed how machine learning can help us simulate alternative futures and understand potential tradeoffs, but that ultimately, the most powerful conservation tool may be humility. More wilderness, not more control, might be the best way to meet the uncertainties ahead.

Listen to the episode here or wherever you get your podcasts.

]]>
2396
Text analysis using AI in R https://www.azandisresearch.com/2023/10/05/text-analysis-using-ai-in-r/ Fri, 06 Oct 2023 00:02:01 +0000 https://www.azandisresearch.com/?p=2301 Introduction

Analyzing qualitative data is challenging. Such analyses are even more difficult when the topic is controversial and the results will drive important policy decisions. This post explores AI methods for qualitative research, using chatGPT for categorization, embeddings to find hidden topics, and long-context summarization with Claude2 on a case study analyzing free-text public comments to a controversial Environmental Impact decision.

Background

Quite a while ago, I detailed why replacing wolves on Isle Royale National Park was a bad policy decision back by even worse science. Since then, the National Park Service (NPS) decided to commit to wolf replacement anyway, dropping 19 new wolves on the island in 2018 and 2019. The results were expected. The new wolves killed the last original male wolf in 2019, almost certainly ensuring that the new wolf population will be genetically disconnected from the prior population. Of the 20 wolves that NPS attempted to relocate, one died before making it to the island, one voluntarily crossed the ice back to the mainland*, and four others died by the end of 2019. The surviving 14 wolves successfully bred and the population now stands at 31. So, in the end, we have a new, synthetic wolf population that is entirely disjunct from a genetic and ecological perspective. As I predicted in my original post: “in reality, this is not a genetic rescue project, it is a genetic replacement project,” which violates both the scientific and management purpose of the Park.

* This contradicts one of the primary justifications for replacing the wolves. Proponents argued that the lack of ice due to climate change would make natural repopulation impossible.

But neither science nor policy drove NPS’s decision. Management of charismatic mammals, especially in a well-known National Park, is largely a matter of public sentiment. In fact, it is a codified part of the decision process. Federal managers are required to seek public comments as part of the NEPA process.

In general, I am a huge supporter of public voices in important conservation decisions (I’ve even written papers advocating for it). But, sometimes I worry about how advocacy groups can skew the perception of organic public sentiment. That’s what I’d like to analyze in this post.

All of the public comments submitted to NPS on the Isle Royale wolf-moose management plan are public record. You can download and read all 1117 pages of comments.

But 1117 pages is a lot of text to read and digest. In this post, I want to show how you can easily process lots of text using AI (both generative large-language models (LLM), like chatGPT, and LLM embeddings) to make quantitative (or semi-quantitative) analyses.

Basic analyses

Visit my GitHub repo for this project for a fully reproducible analysis.

First, we’ll set up the environment and load in necessary packages.

# Load libraries
library(pdftools) # We will use 'pdftools' to convert the pdf to plain text
library(tidyverse)
library(stringr)
library(RColorBrewer)

# Set up the directory structure:
make_new_dir <- 
     function(DIR_TO_MAKE){
          if(dir.exists(DIR_TO_MAKE) == FALSE){
               dir.create(DIR_TO_MAKE)
          }else{
               print("Directory exists")
          }
     }

make_new_dir("./data/")
make_new_dir("./figs/")

We can download the comments from the NPW website.

download.file(
     url = "https://parkplanning.nps.gov/showFile.cfm?projectID=59316&MIMEType=application%252Fpdf&filename=ISRO%5FMWVPlan%5FAllCorrespondence%5FPEPC%2Epdf&sfid=232552",
     destfile = "./data/ISRO_MWVPlan_AllCorrespondence_PEPC.pdf",
mode = "wb"
)

The first step to analyze the public comments is to parse the pdf into text. This is a tedious process. I won’t show it here, but you can follow all of the steps on my GitHub repo for this project.

Example public comment from the downloaded pdf.
Example comment from the formatted PDF document.

You can download my pre-processed dataset to short-cut the the PDF parsing steps.

download.file(
     url = "https://www.azandisresearch.com/wp-content/uploads/2023/09/EIS_comments.csv",
     destfile = "./data/EIS_comments2.csv"
)

EIS_comments <- read.csv("./data/EIS_comments.csv")

The formatting follow the same structure for every comment. I’ve extracted the ‘Comment ID’, ‘Received’ date time, ‘Correspondence Type’, and ‘Correspondence’ text into a dataframe. I’ve also truncated the longest comments (…comment 68 looks like someone copy and pasted their term paper) to 12,000. This will be important later because the context window for chatGPT is 4000 tokens.

EIS_comments %>% glimpse()
Rows: 2,776
Columns: 4
$ ID             <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,…
$ Received       <dttm> 2015-07-12 20:45:30, 2015-07-14 23:18:34, 2015-07-15 12:03:55, 2015-07-15 13:14:52, 2015-07-15 13:35:47, …
$ Correspondence <chr> "Web Form Correspondence", "Web Form Correspondence", "Web Form Correspondence", "Web Form Correspondence"…
$ Content        <chr> "The alternatives are complete enough as a starting point. The issues will be related to the details. The …

We can do some basic summary analysis on these initial variables. The most comments were submitted in the week before the comment deadline on Sept 1. The vast majority of comments were received through the web form. Less than 10% of comments were physical letters and 51 of the 2777 comments were form cards given to Park visitors.

Area plot of the cumulative total comments over time for each correspondence type.

Often, large influxes of web and email comments are the product of advocacy groups encouraging their members to submit pre-written comments. I’ve used this tactic myself in conservation campaigns, so I won’t cast dispersions. But, I’ll also be the first to admit that a copy-and-pasted form letter is far less sincere than a uniquely crafted opinion.

After checking for matches among the comments, it is clear that there were two archetypical pre-written texts.  These include 733 near identical comment in favor of wolf replacement (i.e. Alternative B), likely from National Parks Conservation Association:

EIS_comments %>%
+   filter(grepl("I care about the wildlife at our national parks, including the wolves and moose at Isle Royale. Right now there are only three", Content)) %>%
+   group_by(Content) %>%
+   tally() %>%
+   arrange(desc(n)) %>%
+   ungroup() %>%
+   filter(row_number() == 1) %>%
+   .$Content %>% 
+   cat()
Dear Superintendent Green, I care about the wildlife at our national parks, including the wolves and moose at Isle Royale. Right now there are only three wolves left at the park- -the lowest number of wolves in more than 50 years- -threatening the overall ecosystem health of this iconic national park. I support management Alternative B to bring new wolves to the island, but urge the Park Service to do this as needed, rather than one time only. Without wolves, the moose population on the island will continue to increase, eating until the food sources are gone. If we bring new wolves to the island, they will help keep the moose population from rapidly expanding and minimize impacts to the native vegetation. This option is much less intrusive in this wilderness park than culling moose, removing moose from the island, or having to replant native vegetation once the moose consume it. As stewards of this park, the National Park Service should take the least intrusive action that results in the biggest benefit to the island's wildlife and ecosystem. I support the Park Service taking action to bring new wolves to the park immediately, before the population vanishes altogether. Thank you for considering my concerns. Sincerely,

And 55 nearly identical comments in favor of Wilderness (i.e. Alternative A), likely from Wilderness Watch:

EIS_comments %>%
+   filter(grepl("Isle Royale's wilderness designation requires that we protect the area's unmanipulated, untrammeled wilderness character. Wild", Content)) %>%
+   group_by(Content) %>%
+   tally() %>%
+   arrange(desc(n)) %>%
+   ungroup() %>%
+   filter(row_number() == 1) %>%
+   .$Content %>% 
+   cat()
Isle Royale's wilderness designation requires that we protect the area's unmanipulated, untrammeled wilderness character. Wilderness designation means we let Nature call the shots. Transplanting wolves from the mainland to Isle Royale is a major manipulation of the Isle Royale Wilderness and must not be done. Alternative Concept A, the No Action Alternative, is the best alternative to protect Isle Royale's unmanipulated, untrammeled wilderness character.

It is important to flag these duplicated comments because the methods that we will use later on will not behave correctly with nearly identical strings.

EIS_comments_deduplicated <- 
     EIS_comments %>%
     # Remove comments with no content
     filter(!is.na(Content)) %>%
     # Flag the web form duplicates
     mutate(form_duplicate = ifelse(grepl("I care about the wildlife at our national parks, including the wolves and moose at Isle Royale. Right now there are only three", Content), "for Alt B", NA)) %>%
     mutate(form_duplicate = ifelse(grepl("Isle Royale's wilderness designation requires that we protect the area's unmanipulated, untrammeled wilderness character. Wild", Content), "for Alt A", form_duplicate)) %>%
     # Form duplicates are not exact matches
     mutate(Content_dup = ifelse(is.na(form_duplicate), Content, form_duplicate)) %>%
     group_by(Content_dup) %>%
     # Retain one of the duplicate sets
     slice_sample(n = 1)

After removing the duplicates and cleaning the data, we are left with 1970 unique comments.

Text analysis with chatGPT

Now, we can start analyzing the content. There are many ways that we could do this, depending on the question we want to answer. For instance, maybe we want to see with questions naturally group together to see if we can find common themes? Traditionally, a common way to do this type of natural language processing would be to use an approach like a Latent-Dirchelt allocation topic analysis that groups comments by tf-idf values of the stems of words contained in the comment. (I cover tf-idf in a previous post). But, one problems with this approach is that the context of words is lost.

If we want to capture the context of the text, we might try using word embeddings from a LLM like GPT. We’ll try this approach later.

In our case, maybe we just want to know how many comments support a given policy.. It would be hard to answer that from the embeddings ourselves, but we could treat GPT as an agent who could read and categorize comments by preferred policy alternative.

We’ll use two packages. httr helps us interact with the chatGPT API. The API speaks in json format. jsonlite helps us parse formatted prompts and responses.

library(httr)
library(jsonlite)

Working with chatGPT is a lot like working with a new intern. Like an new intern, it has no prior contextual understanding of our specific task–we have to be very explicit with our directions. On the bright side, our chatGPT intern has endless patience and never sleeps!

We will be interacting with chatGPT through the API. This differs from the dialectical way that most people interact with chatGPT. We need to engineer our prompt to get a robust response in exactly the same format, every time.  We can do that by passing in quite a bit of context in our prompt and giving specific directions for the output, with examples. Here is the prompt we’ll use:

You are a federal employee tasked with reading the following comment submitted by a member of the public in response to the The Isle Royale National Park Moose-Wolf-Vegetation Management Plan/EIS. The Plan/EIS is a document that evaluates management alternatives for the moose and wolf populations on the island National Park land.
Management alternatives include:

- Alternative A: No Action. Continue the current management of letting nature take its course, without any intervention or manipulation of the moose or wolf populations or their habitats.
- Alternative B: Immediate Wolf Introduction. Introduce 20-30 wolves over a three-year period, starting as soon as possible to reduce the moose population and its impacts on vegetation.
- Alternative C: Wolf Introduction after Thresholds are Met. Introduce wolves if certain thresholds are met, such as the extirpation of wolves, the overabundance of moose, or the degradation of vegetation. The number and timing of wolf introductions would depend on the conditions at the time.
- Alternative D: Moose Reduction and Wolf Assessment. Reduce the moose population by lethal and non-lethal means, such as hunting, contraception, or relocation. The goal would be to lower the moose density to a level that would allow vegetation recovery and assessing introducing wolves to the island in the future.

Here is the text of the public comment: '[INSERT COMMENT TEXT]'.

State which alternative the commenter is most likely to favor (A, B, C, D).
State if the comment is 'For', 'Against', or 'Neutral' on wolf introductions.
State if the strength of the commenter's opinion on a scale from 'Extremely strong', 'Very strong', 'Strong', 'Somewhat strong', or 'Mild'.

Produce the output in json format like this:
{
"favored_alternative": "",
"wolf_opinion": "",
"opinion_strength": ""
}

ChatGPT 3.5 costs 0.002$ per 1000 tokens. We can use the OpenAI tokenizer to estimate the number of tokens constituting our input prompt.

Example output from OpenAI's tokenizer for our prompt.

Our input is 420 tokens. The output should be less than 50 tokens. So we can round to assume 500 tokens per query. So, it will cost us about $1 to process 1000 comments. Much cheaper than paying a human!

In the old days, you could pass a list of inputs into chatGPT ‘completions’ model all at once. This is no longer possible. Now, to use the ‘chat/completions’ API requires looping through each of the inputs and making individual requests. Unfortunately, the API often fails or hits the request rate limit. So, we need to be smart about staging and error handling with this larger loop. The structure of this loop is to define the prompt, wait 18 seconds to avoid the rate limit, run a tryCatch block to test if the API call fails, and if so, it skips to the next record and logs the records that the error occurred on, otherwise, parse the response and store the output in a file.

After getting initial responses, I also want to rerun 500 randomly selected comments in order to check chatGPT’s consistency. This is a critical part of using a generative model in quantitative analysis. I’ll talk more about this later.

Here’s the loop. It will take quite a while depending on your rate limit. I’d suggest either running it overnight or putting in on a remote server. Because we write each response out to file, there’s no problem if it fails. Just note the number of the last successful iteration (which will be printed to the screen) and start back up there.

set.seed(7097)

# Randomly select 500 records to resample
IDs_to_resample <- sample(unique(EIS_comments_deduplicated$ID), 500, replace = FALSE)
ID_list <- c(unique(EIS_comments_deduplicated$ID), IDs_to_resample)

# Create a vector to store failed IDs
failed_ids <- c()

ID_list <- Still_need_IDs

for (i in 1:length(ID_list)) {
  ID_number = ID_list[i]
  # Define the prompt
  prompt_content <- paste0( "Here is the text of the public comment: '", EIS_comments_deduplicated %>%
        filter(ID == ID_number) %>%
        .$Content,
      "'.
    State which alternative the commenter is most likely to favor (A, B, C, D).
State if the comment is 'For', 'Against', or 'Neutral' on wolf introductions.
State if the strength of the commenter's opinon on a scale from 'Extremely strong', 'Very strong', 'Strong', 'Somewhat strong', or 'Mild'.
Produce the output in json format like this:\n{\n\"favored_alternative\": \"\",\n\"wolf_opinion\": \"\",\n\"opinion_strength\": \"\"\n}"
    )
  
  # Initialize gpt_response
  gpt_response <- NULL
  
  # With my account, I can make 3 requests per minute. To avoid denied API calls, I add a 18 second pause in each loop.
  Sys.sleep(18)
  
  tryCatch({
    # Call GPT for a response
    gpt_response <- 
      POST(
        url = "https://api.openai.com/v1/chat/completions", 
        add_headers(Authorization = paste0("Bearer ", read_lines("../credentials/openai.key"))),
        content_type_json(),
        encode = "json",
        body = list(
          model = "gpt-3.5-turbo",
          messages = list(
            list(
              "role" = "system",
              "content" = "You are a federal employee tasked with reading the following comment submitted by a member of the public in response to the The Isle Royale National Park Moose-Wolf-Vegetation Management Plan/EIS. The Plan/EIS is a document that evaluates management alternatives for the moose and wolf populations on the island National Park land.
Management alternatives include:
- Alternative A: No Action. Continue the current management of letting nature take its course, without any intervention or manipulation of the moose or wolf populations or their habitats.
- Alternative B: Immediate Wolf Introduction. Introduce 20-30 wolves over a three-year period, starting as soon as possible to reduce the moose population and its impacts on vegetation.
- Alternative C: Wolf Introduction after Thresholds are Met. Introduce wolves if certain thresholds are met, such as the extirpation of wolves, the overabundance of moose, or the degradation of vegetation. The number and timing of wolf introductions would depend on the conditions at the time.
- Alternative D: Moose Reduction and Wolf Assessment. Reduce the moose population by lethal and non-lethal means, such as hunting, contraception, or relocation. The goal would be to lower the moose density to a level that would allow vegetation recovery and assessing introducing wolves to the island in the future."
            ),
            list(
              "role" = "user",
              "content" = prompt_content
            )
          )
        )
      )
    print(paste0("API call successful for ID: ", ID_number, ", index: ", i))
  }, error = function(e) {
    # Handle API call errors
    cat("API call failed for ID: ", ID_number, ", index: ", i, "\n")
    failed_ids <- c(failed_ids, i)
  })
  
  # If the API call was successful, proceed with data wrangling and output
  if (!is.null(gpt_response)) {
    # parse the response object as JSON
    content <- content(gpt_response, as = "parsed")
    
    # Assign the ID to the GPT response
    gpt_response_df <- data.frame(
      response_id = ID_number,
      gpt_response = content$choices[[1]]$message$content
    )
    
    # Convert the JSON to a dataframe and join to the record data
    output <- bind_cols( EIS_comments_deduplicated %>%
        filter(ID == ID_number),
      fromJSON(gpt_response_df$gpt_response) %>% 
        as.data.frame()
    ) %>%
      mutate(response_created_time = Sys.time())
    
    # Append the data to the extant records and write the output to a file. (This is a bit less memory efficient to do this within the loop, but I )
    if (!file.exists("./EIS_GPT_responses.csv")) {
      write.csv(output, "./EIS_GPT_responses.csv", row.names = FALSE)
    } else {
      read.csv("./EIS_GPT_responses.csv") %>%
        mutate(across(everything(), as.character)) %>%
        bind_rows(output %>%
                    mutate(across(everything(), as.character))
        ) %>%
        write.csv("./EIS_GPT_responses.csv", row.names = FALSE)
    }
    
    print(paste0("Completed response ", i))
  }
}

# Log the failed IDs to a file
if (length(failed_ids) > 0) {
  write.csv(data.frame(ID = failed_ids), "./failed_ids.csv", row.names = FALSE)
  cat("Failed IDs logged to 'failed_ids.csv'\n")
}

ChatGPT is nondeterministic, so your responses will differ. You can download the responses I got to follow along.

download.file(
     url = "https://www.azandisresearch.com/wp-content/uploads/2023/09/Final_GPT_Responses.csv",
     destfile = "./data/GPT_output.csv"
)

GPT_output <- read.csv("./data/GPT_output.csv")
GPT_output %>% glimpse()
Rows: 2,470
Columns: 13
$ ID                    <int> 93, 440, 2164, 636, 839, 2335, 36, 487, 1268, 2303, 1781, 60, 1033, 1948, 1826, 1538, 1685, 308, 22…
$ Received              <chr> "7/29/2015 9:09", "8/9/2015 5:14", "8/27/2015 14:36", "8/18/2015", "8/25/2015", "8/28/2015 12:30", …
$ Correspondence        <chr> "Web Form Correspondence", "Web Form Correspondence", "Web Form Correspondence", "Web Form Correspo…
$ Content               <chr> "\"100% o wolves examined since 1994...have spinal anomalies.\"- -Of the six alternatives put forth…
$ form_duplicate        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ Content_dup           <chr> "\"100% o wolves examined since 1994...have spinal anomalies.\"- -Of the six alternatives put forth…
$ favored_alternative   <chr> "C", "C", "Alternative D", "C", "C", "B", "C", "C", "D", "C", "Unknown", "C", "B", "A", "B", "A", "…
$ wolf_opinion          <chr> "For", "Against", "Neutral", "For", "Neutral", "For", "For", "For", "Against", "For", "Neutral", "F…
$ opinion_strength      <chr> "Very strong", "Very strong", "Strong", "Strong", "Somewhat strong", "Very strong", "Strong", "Stro…
$ response_created_time <chr> "32:19.2", "33:11.7", "33:16.9", "33:19.5", "34:35.2", "34:54.2", "34:55.4", "36:15.1", "36:16.3", …
$ Favored_alternative   <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ Wolf_opinion          <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ Opinion_strength      <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…

A couple of interesting things to note here. First, I apparently was not specific enough in my instructions for classifying the favored alternative because chatGPT sometimes returns “Alternative B” instead of just “B”. This is one of the struggles with using chatGPT, a generative model, in this way. It strays from instructions just like human survey respondents when inputting free-text results. For example, common responses to the survey question, “How are you feeling on a scale from 1 (bad) to 10 (good)?” might be “I’m good” or “Okay” or “nine” or “0”. None of those answers fit the instructions, so we have to clean them up.

In the case of chatGPT, we might be able to reduce these errors with more specific prompt engineering. For now, we’ll just clean up the responses on the backend.

# Fix erroneous column names
GPT_output <-
     GPT_output %>%
     mutate(
          favored_alternative = ifelse(is.na(favored_alternative), Favored_alternative, favored_alternative),
          wolf_opinion = ifelse(is.na(wolf_opinion), Wolf_opinion, wolf_opinion),
          opinion_strength = ifelse(is.na(opinion_strength), Opinion_strength, opinion_strength)
          ) %>%
     select(
          -Wolf_opinion,
          -Favored_alternative,
          -Opinion_strength
     )

# There are probably more elegant ways to write generalized rules to classify these reponses, but this does the trick
GPT_output <-
     GPT_output %>%
     # Fix 'favored alternative' responses
     mutate(
          favored_alternative_edit = case_when(
               (grepl(" and ", favored_alternative) | grepl(" or ", favored_alternative) | grepl("/", favored_alternative) | grepl("&", favored_alternative) | favored_alternative == "B, D") & !grepl(" and Wolf ", favored_alternative) & !grepl("N/A", favored_alternative) ~ "Multiple",
               grepl("\\bAlternative A\\b", favored_alternative) | favored_alternative %in% c("A", "No Action (A)") ~ "A",
               grepl("\\bAlternative B\\b", favored_alternative) | favored_alternative == "B" ~ "B",
               grepl("\\bAlternative C\\b", favored_alternative) | favored_alternative %in% c("C", "Concept C") ~ "C",
               grepl("\\bAlternative D\\b", favored_alternative) | favored_alternative == "D" ~ "D",
               TRUE ~ "Other"
          )
     ) %>%
     # Fix 'opinion strength' responses
     mutate(opinion_strength = tolower(opinion_strength)) %>%
     mutate(
          opinion_strength_edit = case_when(
               opinion_strength %in% c("strong", "very strong", "mild", "somewhat strong", "extremely strong") ~ opinion_strength,
               TRUE ~ "other"
          )
     ) %>%
     # Fix 'wolf opinion' responses
     mutate(wolf_opinion = tolower(wolf_opinion)) %>%
     mutate(
          wolf_opinion_edit = case_when(
          wolf_opinion %in% c("for", "against", "neutral") ~ wolf_opinion,
          TRUE ~ "other"
          )
     )

Let’s take a look at the results.

Bar chart of the favored alternative expressed in comments as assessed by chatGPT.We can see that the majority of comments favor Alternative B: immediate wolf introduction. However, if we exclude the duplicated comments, our conclusion shifts to a majority in favor of the more moderate Alternative C: introduce wolves only after certain thresholds are met. Almost no one supports Alternative D: moose reduction and wolf assessment.

Bar chart of opinion strength by favored alternative alternative.Comments that favored Alternative A were stronger proportionally. Alternative B supporters had mostly strong opinions but very few extremely strong or mild opinions. Supporters of Alternatives C and D were the least opinionated.

Validating chatGPT responses

It is worth asking ourselves how reliable chatGPT is at classifying these responses. One way to test this is to rerun a subset of comments, like we did above and check for agreement. This is called inter-rater reliability* (IRR).

* Although, maybe it should be called intra-rater reliability in this case. I guess it depends on out definition of ‘individual’ with LLM queries, but that’s a very philosophical bag of worms!

First, we need to subset our dataset to the responses that we scored twice.

IRR_comparisons <- 
     GPT_output %>%
     group_by(ID) %>%
     arrange(response_created_time) %>%
     mutate(ID_row_count = row_number()) %>%
     filter(ID_row_count <= 2) %>%
     mutate(n = n()) %>%
     filter(n > 1) %>%
     ungroup()

Then we can see how reliably the favored alternative was scored,

IRR_comparisons %>%
     select(ID, favored_alternative_edit, ID_row_count) %>%
     pivot_wider(
          id_cols = "ID",
          names_from = "ID_row_count",
          values_from = "favored_alternative_edit",
          names_prefix = "val"
     ) %>%
     group_by(val1 == val2) %>%
     tally() %>%
     mutate(
          total = sum(n),
          prop = n/total
     )
# A tibble: 2 × 4
  `val1 == val2`     n total  prop
  <lgl>          <int> <int> <dbl>
1 FALSE              2   500 0.004
2 TRUE             498   500 0.996

ChatGPT gave consistent responses in 498 out of 500 cases. That’s pretty good! Let’s look at the comments where it disagreed with itself.

IRR_comparisons %>%
     select(ID, favored_alternative_edit, ID_row_count) %>%
     pivot_wider(id_cols = "ID", names_from = "ID_row_count", values_from = "favored_alternative_edit", names_prefix = "val") %>%
     filter(val1 != val2)
# A tibble: 2 × 3
     ID val1  val2 
1   288 C     B    
2  1160 B     C    
 
EIS_comments_deduplicated %>%
     filter(ID == 288) %>%
     .$Content %>%
     cat()
There should be a balance between the wolf population and moose. When it is not balanced there is more harm than good done to the environment. Please introduce more wolves on this island instead of decreasing their population and this will keep the moose in check. Please add more wolves to contain the moose population. So many wolves are under attack in other states and decreasing their population is NOT the answer. It only creates more problems to the environment. There should be intense management of the wolf population to help it thrive and return the land back to it's natural state where there are enough moose and wolves. I think the public should be consulted as far as future plans for any culling. There should be intense management to monitor the effects of climate change as this will affect all aspects of wildlife and plant life on the island. I do not like the idea of a moose cull. I like the idea of introducing more wolves to the island so long as there is harmony with the existing wolves on the island. Maybe possibly try to introduce another type of animal that would be a good balance with the wolves and moose but only if it does not disrupt the balance and create new problems. Other states have adopted disastrous wolf culling plans that are only in the interests of farmers and ranchers. As the wolf population is dwindling, other problems will begin to develop as there is not a proper balance. Please keep wolves in mind and do your best to increase their population before it is too late and more animals will be needlessly killed without the proper balance of mother nature.> 
 
EIS_comments_deduplicated %>%
     filter(ID == 1160) %>%
     .$Content %>%
     cat()
I have heard both sides of this situation and I believe that new wolves should be introduced on Isle Royale. Climate change has made a large impact on the amount of ice that freezes in the Isle Royale region. Previously wolves from the mainland could cross the ice that formed and take up residence on the Isle. The ice hasn't been stable enough for these crossings in the last few years and the wolves are becoming inbred and dying off. If you will check a video that I have watched about the wolves being reintroduced to Yellowstone, you will see that the ecology of the region is benefited by the wolves being there. If enough wolves are transported to Isle Royale, the wolves will keep the moose in check and the ecology will improve. Allowing the pack to die off is really not a positive move. Introducing a new bloodline to the pack will help. I believe the wilderness designation of Isle Royale is a positive thing and that the wolves help to keep the ecosystem there in good order. Thank you for taking comments from the public.

In both cases, chatGPT vacillated between classifying the comment as favoring alternative B or C. Difference between those alternatives is admittedly nuanced. Both alternatives propose replacing wolves, the only difference is in the timing. In Alternative B, wolves would be introduced immediately and in Alternative C wolve would be introduced, “if certain thresholds are met, such as the extirpation of wolves, the overabundance of moose, or the degradation of vegetation. The number and timing of wolf introductions would depend on the conditions at the time.”

Both of the comments that made chatGPT disagree with itself focus on the environmental conditions that wolf introductions might remedy. However, these comments seems to presuppose that those conditions have been met and seem to suggest immediate introduction is necessary. So, I can see where chatGPT might have a hard time solidly classifying these comments.

Let’s also check the IRR for chatGPT’s classification of ‘opinion strength.’ Unlike the favored alternative, where most folks explicitly stated their preference, classifying the strength of an opinion is a far more subjective task.

IRR_comparisons %>%
     select(ID, opinion_strength_edit, ID_row_count) %>%
     pivot_wider(
          id_cols = "ID",
          names_from = "ID_row_count",
          values_from = "opinion_strength_edit",
          names_prefix = "val") %>%
     group_by(val1 == val2) %>%
     tally() %>%
     mutate(
          total = sum(n),
          prop = n/total
     )
# A tibble: 2 × 4
  `val1 == val2`     n total  prop
  <lgl>          <int> <int> <dbl>
1 FALSE              5   500  0.01
2 TRUE             495   500  0.99

ChatGPT disagreed with itself in 5 cases, but gave reliable classifications 99% of the time. That’s pretty good! However, just assessing binary disagreement or agreement isn’t a strong metric for this variable. A switch from “extremely strong” to “very strong” is less of an issue than a vacillating from “extremely strong” to “mild”.

Instead, we can use the Krippendorff’s Alpha. This metric provides a formal way to assess the the amount of inter-rater disagreement. There are multiple metrics that we could use, but Krippendorff’s Alpha is nice because it can generalize to any number of reviewers and can handle many types of disagreement (i.e. binary, ordinal, interval, categorical, etc.). Here’s a great post for understanding Krippendorff’s Alpha. We’ll use the irr package to estimate it.

library(irr)

The irr package needs the dataset in wide format matrix with one row per reviewer and each record (the package calls records ‘subjects’ because this metric is traditionally used in social science research) as a column. For this analysis, we’ll consider the first and second responses from chatGPT as individual reviewers. We also need to enforce the order of our opinion strength levels; otherwise, R will naturally order them alphabetically.

IRR_comparisons %>%
     mutate(opinion_strength_edit = fct_relevel(
          opinion_strength_edit,
          c(
               "other",
               "mild",
               "somewhat strong",
               "strong",
               "very strong",
               "extremely strong"
           )
     )) %>%
     select(
          ID,
          opinion_strength_edit,
          ID_row_count
     ) %>%
     pivot_wider(
          id_cols = "ID_row_count",
          names_from = "ID",
          values_from = "opinion_strength_edit",
          names_prefix = "ID_"
     ) %>%
     select(-ID_row_count) %>%
     as.matrix() %>%
     kripp.alpha(method = "ordinal")
  
Krippendorff's alpha

 Subjects = 500 
   Raters = 2 
    alpha = 0.996 

Krippendorff’s Alpha ranges from -1 to 1, where 1 means perfect concordance, 0 means random guesses among reviewers, and -1 is perfect negative concordance. At .996, we are pretty near perfect reliability.

For many datasets, there will be a lower degree of IRR. But, it is important to remember to interpret the alpha value in context. Perfect concordance may not be realistic, especially in highly subjective classifications. In most cases our goals is not perfect concordance, but simply greater reliability than we’d get if we hired a bunch of humans to do the annotating. Preliminary evidence seems to indicate that even version 3.5 of chatGPT is more reliable than humans (even domain experts!) in subjective classification tasks.

In most cases, you won’t have the resources to get human annotations for an entire dataset for comparison. Instead, you could 1.) get human annotations for a small subset, 2.) use a similar benchmark dataset, or 3.) spot-check responses yourself. If you choose to spot check, I’d suggest rerunning chatGPT multiple times (> 3) in order to estimate the variance in responses. High variance responses indicate especially difficult classifications that you should target for spot-checks. Another tip is to ask chatGPT to return it’s justification with each response. Ultimately, this process will help you diagnose problematic types of responses and enable you to engineer better prompts to deal with those edge cases.

The bottom line is that working with chatGPT is less like working with a model and more like working with human raters–and all of the validation tasks that entails.

Analysis with token embeddings

Up to this point, we’ve presupposed the classifications we wanted ChatGPT to identify in our data. But, what if we wanted to uncover hidden categories in the responses? Folks could advocate for the same Alternative but for different reasons. For example, among those who favor Alternative C, some might argue from the perspective of climate change and some from the perspective of moose populations.

We can use token embeddings to uncover hidden clusters of topics in our responses. Embeddings are the way that LLMs encode free text into numeric form.  Each token or ‘unit of language’ is numerically described as a position in multidimensional language space. This is a huge advantage over more traditional language clustering methods that simply count the occurrence of certain words. Embeddings retain the context of each token as it exists in the document.

Toy example of four sentences containing the word 'train' embedded in two dimensions.
Embeddings allow us to retain the context of text by expressing tokens in multidimensional language space.

As a toy example, the word “train” in these sentences: “I train a model”, “I train for a marathon”, “I rode the train”, “I’m on the Soul Train” could be described in two dimensions of more or less metaphorical and noun/verb. If we do this for all of the words in a document or chunk of text, we can then think of all the embeddings as a point cloud. Documents with highly overlapping point clouds are more similar that those that don’t overlap at all.

We call a different OpenAI model, text-embedding-ada-002, to return the embeddings. Unlike the chat model, we can pass all of the responses as a list in a single call, instead of looping through each response. This makes embeddings much faster and cheaper than using the chatGPT API.

Prior to embedding, I like to remove non-alpha numeric characters from the text.

# Clean up the text to remove non-alpha numeric characters
input_to_embed <- 
     EIS_comments_deduplicated %>%
     mutate(Content_cleaned = str_replace_all(Content, "[^[:alnum:]]", " "))

# Call OpenAI for the embeddings
embeddings_return <- 
     POST(
          "https://api.openai.com/v1/embeddings",
          add_headers(Authorization = paste0(
               "Bearer ", read_lines("../credentials/openai.key"))
          ),
          body = list(
               model = "text-embedding-ada-002",
               input = input_to_embed$Content_cleaned
               ),
          encode = "json"
     )

The returned object is a bit convoluted. We can use a bit of purrr and jsonlite to extract the embeddings.

# Extract the embeddings from the API return
embeddings_list <-
     embeddings_return %>%
     content(as = "text", encoding = "UTF-8") %>%
     fromJSON(flatten = TRUE) %>%
     pluck("data", "embedding")

Then add the embeddings back into the dataframe.

# Combine the embeddings with the original data
EIS_GPT_embeddings <- 
     EIS_comments_deduplicated %>%
     as_tibble() %>%
     mutate(
          embeddings = embeddings_list,
          ID = as.character(ID)
     ) %>%
     left_join(
# We need to get only the first instance of the GPT response data, which also included the repeated reliability test responses, to know which alternative the comment favors
          GPT_output %>%
               group_by(ID) %>%
               arrange(response_created_time) %>%
               mutate(ID_row_count = row_number()) %>%
               filter(ID_row_count == 1) %>%
               ungroup() %>%
     select(
          ID,
          favored_alternative_edit,
          opinion_strength_edit
          )
     )

Topical clustering from text embeddings

The problem is that those point clouds exist in extremely high dimensions. OpenAI’s text-embedding-ada-002 model returns 1536 dimensions. We need a method to reduce that complexity into something useful.

As mentioned, the embeddings allow us to see how comments relate in high-dimensional language space. We want to figure out where there are denser clusters of point clouds in that space which indicate common themes in the comments.

A couple of common ways to do this is to use a clustering algorithm (e.g. K-means) or dimension reduction (e.g. PCA). For this tutorial I want to use a bit of a hybrid approach called t-SNE (t-distributed Stochastic Neighbor Embedding) that will allow us to easily visualize the clusters of common comments which we can then explore.

We’ll use Rtsne package which requires that the data be in matrix form.

library(Rtsne)

# Rtsne requires the embeddings to be in matrix form, so we extract the lists of emdeddings from the dataframe and convert them to matrix form.
openai_embeddings_mat <-
     matrix(
          unlist(
               EIS_GPT_embeddings %>%
               .$embeddings
               ),
          ncol = 1536,
          byrow = TRUE
     )

# Estimate tSNE coordinates
set.seed(7267158)
tsne_embeddings <-
     Rtsne(
          openai_embeddings_mat,
          pca = TRUE,
          theta = 0.5,
          perplexity = 50,
          dims = 2,
          max_iter = 10000
     )

Determining the proper theta (i.e. learning rate) and perplexity (basically an estimate of how close points are in relation to the expected groupings) is more of an art than a science. This post does a great job of exploring choices for these parameters. By setting pca = TRUE in this case, we are first reducing the dimensionality to 50 principal components and then using tSNE to do the final reduction to two visual dimensions.

# Extract the tSNE coordinates and add them to the main dataset
EIS_GPT_embeddings <- 
     EIS_GPT_embeddings %>%
     mutate(
          tsne_dim1 = tsne_embeddings$Y[,1],
          tsne_dim2 = tsne_embeddings$Y[,2]
     )

# Visualize the tSNE plot
EIS_GPT_embeddings %>%
     ggplot(aes(x = tsne_dim1, y = tsne_dim2)) +
     geom_point(alpha = 0.5, pch = 16)
tSNE plot of the openai embeddings
The tSNE plot uncovers some weak groupings, but there are no extremely clear delineation between most comments. This is likely a symptom of low diversity in comments and the fact that most of our comments are very short, so there is less signal in the content.

The first thing to note is that we are not seeing much discrete grouping of the points. This tells us that that the comments share a lot more in common across all comments than across local groups of comments. The second thing to notice is that despite the spread, we do see a handful of groups budding off along the periphery. In fact, one group in the bottom right is very distinct. It is important to remember that, unlike PCA, the axis dimensions in tSNE are meaningless. In fact, I’ll remove them from plot for the rest of the post. Position doesn’t matter in tSNE–only relative closeness.

At this point, we might want to manually delimit groups that we want to analyze further, like pulling out all of the comments from that cluster in the top left. To make this a bit easier, I’ve opted to cluster the two dimensional tSNE with hierarchical clustering. It is important to realize that this is purely a convenience for visualization. If we really wanted to use clustering to directly define groups (like hierarchical, KNN, etc.), it would make much more sense to cluster directly on the first 50 principle components.

tsne_embedding_clusters <- 
     hclust(
          dist(tsne_embeddings$Y), 
          method = "average"
     )

EIS_embeddings_clustered <-
     EIS_GPT_embeddings %>%
     mutate(
          cluster = cutree(tsne_embedding_clusters, 7)
)

Since we are clustering on tSNE dimensions where distance doesn’t really matter, deciding where to set our breakpoint is a personal choice. I’ve decided to use 7 clusters because it seemed a natural breakpoint and recovered the obvious clusters.

tsne plot and hierarchical tree diagram displaying the data split into 8 clusters
Using hierarchical clustering, we can cluster on the tSNE coordinates. Since tSNE coordinates are mostly meaningless, deciding how many clusters to split the data into is a bit arbitrary.

Text analysis of topical clusters

Now that we have putative clusters of topics, we can perform some classic natural language processing (NLP) to illuminate the themes of those topics. We’ll use tidytext for this task.

library(tidytext)

First, we need to get the data into a long, tidy format where each word in every comments is its own row. We’ll also remove common stop words that are predefined in the tidytext library. Then, we can calculate the term frequency-inverse document frequency (TF-IDF) for the clusters. TF-IDF is basically a measure of how common a word is within a cluster, after accounting for how common a given words is overall.

For example, if we take a look at the most common words in each cluster, it is unsurprising that “wolves”, “moose”, “isle” and “royale” dominate. (Although it is interesting that the top words for clusters 4 and 7 are “wilderness” and “management”… more on that later).

word frequency bar plots for each cluster
Unsurprisingly, when considering the most common words, “wolves”, “moose”, and “isle” dominate.

However, TF-IDF tells us about the relatively unique words that define a cluster of comments. Some clusters, like 1 and 2 have very even tf-idf distribution and the important words are mostly filler or nonsense words. This happens when clusters are saturated with common words and there is no strong theme producing uniquely important words. We could have guessed from the tSNE plot of the embeddings that the bulk of comments in the center of the plot would fall in this lexical no-man’s-land. But! Clusters 3, 4, 5, and 7 show promisingly skewed distributions.

term frequency inverse document frequency bar plots for clusters
TF-IDF is a measure of uniquely important words in a ‘document’ (or cluster, in this case) relative to common words across all documents.

Cluster 3 seems to orient towards a topic of animal welfare, with words like, “contraception”, “sterilization”, “lethal”, and “culls”. I suspect that these comments speak to folks’ concerned less about the wolf population or wilderness management, and more about the ethics of any proposed action involving animals. In a similar way, it looks like Cluster 7 is more concerned with the science and measurement behind the management decision and less about the decision itself with words like, “evaluating”, “approximately”, and “tools” with high uniqueness and “management” as the most common word overall. These topics would have been completely lost if we had stopped at categorizing favored alternatives.

Meanwhile cluster 4 appears to be squarely concerned with Wilderness issues. “Wilderness” and “nature” are the most common words in this cluster and “untrammeled” and “unmanipulated” are the most uniquely important words. We might expect that most of the comments that chatGPT categorizes as favoring alternative A will fall into cluster 4.

We can also take a look at how the clusters map onto the chatGPT categorizations.

chatGPT categorized 'favored alternative' mapped to tSNE coordinates with bar plot showing favored alternative counts per cluster
Mappin the chatGPT categorized ‘favored alternative’ onto the tSNE coordinates, we can see that topical clusters mostly conform to

Mappin the chatGPT categorized ‘favored alternative’ onto the tSNE coordinates, we can see that comments roughly sort by favored alternative. Cluster 6 is almost entirely defined by support for Alternative B – immediate wolf introduction. Cluster 4, which seemed to orient towards Wilderness values is mostly comprised of comments in support of Alternative A – no action.

Cluster 7 and Cluster 3, are mostly skewed to Alternative C – more monitoring, but exhibit very similar distributions. This might be a great example where even folks who tend to agree on the same Alternative, do so for different reasons–a pattern we would have totally missed without text analysis.

The remaining clusters which compose the bulk of the midland in the tSNE plot favor a mix of Alternatives.

Chain-of-density summarization

We can learn a lot from looking at common and important words and using our human judgement to piece together the topical theme of each cluster. Ideally, we would read all of the comments in a cluster to develop a topical summary. But that would take a long time. As an alternative, we can pass all of the comments in a given cluster to an LLM and have it summarize the theme.

Currently, only a handful of models support context windows large enough to digest the entirety of the comments in our clusters. Anthropic’s Claude2 has a context widow of up to 100k tokens (rough 75,00 words). Although, it isn’t quite as good at chatGPT 4. To get the most out of Claude2, we can use a special type of prompting developed for summarization called “chain-of-density”. Chain-of-density prompting forces the model to recurrently check it’s own output to maximize the density and quality of its summarization. Research shows that people tend to like the chain-of-density summaries even better than human-written summaries of new articles.

For demonstration, we’ll use chain-of-density prompting to summarize the theme of cluster 3. Here is the prompt that we will pass to Claude2:

"You will generate increasingly concise entity-dense summaries of the semicolon separated comments included below.

The comments were submitted by a member of the public in response to the The Isle Royale National Park Moose-Wolf-Vegetation Management Plan/EIS. The Plan/EIS is a document that evaluates management alternatives for the moose and wolf populations on the island National Park land.

Now that you know the context, here are the semicolon separated survey response:

[INSERT SEMICOLON SEPARATED COMMENTS]

Instructions: You will generate increasingly concise entity-dense summaries of the above semicolon separated comments. Repeat the following 2 steps 5 times.

Step 1: Identify 1-3 informative entities (delimited) from the comments which are missing from the previously generated summary.
Step 2: Write a new denser summary of identical length which covers every entity and detail from the previous summary plus the missing entities.

A missing entity is
- Relevant: to the main themes of the comments.
- Specific: descriptive yet concise (5 words or fewer).
- Novel: not in the previous summary.
- Faithful: present in the comments.
- Anywhere: located in the comments.

Guidelines:
- The first summary should be long (7 - 5 sentences, ~100 words), yet highly non-specific, containing little information beyond the entities marked as missing. Use overly verbose language and fillers (e.g., 'these comments discuss') to reach ~100 words.
- Make every word count. Rewrite the previous summary to improve flow and make space for additional entities.
- Make space with fusion, compression, and removal of uninformative phrases like 'these comments discuss'.
- The summaries should become highly dense and concise, yet self-contained, e.g., easily understood without having read the comments.
- Missing entities can appear anywhere in the new summary.
- Never drop entities from the previous summary. If space cannot be made, add fewer new entities.

Remember: Use the exact same number of words for each summary.

Answer in JSON. The JSON should be a list (length 5) of dictionaries whose keys are 'missing_entities' and 'denser_summary'."

I don’t have API access to Claude2, but unlike chatGPT, Claude2 accepts files as input. So, it is easy to output our prompt with the contained comments as a .txt file and submit it to Claude2 manually.

Claude2 gives us five increasingly dense summaries:

Summary 1: These comments provide a range of perspectives on how the National Park Service should manage the declining wolf population and increasing moose population on Isle Royale. There is discussion of various preliminary alternative concepts, including taking no action, reintroducing wolves once, maintaining both moose and wolves, reducing the moose population, and intensively managing the moose population. Many commenters express support for alternative C, which involves reintroducing and augmenting the wolf population on Isle Royale more than once if necessary. There is also frequent mention of the possibility of relocating problem wolves from elsewhere to the island. Overall, most commenters emphasize the importance of restoring the natural wolf/moose balance on Isle Royale through wolf reintroduction rather than intensively managing the moose population.
Summary 2: These comments provide perspectives on managing the declining Isle Royale wolf and increasing moose populations. Many support alternative C to reintroduce and augment wolves more than once if needed. Wolf relocation from the mainland is also mentioned. Most emphasize restoring the wolf/moose balance through wolf reintroduction rather than intensively managing moose. Climate change reducing ice bridges for wolf migration and the wilderness designation of Isle Royale are frequently discussed. Many oppose lethal moose culling, instead supporting moose contraception/sterilization. Overall there is a focus on restoring the natural predator-prey dynamic through wolf reintroduction while minimizing intensive human management due to the wilderness designation.
Summary 3: These comments discuss managing the declining Isle Royale wolf and increasing moose populations. Many support alternative C to reintroduce and augment wolves more than once if needed to restore the wolf/moose balance. Wolf relocation and genetics are often mentioned given reduced migration from climate change. The wilderness designation and opposition to lethal moose culling are frequently discussed, with support for contraception/sterilization. There is concern about moose overbrowsing without sufficient wolves. Some mention wolf tourism benefits. Overall there is emphasis on restoring the natural predator-prey dynamic and wolf genetics through reintroduction while minimizing intensive human management due to the wilderness designation.
Summary 4: These comments discuss managing the declining Isle Royale wolf and increasing moose populations, often supporting alternative C to reintroduce and augment wolves more than once to restore balance. Wolf relocation and genetics are frequently mentioned to compensate for reduced migration from climate change. Many oppose lethal moose culling, instead supporting contraception/sterilization, expressing concerns about moose overbrowsing and ticks without wolves. Some mention wolf tourism benefits and preserving the wilderness designation. There is emphasis on restoring balance through wolf reintroduction to limit moose impacts on vegetation while minimizing intensive human management.
Summary 5: These comments frequently support alternative C to reintroduce and augment Isle Royale wolves to restore wolf/moose balance and predator-prey dynamics, concerned about moose overbrowsing, ticks, and vegetation without sufficient wolves. Wolf relocation and genetics are often proposed to compensate for reduced migration from climate change. Many oppose lethal moose culling, noting wilderness values. Some mention continuing the iconic predator-prey research, wolf tourism benefits, wolf-livestock issues, and the value of public engagement. Overall there is emphasis on reestablishing a healthy Isle Royale wolf population through reintroduction to minimize intensive human management and moose impacts.

The final two summaries start to get at the nuance particular to this cluster. Summary 4 mentions, “Many oppose lethal moose culling, instead supporting contraception/sterilization”. Summary 5 mentions: “Many oppose lethal moose culling, noting wilderness values.” As expected, much of the summary is likely common across all comments. So, a human interpretation of the summaries is still necessary.

Conclusion

In this post, we explored using chatGPT, text embeddings, and Claude2 to analyze public comments on a complex environmental management decision. We learned methods to responsibly validate chatGPT output. While not perfect, chatGPT showed promising reliability at categorizing free-form opinions. The text embeddings allowed us to uncover hidden topical clusters among comments that traditional methods would have missed. Claude2’s long context window allowed us to further interpret the topical clusters. Together, these tools enabled a nuanced quantitative analysis of subjective text data that would be infeasible for a single human analyst to perform manually.

 

]]>
2301
Arctic Genes in Alaska Magazine https://www.azandisresearch.com/2022/12/10/arctic-genes-in-alaska-magazine/ Sat, 10 Dec 2022 14:14:55 +0000 https://www.azandisresearch.com/?p=2217 An article I wrote about an expedition to collect wood frogs in the Alaska Arctic is now online at Alaska Magazine. I’ve included the teaser below, but check out the whole article here.

Screenshot of the Alaska Magazine website for the article featuring a picture of Andis and Yara doing DNA extractions in a tent. Image by Kaylyn Messer.

I am deep in the Alaskan Arctic,  300 miles from the nearest road system, attempting to conduct the kind of science that usually requires a specialized laboratory. We rowed 30 miles of meandering flatwater today, bringing our total to 200 river miles in 12 days since we landed at a lonely gravel bar on the headwaters of Ambler River in Gates of the Arctic National Park.

Mosquitoes spangle the tent canopy arching over me. Backlit by summer solstice sun, the silhouettes of the insects make an inverted night sky of shifting constellations. The sun never sets on the banks of the Kobuk River this time of year. It hangs high above the horizon even now at 11 p.m., transforming my tent into a solar oven as I, ironically, work to uncover the secrets of a frog that can turn into ice.

Read the rest of the article here.

]]>
2217
Web scraping and text analysis in R and GGplot2 https://www.azandisresearch.com/2022/08/30/web-scraping-and-text-analysis-in-r/ Tue, 30 Aug 2022 12:04:44 +0000 https://www.azandisresearch.com/?p=2123 I recently needed to learn text mining for a project at work. I generally learn more quickly with a real-world project. So, I turned to a topic I love: Wilderness, to see how I could apply the skills of text scrubbing and natural language processing. You can clone my Git repo for the project or follow along in the post below. The first portion of this post will cover web scraping, then text mining, and finally analysis and visualization.

Introduction

I thought it would be interesting to see how different Wilderness Areas are described. Having spent a lot of time in Wilderness Areas across the country, I know that they are extremely diverse. We can use text mining and subsequent analysis to examine just how that regional diversity is perceived and portrayed.

Infographic depicting the most common and most unique words used to describe US Wilderness Areas
This is the final product visualizing descriptions of Wilderness Areas in the US.

Wilderness.net provides a brief description, which I assume are written by the agency Wilderness managers, for each of the 803 designated Wilderness Areas in the United States [data disclaimer]. For example, here is the page for one of my favorite Wilderness Areas: South Baranof.

As always, we first load packages and set up the environment.

library(rvest) # For web scrapping
library(qdap) # For tagging parts of speech
library(tidytext) # Text mining
library(hunspell) # For word stemming
library(quanteda) # Text analsysis
library(cowplot) # Composite figures
library(ggrepel) # For text labels in visualizations
library(wordcloud2) # Word cloud vizualizations
library(tidyverse) # For everything.

# I find it easier to install wordcloud2 from the repo:
# library(devtools)
# devtools::install_github("lchiffon/wordcloud2")

Web scraping

I found the video tutorials here very helpful. In those tutorials, the author employs a Chrome extension, SelectorGadget to isolate the page elements that you want to scrape. Unfortunately, Wilderness.net uses an extremely convoluted page structure without CSS. Thus, I could not find a way to isolate the Wilderness descriptions using CCS tags. Instead, I scrub all of the body text and then use strsplit to cleave out the portions of the text I am interested in keeping.

Conveniently, each Wilderness Areas gets its own page on Wilderness.net and each of those pages are numbered consecutively in the page address. For example, South Baranof Wilderness is number 561: “wilderness.net/visit-wilderness/?ID=561”. This means that we can simply loop through each page, scrape the text, and store it for further analysis.

First, we need to set up an empty matrix that we will populate with the data we scrub from Wilderness.net. From the text, we want to isolate and store the Wilderness name, the state that contains the most acreage, the year of designation, the federal agency (or agencies) charged with its management, and the description text.

Colnames <- c("Wild", "State", "Year", "Agency", "Descrip")
WildText <- matrix(nrow = 804, ncol = length(Colnames))
colnames(WildText) <- Colnames

Now we loop through all of the pages to populate the empty matrix with our data.

for(i in 1:nrow(WildText)){
  link = paste0("https://wilderness.net/visit-wilderness/?ID=", i)
  page = read_html(link)
  content <- page %>%
    html_nodes("#maincontent , h1") %>%
    html_text()

  WildText[i, 1] <- content %>%
    strsplit("wilderness = '") %>%
    unlist() %>%
    `[`(2) %>%
    strsplit("';\nvar") %>%
    unlist() %>%
    `[`(1)
  
  WildText[i, 2] <- content %>%
    strsplit("stateAbbr = '") %>%
    unlist() %>%
    `[`(2) %>%
    strsplit("';\nvar") %>%
    unlist() %>%
    `[`(1)
  
  WildText[i, 3] <- content %>%
    strsplit("yearDesignated = '") %>%
    unlist() %>%
    `[`(2) %>%
    strsplit("';\nvar") %>%
    unlist() %>%
    `[`(1)
  
  WildText[i, 4] <- content %>%
    strsplit("agency = '") %>%
    unlist() %>%
    `[`(2) %>%
    strsplit(";\nvar") %>%
    unlist() %>%
    `[`(1) %>%
    gsub("[^a-zA-Z ]", "", .)
  
   WildText[i, 5] <- content %>%
    strsplit("<h2>Introduction</h2></div>") %>%
    unlist() %>%
    `[`(2) %>%
    strsplit("Leave No Trace\n\t\t\t") %>%
    unlist() %>%
    `[`(1) %>%
    strsplit(";\n\n\n") %>%
    unlist() %>%
    `[`(2)
}

Now we convert the matrix to a tibble and check to make sure our scraping rules captured all of the descriptions.

WildText <- as_tibble(WildText) # Convert the matrix to a tibble.

MissingDescrip <- WildText %>%
  mutate(WID = row_number()) %>%
  filter(is.na(Descrip)) %>%
  .$WID

In MissingDescrip we find that 44 Wilderness Areas are missing descriptions. So, we need to alter the rules a bit and re-scrape those pages.

for(i in MissingDescrip){
  link = paste0("https://wilderness.net/visit-wilderness/?ID=", i)
  page = read_html(link)
  
  WildText[i, 5] <- page %>%
    html_nodes("#maincontent") %>%
    html_text() %>%
    strsplit(paste0(";\nvar WID = '", i, "';\n\n\n")) %>%
    unlist() %>%
    `[`(2) %>%
    strsplit("Leave No Trace\n\t\t\t") %>%
    unlist() %>%
    `[`(1)
}

There are still a couple of Wildernesses with missing information: Wisconsin Islands Wilderness #654 and Okefenokee Wilderness #426. Each of these areas have idiosyncratic text elements, so we can write specific rules to pull the descriptions for each.

# Wisconsin Islands Wilderness #654
link = paste0("https://wilderness.net/visit-wilderness/?ID=", 654)
page = read_html(link)

WildText[654, 5] <- page %>%
  html_nodes("#maincontent") %>%
  html_text() %>%
  strsplit(";\nvar WID = '654';\n\n\n") %>%
  unlist() %>%
  `[`(2) %>%
  strsplit("Closed Wilderness Area") %>%
  unlist() %>%
  `[`(1)

# Okefenokee Wilderness #426
link = paste0("https://wilderness.net/visit-wilderness/?ID=", 426)
page = read_html(link)

WildText[426, 5] <- page %>%
  html_nodes("#maincontent") %>%
  html_text() %>%
  strsplit("WID = '426';\n\n\n") %>%
  unlist() %>%
  `[`(2) %>%
  strsplit("Leave No TracePlan Ahead and Prepare:") %>%
  unlist() %>%
  `[`(1)

The management of many Wilderness Areas is mandated to two agencies. We need to parse those.

WildText <- WildText %>%
  mutate(
    Agency = case_when(
      Agency == "Bureau of Land ManagementForest Service" ~ "Bureau of Land Management; Forest Service",
      Agency == "Bureau of Land ManagementNational Park Service" ~ "Bureau of Land Management; National Park Service",
      Agency == "Forest ServiceNational Park Service" ~ "Forest Service; National Park Service",
      Agency == "Fish and Wildlife ServiceForest Service" ~ "Fish and Wildlife Service; Forest Service",
      TRUE ~ Agency
    ),
    WID = row_number()
  ) %>%
  filter(!is.na(Wild))

It would be difficult to analyze 804-way comparisons. Instead, I want to group the Wilderness Areas by broad regions. I’m defining the Eastern Region as the states that boarder the Mississippi and those states to the east. The Western Region is everything to the west. Alaska, which contains almost half of the nation’s Wilderness acreage [link to post] is it’s own Region. I’ve grouped Hawaii and Puerto Rico into an Island Region, but because there are only 3 Areas in those places, we won’t have enough data to analyze.

WildText <- WildText %>%
  mutate(Region = case_when(State %in% c("MT", "CA", "NM", "WA", "NV", "AZ", "UT", "OR", "SD", "TX", "CO", "WY", "ND", "ID", "NE", "OK") ~ "West",
                            State %in% c("MN", "FL", "PA", "IL", "TN", "VA", "KY", "MO", "VT", "GA", "MI", "AR", "NJ", "MS", "WI", "LA", "NC", "SC", "ME", "IN", "AL", "WV", "NY", "NH", "MA", "OH") ~ "East",
                            State == "AK" ~ "Alaska",
                            State == "HI" | State == "PR" ~ "Island"))

At this point, I like to save these data so I don’t need to re-scrape every time I run the analysis. Next time, I can simply pick back up by redefining the WildText object.

saveRDS(WildText, "./WildernessDescriptions.rds")
WildText <- readRDS("./WildernessDescriptions.rds")

Text Mining

Now that we’ve successfully scrubbed our data, we can begin text mining. I found Silge & Robinson’s book, “Text Mining with R” invaluable.

There are a number of approaches to dealing with text that fall on a spectrum from relatively simple text mining to more complex natural language processing. For this exercise we will conduct a simple analysis that compares the words and their relative frequencies. To do so, we need to decompose the descriptions from word strings to individual words.

First, we remove non-text characters.

WT <- WildText %>%
  mutate(Descrip = gsub("\\.", " ", Descrip),
         Descrip = gsub("  ", " ", Descrip),
         Descrip = gsub("[^\x01-\x7F]", "", Descrip))

Next, we tag the parts of speech for each word in the essay.

WT$pos <- with(WT, qdap::pos_by(Descrip))$POStagged$POStagged

POS_list <- strsplit(WT$pos, " ") # Break string down into list of tagged words

Next, we create a tidy dataframe of the parts of speech and retain only the informational words (nouns, verbs, adjectives, and adverbs). We also remove “stop words” or common words that tend not to provide a strong signal (or the wrong signal).

WT2 <- data.frame(Wild = rep(WT$Wild, sapply(POS_list, length)), words = unlist(POS_list)) %>% # Convert lists of tagged words into tidy format, one word per row
  separate(words, into = c("word", "POS"), sep = "/") %>% # Create matching columns for the word and the tag
  filter(POS %in% c("JJ", "JJR", "JJS", "NN", "NNS", "RB", "RBR", "RBS", "VB", "VBD", "VBG", "VBN", "VBP", "VBZ")) %>% # Keep only the nouns, verbs, adjectives, and adverbs
  anti_join(stop_words) # Remove stop words

Next we lemmatize the words (i.e. extract the stem word, for instance, peaks = peak and starkly = stark).

WT2$stem <- NA
for(i in 1:nrow(WT2)){
  WT2$stem[i] <- hunspell_stem(WT2$word[i]) %>% unlist() %>% .[1]
  print(paste0(sprintf("%.2f", i/nrow(WT2)), "% completed")) # This just outputs a progress indicator. 
}

Finally, we can add the regions from the original dataset onto the tidy dataset and save the data object for future use.

WT2 <- WT2 %>%
  left_join(WildText %>% select(Wild, Region))

At this point, I like to save these data so we don’t need to reprocess every time we run the analysis.

# saveRDS(WT2, "./WildernessByWord.rds")
WT2 <- readRDS("./WildernessByWord.rds")

Analysis and Visualization

After all that effort of scraping text from the web and text mining, we can finally begin to take a look at our data. One of the first questions one might as is: do managers tend to write longer descriptions for Areas depending on the region?

WT2 %>%
  group_by(Region, Wild) %>%
  summarise(WordCount = length(word)) %>%
  ggplot(aes(x = fct_reorder(Region, WordCount, median), col = Region, y = WordCount)) +
  geom_jitter(size = 3, alpha = 0.4, pch = 16, height = 0) +
  geom_boxplot(width = 0.3, alpha = 0.7) +
  scale_y_log10() +
  scale_color_manual(values = c("#008dc0", "#8ba506", "grey", "#bb5a00"))

Boxplot of word per area showing that Areas in Alaska have more words on average compared to West or East regions.

Descriptions of Alaskan Wilderness Areas tend to be longer than those for other regions. While descriptions for Areas in Hawaii and Puerto Rico also have a high median, there are too few to make strong comparisons.

Another common question is: which words are used most frequently? We can answer that most simply by looking for the word with the high use count across all descriptions.

 WT2 %>% 
    group_by(word) %>%
    tally() %>%
    arrange(desc(n)) %>%
    top_n(20)

Unsurprisingly, “wilderness” is the most frequently used word, by far. Factual descriptors like “feet”, “miles”, “elevation”, and “boundary” are common. Features like “mountain”, “trail”, “river”, “wildlife”, “dessert”, and “forest” are also common.

Interestingly, no emotive adjectives make the list. John Muir would be disappointed! We can also pull out only the adjectives.

WT2 %>% 
    filter(POS %in% c("JJ", "JJS", "JJR")) %>%
    group_by(word, Wild) %>%
    tally() %>%
    group_by(Wild) %>%
    filter(n == max(n)) %>%
    group_by(word) %>%
    tally() %>%
    arrange(desc(n)) %>%
    top_n(20)

Other than the prosaic “national” and direction words, we can see some descriptor of the sublime, like “steep”, “rugged”, “rocky”,  and “wild” that would make the transcendentalists progenitors of the wilderness concept proud.

To make the main figure for this project, I really wanted to use word clouds. Word clouds aren’t a great visualization for data interpretation, but they are a fun gimmick and make for interesting visuals!

Unfortunately, R doesn’t have the best support for generating fancy wordclouds. The package wordcloud is solidly reliable, but can only make very basic images. The package wordcloud2 allows for far more customization, but it is extremely buggy and requires some work-arounds on most systems.
I want the regional word clouds to conform to the Region’s shape. I made some simple shape images (here: East, West, Alaska) that we can pass to wordcloud2.

Depending on your system, the image may not populate in RStudio’s plotting window. If you don’t see the plot, try clicking the button to “show in new window” which will open a browser tab. Try refreshing the browser tab a few times until the wordcloud generates. Unfortunately, there is no way to constrain the aspect ratio of the cloud, so you will need to resize your browser window to a square. Then you can right-click and save the image. …like I said, wordcloud2 requires far too many work-arounds, but it is the only option for word clouds with custom shapes in R.

Below I’m showing the code for generating the Western region. It should be easy to alter for the other regions (all of the code is on github).

WT2 %>%
  filter(Region == "West") %>%
  filter(word != "wilderness") %>%
  count(word) %>%
  filter(n > 10) %>%
  wordcloud2(figPath = "./West.png",
             maxRotation = 0,
             minRotation = 0,
             color = "#bb5a00",
             fontFamily = 'serif')

Wordclouds for Alaska, West, and East regions.
I pulled the wordclound images into Illustrator to make the final adjustments. (You could certainly do all of this in R, but I’m much faster at Illustrator and prefer to use a visual program for graphic design decisions.)

Because wordclouds are not useful for quantitative interpretation, I also want to make some histograms of the most common words associated with each region. Again, I’m only showing code for the Western region below, all of the code is on Github.

WT2 %>%
  filter(Region == "West") %>%
  filter(stem != "wilderness") %>%
  count(stem) %>%
  arrange(desc(n)) %>%
  top_n(40) %>%
  ggplot(aes(x = fct_reorder(stem, n), y = n)) +
  geom_bar(stat = "identity", fill = "#bb5a00", width = 0.7) +
  coord_flip() +
  labs(y = "", x = "") +
  theme(axis.text.y = element_text(color = "#bb5a00"),
        axis.line.y = element_blank(),
        panel.grid.major = element_blank(),
        axis.text = element_text(family = "serif"))

Histograms of the most common words used to describe Wilderness areas in the Alaska, West, and East regions.

Another question we could ask is: which words most distinguish regions from other regions? For example, “mountain” and “trail” are high frequency words in the Western region, but they also occur at high frequency in the Eastern region, as well. So, these terms don’t help us distinguish between regions. Instead we can estimate log ratios of word occurrence between regions. Log ratios conveniently scale symmetrically around zero. Greater absolute values indicate words particularly relevant to one region and smaller values indicate words that are equally relevant to both regions.

wordratios <- WT2 %>%
  filter(word != "wilderness") %>%
  filter(Region == "East" | Region == "West") %>%
  count(stem, Region) %>%
  filter(sum(n) > 10) %>%
  ungroup() %>%
  spread(Region, n, fill = 0) %>%
  mutate_if(is.numeric, funs((. + 1)/sum(. + 1))) %>%
  mutate(logratio = log(East/West)) %>%
  arrange(desc(logratio))

wordratios %>%
  arrange(abs(logratio)) # Small log ratios indicate terms that are equally likely to be from East or West

WT2 %>%
  count(stem, Region) %>%
  group_by(stem) %>%
  filter(sum(n) >= 10) %>%
  ungroup() %>%
  pivot_wider(names_from = Region, values_from = n, values_fill = 0) %>%
  mutate_if(is.numeric, list(~(. + 1) / (sum(.) + 1))) %>%
  mutate(logratio.EW = log(East / West)) %>%
  arrange(abs(logratio.EW)) %>%
  slice(1:20) %>%
  mutate(class = "mid") %>%
  bind_rows(
    WT2 %>%
      count(stem, Region) %>%
      group_by(stem) %>%
      filter(sum(n) >= 10) %>%
      ungroup() %>%
      pivot_wider(names_from = Region, values_from = n, values_fill = 0) %>%
      mutate_if(is.numeric, list(~(. + 1) / (sum(.) + 1))) %>%
      mutate(logratio.EW = log(East / West)) %>%
      arrange((logratio.EW)) %>%
      slice(1:20) %>%
      mutate(class = "west")
  ) %>%
  bind_rows(
    WT2 %>%
      count(stem, Region) %>%
      group_by(stem) %>%
      filter(stem != "roger") %>%
      filter(sum(n) >= 10) %>%
      ungroup() %>%
      pivot_wider(names_from = Region, values_from = n, values_fill = 0) %>%
      mutate_if(is.numeric, list(~(. + 1) / (sum(.) + 1))) %>%
      mutate(logratio.EW = log(East / West)) %>%
      arrange(desc(logratio.EW)) %>%
      slice(1:20) %>%
      mutate(class = "east")
  ) %>%
  ggplot(aes(x = fct_reorder(stem, logratio.EW), 
             y = logratio.EW, 
             col = class)) +
  geom_segment(aes(xend = fct_reorder(stem, logratio.EW), 
                   y = case_when(class == "west" ~ -0.1,
                                 class == "mid" ~ 0,
                                 class == "east" ~ 0.1),
                   yend = logratio.EW)) +
  geom_point(data = . %>% filter(class == "west"), aes(size = exp(abs(logratio.EW))), pch = 16) +
  geom_point(data = . %>% filter(class == "east"), aes(size = exp(abs(logratio.EW))), pch = 16) +
  geom_text(data = . %>% filter(class == "west"), aes(label = stem, y = 0), hjust = 0) +
  geom_text(data = . %>% filter(class == "east"), aes(label = stem, y = 0), hjust = 1) +
  geom_text(data = . %>% filter(class == "mid"), aes(label = stem, y = 0)) +
  coord_flip() +
  scale_color_manual(values = c("#8ba506", "grey70", "#bb5a00")) +
  theme(axis.text.y = element_blank(),
        axis.line.y = element_blank(),
        panel.grid.major = element_blank()) +
  labs(x = "",
       y = "Log ratio ('Uniqueness' of words for a region)",
       title = "Which words are most unique to a Wilderness?")


In the image above we can see the 20 most unique words that distinguish the Eastern from the Western regions. “Laurel”, “swamp”, “key”, and “bay” are characteristic of the East while “desert”, “lion”, “wash”, and “alpine” are almost exclusively used to describe Western areas. Words like “safety”, “coastal”, and “glimpse” are commonly used in both regions. Interestingly, the word “west” is used commonly in Eastern descriptions. I was also surprised to see “lynx” and “glacial” to be equally common.

Log ratios aren’t as useful when we have more than two groups to compare. In our case, where we want to compare multiple groups (i.e. West, East, and Alaska), we can find the words that most distinguish one region from the others by computing the term frequency-inverse document frequency (tf-idf). This metric is computed by multiplying the the number of times a word is used in a given “document” by the inverse of that word’s frequency across all “documents”. In this case, we treat all descriptions from a region as a single “document”. Similar to log ratios, tf-idf let’s us know how relevant a term is to a given document, but allows us to compare across many documents.

WT2 %>%
  filter(Region != "Island") %>%
  filter(!is.na(stem)) %>%
  filter(stem != "roger") %>%
  count(Region, stem, sort = TRUE) %>%
  bind_tf_idf(stem, Region, n) %>%
  group_by(Region) %>%
  top_n(10, tf_idf) %>%
  arrange(desc(Region), desc(tf_idf)) %>%
  print(n = 40)
# A tibble: 32 x 6
# Groups:   Region [3]
   Region stem           n       tf   idf   tf_idf
                    
 1 West   desert       533 0.00852  0.405 0.00346 
 2 West   wash         125 0.00200  1.10  0.00220 
 3 West   bighorn       95 0.00152  1.10  0.00167 
 4 West   mesa          90 0.00144  1.10  0.00158 
 5 West   golden        79 0.00126  1.10  0.00139 
 6 West   creosote      78 0.00125  1.10  0.00137 
 7 West   fir          203 0.00325  0.405 0.00132 
 8 West   sagebrush     73 0.00117  1.10  0.00128 
 9 West   tortoise      71 0.00114  1.10  0.00125 
10 West   badlands      63 0.00101  1.10  0.00111 
11 East   laurel        38 0.00180  1.10  0.00198 
12 East   hardwood      90 0.00427  0.405 0.00173 
13 East   key           68 0.00323  0.405 0.00131 
14 East   oak           66 0.00313  0.405 0.00127 
15 East   illustrate    20 0.000949 1.10  0.00104 
16 East   swamp         53 0.00251  0.405 0.00102 
17 East   logging       37 0.00175  0.405 0.000712
18 East   maple         34 0.00161  0.405 0.000654
19 East   turkey        34 0.00161  0.405 0.000654
20 East   branch        33 0.00157  0.405 0.000635
21 Alaska fjord         17 0.00243  1.10  0.00267 
22 Alaska anchorage     11 0.00157  1.10  0.00173 
23 Alaska tundra        28 0.00400  0.405 0.00162 
24 Alaska prince         9 0.00128  1.10  0.00141 
25 Alaska wale           9 0.00128  1.10  0.00141 
26 Alaska chuck          8 0.00114  1.10  0.00125 
27 Alaska whale         20 0.00286  0.405 0.00116 
28 Alaska lion          17 0.00243  0.405 0.000984
29 Alaska alpine        16 0.00228  0.405 0.000926
30 Alaska cook           5 0.000714 1.10  0.000784
31 Alaska frigid         5 0.000714 1.10  0.000784
32 Alaska warren         5 0.000714 1.10  0.000784

Words like “desert”, “wash”, “bighorn”, and “mesa” are highly indicative of the West. The East is described most distinctly by it’s plant species: “laurel”, “oak”, “maple” and by terms like “key” which refers to the islands in the southeast. Alaska is dominated by intuitive terms like “fjord”, “tundra” and “alpine” and sea animals like “whale” and sea “lion”. Place names also rise to high relevance for Alaska with terms like “Anchorage”, “Prince of Wales Island”, and “Cook Inlet”.

When plotting differences between East and West in log-ratios, above, it made sense to use a diverging bar graph (or lollipop graph, specifically). But with more than a two-way comparison, visualization gets more complicated.

After a few iterations, I settled on visualizing the most distinctive terms (i.e. terms with highest tf-idf) as growing larger from a common point. I accomplished this by wrapping the plot around a circular coordinate system. Terms that are larger and further from the axes of other regions are more distinctive to the focal region.

Before plotting, I also remove the words associated with proper noun place names.

WT2 %>%
  filter(Region != "Island") %>%
  filter(!is.na(stem)) %>%
  filter(!stem %in% c("roger", "anchorage", "prince", "wale", "cook", "warren", "admiralty", "coronation")) %>%
  count(Region, stem, sort = TRUE) %>%
  bind_tf_idf(stem, Region, n) %>%
  group_by(Region) %>%
  top_n(30, tf_idf) %>%
  ungroup() %>%
  mutate(ordering = as.numeric(as.factor(Region)) + (tf_idf*100),
         stem = fct_reorder(stem, ordering, .desc = FALSE)) %>%
  mutate(tf_idf = tf_idf * 100) %>%
  ggplot(aes(x = fct_relevel(Region, c("East", "West", "Alaska")), label = stem, y = tf_idf, col = Region)) +
  geom_point() +
  coord_flip() +
  scale_color_manual(values = c("#008dc0", "#8ba506", "#bb5a00")) +
  scale_y_log10(limits = c(.025, 0.35)) +
  coord_polar() +
  geom_text_repel(aes(cex = tf_idf), max.overlaps = 100, family = "serif", segment.linetype = 0) +
  theme(panel.grid.major.x = element_blank(),
        axis.line = element_blank(),
        axis.text = element_blank()) +
  labs(x = "", y = "")

Radial plot with more unique words on the outside.

With all of the components created, I pulled everything into Illustrator to design the final infographic.

Infographic depicting the most common and most unique words used to describe US Wilderness Areas
This is the final product visualizing descriptions of Wilderness Areas in the US.

There is certainly a lot more that one could learn from these data. For instance, do descriptions differ by management agency? Would we find stronger divergence in language used to describe North versus South regions in the lower 48 rather than East-West? Nevertheless, this was a useful exercise for learning a bit more about web scrapping, text mining, and word analysis.

]]>
2123
Stories of Subsistence in Wilderness – Inian Islands https://www.azandisresearch.com/2022/02/19/stories-of-subsistence-in-wilderness-inian-islands/ Sat, 19 Feb 2022 17:01:49 +0000 https://www.azandisresearch.com/?p=2036 Scroll to the end for the photo gallery.
— I originally wrote this post for the Sitka Conservation Society‘s website in 2014. This trip was part of the Stika Community Wilderness Stewardship Project.

The day we headed out from Hoonah was like most days in Southeast Alaska. Grey clouds diffused the light and an almost imperceptible rain left everything damp. We were headed to the Inian Islands, a cluster of knobby isles on the western end of Icy Strait, just inside the entrance to Cross Sound where the Inside Passage meets the angry Pacific. Our trip held a dual mission: to conduct volunteer wilderness monitoring for the Forest Service and to gather traditional subsistence foods for the Hoonah locals on the trip: Owen James and Gordon Greenwald, our boat captains and wizened culture-bearers, two young men named Randy and Sam, and another adult volunteer, Kathy McCrobie.

The Inians along with two other large islands make up the Pleasant/Lemesurier/Inian Island Wilderness. The PLI Wilderness is one of 19 areas within the Tongass National Forest designated as Wilderness, the highest form of protection public lands can receive. The islands are also historic gathering and hunting grounds of the Huna Tlingit, the native tribe who call this section of northern Southeast Alaska home. Because the Inians are close to the open sea, they are rich with unique flora and fauna. A trip to these distant islands is an opportunity to collect delicacies not common in interior waters near the town of Hoonah. For instance, one of our subsistence targets was black seaweed, a species that thrives in the cold, wave-washed intertidal zone of the outer coast, but is rarely found more than a few miles into the Southeast archipelago.

The outside waters can be a harsh place in the summer and downright inhospitable in the winter. Although the Huna Tlingit are seasoned open ocean travelers and motorized skiffs make the 40-mile journey from the village of Hoonah to the islands much more manageable than a Tlingit canoe, it is still a sizable trip for locals. The same factors—difficult access and a short season—also make it difficult for the Forest Service Wilderness Rangers who are headquartered in Hoonah to access these areas that they are tasked with managing and protecting.

On the first day of our trip we arrived at the Inian Islands after a few hours of skiffing over unusually calm waters. Our first stop was at low tide on a rocky beach, the perfect habitat for Black Katy chitons, one of the traditional foods commonly called Gumboots which we hoped to return with.

The beach also looked like it could be a prime camping area, so while the rest of the crew flipped rocks and pried unsuspecting chitons from their hiding spots, I headed up the beach to look for recreational impacts. Monitoring impacts from visitors is one of the tasks the Forest Service has asked us to assist with. Wilderness areas are intended to preserve nature in its wildest state, but trash, campfire rings, and other signs of previous visitors detract from the wild character of these places. Also, once a site has been impacted, the trend is a downward slope to a trashed site. To prevent cumulative impact, we check known campsites and cleanup and naturalize any human traces we find. Fortunately, this site was in the same condition it’s probably been in since it was uncovered by the glacier, so I spent some time flipping rocks and adding to the gumboots collection.

As the tide neared its apex, Gordon pointed out a small rock island set apart from the larger Inian Islands. For generations, this rock had been the prize destination for Huna families. Set far from land and too small to support trees, the rock is the perfect nesting grounds for seabirds like gulls and cormorants. We had timed our trip perfectly to harvest the new eggs.

As we approached the rock in skiffs, Gordon and Owen explained the protocol: as the swell surges, he runs the skiff up to the rock, one person jumps off, and he pulls the bow away before the swell drops the boat onto the shore, then he resets and we try again for the next person to leap from the bow onto the island. Before they maneuvered the skiffs toward the rock, they carefully taught the boys the traditional method to appropriately harvest the eggs. If done in an ecologically responsible way, these practices will be able to continue forever. (Learn more about the regulations regarding egg collection by Alaskan Natives and locals for subsistence).

Sam was the first to make the jump. The birds immediately erupted in a cacophony of squawks and feathers. Randy and I traded apprehensive glances. I made an excuse that I needed to pack my camera gear in drybags before I could jump…really I just wanted one more chance to see how it was done. Randy landed an impressive leap, despite receiving a bootfull of water. I followed him up the rock.

Blankets of birds flapped above us. The few green tufts of grass made a stark contrast to the guano-bleached stone and the blue-grey sky and water. It took no time for Sam and Randy to collect plenty of eggs to share with family and elders back in Hoonah. With concentration, steady boat handling, and good timing, we all made it safely back aboard the skiffs.

As the day went on, I was impressed with the way Owen and Gordon pointed out new landmarks to the two young men. Every remark about a headland or bay included not only geographical references, but also historical, cultural, and subsistence context. That night, while we ate chowder made with local salmon, smoked octopus, and cockles, I reflected on the education Randy and Sam had inherited on this trip. I have no doubt that they were more interested in learning about hunting spots, edible shellfish, and traditional stories than they were about the Wilderness land designation of their home. But, I would like to think that by relating the cultural values and subsistence practices of the Inian Islands along with the Wilderness values that will continue to protect this place for those practices, they have a better chance of retaining a favorable perspective of public lands, too. In the end, the idea and values of Wilderness are stories, stories that must be repeated and retold to maintain their relevance. Gordon and Owen have endeavored to pass those stories to Hoonah youth. My esteem and thanks goes out to them for including the value of respect for public lands in the stories they tell.

 

]]>
2036
Chasing Arctic Frogs https://www.azandisresearch.com/2021/08/17/chasing-arctic-frogs/ Tue, 17 Aug 2021 19:13:54 +0000 http://www.azandisresearch.com/?p=1905 A short recipe for adventurous field science

Take me to the photos!

Step 1: Come up with a hair-brained scheme.

My labmate Yara and I had been dreaming up the idea studying wood frog genomes from across the species’ range since she started her PhD. Wood frogs have the largest range of any North American amphibian. They also happen to be the only North American amphibian that can survive North of the Arctic circle.

Our 200 mile route (in orange) from the headwaters of the Ambler River in Gates of the Arctic National Park, down the Kobuk River through Kobuk Valley National Park Wilderness, and out to the village of Noorvik where the Kobuk meets the Arctic Ocean.

Dr. Julie Lee-Yaw had done a similar study back in 2008. She embarked on a road trip from Quebec all the way up to Alaska to collect wood frog tissue. So, out first step was to ask Dr. Lee-Yaw if she would collaborate and share her samples.

Those samples gave us a solid backbone across the wood frog range, but we were missing population in expansive regions north and west of the road systems. We worked with the Peabody Museum to search for tissue samples that were already housed in natural history collections around the world. We filled a few gaps, but huge portions of the range were still missing.

 

We knew that there must be samples out there sitting in freezers and labrooms that were not catalogued in museum databases. So, our next step was to begin sleuthing. We looked up author lists from papers and cold-called leads. I even reached out to friends on Facebook (…which actually turned out to be a big success. The aunt of a friend from undergrad happens to do herpetology research in Galena, Alaska and was able to collect fresh samples for us this year!). This effort greatly expanded our sample coverage with new connections (and friends) from Inuvik and Norman Wells in the Northwest Territories, Churchill on the Hudson Bay, and the Stikine River Delta in Southeast Alaska.

But as the points accumulated on the map, we noticed some glaring holes in our coverage. Most importantly, we had no samples from Northwestern Alaska. Populations in this region are the most distant from the ancestral origin of all wood frogs in the southern Great Lakes. If we wanted a truly “range-wide” representation of wood frog samples, we needed tissue from that blank spot on the map!

Step 2: Convince your advisor and funders it’s a good idea.

This might be the hardest step. In our case, Yara and I were lucky that our advisor, Dave, was immediately supportive of the project. After we made the case for the importance of these samples, funders came around to the idea as well.

Step 3: Make a plan …then remake it …then make a new plan yet again.

Once we knew where we required samples from, we needed to figure out how to get there. Alaska in general is remote, but northwestern Alaska is REALLY remote. The road system doesn’t stretch farther than the middle of the state. All of the communities–mainly small villages–are only accessible by plane, and most of them only have runways for tiny prop planes. Travelling out from the villages into the bush is another layer of difficulty. Most people here either travel by boat on the river or by snowmachine during the winter. Traveling on land, over the soggy and brush-choked permafrost, is brutal and most locals only do it when necessary, if at all.

Prior to academia, I made a career of organizing expeditions to the most remote places in the rugged southeastern archipelago of Alaska. Despite my background, the logistic in the Arctic were even inscrutable to me. Fortunately, I had a couple of friends, Nick Jans and Seth Kantner, who know the area well. In fact, Seth grew up in a cabin out on the Kobuk. (Seth and Nick are both talented authors. I suggest checking out Ordinary Wolves by Seth and The Last Light Breaking by Nick). With their help, I was able to piece together the skeleton of a trip.

After many logistic iterations, Yara and I decided to follow in the footsteps of local hunters who, for generations, have used the rivers as conduits into the heart of the wilderness. Our plan was to travel down one of the major arterial rivers and hike inland to search for frog as we went.

Our original itinerary was to raft the 100 mile section of the Kobuk River from just north of Ambler village to the village of Kiana. But at the last minute (literally), our plans changed. As we were loading up the plane, the pilot told us that he couldn’t fly into our planned starting point. Instead, he suggested that we fly into a gravel bar 30 miles up river in Gate of the Arctic. Those “30 miles” turn out to be AIR MILES. Following the river, it ended up adding over 60 miles to our trip.

 

We packed two inflatable oar rafts, almost 150 pounds of food, and another 300 pounds of camping, rescue, and science gear, into the balloon-wheeled plane. For the next two weeks, we rowed down the swift Ambler River from the headwaters to the confluence of the Kobuk. Then, we rowed down the massively wide and meandering Kobuk River, eventually extending our trip by an additional 30 miles, by-passing Kiana, and continuing to Noorvik, the last village on the river.

Step 4: Recruit a crew.

Despite being the worlds first and only Saudi Arabian Arctic Ecologist with limited camping experience, I knew Yara would be a stellar field partner. But I never like traveling in brown bear country with fewer than four people. Plus, expedition research involves too many daily chores for the two of us to manage alone. So, we recruited a team.

Sam Jordan is a dry land ecologist, but he had been willing to help me with my dissertation fieldwork in wetlands before, so I knew he would be willing to defect for a good adventure. Sam is also an exceptional whitewater paddler and all-around outdoor guru. Plus, he’s just a great guy (when he leaves his banjo at home). He and I spend two weeks floating the Grand Canyon in the dead of winter and there are few people I would want along on a remote river trip.

Kaylyn Messer and I guided sea kayak expeditions in Southeast Alaska back in our youth. I am a bit particular about how I manage my camp system (read: “extremely picky and fastidious to a fault”) on big trips. Kaylyn is one of the few people as scrupulous as me, but she’s also a super amenable Midwesterner at heart. I knew she’d be a huge help out in the field.

We fell into an effective rhythm on the trip.  Each morning we woke, made breakfast, broke camp, packed the boats, and launched early in the day. While one person on each boat rowed, the other person checked the maps for frog surveying spots, fished, or photographed. We stopped along the way to bushwhack back into wetlands we’d identified from satellite images. We typically arrived at camp late. Yara and I would set up one tent to process the specimens from the day while Same and Kay made camp and cooked dinner. One of the hidden disadvantages of 24-hour Arctic sunlight is that it is easy to overwork. Most nights we only managed to get sampled finished, dinner cleaned up, and camp bearproofed with enough time to crawl into tents with just eight hours till beginning again the next day.

Step 5: Do the science.

Doing science in the field is difficult. Tedious dissections seem impossible while baking in the omnipresent sun and being alternately hounded by hundreds of mosquitoes or blasted by windblown sand. Trading lab coats for rain jackets and benchtops for sleeping pads covered in trashbags compounds the trouble. Not to mention, keeping tissues safe and cool. Organization and adaptability go a long way.

On remote, self-supported trips, it is inevitable that equipment fails or is lost. On one of the first days, we discovered that our formalin jar was leaking—and formalin is not something you want sloshing around! We cleaned the boats and found a creative solution to replace the offending container: a 750ml Jack Daniel’s bottle!

Planning ahead and engineering backup plans also helps. One of our main struggles was figuring out how to preserve specimens and get them home. It is illegal to ship alcohol by mail and you can’t fly with the high-proof alcohol needed for genetic samples. You can ship formalin, but it is difficult to fly with. To make matters worse, we were flying in and out of “dry” or “damp” villages where alcohol is strictly regulated or forbidden. Also, we happened to be flying out on a Sunday, making it impossible to mail samples home. The solution we arrived at was to ship RNAlater and formaldehyde to our hotel room ahead of time. Tissue would remain stable in RNAlater for a couple of weeks and we could make formalin to fix the specimens. After fixing, we cycled the specimens through water to leach out the formalin. This made it possible for me to fly with all of the tissue tubes and damp specimens in my carry on. Other than a few concerned looks from the TSA folks, all of the samples made it back without issue!

Step 6: Enjoy the adventure.

Despite the hard work, there was a lot to appreciate about the Arctic. We witnessed major changes in ecology as we travelled from the steep headwater streams in the mountains to the gigantic Kobuk. Every day was an entirely new scene.

 

Step 7: Forget the hardships

Looking back, it is really easy to forget the sweltering heat, swarms of mosquitoes, inescapable sun, and freak lightning storms. And, it’s probably better to forget those anyway!

 

]]>
1905
Expedition: South Prince of Wales Wilderness https://www.azandisresearch.com/2018/10/13/expedition-south-prince-of-wales-wilderness/ Sat, 13 Oct 2018 22:29:06 +0000 http://www.azandisresearch.com/?p=759
Wayne looking south over the Barrier Islands into the Dixon Entrance.

From my field notes, 17 August 2018:

“Last night I slept on the flat bench of duff-covered ground at the base of the ‘camp tree,’ a white and skeletal cedar with the characteristic axe marks from decades-past Tlingit or Haida woodsmen who had notched the bark from the underhanging side of the leaning tree trunk. As Wayne, our on-hand anthropologist, explained, the notched wood of the tree would have died and dried in the rainshadow of the trunk making for ready tinder to be harvested the next season when the camp was reoccupied. The overhanging tree would have also provided the Native campers with shelter for fire or lean-to bivouac. One of the same purposes for with I utilize it today. 

As I fell asleep under the history-laden tree, I dreamed of a boat at anchor in high winds. As the gales tugged at the boat, the anchor dug into the muddy bottom and the rode strained taught to the windlass bolted to the foredeck. But the wind was strong, and the windlass was pulling from the deck boards and the road was splitting. With each gust, the planks screamed “Eeeeerrrrck!”

A few more heavy gusts would rend the boat from anchor, setting it adrift in the turbulence. Another gust and a louder “Eeeerrrrck!”

In my dreamy torpor I was frozen, I could do nothing but watch.

“EEEEEERRRRCK!” and I watched the windlass pull loose, hanging by a single bolt.

Just before the final fastener snapped, I woke up to the sun shining sidelong under my tarp and straight into my eyes. I could still hear the creaking Eeeeerrrck sound out in the cove and realized it was the agitate call of a belted Kingfisher. The sound of the gusts in my dream had been the sound of light swell washing the beach.”

Early morning light in Klinkwan Cove.

One of my favorite quotes about Wilderness is from former Senator Clinton Anderson who said, “Wilderness is an anchor to windward. Knowing it is there, we can also know that we are still a rich nation, tending our resources as we should—not a people in despair searching every last nook and cranny of our land for a board of lumber, a barrel of oil, a blade of grass, or a tank of water.” No doubt, I must have had that quote on my mind as I fell asleep on the third night of our three week expedition in South Prince of Wales Wilderness.

South POW is one of the many forgotten Wilderness areas of Southeast Alaska, overshadowed by the iconic ice-filled fjords like Tracy Arm or Glacier Bay, and tourist hotspots like Misty Fjords. Nevertheless, it can easily compete in scale and dynamic topography, and what it lacks in defining photo-geniality, it more than makes up for in its isolation and profound solitude.

Unloading at our southernmost camp in the Barrier Islands.
Sunset in Klakas Inlet.
Kim investigates two giant Lion’s Mane jellyfish.

At just under 90,000 acres, South POW is the 10th largest and the southern-most (narrowly outcompeting the southern point of Misty Fjords) of the twenty-four Wilderness Areas in Southeast Alaska. As its name implies, the designation was carved from the southern portion of Prince of Wales island, including the watersheds draining into Klakas Inlet, all watersheds south that drain to Cordova Bay and Barrier Islands. Encompassed in the boundary is a labarynith of convoluted (some might say fractal-like) shoreline. This is dynamic and magic country. It is intricate and you need to see it intimately, by kayak, or even better, by foot. Inlets and passages otherwise hidden from sight appear as if by incantation, the trees parting to a kayak-width channel only when you paddle immediately beside.

From my field notes, 18 August 2018:

“From our southernmost camp, we paddled out to the most seaward rocks and islands. Out here on the outside waters, among the breakers, expansive kelp beds, and scattered, battered islands with rocky headlands. White breaking water sparkled in the sun with each swell, but the water was otherwise calm. All of the branches of the few trees on these islands point leeward, like frozen weathervanes recording past winter storms originating from the Dixon Entrance to the south.”

Exploring the wave-washed outer islands and seastacks.
Early morning at our camp in the Barrier Islands.
Looking out into the Dixon Entrance and the open Pacific.

From my field notes, 19 August 2018:

“After exploring the Barrier Islands, we spent time tracing the coastline of the main island which is punctuated by deep, multi-chambered inlets, bays, and saltchucks. For instance, a side channel in Hunter Bay opens into Biscuit Lagoon. At the back of Biscuit Lagoon, the Saltchuck opens at high water above a tidal waterfall. The shoreline is like walking through a labyrinth with new passages appearing and opening into whole new habitats.

We paddled up to the base of the Saltchuck where salmon were preparing to run at high water. Wayne, whose eyes always seem to catch any inorganic shape on the landscape, noticed that beneath our tethered kayaks, below the waterline, rocks had been arranged in a line. He reasoned that at lower water the stones would cordon off a pool. The Native fishermen would have either trapped salmon in the pool or used it as a holding pen for their catch pulled from the falls.”

It takes an early morning to see sunrise from the kayak in Alaska.
Great blue heron
Aaron admires a massive red cedar.

The Wilderness is a mélange of habitats from wave washed coastal shores, to glass-still secluded bays, upland muskegs, lowland salal thickets, and rich estuaries throbbing with activity. The varied habitats granted us countless wildlife sightings, include a few once-in-a-lifetime encounters.

From my field notes, 19 August 2018:

“As we ate lunch next to the falls and pondered the rock wall, Kim noticed two wolves trotting up the shoregrass upwind from us. When the wolves saw us, the first bolted. The curiosity of the second got the best of him. With the wind preventing any scent information, he boldly came up to us for a close look. Satisfied, he trotted back away only to decide he still needed a closer look. This process repeated, with us standing like stones, the click of my camera shutter the only sound, until he dematerialized in to the forest.”

The endangered Alexander Archipelago wolf.

The Alexander Archipelago wolf are a controversial subspecies (Canus lupus ligoni) of North American wolf that occur only in the islands and mainland of Southeast Alaska. They are absent from the ABC islands which are dominated by brown bear, but the range is largely congruent with black bears in Southeast. No fossil evidence of the wolves exists, suggesting that the species colonized after the last glacial maximum.

Goldman (1944) was the first to describe the species and granted the name Alexander Archipelago wolf. A more recent molecular study (Weckworth et al. 2005), corroborated the original distinction, finding that the Southeast Alaska population was genetically delineated from the continental population and was itself highly structured and diverse. That same study found evidence that the Prince of Wales population formed its own unique and isolate subgroup within Southeast Alaska. Having spent a fair amount of time with wolves in Glacier Bay and Gustavus, our crew had guessed this was the case. In fact, it took us a split second to even realize we were staring down a wolf when it appeared–it was so dissimilar from the canids we had encountered up north. These island wolves appeared smaller, more dog-like, and with an unusual coat pattern.

Nate investigates the main entrance of a wolf den.
The interior anteroom of the den. Three separate tunnels extend further back.
The ground has been worn down from wolf traffic. Animal remains and scat littler the ground.

The biogeography of the Prince of Wales wolves puts them in a particularly perilous position. It is a general rule in ecology that the smaller and more isolated a population, the more sensitive it will be to environmental and demographic fluctuations (for example, the case of Isle Royale wolves). Although Prince of Wales is an exceptionally large island (the 4th largest in the U.S., just after Puerto Rico), the landscape itself has been highly fragmented and reduced by massive clearcuts and the most extensive road system in Southeast Alaska (about 2,500 miles; more roads than the rest of Southeast combined, many times over).

Roads are a major problem for POW wolves (even more so than for most widlife). Studies across all U.S. populations of wolves have shown a negative correlation between road density and wolf abundance. Wolves were absent where road densities exceed 0.9 mi/mi(Jensen et al. 1986,
Mech et al. 1988, Fuller 1989, as cited in Schoen and Person, n.d.).  Most of the roads on POW were created by logging companies (on tax-payer dollars), which makes a strong correlation with roads and clearcuts, both of which are avoided by wolves (Person 2001).

Clearcuts and active logging on Prince of Wales Island from the plane as we flew in to Klawok.
An active logging operation just outside of the Wilderness boundary. Despite a net loss for tax payers, the Forest Service insists on developing large timber sales for clearcutting.
This clearcut, visible from within the Wilderness, was cut within the last decade. Poor planning resulted in large landslides. It will take a century or more for soils to build up on these slides and allow for forest regeneration.

Black-tailed deer, the wolves’ primary prey, require old-growth forest for summer browse and winter habitat to shelter from heavy snow. Clearcutting areas leads to a “succession debt” for wolves as logging removes habitat which removes deer. And logging continues on the island, driving the succession debt further into the red. In fact, habitat loss from logging alone is expected to reduce the wolf population by a further 25% before 2045 (Person 2001).

But habitat loss is not the only threat to POW wolves. Despite Canis lupus being listed as an endangered species in the lower 48, wolves are considered both big-game and furbears in Alaska, subjecting them to both hunting and trapping. Between the 1980s and 2010, around 175 wolves were killed in Southeast Alaska; 75 of those were from POW (Schoen and Person, n.d.).

As logging and associate road infrastructure continue to proliferate, “wolf populations on Prince of Wales and adjacent islands will be caught between two significant pressures: declining prey abundance and increasing hunting and trapping mortality” (Schoen and Person, n.d.).

The South POW Wilderness, at least, is a partial refuge for the wolves. While Wilderness designation prevents future roads and logging, it does not exclude hunting nor trapping; so, there really is no safe haven for these unique canids. My hope is that the isolation and extreme difficulty of hunting here compared to the road system will keep it to a minimum.

Considering the rarity of kayaks in this area, it is easy to imagine the confusion of this bear upon seeing us.
Migrating sandhill cranes provided early-morning alarm clocks with their rusty, grating calls almost every day at dawn.
A family of river otters scurries out of the intertidal zone.
Tiny Sitka Black-tail deer are the only large herbivores on Prince of Wales Island.
Tiny nudibranchs are one of the most colorful creatures in the kelp, but there are countless invertebrates and small fish if you sit and observe for long enough.
Belted Kingfishers seemed to swarm the coastline. I’ve never seen so many.
Nate paddles alongside a solo Humpback whale.

In addition to the wolves, we came nose-to-nose with loads of wildlife, including sandhill cranes, black bears, whales (alive and bones), and so many kingfishers and river otters.

From my field notes:

“More kingfishers than I’ve ever seen! The shoreline seems like it is just dripping with kingfishers from every low-hanging branch. And below, the tidal rocks are writhing with families of river otter. It must be a productive place for small carnivores and pescavores!”

We spent a large chunk of our time in Klakas Inlet, the long fjord that dominates the norther reach of the Wilderness. Although not too steep, the shoreline harbors very few beaches along the flanks. One exception is a grassy cove sheltering a pink and sockeye waterfall. We spent an entire day watching black bears filter through, try their luck with the flying salmon, and continue their up-fjord journey.

The density of black bears in Klakas was astounding. We paddled to the head of the inlet, where the fjord splinters into three massive estuaries, each full of late-running salmon. At one point, I could see 7 bears around the shore. They circled the shoreline of the fjord almost like a slow and dispersed school of fish circling a pond.

From my field notes, 27 August 2018:

“In the morning I watched multicolored bat stars pass beneath my kayak. Their five to six bright arms were like a dappled rainbow fluorescing against the dark water. Watching black bears pass by on shore is like the inversion of the bat stars. The dark fur seems to absorb light and looking for bears is more about looking for the absence of a bear or a bear shaped black hole in the otherwise colorful shoreline.”

Black turnstones, a type of sandpiper, wait for the tide to overtake their perch. Interestingly, the Audubon field guide lists five collective nouns for a group of sandpipers: a bind, a contradiction, a fling, a hill, or a time-step.
The head of Klakas Inlet was full of jumping salmon ready to make a late-season run up the streams thanks to heavy rains following an unusually dry summer.
We watched this bear for a couple of hours and never saw him catch a single fish.
A sow and cub black bear waiting for a male to vacate his position on the waterfall.
The changing of the guard.
It almost seemed as if the salmon were taunting the bears with their acrobatics.
Like a kid salivating over candy through a store window, this black bear longingly watched salmon swim just out of reach.
From the head of Kalkas Inlet looking south toward the mouth.

From my field notes, 21 August 2018:

“On every trip, we try to make a point of hiking uphill, above the treeline, in order to look back down on the landscape. Experiencing a Wilderness is a lot like experiencing a painting. It takes time. You have to shift your focus and view it from different angles to take it all in. Look close, then step back. Put your nose right up to the canvas, then from across the room. In wild places, you have to paddle close to shore, get out and crawl through the brush of the forest, then get up high to see the whole thing. The detail and the composition are the functional integral of artistic mastery; same goes for the majesty of nature.”

So, we set our sights on a bald knob at the mouth of Klakas inlet and landed below a steep, even-aged stand of spruce and hemlock. We had envisioned a strenuous, patience-trying hike over old growth deadfall and bashing through endless under-story brush. It turned out that the stand had been beach logged many decades ago. The regenerating forest was the perfect age to shade out the under-story brush, but too young to accumulate much deadfall. It turned out to be one of the easiest routes we’d ever hiked. Surprisingly, there was not even brush at the transition from the forest to the subalpine bald, so we walked out into the sun and onto the rocks. From the top, we were treated to views of the entire southern half of Cordova Bay, up into Klakas Inlet and the ridges of the surrounding watersheds. In short, we could see almost the entire Wilderness area and the saltwater well out into the Pacific.

The pools on the bald were ringed by bright red sundews, a carnivorous plant that produces a sticky, sweet digestive enzyme from the tips of hairs on its leaves. Bug are attracted to the scent, land on the leaf, then wrap themselves up in the leaf as they struggle and adhere to more leaf hairs.

I see the round leaf sundew (Drosera rotundiflora) often, but I was surprised to see the English sundew (Drosera anglica) was more common and growing right alongside the D. rotundiflora.

Although it was steep, the hike up to the alpine was one of the most pleasant bushwhacks in Southeast Alaska.
Kim looks out over the coastline we’d spent the past couple of weeks exploring.
Looking south over southern Cordova Bay with the Dixon Entrance and the open Pacific in the distance.
A panoramic view of our lunch spot.
Drosera aglica, the English sundew is a carnivorous plant that loves the wet, sunny habitat around alpine pools.
The crew, transfixed by sundews and dragon flies.
Nap time in the alpine.
Rain-fed pools in the subalpine can hold water for long periods between rain events.
Stumps from hand-logged trees. In the days before industrial logging, foresters would cut notches in either side of the tree and insert a plank platform called a spring board (you can see the notch in the center stump). They would then climb up onto the springboards on either side of the tree and swing axes or pull a crosscut saw to fell the tree.
Understory in a second-growth forest.

From my field notes, 22 August 2018:

“The Native Haida and Tlingit folks who frequented this region made ample use of the fractal coastline. In almost every landable spot, Wayne noticed some trace of human occupation Some hints were subtle, like notches in trees or new forest growth indicating an area that had been cleared. Occasionally, the signs were more obvious, like the square foundations of old plank houses subsumed into the ground, and even pilons supporting old floor joists, now fixed in place by the roots of saplings growing atop them. The debris of stoves and other iron objects were sunk into the moss of a few such sites.

Wayne pointed out an iron ax head—the same type of ax head used to make the tree notches we’d been seeing all over the coast.”

A cleared area from an old village site.
A cedar ‘camp tree’ with tell-tale ax marks and fire scars.
A Haida home site with floor joists and pilons still in place.
Wayne points out an ax head, the same type used to notch the numerous ‘camp tress’ throughout the Wilderness area.
A casual evening in camp.
Thanks to Nate’s tireless tending to the solar panel and battery, all our GPSs, phones, and communications equipment lasted the entire trip.

After 16 days, we left the Wilderness toward Hydaburg. After days of calm waters and avoiding the oppressive sun, we found ourselves preparing for some exposed sections and large crossings in big water. The water shut us down and sent us back to camp for an extra night. We made progress despite lots of waiting and watching as the conditions fluctuated. We made our longest crossing of Cordova Bay in building, following seas. About midway across, we spotted black fins slicing the water miles away. As we thrashed onward on our heading, we watched the pod of orca surface and dive, rapidly closing the distance between us. The bull crossed comfortably forward of our bows, but the four females intersected our course. They surfaced with explosive spouts close enough to startle us. We exchanged some wide-eyed glances of concern while the females cavorted amidst out kayaks until they melted back into the water and joined the male far off to starboard.

From my field notes, 30 August 2018:

“South POW is a palimpsest. The page has been written by glaciers, over-written by many chapters of ecological succession, punctuated by interludes of logging, and a final chapter back to succession, all with the footnotes and interjections of Haida and Tlingit history throughout.
Now, having departed the Wilderness Area, our time there seems even quieter in contrast. Clear cuts appear over every ridge and we hear the whining buzz of outboards every few tens of minutes as boats zip in and out of Hydaburg.”

We paddled into Hydaburg on a quiet mid-morning diffused with grey. The brightly painted poles and homes seemed like holes pricked into the grey blanket of the day.

We found the folks in Hydaburg incredibly welcoming. We were invited into smokehouses and carving sheds, folks told us stories and we even made canine friends after saving a dog from drowning at the dock.

Overall, our South Prince of Wales expedition will remain one of my favorites. The immense quiet and solitude, the shoreline full of intrigue, the enigmatic wildlife—will define my memory of this place.

Grey day paddling in to Hydaburg.
Hydaburg is full of wonderful carvings.
I especially appreciated the frog emblem which is probably a theme imported from the more southerly Hyda county.
We were invited in to see the carving shed. The panel on the right will be one of the main screens in the new tribal house. Notice the rough shape of a frog on the pole in the foreground.

Route and Logistics:

Final packing before heading off to the Wilderness.

Our route and campsites. [Click to view full size]
South Prince of Wales can be a difficult area to access, especially if you are limited on travel time. If you have time, the easiest option would be to paddle in and out of Hydaburg.  The closest airport to Hydaburg is in Klawok, near Craig. Otherwise, you can fly to Ketchikan and take the Interisland Ferry to Hollis. In either case you would need to arrange road transport to Hydaburg. Unfortunately, there is no outfitter in Hydaburg, so if you are traveling without your own craft, I suggest contacting Katy Rooks at POW Excursion Outfitter. If you are interested in visiting South POW Wilderness, or really anywhere on or around Prince of Wales Island, you need to talk to Katy. Between her full gear shed, drop-box skiff, kayaks, canoes, ATV, etc., etc. you can be assured to get the most out of a trip.

Out trip could not have happened without Katy. We rented all of our kayaks through her, and in order to maximize our time in the Wilderness, Katy transported us to Klakas by skiff. She also picked us up and even stored our gear for us.

If you decide to paddle in South POW (and you should) keep in mind that camping can be sparse, weather can pick up quickly, parts of the coastline are committing, and communications are almost non-existent. Talk to locals, know your own skill level, and pour over your charts. Of course, I’m always happy to provide beta. Just drop me a line.

Skiffing down to the Wilderness Area saved us a couple of days of paddling and lots of retracing our own steps.

 

Check out my posts from other expeditions.

Check out other posts about designated Wilderness. 


References:

Fuller (1989) Population dynamics of wolves in northcentral Minnesota. Wildlife Monographs 105.

Goldman (1944) Classification of wolves. In S. Young and E. Goldman, eds., The wolves of North America, Part 2. Dover Publications, New York

Jensen, Fuller, and Robinson (1986) Wolf (Canis lupus) distribution on the Ontario to Michigan border near Sault St. Marie. Canadian Field Naturalist 100: 363-366.

Mech, Fritts, Radde, and Paul (1988) Wolf distribution and road density in Minnesota. Wildlife Society Bulletin 16:85-87.

Person (2001) Alexander Archipelago wolves: ecology and population viability in a disturbed, insular landscape (Doctoral dissertation). University of Alaska, Fairbanks, AK.

Schoen and Pearson, Chapter 6.4 “Alexander Archipelago Wolf.” In A Conservation Assessment and Resource Synthesis for The Coastal Forests and Mountains Ecoregion in the Tongass National Forest and Southeast Alaska. The Nature Conservancy.

Weckworth, Talbot, Sage, Person, and Cook (2005) A signal for independent coastal and continental histories among North American wolves. Mol. Ecol. 14: 917-931.

]]>
759
A Wake in Space-time https://www.azandisresearch.com/2018/07/26/a-wake-in-space-time/ Thu, 26 Jul 2018 18:49:11 +0000 http://www.azandisresearch.com/?p=660

I’m currently on my way up to Alaska for another supremely short season of guiding (just two trips this years). I was going through some old photos and came across this image of the Milky Way from a trip back in 2014. It evoked a memory of the last time I paddled with Ken Leghorn in Windfall Harbor.

Ken Leghorn was a hero of the Alaskan conservation movement and a friend and mentor of mine who passed away a little over a year ago. I wrote down this recollection on the airline napkin:

The night was dead still under the stars as we scraped the final bites from our dishes and made the slippery pilgrimage over the popweed to wash plates at the waterline. As we cast our rinse water out, the splash excited thousands of tiny green sparks in the wake. Bioluminescent algae had flooded into Windfall Harbor with the rising tide and now the bay was dense with the tiny flashing organisms. Ken and I decided it was definitely worth the effort of pulling a tandem down from the woods. We slid off into the black indefinite water. Every paddle stroke lit up like an aquatic Christmas tree. We stopped paddling not far from shore and floated. As the hull lost momentum it ceased to perturb the algae. Now the water was a black mirror of the star-full sky. Between the silence, Ken and I traded similes: Our kayak was like a space ship floating in space. Our wake was like a ripple in space-time. The Alexander Archipelago was like a solar system hurtling through the universe and we were a satellite in orbit around a tiny island planet.

We paddle back and pulled the kayak back up into the treeline. Knotting the bowline, we agreed it was the best bioluminescence we’d ever seen in Southeast.

Since I shuttered my photoblog a few months ago, I realized that my original post from that trip to Windfall Harbor had been lost to the ether. So, I resurrected the photos and lightly edited that post below.

From August 2014:

There are only a few Wilderness areas in Southeast Alaska that I have not been to. Surprisingly, Admiralty Island/Kootznoowoo Wilderness, one of the larger Wildernesses in the Tongass is one that I had never visited. Along with Baranof Island and Chichagof Island, Admiralty Island has one of the highest concentrations of brown bears in the world. The average is one bear per square mile. In total, that means that the bears outnumber the people on these large islands. In fact, Admiralty itself has more bears on it than all of the lower 48 states combined.

Pack Creek is a special place for bears. It is a wildlife sanctuary in addition to its Wilderness designation. That means that there is no hunting of bears at Pack and the viewing at the Creek is strictly regulated. This is a great set up for bear viewing, as bears get much closer than would be normally comfortable. We arrived late in the season, well after the tourists, so we basically had the place to ourselves.

Many thanks to friend Ken Leghorn and Pack Creek Bear Tours for loaning us a kayak, sharing salmon dinner, and providing super helpfully detailed info about Pack. If you ever want to make the trip yourself. Pack Creek Bear Tours are the folks to call.

The inspiration for this trip was a visit from one of my best friends from middle school, Jordan, who came up to visit Alaska for the first time. After years hearing about the incredible bear viewing at Pack Creek, this seemed like the best excuse to spend a few days there. We boarded the float plane in Juneau and made the short flight to Windfall Harbor where the Forest Service maintains a small seasonal camp for their rangers on an island just a stone’s throw from the Creek. This is also where Pack Creek Outfitters store their kayaks. It was the end of the season, so Ken offered to let us use a kayak for a few days if we would help him move his fleet to the winter storage area.

The operation at Pack Creek is nothing like any other bear viewing site. There is no platform, no fences, no barriers. The viewing area is a 5 by 10 meter area of mown grass with a driftlog to sit on. The Forest Service and Fish and Game rangers are on-site at all times that people are present. They are trained to let the bears move about freely up to the edge of the mown grass line.

The unique situation at Pack Creek is a stamp of its history. In the 1930s a major conservation campaign sprouted with the intent of designating all of Admiralty Island a bear refuge, but succeeded only in protecting the Pack Creek drainage from hunting. In 1935, the Forest Service designated it an official bear viewing. Despite the restrictions, poaching was regular in the remote watershed. In 1956, a local miner and logger, Stan Price, rowed his floating cabin on shore at the mouth of Pack Creek and established a homestead with his wife, Edna. Rather than fear, they treated the local bruins as neighbors. Their presence helped to curtail poaching and also attracted new visitors. For almost 4 decades, the Prices lived with the bears. Over that time, new generations of cubs were born and reared with the Prices as a normal fixture of life. By the time Price died in 1989, just about every local bear was habituated to constant human presence.

In 1984, the tiny sliver of bear sactuary was expanded to a no-hunting zone encompassing Pack Creek and the adjacent watersheds, as well as the islands in Windfall Harbor. As the 80s progressed visitation increased to the point that the agencies decided to actively manage the area. Viewing times were limited, rangers were installed on-site, and visitation was limited to just 24 people per day.

As a result, generations of bears have come to associate Pack Creek as a safe haven from hunting and to ignore the small groups of human onlookers.

The Swan River estuary looking south across Windfall Harbor.
The dark silhouettes of salmon in the clear waters of an Admiralty Island stream.

Bear trail through the grass, making a straight line from one salmon stream to the next salmon stream.

Sitting on a log, surrounded by Alaskan brown bears playing, snoozing, bathing, and snapping at salmon is a mesmerizing experience. We spent most of our time sitting on the log at the Creek mouth or walking up the trail to the viewing platform. But we managed a couple paddles around the Harbor, including a visit to the most impressive Sitka Spruce tree I’ve ever met.

Both photos are the same tree from different aspects. Daven is easy to spot in his bright blue jacket (left), but you have to look a little more closely to see me lounging on the branch in the right image.

On our second night, we sat under the clear night sky and discovered bioluminescent algae in the water. It is rare to see stars in Souheast Alaska. And it is a pleasure to see them reflected in the still waters. It is utterly, chest-caving, breathtaking to paddle the myth-like firmament of water sandwiched between a sky of stars above and swirls of bioluminesces below. Ken and I paddled out in a tandem just to sit and float. I can’t describe it. It was one of those utterly unique experiences that will forever bound my conception of hyperbole.

On the final evening of our visit, Jordan and I sat on the log with my friend Daven who happened to be the Forest Service Ranger on staff for the day.

The three of us sat in silence for most of the evening, occasionally swatting mosquitos, surveying the moldering ruins of Stan Price’s cabin, and potting bears across the river. With the sun dropping behind the mountains, we were contemplating packing up for the evening when a medium-sized rich-chocolate colored bear sauntered out of the trees. Daven recognized her immediately as Chino (her mother, a creamy brown bear, was named Mocha… get it?). Chino ambled across the streamlets and with no attention to us, came to rest in the tall grass at the edge of the viewing area. We were stunned into silence. I frantically switched lenses since she was closer than the focusing distance of my long lens and filled cards with her portrait.

As Chino ambled toward us, casually munching sedge, we sat quite. You can see Daven official USFS hat crouching in front of me.

 

After grazing on the grass before us, Chino walked a couple meters past, sat down with her back to us, ears unalert and pointed away from us, in a posture of complete indifference to our presence.

I’ve seen many, many bears at very close range. But the general protocol for bear encounters is to make your presence known with the goal making it clear to the bear that you want your space. At Pack Creek, the tone is completely different, the intent is to discharge any discomfort, to let Chino forget we were even there. I learned that nonchalance is a powerful emotion when seen in the eyes of a bear.

We flew out on a clear day with Ken. Upon take off, we circled over the Swan River estuary which was expansive at low tide. The afternoon sun fluoresced the rivulets like veins under an X-ray. Out on the flats, we passed over a sow and two cubs. It takes a big landscape to make a 900lbs animal look like a speck, and it takes an even larger Wilderness area to ensure that such a landscape remains truly wild.

]]>
660
Replacing wolves on Isle Royale is bad policy and worse science https://www.azandisresearch.com/2018/03/17/replacing-wolves-on-isle-royale-is-bad-policy-and-worse-science/ Sat, 17 Mar 2018 16:56:09 +0000 https://www.azandisresearch.com/?p=415
Map from National Geographic.

Overview:

Isle Royale is an island in Lake Superior that is designated as a Wilderness Area and managed by the National Park Service. In the 20th century, wolves and moose migrated to the island and their dynamic spurred one of the longest predator-prey studies in history. Now, the wolf population has dropped to 2 and the Park Service is planning a major intervention that will install an entirely new, synthetic population of wolves on the island. This plan is the result of myopic research perspective and disregard for eco-evolutionary dynamics. It is bad policy and even worse science. Here’s why:

Background:

Isle Royal is a smallish-island (just over 200 square miles) that sits about 8 miles from the north shore of Lake Superior. (Although it is small, it is large enough to host its own internal lake with an island, making it, as upper-midwesterners are fond of point out, the largest island on the largest lake on the largest island on the largest lake in the world.) The Isle and its many tiny satellite islands became a National Park in 1940 and were designated as a federal Wilderness Area in 1976.

Because the island is small and a long swim from the mainland, large fauna populations have been inconsistent denizens, historically. Moose first arrived on the island in the early 1900s. Wolves followed the moose in the 1940s, adding two major trophic levels to the island ecosystem. The complex predator-prey interactions became one of the classic test cases of ecological theory (see Peterson et al. 1984; McLaren & Peterson 1994).

Over the decades, the wolf and moose populations have demonstrated a standard predator-prey oscillation, with the wolves generally bouncing around about 20 individuals, but reaching a population maximum of 50 individuals in the 1980.

Figure from Duluth News Tribune.

However, Isle Royale is a small place. Small islands are more susceptible to tipping points on the roller-coaster of demographic stochasticity. It’s kind of like drunkenly walking along the centerline of a bridge versus drunkenly walking a tightrope. If you stumble off course too far on the bridge, you have the latitude to recover and get back on course. Too big a waiver on a tightrope and you’re done for. The small size and isolation of Isle Royale makes it a tight rope for large predators. Like all oscillatory ecological patterns, what goes up eventually comes down, and in the last decade or so, the wolf population has declined in a mirror-like inversion of the population boom in the 1980s. As of this year, there are only two wolves left. As per the dynamics of island-biogeography, the natural course looks like the rein of the wolf will eclipse on the island, probably followed by a boom and eventually extirpation of moose, and the island will continue along as it did for the many decades prior to the most recent immigration events. That is, until the next colonists arrive, as has happened multiple times in the past. Coyotes immigrated and blinked out in 50 years in the early to mid-1900s. At times, lynx and caribou both made the pilgrimage to the island and subsequently slipped off the tightrope.

The issue:

Now, the National Park Service has released an Environmental Impact Statement (EIS) for a plan to install a new populations of wolves on the island (available here). If you are unfamiliar with the NEPA process, here’s how it works: When a land management agency like the National Park Service wants to embark on a project that might run counter to its mandate and/or result in large impacts, they are required to vet all potential options, usually as an EIS, and ask for the public’s comments on the plan. After the revision process, they make a final decision to enact one of those potential options, the “preferred alternative.”

Since 99% of  Isle Royale Park is a designated Wilderness, “where the earth and its community of life are untrammeled by man” and “generally appears to have been affected primarily by the forces of nature, with the imprint of man’s work substantially unnoticeable” (Wilderness Act, 1964), shipping in a boatload of wolves to manipulate the ecosystem required an EIS.

As a scientist and especially as an ecologist, I tend to view Wilderness Areas as our most critical ‘controls’ or ‘baselines’ for science to contrast other areas where human impact alters systems. Though every system is touched by human impact to some extent, there is huge value in preserving the least impacted places in an unmanipulated state. As an analogy, a blemished diamond might be worth a little less than a perfect diamond, but that doesn’t reduce it to equal value with a lump of coal.

But, not all scientists think that way.

To introduce or not to introduce:

The push to introduce wolves to Isle Royale has been championed primarily by two researchers at Michigan Tech, John Vucetich and Rolf Peterson, whose careers are rooted in the Isle’s wolf-moose study.

I first heard about this proposal when Vucetich gave a presentation at the Sigurd Olsen Environmental Institute. At the time, it wasn’t the science that bothered me about the presentation–it was the patent misrepresentation and obvious straw-man Vucetich employed to characterize the intention of the Wilderness Act. Since then, these researchers have made major pushes in film (and this one) and popular press to portray wolves as an ever-present and integral part of the Isle Royale ecosystem, and pit the “health” of the ecosystem against what they believe is an outdated philosophy of conservation.

Essentially, their argument is that since climate change impacts the whole globe, no wilderness is really free of human manipulation, so we should be free to further manipulate it to our own design. They explicitly argue for “new visions for the meaning of wilderness,” with their preferred vision being “a place where concern for ecosystem health is paramount, even if human action is required to maintain it” (from here).

Intentionally or not, their use of relativistic “ecosystem health” rhetoric and attempts to stretch the ‘wilderness myth’ concept into their own application has thoroughly muddied the debate.

And it’s resulted in a lot of public confusion on the topic. For instance, here’s one comment I pulled from the public response in the EIS:

I have visited Isle Royale twice and it remains one of my favorite places in the world. The wolves and the moose have become a part of the island and that is a good thing. Wolves and moose aren’t faring well on the mainland due to politics ignorance and climate change. Isle Royale remains a unique microcosm where we can still observe and study this ancient predator-prey relationship. In a world where species are becoming extinct on a daily basis, this rare relationship has endured and that should be given a lot of weight when making the decision of what to do about the wolf-moose problem on Isle Royale. Please use common sense and act sooner rather than when it is too late.

First off, it’s not an ancient relationship (it’s only been going on for 60 years on the island), and it’s not a rare relationship (wolves eat moose all over the continent all the time). What makes it “rare” is the fact that it happens without human intervention (at least until NPS takes control of the population) on an isolated island with researchers tracking every move.

This person’s comment shows that opinions on the wolf issue are completely colored by human perception: i.e. anything that happened before your lifetime is “ancient,” anything that looks the way it is when you first saw it is “natural,” if you’ve only heard about something in one place, it must be “rare,” etc. The most pernicious perception is that the only species that are worth concerning ourselves with are the big ones with faces that you can relate to (after all, amphibian populations fluctuate on and off in ponds all over the upper midwest following the exact biogeographic pattern as the wolves of Isle Royle, but I’ve yet to see an outrage).

Even the main proponents of wolf introduction, Vucetich et al. and the National Parks Conservation Association invoke the myth that a “sustainable” wolf population is critical to the island’s “health.” Considering that wolves only appeared on the island within a human lifetime and probably blinked on and off the island historically, wolves are only an ephemeral component of this dynamic ecosystem. They never have been “sustainable,” and if ecosystem “health” hinges on the presence of wolves, the island has always been naturally unhealthy.

The rhetoric of “healthy” ecosystems is useless in science, because its meaning is entirely relative. Rolf Peterson, the researcher who initiated the moose-wolf study in the 70s, states that, “There’s a mythical belief that Isle Royale has been working well because we kept our hands off it; my opinion is, it worked well because there were wolves there” (from here). You can only consider a wolf-inhabited Isle Royale as “healthy” if you define a “healthy” ecosystem as one that looks the same way it did when you started your research plan. The real myth is conflating wolf presence with Isle Royale’s natural state, and in this case, it seems a personal mythology crafted to shore the legacy of Peterson’s research project.

The Park’s plan:

The preferred action of NPS is to install 20-30 wolves on the island over the next 3 years, and if those don’t take, to continue introducing for 2 more years.

Originally, the proponents of airdropping new wolves onto the Isle proposed it as genetic “rescue.” But with only two post-breeding age, inbred stock left, there is little chance that new wolves will breed with the two relics. Thus, in reality, this is not a genetic rescue project, it is a genetic replacement project.

So, where do the replacement wolves come from? The EIS suggests that wolves should be sourced from the mainland near the Park, but that many different populations around Lake Superior should be mixed on the island. They also suggest sourcing wolves with experience hunting moose (which are rare in mid-western populations).

Will a new population fair better? The reason wolves lost the plot in the first place was due to the ubiquitous force of natural selection. When faced with strong selection pressure, organisms are faced with three choices: move (not possible on an island), adapt, or disappear. The current wolves were not able to adapt to the ecological scenario they found on the island, so they are disappearing. The NPS knows that new wolves will be even more likely to succumb to selection pressure because they will not be locally adapted. This is why they are planning recurring introductions for a total of up to 5 years. The new population, with lots of diverse genetic material to work with, might be more prone to local adaptation, or it might be more prone to crash because the animals are too locally adapted to their naive system to cope in the new setting.

It might be tempting to think that evolution won’t be a factor considering the short tenure of wolves on the island, but wolf generation times are under 5 years (Mech et al. 2016) which means that they’ve had about 20 generations on the island. We know from the deluge of rapid evolution studies in the past few years that 20 generations is well within the timespan for marked evolution. Similarly, one can expect that moose have been evolving in that time too (Hoy et al. 2018), as have the plants that are browsed by moose, and the small mammals, and the microorganisms that exists in concert… In other words, the entire trophic system has been subject to dynamic eco-evolutionary change that has refined its assemblage and genetic composition. Replacing local wolves with wolves from elsewhere will short-circuit that dynamic process and set a new eco-evolutionary trajectory. Any study that occurs post-introduction will be studying a different eco-evolutionary system, altogether.

Proponents have made the case that occasional genetic influx from the mainland population (when a wolf might cross the ice to the island in cold winters) is part of the natural dynamic, but that climate change has disrupted this process. Leaving aside the fact that much of Isle Royales history was wolf-less long before climate change, reintroducing wolves does not simulate this natural process. In natural migration events, wolves are not randomly selected from a larger pool. The process of migration is a selective sieve that winnows out some potential migrants and selects for others. By high-grading the genetic stock from the mainland based on their own criteria, the Park Service will not be replicating nature, they will be conducting a large-scale, manipulative selection experiment.

The value of non-intervention:

As I mentioned, one of the most critical values of Wildernesses are their role as baselines. This is a point  repeatedly highlighted in the “Strategic Plan for Scientific Research in Isle Royal National Park” (Schlesinger et al. 2009).  The Plan lists as “Unique Attributes of Isle Royale National Park” that it is “An Isolated Location for Baseline Studies”, and “an Ideal Place to Study Fundamental Ecological Concepts” like island-biogeography and predator-prey dynamics.

Trophic web and resource cycle for Isle Royale (Figure 3 from Schlesinger et al. 2009).

Isle Royale attracts biogeographers, whose focus is the distribution of life forms as determined by the balance of regional dispersal and local extinction processes (MacArthur and Wilson 1967). Determination of the relative importance of both dispersal and extinction is of central interest to ecologists wishing to explain variability in the species diversity of a given environment and the potential changes brought about by environmental change. As the Strategic Plan states, “the pristine nature of Isle Royale offers an opportunity to examine the potential influence of regime shifts due to natural causes or indirect anthropogenic causes such as climate change.” If we choose a policy of artificially imposing stasis on a naturally dynamic ecosystem we lose that value almost entirely.

On the other hand, if we practice humility and allow natural systems to be dynamic, we can ask a list of interesting questions: What happens if we remove those top trophic levels of moose and wolf? How will that impact the nutrient cycle? How will it impact community dynamics? In what ways will the change in selection pressures drive evolution? Will the eco-evo dynamic play out in predicatable ways based on theory and inference from other archipelagos? What will the post-wolf community composition look like and will it be the same as the pre-wolf community? Etc. etc…

There are endless scientific questions that a wolf-less IR can answer. On the other hand, a replacement wolf population cannot even answer the original question that it is intended to address because such a manipulation cannot be considered a continuation of that community; at best, we can only consider this a manipulative experiment at the price of sacrificing an entire natural ecosystem and ruining an exemplary opportunity to study eco-evo dynamics.

 


References:

Hoy, S.R., Peterson, R.O., and Vuctich, J.A. 2018. Climate warming is associated with smaller body size and shorter lifespans in moose near their southern range limit. Global Change Biology. DOI: 10.1111/gcb.14015

MacArthur, R.H. and Wilson, E.O. 1967. The Theory of Island Biogeography. Princeton Press.

McLaren, B. E. and Peterson, R. O. 1994. Wolves, Moose, and Tree Rings on Isle Royale. Science 266(5190): 155-1558. DOI: 10.1126/science.266.5190.1555

Mech, D.L., Barber-Meyer, S.M., and Erb, J. 2016. Wolf (Canis lupus) Generation Time and Proportion of Current Breeding Females by Age. PLOS one. DOI: 10.1371/journal.pone.0156682

Peterson, R.O., Page, R.E., Dodge, K.M. 1984. Wolves, moose, and the allometry of population cycles. Science 224:1350-1352.

Schlesinger, W.H, Aneja, V.P., Chapin III, F.S, Comerford, N., Gibbs, J.P., Hrabik, T., Turner, M.G., and Whitaker, J. 2009. “Strategic Plan for Scientific Research in Isle Royale National Park.”

]]>
415
Stikine-LeConte Wilderness expedition https://www.azandisresearch.com/2017/12/28/stikine-leconte-wilderness-expedition/ Thu, 28 Dec 2017 16:51:09 +0000 https://www.azandisresearch.com/?p=211 Over the summer of 2016 (between wrapping up my Masters starting my PhD) I spent 7 weeks back in Alaska on contract with Sitka Conservation Society and the US Forest Service to draft Wilderness Character Narratives for three Wilderness Areas, including the Sitkine-LeConte. These narratives are essentially an “anchor to windward,” historical benchmarks that describe the Wilderness areas as they are today (warts and all) and characterize the trends that led up to this point. In an agency with high turn-over, these narrative will allow new wilderness managers to quickly understand the complexities that underlie the management of each landscape.

The intention with these narratives is to tie together all dimensions of a Wilderness’s management. In addition to physical, biological, and historical information about the area, I also needed to incorporate the human perspective of the river. So, for each narrative, I organized a public meeting to garner input from locals in the community.

Check out this local radio interview about the project.

In addition to my research, I also wanted to spend some time on the ground in each of the Wilderness Areas. It seemed artificial to write about these Areas’ “character” without experiencing them first hand. While I had been on the Stikine before, it was only for a few days well up the river. So, I recruited a friend and planned a long expedition.

 

]]>
211