methods – A.Z. Andis Arietta

UPDATE: Smartphone Hemispherical Image Analysis

azandis@gmail.com — Sat, 20 May 2023 19:27:45 +0000

When I first developed the method to estimate canopy metrics from smartphone spherical panoramas, I used a somewhat convoluted workflow involving command line image manipulation, a plugin for ImageJ to do binarization, and a script I wrote in AutoHotKey language to automate mouse clicks on a GUI for canopy measures. Admittedly, it is a difficult pipeline for others to easily replicate.

In an effort to make life easiest, I spent some time building out the pipeline entirely in R, including a sourceable function for converting spherical panos to hemispherical images (all available on this repo).

The easiest way to convert all of your spherical panos to hemispherical projections is to source the function from my github:

source("https://raw.githubusercontent.com/andisa01/Spherical-Pano-UPDATE/main/Spheres_to_Hemis.R")

When you source the script, it will install and load all necessary packages. It also downloads the masking file that we will use to black out the periphery of the images.

The script contains the function convert_spheres_to_hemis, which does exactly what is says. You’ll need to put all of your raw spherical panos into a subdirectory within your working directory. We can then pass the path to the directory as an argument to the function.

convert_spheres_to_hemis(focal_path = "./raw_panos/")

This function will loop through all of your raw panos, convert them to masked, north-oriented upward-facing hemispherical images and put them all in a folder called “masked_hemispheres” in your working directory. It will also output a csv file called “canopy_output.csv” that contains information about the image.

Below, I will walk through the steps of the workflow that happened in the convert_spheres_to_hemis function. If you just want to use the function, you can skip to the analysis. I’ve also written a script to do all of the conversion AND analysis in batch in the repo titled “SphericalCanopyPanoProcessing.R”.

library(tidyverse) # For data manipulation
library(exifr) # For extracting metadata

R is not really the best tool for working with image data. So, we’ll use the magick package to call the ImageMagick program from within R.

library(magick) # For image manipulation
# Check to ensure that ImageMagick was installed.
magick_config()$version

You should see a numeric version code like ‘6.9.12.3’ if ImageMagick was properly installed.

# ImageR also requires ImageMagick
library(imager) # For image display

For binarizing and calculating some canopy metrics, we will use Chiannuci’s hemispheR package, which we need to install from the development version.

library(devtools)
devtools::install_git("https://gitlab.com/fchianucci/hemispheR")
library(hemispheR) # For binarization and estimating canopy measures

To get started, you’ll need to have all of your raw equirectangular panoramas in a folder. I like to keep my raw images in a subdirectory called ‘raw_panos’ within my working directory. Regardless of where you store your files, set the directory path as the focal_path variable. For this tutorial we’ll process a single image, but I’ve included scripts for batch processing in the repo. Set the name of the image to process as the focal_image variable.

focal_path <- "./raw_panos/"
focal_image <- "PXL_20230519_164804198.PHOTOSPHERE_small.jpg"

focal_image_path <- paste0(focal_path, focal_image)
focal_image_name <- sub("\\.[^.]+$", "", basename(focal_image_path))

Let’s take a look at the equirectangular image.

pano <- image_read(focal_image_path)
pano # Visualize the pano

Raw equirectangular projection of a spherical panorama.

Note: One advantage of spherical panos is that they are large, and therefore, high resolution. The images from my Google Pixel 4a are 38 mega pixels. For this tutorial, I downsized the example pano to 10% resolution to make processing and visualizing easier. For your analysis, I’d recommend using full resolution images.

Spherical panoramas contain far more metadata than an average image. We can take a look at all of this additional information with the read_exif function.

read_exif(focal_image_path) %>%
  glimpse()

We’ll extract some of this information about the image, the date it was created, the georeference, and altitude, etc. to output alongside our canopy metrics. You can decide which elements are most important or useful for you.

xmp_data <- read_exif(focal_image_path) %>%
  select(
    SourceFile,
    Make,
    Model,
    FullPanoWidthPixels,
    FullPanoHeightPixels,
    SourcePhotosCount,
    Megapixels,
    LastPhotoDate,
    GPSLatitude,
    GPSLongitude,
    GPSAltitude,
    PoseHeadingDegrees
  )

The first image processing step is to convert the equirectangular panorama into a hemispherical image. We’ll need to store the image width dimension and the heading for processing. The pose heading is a particularly important feature that is a unique advantage of spherical panoramas. Since the camera automatically stores the compass heading of the first image of the panorama, we can use that information to automatically orient all of our hemispherical images such that true north is the top of the image. This is critical for analyses of understory light which requires plotting the sunpath onto the hemisphere.

# Store the pano width to use in scaling and cropping the image
pano_width <- image_info(pano)$width
image_heading <- read_exif(focal_image_path)$PoseHeadingDegrees

The steps to reproject the spherical panorama into an upward-looking hemispherical image go like this:

Crop the upper hemisphere (this is easy with smartphone spheres because the phone’s gyro ensures that the horizon line is always the midpoint of the y-axis).

Cropped upper hemisphere (top half of the image) from an equirectangular projection of a spherical panorama.

Rescale the cropped image into a square to retain the correct scaling when reprojected into polar coordinate space.

Rescaled upper hemisphere from an equirectangular projection of a spherical panorama.

Project into polar coordinate space and flip the perspective so that it is upward-looking.

Polar projection of the upper half of a spherical panorama.

Rotate the image so that the top of the image points true north and crop the image so that the diameter of the circle fills the frame.

Polar projection of the upper half of a spherical panorama rotated to orient to true north.

We can accomplish all of those steps with the code below.

pano_hemisphere <- pano %>%
  # Crop to retain the upper hemisphere
  image_crop(geometry_size_percent(100, 50)) %>%
  # Rescale into a square to keep correct scale when projecting in to polar coordinate space
  image_resize(geometry_size_percent(100, 400)) %>%
  # Remap the pixels into polar projection
  image_distort("Polar",
                c(0),
                bestfit = TRUE) %>%
  image_flip() %>%
  # Rotate the image to orient true north to the top of the image
  image_rotate(image_heading) %>%
  # Rotating expands the canvas, so we crop back to the dimensions of the hemisphere's diameter
  image_crop(paste0(pano_width, "x", pano_width, "-", pano_width/2, "-", pano_width/2))

The resulting image looks funny because the outer pixels are extended by interpolation and we’ve rotated the image which leaves white space at the corners. Most analyses define a bounding perimeter to exclude any pixels outside of the circular hemisphere; so, the weird border shouldn’t matter. But, we can add a black mask to make the images look better.

I’ve included a vector file for a black mask to lay over the image in the repo.

# Get the image mask vector file
image_mask <- image_read("./HemiPhotoMask.svg") %>%
  image_transparent("white") %>%
  image_resize(geometry_size_pixels(width = pano_width, height = pano_width)) %>%
  image_convert("png")

masked_hemisphere <- image_mosaic(c(pano_hemisphere, image_mask))

masked_hemisphere

Masked hemispherical canopy image.

We’ll store the masked hemispheres in their own subdirectory. This script makes that directory, if it doesn’t already exists and writes our file into it.

if(dir.exists("./masked_hemispheres/") == FALSE){
  dir.create("./masked_hemispheres/")
} # If the subdirectory doesn't exist, we create it.

masked_hemisphere_path <- paste0("./masked_hemispheres/", focal_image_name, "hemi_masked.jpg") # Set the filepath for the new image

image_write(masked_hemisphere, masked_hemisphere_path) # Save the masked hemispherical image

At this point, you can use the hemispherical image in any program you’d like either in R or other software. For this example, I’m going to use Chiannuci’s hemispheR package in order to keep this entire pipeline in R.

The next step is to import the image. hemispheR allows for lots of fine-tuning. Check out the docs to learn what all of the options are. These settings most closely replicate the processing I used in my 2021 paper.

fisheye <- import_fisheye(masked_hemisphere_path,
                          channel = '2BG',
                          circ.mask = list(xc = pano_width/2, yc = pano_width/2, rc = pano_width/2),
                          gamma = 2.2,
                          stretch = FALSE,
                          display = TRUE,
                          message = TRUE)

Circular hemispheric plot output by hemispheR’s ‘import_fisheye’ function.

Now, we need to binarize the images, converting all sky pixels to white and everything else to black (at least as close as possible). Again, there are lots of options available in hemispheR. You can decides which settings are right for you. However, I would suggest keeping the zonal argument set to FALSE. The documentation describes this argument as:

zonal: if set to TRUE, it divides the image in four sectors
(NE,SE,SW,NW directions) and applies an automated classification
separatedly to each region; useful in case of uneven light conditions in
the image

Because spherical panoramas are exposing each of the 36 images separately, there is no need to use this correction.

I also suggest keeping the export argument set to TRUE so that the binarized images will be automatically saved into a subdirectory named ‘results’.

binimage <- binarize_fisheye(fisheye,
                 method = 'Otsu',
                 # We do NOT want to use zonal threshold estimation since this is done by the camera
                 zonal = FALSE,
                 manual = NULL,
                 display = TRUE,
                 export = TRUE)

Binarized circular hemispheric plot output by hemispheR’s ‘binarize_fisheye’ function.

Unfortunately, hemispheR does not allow for estimation of understory light metrics like through-canopy radiation or Global Site Factors. If you need light estimates, you’ll have to take the binarized images and follow my instructions and code for implementing Gap Light Analyzer.

Assuming all you need is canopy metrics, we can continue with hemispheR and finalize the whole pipeline in R. We estimate canopy metrics with the gapfrac_fisheye() function.

gapfrac <- gapfrac_fisheye(
  binimage,
  maxVZA = 90,
  # Spherical panoramas are equidistant perforce
  lens = "equidistant",
  startVZA = 0,
  endVZA = 90,
  nrings = 5,
  nseg = 8,
  display = TRUE,
  message = TRUE
)

Binarized circular hemispheric plot with azimuth rings and segments output by hemispheR’s ‘gapfrac_fisheye’ function.

Finally, we can estimate the canopy metrics with the canopy_fisheye() function, join those to the metadata from our image, and output our report.

canopy_report <- canopy_fisheye(
  gapfrac
)

output_report <- xmp_data %>%
  bind_cols(
    canopy_report
  ) %>%
  rename(
    GF = x,
    HemiFile = id
  )

glimpse(output_report)

Rows: 1
Columns: 32
$ SourceFile            "./raw_panos/PXL_20230519_164804198.PHOTOSPHERE_small.jpg"
$ Make                  "Google"
$ Model                 "Pixel 4a"
$ FullPanoWidthPixels   8704
$ FullPanoHeightPixels  4352
$ SourcePhotosCount     36
$ Megapixels            0.37845
$ LastPhotoDate         "2023:05:19 16:49:57.671Z"
$ GPSLatitude           41.33512
$ GPSLongitude          -72.91103
$ GPSAltitude           -23.1
$ PoseHeadingDegrees    86
$ HemiFile              "PXL_20230519_164804198.PHOTOSPHERE_smallhemi_masked.jpg"
$ Le                    2.44
$ L                     3.37
$ LX                    0.72
$ LXG1                  0.67
$ LXG2                  0.55
$ DIFN                  9.981
$ MTA.ell               19
$ GF                    4.15
$ VZA                   "9_27_45_63_81"
$ rings                 5
$ azimuths              8
$ mask                  "435_435_434"
$ lens                  "equidistant"
$ channel               "2BG"
$ stretch               FALSE
$ gamma                 2.2
$ zonal                 FALSE
$ method                "Otsu"
$ thd                   116

Be sure to check out my prior posts on working with hemispherical images.

Visualize mixed effect regressions in R with GGplot2

azandis@gmail.com — Sat, 31 Dec 2022 18:34:16 +0000

Example of the final viz.

In this post, I will show some methods of displaying mixed effect regression models and associated uncertainty using non-parametric bootstrapping. This is kind of a follow-up to my previous post on visualizing custom main effect models.

Introduction

Mixed models have quickly become the model du jour in many observation-oriented fields because these models obviate many of the issues of pseudoreplication inherent in blocked or repeated measures experiments and structured data.

They do this by treating the levels of categorical variables not as unique instances to be parameterized individually, but as random samples from an infinite distribution of levels. Instead of wasting degrees of freedom estimating parameter values for each level, we only need to estimate that global distribution (which requires only a handful of parameters) and instead focus our statistical power on the variables of interest.

Thanks to packages like nlme and lme4, mixed models are simple to implement. For all of their virtues, mixed models can also be a pain to visualize and interpret. Although linear mixed models are conceptually similar to the plain old ordinary least-squares regression we know and love, they harbor a lot more math under the hood, which can be intimidating.

One of the reasons mixed models are difficult to intuitively visualize is because they allow us to manage many levels of uncertainty. Depending on the focus of our analyses, we usually want to focus on certain aspects of the trends and associated uncertainty. For instance, an ecologist might be interested in the effect of nutrient input across many plots, but not interested in the difference between plots (i.e. traditional random effect). Or, an educator might be interested in the effect of different curricula, but not the difference between specific classes within specific schools (i.e. nested random effects). Or, a physician might be interested in the effect of a long-term treatment on a patient after accounting for baseline difference between patients (i.e. repeated measures).

In this tutorial, I’m going to focus on how to visualize the results of mixed effect models from lme4 using ggplot2. You can also clone the annotated code from my Github.

First, load in the necessary libraries.

library(tidyverse)
library(lme4)
library(ggsci)
library(see)
library(cowplot)
theme_set(theme_classic()) # This sets the default ggplot theme

To begin, I am going to simulate an experiment with 10 experimental units each containing 100 observations. These could be 10 plots with 100 random samples, or 10 schools with 100 student test scores, or the records of 10 patients from each of 100 follow-up visits. Each of the experimental units will ultimately get its own intercept and slope effect coefficient. The rand_eff data frame is essentially a Z matrix in classic mixed model notation. For this example, I’ll assume that the intercepts come from a distribution with standard deviation of 20 and the slopes from a distribution with standard deviation of 0.5. The random effects define the variation of the experimental unit around the main effect, so the mean of these distributions is necessarily 0.

set.seed(666)
rand_eff <- data.frame(unit = as.factor(seq(1:10)),
            b0 = rnorm(10, 0, 20),
            b1 = rnorm(10, 0, 0.5))

We can now join our random effect matrix to the full dataset and define our y values as y_i = B0_i + b0_j + B1x_i + b1x_j + ε.

X <- expand.grid(unit = as.factor(seq(1:10)), obs = as.factor(seq(1:100))) %>%
  left_join(rand_eff,
            by = "unit") %>%
  mutate(x = runif(n = nrow(.), 0, 10),
         B0 = 20,
         B1 = 2,
         E = rnorm(n = nrow(.), 0, 10)) %>%
  mutate(y = B0 + b0 + x * (B1 + b1) + E)

Here’s a look at the data.

X %>%
  ggplot(aes(x = x, y = y, col = unit)) +
  geom_point() +
  facet_wrap(vars(unit))

Scatter plots of the simulated data. Each of 10 experimental units contains 100 observations.

Random intercept model

For demonstration, let’s first assume that we are primarily interested in the overall slope of the relationship. For instance, if these are 10 field plots, we might want to know the effect of adding 1 unit of nutrient fertilizer, regardless of the baseline level of nutrients in a given plot.

We can do this by fitting a random intercept model and then looking at the summary of the resulting model.

lmer1 <- lmer(y ~ x + (1|unit), data = X)

summary(lmer1)

> summary(lmer1)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (1 | unit)
   Data: X

REML criterion at convergence: 7449.7

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.88156 -0.68745  0.01641  0.71022  2.84532 

Random effects:
 Groups   Name        Variance Std.Dev.
 unit     (Intercept) 811.59   28.488  
 Residual              94.63    9.728  
Number of obs: 1000, groups:  unit, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)  18.2652     9.0301   2.023
x             2.0091     0.1077  18.651

Correlation of Fixed Effects:
  (Intr)
x -0.060

We can see that the fitted model does a good job estimating the fixed effect slope (B1), which we simulated with a coefficient of 2 as 2.0091. However, the model is underestimating the fixed effect intercept (B0) as 18.3 and overestimating the standard deviation of the random effect slopes (b1) as 28.5, when we simulated those values as 20 and 20.

If we think we could live with that fit, how would we go about visualizing our model?

Here is a look at our data with a linear regression fit to each experimental unit. It is clear that there is a wide spread in the intercepts, but the slopes are similar.

X %>%
ggplot(aes(x = x, y = y, col = unit)) +
geom_point() +
geom_smooth(method = 'lm', se = F)

Scatter plot of the simulated data with an independent linear regression fit to each experimental unit.

Marginal (Fixed effect) versus Conditional (Fixed + Random effect)

We might be tempted to use this built-in regression by group from ggplot as a visualization of the mixed model. However, this would be WRONG!!! GGplot is fitting an ordinary least squares regression without accounting for the random effect. That means that the estimates and the confidence intervals do not reflect our model. In this case, the estimates might be pretty close since our samples sizes across species are pretty even, but this could be wildly off, or even opposite, of mixed model slope estimate.

In a prior post, I showed how we can use the predict function to display our custom models in ggplot. In the case of mixed effect models, you can predict both the marginal and conditional values. The marginal value is the fixed effect. The conditional value is the mixed effect of the fixed and random effects.

In other words, the marginal effect is asking “What would I expect y to be for a given x without knowing which experimental unit it came from?” whereas the conditional effect is asking “What would I expect y to be for a given x from a given experimental unit?”

We can specify which prediction we want with the random effect formula argument re.form:

X <- X %>% 
  mutate(fit.m = predict(lmer1, re.form = NA),
         fit.c = predict(lmer1, re.form = NULL))

The simplest visualization would be to display the marginal fit on the raw values.

X %>%
  ggplot(aes(x = x, y = y)) +
    geom_point(pch = 16, col = "grey") +
    geom_line(aes(y = fit.m), col = 1, size = 2) +
    coord_cartesian(ylim = c(-40, 100))

Linear fit of the marginal (fixed) effects (black line) shown with a scatterplot of the raw data (grey dots).

However, this is a bit misleading because it underrepresents our confidence in the slope by making it look like the residuals are huge.

But the residuals, and our confidence in the fit, is based on the conditional residual variance, which is much tighter. We can see that easily when we look at the conditional fits. This is one option for visualization, but it highlights the wrong element if our primary interest is the overall slope trend.

X %>%
  ggplot(aes(x = x, y = y, col = unit)) +
  geom_point(pch = 16) +
  geom_line(aes(y = fit.c, col = unit), size = 2) +
  facet_wrap(vars(unit)) +
  coord_cartesian(ylim = c(-40, 100))

Conditional fits from the random effect model with random intercepts on the raw data points, facetted by experimental unit.

Displaying the conditional fits on the same facet helps. Now we can see the variation in the conditional intercepts, but as a tradeoff it makes it difficult to get a sense of the residual variance because there are too many points.

X %>%
  ggplot(aes(x = x, y = y, col = unit)) +
  geom_point(pch = 16) +
  geom_line(aes(y = fit.c, col = unit), size = 2)  +
  coord_cartesian(ylim = c(-40, 100))

Conditional fits from the random effect model with random intercepts on the raw data points. All experimental units are displayed on the same facet, differentiated by color.

Instead, I think it makes more sense to display the conditional residuals around the marginal effect. You can kind of think of this as collapsing all of the conditional fits from the previous plot into the single marginal fit. We can do this by extracting the residuals (which are the conditional residuals) and then displaying the points as the marginal fit plus the residuals.

X <- X %>%
  mutate(resid = resid(lmer1))

X %>%
  ggplot(aes(x = x, y = fit.m + resid, col = unit)) +
  geom_point(pch = 16) +
  geom_line(aes(y = fit.m), col = 1, size = 2)  +
  coord_cartesian(ylim = c(-40, 100))

Marginal fit from the random effect model with random intercepts on the conditional residuals of the experimental units, differentiated by color.

In some cases, we might also want to give the reader a sense of the variation in the conditional intercepts. For instance, the fact that the slope is so consistent across a wide range of baselines might actually increase our confidence in the relationship even further.

There are a couple of ways to simultaneously display both our confidence in the fit of the marginal trend and the variance in the conditional fits.

Depending on the number of conditional units, one option is to display the conditional fits below the scatter plot of the conditional residuals.

X %>%
  ggplot(aes(x = x, y = fit.m + resid)) +
  geom_line(aes(y = fit.c, col = unit), size = 1) +
  geom_point(pch = 16, col = "grey") +
  geom_line(aes(y = fit.m), col = 1, size = 2) +
  coord_cartesian(ylim = c(-40, 100))

Marginal fit (heavy black line) from the random effect model with random intercepts with the conditional residuals (grey dots) and conditional fits (thin lines) for each experimental unit, differentiated by color.

Another option is to display a density plot or histogram of the estimated conditional intercepts (also known as the Best Linear Unbiased Predictors or BLUPs). In a random effect framework, we are assuming that the conditional intercepts are samples of some infinite distribution of intercepts, so this histogram from the BLUPs of our model is essentially an empirical representation of that idealized distribution. (Alternatively, we could also simply plot the idealized distribution as a normal distribution from the estimated variance of the random effect, but I like the empirical density plot because it also give a sense of when our conditionals do NOT conform to the assumption of being samples from a normal distribution).

We can extract the BLUPs from the model object (b0_hat) and add those to the model estimate of the marginal intercept (B0_hat) to get the estimated conditional intercepts. This is our data frame of conditional estimates.

Cond_DF <- as.data.frame(ranef(lmer1)) %>% transmute(unit = grp, b0_hat = condval) %>% mutate(Intercept_cond = b0_hat + summary(lmer1)$coef[1,1])

X %>%
  ggplot(aes(x = x, y = fit.m + resid)) +
  geom_point(pch = 16, col = "grey") +
  geom_violinhalf(data = Cond_DF, aes(x = 0, y = Intercept_cond), trim = FALSE, width = 3, fill = NA) +
  geom_line(aes(y = fit.m), col = 1, size = 2) +
  coord_cartesian(ylim = c(-40, 100))

Marginal fit (heavy black line) from the random effect model with random intercepts with the conditional residuals (grey dots) and histogram of the distribution of conditional intercepts.

Random slope and intercept model

Now let’s imagine that we are not satisfied with the random intercept model and also want to fit a random slope parameter. In this case, we want to estimate the distribution of slopes for all experimental units across the values of x.

lmer2 <- lmer(y ~ x + (x|unit), data = X)

We can see that the values from the model are getting much closer to the known values that we simulated.

summary(lmer2)


> summary(lmer2)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + (x | unit)
   Data: X

REML criterion at convergence: 7442.8

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.2071 -0.6957  0.0268  0.7067  2.8482 

Random effects:
 Groups   Name        Variance Std.Dev. Corr 
 unit     (Intercept) 846.3021 29.0913       
          x             0.2151  0.4638  -0.27
 Residual              93.0853  9.6481       
Number of obs: 1000, groups:  unit, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)  18.3474     9.2202    1.99
x             2.0075     0.1816   11.05

Correlation of Fixed Effects:
  (Intr)
x -0.250

Correlation of random effects

One important addition from the random intercept-only model is the estimate for the correlation between the distribution of the random slopes and random intercepts (which the model estimates as -0.268, see output below). Because we simulated these data, we know that there is no true correlation between the unit slopes and intercepts. But, because we have a small number of units, we just happened to have an emergent correlation.

summary(lmer2)$varcor

cor.test(rand_eff$b0, rand_eff$b1)

> summary(lmer2)$varcor
 Groups   Name        Std.Dev. Corr  
 unit     (Intercept) 29.09127       
          x            0.46378 -0.268
 Residual              9.64807       
> 
> cor.test(rand_eff$b0, rand_eff$b1)

	Pearson's product-moment correlation

data:  rand_eff$b0 and rand_eff$b1
t = -1.2158, df = 8, p-value = 0.2587
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.8205150  0.3123993
sample estimates:
       cor 
-0.3949022

If you had a reason to assume NO correlation between your random effects, you could specify that as (x||unit) in this way:

lmer3 <- lmer(y ~ x + (x||unit), data = X)

summary(lmer3)

summary(lmer3)$varcor

> summary(lmer3)
Linear mixed model fit by REML ['lmerMod']
Formula: y ~ x + ((1 | unit) + (0 + x | unit))
   Data: X

REML criterion at convergence: 7443.3

Scaled residuals: 
    Min      1Q  Median      3Q     Max 
-3.1812 -0.6951  0.0223  0.7074  2.8523 

Random effects:
 Groups   Name        Variance Std.Dev.
 unit     (Intercept) 833.2681 28.8664 
 unit.1   x             0.2092  0.4574 
 Residual              93.1011  9.6489 
Number of obs: 1000, groups:  unit, 10

Fixed effects:
            Estimate Std. Error t value
(Intercept)   18.325      9.149   2.003
x              2.010      0.180  11.165

Correlation of Fixed Effects:
  (Intr)
x -0.035
> 
> summary(lmer3)$varcor
 Groups   Name        Std.Dev.
 unit     (Intercept) 28.86638
 unit.1   x            0.45738
 Residual              9.64889

As with the random intercept model, we can use the predict function to get expected values of y based on the marginal or conditional estimates. Note that re.form = NULL is the same as re.form = ~ (x|unit).

X <- X %>% 
  mutate(fit2.m = predict(lmer2, re.form = NA),
         fit2.c = predict(lmer2, re.form = NULL),
         resid2 = resid(lmer2))

As with the random intercept model, one way to visualize the model is to show the conditional intercept/slopes as fitted lines and the conditional residuals as points.

pmain_lmer2 <- X %>%
  ggplot(aes(x = x, y = fit2.m + resid2)) +
  geom_line(aes(y = fit2.c, col = unit), size = 1) +
  geom_point(pch = 16, col = "grey") +
  geom_line(aes(y = fit2.m), col = 1, size = 2) +
  coord_cartesian(ylim = c(-40, 100))
pmain_lmer2

Marginal fit (heavy black line) from the random effect model with random intercepts and slopes with the conditional residuals (grey dots) and conditional fits (thin lines) for each experimental unit, differentiated by color.

Visualizing the random effect variance gets a bit more difficult with two random parameters. One strategy I like is to include an additional plot of the correlation and distribution of the random effects.

Basically, we make a scatter plot of the BLUPs of slopes and intercepts, Then we make histograms for each and add those to the margins of the plot. (I’m relying heavily on Claus Wilke’s post to code the histograms at the margins of the plot). Finally, we patch it all together with cowplot.

Cond_DF2 <- as.data.frame(ranef(lmer2)) %>% 
  transmute(unit = grp,
            term = case_when(term == "(Intercept)" ~ "b0_hat",
                             term == "x" ~ "b1_hat"),
            value = condval) %>%
  pivot_wider(id_cols = "unit", names_from = "term", values_from = "value") %>%
  mutate(Intercept_cond = b0_hat + summary(lmer2)$coef[1,1],
         Slope_cond = b1_hat + summary(lmer2)$coef[2,1])

pmain <- Cond_DF2 %>%
  ggplot(aes(x = Intercept_cond, y = Slope_cond)) +
    geom_point(aes(col = unit), size = 3) +
    geom_density2d(bins = 4, col = "grey", adjust = 3)

xdens <- axis_canvas(pmain, axis = "x") +
  geom_density(data = Cond_DF2, aes(x = Intercept_cond), fill = "grey", col = NA, trim = FALSE, adjust = 2)

ydens <- axis_canvas(pmain, axis = "y", coord_flip = TRUE) +
  geom_density(data = Cond_DF2, aes(x = Slope_cond), fill = "grey", col = NA, trim = FALSE, adjust = 2) +
  coord_flip()

p1 <- insert_xaxis_grob(pmain, xdens, grid::unit(.2, "null"), position = "top")
p2 <- insert_yaxis_grob(p1, ydens, grid::unit(.2, "null"), position = "right")

pinsert_lmer2 <- ggdraw(p2)

plot_grid(
  pmain_lmer2,
  pinsert_lmer2,
  nrow = 1
)

Left: Marginal fit (heavy black line) from the random effect model with random intercepts with the conditional residuals (grey dots) and conditional fits (thin lines) for each experimental unit, differentiated by color. Right: Correlation of BLUPs (Best Linear Unbiased Predictors) of random intercept and slope parameters of experimental units, differentiated by color with marginal density distributions.

Bootstrapping uncertainty

For many researchers, one of the most frustrating aspects of mixed models is that estimating confidence intervals and testing the significance of parameters is not straight forward. I highly encourage that folks take a look at Ben Bolker’s thorough considerations on the topic. Dr. Bolker suggests many different problems and solutions depending on the structure of your model.

I think that the most generalized solution is to use non-parametric bootstrapping. This method essentially asks the question, “How would our model fit change if we could go back in time, select different samples, and then rerun our analysis?”

We can’t go back in time, but maybe we CAN assume that our original samples were representative of the population. If so, instead of resampling the actual population, we could resample our original observations WITH REPLACEMENT to approximate resamples of the population. If we do this many time, we can then make intuitive statements like, “95% of the time, if we reran this experiment, we’d expect the main effect to be between X and X value.”

It is important to stop an consider what re-doing your data generation process would look like. For instance, imagine our mock data had come from 1000 independent random observation that we then categorized into “units” to control for autocorrelation after the fact. If we re-ran the process to generate a new dataset, we may not always get the same number of observations in each “unit”.

However, if our mock data came from an experiment where we planted 100 trees in each of 10 “units”, then when we re-ran the experiment, we could control the number of individuals per unit. We would also need to consider if we would always choose to plant in the same 10 units, or if we would also choose units at random.

The structure of the data generation process can guide our bootstrap resampling strategy. In the first example, we could simply bootstrap all individual observations (although we may need to worry about non-convergence and small sample sizes). In the second example, where unit choice is constrained, we might decide to bootstrap within units. If, in the second example, we could also randomize units, we should probably take a hierarchical approach, first bootstrapping the units and then bootstrapping the observations within each unit.

NOTE: The problem with non-parametric boostrapping of this kind is that it can be computationally expensive. One trick is to parallelize the bootstraps across all of your computer’s processors. By default, R uses one processor, so it will fit one bootstrap iteration at a time, sequentially. But the bootstraps are independent and the order doesn’t matter. If you computer has 8 cores, there is no reason not to fit 8 models simultaneously on 8 processors in 1/8 of the time. Unfortunately, setting up for parallel processing can be an adventure of its own. I won’t detail it here, but will try to dedicate a post on it in the future. If you have a very large dataset and can’t run bootstraps in parallel, you might consider some of the other methods suggested by Dr. Bolker.

Since our dataset is fairly small and simple, I’ll demonstrate how we can use bootstrapping to simultaneously estimate confidence intervals of our model parameters and visualize error bands.

Just keep in mind that if you are fitting models with complex random effect designs, you’ll have to think critically about which elements and levels of variance are most important for your data story. Hopefully, these examples will at least get you started and inspired!

The bootstrapping process begins by initiating an empty dataframes to accept the parameter estimates for the fixed effect coefficients and random effect variances. The number of rows in the dataframe will be the same as the number of bootstrap iterations, so we set that first. The number of iterations is the number of times we want to simulate re-doing our data generation. The convention is 1000, but the more the merrier!

Number_of_boots <- 1000

The number of columns for the dataframes will equal the number of fixed effect coefficients and random effect variances. We can extract these from the initial model. First we extract the coefficients, then transpose the table into wide format.

# Extract the fixed effect coefficients.
FE_df <- fixef(lmer2) %>% 
  t() %>%
  as.data.frame()

# Extract the random effects variance and residual variance
RE_df <- VarCorr(lmer2) %>%
  as.data.frame() %>%
  unite("Level", -c(vcov, sdcor)) %>%
  select(-vcov) %>%
  t() %>%
  as.data.frame()

Next, we create empty dataframes to take our bootstraps.

BS_params <- data.frame(matrix(nrow = Number_of_boots, ncol = ncol(FE_df)))
colnames(BS_params) <- colnames(FE_df)

BS_var <- data.frame(matrix(nrow = Number_of_boots, ncol = ncol(RE_df)))
colnames(BS_var) <- RE_df["Level",]

In addition, we will be predicting marginal values from each model. So, we need to create a prediction dataframe with an empty column to store the predicted values. For this example, we only need to predict ŷ values for a handful of x values that represent the range of xs. I chose to use 10-quantiles because I want to be able to fit a non-linear confidence band later. If this was a non-linear fit, we might want even more prediction values.

BS_pred <- expand.grid(x = quantile(X$x, probs = seq(0, 1, length.out = 10)),
                       iterration = 1:Number_of_boots,
                       pred = NA)

Finally, we can write a loop that creates a resampled dataset (with replacement) and fits the original model formula to the new dataset. From the new model, we can then extract the fixed and random effects and predict ŷs for the subset of x values. All of these get stored in their respective dataframes, indexed by the iteration number.

for(i in 1:Number_of_boots){
BS_X <- slice_sample(X, prop = 1, replace = TRUE)
BS_lmer <- lmer(formula = lmer2@call$formula,
data = BS_X)

BS_params[i,] <- BS_lmer %>%
fixef() %>%
t() %>%
as.data.frame()

BS_var[i,] <- BS_lmer %>%
VarCorr() %>%
as.data.frame() %>%
.$sdcor

BS_pred[which(BS_pred$iterration == i),]$pred <- predict(BS_lmer,
newdata = BS_pred[which(BS_pred$iterration == i),],
re.form = ~0)
}

Now we have a dataframe of the marginal (i.e. fixed effect) intercept and slope parameter estimates from 1000 models fit to bootstraps of our original data.

head(BS_params)

> head(BS_params)
  (Intercept)        x
1    18.06942 2.096240
2    18.18005 2.070043
3    18.13506 2.093110
4    18.77862 1.928048
5    18.47963 2.013831
6    18.28875 2.005947

One way to get a sense of our confidence in these parameter estimates is to take a look at their distributions.

BS_hist_x <- BS_params %>%
  ggplot(aes(x = x)) +
  geom_histogram()

BS_hist_intercept <- BS_params %>%
  ggplot(aes(x = `(Intercept)`)) +
  geom_histogram()

BS_hists <- plot_grid(
  BS_hist_intercept,
  BS_hist_x,
  nrow = 1)

BS_hists

Histograms of marginal (i.e. fixed effect) intercept (left) and slope (right) parameters.

These histograms tell us so much more than a typical confidence interval because we can see the full distribution. We can see that the baseline effect on y given x = 0 is around 18.5 and we are very confident it is between 17 and 20. We can also see that the effect of 1 unit change in x is expected to yield about a 2.0 unit change in y and we are extremely confident that the slope is positive and greater than 1.7 but probably less than 2.3.

Although we won’t directly visualize the random effect variance, we can see the estimates in the BS_var dataframe.

BS_var %>% head()

> BS_var %>% head()
  unit_(Intercept)_NA unit_x_NA unit_(Intercept)_x Residual_NA_NA
1            28.30440 0.4427450         0.01009018       9.786112
2            30.31353 0.6265168        -0.42315981       9.365630
3            30.33369 0.5261389        -0.78128072       9.320844
4            28.51896 0.6079722        -0.08575830       9.739881
5            30.50690 0.5835157        -0.64068868       9.718633
6            29.17012 0.5182395        -0.28244154       9.930305

Here, the first column is the estimated variance attributed to the random intercepts, the second column is the variance estimate of the random slopes, and the third column is the correlation between the random effects. The fourth and final column is the residual variation after accounting for the random effects (i.e. the conditional residual variation). This is information you’d want to include in a table of the model output.

We also have a dataframe of 10 predicted ŷs for each iteration.

head(BS_pred, n = 20)

> head(BS_pred, n = 20)
             x iterration     pred
1  0.008197445          1 18.08660
2  1.098954093          1 20.37309
3  2.231045775          1 22.74622
4  3.266109165          1 24.91596
5  4.323574828          1 27.13267
6  5.667717615          1 29.95031
7  6.672807375          1 32.05722
8  7.748743587          1 34.31264
9  8.793447507          1 36.50259
10 9.968969584          1 38.96677
11 0.008197445          2 18.19702
12 1.098954093          2 20.45493
13 2.231045775          2 22.79841
14 3.266109165          2 24.94104
15 4.323574828          2 27.13003
16 5.667717615          2 29.91247
17 6.672807375          2 31.99305
18 7.748743587          2 34.22028
19 8.793447507          2 36.38286
20 9.968969584          2 38.81624

Rather than using a traditional confidence band (which basically reduces the distribution of our bootstraps down to two points: high and low), I prefer to actually show all of the iterations and let the density of the lines make a kind of optical illusion of a confidence band.

Since our estimates are pretty tightly distributed, I’m using cowplot to show what this looks like while also zooming in to a section.

plot_grid(
  BS_pred %>%
    ggplot(aes(x = x, y = pred)) +
    geom_line(aes(group = iterration), alpha = 0.1, col = "grey50") +
    geom_line(data = X,
              aes(x = x, y = fit2.m)) +
    geom_rect(aes(ymin = 35, ymax = 40,
                  xmin = 8, xmax = 10),
              col = "firebrick",
              fill = NA,
              size = 2),
  BS_pred %>%
    ggplot(aes(x = x, y = pred)) +
    geom_line(aes(group = iterration), alpha = 0.1, col = "black") +
    geom_line(data = X,
              aes(x = x, y = fit2.m),
              col = "grey", 
              size = 2) +
    coord_cartesian(xlim = c(8, 10),
                    ylim = c(35, 40)) +
    geom_rect(aes(ymin = 35, ymax = 40,
                  xmin = 8, xmax = 10),
              col = "firebrick",
              fill = NA,
              size = 2) +
    theme(axis.line = element_blank(),
          axis.text = element_blank(),
          axis.title = element_blank()) +
    labs(x = "", y = ""),
  nrow = 1
)

Marginal effects of linear mixed effect regressions fit to 1000 boostrap resamples of the original data (thin grey lines in left and think black lines in right panel). The model estimates fit to the original data are show as a thick black (left) or grey (right). The right panel is a detailed view of the quadrant indicated with a red rectangle in the left panel.

Of course, we can also create a more traditional 90% confidence band by summarizing the 5th and 95th percentiles of ŷ across each x value.

BS_pred %>%
  group_by(x) %>%
  summarise(hi = quantile(pred, 0.95),
            lo = quantile(pred, 0.05)) %>%
  ggplot(aes(x = x)) +
    geom_ribbon(aes(ymin = lo, ymax = hi),
                fill = "grey50",
                alpha = 0.3) +
    geom_line(data = X,
              aes(x = x, y = fit2.m))

Model fit of a linear mixed effect regression (black line) and 90% confidence band (grey band) estimated from fitting 1000 boostraps.

Putting it all together

Putting it all together, here is my preferred visualization of a mixed effect model with random intercepts and slopes, using bootstrapping to display uncertainty.

BS_ci_lines <- BS_pred %>%
  ggplot(aes(x = x, y = pred)) +
    geom_line(aes(group = iterration), alpha = 0.1, col = "grey") +
    geom_line(data = X,
              aes(x = x, y = fit2.m)) +
    geom_point(data = X,
               aes(x = x, y = fit2.m + resid2, col = unit),
               alpha = 0.3,
               pch = 16)

plot_grid(
  BS_ci_lines,
  plot_grid(
    pinsert_lmer2,
    plot_grid(
      BS_hist_intercept,
      BS_hist_x,
      nrow = 1,
      labels = c("C", "D")),
    nrow = 2,
    rel_heights = c(1, 0.7),
    labels = c("B", NA)
  ),
  nrow = 1,
  rel_widths = c(1, 0.7),
  labels = c("A", NA)
)plot_grid(
  BS_ci_lines,
  plot_grid(
    pinsert_lmer2,
    plot_grid(
      BS_hist_intercept,
      BS_hist_x,
      nrow = 1,
      labels = c("C", "D")),
    nrow = 2,
    rel_heights = c(1, 0.7),
    labels = c("B", NA)
  ),
  nrow = 1,
  rel_widths = c(1, 0.7),
  labels = c("A", NA)
)

Linear fixed effect (A, solid black line) of a mixed effect regression fit to the original data and estimate 1000 boostrap resamples (A, thin grey lines) with conditional residuals for experimental units (A, points). Correlation of BLUPs (Best Linear Unbiased Predictors) of random intercept and slope parameters of experimental units with marginal density distributions (B). Histograms of marginal (i.e. fixed effect) intercept (C) and slope (D) parameters estimated from 100 boostraps. Colors in A and B differentiate experimental units.

Or, if we want to get really fancy with it, we could inset everything into one plot panel.

BS_ci_lines +
  coord_cartesian(ylim = c(-30, 100)) +
  annotation_custom(ggplotGrob(BS_hists),
                    xmin = 5,
                    xmax = 10,
                    ymin = -30,
                    ymax = 5
  ) +
  annotation_custom(ggplotGrob(pinsert_lmer2),
                    xmin = 0,
                    xmax = 4,
                    ymin = 50,
                    ymax = 110
  )

sadfasdf

Smartphone hemispherical photography

azandis@gmail.com — Wed, 16 Dec 2020 23:41:13 +0000

Hemispherical photography is one of those tasks often prefaced by the statement, “How hard could it be?” I’m pretty sure I said something like this at the beginning of my PhD when we wanted to ask how the canopy over wood frog ponds influences their larval ecology.

Now, five years, four blog posts, and uncountable hours later, I can say that measuring canopy structure with hemispherical photos is surprisingly difficult.

One of the biggest hurdles is understanding the equipment and deploying it properly in the field. For me, nothing is more tedious than standing waste deep in a mucky, mosquito-infested pond while I fiddle around with camera exposure settings and fine-tuning my leveling device. Add to that the constant fear of dropping a couple thousands of dollars of camera and specialized lens into the water, and you get a good sense of my summers.

So, it is with great pleasure that I offer an alternative method for capturing canopy photos that requires nothing but a cell phone, yet produces higher quality images than a traditional DSLR setup. This new method exploits the spherical panorama function available on most cameras (or in the free Google Street View app). Hemispherical photos can then be easily extracted a remapped from the spheres. You can check out the manuscript at Forestry here (a PDF is available on my Publications page) or continue reading while I walk through the paper below.

From figure 2 of the paper: . Comparisons of smartphone spherical panorama hemispherical photographs (SSP HP) (right B and C) to traditional DSLR hemispherical photographs (DSLR HP) (left B and C) captured at the same site. Details of the same subsection of the canopy, indicated by orange boxes, are expanded in C. Binarized images are shown below color images in B and C.

The old way

The most common way to measure canopy structure these days is with the use of hemispherical photographs. These images capture the entire canopy and sky from horizon to horizon. Assuming proper exposure, we can categorize individual pixels as either sky or canopy and run simple statistics to count the amount of sky pixels versus canopy or the number and size of the gaps between canopy pixels. We can also plot a sun path onto the image and estimate the amount of direct and indirect light that penetrated through the canopy. (You can follow my analysis pipeline in this post).

All of this analysis relies on good hemispherical images. But the problem is that there are many things that can go wrong when taking canopy photos, including poor lighting conditions, bad exposure settings, improperly oriented camera, etc. Another problem is that capturing images of high-enough quality requires a camera with a large sensor, typically a DSLR, a specialized lens, and a leveling device, which can cost a lot of money. Most labs only have one hemispherical photography setup (if any), which means that we sacrifice the number of photos we can take in favor of high-quality images.

The new way

In the past few years, researchers have tried to figure out ways to get around this equipment barrier. Folks have tried eschewing the leveling device, using clip-on hemispherical lenses for smartphones, or using non-hemispherical smartphone images. I even tested using a hemispherical lens attachment on a GoPro.

But, none of these methods really produce images that are comparable to the images from DSLRs, for three reasons:

Smartphone sensors are tiny compared to DSLR, so there is a huge reduction in quality.
Clip-on smartphone lenses are tiny compared to DSLR, so again, there is a huge reduction in optical quality.
Canopy estimates are sensitive to exposure settings and DLSRs allow for more control over exposure.

The method I developed gets around all of these issues by using multiple, individual cell phone images to stitch together a single hemispherical image. Thus, instead of relying on one tiny cell phone sensor, we are effectively using many tiny cell phone sensors to make up the difference.

Another advantage of creating a hemispherical image out of many images is that each individual image only has to be exposed for a portion of the sky. This avoids the problems of glare and variable sky conditions that plague traditional systems. An added benefit is that, smartphone cameras operate in a completely different way than DSLRs, so they are much less sensitive to exposure issues in general.

Smartphones are less sensitive to exposure issues because, unlike DSLRs that capture a single instance on the sensor when you hit the shutter button, smartphone cameras use computational photography techniques that blend the best parts of many images taken in short succession. You may not realize it, but your smartphone is constantly taking photos as soon as you turn it on (which makes sense since you can see the scene from the camera on your screen). The phone stores about 15 images at a time, constantly dumping the older versions out of temporary memory as updated images pour in. When you hit the button to take a picture, your phone then automatically blends the last few images with the next few images. The phone’s software selects the sharpest pixels with the most even contrast and color from each image and then composites those into the picture presented to you. With every new software update, the algorithms for processing images get better and better. That’s why modern cell phones are able to take photos that can compete with mid-range DSLRs despite the limitations of their tiny sensors.

So, if each phone photos is essentially a composite of 15 images, and then we take 18 of those composite images and stitch them into a hemispherical image, we are effectively comparing a sensor the size of 270 individual phone camera sensors to the DSLR sensor.

The best part is that there is already software that can do this for us via the spherical panorama feature included with most contemporary smartphone cameras. This feature was introduced in the Google Camera app back in 2012 and iOS users can access the feature via the Google StreetView app. It is incredibly simple to use.

Update: Check out my post on tips for taking spherical panoramas

Once you’ve taken a spherical panorama, it is stored in your phone as a 2D JPEG in equirectangular format. The best part about the photo sphere software is that it utilizes your phone’s gyroscope and spatial mapping abilities to automatically level the horizon. This is advantageous for two reasons. First, it means we can ditch the tedious leveling devices. Second, it means that the equirectangular image can be perfectly split between the upper and lower hemisphere. We simply have to crop the top half of the rectangular image and remap it to polar coordinates to get a circular hemispherical image.

Figure 1 from the paper: Spherical panoramas (A) are stored and output from smartphones as 2D images with equirectangular projection (B). Because spherical panoramas are automatically leveled using the phone gyroscope, the top half of the equirectangular image corresponds to the upper hemisphere of the spherical panorama. The top portion of the equirectangular image (B) can then be remapped onto the polar coordinate plane to create a circular hemispherical photo (C). In all images, zenith and azimuth are indicated by Θ and Φ, respectively.

How to extract hemispherical images from spherical panoramas

UPDATE: Please see my latest post to process spherical images with R.

Command line instructions

If you are proficient with the command line, the easiest way to extract hemispherical images from photo spheres is to use ImageMagick. After you download and install the program you can run the script below to convert all of your images with just a couple lines of code.

cd "YOUR_IMAGE_DIR"

magick mogrify -level 2%,98% -crop 8704x2176-0-0 -resize "8704x8704!" -virtual-pixel horizontal-tile -background black +distort Polar 0 -flop -flip *jpg

You may need to make a few modifications to the script for your own images. The -crop 8704x2176-0-0 flag crops the top half of the image (i.e. upper hemisphere). Be sure to adjust this to 1.00×0.25 the dimensions of your panorama dimensions. The -resize "8704x8704!" flag resizes the image into a square in order to apply a polar transformation. Be sure to adjust this to 1.00×1.00 the width of your panorama

Note that the code above will convert and overwrite all of the .jpg files in your folder to hemispherical images. I suggest that you practice on a folder of test images or a folder of duplicates to avoid any mishaps.

GUI instructions

If you are intimidated by the command line, extracting hemispherical images from photo spheres is also easy with GIMP (I used GIMP because it is free, but you can follow the same steps in Photoshop).

Update: You can also try out this cool web app developed by researchers in Helsinki which allows you to upload spherical panoramas from your computer or phone and automatically converts them to hemispherical images that you can download. However, I would not suggest using this tool for research purposes because the app fixes the output resolution at 1000p, so you lose all of the benefits of high-resolution spherical images.

Spherical panoramas are stored as 2D equirectangular projections from which hemispherical images can be extracted in GIMP.

First, crop the top half of the rectangular photo sphere.

Crop the top half of the panorama.

Second, scale the image into a square. I do this by stretching the image so that the height is the same size as the width. I go into why I do this below.

Scale the image into a square.

Third, remap the image to a polar projection. Go to Filter > Distort > Polar Coordinates

Settings for remapping the panorama into a polar projection.

Once, mapped onto polar coordinates, the image is now a circular hemispherical image.

Fourth, I found that increasing the contrast slightly helps the binarization algorithms find the correct threshold.

All of these steps can be automated in batch with BIMP plugin (a BIMP recipe is available in the supplemental files of the paper). This can also be automated from the command line with ImageMagick (see scripts above and in the supplemental materials of the paper).

The result is a large image with a diameter equal to the width of the equirectangular sphere. Because we are essentially taking columns of pixels from the rectangular image and mapping them into “wedges” of the circular image, we will always need to down sample pixels toward the center of the circular image. Remember that each step out from the center of the image is the same as each step down the rows of the rectangular image. So, the circumference of every ring of the circular image is generated from a row of pixels that is the width of the rectangular image.

With a bit of geometry, we can see that, the circumference matches the width of our rectangular image (i.e. 1:1 resolution) at zenith 57.3 degrees. Zenith rings below 57.3 will be downsampled and those above will be scaled up and new pixels will be interpolated into the gaps. Conveniently, 57.3 degrees is 1 radian. The area within 1 rad, from zenith 0° to 57°, is important for canopy estimates as gap fraction measurements in this portion of the hemisphere are insensitive to leaf inclination angle, allowing for estimated of leaf area index without accounting for leaf orientation.

Thus, we retain most of our original pixel information within this critical portion of the image, but it does mean that we are expanding the pixels (increasing the resolution) closer to the horizon. I tested the impact of resolution directly in my paper and found almost no difference in canopy estimates; so, it is probably okay to downscale images for ease of processing if high resolution is not needed.

The hemispherical image produced can be now be analyzed in any pipeline used to analyze DSLR hemispherical images. You can see the pipeline I uses in this post.

How do images from smartphone panoramas compare to DSLR

In my paper, I compared hemispherical photos taken with a DSLR against those extracted from a spherical panorama. I took consecutive photos at 72 sites. Overall, I found close concordance between measures of canopy structure (canopy openness) and light transmittance (global site factor) between the methods (R² > 0.9). However, the smartphone images were of much greater clarity and therefore retained more detailed canopy structure that was lost in the DSLR images.

Figure 4 from the paper: Difference in canopy structure and light environment estimates between reference (standard DSLR HP) and full resolution SSP HP (dark orange), low resolution SSP HP downsampled to match the standard DSLR resolution (light orange), fisheye HP (blue), and DSLR HP with exposure adjusted from +5 to -5 (light to dark). SSP HP images were generated from spherical panoramas taken with Google Pixel 4a and Google Camera. Fisheye HP images were simulated from smartphone HP for two intersecting 150° FOV images from a Pixel 4a. DSLR HP were captured with Canon 60D and Sigma 4.5mm f2.8 hemispherical lens.

Although the stitching process occasionally produces artifacts in the image, the benefits of this method far outweigh the minor problems. Care when taking the panorama images, as well as ever-improving software will help to minimize imperfect stitching.

Figure 2 from the paper: Comparisons of smartphone spherical panorama hemispherical photographs (SSP HP) (right B and C) to traditional DSLR hemispherical photographs (DSLR HP) (left B and C) captured at the same site. Details of the same subsection of the canopy, indicated by orange boxes, are expanded in C. Binarized images are shown below color images in B and C. Image histograms differ in the distribution of luminance values in the blue color plane (A). In panel E, a section of the canopy from full resolution SSP HP (left), downsampled SSP HP (middle), and DSLR HP (right) is further expanded to demonstrate the effect of image clarity on pixel classification. An example of an incongruous artifact resulting from misalignment in the spherical panorama is outlined in blue in A and expanded in D.

Overall, this method is not only a good alternative, it is probably even more accurate than traditional methods because of the greater clarity and robustness to variable exposure. My hope is that this paper will help drive more studies in the use of smartphone spheres for forest research. For instance, 360 horizontal panoramas could be extracted for basal measurement or entire spheres could be used to spatially map tree stands. The lower hemisphere could also be extracted and used to assess understory plant communities or leaf litter composition. Researchers could even enter the sphere with a virtual reality headset in order to identify tree species at their field sites from the comfort of their home.

Mostly, I’m hopeful that the ease of this method will allow more citizen scientists and non-experts to collect data for large-scale projects. After all, this method requires no exposure settings, no additional lenses, and is automatically levelled. The geolocation and compass heading can even be extracted from the image metadata to automatically orient the hemispherical image and set the location parameters in analysis software. Really, anyone with a cell phone can capture research-grade spherical images!

Be sure to check out my other posts about canopy research that cover the theory, hardware, field sampling, and analysis pipelines for hemispherical photos, and my tips for taking spherical panoramas.

Analyzing Hemispherical Photos

azandis@gmail.com — Mon, 04 Feb 2019 02:06:15 +0000

Now that you’ve got the theory, equipment, and your own hemispherical photos, the final steps are processing the photos and generating estimates of canopy and light values.

We’ll be correcting photos and then converting to a binary image for analysis just like those in the slider above.

There are many paths to get light estimates from photos. Below, I’ve detailed my own pipeline which consists of these main steps:

Standardizing exposure and correcting photos
Binarization
File conversion
Model configuration
Analyzing images

For this pipeline, I use Adobe Lightroom 5 for standardizing the photos and masking, ImageJ with Hemispherical 2.0 plugin for binarization and image conversion, Gap Light Analyzer (GLA) for estimating light values, and AutoHotKey for automating the GLA GUI. All of these programs, with the exception of Lightroom, are freely available. You could easily replicate all of the steps I show for Lightroom in GIMP, an open-source graphics program.

There are probably stand-alone proprietary programs out there that can analyze hemispherical photos in a more streamlined manner within a single program, but personally, I’m willing to sacrifice some efficiency for free and open-source methods.

1. Standardizing photos:

With the photos on your computer, the first step is to standardize the exposure. In theory, exposure should be standardized in the field by calibrating reference exposure to the open sky (as outlined in my previous post). However, when taking many photos throughout the day, or on days with variable cloud cover, there will be variation in the photos. Similarly, snowy days or highly reflective vegetation can contribute anomalies in the photos. Fortunately, you can reduce much of this variability in post-processing adjustments. We will apply some of these adjustments across the entire image and others we will need to correct targeted elements within the image.

Across the entire image, we can right-shift the exposure histogram so that the brightest pixels in the sky are always fully white. This standardizes the sky to the upper end of the brightness distribution across the image. So, even if we capture one photo with relatively dark clouds and another with relatively bright clouds over the course of a day, after standardizing, the difference between the white sky and dark canopy will be consistent.

Example using a local adjustment brush to recover the effect of sun flare.

If elements of the image are highly reflective, an additional step to mask these areas is required to avoid underestimating canopy closure. For instance, extremely bright tree bark or snow on the ground should be masked. I do this by painting a -2EV exposure brush over the area to ensure it will be converted to black pixels in the binarization step. Unfortunately, this must be done by hand on each photo.

Example using a local adjustment brush to mask reflective areas in the image.

Sunspots can also be partially corrected, but again, it requires manipulation by hand on each photo. I find that a local exposure brush that depresses the highlights and increases the contrast will recover pixel that have been overexposed due to sun flare, but only to an extent. It is best to avoid flare in the first place (see my last post for tips for avoiding flare).

2. Binarization

Binarization is the process of converting all color pixels in an image to black (canopy) and white (sky). The most basic method of conversion is to set a defined threshold for the tone of each pixel above which a pixel is converted to white and below which it is converted to black. This type of simple thresholding suffers from problems in cases when pixels are ambiguous, especially at the edges of solid canopy objects. To avoid these problem, a number of algorithms have been developed that take a more complex approach. I won’t go into details here, but I suggest reading Glatthorn and Beckshafer’s (2014).

Examples of pixels that pose significant problems for thresholding and binarizing images. From Glatthorn & Beckshafer 2014.

I explored a number of binarization programs while developing my project, including SideLook 1.1 (Nobis, 2005) and the in-program thresholding capability in Gap Light Analyzer (Frazer et al. 2000). However, both of these programs fell short of my needs. In the end, I found the Hemispherical 2.0 plugin (Beckshafer 2015) for ImageJ to be most useful because it is easily able to handle batch processing.

You can download the Hemispherical 2.0 plugin and manual here (scroll to the bottom of the page under “Forest Software”). This plugin uses a “Minimum” thresholding algorithm on the blue color plane of the image (skies are blue, so the blue plane yields the highest contrast). The threshold is based on a histogram populated from a moving window of neighboring pixel values to ensure continuity of canopy edges (see Glatthorn & Beckshafer 2014 for details).

Once the plugin is installed, you can run the binarization process in batch on a folder containing all of your adjusted hemispherical photo files. I’ve found the program lags if I try to batch process more than 100 photos at a time; so, I suggest splitting datasets into multiple folders with fewer than 100 files if you are processing lots of photos.

Example of a standard registration image. Naming this file starting with “000” or “AAA” will ensure it is the first image processed in the folder.

Before you begin the binarization and analysis process, I suggest that you make a registration image. The first step in both of the binarization and analysis programs is defining the area of interest in the image that excludes the black perimeter. This can be hard to do depending on which photo is first in the stack (for instance, if the ground is very dark and blends into the black perimeter). A registration image allows you to easily and consistently define your area of interest. I like to make a standardized registration image for all of my camera/lens combinations.

The easiest way to make a registration image is to simply take a photo of a white wall and increase the contrast in an editing program until you have hard edges. Or, you can overlay a solid color circle that perfectly matches the bounds of the hemisphere (as I’ve done in the photo above). I also like to include a demarcated point that indicates a Northward orientation (this is at the top of the circle, in my case). I use this mark as a starting point for manually dragging the area-of-interest selection tool in both GLA and Hemispherical 2.0.

3. File Conversion

Hemispherical 2.0 outputs binarized images as TIFF files, but GLA can only accept bitmap images. So, we need to convert the binary images to bitmap files. This can easily be accomplished in ImageJ (from the Menu go to Process > Batch > Convert).

4. Model configuration

Now that our files are processed, we can finally move to GLA and compute our estimates of interest.

The first step is to establish the site specific parameters in the “Configure” menu. I’ll walk through these parameters tab-by-tab where they pertain to this pipeline. You can read more about each setting in the GLA User Manual.

“Image” tab:

In the “Registration” box, define the orientation of the photo. If you used a registration image with a defined North point, select “North” for the “Initial Cursor Point.” Select if your North point in the photo is geographic or magnetic. If the latter, input the declination.

In the “Projection distortion” box, select a lens distortion or input custom values (see my previous post for instructions on computing these for your lens).

“Site” tab:

Input the coordinates of the image in the “Location” box. Note that we will use a common coordinate for all photos in the batch. This only works if there is not much distance between photos (I use 10km as a rule of thumb, but realistically, even 100km probably makes very little difference). If your sites are far apart, remember to set new coordinates for each cluster of neighboring sites.

If you ensure that your images point directly up as I suggest in my previous post, you should select “Horizontal” in the “Orientation” box. Note that in this case, “Horizontal” means “horizontal with respect to gravity” not “horizontal to the ground” which might not be the same thing if you are standing on a slope.

The “Topographic Shading” is not important if our only interest is estimating light. If you are interested in the canopy itself, this tool can be used to exclude a situation where, for instance, a tall mountain blocks light behind a canopy and would be considered a canopy element by the program.

“Resolution” tab:

In the “Suntrack” box, it is important to carefully think about and set your season start and end dates. If you decide to consider other seasonal timespans later, you’ll need to rerun all of these analyses again. Remember that if you are estimating light during spring leaf-out or fall senescence, you’ll need to run separate analyses for each season with corresponding images.

“Radiation” tab:

Assuming that you did not directly measure radiation values alongside your images, we need to model these parameters. These are probably the most difficult parameters to set because they will require some prior analysis from other data on radiation for your site.

The NSRDB has many observation locations across the U.S. Most of these have long-term data with low uncertainty in daily estimates.

You can download radiation values from local observation points from the National Solar Radiation Dataset website. From these data, you can calculate parameter estimates for Cloudiness index, Beam fraction, and Spectral fraction. Remember only to use data for the seasonal window that matches the seasonal window of interest in your photos.

Cloudiness index (K_t):

This is essentially a measure of the percentage of the total radiation entering the atmosphere (H_o) that makes it to the ground surface (H) (i.e. not reflected by clouds).

Where H is the Modeled Global Horizontal radiation (Glo Mod (W/m^2)) and H_o is the Hourly Extraterrestrial radiation (ETRN (W/m^2)) with 0.01 added to each to avoid dividing by zero.

Beam fraction (H_b/H):

Beam fraction is the ratio of radiation from direct beam (H_b) versus spectral radiation that reaches the ground (H). As you can imagine, as cloudiness increase, direct beam radiation decreases as it is diffused by the clouds; thus, we can estimate beam fraction as a function of cloudiness (K_t).

Where the constants -3.044 and 2.436 are empirically derived by Frazer et al. (1999) (see their paper for details and extensions).

Spectral fraction (R_p/R_s):

Spectral fraction is a ratio of photosynthetically active radiation wavelengths (R_p) to all broadband shortwave radiation (R_s). Frazer et al. (1999) provide the following equation for calculating Spectral fraction as a function of Cloudiness index (K_t).

Frazer et al. (1999) provide guidance on computing this as a monthly mean, but for most analyses, I think this overall average is probably fine.

The Solar constant is not region-specific, so the default value should be used.

The Transmission coefficient basically describes how much solar transmittance is expected on a perfectly clear day and is mainly a function of elevation and particulate matter in the air (i.e. dusty places have lower clear sky transmission). Frazer et al. (1999) suggest that this coefficient ranges from 0.4 to 0.8 globally and between 0.6 and 0.7 for North America. Therefore, they use a default of 0.65. Unless you have good reason to estimate your own transmission values, this default should be fine.

It can be very intimidating to pick values for these parameters. To be honest, I suspect that most practitioners never even open up the configuration menu, and instead, simply use the default parameters of the program. In the end, no matter what values you use to initialize your model, the most critical step is recording the values you chose and justifying your choice. As long as you are transparent with your analysis, others can repeat it and adjust these parameters in order to make direct comparisons with their own study system. Note: you can easily save this configuration file within GLA, too.

5. Analyzing images

Once GLA is configured, you will need to register the first image. If you’ve named your registration image something like “AA000” as I have, it will be the first in your stack. Place the cursor on the known north point and drag to fit the circumference. Be sure to select the “Fix Registration for Next Image” button and also record the coordinate values for the circle.

Now that the first image is registered, we can automate the analysis process for the rest of the images. Unfortunately, GLA does not have a command line version and the GUI does not offer a method of batch processing. So, I wrote a script for the AutoHotKey program that replicates keyboard and mouse inputs. You can download AutoHotKey here. The free version is fine. For best operation, I recommend the 32-bit version.

I’ve included my AutoHotKey script below. You will need to use a text editor to edit the first line of the macro (line 17) with the file path to your image folder. Save the file as an AutoHotKey file (.ahk) and run it as administrator. Now you can pull up the GLA window and use the F6 shortcut to start the macro.

http://www.azandisresearch.com/wp-content/uploads/2019/02/Fig_photo2.mp4

You should see the program move your cursor to open windows and perform operations, like in the video above. Since AutoHotKey is replicating mouse and keyboard movements, if you use your computer, it will stop the program. I suggest not touching your computer until the program is done (go get a cup of coffee while your new robot does all of your analysis for you!). If you must use your computer concurrently, you will need to first set up GLA and the AutoHotKey to run in a virtual machine.


#NoEnv
SetWorkingDir %A_ScriptDir%
CoordMode, Mouse, Window
SendMode Input
#SingleInstance Force
SetTitleMatchMode 2
#WinActivateForce
SetControlDelay 1
SetWinDelay 0
SetKeyDelay -1
SetMouseDelay -1
SetBatchLines -1

F6::
Macro1:
Loop, Files, ENTER_YOUR_FILEPATH_HERE*.*, F
{
WinMenuSelectItem, Gap Light Analyzer, , File, Open Image..
WinWaitActive, Gap Light Analyzer ahk_class #32770
Sleep, 333
ControlClick, Button1, Gap Light Analyzer ahk_class #32770,, Left, 1,  NA
Sleep, 10
WinWaitActive, Open file... ahk_class #32770
Sleep, 333
ControlSetText, Edit1, %A_LoopFileName%, Open file... ahk_class #32770
ControlClick, Button2, Open file... ahk_class #32770,, Left, 1,  NA
Sleep, 1000
Sleep, 1000
WinWaitActive, Gap Light Analyzer  ; Start of new try
Sleep, 333
WinMenuSelectItem, Gap Light Analyzer, , Configure, Register Image
WinWaitActive, Gap Light Analyzer  ahk_class ThunderRT5MDIForm
Sleep, 333
ControlClick, ThunderRT5CommandButton1, Gap Light Analyzer  ahk_class ThunderRT5MDIForm,, Left, 1,  NA
Sleep, 10
WinMenuSelectItem, Gap Light Analyzer, , View, OverLay Sky-Region Grid
WinMenuSelectItem, Gap Light Analyzer, , View, OverLay Mask
WinMenuSelectItem, Gap Light Analyzer, , Image, Threshold..
WinWaitActive, Threshold ahk_class ThunderRT5Form
Sleep, 333
ControlClick, ThunderRT5CommandButton1, Threshold ahk_class ThunderRT5Form,, Left, 1,  x34 y13 NA
Sleep, 10
WinMenuSelectItem, Gap Light Analyzer, , Calculate, Run Calculations..
WinWaitActive, Calculations ahk_class ThunderRT5Form
Sleep, 333
ControlClick, ThunderRT5CommandButton1, Calculations ahk_class ThunderRT5Form,, Left, 1,  NA
Sleep, 10
WinWaitActive, Calculation Summary Results ahk_class ThunderRT5Form
Sleep, 333
ControlSetText, ThunderRT5TextBox4, %A_LoopFileName%, Calculation Summary Results ahk_class ThunderRT5Form
ControlClick, ThunderRT5CommandButton2, Calculation Summary Results ahk_class ThunderRT5Form,, Left, 1,  NA
Sleep, 10
Sleep, 5000
}
MsgBox, 0, , Done!
Return

F8::ExitApp

F12::Pause

When the program has looped through all of the photos, a dialogue box will pop up with “Done!”

The data output is in a separate sub-window within GLA. It is usually minimized and hidden behind the registration image window. You can save the data, but GLA defaults to save the output as a semicolon delimited text file. This can be easily converted to a CSV or Excel worksheet in Excel.

And you’re done!

Be sure to check out my other posts about canopy research that cover the theory, hardware, field sampling, and analysis pipelines for hemispherical photos.

Also, check out my new method of canopy photography with smartphones and my tips for taking spherical panoramas.

References

Beckchafer, P. (2015). Hemispherical_2.0: Batch processing hemispherical and canopy photographs with ImageJ – User manual. doi:10.13140/RG.2.1.3059.4088.

Frazer, G. W., Canham, C. D., and Lertzman, K. P. (2000). Technology Tools: Gap Light Analyzer, version 2.0. The Bulletin of the Ecological Society of America 81, 191–197.

Frazer, G. W., Canham, C. D., and Lertzman, K. P. (1999). Gap Light Analyzer (GLA): Imaging software to extract canopy structure and gap light transmission indices from true-color fisheye photographs, users manual and documentation. Burnaby, British Columbia: Simon Fraser University Available at: http://rem-main.rem.sfu.ca/downloads/Forestry/GLAV2UsersManual.pdf.

Glatthorn, J., and Beckschäfer, P. (2014). Standardizing the protocol for hemispherical photographs: accuracy assessment of binarization algorithms. PLoS One 9, e111924.

Nobis, M. (2005). Program documentation for SideLook 1.1 – Imaging software for the analysis of vegetation structure with true-colour photographs. Available at: http://www.appleco.ch/sidelook.pdf.

Wilcox, S. (2012). National Solar Radiation Database 1991–2010 Update: User’s Manual. National Renewable Energy Laboratory Available at: https://www.nrel.gov/docs/fy12osti/54824.pdf.

Taking hemispherical canopy photos

azandis@gmail.com — Tue, 24 Jul 2018 19:30:29 +0000

Be sure to check out the previous two posts:

1. Hemispherical Light Estimates

2. Hardware for Hemispherical Photos

Once the theory and equipment are taken care of, you are ready to go out into the field and collect data. This post will cover the when, where, and how of shooting hemispherical photos. The next and final post deals with analyzing the photos you’ve captured.

When:

Hemispherical photos are very sensitive to lighting conditions. Because you are measuring an entire half hemisphere of the sky at once with each photo, it is important that the background (i.e. the sky) is as standardized as possible so that the canopy can be accurately distinguished without bias.

To imagine the wildly different exposures gradient across a hemispherical lens, wait until a clear sunset and walk out into an open field. Stare directly at the setting sun for a few seconds, then turn 180 degrees and try to make out the horizon. Even though our eyes are extremely well-equipped to quickly adjust for different lighting, you’ll probably have a hard time making out the dark horizon after looking at the bright sunset. Unlike your eye, which continuously adjusts to light conditions even at these extremes, cameras are stuck in one setting. Even the best camera sensors are limited in their tolerances, so we need to take photos at times when lighting in the entire hemisphere fits within a narrow range.

The best time to shoot is on uniformly overcast days. Overcast cloud cover yields a nice homogeneous white background that makes strong contrast to the edges of leaves and branches.

Overcast days provide a nice standard background on which the canopy structure can be easily distinguished.

If cloudy days are not in the forecast, another option is to wait until dawn or dusk, before or after the sun is below the horizon and the sky is evenly lit. The problem is that this only allows a very short window for shooting.

Sometimes, we can’t help but take photos on sunny days. Before I talk about strategies for dealing with difficult exposures, I want to explain some of the problems direct sunlight can precipitate in light estimation methods: blown-out highlights, flare, and reflection.

Overexposed highlights

Overexposed or “blown-out” highlights is an issue of exposure settings and sensor limitations. When shooting into light, individual sensor cells can only record light up to a threshold. In my last post, I compared light collecting on camera sensor cells to an array of buckets in a field collecting raindrops. Using that analogy, think about sensor thresholds as the rim of the bucket. At some point, enough rain collects in the buckets that it overfills. After that point, you can record no more information about relative rainfall between buckets. In the camera, eventually too much light falls incident on a portion of the sensor and those pixels are maxed out and recorded as fully white (i.e. no information is coded in those cells). This leads to an underestimate of canopy closure because some pixels occupied by canopy will look like white sky after binarization. We can try correct our exposure to ensure we do not truncate those bright pixels, but compared to direct sun, blue sky directly overhead is relatively dark on sunny days, so this will almost certainly lead to a loss of information on the dark end of the spectrum, and we will then be overestimating canopy closure.

Overexposure will blow out the brightest parts of the image and the result will be that solid canopy elements will be considered open sky in downstream analysis. This photo is uniformly overexposed, but this can also occur if the range of exposure across the hemisphere is greater than the dynamic range of the sensor.

Unfortunately, there are no easy ways to deal with wide exposure ranges (other than avoiding sunny days). One solution is to shoot in RAW format which retains more information per pixel. With RAW photos one can attempt to recover highlights and boost shadows to some extent, but because the next processing step is binarization, this will only be effective if you can sufficiently expand the pixel value range at the binarization threshold. This also entails hand-calibrating each image, which reduces standardization and replicability, and may take a lot of time if you need to process many photos.

Flare

Flare is another problem and it emerges for a couple of reasons.

Hazy-looking “veiling flare” occurs when light scatters inside the glass optics of your lens. In normal photography, it is most prevalent when the sun is oblique to the front lens surface and light gets “trapped” bouncing around inside of the glass. It can also look like a halo of light bleeding into a solid object or streaks of light radiating from a point of light depending on your aperture settings (this is called Fraunhofer diffraction and can make for very cool effects…just not in hemispherical photos!). When these streaks appear to overlay solid canopy objects in our photos it will lead to underestimation of closure.

Flare from a lightsource can lead to mischaracterization of solid canopy pixels as open sky by binarization algorithms as seen in the to two images on the right.

Ghosting flare looks like random orbs of light in photos and it occurs when light entering the lens highlights the inside of the optics close enough to focus on the sensor. Hemispherical lenses are incredibly prone to this type of flare because of the short focal distance of wide angle lenses.

Ghost flare, when sufficiently bright, will be considered a canopy gap in downstream estimation.

If photos must be taken in sunlight, one alternative is to at least block the direct beams of sunlight by positioning the sun behind a natural sunblock or fashioning a shield. I’ve never tried the shield option myself, but I’ve seen photos of folks who have affixed a small disk on a semi-ridge wire on their tripod. The disk can be positioned such that it only blocks the sun. There are many problems with this option. First, the sunshield will be counted as canopy in the light calculations which will bias the estimates. One could exclude the blocked area from analysis by masking out the area of interest, but then that area will be excluded from the analysis. Another option is to spot-correct flare in each photo. This is most effective with RAW photos and can be accomplished in photo-editing applications by using a weak adjustment brush to reduce highlights, boost shadows, and increase contrast directly over the flare. Again, I don’t recommend editing individual photos, but sometimes this is the only option.

Reflection

Finally, direct sun reflecting on solid surfaces can lead to mischaracterization of pixels and overestimate openness. In this case, objects opposite the sun can be lit so well that they are brighter than the background. This is very common in forests dominated by light bark trees like aspen and birch. This also occurs just about any time there is snow in the frame, regardless of the lighting conditions. The only solution for this problem is darkening the solid objects in a photo-editing program. It is a painstaking task and increases the error in the final estimation, but sometimes is necessary.

Reflection can make solid object in the canopy appear lighter than the sky. Although, this image is properly exposed, the light trunk of the tree reflects enough light to be characterized as sky by the binarization algorithm.

Where.

Next you must decide where you want to take photos. The answer to this will depend on your research question. We can break it down into spatial/temporal sampling scale and relative position.

Hemispherical photos can fit into any standard spatial sampling scheme (e.g. transects, grid, opportunistic, clustered, etc.) depending on your downstream statistical analysis. However, because hemispherical photos capture such a wide angle of view, you must be careful of any analysis that assumes independence of observations if the viewshed overlaps between photos.

Generally, when we think about spatial sampling, we think in two dimensions (like in the figure below). However, it is also important to consider that canopy closure estimates integrate sun angle, and so it is critical that an even sampling scheme considers the third spatial dimension and include a representative sample of topographical aspect.

Example spatial sampling schemes from Collender et al 2015.

There is no perfect sampling strategy for any given project. To illustrate some considerations, I will outline the sampling for my work. For my project, I need to characterize the canopy closure above forested pond that range in size from a few meters to a hundred meters across. The most obvious strategy was to sample the intersections of a tightly spaced Cartesian grid over the entire surface of the pond. A previous research student tried this method on a handful of ponds. Using that information, we were then able to subsample those data to determine the spacing of the grid that would yield the most accurate estimates with the fewest photos. In this case, it turned out that every pond, regardless of size, could be characterized with 5 photos: 4 photos along the most distal shoreline in each cardinal direction and one photo in the center of the pond where the east-west and north-south lines intersect. An ancillary benefit of taking the same number of photos at each pond is that I can also calculate the variance within each pond, which gives me a sense of the homogeneity of the habitat. Keep in mind that measures of higher moments of the distribution of light values across habitats (like variance or kurtosis) may be extremely ecologically relevant and can be incorporated for more meaningful statistical analysis.

One final consideration in spatial sampling is the height at which photos are captured. By virtue of practicality, the most common capture height is about 1m from the ground since this is the height of most tripods. However, the study question might dictate taking photos above understory plants or at ground level. Regardless, the height of photos should be consistent, or recorded for each photo, and explicitly stated in published methods.

You will also need to consider the frequency of your sampling to ensure that you capture any relevant variation in the study system over time. In temperate forests, this usually means, at the very least, taking the same photos with deciduous leaves on and again after the leaves have fallen. On the other hand, phenological studies might need photos from many timepoints over shorter durations.

Example of a sunpath projected onto a hemispherical photo from Gap Light Analyzer.

It is important to remember that canopy closure estimation integrates the sun’s path over a specific yearly window. We will define that window explicitly in the model, so it is important to ensure that the canopy structure in the photos accurately represents the sunpath window you define.

How.

In this section I will get into the nuts and bolts of taking photos in the field. I’ll cover camera settings and then camera operation.

Camera settings.

Most modern cameras are designed for ease of use and offer a variant of “automatic” settings. Automatic settings are great for snapping selfies and family photos, but awful for data collection. Manually adjusting the camera increases replicability and increases the accuracy of light estimates. Fortunately, there are only 4 parameters that we need to adjust for hemispherical photography: ISO, aperture, shutter speed, and focal distance.

ISO measures the sensitivity of a camera’s sensor to light. Higher ISO settings ( = greater sensitivity) allow for faster capture in lower light. However, high ISO leads to lots of noise and mischaracterization of pixels. In general, you should aim for the lowest ISO setting possible to produce better quality photos. More expensive cameras have better sensors and interpolation algorithms, so you can get away with higher ISO settings.

The aperture is the iris of a lens and controls the amount of light entering the camera. Aperture settings are given in f-numbers (which is the ratio of the lens focal length to the physical aperture width). Counterintuitively, a larger f-value (i.e. f22) is a smaller physical pupil, and therefore, less light than a smaller f-value (i.e. f2.8). Your aperture settings will be a balance between letting in enough light and getting crisp focus across the focal range (see focal distance below).

The shutter speed determines the duration of time that the sensor is exposed to light. Longer shutter speeds means more light resulting in brighter photos. However, the longer the shutter speed, the more any movement of the camera or the canopy will blur the image. If you are using a handheld system, I suggest at least 1/60sec shutter. With a tripod, shutter speeds can be longer, but only if the canopy is completely still. If there is ANY wind, I suggest at least 1/500sec shutter.

Focal distance is the simplest—just adjust the lens focus so that canopy edges are in sharp focus. This is easy when the canopy is a consistent distance from the lens, but can be difficult when capturing multi-layer structure. Lenses resolve greater depth of field (range of focal distances simultaneously in focus) when the aperture is smallest.

Since all four settings rely on all of the others, camera settings will be a balancing act. The end goal is to ensure that you have the best balance between overly white and overly black pixels and you can check this with your camera’s internal light meter. The big catch here is that we are actually not that interested in the exposure of the sky; in fact, we would like for the sky to be entirely white.

The most common exposure standardization technique is to first determine the exposure settings for open sky, then overexpose the image by 2-3 exposure values (EV) (Beckshäfer et al. 2013, Brown et al. 2000, Chianucci and Cutini 2012). In theory, this will ensure that a uniformly overcast sky is entirely white without blowing-out the canopy. The primary benefit is that this method uses the sky as a relative reference standard which is replicable.

It is easy to employ this method using your camera’s internal light meter. First, set your camera to meter from a single central point (you will probably need to look in your manual to figure out how to do this). If there is a large enough gap in the canopy overhead, you can point the meter spot there and take a reading then adjust your settings to get a 0 EV. (If the canopy has no gaps, you can set this in the open before going into the forest–just remember to take another reading if the conditions change). Now, reduce the shutter speed by 2 full stops (i.e. if 1/500 is 0 EV for the sky, set your shutter speed to 1/125; if 0 EV is 1/1600, set you speed to 1/400).

[Note, other authors (e.g. Beckshäfer et al. 2013) suggest adjusting for 2 EV overexposure. I don’t like using EV for anything other than the spot reference because all cameras use different methods of evaluating exposure. In contrast, shutter speed is invariant across all camera platforms.]

This may all seem like a confusing juggling act, but it is not that difficult in practice. Here is my general strategy:

I set my ISO at 200 and my aperture at around f11.
With the camera set to evaluate the central point of the image, I take a light meter reading of open sky.
I adjust my shutter speed to an exposure value of 0 for open sky.
Now, I re-adjust my shutter speed slower by 2 full stops.
If my shutter speed is now too slow, I will increase my ISO one level or decrease my f-stop (aperture) by one stop and go back to step #2. I repeat until I find the balance.
With the camera set, I can set up my camera and take images using these same settings; however, I must re-calibrate if the sky conditions change.

Taking photos.

At this point, shooting the photos is the easy part! A couple of helpful tips will make life easier.

Before shooting, you will need to orient the camera so that the sensor is perpendicular to the zenith angle (i.e. the camera lens is pointing directly up, opposite gravity). In my previous post covering hardware I mentioned that there are pre-fabricated leveling systems available or you can make a DIY version. With a tripod, you can manually level the camera.

For later analysis, you will need to know the compass orientation of the photos. Some pre-fab systems have light-up LEDs around the perimeter that are controlled by an internal compass and always light north. Otherwise, you can place a compass on your stabilizer or tripod and point the top of the image frame in a consistent direction (magnetic or true north is fine, just make sure you are consistent and write down which one you use).

It can be hard to take a hemispherical photo without unintentionally making self-portraits. With a tripod, you can use a remote to release the shutter from a distance or from behind a tree. Camera manufacturers make dedicated remotes, or if your camera has wifi capabilities, you can use an app from your phone. Most cameras also have a timer setting which can give you enough time to duck for cover.

Be sure to check out my other posts about canopy research that cover the theory, hardware, field sampling, and analysis pipelines for hemispherical photos.

Also, check out my new method of canopy photography with smartphones and my tips for taking spherical panoramas.

References

Beckschäfer, P., Seidel, D., Kleinn, C., and Xu, J. (2013). On the exposure of hemispherical photographs in forests. iForest 6, 228–237.

Brown, P. L., Doley, D., and Keenan, R. J. (2000). Estimating tree crown dimensions using digital analysis of vertical photographs. Agric. For. Meteorol. 100, 199–212.

Chianucci, F., and Cutini, A. (2012). Digital hemispherical photography for estimating forest canopy properties: Current controversies and opportunities. iForest – Biogeosciences and Forestry 5.

Collender, P. A., Kirby, A. E., Addiss, D. G., Freeman, M. C., and Remais, J. V. (2015). Methods for Quantification of Soil-Transmitted Helminths in Environmental Media: Current Techniques and Recent Advances. Trends Parasitol. 31, 625–639.

Poster: Frogz that Glowz

azandis@gmail.com — Sat, 14 Jul 2018 13:41:11 +0000

Poster presented at JMIH on Friday 13th, 2018

I had a ton of fun presenting a poster detailing my ~~soon-to-be~~ published study exploring the use of calcein fluorescence marking in amphibians at this year’s Joint Meeting of Ichthyologists and Herpetologists in Rochester, New York. This method solves a lot of problem areas we run into in trying to mark and re-identify amphibians across time and is one of the only methods for reliably marking larval anurans across developmental stages for identification as adults.

~~Keep an eye out for the paper in the next couple of week in Herpetological Conservation and Biology.~~ Check out my blog post and download the paper!

Here is a link to a PDF version of the poster if you’d like a closer look.

If you are interested in this method, please feel free to contact me: a.andis@yale.edu.

Here is the product page for SE-MARK brand calcein.

Much thanks to everyone who came to talk with me about the poster!

UPDATE: I was thrilled to win the Victor Hutchinson Poster Award from SSAR! This is quite an honor for my first real conference presentation and goes to show that there are at least a few benefits of taking a long time off and gaining some ancillary skills in graphic design before heading back to graduate school. I also have to make a huge shout out to Better Posters blog where I found tons of tips, tricks, good advice in turning academic content into a visual story.

The key to any great presentation is pure professionalism at all times… unless you’re at a herp meeting.