Using find_dives

Welcome to this vignette! On behalf of the team behind tagtools, thanks for taking some time to get to know this package. We hope it is useful to you.

In this vignette you will learn to use find_dives to gain insight about a depth profile, calculate the mean dive duration and mean dive depth, and mark the beginnings and endings of dives.

Estimated time for this vignette: 20 minutes.

These practicals all assume that you have R/Rstudio, some basic experience working with them, and can execute provided code, making some user-specific changes along the way (e.g. to help R find a file you downloaded). We will provide you with quite a few lines. To boost your own learning, you would do well to try and write them before opening what we give, using this just to check your work.

Additionally, be careful when copy-pasting special characters such as _underscores_ and ‘quotes’. If you get an error, one thing to check is that you have just a single, simple underscore, and 'straight quotes', whether 'single' or "double" (rather than “smart quotes”).

Load the test data set

Load the test dataset mn12_186a_raw.nc. This dataset has already been converted from the source file that was offloaded from the tag into a NetCDF file. In doing so, some metadata was gleaned from the file and added to the data. Other metadata was added by hand. If you want to run this example, download the “mn12_186a_raw.nc” file from https://github.com/animaltags/tagtools_data and change the file path to match where you’ve saved the files

Use load_nc to load a NetCDF file:

library(tagtools)
MN_file_path <- "nc_files/mn12_186a_raw.nc"
MN <- load_nc(MN_file_path)

This creates an animaltag list object MN in your workspace. You can view it in the Environment tab if working in RStudio, or in the command line type:

names(MN)
str(MN$A)
# not run because output is very long! see the whole STRucture of MN:
# str(MN)
# shorter outline version:
str(MN, max.level = 1)

#> [1] "Results for `names(MN)`:"
#> [1] "------------------------"
#> [1] "P"    "A"    "M"    "T"    "S"    "info"
#> [1] "Results for `str(MN$A)`:"
#> [1] "------------------------"
#> List of 19
#>  $ data              : num [1:1172115, 1:3] 2355 2340 2324 2315 2296 ...
#>  $ sampling          : chr "regular"
#>  $ sampling_rate     : num 32
#>  $ sampling_rate_unit: chr "Hz"
#>  $ depid             : chr "mn12_186a"
#>  $ creation_date     : chr "07-Aug-2017 14:51:46"
#>  $ history           : chr "read_ll3m"
#>  $ type              : chr "acc"
#>  $ full_name         : chr "Acceleration"
#>  $ description       : chr "triaxial acceleration"
#>  $ unit              : chr "1"
#>  $ unit_name         : chr "counts"
#>  $ unit_label        : chr "counts"
#>  $ start_offset      : num 0
#>  $ start_offset_units: chr "second"
#>  $ column_name       : chr "x,y,z"
#>  $ frame             : chr "tag"
#>  $ axes              : chr "FRU"
#>  $ files             : chr "20120704-57537-Mn-XXXX-U-NoName-Acceleration-X.TXT20120704-57537-Mn-XXXX-U-NoName-Acceleration-Y.TXT20120704-57"| __truncated__
#> [1] "Results for `str(MN$A)`:"
#> [1] "------------------------"
#> List of 6
#>  $ P   :List of 17
#>  $ A   :List of 19
#>  $ M   :List of 19
#>  $ T   :List of 16
#>  $ S   :List of 17
#>  $ info:List of 25
#>  - attr(*, "class")= chr [1:2] "animaltag" "list"

You should see that lists A, M, P, S, T and info are contained within the list MN.

Exercise: Calculate the mean duration of dives deeper than 5m

Our goal with these data is to calculate the mean duration of dives deeper than 5 m. If you can think of a way to do this already, please go ahead and try! You can then compare your answer to the step-by-step procedure below.

As with all raw depth data, there are some problems with this dive profile. Try and plot the depth profile and the temperature. Then, see if you can find evidence for each of these in the plot:

Incorrect calibration of the sensor
Occasional outliers
Coarse depth resolution
Temperature sensitivity

1. Pay attention to the depths reported… While it looks like a dive profile, mostly, what looks like the “surface” is actually around some $-10$ m (that is, ten meters high in the air)! We know that humpback whales do not, in fact, go ten meters in the sky routinely. This suggests the sensor has not been well-calibrated. 2. Occasionally, there are some peaks that go higher yet than these ten meters high in the sky momentarily (more like 15 meters). These are outliers. 3. Zoom in to the shallow part of the dive to notice this. The individual measurements are only precise to the meter; thus it will appear to jump back and forth more suddenly than we would expect for a relaxed whale at the surface. 4. The temperature gradually rises as the recording nears the end, then spikes down and up. This is accompanied by a gradually decreasing depth from ten meters in the sky to the surface. Together, the two trends strongly suggest that the depth data has been affected by the changing temperature as the tag sits near the surface.

Hints & Tips

Look in info to find what species the data come from. Are the depth values reasonable for this species?
Zoom in and see what size depth steps there are in the data.
Use plott to plot both the depth and temperature:

plott(X=list(Depth=MN$P, Temperature=MN$T), r=c(TRUE,FALSE))
# r=c(TRUE,FALSE) tells plot to reverse the y-axis for the Depth data (so that it looks like a dive profile), but not for the Temperature data (which would be silly).

What to do about periods of data when the tag is not on the animal

Not all tags have a way to start logging as soon as the tag has been deployed on the animal. Often data logging is started by a time trigger or alarm, and the researcher has to make a guess as to when the tag will be deployed to set its start time appropriately. Often this means that a tag is logging data before it is put on an animal.

Equally, tags have no means of detecting when they release from the tagged animal. As a consequence, they may continue to log data after they release. In most cases, the logged data from before and after deployment has no use. To reduce the data to just the periods when the tag is on the animal, use the tool crop:

Pc = crop(MN$P)

#> [1] "Position your cursor and then click once followed by clicking FINISH to change the start, or click twice in the same spot followed by clicking FINISH to change the end. If you wish to change both the start and end click once at the start time desired and twice at the end time desired."

This displays an interactive depth plot. Follow the instructions to select the obvious diving section of the data and then click finish. The function returns a new data structure which contains just the selected part of the dive profile. The resulting sensor data list also contains fields that document what you just did. To see them:

Pc$history
str(Pc, max.level = 1)

#> [1] "Results for `Pc$history`:"
#> [1] "-------------------------------------"
#> [1] "read_ll3m,crop_to"
#> [1] "Results for `str(Pc, max.level = 1)`:"
#> [1] "-------------------------------------"
#> List of 21
#>  $ data              : num [1:25306] -6.5 -6.5 -6.5 -6.5 -6.5 -6.5 -6.5 -6.5 -6.5 -7 ...
#>  $ sampling          : chr "regular"
#>  $ sampling_rate     : num 1
#>  $ sampling_rate_unit: chr "Hz"
#>  $ depid             : chr "mn12_186a"
#>  $ creation_date     : chr "07-Aug-2017 14:51:49"
#>  $ history           : chr "read_ll3m,crop_to"
#>  $ type              : chr "press"
#>  $ full_name         : chr "Pressure"
#>  $ description       : chr "dive depth"
#>  $ unit              : chr "m H2O"
#>  $ unit_name         : chr "meters H2O (salt)"
#>  $ unit_label        : chr "meters"
#>  $ start_offset      : num 0
#>  $ start_offset_units: chr "second"
#>  $ axes              : chr "D"
#>  $ files             : chr "20120704-57537-Mn-XXXX-U-NoName-Depth.TXT"
#>  $ crop              : num [1:2] 0 25306
#>  $ crop_units        : chr "seconds"
#>  $ start_time        : num 0
#>  $ start_time_units  : chr "seconds"

The $history field keeps track of the operations that you perform on a data structure. This helps with trace-ability if you make the processed data available in an archive. The $crop and $start_time fields show how the original data was changed: the start_time is with respect to the field dephist_device_datetime_start in the $info structure which says when the tag recording started. Use plott to plot Pc to make sure you cropped it correctly.

plott(X = list(DepthCropped=Pc), r = TRUE)

Note that the results from here on out will look a little different depending on where exactly you’ve cropped your data, but it should come to essentially the same thing.

plott(X = list(DepthCropped=Pc), r = TRUE)

Removing outliers.

Outliers or spikes in the data may result from errors in the tag or poor sensor performance under rapidly changing environmental conditions. For example in this data set, rapid changes in temperature and pressure as the animal surfaces cause small glitches in the data. These are not representative of the animal’s behaviour so we need to remove them. A good way to do this is with a median filter. Type:

?median_filter

to find out how this function works. You call it using:

Pcm = median_filter(Pc,n=3)

Your variable Pcm now contains the median-filtered, cropped depth data. Check its history to verify that the median filtering has been added. Compare it against the unfiltered data using:

plott(X=list(Pc=Pc, Pcm=Pcm), r=c(TRUE,TRUE))

This plots Pc in the upper panel and Pcm in the lower one, and reverses the y-axis labeling on both plots (by setting r to TRUE both times).

Correcting pressure offsets & temperature effects

The next step is to correct the pressure offset of the depth sensor (so that the animal is not 10 m out of the water when it is really at the surface). We can also compensate for temperature at the same time. To do this we have to first crop the temperature data to match the pressure data. You can do this using:

Tc <- crop_to(MN$T,tcues=Pc$crop)

This uses the crop information stored in Pc to do the same operation on MN$T. Plot them together to confirm that the time intervals (horizontal ticks) are the same:

plott(X=list(Pcm=Pcm, Tc=Tc), r=c(TRUE,FALSE))

The tag toolbox has a function to correct pressure data called fix_pressure(). Type

? fix_pressure

to find out what it does and what assumptions it makes about the data. Use this function by calling:

Pcmf <- fix_pressure(Pcm, Tc)

Compare the compensated dive profile to the uncompensated cropped ones using plott. (The code below will produce three plots, which might make them a bit small. Hit “Zoom” to view them in a larger window.)

plott(X=list(Pc=Pc, Pcm=Pcm, Pcmf=Pcmf$p), r=c(TRUE,TRUE,TRUE))

Which of the problems that we listed above have been taken care of? Any ideas what you could do about the remaining one(s)?

1. Incorrect calibration of the sensor and temperature sensitivity have both been addressed by now, since the whale is actually resting and breathing at the surface, rather than ten meters in the air. Additionally, some of the occasional outliers have been removed by the median filter.

However, none of this has increased the resolution of the depth data. In order to resolve this last issue, ultimately, new data would have to be taken at higher resolution.

Finding dives & the mean dive duration

To find the mean dive duration for dives over 5 m in depth, you could measure each dive by hand on the depth plot (ginput is a useful function in Matlab and Octave for measuring data on a plot – there isn’t a great equivalent in R, where interactive plots are not really commonly used). But there is a toolbox function for this called find_dives. See the help on this function to find out what it does and what options it has.

? find_dives

To find dives deeper than 5 m in your compensated dive data, type:

d <- find_dives(Pcmf$p,mindepth=5)
str(d, max.level = 1)

#> 'data.frame':    50 obs. of  4 variables:
#>  $ start: num  2392 2604 3165 3626 4023 ...
#>  $ end  : num  2481 2778 3361 3860 4213 ...
#>  $ max  : num  5.75 10.74 10.24 48.7 22.23 ...
#>  $ tmax : num  2432 2640 3196 3739 4073 ...

d should return a data frame with the start, end, and maximum depth of about 51 dives (depending on where you cropped the data). How can you get the mean dive duration & mean (maximum) dive depth from this structure? Code below provides one possible answer.

total_dive_duration <- matrix(0)
for(n in 1:nrow(d)) {
  total_dive_duration <- total_dive_duration + d[n,2] - d[n,1]
  }
mean_dive_duration <- total_dive_duration/nrow(d)
mean_dive_duration
# since this is using data from find_dives, the mean duration is in seconds
# and mean depth
total_dive_depth <- matrix(0)
for(n in 1:nrow(d)) {
  total_dive_depth <- total_dive_depth + d[n,3]
}
mean_dive_depth <- total_dive_depth/nrow(d)
mean_dive_depth

When you have got the mean dive depth, try plotting the start and end of the dives on the depth plot:

plot_panels <- plott(X = list(Pcmf = Pcmf$p),
                   r = TRUE,
                   draw = FALSE) 
# our "list" of plot panels from plott has only one element, the dive profile
plot_panels[[1]] + 
  ggplot2::geom_point(inherit.aes = FALSE,
                      data = d,
                      x = d$start/3600 / 24,
                      y = 0 * d$start,
                      color = 'darkgreen') +
  ggplot2::geom_point(inherit.aes = FALSE,
                      data = d,
                      x = d$end / 3600 / 24,
                      y = 0 * d$end,
                      colour = "darkred")

Note: Depending on the duration of data you’re plotting, plott() may choose days or hours as the time units, and you will have to adjust the multipliers in the geom_point code accordingly. The start and end times returned by find_dives() are in seconds.

Again, this plot might be rather small. As a result it might be tricky to make sense of the markers that designate the starts and ends of dives. An interactive version might help, so you can zoom in and out…

plot_panels <- plott(X = list(Pcmf = Pcmf$p),
                   r = TRUE,
                   interactive = TRUE,
                   draw = FALSE) 
# our "list" of plot panels from plott has only one element, the dive profile
plot_panels[[1]] |>
  plotly::add_trace(x = d$start/3600 / 24,
                    y = 0 * d$start,
                    marker = list(color = 'darkgreen'), 
                    name = "Dive Starts",
                    mode = "markers") |>
  plotly::add_trace(x = d$end/3600 / 24,
                    y = 0 * d$end,
                    marker = list(color = 'darkred'),
                    name = "Dive Ends",
                    mode = "markers")

#> No trace type specified:
#>   Based on info supplied, a 'scatter' trace seems appropriate.
#>   Read more about this trace type -> https://plotly.com/r/reference/#scatter
#> No trace type specified:
#>   Based on info supplied, a 'scatter' trace seems appropriate.
#>   Read more about this trace type -> https://plotly.com/r/reference/#scatter

Review

Great work! You’ve learned how to use find_dives.

If you’d like to continue working through these practicals, dive-stats might be a good option if you’re especially interested in analyzing individual dives. Otherwise, consider fine-scale-tracking or mahalanobis-distance.

vignette('dive-stats')
vignette('fine-scale-tracking')
vignette('mahalanobis-distance')

Animaltags tag tools online: http://animaltags.org/, https://github.com/animaltags/tagtools_r (for latest beta source code), https://animaltags.github.io/tagtools_r/index.html (vignettes overview)

tagtools project team

2024-06-27