Skip to contents

This function is used to calculate the Mahalanobis distance for a multivariate time series.

Usage

m_dist(
  data,
  sampling_rate,
  smoothDur,
  overlap,
  consec,
  cumSum,
  expStart,
  expEnd,
  baselineStart,
  baselineEnd,
  BL_COV
)

Arguments

data

A data frame or matrix with one row for each time point. Note that the Mahalanobis distance calculation should be carried out on continuous data only, so if your data contain logical, factor or character data, proceed at your own risk...errors (or at least meaningless results) will probably ensue.

sampling_rate

The sampling rate in Hz (data should be regularly sampled). If not specified it will be assumed to be 1 Hz.

smoothDur

The length, in minutes, of the window to use for calculation of "comparison" values. If not specified or zero, there will be no smoothing (a distance will be calculated for each data observation).

overlap

The amount of overlap, in minutes, between consecutive "comparison" windows. smooth_dur - overlap will give the time resolution of the resulting distance time series. If not specified or zero, there will be no overlap. Overlap will also be set to zero if smoothDur is unspecified or zero.

consec

Logical. If consec = TRUE, then the calculated distances are between consecutive windows of duration smoothDur, sliding forward over the data set by a time step of (smoothDur-overlap) minutes. If TRUE, baselineStart and baselineEnd inputs will be used to define the period used to calculate the data covariance matrix. Default is consec = FALSE.

cumSum

Logical. If cum_sum = TRUE, then output will be the cumulative sum of the calculated distances, rather than the distances themselves. Default is cum_sum = FALSE.

expStart

Start times (in seconds since start of the data set) of the experimental exposure period(s).

expEnd

End times (in seconds since start of the data set) of the experimental exposure period(s). If either or both of exp_start and exp_end are missing, the distance will be calculated over whole dataset and full dataset will be assumed to be baseline.

baselineStart

Start time (in seconds since start of the data set) of the baseline period (the mean data values for this period will be used as the 'control' to which all "comparison" data points (or windows) will be compared. if not specified, it will be assumed to be 0 (start of record).

baselineEnd

End time (in seconds since start of the data set) of the baseline period. If not specified, the entire data set will be used (baseline_end will be the last sampled time-point in the data set).

BL_COV

Logical. If BL_COV= TRUE, then a covariance matrix using all data in baseline period will be used for calculating the Mahalanobis distance. Default is BL_COV = FALSE.

Value

Data frame containing results: variable seconds is times in seconds since start of dataset, at which Mahalanobis distances are reported. If a smoothDur was applied, then the reported times will be the start times of each "comparison" window. Variable dist is the Mahalanobis distances between the specified baseline period and the specified "comparison" periods.

Examples

BW <- beaked_whale
m_dist_result <- m_dist(BW$A$data, BW$A$sampling_rate)