Difference between revisions of "StereoBatRecorder"

From RevSpace
Jump to navigation Jump to search
m (More updates for sample rate)
(Add octave stuff)
Line 69: Line 69:
 
* a common source of noise is the sound of car brakes, often between 10 kHz - 20 kHz, partly in the range of bat calls. Can recordings containing these be automatically discarded?
 
* a common source of noise is the sound of car brakes, often between 10 kHz - 20 kHz, partly in the range of bat calls. Can recordings containing these be automatically discarded?
 
* is it possible to detect that a bat is purely "passing by" (not feeding) by looking at the regularity (inter-pulse-interval) of the calls?
 
* is it possible to detect that a bat is purely "passing by" (not feeding) by looking at the regularity (inter-pulse-interval) of the calls?
 +
 +
== Example call ==
 +
Below is an example of a bat call from a pipistrellus bat.
 +
[plaatje]
 +
and its spectrogram
 +
[plaatje]
 +
It's fairly 'flat' call in the sense that most of the energy is between 35 and 45 kHz.
 +
You can see a bit of noise in the bottom of the spectrogram, we will filter that out in the program octave.
 +
 +
To load the wav-file into octave and remove the low-frequency noise
 +
<pre>
 +
# load wav and get sample rate, bits per sample
 +
[y,fs,bps] = wavread("call1.wav");
 +
# create and apply a butterworth filter (maximally flat in the pass-band)
 +
[b,a] = butter(4,0.2,"high");
 +
y2 = filter(b,a,y);
 +
</pre>
 +
 +
Next, we can use the 'xcov' (cross-covariance) to calculate a set of correlations vs time offset.
 +
The cross-covariance function is very similar to the cross-correlation function, with the added advantage that the mean is automatically subtracted.
 +
<pre>
 +
# calculate cross-covariance on y2(:,1) (left) and y2(:,2) (right) data, with a max delay of +/- 40
 +
[r,lag] = xcov(y2(:,1), y2(:,2), 40);
 +
plot(lag,r);
 +
</pre>
 +
 +
[plaatje]
 +
 +
From the correlation vs time plot we can see which delay has the highest correlation.
 +
However, the signal is oscillatory, with the peaks about 6 samples apart, corresponding to the mean frequency of the bat call (about 40 kHz).
 +
To smooth the peaks, we can apply the hilbert transform
 +
<pre>
 +
# take absolute value of the hilbert transform to get the envelope of the correlation
 +
r2 = abs(hilbert(r));
 +
plot(lag,r); hold on;
 +
plot(lag,r2);
 +
</pre>
 +
 +
[plaatje]
 +
 +
The correlation plot is now smooth and we simply take the maximum value now to find the index of maximum correlation
 +
<pre>
 +
[m,ind] = max(r2);
 +
lag(ind)
 +
</pre>

Revision as of 11:07, 15 March 2014

Project StereoBatRecorder
Status In progress
Contact bertrik
Last Update 2014-03-15

This project is about signal processing of bat echolocation audio recorded through a pair of ultrasonic microphones. The goal is to automatically determine the flight direction of the bat.

Knowing the flight direction gives a lot of additional information, like if the bat is just passing or whether it is feeding, getting a better sense where the bat colony is and where the feeding ground is, etc. Having this determined automatically saves a lot of time compared to physically observing the bats in the field at odd hours.

The main principle of determining the flight direction is to measure the time-of-arrival delay between the signals arriving at the two microphones. For example, when a bat is flying from left to right, the first recorded pulse will arrive slightly earlier at the left microphone, with the left-right delay decreasing to 0 as the bat flies directly in front of the two microphones, then the delay will reverse with signals arriving earlier at the right microphone as it flies further to the right.

TODO

Create a reference set of recordings, e.g.

  • unambiguous passes, left-to-right and right-to-left
  • unambiguous passes but with relative "flat" pulses
  • feeding bat, circling in front of the recorder
  • recording containing a feeding buzz
  • recording containing social calls

Hardware

The hardware used is a stereo bat detector that has already been developed, it has the following specifications:

  • sampling rate is 240 kS/s for each channel (left and right);
  • sampling resolution is 8 bits;
  • microphone used is the Knowles FG-3329, with useful response over 100 kHz;
  • microphone separation is about 50 mm;
  • audio is recorded on an sd-card as a stereo "wav" file, so it can be directly opened in audio editing/viewing software (e.g. cooledit, audacity, raven, etc.)


To get some feeling for the parameters involved (at 240 kS/s):

  • one cycle of a 40 kHz bat signal consists of 6 samples
  • a typical bat pulse of (say) 5 ms consists of about 1200 samples
  • the delay between the left and right microphone is about 0.15 ms, or 36 samples

Approach

The plan is to prototype algorithms in a numerical math package (like matlab, octave or scilab), then port the algorithm over to a more high-level language (like C or Java)

Signal processing steps

I expect the following steps:

  • pre-processing: high-pass filter (e.g. 10 kHz) the signal to remove low-frequency noise and make the signal DC-free
  • detecting and isolating the individual calls
    • come up with some metric (e.g. energy in 0.1 ms block) to separate bat sounds from background noise
    • stitching "holes" in a call (e.g. ignore gaps of < 0.5 ms)
    • discarding non-bat clicks (e.g. ignore signals < 0.5 ms in duration)
  • determining the delay between the left and right channel
    • possible methods:
      • direct correlation, can be accelerated by FFT
      • correlation of the amplitude envelope
      • correlation of the frequency slope
      • compare center of gravity of the envelope
    • perhaps switch algorithms depending on the type of call (flat/CF or steep/FM)
  • determine flight direction based on set of delays measured during a pass
    • verify that delay changes monotonically during a pass, e.g. start with simple assumption that delay vs. time is linear and calculate a best fit.
    • discard outliers in a smart way
    • consider using the call energy as a weight factor, so clear calls have more significance than quiet ones
  • cool ideas for further classification (optional):
    • calculate doppler shift to get a flight speed estimate
  • output the result
    • make a nice graph to visually verify the conclusion made by the algorithms
    • export result to .csv for example

Hilbert transform

An interesting mathematical operation is the Hilbert transform. It allows conversion of a simple real signal into a complex "analytical signal". The analytical signal can be converted into an instantaneous amplitude estimate and an instantaneous frequency estimate. This probably works well for bat signals, since many bat signals are relatively "pure" (don't have many harmonics). So, using the Hilbert transform, a bat call can be relatively simply separated into an envelope signal and a frequency signal.

Random thoughts yet to be organized

  • I've observed some common pipistrelle bats en-route to use a very flat type of pulse, direct correlation works badly on these pulses. So perhaps use some other type of correlation (e.g. correlate on envelope)
  • a common source of noise is the sound of car brakes, often between 10 kHz - 20 kHz, partly in the range of bat calls. Can recordings containing these be automatically discarded?
  • is it possible to detect that a bat is purely "passing by" (not feeding) by looking at the regularity (inter-pulse-interval) of the calls?

Example call

Below is an example of a bat call from a pipistrellus bat. [plaatje] and its spectrogram [plaatje] It's fairly 'flat' call in the sense that most of the energy is between 35 and 45 kHz. You can see a bit of noise in the bottom of the spectrogram, we will filter that out in the program octave.

To load the wav-file into octave and remove the low-frequency noise

# load wav and get sample rate, bits per sample
[y,fs,bps] = wavread("call1.wav");
# create and apply a butterworth filter (maximally flat in the pass-band)
[b,a] = butter(4,0.2,"high");
y2 = filter(b,a,y);

Next, we can use the 'xcov' (cross-covariance) to calculate a set of correlations vs time offset. The cross-covariance function is very similar to the cross-correlation function, with the added advantage that the mean is automatically subtracted.

# calculate cross-covariance on y2(:,1) (left) and y2(:,2) (right) data, with a max delay of +/- 40
[r,lag] = xcov(y2(:,1), y2(:,2), 40);
plot(lag,r);

[plaatje]

From the correlation vs time plot we can see which delay has the highest correlation. However, the signal is oscillatory, with the peaks about 6 samples apart, corresponding to the mean frequency of the bat call (about 40 kHz). To smooth the peaks, we can apply the hilbert transform

# take absolute value of the hilbert transform to get the envelope of the correlation
r2 = abs(hilbert(r));
plot(lag,r); hold on;
plot(lag,r2);

[plaatje]

The correlation plot is now smooth and we simply take the maximum value now to find the index of maximum correlation

[m,ind] = max(r2);
lag(ind)