NCEI Water-Column Sonar Data Archive on AWS
Background
Water column sonar data, the acoustic back-scatter from the near-surface to the seafloor, are used to assess physical and biological characteristics of the ocean including the spatial distribution of plankton, fish, methane seeps, and underwater oil plumes.
In collaboration with NOAA's National Marine Fisheries Service (NMFS) and the University of Colorado Boulder, NOAA’s National Centers for Environmental Information (NCEI) established a national archive for water column sonar data. This project entails ensuring the long-term stewardship of well-documented water column sonar data, and enabling discovery and access to researchers and the public around the world.
Data providers include NOAA National Marine Fisheries Service (NMFS), NOAA Office of Ocean Exploration and Research (OER), NOAA National Ocean Service (NOS), Rolling Deck to Repository (R2R), U.S. academic and private institutions, and international groups.
This data set comprises the water-column sonar data archived at NCEI in a more readily accessible media. Data provided to NCEI are in their raw format. Processing routines are being applied to a subset of the archive, specifically focusing on Simrad EK60 single and multiple frequency datasets. Ping alignment, noise removal algorithms (De Robertis & Higgenbottom, 2007; Ryan et al., 2015), and bottom detection algorithms are applied to the raw data binned into one hour intervals using Echoview (Myriax, v.10). The processed data are exported as a CSV for each interval and each frequency.
Additional Resources
Data
Raw archived data were collected using a variety of vessel-mounted sonars with Kongsberg's EM 122 (12 kHz) and EM 302 (30 kHz), Simrad's EK60 (18-710 kHz, split beam), ME70 (70-120 kHz, can be split beam), and EK80 (18-710 kHz, split beam and broadband) being the most common. The configuration of each cruise's sonar system (e.g., beam type and angle) can be found in the file metadata.
File names contain the start time for that file, and often include a preceding tag for that cruise. The timestamp in UTC follows the convention: ‘D’YYYYMMDD’-T’hhmmss. For example, “SaKe_2013-D20130522-T134850”, indicates a files from a 2013 SaKe cruise and the start of the file is May 22, 2013 at 13:48:50 (UTC).
Type
Data are categorized as raw or processed.
Raw
Binary files are generated during individual cruises. Users would typically use a tool such as pyEcholab to open the files and process the data into a more conventional format.
Processed
Data for EK60 data are the output of a Matlab-Echoview (v.10)-Matlab workflow*, collated by frequency, e.g., 18, 38, 70, 120, and 200 kHz. Within individual folders each cruise contains CSV files formatted with headers to describe the structure of the underlying data.
*Any use of trade names does not imply endorsement by NOAA
Data Details
The raw EK60 data are processed with the routine below. This routine will be available in pyEcholab in 2020. Processed data are not available for all raw data. However, more will be added over time as it is created.
Noise removal including impulse, attenuation, transient, and background noise
Removal of top 10 m of data due to bubble interference
If EK60 data contain multiple frequencies, preprocess with a 3x3 median convolution filter and apply multi-frequency single-beam imaging index outlined in Wall et al. (2016) using a threshold of -66 dB
Structure
Data are archived in an Amazon S3 bucket with access to the general public. The folder structure is outlined as follows:
For processed data: cruise → transducer frequency/bottom/multi-frequency single-beam imaging index → file
For raw data: ship → cruise → instrument → file
To download a 18 kHz frequency file from the SH1305 cruise such as "SaKe2013-D20130523-T080854_to_SaKe2013-D20130523-T085643.csv" you can read directly from the URL as follows:
Citation
If the archived data are used in a future publication, please cite all used data sets to document and provide credit back to the data creators. Cruises have unique citations. See individual cruises for details. Citation information can be found at the NCEI water-column sonar data archive.
Access
Raw and processed data are stored in the cloud on an Amazon Web Services S3 bucket and accessible for download using a variety of tools.
The library boto3 provides an object-oriented and well documented interface to the data set. We can configure the boto3 resource to access our bucket, "noaa-wcsd-pds" as an anonymous user using low-level functions from botocore.
To download and cache a file while checking for exceptions:
Tutorials
There are several tutorials that will help you download the data and begin analysis. They utilize both raw and processed data.
Plotting Raw EK60 Data [EK60 Jupyter Notebook]
Frequency Differencing with Raw Data [Frequency Jupyter Notebook]
Reading and Plotting Processed CSV Data [CSV Jupyter Notebook]
Reading and Plotting Raw Bottom Data [Bottom Jupyter Notebook]
Updated tutorials utilizing cloud-native Zarr data from NOAA's Open Data Dissemination can be found here:
Echofish — Echopype EK60 Cloud Processing [Colab Notebook]
Echofish — Frequency Differencing with L2 EK60 Data [Colab Notebook]
Echofish — Geospatial Indexing [Colab Notebook]
References
De Robertis, A., & Higginbottom, I. (2007). A post-processing technique to estimate the signal-to-noise ratio and remove echosounder background noise. ICES Journal of Marine Science, 64(6): 1282-1291.
Ryan, T.E., Downie, R.A., Kloser, R.J., and Keith, G. (2015). Reducing bias due to noise and attenuation in open-ocean echo integration data. ICES Journal of Marine Science, 72(8): 2482-2493.
Simmonds, E.J. and MacLennan, D.N. 2005. Fisheries Acoustics: Theory and practice. Blackwell Science, Oxford. 456pp.
Wall, C.C. (2016), Building an accessible archive for water column sonar data, Eos, 97, https://doi.org/10.1029/2016EO057595. Published on 15 August 2016.
Wall, C.C., Jech, J.M. and. McLean, S.J. (2016) Increasing the accessibility of acoustic data through global access and imagery, ICES Journal of Marine Science, 73(8): 2093–2103, DOI: https://doi.org/10.1093/icesjms/fsw014.
Contact wcd.info@noaa.gov for support with the data set
Last updated