Summary:
Marine geophysical exchange files for R/V Kilo Moana: 2002 to 2018 includes 328 geophysical archive files spanning km0201, the vessel's very first expedition, through km1812, the last survey included in this data synthesis.
Data formats (you will likely require only one of these):
MGD77T (M77T): ASCII - the current standard format for marine geophysical data exchange, tab delimited, low human readability
MGD77: ASCII - legacy format for marine geophysical data exchange (no longer recommended due to truncated data precision and low human readability)
GMT DAT: ASCII - the Generic Mapping Tools format in which these archive files were built, best human readability but largest file size
MGD77+: highly flexible and disk space saving binary NetCDF-based format, enables adding additional columns and application of errata-based data correction methods (i.e., Chandler et al, 2012), not human readable
The process by which formats were converted is explained below.
Data Reduction and Explanation:
R/V Kilo Moana routinely acquired bathymetry data using two concurrently operated sonar systems hence, for this analysis, a best effort was made to extract center beam depth values from the appropriate sonar system. No resampling or decimation of center beam depth data has been performed with the exception that all depth measurements were required to be temporally separated by at least 1 second. The initial sonar systems were the Kongsberg EM120 for deep and EM1002 for shallow water mapping. The vessel's deep sonar system was upgraded to Kongsberg EM122 in January of 2010 and the shallow system to EM710 in March 2012.
The vessel deployed a Lacoste and Romberg spring-type gravity meter (S-33) from 2002 until March 2012 when it was replaced with a Bell Labs BGM-3 forced feedback-type gravity meter. Of considerable importance is that gravity tie-in logs were by and large inadequate for the rigorous removal of gravity drift and tares. Hence a best effort has been made to remove gravity meter drift via robust regression to satellite-derived gravity data. Regression slope and intercept are analogous to instrument drift and DC shift hence their removal markedly improves the agreement between shipboard and satellite gravity anomalies for most surveys. These drift corrections were applied to both observed gravity and free air anomaly fields. If the corrections are undesired by users, the correction coefficients have been supplied within the metadata headers for all gravity surveys, thereby allowing users to undo these drift corrections.
The L&R gravity meter had a 180 second hardware filter so for this analysis the data were Gaussian filtered another 180 seconds and resampled at 10 seconds. BGM-3 data are not hardware filtered hence a 360 second Gaussian filter was applied for this analysis. BGM-3 gravity anomalies were resampled at 15 second intervals. For both meter types, data gaps exceeding the filter length were not through-interpolated. Eotvos corrections were computed via the standard formula (e.g., Dehlinger, 1978) and were subjected to identical filtering of the respective gravity meter.
The vessel also deployed a Geometrics G-882 cesium vapor magnetometer on several expeditions. A Gaussian filter length of 135 seconds has been applied and resampling was performed at 15 second intervals with the same exception that no interpolation was performed through data gaps exceeding the filter length.
Archive file production:
At all depth, gravity and magnetic measurement times, vessel GPS navigation was resampled using linear interpolation as most geophysical measurement times did not exactly coincide with GPS position times. The geophysical fields were then merged with resampled vessel navigation and listed sequentially in the GMT DAT format to produce data records.
Archive file header fields were populated with relevant information such as port names, PI names, instrument and data processing details, and others whereas survey geographic and temporal boundary fields were automatically computed from the data records.
Archive file conversion:
Once completed, each marine geophysical data exchange file was converted to the other formats using the Generic Mapping Tools program known as mgd77convert. For example, conversions to the other formats were carried out as follows:
mgd77convert km0201.dat -Ft -Tm # gives mgd77t (m77t file extension)
mgd77convert km0201.dat -Ft -Ta # gives mgd77
mgd77convert km0201.dat -Ft -Tc # gives mgd77+ (nc file extension)
Disclaimers:
These data have not been edited in detail using a visual data editor and data outliers are known to exist. Several hardware malfunctions are known to have occurred during the 2002 to 2018 time frame and these malfunctions are apparent in some of the data sets. No guarantee is made that the data are accurate and they are not meant to be used for vessel navigation. Close scrutiny and further removal of outliers and other artifacts is recommended before making scientific determinations from these data.
The archive file production method employed for this analysis is explained in detail by Hamilton et al (2019).
Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
License information was derived automatically
Search API for looking up addresses and roads within the catchment. The api can search for both address and road, or either. This dataset is updated weekly from VicMap Roads and Addresses, sourced via www.data.vic.gov.au.\r \r
\r
The Search API uses a data.gov.au datastore and allows a user to take full advantage of full test search functionality.\r
\r
An sql attribute is passed to the URL to define the query against the API. Please note that the attribute must be URL encoded. The sql statement takes for form as below:\r
\r
\r
SELECT distinct display, x, y\r
FROM "4bf30358-6dc6-412c-91ee-a6f15aaee62a"\r
WHERE _full_text @@ to_tsquery(replace('[term]', ' ', ' %26 '))\r
LIMIT 10\r
\r
\r
The above will select the top 10 results from the API matching the input 'term', and return the display name as well as an x and y coordinate. \r
\r
The full URL for the above query would be:\r
\r
\r
https://data.gov.au/api/3/action/datastore_search_sql?sql=SELECT display, x, y FROM "4bf30358-6dc6-412c-91ee-a6f15aaee62a" WHERE _full_text @@ to_tsquery(replace('[term]', ' ', ' %26 ')) LIMIT 10)\r
\r
\r
Any field in the source dataset can be returned via the API. Display, x and y are used in the example above, but any other field can be returned by altering the select component of the sql statement. See examples below.\r \r
Search data sources and LGA can also be used to filter results. When not using a filter, the API defaults to using all records. See examples below.\r \r
A filter can be applied to select for a particular source dataset using the 'src' field. The currently available datasets are as follows:\r \r - 1 for Roads\r - 2 for Address\r - 3 for Localities\r - 4 for Parcels (CREF and SPI)\r - 5 for Localities (Propnum)\r \r
Filters can be applied to select for a specific local government area using the 'lga_code' field. LGA codes are derrived from Vicmap LGA datasets. Wimmeras LGAs include:\r \r - 332 Horsham Rural City Council\r - 330 Hindmarsh Shire Council\r - 357 Northern Grampians Shire Council\r - 371 West Wimmera Shire Council\r - 378 Yarriambiack Shire Council\r \r
Search for the top 10 addresses and roads with the word 'darlot' in their names:\r
\r
\r
SELECT distinct display, x, y FROM "4bf30358-6dc6-412c-91ee-a6f15aaee62a" WHERE _full_text @@ to_tsquery(replace('darlot', ' ', ' & ')) LIMIT 10)\r
\r
example\r
\r
Search for all roads with the word 'perkins' in their names:\r
\r
\r
SELECT distinct display, x, y FROM "4bf30358-6dc6-412c-91ee-a6f15aaee62a" WHERE _full_text @@ to_tsquery(replace('perkins', ' ', ' %26 ')) AND src=1\r
\r
example\r
\r
Search for all addresses with the word 'kalimna' in their names, within Horsham Rural City Council:\r
\r
\r
SELECT distinct display, x, y FROM "4bf30358-6dc6-412c-91ee-a6f15aaee62a" WHERE _full_text @@ to_tsquery(replace('kalimna', ' ', ' %26 ')) AND src=2 and lga_code=332\r
\r
example\r
\r
Search for the top 10 addresses and roads with the word 'green' in their names, returning just their display name, locality, x and y:\r
\r
\r
SELECT distinct display, locality, x, y FROM "4bf30358-6dc6-412c-91ee-a6f15aaee62a" WHERE _full_text @@ to_tsquery(replace('green', ' ', ' %26 ')) LIMIT 10\r
\r
example\r
\r
Search all addresses in Hindmarsh Shire:\r
\r
\r
SELECT distinct display, locality, x, y FROM "4bf30358-6dc6-412c-91ee-a6f15aaee62a" WHERE lga_code=330\r
\r
example
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Retrospective Analysis of Antarctic Tracking Data (RAATD) is a Scientific Committee for Antarctic Research (SCAR) project led jointly by the Expert Groups on Birds and Marine Mammals and Antarctic Biodiversity Informatics, and endorsed by the Commission for the Conservation of Antarctic Marine Living Resources. The RAATD project team consolidated tracking data for multiple species of Antarctic meso- and top-predators to identify Areas of Ecological Significance. These datasets constitute the compiled tracking data from a large number of research groups that have worked in the Antarctic since the 1990s.
This metadata record pertains to the "filtered" version of the data files. These files contain position estimates that have been processed using a state-space model in order to estimate locations at regular time intervals. For technical details of the filtering process, consult the data paper. The filtering code can be found in the https://github.com/SCAR/RAATD repository.
This data set comprises one metadata csv file that describes all deployments, along with data files (3 files for each of 17 species). For each species there is: - an RDS file that contains the fitted TMB filter model object and model predictions (this file is RDS format that can be read by the R statistical software package) - a PDF file that shows the quality control results for each individual model - a CSV file containing the interpolated position estimates
For details of the file contents and formats, consult the data paper.
The original copy of these data are available through the Australian Antarctic Data Centre (https://data.aad.gov.au/metadata/records/SCAR_EGBAMM_RAATD_2018_Filtered)
The data are also available in a standardized version (see https://data.aad.gov.au/metadata/records/SCAR_EGBAMM_RAATD_2018_Standardised) that contain position estimates as provided by the original data collectors (generally, raw Argos or GPS locations, or estimated GLS locations) without state-space filtering.
This archive contains the summarization corpus generated as a result of the filtering stages (trials-final.csv), the rouge scores for the generated summaries (rouge-results-parsed.csv), the data and results of the human evaluation (evaluation/ subfolder), the code used to generate the corpus (extract.r, filter.r, and determine_similarity_threshold.r). The summaries were generated using the summarize_all.py script.
Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
License information was derived automatically
\r The Australian Charities and Not-for-profits Commission (ACNC) is Australia’s national regulator of charities.\r \r Since 3 December 2012, charities wanting to access Commonwealth charity tax concessions (and other benefits), need to register with the ACNC. Although many charities choose to register, registration with the ACNC is voluntary.\r \r Each year, registered charities are required to lodge an Annual Information Statement (AIS) with the ACNC. Charities are required to submit their AIS within six months of the end of their reporting period.\r \r Registered charities can apply to the ACNC to have some or all of the information they provide withheld from the ACNC Register. However, there are only limited circumstances when the ACNC can agree to withhold information. If a charity has applied to have their data withheld, the AIS data relating to that charity has been excluded from this dataset.\r \r This dataset can be used to find the AIS information lodged by multiple charities. It can also be used to filter and sort by different variables across all AIS information.\r \r This dataset can be used to find the AIS information lodged by multiple charities. It can also be used to filter and sort by different variables across all AIS information. AIS Information for individual charities can be viewed via the ACNC Charity Register.\r \r The AIS collects information about charity finances, and financial information provides a basis for understanding the charity and its activities in greater detail. \r We have published explanatory notes to help you understand this dataset.\r \r When comparing charities’ financial information it is important to consider each charity's unique situation. This is particularly true for small charities, which are not compelled to provide financial reports – reports that often contain more details about their financial position and activities – as part of their AIS.\r \r For more information on interpreting financial information, please refer to the ACNC website.\r \r The ACNC also publishes other datasets on data.gov.au as part of our commitment to open data and transparent regulation. Please click here to view them.\r \r NOTE: It is possible that some information in this dataset might be subject to a future request from a charity to have their information withheld. If this occurs, this information will still appear in the dataset until the next update.\r \r Please consider this risk when using this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Particulates in the water were concentrated onto 25mm glass fibre filters.
Light transmission and reflection through the filters was measured using a spectrophotometer to yield spectral absorption coefficients.
Data Acquisition:
Water samples were taken from Niskin bottles mounted on the CTD rosette. Two or three depths were selected at each station, using the CTD fluorometer profile to identify the depth of maximum fluorescence and below the fluorescence maximum. One sample was always taken at 10m, provided water was available, as a reference depth for comparisons with satellite data (remote sensing international standard). Water sampling was carried out after other groups, leading to a considerable time delay of between half an hour and 3 hours, during which particulates are likely to have sedimented within the Niskin bottle, and algae photoadapted to the dark. In order to minimise problems of sedimentation, as large a sample as practical was taken. Often so little water remained in the Niskin bottle that the entire remnant was taken. Where less than one litre remained, leftover sample water was taken from the HPLC group. Water samples were filtered through 25mm diameter GF/F filters under a low vacuum (less than 5mmHg), in the dark. Filters were stored in tissue capsules in liquid nitrogen and transported to the lab for analysis after the cruise. Three water samples were filtered through GF/F filters under gravity, with 2 30ml pre-rinses to remove organic substances from the filter, and brought to the laboratory for further filtration through 0.2micron membrane filters.
Filters were analysed in batches of 3 to 7, with all depths at each station being analysed within the same batch to ensure comparability. Filters were removed one batch at a time and place on ice in the dark. Once defrosted, the filters were placed upon a drop of filtered seawater in a clean petri dish and returned to cold, dark conditions. One by one, the filters were placed on a clean glass plate and scanned from 200 to 900nm in a spectrophotometer equipped with an integrating sphere. A fresh baseline was taken with each new batch using 2 blank filters from the same batch as the sample filters, soaked in filtered seawater. After scanning, the filters were placed on a filtration manifold, soaked in methanol for between 1 and 2 hours to extract pigments, and rinsed with filtered seawater. They were then scanned again against blanks soaked in methanol and rinsed in filtered seawater.
Data Processing:
The initial scan of total particulate matter, ap, and the second scan of non-pigmented particles, anp, were corrected for baseline wandering by setting the near-infrared absorption to zero.
This technique requires correction for enhanced scattering within the filter, which has been reported to vary with species. One dilution series was carried out at station 118 to allow calculation of the correction (beta-factor). Since it is debatable whether this factor will be applicable to all samples, no correction has been applied to the dataset. Potential users should contact JSchwarz for advice on this matter when using the data quantitatively.
Not yet complete:
Comparison of the beta-factor calculated for station 118 with the literature values.
Comparison of phytoplankton populations from station 118 with those found at other stations to evaluate the applicability of the beta-factor.
Dataset Format:
Two files: phyto_absorp_brokew.txt and phyto_absorp_brokew_2.txt: covering stations 4 to 90 and 91 to 118, respectively. Note that not every station was sampled.
File format: Matlab-readable ascii text with 3 'header' lines: Row 1: col.1=-999, col.2 to end = ctd number Row 2: col.1=-999, col.2 to end = sample depth in metres Row 3: col.1=-999, col.2 to end = 1 for total absorption by particulates, 2 for absorption by non-pigmented particles Row 4 to end: col.1=wavelength in nanometres, col.2 to end = absorption coefficient corresponding to station, depth and type given in rows 1 to 3 of the same column.
This work was completed as part of ASAC projects 2655 and 2679 (ASAC_2655, ASAC_2679).
http://spdx.org/licenses/CC0-1.0http://spdx.org/licenses/CC0-1.0
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Summary of the data
Conductivity-Temperature-Depth (CTD) profiles with auxiliary sensor data from cruises to Kongsfjorden (KF) in summers of 2011-2020 and Rijpfjorden (RF) in 2011-2014 and 2016-2017. The cruises were funded by MOSJ (Environmental monitoring of Svalbard and Jan Mayen) and others.
This dataset contains processed profiles of sensor temperature and salinity collected using an SBE911+ unit from R/V Lance (2011-2018), R/V Helmer Hanssen (2019) and R/V Kronprins Haakon (2020) as well as calibrated fluorescence profiles and other auxiliary sensor data. The parameters included are (not all parameters available for every year) temperature, conductivity, calculated practical salinity (EOS-80), chlorophyll fluorescence, uncalibrated dissolved oxygen (both as dissolved oxygen and raw sensor voltage, colored dissolved organic matter (CDOM) fluorescence (voltage only), beam attenuation, PAR and SPAR and uncalibrated turbidity. Profile data are from down casts only and available in a 1 decibar vertical resolution (i.e. averaged into 1-decibar bins).
Only data conversion and derive were run due to raw data having been averaged by presumably number of measurements (20?). The binned data contains the value from the nearest averaged raw data bin within 2 dbar.
The processing for 2014-2020 involved following steps using SeaBird Electronics software:
Data conversion
Filtering the data: Temperature, conductivity, and descent rate no filter. Other variables A 0.03 s (Voltages and oxygen) or B 0.15 s (all other variables).
Cell thermal mass computations
Loop edit, remove surface soak and velocities lower than 0.15 m/s (2015-2017, 2019), 0.2 m/s (2018, 2020)
Wild edit with pass 1 2, pass 2 20, npoint 100
Derive salinities
Bin average to 1 dbar bins from a minimum of 3 data points
Bottle summary
Extra filtering for CDOM
For 2013 and 2014 only data conversion and derive were run due to raw data having been averaged by presumably number of measurements. The ‘binned’ data contains the value from the nearest averaged raw data bin within 2 dbar.
For 2011 and 2012 no raw data were available. In 2012 data conversion and bin averaging have been run earlier. For 2011 no information of the processing details were found.
Temperature and salinity have been quality-controlled and salinity spikes have been removed. No salinity samples were taken and the salinity data could however not be verified against bottle samples. The primary conductivity sensors were of better quality than the secondary sensors and data from the primary sensors were selected to be included. The fluorescence data have been calibrated against Chl-a samples from bottles (Table 1, Figure 1 in summary pdf). Bottle sample data will be published separately by Anette Wold et al. Please note that the rest of the data have not been calibrated nor quality-controlled and are provided as-is. The quality of fluorescence data from the CTD in 2013 seems worse than in other years. The sensors are listed in Table 2 in summary pdf.
CTD Chlorophyll data were calibrated based on a linear fit of CTD fluoroscence against bottle sample chlorophyll (Chla) measurements:
The median value was determined at depths from 250 to 500 m depth where the fluorescence value was expected to equal or be close to zero.
A visual inspection confirmed that there was no drift between profiles.
The offset was removed from the fluorescence values
The calibration of fluorescence values against Chl-a values: When .btl files were available from the CTD, we only included the values when the standard deviation of the bottle value divided by the fluorescence measurement was < 0.15, to avoid highly noisy parts of the fluorescence profile. As the standard deviation was not available for all the years and the data are quite sparse, CTD fluorescence values larger than 2.5 mg/m3 were also excluded. We then fit a linear regression through the origin between the bottle Chl-a measurement and the offset-corrected CTD fluorescence. The coefficients were used to calibrate the offset-corrected CTD (and SAIV) fluorescence profiles.
The data are first be published as two .csv files containing all the years. File ‘CTD_KF_2011_2020_version1.csv’ contains 2011-2020 Kongsfjorden and nearby areas and file ‘CTD_RF_2011_2017.csv’ contains Rijpfjorden and other areas.
The data will be published in annual, self-documenting netCDF files. Profile data are organized in arrays with one column per cast and one row per pressure bin (BIN_*). 1-dimensional metadata such as time and position are organized in a row-vector with one value per cast. All variables within an annual file have the same number of columns, equal to the total number of CTD casts. Not all years contain the same auxiliary sensor data. For more information on the sampling routines, please refer to some of the cruise reports at https://brage.npolar.no/npolar-xmlui/handle/11250/172693
Temperature and salinity from the earlier years have been included in datasets published in data.npolar.no before (https://doi.org/10.21334/unis-hydrography (UNIS HD) and other parameters are included in the KF 2016-2017 dataset https://doi.org/10.21334/npolar.2023.62247dad.
Table 1. Chlorophyll / fluorescence linear regression coefficients through origin and the offset of sensor data at depth 250-500 dbar
Year Coefficient Offset Nr stations with samples/Nr stations
2020 0.7615 0.0250 25/26
2019 0.4188 0.0366 13/40
2018 0.3437 -0.1016 10/23
2017 1.0005 -0.1267 20/47
2016 0.6856 -0.0973 21/49
2015 0.3698 -0.0313 9/15
2014 0.7190 -0.1253 22/25
2013 2.0873 -0.1558 21/49
2012 0.9675 -0.1558 12/48
2011 2.1199 -0.1673 13/36
Wombats_VOI.qmd
: Primary analysis script containing data cleaning, modelling, and visualisation ## Data Files ### GBIF Wombat Occurrence Data- 0014630-250127130748423.csv
- GBIF wombat occurrence export (download available from: https://www.gbif.org/occurrence/download/0014630-250127130748423 ) ### Environmental and Administrative Data- HCAS31_AHC_2020_2022_NSW_50km_3577.tif
- Habitat condition raster for NSW (from CSIRO Habitat Condition Assessment System - subset of full dataset available from https://data.csiro.au/collection/csiro%3A63571v7 )- RA_2021_AUST_GDA2020
- ABS Remoteness Areas 2021 shapefile (folder containing multiple files - available from https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files Specficic shapefile URL: https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition-3/jul2021-jun2026/access-and-downloads/digital-boundary-files/RA_2021_AUST_GDA2020.zip ) ## Requirements - R (version 4.3.2 or later)- Required R packages: - tidyverse (v2.0.0) - for data manipulation and visualization - sf (v1.0-16) - for spatial data handling - lubridate (v1.9.3) - for date handling - raster (v3.6-26) - for raster data handling - ozmaps (v0.4.5) - for Australian state boundaries - terra (v1.7-71) - for efficient raster processing - exactextractr (v0.10.0) - for exact extraction from raster to polygons - patchwork (v1.2.0) - for combining plots - corrplot (v0.92) - for correlation matrix visualization - RColorBrewer (v1.1-3) - for color palettes - gridExtra (v2.3) - for arranging multiple plots - grid - for low-level graphics (base R) Install these packages before running the script. ## Analysis Process The analysis follows these key steps: 1. Data Cleaning - Filter to NSW and ACT wombat records - Remove records with high uncertainty (>10km) - Create a 50km grid system across NSW and ACT 2. Temporal Analysis - Calculate presence/absence of wombats in each grid cell by year - Determine empirical observation probability for each grid cell - Analyse temporal stability using 5-year sliding windows 3. Value of Information Calculation - Calculate expected information gain through KL divergence - Simulate adding new observations using the binomial model - Map VOI across NSW and ACT 4. Need for Information Integration - Import and process habitat condition data - Convert to habitat loss metric (NFI) - Integrate with VOI analysis 5. Cost of Information - Process remoteness area data as a proxy for sampling cost - Map remoteness areas across the study region 6. Combined Analysis - Create quadrant analysis combining VOI and NFI - Generate comprehensive visualisations of all three metrics ## How to Use 1. Download this repository to your local machine.2. Set your working directory to the location of the script.3. Ensure all required R packages are installed.4. Run the script in RStudio or your preferred R environment. ## Outputs The primary output is combined_quadrant_analysis_100425.png
, which displays four panels:1. Value of Information (VOI) - Expected information gain across the study area2. Need for Information (NFI) - Habitat loss percentiles3. Cost of Information (COI) - Remoteness areas as a proxy for sampling cost4. VOI vs NFI Quadrant Analysis - Relationship between information value and need ## Citation If you use this code or methodology, please cite: Forbes, O., Thrall, P.H., Young, A.G., Ong, C.S. (2025). Natural History Collections at the Crossroads: Shifting Priorities and Data-Driven Opportunities. [Journal information pending]Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset should be read alongside other energy consumption datasets on the City of Melbourne open data platform as well as the following report:
The dataset outlines modelled energy consumption across the City of Melbourne municipality. It is not energy consumption data captured by a meter, but modelled data based on building attributes such as building age, floor area etc. This data was provided by the CSIRO as a result of a study commissioned by IMAP Councils. The study was governed by a Grant Agreement between Councils and the CSIRO, which stated an intent for the data to be published. This specific dataset is presented at a block level scale. It includes both commercial and residential buildings and is a 2021 projection, relative to a 2011 baseline, based on a scenario of buildings being retrofitted. It does not include the industrial sector.
Soluble iron (Fe), the Fe passing through a 0.02 µm Anodisc membrane filter, is reported in nmol Fe per kg of seawater. Samples were collected on the U.S. GEOTRACES North Atlantic Zonal Transect, Leg 2, in 2011.
In comparing this data to other published profiles of soluble Fe, it is valuable to know that soluble Fe is a highly operationally-defined parameter. The two most common methods of collecting soluble Fe samples are via 0.02 µm Anopore membrane filtration (this study) and by cross-flow filtration. An intercalibration between the two methods used to collect soluble Fe samples on the U.S. Atlantic GEOTRACES cruises are described in this excerpt (PDF) from a Fitzsimmons manuscript (in preparation). The intercalibration determined that "soluble Fe produced by cross-flow filtration (10 kDa membrane) is only ~65-70% of the soluble Fe produced by Anopore filtration."
Please note that some US GEOTRACES data may not be final, pending intercalibration results and further analysis. If you are interested in following changes to US GEOTRACES NAT data, there is an RSS feed available via the BCO-DMO US GEOTRACES project page (scroll down and expand the "Datasets" section).
The dataset consists of 100 CTD casts in the region north of Flemish Cap. Some casts cover the full water column, while others only cover the upper 1000 db. The CTD casts were obtained with a SeaBird SBE911+ system, measuring temperature (2 sensors), conductivity (2 sensors), pressure, beam transmission, oxygen (plumbed in series with the primary T/C sensor pair), chlorophyll fluorescence, and turbidity. All sensors were sampled at 24 Hz. The data were processed using the SeaBird data processing software suite, SBEDataProcessing-Win32. A low pass filter, with time constant of 0.15 s, was applied to the pressure record. To account for the transit time between the temperature and conductivity sensors, the conductivity measurements were aligned with the temperature measurements using empirically determined time delays. The primary conductivity was delayed by 0.011 s relative to pressure (this is in addition to the advance of 0.073 s which is performed by the SeaBird deckbox during data acquisition, thus resulting in a net advance of 0.062 s). The secondary conductivity was advanced by 0.050 s. The oxygen voltage was advanced by 4 s relative to pressure. A correction for conductivity cell thermal mass effects was applied to both conductivity channels using the parameters recommended by SeaBird (alpha=0.03, 1/beta=7.0). The temperatures, conductivities, and oxygen voltage were then median filtered using a 7-scan window. A loop edit step was then applied, whereby portions of the cast in which the pressure was not changing sufficiently fast (0.2 dbar/s) were removed. This was followed by computation of salinity, potential temperature, potential density, sound velocity, geopotential anomaly, and oxygen concentration. Finally, the data from the downcast were averaged into 1 dbar bins. Further details of the CTD data processing can be found in the header portion of the individual cast files. The final data files contain raw sensor values (1 dbar bin averages) plus a number of derived variables (e.g., potential temperature, salinity, sigma-theta, oxygen). A full list of the output variables is contained in the header portion of the cast files. The casts were visually examined to determine the quality of the data from the 2 separate sensor suites (primary and secondary). A header line was placed in each file indicating the preferred sensor pair (PRIMARY or SECONDARY) if one was bad or whether both were of equal quality (BOTH GOOD).
The dataset consists of 173 CTD casts in Rhode Island and Block Island Sounds obtained during 4 surveys. The surveys were performed during 22-24 September 2009, 7-8 December 2009, 9-11 March 2010, and 16-18 June 2010. The casts cover the nearly the entire watercolumn from the surface to approximately 2 m above the bottom. The data were obtained with a SeaBird SBE19plus, which measures temperature, conductivity, pressure, optical backscatter, chlorophyll fluorescence, oxygen, and photosynthetically active radiation (PAR). All sensors were sampled at 4 Hz. The data were processed using the SeaBird data processing software suite, SBEDataProcessing-Win32. A low pass filter, with time constant of 1 s, was applied to the pressure record. Temperature and conductivity were low pass filtered with a 0.5 s filter time constant. To account for the relatively slower response of the temperature sensor, the temperature was advanced in time by 0.5 seconds relative to pressure. The oxygen voltage was advanced relative to pressure by 2 seconds for the September, December, and June survey casts and 5 seconds for the March survey casts. A correction for conductivity cell thermal mass effects was applied to the conductivity signal using the parameters recommended by SeaBird (alpha=0.04, 1/beta=8.0). A loop edit step was then applied, whereby portions of the cast in which the pressure was not changing sufficiently fast (0.1 dbar/s) were removed. This was followed by computation of salinity, sigma-t, and oxygen concentration. Finally, the data from the downcast were averaged into 1 dbar bins. Further details of the CTD data processing can be found in the header portion of the individual cast files. The final data files contain raw sensor values (1 dbar bin averages) plus a number of derived variables (e.g. salinity, sigma-t, oxygen). A full list of the output variables is contained in the header portion of the cast files.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
all_hemoglobin_data.csv: This dataset captures hemoglobin activity levels during bimanual coordination tasks. All_Hand_Position_Data.csv: This dataset records kinematic positions of a rigid body representing the oscillating manipulandum's center for both hands. Description of the data and file structure all_hemoglobin_data.csv:ID: subject id | Phase: Initial starting coordination phase of trial | Noise: Metronome condition (inter-beat-intervals are structured to mimic different statistical noise types) | Epoch: ~8 second epochs. 0-baseline, 1-8 epochs are when subjects are performing coordination task (64 seconds), 9 epoch is rest period after movement | Time (s): Time in seconds of trial | Pre SMA (hbo): Oxygenated hemoglobin data in the anterior supplementary motor area (channels in ROI averaged together) | Pre SMA (hbr): De-oxygenated hemoglobin data in the anterior supplementary motor area (channels in ROI averaged together) | SMA Proper (hbo): Oxygenated hemoglobin data in the posterior supplementary motor area (channels in ROI averaged together) | SMA Proper (hbr): De-oxygenated hemoglobin data in the posterior supplementary motor area (channels in ROI averaged together) All_Hand_Position_Data.csvid: subject id | phase: Initial starting coordination phase of trial noise | Metronome condition: inter-beat-intervals of metronome that are structured to mimic different statistical noise types | epoch: ~8 second epochs. 1-8 epochs are when subjects are performing coordination task (64 seconds) | time: Time in seconds of trial | signal1: Rigid body of left-hand roll values (i.e., pitch, roll, yaw) during bimanual coordination task. Data is filtered using 4th order Butterworth filter with a 9.6 Hz cutoff frequency. | signal2: Rigid body of right-hand roll values (i.e., pitch, roll, yaw) during bimanual coordination task. Data is filtered using 4th order Butterworth filter with a 9.6 Hz cutoff frequency. | crp: Continuous relative phase of both signalsCode/Software hemoglobin_graphs_and_analyses.R: An R script designed to generate graphs and conduct analyses based on the hemoglobin values. kinematic_data_graphs_and_analyses.R: An R script designed to generate graphs and conduct analyses on continuous relative phase values derived from hand positions. It also generates the graph that contains both experimental and simulated data.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Meta Kaggle Code is an extension to our popular Meta Kaggle dataset. This extension contains all the raw source code from hundreds of thousands of public, Apache 2.0 licensed Python and R notebooks versions on Kaggle used to analyze Datasets, make submissions to Competitions, and more. This represents nearly a decade of data spanning a period of tremendous evolution in the ways ML work is done.
By collecting all of this code created by Kaggle’s community in one dataset, we hope to make it easier for the world to research and share insights about trends in our industry. With the growing significance of AI-assisted development, we expect this data can also be used to fine-tune models for ML-specific code generation tasks.
Meta Kaggle for Code is also a continuation of our commitment to open data and research. This new dataset is a companion to Meta Kaggle which we originally released in 2016. On top of Meta Kaggle, our community has shared nearly 1,000 public code examples. Research papers written using Meta Kaggle have examined how data scientists collaboratively solve problems, analyzed overfitting in machine learning competitions, compared discussions between Kaggle and Stack Overflow communities, and more.
The best part is Meta Kaggle enriches Meta Kaggle for Code. By joining the datasets together, you can easily understand which competitions code was run against, the progression tier of the code’s author, how many votes a notebook had, what kinds of comments it received, and much, much more. We hope the new potential for uncovering deep insights into how ML code is written feels just as limitless to you as it does to us!
While we have made an attempt to filter out notebooks containing potentially sensitive information published by Kaggle users, the dataset may still contain such information. Research, publications, applications, etc. relying on this data should only use or report on publicly available, non-sensitive information.
The files contained here are a subset of the KernelVersions
in Meta Kaggle. The file names match the ids in the KernelVersions
csv file. Whereas Meta Kaggle contains data for all interactive and commit sessions, Meta Kaggle Code contains only data for commit sessions.
The files are organized into a two-level directory structure. Each top level folder contains up to 1 million files, e.g. - folder 123 contains all versions from 123,000,000 to 123,999,999. Each sub folder contains up to 1 thousand files, e.g. - 123/456 contains all versions from 123,456,000 to 123,456,999. In practice, each folder will have many fewer than 1 thousand files due to private and interactive sessions.
The ipynb files in this dataset hosted on Kaggle do not contain the output cells. If the outputs are required, the full set of ipynbs with the outputs embedded can be obtained from this public GCS bucket: kaggle-meta-kaggle-code-downloads
. Note that this is a "requester pays" bucket. This means you will need a GCP account with billing enabled to download. Learn more here: https://cloud.google.com/storage/docs/requester-pays
We love feedback! Let us know in the Discussion tab.
Happy Kaggling!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
"Supplementary data" includes all of the raw data utilized for the manuscript A burning issue: the effect of organic ultraviolet filter exposure on the behaviour and physiology of Daphnia magna. "A Burning Issue R Script" contains the R script used to perform all statistical analyses and generate the figures included within the manuscript.
Overview: ERA5-Land is a reanalysis dataset providing a consistent view of the evolution of land variables over several decades at an enhanced resolution compared to ERA5. ERA5-Land has been produced by replaying the land component of the ECMWF ERA5 climate reanalysis. Reanalysis combines model data with observations from across the world into a globally complete and consistent dataset using the laws of physics. Reanalysis produces data that goes several decades back in time, providing an accurate description of the climate of the past. Surface temperature: Temperature of the surface of the Earth. The skin temperature is the theoretical temperature that is required to satisfy the surface energy balance. It represents the temperature of the uppermost surface layer, which has no heat capacity and so can respond instantaneously to changes in surface fluxes. The original ERA5-Land dataset (period: 2000 - 2020) has been reprocessed to: - aggregate ERA5-Land hourly data to daily data (minimum, mean, maximum) - while increasing the spatial resolution from the native ERA5-Land resolution of 0.1 degree (~ 9 km) to 30 arc-sec (~ 1 km) by image fusion with CHELSA data (V1.2) (https://chelsa-climate.org/). For each day we used the corresponding monthly long-term average of CHELSA. The aim was to use the fine spatial detail of CHELSA and at the same time preserve the general regional pattern and fine temporal detail of ERA5-Land. The steps included aggregation and enhancement, specifically: 1. spatially aggregate CHELSA to the resolution of ERA5-Land 2. calculate difference of ERA5-Land - aggregated CHELSA 3. interpolate differences with a Gaussian filter to 30 arc seconds 4. add the interpolated differences to CHELSA Data available is the daily average, minimum and maximum of surface temperature. Software used: GDAL 3.2.2 and GRASS GIS 8.0.0 (r.resamp.stats -w; r.relief) Original ERA5-Land dataset license: https://cds.climate.copernicus.eu/api/v2/terms/static/licence-to-use-copernicus-products.pdf CHELSA climatologies (V1.2): Data used: Karger D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E, Linder, H.P., Kessler, M. (2018): Data from: Climatologies at high resolution for the earth's land surface areas. Dryad digital repository. http://dx.doi.org/doi:10.5061/dryad.kd1d4 Original peer-reviewed publication: Karger, D.N., Conrad, O., Böhner, J., Kawohl, T., Kreft, H., Soria-Auza, R.W., Zimmermann, N.E., Linder, P., Kessler, M. (2017): Climatologies at high resolution for the Earth land surface areas. Scientific Data. 4 170122. https://doi.org/10.1038/sdata.2017.122
https://cds.unistra.fr/aladin-org/licences_aladin.htmlhttps://cds.unistra.fr/aladin-org/licences_aladin.html
Hyper Suprime-Cam Legacy Archive (HSCLA) is a public archive of processed, science-ready data from HSC taken as part of PI-based programs. The 2016 release includes data taken through 2016 and the total data volume increased significantly. Now, the total area of 3400 square degrees is covered in at least one filter and as many as 770 million sources are detected and measured. The data processed here are from 800 hours of observations executed under good conditions. Also, the overall data quality is improved thanks to updates in the processing pipeline.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Australian Charities and Not-for-profits Commission (ACNC) is the national regulator of charities in Australia. \r \r Since 3 December 2012, charities wanting to access Commonwealth charity tax concessions (and other benefits), need to register with the ACNC. Although many charities choose to register, registration with the ACNC is voluntary. Each financial year, registered charities are required to lodge an Annual Information Statement (AIS) with the ACNC. Generally, charities are required to submit their AIS within six months of the end of their reporting period – for example, by 31 December for a charity with a 30 June financial year end. This dataset provides a record of the 2015 AISs submitted by charities, that is, the statements submitted for a charity’s 2015 reporting year. For most charities that will be for the financial year 1 July 2014 – 30 June 2015, for others it will be the 2015 calendar year. There are also a small number of charities that have alternative particular reporting periods. \r \r Registered charities can apply to the ACNC to have some or all of the information they provide withheld from the ACNC Register. If a charity has applied to have their data withheld, the AIS data relating to that charity has been excluded from this dataset. There are only limited circumstances when the ACNC can agree to withhold information, including because the information: \r \r • is commercially sensitive and it could cause harm \r \r • is false, confusing or misleading \r \r • is offensive \r \r • could endanger public safety \r \r • falls within the circumstances allowed by the regulations (such as if the information identifies a private donor in relation to a private ancillary fund). \r The AIS information for individual charities can be viewed at www.acnc.gov.au/findacharity. \r \r The 2015 AIS dataset on data.gov.au can be used to find the AIS information lodged by more than one charity, for example for research purposes. It can also be used to filter and sort by different variables across all the AIS information that the ACNC has received for the 2015 reporting period. \r \r Some charities report to the ACNC as part of a reporting group. The 2015 AIS information for these charities is attached as a separate dataset on this page. Any analysis of AIS data should include those charities as well. \r \r The 2015 AIS collects information about charity finances. Financial information provides a basis for understanding the charity and its activities in greater detail. However, it is easy to misunderstand a charity's financial position or performance by judging it solely on its financial information. When comparing financial information to other charities, it is important to consider each charity's unique situation. This is especially the case for small charities, that do not provide financial reports which often contain more details about a charity's financial position and activities. For more information on interpreting financial information, please see the ACNC website. \r \r The ACNC maintains a network of researchers who are interested in the sector, and hold regular teleconferences. If you are interested in becoming involved in the Research network, please email us. The ACNC also publishes other datasets on data.gov.au, as part of our commitment to open data and transparent regulation. Please click here to view these. \r \r NOTE: It is possible that some information in this dataset might be subject to a future request by the charity to have their information withheld, but will still appear in the dataset until the next update. Please consider this risk when using this dataset. \r \r Please use the explanatory notes attached to help with analysis of this dataset.
Gross Primary Production Six depths were sampled per CTD station ranging from near-surface to 125 m. Sample depths were based on downward fluorescence profiles and two of six samples always included both near-surface (approximately 5-10 m) and the depth of the chlorophyll maximum where applicable. Photosynthetic rates were determined using radioactive NaH14CO3. Incubations were conducted according to the method of Westwood et al. (2011). Cells were incubated for 1 hour at 21 light intensities ranging from 0 to 1200 µmol m-2 s-1 (CT Blue filter centred on 435 nm). Carbon uptake rates were corrected for in situ chlorophyll a (chl a) concentrations (µg L-1) measured using high performance liquid chromatography (HPLC, Wright et al. 2010), and for total dissolved inorganic carbon availability, analysed according to Dickson et al. (2007). Photosynthesis-irradiance (P-I) relationships were then plotted in R and the equation of Platt et al. (1980) used to fit curves to data using robust least squares non-linear regression. Photosynthetic parameters determined included light-saturated photosynthetic rate [Pmax, mg C (mg chl a)-1 h-1], initial slope of the light-limited section of the P-I curve [α, mg C (mg chl a)-1 h-1 (µmol m-2 s-1)-1], light intensity at which carbon-uptake became maximal (calculated as Pmax/ α = Ek, µmol m-2 s-1), intercept of the P-I curve with the carbon uptake axis [c, mg C (mg chl a)-1 h-1] , and the rate of photoinhibition where applicable [β, mg C (mg chl a)-1 h-1 (µmol m-2 s-1)-1]. Gross primary production rates were modelled using R. Depth interval profiles (1 m) of chl a from the surface to 200 m were constructed through the conversion of up-cast fluorometry data measured at each CTD station. For conversions, pooled fluorometry burst data from all sites and depths was linearly regressed against in situ chl a determined using HPLC. Gross daily depth-integrated water-column production was then calculated using chl a depth profiles, photosynthetic parameters (Pmax, α , β, see above), incoming climatological PAR, vertical light attenuation (Kd), and mixed layer depth. Climatological PAR was based on spatially averaged (49 pixels, approx. 2 degrees) 8 day composite Aqua MODIS data (level 3, 2004-2017) obtained for Julian day 34. Summed incoming light intensities throughout the day equated to mean total PAR provided by Aqua MODIS. Kd for each station was calculated through robust linear regression of natural logarithm-transformed PAR data with depth. In cases where CTD stations were conducted at night, Kd was calculated from a linear relationship established between pooled chlorophyll a concentrations and Kd’s determined at CTD stations conducted during the day (Kd = -0.0421 chl a * -0.0476). Mixed layer depths were calculated as the depth where density (sigma) changed by 0.05 from a 10 m reference point. Gross primary production was calculated at 0.1 time steps throughout the day (10 points per hour) and summed.
This dataset consists of sensor data collected aboard buoys in the Beaufort sea. The sensor data are prefiltered, combined, adjusted for buoy drift and screened with a Gaussian first difference filter. This dataset exists as 11 DOS formatted zip files.
Summary:
Marine geophysical exchange files for R/V Kilo Moana: 2002 to 2018 includes 328 geophysical archive files spanning km0201, the vessel's very first expedition, through km1812, the last survey included in this data synthesis.
Data formats (you will likely require only one of these):
MGD77T (M77T): ASCII - the current standard format for marine geophysical data exchange, tab delimited, low human readability
MGD77: ASCII - legacy format for marine geophysical data exchange (no longer recommended due to truncated data precision and low human readability)
GMT DAT: ASCII - the Generic Mapping Tools format in which these archive files were built, best human readability but largest file size
MGD77+: highly flexible and disk space saving binary NetCDF-based format, enables adding additional columns and application of errata-based data correction methods (i.e., Chandler et al, 2012), not human readable
The process by which formats were converted is explained below.
Data Reduction and Explanation:
R/V Kilo Moana routinely acquired bathymetry data using two concurrently operated sonar systems hence, for this analysis, a best effort was made to extract center beam depth values from the appropriate sonar system. No resampling or decimation of center beam depth data has been performed with the exception that all depth measurements were required to be temporally separated by at least 1 second. The initial sonar systems were the Kongsberg EM120 for deep and EM1002 for shallow water mapping. The vessel's deep sonar system was upgraded to Kongsberg EM122 in January of 2010 and the shallow system to EM710 in March 2012.
The vessel deployed a Lacoste and Romberg spring-type gravity meter (S-33) from 2002 until March 2012 when it was replaced with a Bell Labs BGM-3 forced feedback-type gravity meter. Of considerable importance is that gravity tie-in logs were by and large inadequate for the rigorous removal of gravity drift and tares. Hence a best effort has been made to remove gravity meter drift via robust regression to satellite-derived gravity data. Regression slope and intercept are analogous to instrument drift and DC shift hence their removal markedly improves the agreement between shipboard and satellite gravity anomalies for most surveys. These drift corrections were applied to both observed gravity and free air anomaly fields. If the corrections are undesired by users, the correction coefficients have been supplied within the metadata headers for all gravity surveys, thereby allowing users to undo these drift corrections.
The L&R gravity meter had a 180 second hardware filter so for this analysis the data were Gaussian filtered another 180 seconds and resampled at 10 seconds. BGM-3 data are not hardware filtered hence a 360 second Gaussian filter was applied for this analysis. BGM-3 gravity anomalies were resampled at 15 second intervals. For both meter types, data gaps exceeding the filter length were not through-interpolated. Eotvos corrections were computed via the standard formula (e.g., Dehlinger, 1978) and were subjected to identical filtering of the respective gravity meter.
The vessel also deployed a Geometrics G-882 cesium vapor magnetometer on several expeditions. A Gaussian filter length of 135 seconds has been applied and resampling was performed at 15 second intervals with the same exception that no interpolation was performed through data gaps exceeding the filter length.
Archive file production:
At all depth, gravity and magnetic measurement times, vessel GPS navigation was resampled using linear interpolation as most geophysical measurement times did not exactly coincide with GPS position times. The geophysical fields were then merged with resampled vessel navigation and listed sequentially in the GMT DAT format to produce data records.
Archive file header fields were populated with relevant information such as port names, PI names, instrument and data processing details, and others whereas survey geographic and temporal boundary fields were automatically computed from the data records.
Archive file conversion:
Once completed, each marine geophysical data exchange file was converted to the other formats using the Generic Mapping Tools program known as mgd77convert. For example, conversions to the other formats were carried out as follows:
mgd77convert km0201.dat -Ft -Tm # gives mgd77t (m77t file extension)
mgd77convert km0201.dat -Ft -Ta # gives mgd77
mgd77convert km0201.dat -Ft -Tc # gives mgd77+ (nc file extension)
Disclaimers:
These data have not been edited in detail using a visual data editor and data outliers are known to exist. Several hardware malfunctions are known to have occurred during the 2002 to 2018 time frame and these malfunctions are apparent in some of the data sets. No guarantee is made that the data are accurate and they are not meant to be used for vessel navigation. Close scrutiny and further removal of outliers and other artifacts is recommended before making scientific determinations from these data.
The archive file production method employed for this analysis is explained in detail by Hamilton et al (2019).