42 datasets found

Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1°...
data.europa.eu
unknown
Updated May 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1° or 5° squares (IRD level 2) - and efforts level 0 (1950-2023) [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-15405414?locale=es
Explore at:
unknown(2677816)Available download formats
Dataset updated
May 16, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Major differences from v1: For level 2 catch: Catches and number raised to nominal are only raised to exactly matching stratas or if not existing, to a strata corresponding with UNK/NEI or 99.9. (new feature in v4) When nominal strata lack specific dimensions (e.g., fishing_mode always UNK) but georeferenced strata include them, the nominal data are “upgraded” to match—preventing loss of detail. Currently this adjustment aligns nominal values to georeferenced totals; future versions may apply proportional scaling. This does not create a direct raising but rather allows more precise reallocation. (new feature in v4) IATTC Purse seine catch-and-effort are available in 3 separate files according to the group of species: tuna, billfishes, sharks. This is due to the fact that PS data is collected from 2 sources: observer and fishing vessel logbooks. Observer records are used when available, and for unobserved trips logbooks are used. Both sources collect tuna data but only observers collect shark and billfish data. As an example, a strata may have observer effort and the number of sets from the observed trips would be counted for tuna and shark and billfish. But there may have also been logbook data for unobserved sets in the same strata so the tuna catch and number of sets for a cell would be added. This would make a higher total number of sets for tuna catch than shark or billfish. Efforts in the billfish and shark datasets might hence represent only a proportion of the total effort allocated in some strata since it is the observed effort, i.e. for which there was an observer onboard. As a result, catch in the billfish and shark datasets might represent only a proportion of the total catch allocated in some strata. Hence, shark and billfish catch were raised to the fishing effort reported in the tuna dataset. (new feature in v4, was done in Firms Level 0 before) Data with resolution of 10degx10deg is removed, it is considered to disaggregate it in next versions. Catches in tons, raised to match nominal values, now consider the geographic area of the nominal data for improved accuracy. (as v3) Captures in "Number of fish" are converted to weight based on nominal data. The conversion factors used in the previous version are no longer used, as they did not adequately represent the diversity of captures. (as v3) Number of fish without corresponding data in nominal are not removed as they were before, creating a huge difference for this measurement_unit between the two datasets. (as v3) Strata for which catches in tons are raised to match nominal data have had their numbers removed. (as v3) Raising only applies to complete years to avoid overrepresenting specific months, particularly in the early years of georeferenced reporting. (as v3) Strata where georeferenced data exceed nominal data have not been adjusted downward, as it is unclear if these discrepancies arise from missing nominal data or different aggregation methods in both datasets. (as v3) The data is not aggregated to 5-degree squares and thus remains unharmonized spatially. Aggregation can be performed using CWP codes for geographic identifiers. For example, an R function is available: source("https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/sardara_functions/transform_cwp_code_from_1deg_to_5deg.R") (as v3) This results in a raising of the data compared to v3 for IOTC, ICCAT, IATTC and WCPFC. However as the raising is more specific for CCSBT, the raising is of 22% less than in the previous version. Level 0 dataset has been modified creating differences in this new version notably : The species retained are different; only 32 major species are kept. Mappings have been somewhat modified based on new standards implemented by FIRMS. New rules have been applied for overlapping areas. Data is only displayed in 1 degrees square area and 5 degrees square areas. The data is enriched with "Species group", "Gear labels" using the fdiwg standards. These main differences are recapped in the Differences_v2018_v2024.zip Recommendations: To avoid converting data from number using nominal stratas, we recommend the use of conversion factors which could be provided by tRFMOs. In some strata, nominal data appears higher than georeferenced data, as observed during level 2 processing. These discrepancies may result from errors or differences in aggregation methods. Further analysis will examine these differences in detail to refine treatments accordingly. A summary of differences by tRFMOs, based on the number of strata, is included in the appendix. For level 0 effort : In some datasets—namely those from ICCAT and the purse seine (PS) data from WCPFC— same effort data has been reported multiple times by using different units which have been kept as is, since no official mapping allows conversion between these units. As a result, users have be remind that some ICCAT and WCPFC effort data are deliberately duplicated : in the case of ICCAT data, lines wi
Global monthly catch of tuna, tuna-like and shark species (1950-2021) by 1°...
data.europa.eu
unknown
Updated Dec 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2024). Global monthly catch of tuna, tuna-like and shark species (1950-2021) by 1° or 5° squares (IRD level 2) - and efforts level 0 (1950-2023) [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-15221705?locale=cs
Explore at:
unknown(21391)Available download formats
Dataset updated
Dec 1, 2024
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Major differences from previous work: For level 2 catch: Catches in tons, raised to match nominal values, now consider the geographic area of the nominal data for improved accuracy. Captures in "Number of fish" are converted to weight based on nominal data. The conversion factors used in the previous version are no longer used, as they did not adequately represent the diversity of captures. Number of fish without corresponding data in nominal are not removed as they were before, creating a huge difference for this measurement_unit between the two datasets. Nominal data from WCPFC includes fishing fleet information, and georeferenced data has been raised based on this instead of solely on the triplet year/gear/species, to avoid random reallocations. Strata for which catches in tons are raised to match nominal data have had their numbers removed. Raising only applies to complete years to avoid overrepresenting specific months, particularly in the early years of georeferenced reporting. Strata where georeferenced data exceed nominal data have not been adjusted downward, as it is unclear if these discrepancies arise from missing nominal data or different aggregation methods in both datasets. The data is not aggregated to 5-degree squares and thus remains unharmonized spatially. Aggregation can be performed using CWP codes for geographic identifiers. For example, an R function is available: source("https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/sardara_functions/transform_cwp_code_from_1deg_to_5deg.R") Level 0 dataset has been modified creating differences in this new version notably : The species retained are different; only 32 major species are kept. Mappings have been somewhat modified based on new standards implemented by FIRMS. New rules have been applied for overlapping areas. Data is only displayed in 1 degrees square area and 5 degrees square areas. The data is enriched with "Species group", "Gear labels" using the fdiwg standards. These main differences are recapped in the Differences_v2018_v2024.zip Recommendations: To avoid converting data from number using nominal stratas, we recommend the use of conversion factors which could be provided by tRFMOs. In some strata, nominal data appears higher than georeferenced data, as observed during level 2 processing. These discrepancies may result from errors or differences in aggregation methods. Further analysis will examine these differences in detail to refine treatments accordingly. A summary of differences by tRFMOs, based on the number of strata, is included in the appendix. Some nominal data have no equivalent in georeferenced data and therefore cannot be disaggregated. What could be done is to check for each nominal data without equivalence if a georeferenced data exists in different buffers, and to average the distribution of this footprint. Then, disaggregate the nominal data based on the georeferenced data. This would lead to the creation of data (approximately 3%), and would necessitate reducing/removing all georeferenced data without a nominal equivalent or with a lesser equivalent. Tests are currently being conducted with and without this. It would help improve the biomass captured footprint but could lead to unexpected discrepancies with current datasets. For level 0 effort : In some datasets—namely those from ICCAT and the purse seine (PS) data from WCPFC— same effort data has been reported multiple times by using different units which have been kept as is, since no official mapping allows conversion between these units. As a result, users have be remind that some ICCAT and WCPFC effort data are deliberately duplicated : in the case of ICCAT data, lines with identical strata but different effort units are duplicates reporting the same fishing activity with different measurement units. It is indeed not possible to infer strict equivalence between units, as some contain information about others (e.g., Hours.FAD and Hours.FSC may inform Hours.STD). in the case of WCPFC data, effort records were also kept in all originally reported units. Here, duplicates do not necessarily share the same “fishing_mode”, as SETS for purse seiners are reported with an explicit association to fishing_mode, while DAYS are not. This distinction allows SETS records to be separated by fishing mode, whereas DAYS records remain aggregated. Some limited harmonization—particularly between units such as NET-days and Nets—has not been implemented in the current version of the dataset, but may be considered in future releases if a consistent relationship can be established.
Philippines Nominal GDP
ceicdata.com
dr.ceicdata.com
Updated Feb 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2025). Philippines Nominal GDP [Dataset]. https://www.ceicdata.com/en/indicator/philippines/nominal-gdp
Explore at:
Dataset updated
Feb 15, 2025
Dataset provided by
CEIC Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 1, 2020 - Dec 1, 2022
Area covered
Philippines
Variables measured
Gross Domestic Product
Description
Key information about Philippines Nominal GDP
Philippines Nominal GDP reached 112.5 USD bn in Dec 2022, compared with 93.0 USD bn in the previous quarter.
Nominal GDP in Philippines is updated quarterly, available from Mar 1981 to Dec 2022, with an average number of 22.7 USD bn.
The data reached an all-time high of 112.5 USD bn in Dec 2022 and a record low of 7.9 USD bn in Sep 1984.

CEIC converts quarterly Nominal GDP into USD. Philippine Statistics Authority provides Nominal GDP in local currency. Bangko Sentral ng Pilipinas average market exchange rate is used for currency conversions.

Related information about Philippines Nominal GDP

In the latest reports, Philippines GDP expanded 7.1 % YoY in Dec 2022.
Its GDP deflator (implicit price deflator) increased 6.3 % in Dec 2022.
Philippines GDP Per Capita reached 3,623.3 USD in Dec 2022.
Its Gross Savings Rate was measured at 10.8 % in Dec 2022.
For Nominal GDP contributions, Investment accounted for 21.5 % in Dec 2022.
Public Consumption accounted for 13.8 % in Dec 2022.
Private Consumption accounted for 75.4 % in Dec 2022.
Data release for the "First measurement of muon neutrino charged-current...
zenodo.org
application/gzip
Updated Aug 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Cudd; Andrew Cudd (2024). Data release for the "First measurement of muon neutrino charged-current interactions on hydrocarbon without pions in the final state using multiple detectors with correlated energy spectra at T2K" [Dataset]. http://doi.org/10.5281/zenodo.13306653
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13306653
Dataset updated
Aug 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andrew Cudd; Andrew Cudd
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
### On-/Off-Axis Data Release
#### (Version 1.0.1, dated 2024/08/12)

This tar archive contains the data release for ‘First measurement of muon neutrino charged-current interactions on hydrocarbon without pions in the final state using multiple detectors with correlated energy spectra at T2K’. It contains the cross-section data points and supporting information in ROOT and text format, which are detailed below:

+ `onoffaxis_xsec_data.root`
This ROOT file contains the extracted cross section and the nominal MC prediction as TH1D histograms for both the flattened 1D array of bins and in the angle binning for the analysis. The ROOT file also contains both the covariance and inverted covariance matrix for the result stored as TH2D histograms. The angle bin numbering and the corresponding bin edges are detailed at the end of the README.

+ `flux_analysis.root`
This ROOT file contains the nominal and post-fit flux histograms for ND280 and INGRID. Two different binnings are included: a fine binned histogram (220 bins) and a coarse binned histogram (20 bins). The coarse binned histogram corresponds to the flux parameters detailed in the paper (and bin edges listed in the appendix).

+ `xsec_data_mc.csv`
The extracted cross-section data points and the nominal MC prediction for each bin is stored as a comma-separated value (CSV) file with header row.

+ `cov_matrix.csv` and `inv_matrix.csv`
The covariance matrix and the inverted covariance matrix are both stored as CSV files with each row stored as a single line and columns separated by commas (there is no header row). Matrix element (0,0) corresponds to the first number in the file.

+ `nd280_analysis_binning.csv` and `ingrid_analysis_binning.csv`
The analysis bin edges are included as CSV files. The columns are labeled with a header row and denote the linear bin index and the lower and upper bin edge for the angle and momentum bins. The units are in cos(angle) for the angle bins and in MeV/c for the momentum bins.

+ `calc_chisq.cxx`
This is an example ROOT script to calculate the chi-square between the data and the nominal MC prediction using the ROOT file in the data release. To run, open ROOT and load the script (`.L calc_chisq.cxx`) and execute the function `calc_chisq("/path/to/file.root")`.

+ `calc_chisq.py`
This is an example Python script to calculate the chi-square between the data and the nominal MC prediction using the text/CSV files in the data release. The code requires NumPy as an external dependency, but otherwise uses built-in modules. To run, execute using a Python3 interpreter and give the file paths to the data/MC text file and the inverse covariance text file as the first and second arguments respectively -- e.g. `python3 calc_chisq.py /path/to/xsec_data_mc.csv /path/to/inv_matrix.csv`

+ ND280 angle bin numbering
- 0: `-1.0 < cos(#theta) < 0.20`
- 1: `0.20 < cos(#theta) < 0.60`
- 2: `0.60 < cos(#theta) < 0.70`
- 3: `0.70 < cos(#theta) < 0.80`
- 4: `0.80 < cos(#theta) < 0.85`
- 5: `0.85 < cos(#theta) < 0.90`
- 6: `0.90 < cos(#theta) < 0.94`
- 7: `0.94 < cos(#theta) < 0.98`
- 8: `0.98 < cos(#theta) < 1.00`

+ INGRID angle bin numbering
- 0: `0.50 < cos(#theta) < 0.82`
- 1: `0.82 < cos(#theta) < 0.94`
- 2: `0.94 < cos(#theta) < 1.00`

### Changelog

#### v1.0.1
Fix transcription error in INGRID momentum binning. The lowest momentum bin edge is at 350 MeV/c, not 300 MeV/c.
c
ATLAS DAOD_PHYSLITE format MC simulation QCD jet nominal samples
opendata.cern.ch
Updated 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ATLAS collaboration (2024). ATLAS DAOD_PHYSLITE format MC simulation QCD jet nominal samples [Dataset]. http://doi.org/10.7483/OPENDATA.ATLAS.OXQR.DJQ3
Explore at:
Unique identifier
https://doi.org/10.7483/OPENDATA.ATLAS.OXQR.DJQ3
Dataset updated
2024
Dataset provided by
CERN Open Data Portal
Authors
ATLAS collaboration
Description
MC simulation QCD jet nominal samples from the ATLAS experiment
Market Basket Analysis
kaggle.com
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aslan Ahmedov
Description
Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import

Data Understanding and Exploration

Transformation of the data – so that is ready to be consumed by the association rules algorithm

Running association rules

Exploring the rules generated

Filtering the generated rules

Visualization of Rule

Dataset Description

File name: Assignment-1_Data

List name: retaildata

File format: . xlsx

Number of Row: 522065

Number of Attributes: 7

BillNo: 6-digit number assigned to each transaction. Nominal.

Itemname: Product name. Nominal.

Quantity: The quantities of each product per transaction. Numeric.

Date: The day and time when each transaction was generated. Numeric.

Price: Product price. Numeric.

CustomerID: 5-digit number assigned to each customer. Nominal.

Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).

arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.

tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.

readxl - Read Excel Files in R.

plyr - Tools for Splitting, Applying and Combining Data.

ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

knitr - Dynamic Report generation in R.

magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.

dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
Global monthly catch of tuna, tuna-like and shark species (1950-2021) by 1°...
data.europa.eu
unknown
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). Global monthly catch of tuna, tuna-like and shark species (1950-2021) by 1° or 5° squares (IRD level 2) [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-14184244?locale=ga
Explore at:
unknown(21391)Available download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Major differences from previous work: For level 2: Catches in tons, raised to match nominal values, now consider the geographic area of the nominal data for improved accuracy. Captures in "Number of fish" are converted to weight based on nominal data. The conversion factors used in the previous version are no longer used, as they did not adequately represent the diversity of captures. Number of fish without corresponding data in nominal are not removed as they were before, creating a huge difference for this measurement_unit between the two datasets. Nominal data from WCPFC includes fishing fleet information, and georeferenced data has been raised based on this instead of solely on the triplet year/gear/species, to avoid random reallocations. Strata for which catches in tons are raised to match nominal data have had their numbers removed. Raising only applies to complete years to avoid overrepresenting specific months, particularly in the early years of georeferenced reporting. Strata where georeferenced data exceed nominal data have not been adjusted downward, as it is unclear if these discrepancies arise from missing nominal data or different aggregation methods in both datasets. The data is not aggregated to 5-degree squares and thus remains unharmonized spatially. Aggregation can be performed using CWP codes for geographic identifiers. For example, an R function is available: source("https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/sardara_functions/transform_cwp_code_from_1deg_to_5deg.R") Level 0 dataset has been modified creating differences in this new version notably : The species retained are different; only 32 major species are kept. Mappings have been somewhat modified based on new standards implemented by FIRMS. New rules have been applied for overlapping areas. Data is only displayed in 1 degrees square area and 5 degrees square areas. The data is enriched with "Species group", "Gear labels" using the fdiwg standards. These main differences are recapped in the Differences_v2018_v2024.zip Recommendations: To avoid converting data from number using nominal stratas, we recommend the use of conversion factors which could be provided by tRFMOs. In some strata, nominal data appears higher than georeferenced data, as observed during level 2 processing. These discrepancies may result from errors or differences in aggregation methods. Further analysis will examine these differences in detail to refine treatments accordingly. A summary of differences by tRFMOs, based on the number of strata, is included in the appendix. Some nominal data have no equivalent in georeferenced data and therefore cannot be disaggregated. What could be done is to check for each nominal data without equivalence if a georeferenced data exists in different buffers, and to average the distribution of this footprint. Then, disaggregate the nominal data based on the georeferenced data. This would lead to the creation of data (approximately 3%), and would necessitate reducing/removing all georeferenced data without a nominal equivalent or with a lesser equivalent. Tests are currently being conducted with and without this. It would help improve the biomass captured footprint but could lead to unexpected discrepancies with current datasets.
F
Fundamental Data Record for Atmospheric Composition [ATMOS_L1B]
fedeo.ceos.org
eocat.esa.int
html
Updated Jul 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ESA/ESRIN (2024). Fundamental Data Record for Atmospheric Composition [ATMOS_L1B] [Dataset]. http://doi.org/10.5270/ESA-852456e
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.5270/ESA-852456e
Dataset updated
Jul 10, 2024
Dataset provided by
ESA/ESRIN
Time period covered
Jun 28, 1995 - Apr 7, 2012
Measurement technique
Spectrometers
Description
The Fundamental Data Record (FDR) for Atmospheric Composition UVN Level 1b v.1.0 dataset is a cross-instrument Level-1 product [ATMOS_L1B] generated in 2023 and resulting from the _\(ESA FDR4ATMOS project\) https://atmos.eoc.dlr.de/FDR4ATMOS/ . The FDR contains selected Earth Observation Level 1b parameters (irradiance/reflectance) from the nadir-looking measurements of the ERS-2 GOME and Envisat SCIAMACHY missions for the period ranging from 1995 to 2012. The data record offers harmonised cross-calibrated spectra, essential for subsequent trace gas retrieval. The focus lies on spectral windows in the Ultraviolet-Visible-Near Infrared regions the retrieval of critical atmospheric constituents like ozone (O3), sulphur dioxide (SO2), nitrogen dioxide (NO2) column densities, alongside cloud parameters in the NIR spectrum. For many aspects, the FDR product has improved compared to the existing individual mission datasets: • GOME solar irradiances are harmonised using a validated SCIAMACHY solar reference spectrum, solving the problem of the fast-changing etalon present in the original GOME Level 1b data; • Reflectances for both GOME and SCIAMACHY are provided in the FDR product. GOME reflectances are harmonised to degradation-corrected SCIAMACHY values, using collocated data from the CEOS PIC sites; • SCIAMACHY data are scaled to the lowest integration time within the spectral band using high-frequency PMD measurements from the same wavelength range. This simplifies the use of the SCIAMACHY spectra which were split in a complex cluster structure (with own integration time) in the original Level 1b data; • The harmonization process applied mitigates the viewing angle dependency observed in the UV spectral region for GOME data; • Uncertainties are provided.

Each FDR product covers three FDRs (irradiance/reflectance for UV-VIS-NIR) for a single day within the same product including information from the individual ERS-2 GOME and Envisat SCIAMACHY orbits therein.

FDR has been generated in two formats: Level 1A and Level 1B targeting expert users and nominal applications respectively. The Level 1A [ATMOS_L1A] data include additional parameters such as harmonisation factors, PMD, and polarisation data extracted from the original mission Level 1 products. The ATMOS_L1A dataset is not part of the nominal dissemination to users. In case of specific requirements, please contact _\(EOHelp\) http://esatellus.service-now.com/csp?id=esa_simple_request&sys_id=f27b38f9dbdffe40e3cedb11ce961958 .

The FDR4ATMOS products should be regarded as experimental due to the innovative approach and the current use of a limited-sized test dataset to investigate the impact of harmonization on the Level 2 target species, specifically SO2, O3 and NO2. Presently, this analysis is being carried out within follow-on activities.

One of the main aspects of the project was the characterization of Level 1 uncertainties for both instruments, based on metrological best practices. The following documents are provided:

General guidance on a metrological approach to Fundamental Data Records (FDR) -> link TBC

Uncertainty Characterisation document -> link TBC

Effect tables -> link TBC

NetCDF files containing example uncertainty propagation analysis and spectral error correlation matrices for SCIAMACHY (Atlantic and Mauretania scene for 2003 and 2010) and GOME (Atlantic scene for 2003) links TBC reflectance_uncertainty_example_FDR4ATMOS_GOME.nc reflectance_uncertainty_example_FDR4ATMOS_SCIA.nc

The FDR V1 is currently being extended to include the MetOp GOME-2 series.

All the new products are conveniently formatted in NetCDF. Free standard tools, such as _\(Panoply\) https://www.giss.nasa.gov/tools/panoply/ , can be used to read NetCDF data.

Panoply is sourced and updated by external entities. For further details, please consult our _\(Terms and Conditions page\) https://earth.esa.int/eogateway/terms-and-conditions .
D
Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244
darus.uni-stuttgart.de
Updated Feb 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonas Stiefelmaier (2025). Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244 [Dataset]. http://doi.org/10.18419/DARUS-4784
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-4784
Dataset updated
Feb 24, 2025
Dataset provided by
DaRUS
Authors
Jonas Stiefelmaier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
DFG
Description
General information: This dataset is meant to serve as a benchmark problem for fault detection and isolation in dynamic systems. It contains preprocessed sensor data from the adaptive high-rise demonstrator building D1244, built in the scope of the CRC1244. Parts of the measurements have been artificially corrupted and labeled accordingly. Please note that although the measurements are stored in Matlab's .mat-format (Version 7.0), they can easily be processed using free software such as the SciPy library in Python. Structure of the dataset: train contains training data (only nominal) validation contains validation data (nominal and faulty). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the validation data. test contains test data (nominal and faulty). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the test data. meta contains textual labels for all signals as well as additional information on the considered fault classes File contents: Each file contains the following data from 1200 timesteps (60 seconds sampled at 20 Hz): t: time in seconds u: actuator forces (obtained from pressure measurements) in newtons y: relative elongations as well as bending curvatures of structural elements obtained from strain gauge measurements, and actuator displacements measured by position encoders label: categorical label of the present fault class, where 0 denotes the nominal class and faults in the different signals are encoded according to their index in the list of fault types in meta/labels.mat Faulty samples additionally include the corresponding nominal values for reference u_true: actuator forces without faults y_true: measured outputs without faults Textual labels for all in- and output signals as well as all faults are given in the struct labels. Each sample's textual fault label is additionally contained in its filename (between the first and second underscore).
F
Gross Domestic Product
fred.stlouisfed.org
trends.sourcemedium.com
json
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Gross Domestic Product [Dataset]. https://fred.stlouisfed.org/series/GDP
Explore at:
jsonAvailable download formats
Dataset updated
May 29, 2025
License
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Description
View economic output, reported as the nominal value of all new goods and services produced by labor and property located in the U.S.
e
Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244 -...
b2find.eudat.eu
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/c63477bb-dc7a-54a4-a75d-d28c74f85cbb
Explore at:
Dataset updated
Jul 30, 2025
Description
General information: This dataset is meant to serve as a benchmark problem for fault detection and isolation in dynamic systems. It contains preprocessed sensor data from the adaptive high-rise demonstrator building D1244, built in the scope of the CRC1244. Parts of the measurements have been artificially corrupted and labeled accordingly. Please note that although the measurements are stored in Matlab's .mat-format (Version 7.0), they can easily be processed using free software such as the SciPy library in Python. Structure of the dataset: train contains training data (only nominal) validation contains validation data (nominal and faulty). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the validation data. test contains test data (nominal and faulty). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the test data. meta contains textual labels for all signals as well as additional information on the considered fault classes File contents: Each file contains the following data from 1200 timesteps (60 seconds sampled at 20 Hz): t: time in seconds u: actuator forces (obtained from pressure measurements) in newtons y: relative elongations as well as bending curvatures of structural elements obtained from strain gauge measurements, and actuator displacements measured by position encoders label: categorical label of the present fault class, where 0 denotes the nominal class and faults in the different signals are encoded according to their index in the list of fault types in meta/labels.mat Faulty samples additionally include the corresponding nominal values for reference u_true: actuator forces without faults y_true: measured outputs without faults Textual labels for all in- and output signals as well as all faults are given in the struct labels. Each sample's textual fault label is additionally contained in its filename (between the first and second underscore).
d
Using Decision Trees to Detect and Isolate Leaks in the J-2X
catalog.data.gov
s.cnmilf.com
+1more
Updated Apr 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Using Decision Trees to Detect and Isolate Leaks in the J-2X [Dataset]. https://catalog.data.gov/dataset/using-decision-trees-to-detect-and-isolate-leaks-in-the-j-2x
Explore at:
Dataset updated
Apr 11, 2025
Dataset provided by
Dashlink
Description
Full title: Using Decision Trees to Detect and Isolate Simulated Leaks in the J-2X Rocket Engine Mark Schwabacher, NASA Ames Research Center Robert Aguilar, Pratt & Whitney Rocketdyne Fernando Figueroa, NASA Stennis Space Center Abstract The goal of this work was to use data-driven methods to automatically detect and isolate faults in the J-2X rocket engine. It was decided to use decision trees, since they tend to be easier to interpret than other data-driven methods. The decision tree algorithm automatically “learns” a decision tree by performing a search through the space of possible decision trees to find one that fits the training data. The particular decision tree algorithm used is known as C4.5. Simulated J-2X data from a high-fidelity simulator developed at Pratt & Whitney Rocketdyne and known as the Detailed Real-Time Model (DRTM) was used to “train” and test the decision tree. Fifty-six DRTM simulations were performed for this purpose, with different leak sizes, different leak locations, and different times of leak onset. To make the simulations as realistic as possible, they included simulated sensor noise, and included a gradual degradation in both fuel and oxidizer turbine efficiency. A decision tree was trained using 11 of these simulations, and tested using the remaining 45 simulations. In the training phase, the C4.5 algorithm was provided with labeled examples of data from nominal operation and data including leaks in each leak location. From the data, it “learned” a decision tree that can classify unseen data as having no leak or having a leak in one of the five leak locations. In the test phase, the decision tree produced very low false alarm rates and low missed detection rates on the unseen data. It had very good fault isolation rates for three of the five simulated leak locations, but it tended to confuse the remaining two locations, perhaps because a large leak at one of these two locations can look very similar to a small leak at the other location. Introduction The J-2X rocket engine will be tested on Test Stand A-1 at NASA Stennis Space Center (SSC) in Mississippi. A team including people from SSC, NASA Ames Research Center (ARC), and Pratt & Whitney Rocketdyne (PWR) is developing a prototype end-to-end integrated systems health management (ISHM) system that will be used to monitor the test stand and the engine while the engine is on the test stand[1]. The prototype will use several different methods for detecting and diagnosing faults in the test stand and the engine, including rule-based, model-based, and data-driven approaches. SSC is currently using the G2 tool http://www.gensym.com to develop rule-based and model-based fault detection and diagnosis capabilities for the A-1 test stand. This paper describes preliminary results in applying the data-driven approach to detecting and diagnosing faults in the J-2X engine. The conventional approach to detecting and diagnosing faults in complex engineered systems such as rocket engines and test stands is to use large numbers of human experts. Test controllers watch the data in near-real time during each engine test. Engineers study the data after each test. These experts are aided by limit checks that signal when a particular variable goes outside of a predetermined range. The conventional approach is very labor intensive. Also, humans may not be able to recognize faults that involve the relationships among large numbers of variables. Further, some potential faults could happen too quickly for humans to detect them and react before they become catastrophic. Automated fault detection and diagnosis is therefore needed. One approach to automation is to encode human knowledge into rules or models. Another approach is use data-driven methods to automatically learn models from historical data or simulated data. Our prototype will combine the data-driven approach with the model-based and rule-based appro
d
Spatial Coverage Map and Resampling Error Assessment for Hyperspectral...
search.dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inamdar, Deep; Soffer, Raymond; Kalacska, Margaret; Naprstek, Tomas (2023). Spatial Coverage Map and Resampling Error Assessment for Hyperspectral Imaging Data [Dataset]. http://doi.org/10.5683/SP3/EO8LM8
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/EO8LM8
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Inamdar, Deep; Soffer, Raymond; Kalacska, Margaret; Naprstek, Tomas
Description
A set of MATLAB functions (HSI_PSFS, SC_RS_Analysis_NAD.m, SC_RS_Analysis_sim.m) were developed to assess the spatial coverage of pushbroom hyperspectral imaging (HSI) data. HSI_PSFs derives the net point spread function of HSI data based on nominal data acquisition and sensor parameters (sensor speed, sensor heading, sensor altitude, number of cross track pixels, sensor field of view, integration time, frame time and pixel summing level). SC_RS_Analysis_sim calculates a theoretical spatial coverage map for HSI data based on nominal data acquisition and sensor parameters. The spatial coverage map is the sum of the point spread functions of all the pixels collected within an HSI dataset. Practically, the spatial coverage map quantifies how HSI data spatially samples spectral information across an imaged scene. A secondary theoretical spatial coverage map is also calculated for spatially resampled (nearest neighbour approach) HSI data. The function also calculates theoretical resampling errors such as pixel duplication (%), pixel loss (%) and pixel shifting (m). SC_RS_Analysis_NAD calculates an empirical spatial coverage map for collected HSI data (before and after spatial resampling) based on its nominal data acquisition and sensor parameters. The function also calculates empirical resampling errors. The current implementation of SC_RS_Analysis_NAD only works for ITRES (Calgary, Alberta, Canada) data products as it uses auxiliary information generated during the ITRES data processing workflow. This auxiliary information includes a ground look-up table that specifies the location (easting and northing) of each pixel of the HSI data in its raw sensor geometry. This auxiliary information also includes the pixel-to-pixel mapping between the HSI data in its raw sensor geometry and the spatially resampled HSI data. SC_RS_Analysis_NAD can readily be modified to work with HSI data collected by sensors from other manufacturers so long as the required auxiliary information can be extracted during data processing.
Summary statistics for expression quantitative trait loci in the developing...
springernature.figshare.com
application/gzip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heath O'Brien; Nicholas J. Bray (2023). Summary statistics for expression quantitative trait loci in the developing human brain and their enrichment in neuropsychiatric disorders [Dataset]. http://doi.org/10.6084/m9.figshare.6881825.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6881825.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Heath O'Brien; Nicholas J. Bray
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains summary statistics for eQTL (Expression Quantitative Trait Loci) analyses for 120 human fetal brains from the second trimester of gestation (12 to 19post-conception weeks). Expression matrices, covariates, and summary statistics are provided for all tested eQTL and for top eQTL for all genes.The data are contained within a single .zip archive file. Individual data files are of openly accessible .txt text file format containing p- or q- values by SNP, and .bed Browser Extensible Data format files, containing annotation track data such as chromosomal coordinates. Data files of multiple GB in size are stored in individual .gz gzip compressed files.The related study investigates genetic influences on gene expression in the human fetal brain and their relationship with a variety of postnatal brain-related traits, including susceptibility to neuropsychiatric disorders. This dataset represents the first eQTL dataset derived exclusively from the human fetal brain, and is based on initial deep RNA sequencing and genotyping.The detailed breakdown of the files in this dataset is provided below and in README.md.Gene Level Analyses:

expression_gene.bed.gz

·
normalised, variance-stabilising transformed count data (29,875 genes)

·
columns: chr, gene_start, gene_end, gene_id, samples...

all_eqtls_gene.txt.gz· nominal p-values for all SNPs within 1 MB of each gene· columns: gene_id, variant_id, tss_distance, ma_samples, ma_count, maf, pval_nominal, slope, slope_se

top_eqtls_gene.txt.gz· q-values for most significant eQTL for each gene (includes nominal p-value thresholds that can be used to filter significant SNPs)· columns: chr, snp_start, snp_end, gene_id, num_var, beta_shape1, beta_shape2, true_df, pval_true_df, variant_id, tss_distance, minor_allele_samples, minor_allele_count, maf, ref_factor, pval_nominal, slope, slope_se, pval_perm, pval_beta, qval, pval_nominal_threshold

Transcript Level Analyses: - expression_transcript.bed.gz · normalised, variance-stabilising transformed count data (144,448 transcripts)· columns: chr, transcript_start, transcript_end, transcript_id, samples... - all_eqtls_transcript.txt.gz· nominal p-values for all SNPs within 1 MB of each transcript· columns: transcript_id, variant_id, tss_distance, ma_samples, ma_count, maf, pval_nominal, slope, slope_se - top_eqtls_transcript.txt.gz· q-values for most significant eQTL for each transcript (includes nominal p-value thresholds that can be used to filter significant SNPs)· columns: columns: chr, snp_start, snp_end, transcript_id, num_var, beta_shape1, beta_shape2, true_df, pval_true_df, variant_id, tss_distance, minor_allele_samples, minor_allele_count, maf, ref_factor, pval_nominal, slope, slope_se, pval_perm, pval_beta, qval, pval_nominal_thresholdCovariates (Used For Both Gene Level and Transcript-Level Analyses) - covariates.txt· columns: Sample, Sex, PCW, RIN, ReadLength, PC1, PC2, PC3, PEER1, PEER2, PEER3, PEER4, PEER5, PEER6, PEER7, PEER8, PEER9, PEER10
e
Data for Fault Diagnosis in Adaptive Buildings - Dataset - B2FIND
b2find.eudat.eu
Updated Jun 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Data for Fault Diagnosis in Adaptive Buildings - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/39a11b51-f4bc-5e0e-8e8d-979da1536e84
Explore at:
Dataset updated
Jun 3, 2024
Description
General information: This dataset is meant to serve as a benchmark problem for fault detection and isolation in dynamical systems. It contains pre-processed sensor data from the adaptive high-rise demonstrator building D1244, built in the scope of the CRC1244. Parts of the measurements have been artificially corrupted and labeled accordingly. Please note that although the measurements are stored in Matlab's .mat-format (Version 7.0), they can easily be processed using free software such as the SciPy library in Python. Structure of the dataset: train contains the training data (only nominal) test_easy contains test data (nominal and faulty with high fault amplitude). Faulty samples were obtained by manipulating a single signal in a random nominal sample from the test data. test_hard contains test data (nominal and faulty with low fault amplitude) meta contains textual labels for all signals and fault types File contents: Each file contains the following data from 16384 timesteps: t: time in seconds u: demanded actuator forces in newtons y: measured outputs (relative elongations measured by strain gauges and actuator displacements in meters measured by position encoders) label: categorical label of the present fault class, where 0 denotes the nominal class and faults in the different signals are encoded according to their index in the list of fault types meta/labels.txt Faulty samples additionally include the corresponding nominal values for reference u_true: delivered actuator forces y_true: measured outputs without faults A sample's textual fault label is also contained in its filename (between the first and second underscore).
f
Data and probability for an incomplete 2×2 table.
plos.figshare.com
xls
Updated Jun 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hezhi Lu; Fengjing Cai; Yuan Li; Xionghui Ou (2023). Data and probability for an incomplete 2×2 table. [Dataset]. http://doi.org/10.1371/journal.pone.0272007.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0272007.t001
Dataset updated
Jun 10, 2023
Dataset provided by
PLOS ONE
Authors
Hezhi Lu; Fengjing Cai; Yuan Li; Xionghui Ou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data and probability for an incomplete 2×2 table.
f
Proportion of a nominal sample from each respondent category according to...
plos.figshare.com
xls
Updated Jun 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel J. Simons; Christopher F. Chabris (2023). Proportion of a nominal sample from each respondent category according to the 2010 Census data, along with weights applied to individual respondents in the MTurk and SurveyUSA samples. [Dataset]. http://doi.org/10.1371/journal.pone.0051876.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0051876.t002
Dataset updated
Jun 10, 2023
Dataset provided by
PLOS ONE
Authors
Daniel J. Simons; Christopher F. Chabris
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Note: SurveyUSA weights are based on data from Simons & Chabris (2011), re-normed to 2010 Census data.
f
Data_Sheet_1_Predicting the data structure prior to extreme events from...
frontiersin.figshare.com
pdf
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhirup Banerjee; Arindam Mishra; Syamal K. Dana; Chittaranjan Hens; Tomasz Kapitaniak; Jürgen Kurths; Norbert Marwan (2023). Data_Sheet_1_Predicting the data structure prior to extreme events from passive observables using echo state network.PDF [Dataset]. http://doi.org/10.3389/fams.2022.955044.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fams.2022.955044.s001
Dataset updated
Jun 4, 2023
Dataset provided by
Frontiers
Authors
Abhirup Banerjee; Arindam Mishra; Syamal K. Dana; Chittaranjan Hens; Tomasz Kapitaniak; Jürgen Kurths; Norbert Marwan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Extreme events are defined as events that largely deviate from the nominal state of the system as observed in a time series. Due to the rarity and uncertainty of their occurrence, predicting extreme events has been challenging. In real life, some variables (passive variables) often encode significant information about the occurrence of extreme events manifested in another variable (active variable). For example, observables such as temperature, pressure, etc., act as passive variables in case of extreme precipitation events. These passive variables do not show any large excursion from the nominal condition yet carry the fingerprint of the extreme events. In this study, we propose a reservoir computation-based framework that can predict the preceding structure or pattern in the time evolution of the active variable that leads to an extreme event using information from the passive variable. An appropriate threshold height of events is a prerequisite for detecting extreme events and improving the skill of their prediction. We demonstrate that the magnitude of extreme events and the appearance of a coherent pattern before the arrival of the extreme event in a time series affect the prediction skill. Quantitatively, we confirm this using a metric describing the mean phase difference between the input time signals, which decreases when the magnitude of the extreme event is relatively higher, thereby increasing the predictability skill.
Degradation Measurement of Robot Arm Position Accuracy
catalog.data.gov
data.nist.gov
Updated Jul 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2022). Degradation Measurement of Robot Arm Position Accuracy [Dataset]. https://catalog.data.gov/dataset/degradation-measurement-of-robot-arm-position-accuracy-be76e
Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
The dataset contains both the robot's high-level tool center position (TCP) health data and controller-level components' information (i.e., joint positions, velocities, currents, temperatures, currents). The datasets can be used by users (e.g., software developers, data scientists) who work on robot health management (including accuracy) but have limited or no access to robots that can capture real data. The datasets can support the: Development of robot health monitoring algorithms and tools Research of technologies and tools to support robot monitoring, diagnostics, prognostics, and health management (collectively called PHM) Validation and verification of the industrial PHM implementation. For example, the verification of a robot's TCP accuracy after the work cell has been reconfigured, or whenever a manufacturer wants to determine if the robot arm has experienced a degradation. For data collection, a trajectory is programmed for the Universal Robot (UR5) approaching and stopping at randomly-selected locations in its workspace. The robot moves along this preprogrammed trajectory during different conditions of temperature, payload, and speed. The TCP (x,y,z) of the robot are measured by a 7-D measurement system developed at NIST. Differences are calculated between the measured positions from the 7-D measurement system and the nominal positions calculated by the nominal robot kinematic parameters. The results are recorded within the dataset. Controller level sensing data are also collected from each joint (direct output from the controller of the UR5), to understand the influences of position degradation from temperature, payload, and speed. Controller-level data can be used for the root cause analysis of the robot performance degradation, by providing joint positions, velocities, currents, accelerations, torques, and temperatures. For example, the cold-start temperatures of the six joints were approximately 25 degrees Celsius. After two hours of operation, the joint temperatures increased to approximately 35 degrees Celsius. Control variables are listed in the header file in the data set (UR5TestResult_header.xlsx). If you'd like to comment on this data and/or offer recommendations on future datasets, please email guixiu.qiao@nist.gov.
Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1°...
zenodo.org
bin, csv
Updated May 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bastien Grasset; Bastien Grasset; Julien Barde; Julien Barde; Paul Taconet; Paul Taconet (2025). Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1° or 5° squares (IRD level 2) - and efforts level 0 (1950-2023) [Dataset]. http://doi.org/10.5281/zenodo.15405414
Explore at:
csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15405414
Dataset updated
May 28, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Bastien Grasset; Bastien Grasset; Julien Barde; Julien Barde; Paul Taconet; Paul Taconet
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
This deposit contains various datasets describing tuna fisheries activities (currently catches and efforts) and different levels of processing on 1° or 5° spatial grids with a monthly temporal resolution. Lower levels of processing have been officially endorsed by FIRMS and are also published on Zenodo : see FIRMS Global Tuna Atlas datasets. Currently, FIRMS datasets only deal with catches and Level 0 data (a global dataset which remains as close as possible from datasets published on tuna RFMOs Website), including a lower spatio-temporal resolution dataset which gives the best estimates of total catches (nominal catches, per year and per ocean).

Data structure

All Global Tuna Atlas datasets comply with a common data format in line with CWP Reference Harmonization standard (https://www.fao.org/3/cc6734en/cc6734en.pdf) which is described in a json file (https://github.com/fdiwg/fdi-formats/blob/main/cwp_rh_generic_gta_taskI.json).

Global Catch dataset (IRD level 2)

IRD Level 2 denotes the series of processing steps applied by the French National Research Institute for Sustainable Development (IRD) to generate this dataset from the primary RFMO catch-and-effort data. Although some steps mirror those used in the FIRMS Level 0 product (DOI: https://doi.org/10.5281/zenodo.5745958), the entire workflow was rerun to integrate early adjustments to IATTC shark and billfish data prior to final aggregation.

This dataset compiles monthly global catch data for tuna, tuna-like species and sharks from 1950 through 2023. Catches are stratified according to the latest CWP standards update :

- month

- species

- gear_type (reporting fishing_gear)

- fishing_fleet (reporting country)

- fishing_mode (type of school used)

- geographic_identifier (1° or 5° grid cell)

- measurement_unit i.e. unit of catch (weight or number)

- measurement (catch)

- measurement_type (landings or retained catches)

- measurement_processing_level (original samples or processed data)

- a `label` column has been added for each field (e.g. `fishing_mode`, `species`, `gear_type`, etc.) to provide clear descriptive metadata

This work aims not only to provide a biomass dataset for scientific purposes but also to identify and address data inconsistencies, improving the overall process. All code, materials, and packages used are available on the GitHub repository, firms-gta/geoflow-tunaatlas, along with detailed documentation on the impact of each specific treatment.
Warning: This dataset is designed to enhance the understanding of fish counts at level 0, and the amount of georeferenced data. It is not suitable for accurately georeferencing data by country or fishing fleet and should not be used for studies on fishing zone legality or quota management. While it offers a georeferenced footprint of captures to reflect reported biomass more closely, significant uncertainty remains regarding the precise locations of the catches.

Global level 2 processing includes the conversion and raising of georeferenced catch data to match nominal dataset values.

To reproduce the data and the workflow we provide a .zip with all the initial data used as well as labeling and the mapping to nominal geometries (see all_rawdata.zip)

Global Effort dataset (IRD Level 0)

We compiled a comprehensive dataset of geo-referenced fishing effort observations from global tuna fisheries, covering the period from 1950 to 2023. These data are collected from the public domain datasets released by the five tuna Regional Fisheries Management Organizations (t-RFMOs): CCSBT, IATTC, ICCAT, IOTC, and WCPFC. As with the catch dataset, the effort data were processed by using the same data generation workflow as the one used for FIRMS-GTA with a different parametrization complying with the standardized data structure promoted by the Coordinating Working Party (CWP) standards for (tuna) fisheries statistics.

Contrariwise to catches, effort values are reported using a significant number of measurement units (23). Only a few mapping between similar tRFMOs units has been managed based on fdiwg codelists (see GitHub repository: https://github.com/fdiwg/fdi-mappings). Each remaining unit reflects different operational aspects depending on the fishing gear, fleet behavior, and the reporting RFMO. The Level 0 global dataset includes all reported units without conversion or aggregation, to preserve the original semantic richness and reflect the heterogeneity in reporting practices.

This IRD Level 0 global effort dataset thus, preserves all original effort records from t-RFMOs and complies with a unified data structure while maintaining the granularity and diversity of reporting. This level of processing is not a standardized or simplified effort dataset. No upper level of processing is currently made available by IRD. Any further aggregation or transformation of effort data should be conducted by the end-user, based on specific scientific goals and with careful consideration of the semantics behind each unit.

Both datasets are enriched with "gear_type_label", "fishing_fleet label", for catch, "species_group" using the FDIWG standards and for efforts "measurement_unit_labels".

Appendix work:

The github repository (DOI:10.5281/zenodo.14039665 Allowing to reproduce this dataset)

A Shiny app has been created to easily visualize catch data in CWP format: ghcr.io/firms-gta/tunaatlas_pie_map_shiny_cwp_database:latest. The docker image is based on this DOI dataset and allows to explore it.

bastienird/CPW.dataset : an R package that allows to perform easy manipulation on data following the CWP standards and to create plots and structured reports.

Data paper incoming.

If you are interested in creating a customized version of this Global Tuna Atlas with specific filters or adjustments based on particular issues, please feel free to reach out to us.

Facebook

Twitter

Click to copy link

Link copied

Cite

Zenodo (2025). Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1° or 5° squares (IRD level 2) - and efforts level 0 (1950-2023) [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-15405414?locale=es

Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1° or 5° squares (IRD level 2) - and efforts level 0 (1950-2023)

Explore at:

unknown(2677816)Available download formats

Dataset updated

May 16, 2025

Dataset authored and provided by

Zenodohttp://zenodo.org/

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

Major differences from v1: For level 2 catch: Catches and number raised to nominal are only raised to exactly matching stratas or if not existing, to a strata corresponding with UNK/NEI or 99.9. (new feature in v4) When nominal strata lack specific dimensions (e.g., fishing_mode always UNK) but georeferenced strata include them, the nominal data are “upgraded” to match—preventing loss of detail. Currently this adjustment aligns nominal values to georeferenced totals; future versions may apply proportional scaling. This does not create a direct raising but rather allows more precise reallocation. (new feature in v4) IATTC Purse seine catch-and-effort are available in 3 separate files according to the group of species: tuna, billfishes, sharks. This is due to the fact that PS data is collected from 2 sources: observer and fishing vessel logbooks. Observer records are used when available, and for unobserved trips logbooks are used. Both sources collect tuna data but only observers collect shark and billfish data. As an example, a strata may have observer effort and the number of sets from the observed trips would be counted for tuna and shark and billfish. But there may have also been logbook data for unobserved sets in the same strata so the tuna catch and number of sets for a cell would be added. This would make a higher total number of sets for tuna catch than shark or billfish. Efforts in the billfish and shark datasets might hence represent only a proportion of the total effort allocated in some strata since it is the observed effort, i.e. for which there was an observer onboard. As a result, catch in the billfish and shark datasets might represent only a proportion of the total catch allocated in some strata. Hence, shark and billfish catch were raised to the fishing effort reported in the tuna dataset. (new feature in v4, was done in Firms Level 0 before) Data with resolution of 10degx10deg is removed, it is considered to disaggregate it in next versions. Catches in tons, raised to match nominal values, now consider the geographic area of the nominal data for improved accuracy. (as v3) Captures in "Number of fish" are converted to weight based on nominal data. The conversion factors used in the previous version are no longer used, as they did not adequately represent the diversity of captures. (as v3) Number of fish without corresponding data in nominal are not removed as they were before, creating a huge difference for this measurement_unit between the two datasets. (as v3) Strata for which catches in tons are raised to match nominal data have had their numbers removed. (as v3) Raising only applies to complete years to avoid overrepresenting specific months, particularly in the early years of georeferenced reporting. (as v3) Strata where georeferenced data exceed nominal data have not been adjusted downward, as it is unclear if these discrepancies arise from missing nominal data or different aggregation methods in both datasets. (as v3) The data is not aggregated to 5-degree squares and thus remains unharmonized spatially. Aggregation can be performed using CWP codes for geographic identifiers. For example, an R function is available: source("https://raw.githubusercontent.com/firms-gta/geoflow-tunaatlas/master/sardara_functions/transform_cwp_code_from_1deg_to_5deg.R") (as v3) This results in a raising of the data compared to v3 for IOTC, ICCAT, IATTC and WCPFC. However as the raising is more specific for CCSBT, the raising is of 22% less than in the previous version. Level 0 dataset has been modified creating differences in this new version notably : The species retained are different; only 32 major species are kept. Mappings have been somewhat modified based on new standards implemented by FIRMS. New rules have been applied for overlapping areas. Data is only displayed in 1 degrees square area and 5 degrees square areas. The data is enriched with "Species group", "Gear labels" using the fdiwg standards. These main differences are recapped in the Differences_v2018_v2024.zip Recommendations: To avoid converting data from number using nominal stratas, we recommend the use of conversion factors which could be provided by tRFMOs. In some strata, nominal data appears higher than georeferenced data, as observed during level 2 processing. These discrepancies may result from errors or differences in aggregation methods. Further analysis will examine these differences in detail to refine treatments accordingly. A summary of differences by tRFMOs, based on the number of strata, is included in the appendix. For level 0 effort : In some datasets—namely those from ICCAT and the purse seine (PS) data from WCPFC— same effort data has been reported multiple times by using different units which have been kept as is, since no official mapping allows conversion between these units. As a result, users have be remind that some ICCAT and WCPFC effort data are deliberately duplicated : in the case of ICCAT data, lines wi

Clear search

Close search

Google apps

Main menu

Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1°...

Global monthly catch of tuna, tuna-like and shark species (1950-2021) by 1°...

Philippines Nominal GDP

Data release for the "First measurement of muon neutrino charged-current...

ATLAS DAOD_PHYSLITE format MC simulation QCD jet nominal samples

Market Basket Analysis

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

Global monthly catch of tuna, tuna-like and shark species (1950-2021) by 1°...

Fundamental Data Record for Atmospheric Composition [ATMOS_L1B]

Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244

Gross Domestic Product

Experimental Data for Fault Diagnosis in the Adaptive High-Rise D1244 -...

Using Decision Trees to Detect and Isolate Leaks in the J-2X

Spatial Coverage Map and Resampling Error Assessment for Hyperspectral...

Summary statistics for expression quantitative trait loci in the developing...

Data for Fault Diagnosis in Adaptive Buildings - Dataset - B2FIND

Data and probability for an incomplete 2×2 table.

Proportion of a nominal sample from each respondent category according to...

Data_Sheet_1_Predicting the data structure prior to extreme events from...

Degradation Measurement of Robot Arm Position Accuracy

Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1°...

Global monthly catch of tuna, tuna-like and shark species (1950-2023) by 1° or 5° squares (IRD level 2) - and efforts level 0 (1950-2023)