30 datasets found
  1. o

    FSDKaggle2019

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Jan 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eduardo Fonseca; Manoj Plakal; Frederic Font; Daniel P. W. Daniel P. W. Ellis; Xavier Serra (2020). FSDKaggle2019 [Dataset]. http://doi.org/10.5281/zenodo.3612636
    Explore at:
    Dataset updated
    Jan 20, 2020
    Authors
    Eduardo Fonseca; Manoj Plakal; Frederic Font; Daniel P. W. Daniel P. W. Ellis; Xavier Serra
    Description

    FSDKaggle2019 is an audio dataset containing 29,266 audio files annotated with 80 labels of the AudioSet Ontology. FSDKaggle2019 has been used for the DCASE Challenge 2019 Task 2, which was run as a Kaggle competition titled Freesound Audio Tagging 2019. Citation If you use the FSDKaggle2019 dataset or part of it, please cite our DCASE 2019 paper: Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra. "Audio tagging with noisy labels and minimal supervision". Proceedings of the DCASE 2019 Workshop, NYC, US (2019) You can also consider citing our ISMIR 2017 paper, which describes how we gathered the manual annotations included in FSDKaggle2019. Eduardo Fonseca, Jordi Pons, Xavier Favory, Frederic Font, Dmitry Bogdanov, Andres Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra, "Freesound Datasets: A Platform for the Creation of Open Audio Datasets", In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017 Data curators Eduardo Fonseca, Manoj Plakal, Xavier Favory, Jordi Pons Contact You are welcome to contact Eduardo Fonseca should you have any questions at eduardo.fonseca@upf.edu. ABOUT FSDKaggle2019 Freesound Dataset Kaggle 2019 (or FSDKaggle2019 for short) is an audio dataset containing 29,266 audio files annotated with 80 labels of the AudioSet Ontology [1]. FSDKaggle2019 has been used for the Task 2 of the Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2019. Please visit the DCASE2019 Challenge Task 2 website for more information. This Task was hosted on the Kaggle platform as a competition titled Freesound Audio Tagging 2019. It was organized by researchers from the Music Technology Group (MTG) of Universitat Pompeu Fabra (UPF), and from Sound Understanding team at Google AI Perception. The competition intended to provide insight towards the development of broadly-applicable sound event classifiers able to cope with label noise and minimal supervision conditions. FSDKaggle2019 employs audio clips from the following sources: Freesound Dataset (FSD): a dataset being collected at the MTG-UPF based on Freesound content organized with the AudioSet Ontology The soundtracks of a pool of Flickr videos taken from the Yahoo Flickr Creative Commons 100M dataset (YFCC) The audio data is labeled using a vocabulary of 80 labels from Google’s AudioSet Ontology [1], covering diverse topics: Guitar and other Musical Instruments, Percussion, Water, Digestive, Respiratory sounds, Human voice, Human locomotion, Hands, Human group actions, Insect, Domestic animals, Glass, Liquid, Motor vehicle (road), Mechanisms, Doors, and a variety of Domestic sounds. The full list of categories can be inspected in vocabulary.csv (see Files & Download below). The goal of the task was to build a multi-label audio tagging system that can predict appropriate label(s) for each audio clip in a test set. What follows is a summary of some of the most relevant characteristics of FSDKaggle2019. Nevertheless, it is highly recommended to read our DCASE 2019 paper for a more in-depth description of the dataset and how it was built. Ground Truth Labels The ground truth labels are provided at the clip-level, and express the presence of a sound category in the audio clip, hence can be considered weak labels or tags. Audio clips have variable lengths (roughly from 0.3 to 30s). The audio content from FSD has been manually labeled by humans following a data labeling process using the Freesound Annotator platform. Most labels have inter-annotator agreement but not all of them. More details about the data labeling process and the Freesound Annotator can be found in [2]. The YFCC soundtracks were labeled using automated heuristics applied to the audio content and metadata of the original Flickr clips. Hence, a substantial amount of label noise can be expected. The label noise can vary widely in amount and type depending on the category, including in- and out-of-vocabulary noises. More information about some of the types of label noise that can be encountered is available in [3]. Specifically, FSDKaggle2019 features three types of label quality, one for each set in the dataset: curated train set: correct (but potentially incomplete) labels noisy train set: noisy labels test set: correct and complete labels Further details can be found below in the sections for each set. Format All audio clips are provided as uncompressed PCM 16 bit, 44.1 kHz, mono audio files. DATA SPLIT FSDKaggle2019 consists of two train sets and one test set. The idea is to limit the supervision provided for training (i.e., the manually-labeled, hence reliable, data), thus promoting approaches to deal with label noise. Curated train set The curated train set consists of manually-labeled data from FSD. Number of clips/class: 75 except in a few cases (where there are less) Total number of clips: 4970 Avg number of labels/clip: 1.2 Total duration: 10.5 hours The duratio...

  2. d

    School Nutrition Programs - Contact Information and Site-Level Program...

    • catalog.data.gov
    • data.texas.gov
    Updated Jun 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). School Nutrition Programs - Contact Information and Site-Level Program Participation - Program Year 2019-2020 [Dataset]. https://catalog.data.gov/dataset/school-nutrition-programs-contact-information-and-site-level-program-participation-pr-2019
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    Help us provide the most useful data by completing our ODP User Feedback Survey for School Nutrition Data About the Dataset This dataset contains contact information and program participation information for all School Nutrition Program (SNP) meal sites approved by TDA to operate during the 2019-2020 program year. The school nutrition program year begins July 1 and ends June 30. In March 2020, USDA began allowing flexibility in nutrition assistance program policies in order to support continued meal access should the coronavirus pandemic (COVID-19) impact meal service operation. Sites participating in these flexibilities are indicated in the newly added COVID Meal Site column of this dataset. For more information on the waivers implemented for this purpose, please visit our website at SquareMeals.org. An overview of all SNP data available on the Texas Open Data Portal can be found at our TDA Data Overview - School Nutrition Programs page. An overview of all TDA Food and Nutrition data available on the Texas Open Data Portal can be found at our TDA Data Overview - Food and Nutrition Open Data page. More information about accessing and working with TDA data on the Texas Open Data Portal can be found on the SquareMeals.org website on the TDA Food and Nutrition Open Data page. About Dataset Updates TDA aims to post new program year data by September 1 of the active program year. Updates will occur quarterly and end 90 days after the close of the program year. Any data posted during the active program year is subject to change. After 90 days from the close of the program year, this dataset will remain published but will no longer be updated. About the Agency The Texas Department of Agriculture administers 12 U.S. Department of Agriculture nutrition programs in Texas including the National School Lunch and School Breakfast Programs, the Child and Adult Care Food Program (CACFP), and summer meal programs. TDA’s Food and Nutrition division provides technical assistance and training resources to partners operating the programs and oversees the USDA reimbursements they receive to cover part of the cost associated with serving food in their facilities. By working to ensure these partners serve nutritious meals and snacks, the division adheres to its mission — Feeding the Hungry and Promoting Healthy Lifestyles. For more information on these programs, please visit us at SquareMeals.org.

  3. NFL Play Statistics dataset (secondary)

    • kaggle.com
    Updated Apr 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Todd Steussie (2020). NFL Play Statistics dataset (secondary) [Dataset]. https://www.kaggle.com/toddsteussie/nfl-play-statistics-secondary-datasets/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Todd Steussie
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    NFL is one of the most popular sports in the world. Many of us are stat geeks who understanding not what just happened but also who and why. This NFL dataset provides a comprehensive view of NFL games, statistics, participation, and much more. The dataset includes NFL play data from 2004 to the present.

    This NFL dataset provides play-by-play data from the 2004 to 2019 seasons. Dataset also includes play and participation information for players, coaches, and game officials. Additional data tables included in this file includes NFL Draft from 1989 to present, NFL Combine 1999 to present, NFL rosters from 1998 to present, NFL schedules, stadium information and much more. The granularity of NFL statistics varies by NFL season. The current version of NFL statistics has been collected since 2012. All information sources used to create this dataset are from publically accessible websites and the NFL GSIS dataset.

    All information sources used to create this dataset are from publically accessible websites and NFL documentation. Although my current life is focused on data science, this project has a special place in my heart, since it links my previous profession in the NFL with my current passion for data analysis.

  4. h

    Hampton Roads Tourism and Cultural Sites

    • hrgeo.org
    • data.virginia.gov
    • +1more
    Updated Aug 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HRPDC & HRTPO (2019). Hampton Roads Tourism and Cultural Sites [Dataset]. https://www.hrgeo.org/datasets/hampton-roads-tourism-and-cultural-sites
    Explore at:
    Dataset updated
    Aug 20, 2019
    Dataset authored and provided by
    HRPDC & HRTPO
    Area covered
    Description

    This layer contains major tourism and cultural sites in Hampton Roads, Virginia. Included are museums, gardens, parks/open spaces, historic sites, and entertainment/sports venues. It is not an exhaustive collection but primarily includes the most popular or well-known locations that attract visitors. Last update: August 2019.

  5. d

    Child and Adult Care Food Programs (CACFP) – Centers – Site-Level Contact...

    • catalog.data.gov
    • data.texas.gov
    Updated May 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). Child and Adult Care Food Programs (CACFP) – Centers – Site-Level Contact and Program Participation – Program Year 2019-2020 [Dataset]. https://catalog.data.gov/dataset/child-and-adult-care-food-programs-cacfp-centers-site-level-contact-and-program-parti-2019
    Explore at:
    Dataset updated
    May 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    Help us provide the most useful data by completing our ODP User Feedback Survey for Child and Adult Care Food Program (CACFP) Data About the Dataset This dataset contains contact and program participation information for all Texas child and adult care centers approved by the Texas Department of Agriculture to operate as a meal site under the Child and Adult Care Food Program (CACFP) during the 2019-2020 program year. CACFP centers include Adult Day Care Centers (ADC), Child Care Centers (CCC), At-Risk Afterschool Centers (At-Risk), Head Start Centers, emergency shelters, and centers providing care for students outside school hours. Sites can participate in one or more CACFP sub-programs. The CACFP program year begins October 1 and ends September 30. In March 2020, USDA began allowing flexibility in nutrition assistance program policy to support continued meal access should the coronavirus pandemic (COVID-19) impact meal service operation. Sites participating in these flexibilities are indicated in the newly added COVID Meal Site column of this dataset. For more information on the waivers implemented for this purpose, please visit our website at SquareMeals.org. This dataset only includes information for CACFP centers. For data on Texas Day Care Homes (DCH) participating in CACFP, please refer to the “Child and Adult Care Food Programs (CACFP) – Day Care Homes – Contact and Program Participation” dataset, also on the State of Texas Open Data Portal. An overview of all CACFP data available on the Texas Open Data Portal can be found at our TDA Data Overview - Child and Adult Care Food Programs page. Dataset content and column order have been updated starting with program year 2018-2019 forward. Older program year datasets will retain original content and organization. More information about accessing and working with TDA data on the Texas Open Data Portal can be found on the SquareMeals.org website on the TDA Food and Nutrition Open Data page. About Dataset Updates TDA aims to post new program year data by December 15 of the active program year. Updates will occur quarterly and end 90 days after the close of the program year. Any data posted during the active update period is subject to change. After 90 days from the close of the program year, the dataset will remain published but will no longer be updated. About the Agency The Texas Department of Agriculture administers 12 U.S. Department of Agriculture nutrition programs in Texas including the National School Lunch and School Breakfast Programs, the Child and Adult Care Food Program (CACFP), and summer meal programs. TDA’s Food and Nutrition division provides technical assistance and training resources to partners operating the programs and oversees the USDA reimbursements they receive to cover part of the cost associated with serving food in their facilities. By working to ensure these partners serve nutritious meals and snacks, the division adheres to its mission — Feeding the Hungry and Promoting Healthy Lifestyles. For more information on these programs, please visit our website.

  6. Synthetic Data Generation of Health and Demographic Surveillance Systems...

    • icpsr.umich.edu
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Waljee, Akbar K. (2024). Synthetic Data Generation of Health and Demographic Surveillance Systems Dataset, Kenya, 2019-2020 [Dataset]. http://doi.org/10.3886/ICPSR39209.v1
    Explore at:
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Waljee, Akbar K.
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/39209/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/39209/terms

    Time period covered
    2019 - 2020
    Area covered
    Kenya
    Description

    Surveillance data play a vital role in estimating the burden of diseases, pathogens, exposures, behaviors, and susceptibility in populations, providing insights that can inform the design of policies and targeted public health interventions. The use of Health and Demographic Surveillance System (HDSS) collected from the Kilifi region of Kenya, has led to the collection of massive amounts of data on the demographics and health events of different populations. This has necessitated the adoption of tools and techniques to enhance data analysis to derive insights that will improve the accuracy and efficiency of decision-making. Machine Learning (ML) and artificial intelligence (AI) based techniques are promising for extracting insights from HDSS data, given their ability to capture complex relationships and interactions in data. However, broad utilization of HDSS datasets using AI/ML is currently challenging as most of these datasets are not AI-ready due to factors that include, but are not limited to, regulatory concerns around privacy and confidentiality, heterogeneity in data laws across countries limiting the accessibility of data, and a lack of sufficient datasets for training AI/ML models. Synthetic data generation offers a potential strategy to enhance accessibility of datasets by creating synthetic datasets that uphold privacy and confidentiality, suitable for training AI/ML models and can also augment existing AI datasets used to train the AI/ML models. These synthetic datasets, generated from two rounds of separate data collection periods, represent a version of the real data while retaining the relationships inherent in the data. For more information please visit The Aga Khan University Website.

  7. Virginia Springs/Groundwater Layers - 2023

    • data.virginia.gov
    • opendata.winchesterva.gov
    • +4more
    Updated Oct 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Virginia Department of Environmental Quality (2024). Virginia Springs/Groundwater Layers - 2023 [Dataset]. https://data.virginia.gov/dataset/virginia-springs-groundwater-layers-2023
    Explore at:
    html, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Oct 23, 2024
    Dataset authored and provided by
    Virginia Department of Environmental Qualityhttps://deq.virginia.gov/
    Area covered
    Hot Springs
    Description
    The VDEQ Spring SITES database contains data describing the geographic locations and site attributes of natural springs throughout the commonwealth. This data coverage continues to evolve and contains only spring locations known to exist with a reasonable degree of certainty on the date of publication. The dataset does not replace site specific inventorying or receptor surveys but can be used as a starting point. VDEQ's initial geospatial dataset of approximately 325 springs was formed in 2008 by digitizing historical spring information sheets created by State Water Control Board geologists in the 1970s through early 1990s. Additional data has been consolidated from the EPA STORET database, the U.S. Geological Survey's Ground Water Site Inventory (GWSI) and Geographic Names Inventory System (GNIS), the Virginia Department of Health SDWIS database, the Virginia DEQ Virginia Water Use Data Set (VWUDS), the Commonwealth of Virginia Division of Water Resources and Power Bulletin No. 1: "Springs of Virginia" by Collins et al., 1930 as well as several VDWR&P Surface Water Supply bulletins from the 1940's - 1950's. A 1992 Virginia Department of Game and Inland Fisheries / Virginia Tech sponsored study by Helfrich et al. titled "Evaluation of the Natural Springs of Virginia: Fisheries Management Implications", a 2004 Rockbridge County groundwater resources report written by Frits van der Leeden, and several smaller datasets from consultants and citizens were evaluated and added to the database when confidence in locational accuracy was high or could be verified with aerial or LIDAR imagery. Significant contributions have been made throughout the years by VDEQ Groundwater Characterization staff site visits as well as other geologists working in the region including: Matt Heller at Virginia Division of Geology and Mineral Resources (VDMME), Wil Orndorff at the Virginia Department of Conservation and Recreation Karst Program (VDCR), and David Nelms and Dan Doctor of the U.S. Geological Survey (USGS). Substantial effort has been made to improve locational accuracy and remove duplication present between data sources. Hundreds of spring locations that were originally obtained using topographic maps or unknown methods were updated to sub-meter locational accuracy using post-processed differential GPS (PPGPS) and through the use of several generations of aerial imagery (2002-2017) obtained from Virginia's Geographic Information Network (VGIN) and 1-meter LIDAR, where available. Scores of new spring locations were also obtained by systematic quadrangle by quadrangle analysis in areas of the Shenandoah Valley where 1-meter LIDAR datasets where obtained from the U.S. Geological Survey. Future improvements to the dataset will result when statewide 1-meter LIDAR datasets becomes available and through continued field work by DEQ staff and other contributors working in the region. Please do not hesitate to contact the author to correct mistakes or to contribute to the database.

    The VDEQ Spring FIELD MEASUREMENTS database contains data describing field derived physio-chemical properties of spring discharges measured throughout the Commonwealth of Virginia. Field visits compiled in this dataset were performed from 1928 to 2019 by geologists with the State Water Control Board, the Virginia Division of Water and Power, the Virginia Department of Environmental Quality, and the U.S. Geological Survey with contributions from other sources as noted. Values of -9999 indicate that measurements were not performed for the referenced parameter. Please do not hesitate to contact the author to add data to the database or correct errors.


    The VDEQ_Spring_WQ database is a geodatabase containing groundwater sample information collected from springs throughout Virginia. Sample specific information include: location and site information, measured field parameters, and lab verified quantifications of major ionic concentrations, trace element concentrations, nutrient concentrations, and radiological data. The VDEQ_Spring_WQ database is a subset of the VDEQ GWCHEM database which is a flat-file geodatabase containing groundwater sample information from groundwater wells and springs throughout Virginia. Sample information has been correlated via DEQ Well # and projected using coordinates in VDEQ_Spring_SITES database. The GWCHEM database is comprised of historic groundwater sample data originally archived in the United States Geological Survey (USGS) National Water Information System (NWIS) and the Environmental Protection Agency (EPA) Storage and Retrieval (STORET) data warehouse. Archived STORET data originated as groundwater sample data collected and uploaded by Virginia State Water Control Board Personnel. While groundwater sample data in the STORET data warehouse are static, new groundwater sample data are periodically uploaded to NWIS and spring laboratory WQ data reflect NWIS downloaded on 9/30/2019. Recent groundwater sample data collected by Virginia Department of Environmental Quality (DEQ) personnel as part of the Ambient Groundwater Sampling Program are entered into the database as lab results are made available by the Division of Consolidated Laboratory Services (DCLS). When possible, charge balances were calculated for samples with reported values for major ions including (at a minimum) calcium, magnesium, potassium, sodium, bicarbonate, chloride, and sulfate. Reported values for Nitrate as N, carbonate, and fluoride were included in the charge balance calculation when available. Field determined values for bicarbonate and carbonate were used in the charge balance calculation when available. For much of the legacy DEQ groundwater sample data, bicarbonate values were derived from lab reported values of alkalinity (as mg/CaCO3) under the assumption that there was no contribution by carbonate to the reported alkalinity value. Charge balance values are reported in the "Charge Balance" column of the GWCHEM geodatabase. The closer the charge balance value is to unity (1), the lower the assumed charge balance error.In order to preserve the numerical capabilities of the database, non- numeric lab qualifiers were given the following numeric identifiers:- (minus sign) = less than the concentration specified to the right of the sign-11110 = estimated-22220 = presence verified but not quantified-33330 = radchem non-detect, below sslc-4440 = analyzed for but not detected-55550 = greater than the concentration to the right of the zero-66660 = sample held beyond normal holding time-77770 = quality control failure. Data not valid.-88880 = sample held beyond normal holding time. Sample analyzed for but not detected. Value stored is limit of detection for proces in use.-11120 = Value reported is less than the criteria of detection.-9999 = no data (parameter not quantified)

    A more in depth descprition and hydrogeologic analysis of the database can be found here
    An in Depth data fact sheet can be found here
  8. Z

    Data from: SOTorrent Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baltes, Sebastian (2021). SOTorrent Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1135262
    Explore at:
    Dataset updated
    Jan 11, 2021
    Dataset authored and provided by
    Baltes, Sebastian
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of code snippets and free-form text on a wide variety of topics. Like other software artifacts, questions and answers on SO evolve over time, for example when bugs in code snippets are fixed, code is updated to work with a more recent library version, or text surrounding a code snippet is edited for clarity. To be able to analyze how content on SO evolves, we built SOTorrent, an open dataset based on the official SO data dump. SOTorrent provides access to the version history of SO content at the level of whole posts and individual text or code blocks. It connects SO posts to other platforms by aggregating URLs from text blocks and comments, and by collecting references from GitHub files to SO posts. Our vision is that researchers will use SOTorrent to investigate and understand the evolution of SO posts and their relation to other platforms such as GitHub.

    If you use this dataset in your work, please cite our MSR 2018 paper (BibTex) or our MSR 2019 mining challenge proposal.

  9. Google Landmarks Dataset v2

    • github.com
    • paperswithcode.com
    • +1more
    Updated Sep 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google (2019). Google Landmarks Dataset v2 [Dataset]. https://github.com/cvdfoundation/google-landmark
    Explore at:
    Dataset updated
    Sep 27, 2019
    Dataset provided by
    Googlehttp://google.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the second version of the Google Landmarks dataset (GLDv2), which contains images annotated with labels representing human-made and natural landmarks. The dataset can be used for landmark recognition and retrieval experiments. This version of the dataset contains approximately 5 million images, split into 3 sets of images: train, index and test. The dataset was presented in our CVPR'20 paper. In this repository, we present download links for all dataset files and relevant code for metric computation. This dataset was associated to two Kaggle challenges, on landmark recognition and landmark retrieval. Results were discussed as part of a CVPR'19 workshop. In this repository, we also provide scores for the top 10 teams in the challenges, based on the latest ground-truth version. Please visit the challenge and workshop webpages for more details on the data, tasks and technical solutions from top teams.

  10. a

    NT DCIS - Remote Sites with Mobile Coverage (Point) 2019 - Dataset - AURIN

    • data.aurin.org.au
    Updated Mar 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). NT DCIS - Remote Sites with Mobile Coverage (Point) 2019 - Dataset - AURIN [Dataset]. https://data.aurin.org.au/dataset/nt-govt-dcis-nt-cis-remote-sites-w-mobile-coverage-2019-na
    Explore at:
    Dataset updated
    Mar 6, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset provides the GPS Co-ordinates of small cell mobile phone services in remote Northern Territory locations. For more information please visit the Northern Territory Government Open Data Portal.

  11. a

    Vatican Data, Year of Statistical Data

    • hub.arcgis.com
    • catholic-geo-hub-cgisc.hub.arcgis.com
    Updated Oct 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    burhansm2 (2019). Vatican Data, Year of Statistical Data [Dataset]. https://hub.arcgis.com/maps/36fcd8c2e2b04b48bcbc19602dcda867
    Explore at:
    Dataset updated
    Oct 22, 2019
    Dataset authored and provided by
    burhansm2
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Area covered
    Description

    Vatican Data Series {title at top of page}Data Developers: Burhans, Molly A., Cheney, David M., Emege, Thomas, Gerlt, R.. . “Vatican Data Series {title at top of page}”. Scale not given. Version 1.0. MO and CT, USA: GoodLands Inc., Catholic Hierarchy, Environmental Systems Research Institute, Inc., 2019.Web map developer: Molly Burhans, October 2019Web app developer: Molly Burhans, October 2019GoodLands’ polygon data layers, version 2.0 for global ecclesiastical boundaries of the Roman Catholic Church:Although care has been taken to ensure the accuracy, completeness and reliability of the information provided, due to this being the first developed dataset of global ecclesiastical boundaries curated from many sources it may have a higher margin of error than established geopolitical administrative boundary maps. Boundaries need to be verified with appropriate Ecclesiastical Leadership. The current information is subject to change without notice. No parties involved with the creation of this data are liable for indirect, special or incidental damage resulting from, arising out of or in connection with the use of the information. We referenced 1960 sources to build our global datasets of ecclesiastical jurisdictions. Often, they were isolated images of dioceses, historical documents and information about parishes that were cross checked. These sources can be viewed here:https://docs.google.com/spreadsheets/d/11ANlH1S_aYJOyz4TtG0HHgz0OLxnOvXLHMt4FVOS85Q/edit#gid=0To learn more or contact us please visit: https://good-lands.org/The Catholic Leadership global maps information is derived from the Annuario Pontificio, which is curated and published by the Vatican Statistics Office annually, and digitized by David Cheney at Catholic-Hierarchy.org -- updated are supplemented with diocesan and news announcements. GoodLands maps this into global ecclesiastical boundaries. Admin 3 Ecclesiastical Territories:Burhans, Molly A., Cheney, David M., Gerlt, R.. . “Admin 3 Ecclesiastical Territories For Web”. Scale not given. Version 1.2. MO and CT, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2019.Derived from:Global Diocesan Boundaries:Burhans, M., Bell, J., Burhans, D., Carmichael, R., Cheney, D., Deaton, M., Emge, T. Gerlt, B., Grayson, J., Herries, J., Keegan, H., Skinner, A., Smith, M., Sousa, C., Trubetskoy, S. “Diocesean Boundaries of the Catholic Church” [Feature Layer]. Scale not given. Version 1.2. Redlands, CA, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2016.Using: ArcGIS. 10.4. Version 10.0. Redlands, CA: Environmental Systems Research Institute, Inc., 2016.Boundary ProvenanceStatistics and Leadership DataCheney, D.M. “Catholic Hierarchy of the World” [Database]. Date Updated: August 2019. Catholic Hierarchy. Using: Paradox. Retrieved from Original Source.Catholic HierarchyAnnuario Pontificio per l’Anno .. Città del Vaticano :Tipografia Poliglotta Vaticana, Multiple Years.The data for these maps was extracted from the gold standard of Church data, the Annuario Pontificio, published yearly by the Vatican. The collection and data development of the Vatican Statistics Office are unknown. GoodLands is not responsible for errors within this data. We encourage people to document and report errant information to us at data@good-lands.org or directly to the Vatican.Additional information about regular changes in bishops and sees comes from a variety of public diocesan and news announcements.

  12. g

    HUN SW footprint shapefiles v01

    • gimi9.com
    • cloud.csiss.gmu.edu
    • +3more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HUN SW footprint shapefiles v01 [Dataset]. https://gimi9.com/dataset/au_2a9520c8-1569-4e0e-8bd8-26e2c7b9e9e0/
    Explore at:
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract This is a source dataset created by the Bioregional Assessment Programme without the use of source data. This dataset contains all of the surface water footprint polygons that were created from mining reports that were used in the surface water modelling. There is also a document with the source references for all of the footprints included in the dataset. ## Dataset History Environmental impact statements and similar documents were downloaded from New South Wales Department of Planning and Environment Major Projects website, and from mining companies' websites. To obtain mine footprints for surface water modelling, the mining reports were searched for past and future projected mine layouts and surface water contributing areas. Each figure was digitised and georeferenced using one of four methods: 1. The preferred method was to use maps or plans with coordinates already on them. 2. If there were no coordinates, then three point locations were matched with points on Google Earth and the latitude and longitude from Google Earth were used to georeference the image. 3. If there were not three clearly identifiable point locations in the image, then supplementary points were found by matching contour information to the Shuttle Radar Topography Mission Smoothed Digital Elevation Model (SRTM DEM-S) grid Dataset GUID - 12e0731d-96dd-49cc-aa21-ebfd65a3f67a 4. Other site-specific approaches: a. Mangoola Coal Mine did not have adequate georeferencing points in the Year 10 and Final Landform images so these images were georeferenced to the matching project boundary in the other Mangoola Coal Mine images. b. The West Wallsend Colliery existing pit top surface facilities image, containing a satellite photo background, was georeferenced using Google Earth. The West Wallsend Colliery pit top facility outline was used to georeference the water management system image as they both contained the same outline. These areas were exported as polygon files (*.poly) using Geosoft Oasis Montaj software. A list of documents used for creating these polygon files are also included in the dataset ## Dataset Citation Bioregional Assessment Programme (2016) HUN SW footprint shapefiles v01. Bioregional Assessment Source Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/2a9520c8-1569-4e0e-8bd8-26e2c7b9e9e0.

  13. Amendment to Master Plan 2019 Monument Site layer

    • data.gov.sg
    Updated Jun 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Urban Redevelopment Authority (2024). Amendment to Master Plan 2019 Monument Site layer [Dataset]. https://data.gov.sg/datasets/d_2a9bde71d16bf9e98530a86130bab404/view
    Explore at:
    Dataset updated
    Jun 6, 2024
    Dataset authored and provided by
    Urban Redevelopment Authorityhttp://ura.gov.sg/
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Description

    Dataset from Urban Redevelopment Authority. For more information, visit https://data.gov.sg/datasets/d_2a9bde71d16bf9e98530a86130bab404/view

  14. d

    Child and Adult Care Food Programs (CACFP) – Day Care Homes – Site-Level...

    • catalog.data.gov
    • data.texas.gov
    Updated Apr 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.austintexas.gov (2025). Child and Adult Care Food Programs (CACFP) – Day Care Homes – Site-Level Contact and Program Participation – Program Year 2019-2020 [Dataset]. https://catalog.data.gov/dataset/child-and-adult-care-food-programs-cacfp-day-care-homes-site-level-contact-and-progra-2019
    Explore at:
    Dataset updated
    Apr 25, 2025
    Dataset provided by
    data.austintexas.gov
    Description

    Help us provide the most useful data by completing our ODP User Feedback Survey for Child and Adult Care Food Program (CACFP) Data About the Dataset This dataset contains contact and program participation information for Day Care Home (DCH) providers approved to operate under the Child and Adult Care Food Program (CACFP) during program year 2019-2020. Contracting Entity (CE) sponsors can participate in more than one CACFP program. The CACFP program year begins October 1 and ends September 30. In March 2020, USDA began allowing flexibility in nutrition assistance program policy to support continued meal access should the coronavirus pandemic (COVID-19) impact meal service operation. Sites participating in these flexibilities are indicated in the newly added COVID Meal Site column of this dataset. For more information on the waivers implemented for this purpose, please visit our website at SquareMeals.org. This dataset only includes information for Texas Day Care Home providers participating in CACFP. For data on CEs and sites participating as Adult Day Care Centers (ADC), Child Care Centers (CCC), At-Risk Afterschool Centers (At-Risk), Head Start Centers, emergency shelters, and centers providing care for students outside school hours, please refer to the “Child and Adult Care Food Programs (CACFP) – Centers – Contact and Program Participation” dataset, also on the State of Texas Open Data Portal. Dataset content and column order have been updated starting with program year 2018-2019 forward. Older program year datasets will retain original content and organization. An overview of all CACFP data available on the Texas Open Data Portal can be found at our TDA Data Overview - Child and Adult Care Food Programs page. An overview of all TDA Food and Nutrition data available on the Texas Open Data Portal can be found at our TDA Data Overview - Food and Nutrition Open Data page. More information about accessing and working with TDA data on the Texas Open Data Portal can be found on the SquareMeals.org website on the TDA Food and Nutrition Open Data page. About Dataset Updates TDA aims to post new program year data by December 15 of the active program year. Updates will occur quarterly and end 90 days after the close of the program year. Any data posted during the active update period is subject to change. After 90 days from the close of the program year, the dataset will remain published but will no longer be updated. About the Agency The Texas Department of Agriculture administers 12 U.S. Department of Agriculture nutrition programs in Texas including the National School Lunch and School Breakfast Programs, the Child and Adult Care Food Programs (CACFP), and summer meal programs. TDA’s Food and Nutrition division provides technical assistance and training resources to partners operating the programs and oversees the USDA reimbursements they receive to cover part of the cost associated with serving food in their facilities. By working to ensure these partners serve nutritious meals and snacks, the division adheres to its mission — Feeding the Hungry and Promoting Healthy Lifestyles. For more information on these programs, please visit our SquareMeal.org website.

  15. Weekly United States COVID-19 Cases and Deaths by State - ARCHIVED

    • data.cdc.gov
    • data.virginia.gov
    • +1more
    application/rdfxml +5
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDC COVID-19 Response (2023). Weekly United States COVID-19 Cases and Deaths by State - ARCHIVED [Dataset]. https://data.cdc.gov/Case-Surveillance/Weekly-United-States-COVID-19-Cases-and-Deaths-by-/pwn4-m3yp
    Explore at:
    csv, application/rdfxml, xml, tsv, json, application/rssxmlAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Authors
    CDC COVID-19 Response
    License

    https://www.usa.gov/government-workshttps://www.usa.gov/government-works

    Area covered
    United States
    Description

    Reporting of new Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. This dataset will receive a final update on June 1, 2023, to reconcile historical data through May 10, 2023, and will remain publicly available.

    Aggregate Data Collection Process Since the start of the COVID-19 pandemic, data have been gathered through a robust process with the following steps:

    • A CDC data team reviews and validates the information obtained from jurisdictions’ state and local websites via an overnight data review process.
    • If more than one official county data source exists, CDC uses a comprehensive data selection process comparing each official county data source, and takes the highest case and death counts respectively, unless otherwise specified by the state.
    • CDC compiles these data and posts the finalized information on COVID Data Tracker.
    • County level data is aggregated to obtain state and territory specific totals.
    This process is collaborative, with CDC and jurisdictions working together to ensure the accuracy of COVID-19 case and death numbers. County counts provide the most up-to-date numbers on cases and deaths by report date. CDC may retrospectively update counts to correct data quality issues.

    Methodology Changes Several differences exist between the current, weekly-updated dataset and the archived version:

    • Source: The current Weekly-Updated Version is based on county-level aggregate count data, while the Archived Version is based on State-level aggregate count data.
    • Confirmed/Probable Cases/Death breakdown:  While the probable cases and deaths are included in the total case and total death counts in both versions (if applicable), they were reported separately from the confirmed cases and deaths by jurisdiction in the Archived Version.  In the current Weekly-Updated Version, the counts by jurisdiction are not reported by confirmed or probable status (See Confirmed and Probable Counts section for more detail).
    • Time Series Frequency: The current Weekly-Updated Version contains weekly time series data (i.e., one record per week per jurisdiction), while the Archived Version contains daily time series data (i.e., one record per day per jurisdiction).
    • Update Frequency: The current Weekly-Updated Version is updated weekly, while the Archived Version was updated twice daily up to October 20, 2022.
    Important note: The counts reflected during a given time period in this dataset may not match the counts reflected for the same time period in the archived dataset noted above. Discrepancies may exist due to differences between county and state COVID-19 case surveillance and reconciliation efforts.

    Confirmed and Probable Counts In this dataset, counts by jurisdiction are not displayed by confirmed or probable status. Instead, confirmed and probable cases and deaths are included in the Total Cases and Total Deaths columns, when available. Not all jurisdictions report probable cases and deaths to CDC.* Confirmed and probable case definition criteria are described here:

    Council of State and Territorial Epidemiologists (ymaws.com).

    Deaths CDC reports death data on other sections of the website: CDC COVID Data Tracker: Home, CDC COVID Data Tracker: Cases, Deaths, and Testing, and NCHS Provisional Death Counts. Information presented on the COVID Data Tracker pages is based on the same source (total case counts) as the present dataset; however, NCHS Death Counts are based on death certificates that use information reported by physicians, medical examiners, or coroners in the cause-of-death section of each certificate. Data from each of these pages are considered provisional (not complete and pending verification) and are therefore subject to change. Counts from previous weeks are continually revised as more records are received and processed.

    Number of Jurisdictions Reporting There are currently 60 public health jurisdictions reporting cases of COVID-19. This includes the 50 states, the District of Columbia, New York City, the U.S. territories of American Samoa, Guam, the Commonwealth of the Northern Mariana Islands, Puerto Rico, and the U.S Virgin Islands as well as three independent countries in compacts of free association with the United States, Federated States of Micronesia, Republic of the Marshall Islands, and Republic of Palau. New York State’s reported case and death counts do not include New York City’s counts as they separately report nationally notifiable conditions to CDC.

    CDC COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths, available by state and by county. These and other data on COVID-19 are available from multiple public locations, such as:

    https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html

    https://www.cdc.gov/covid-data-tracker/index.html

    https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html

    https://www.cdc.gov/coronavirus/2019-ncov/php/open-america/surveillance-data-analytics.html

    Additional COVID-19 public use datasets, include line-level (patient-level) data, are available at: https://data.cdc.gov/browse?tags=covid-19.

    Archived Data Notes:

    November 3, 2022: Due to a reporting cadence issue, case rates for Missouri counties are calculated based on 11 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 3, 2022, instead of the customary 7 days’ worth of data.

    November 10, 2022: Due to a reporting cadence change, case rates for Alabama counties are calculated based on 13 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 10, 2022, instead of the customary 7 days’ worth of data.

    November 10, 2022: Per the request of the jurisdiction, cases and deaths among non-residents have been removed from all Hawaii county totals throughout the entire time series. Cumulative case and death counts reported by CDC will no longer match Hawaii’s COVID-19 Dashboard, which still includes non-resident cases and deaths. 

    November 17, 2022: Two new columns, weekly historic cases and weekly historic deaths, were added to this dataset on November 17, 2022. These columns reflect case and death counts that were reported that week but were historical in nature and not reflective of the current burden within the jurisdiction. These historical cases and deaths are not included in the new weekly case and new weekly death columns; however, they are reflected in the cumulative totals provided for each jurisdiction. These data are used to account for artificial increases in case and death totals due to batched reporting of historical data.

    December 1, 2022: Due to cadence changes over the Thanksgiving holiday, case rates for all Ohio counties are reported as 0 in the data released on December 1, 2022.

    January 5, 2023: Due to North Carolina’s holiday reporting cadence, aggregate case and death data will contain 14 days’ worth of data instead of the customary 7 days. As a result, case and death metrics will appear higher than expected in the January 5, 2023, weekly release.

    January 12, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0. As a result, case and death metrics will appear lower than expected in the January 12, 2023, weekly release.

    January 19, 2023: Due to a reporting cadence issue, Mississippi’s aggregate case and death data will be calculated based on 14 days’ worth of data instead of the customary 7 days in the January 19, 2023, weekly release.

    January 26, 2023: Due to a reporting backlog of historic COVID-19 cases, case rates for two Michigan counties (Livingston and Washtenaw) were higher than expected in the January 19, 2023 weekly release.

    January 26, 2023: Due to a backlog of historic COVID-19 cases being reported this week, aggregate case and death counts in Charlotte County and Sarasota County, Florida, will appear higher than expected in the January 26, 2023 weekly release.

    January 26, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0 in the weekly release posted on January 26, 2023.

    February 2, 2023: As of the data collection deadline, CDC observed an abnormally large increase in aggregate COVID-19 cases and deaths reported for Washington State. In response, totals for new cases and new deaths released on February 2, 2023, have been displayed as zero at the state level until the issue is addressed with state officials. CDC is working with state officials to address the issue.

    February 2, 2023: Due to a decrease reported in cumulative case counts by Wyoming, case rates will be reported as 0 in the February 2, 2023, weekly release. CDC is working with state officials to verify the data submitted.

    February 16, 2023: Due to data processing delays, Utah’s aggregate case and death data will be reported as 0 in the weekly release posted on February 16, 2023. As a result, case and death metrics will appear lower than expected and should be interpreted with caution.

    February 16, 2023: Due to a reporting cadence change, Maine’s

  16. P

    Skill2vec Dataset

    • paperswithcode.com
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Le Van-Duyet; Vo Minh Quan; Dang Quang An (2024). Skill2vec Dataset [Dataset]. https://paperswithcode.com/dataset/skill2vec
    Explore at:
    Dataset updated
    Jan 9, 2024
    Authors
    Le Van-Duyet; Vo Minh Quan; Dang Quang An
    Description

    Collects a huge number of job descriptions from Dice.com - one of the most popular career website about Tech jobs in USA. From these job descriptions, skills are extracted for each one by using skills dictionary. Now, the dataset is presented by a list of collections of skills based on job descriptions. After crawling, there are a total of 5GB with more than 1,400,000 job descriptions. From these data, skills are extracted and performed as a list of skills in the same context, the context here includes skills in the same job description.

  17. 4

    The Spotify Audio Features Hit Predictor Dataset (1960-2019)

    • data.4tu.nl
    zip
    Updated Feb 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farooq Ansari (2020). The Spotify Audio Features Hit Predictor Dataset (1960-2019) [Dataset]. http://doi.org/10.4121/uuid:d77e74b0-66bc-47ac-8b25-5796d3084478
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 4, 2020
    Dataset provided by
    4TU.Centre for Research Data
    Authors
    Farooq Ansari
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Time period covered
    1960 - 2019
    Description

    This is a dataset consisting of features for tracks fetched using Spotify's Web API. The tracks are labeled '1' or '0' ('Hit' or 'Flop') depending on some criterias of the author. This dataset can be used to make a classification model that predicts whethere a track would be a 'Hit' or not. (Note: The author does not objectively considers a track inferior, bad or a failure if its labeled 'Flop'. 'Flop' here merely implies that it is a track that probably could not be considered popular in the mainstream.) Here's an implementation of this idea in the form of a website that I made. {http://www.hitpredictor.in/}

  18. Marine Species Biodiversity - AquaMaps

    • fsm-data.sprep.org
    • pacificdata.org
    • +13more
    csv
    Updated Feb 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Secretariat of the Pacific Regional Environment Programme (2025). Marine Species Biodiversity - AquaMaps [Dataset]. https://fsm-data.sprep.org/dataset/marine-species-biodiversity-aquamaps
    Explore at:
    csv(5080821)Available download formats
    Dataset updated
    Feb 20, 2025
    Dataset provided by
    Pacific Regional Environment Programmehttps://www.sprep.org/
    License

    Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
    License information was derived automatically

    Area covered
    190.70068359375 84.865781867315, 190.70068359375 -82.940326801695)), POLYGON ((-172.11181640625 -82.940326801695, -172.11181640625 84.865781867315, Worldwide
    Description

    AquaMaps are computer-generated predictions of natural occurrence of marine species, based on the environmental tolerance of a given species with respect to depth, salinity, temperature, primary productivity, and its association with sea ice or coastal areas. These 'environmental envelopes' are matched against an authority file which contains respective information for the Oceans of the World. Independent knowledge such as distribution by FAO areas or bounding boxes are used to avoid mapping species in areas that contain suitable habitat, but are not occupied by the species. Maps show the color-coded likelihood of a species to occur in a half-degree cell, with about 50 km side length near the equator. Experts are able to review, modify and approve maps.

    Environmental envelopes are created in part (FAO areas, bounding boxes, depth ranges) from respective information in species databases such as FishBase and in part from occurrence records available from OBIS or GBIF. AquaMaps predictions have been validated successfully for a number of species using independent data sets and the model was shown to perform equally well or better than other standard species distribution models, when faced with the currently existing suboptimal input data sets (Ready et al. 2010).

    The creation of AquaMaps is supported by the following projects: MARA, Pew Fellows Program in Marine Conservation, INCOFISH, Sea Around Us, and Biogeoinformatics of Hexacorals.

    Kaschner, K., D.P. Tittensor, J. Ready, T Gerrodette and B. Worm (2011). Current and Future Patterns of Global Marine Mammal Biodiversity. PLoS ONE 6(5): e19653. PDF

    Ready, J., K. Kaschner, A.B. South, P.D Eastwood, T. Rees, J. Rius, E. Agbayani, S. Kullander and R. Froese (2010). Predicting the distributions of marine organisms at the global scale. Ecological Modelling 221(3): 467-478. PDF

    Copyright Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License. (CC-BY-NC) You are welcome to include maps from www.aquamaps.org in your own web sites for non-commercial use, given that such inserts are clearly identified as coming from AquaMaps, with a backward link to the respective source page.

    Contacts Rainer Froese, GEOMAR, Coordinator rfroese@geomar.de Kristin Kaschner, Uni Freiburg, model development Kristin.Kaschner@biologie.uni-freiburg.de Ma. Lourdes D. Palomares, UBC, extension to non-fish marine organisms m.palomares@fisheries.ubc.ca Sven Kullander, NRM, extension to freshwater ve-sven@nrm.se Jonathan Ready, NRM, implementation jonathan.ready@gmail.com Tony Rees, formerly with CSIRO, mapping tools Tony.Rees@marinespecies.org Paul Eastwood, SOPAC, validation Paul.Eastwood@sopac.org Andy South, CEFAS, validation andy.south@cefas.co.uk Josephine Rius-Barile, Q-quatics, database programming / data collection j.barile@q-quatics.org Cristina Garilao, GEOMAR, web programming cgarilao@geomar.de Kathleen Kesner-Reyes, Q-quatics, map validation k.reyes@q-quatics.org Elizabeth Bato, Q-quatics, map validation (non-fish) e.david@q-quatics.org

    Citing AquaMaps

    General citation Kaschner, K., K. Kesner-Reyes, C. Garilao, J. Rius-Barile, T. Rees, and R. Froese. 2019. AquaMaps: Predicted range maps for aquatic species. World wide web electronic publication, www.aquamaps.org, version 10/2019.

    Cite individual maps as, e.g., Computer Generated Map for Gadus morhua (Atlantic cod). www.aquamaps.org, version 10/2019 (accessed 01 Oct 2019).

    Reviewed Native Distribution Map for Gadus morhua (Atlantic cod). www.aquamaps.org, version 10/2019 (accessed 01 Oct 2019).

    Cite biodiversity maps as, e.g., Shark and Ray Biodiversity Map. www.aquamaps.org, version 10/2019 (accessed 01 Oct 2019).

    Cite the environmental dataset as, e.g., Kesner-Reyes, K., Segschneider, J., Garilao, C., Schneider, B., Rius-Barile, J., Kaschner, K., and Froese, R.(editors). AquaMaps Environmental Dataset: Half-Degree Cells Authority File (HCAF). World Wide Web electronic publication, www.aquamaps.org/main/envt_main.php, ver. 7, 10/2019.

    Using Full or Large Sets of AquaMaps Data We encourage partnering with the AquaMaps team for larger research projects or publications that would make intensive use of AquaMaps to ensure that you have access to the latest version and/or reviewed maps, the limitations of the data set are clearly understood and addressed, and that critical maps and/or unlikely results are recognized as such and double-checked for correctness prior to drawing conclusions and/or subsequent publication.

    The AquaMaps team can be contacted through Rainer Froese (rfroese@geomar.de) or Kristin Kaschner (Kristin.Kaschner@biologie.uni-freiburg.de).

    Privacy Policy AquaMaps uses log data generate usage statistics. Like most websites, AquMaps gathers information about internet protocol (IP) addresses, browser, referring pages, operating system, date/time, clicks, and visited pages, and store it in log files. This information is used to find errors in our website, analyze trends, and determine country of origin of our users. The log files are stored indefinitely. Only the administrators of the AquaMaps server has direct access to the log files. The information is used to inform further development of AquaMaps. Usage statistics may be shared with third parties for non-commercial purposes.

    Disclaimer AquaMaps generates standardized computer-generated and fairly reliable large scale predictions of marine and freshwater species. Although the AquaMaps team and their collaborators have obtained data from sources believed to be reliable and have made every reasonable effort to ensure its accuracy, many maps have not yet been verified by experts and we strongly suggest you verify species occurrences with independent sources before usage. We will not be held responsible for any consequence from the use or misuse of these data and/or maps by any organization or individual.

    Copyright This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License (CC-BY-NC). You are welcome to include text, numbers and maps from AquaMaps in your own web sites for non-commercial use, given that such inserts are clearly identified as coming from AquaMaps, with a backward link to the respective source page. Note that although species photos and drawings draw mainly from FishBase and SeaLifeBase, they belong to the indicated persons or organizations and have their own copyright statements.

  19. Z

    Higher Education Research: A Compilation of Journals and Abstracts 2019

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hertwig, Alexandra (2020). Higher Education Research: A Compilation of Journals and Abstracts 2019 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4244369
    Explore at:
    Dataset updated
    Nov 5, 2020
    Dataset authored and provided by
    Hertwig, Alexandra
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset to original publication:

    Hertwig, Alexandra (2020): Higher Education Research. A Compilation of Journals and Abstracts 2019. Kassel: INCHER-Kassel. DOI: 10.17170/kobra-202010292027.

    The Research Information Service (RIS) of INCHER-Kassel, Germany provides annual compilation of academic journals since 2013. This useful information tool for researchers also provides as a “side effect” an overview of the current topics of higher education research. The datasets allow for further evaluation of single or multiple volumes. For more information on original publications and available datasets please visit INCHER’s RIS websites.

    http://www.uni-kassel.de/einrichtungen/en/incher/risspecial-research-library/ris-documents.html

  20. a

    ABS - Regional Population (SA2) 2001-2019 - Dataset - AURIN

    • data.aurin.org.au
    Updated Mar 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). ABS - Regional Population (SA2) 2001-2019 - Dataset - AURIN [Dataset]. https://data.aurin.org.au/dataset/au-govt-abs-abs-regional-population-sa2-2001-2019-sa2-2016
    Explore at:
    Dataset updated
    Mar 5, 2025
    License

    Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
    License information was derived automatically

    Description

    This dataset contains estimates of the resident population and estimates of the components of population change as at 30 June for the years 2001-2019. The data is aggregated to 2016 Australian Statistical Geography Standard (ASGS) Statistical Area Level 2 (SA2). Estimated resident population (ERP) is the official estimate of the Australian population, which links people to a place of usual residence within Australia. Usual residence within Australia refers to that address at which the person has lived or intends to live for six months or more in a given reference year. For the 30 June reference date, this refers to the calendar year around it. Estimated resident population is based on Census counts by place of usual residence (excluding short-term overseas visitors in Australia), with an allowance for Census net undercount, to which are added the estimated number of Australian residents temporarily overseas at the time of the Census. This data is sourced from the Australian Bureau of Statistics (Catalogue Number: 3218.0). For more information please visit the Explanatory Notes.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Eduardo Fonseca; Manoj Plakal; Frederic Font; Daniel P. W. Daniel P. W. Ellis; Xavier Serra (2020). FSDKaggle2019 [Dataset]. http://doi.org/10.5281/zenodo.3612636

FSDKaggle2019

Explore at:
25 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jan 20, 2020
Authors
Eduardo Fonseca; Manoj Plakal; Frederic Font; Daniel P. W. Daniel P. W. Ellis; Xavier Serra
Description

FSDKaggle2019 is an audio dataset containing 29,266 audio files annotated with 80 labels of the AudioSet Ontology. FSDKaggle2019 has been used for the DCASE Challenge 2019 Task 2, which was run as a Kaggle competition titled Freesound Audio Tagging 2019. Citation If you use the FSDKaggle2019 dataset or part of it, please cite our DCASE 2019 paper: Eduardo Fonseca, Manoj Plakal, Frederic Font, Daniel P. W. Ellis, Xavier Serra. "Audio tagging with noisy labels and minimal supervision". Proceedings of the DCASE 2019 Workshop, NYC, US (2019) You can also consider citing our ISMIR 2017 paper, which describes how we gathered the manual annotations included in FSDKaggle2019. Eduardo Fonseca, Jordi Pons, Xavier Favory, Frederic Font, Dmitry Bogdanov, Andres Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra, "Freesound Datasets: A Platform for the Creation of Open Audio Datasets", In Proceedings of the 18th International Society for Music Information Retrieval Conference, Suzhou, China, 2017 Data curators Eduardo Fonseca, Manoj Plakal, Xavier Favory, Jordi Pons Contact You are welcome to contact Eduardo Fonseca should you have any questions at eduardo.fonseca@upf.edu. ABOUT FSDKaggle2019 Freesound Dataset Kaggle 2019 (or FSDKaggle2019 for short) is an audio dataset containing 29,266 audio files annotated with 80 labels of the AudioSet Ontology [1]. FSDKaggle2019 has been used for the Task 2 of the Detection and Classification of Acoustic Scenes and Events (DCASE) Challenge 2019. Please visit the DCASE2019 Challenge Task 2 website for more information. This Task was hosted on the Kaggle platform as a competition titled Freesound Audio Tagging 2019. It was organized by researchers from the Music Technology Group (MTG) of Universitat Pompeu Fabra (UPF), and from Sound Understanding team at Google AI Perception. The competition intended to provide insight towards the development of broadly-applicable sound event classifiers able to cope with label noise and minimal supervision conditions. FSDKaggle2019 employs audio clips from the following sources: Freesound Dataset (FSD): a dataset being collected at the MTG-UPF based on Freesound content organized with the AudioSet Ontology The soundtracks of a pool of Flickr videos taken from the Yahoo Flickr Creative Commons 100M dataset (YFCC) The audio data is labeled using a vocabulary of 80 labels from Google’s AudioSet Ontology [1], covering diverse topics: Guitar and other Musical Instruments, Percussion, Water, Digestive, Respiratory sounds, Human voice, Human locomotion, Hands, Human group actions, Insect, Domestic animals, Glass, Liquid, Motor vehicle (road), Mechanisms, Doors, and a variety of Domestic sounds. The full list of categories can be inspected in vocabulary.csv (see Files & Download below). The goal of the task was to build a multi-label audio tagging system that can predict appropriate label(s) for each audio clip in a test set. What follows is a summary of some of the most relevant characteristics of FSDKaggle2019. Nevertheless, it is highly recommended to read our DCASE 2019 paper for a more in-depth description of the dataset and how it was built. Ground Truth Labels The ground truth labels are provided at the clip-level, and express the presence of a sound category in the audio clip, hence can be considered weak labels or tags. Audio clips have variable lengths (roughly from 0.3 to 30s). The audio content from FSD has been manually labeled by humans following a data labeling process using the Freesound Annotator platform. Most labels have inter-annotator agreement but not all of them. More details about the data labeling process and the Freesound Annotator can be found in [2]. The YFCC soundtracks were labeled using automated heuristics applied to the audio content and metadata of the original Flickr clips. Hence, a substantial amount of label noise can be expected. The label noise can vary widely in amount and type depending on the category, including in- and out-of-vocabulary noises. More information about some of the types of label noise that can be encountered is available in [3]. Specifically, FSDKaggle2019 features three types of label quality, one for each set in the dataset: curated train set: correct (but potentially incomplete) labels noisy train set: noisy labels test set: correct and complete labels Further details can be found below in the sections for each set. Format All audio clips are provided as uncompressed PCM 16 bit, 44.1 kHz, mono audio files. DATA SPLIT FSDKaggle2019 consists of two train sets and one test set. The idea is to limit the supervision provided for training (i.e., the manually-labeled, hence reliable, data), thus promoting approaches to deal with label noise. Curated train set The curated train set consists of manually-labeled data from FSD. Number of clips/class: 75 except in a few cases (where there are less) Total number of clips: 4970 Avg number of labels/clip: 1.2 Total duration: 10.5 hours The duratio...

Search
Clear search
Close search
Google apps
Main menu