100+ datasets found
  1. s

    Data Sources

    • pacific-data.sprep.org
    • tonga-data.sprep.org
    xlsx
    Updated Feb 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Environment (2025). Data Sources [Dataset]. https://pacific-data.sprep.org/dataset/data-sources
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 14, 2025
    Dataset provided by
    Tonga
    Department of Environment
    License

    https://pacific-data.sprep.org/resource/private-data-license-agreement-0https://pacific-data.sprep.org/resource/private-data-license-agreement-0

    Area covered
    Tonga
    Description

    Data sources. Not complete. Will get it done this weekend.

  2. Data from: Inventory of online public databases and repositories holding...

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

  3. f

    Data from: Analysis of Commercial and Public Bioactivity Databases

    • acs.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pekka Tiikkainen; Lutz Franke (2023). Analysis of Commercial and Public Bioactivity Databases [Dataset]. http://doi.org/10.1021/ci2003126.s003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    ACS Publications
    Authors
    Pekka Tiikkainen; Lutz Franke
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data.

  4. a

    GIS Data Sources

    • hub.arcgis.com
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    King County (2024). GIS Data Sources [Dataset]. https://hub.arcgis.com/documents/kingcounty::gis-data-sources?uiVersion=content-views
    Explore at:
    Dataset updated
    Apr 2, 2024
    Dataset authored and provided by
    King County
    Area covered
    Description

    This page is an index of all the data sources that the GIS Center has to offer. If you're looking for anything, you'll find it here!

  5. f

    Summary of major data sources used.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 9, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Higgins, John M.; Delgado, Francisco Feijó; Malka, Roy; Manalis, Scott R. (2014). Summary of major data sources used. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001194004
    Explore at:
    Dataset updated
    Oct 9, 2014
    Authors
    Higgins, John M.; Delgado, Francisco Feijó; Malka, Roy; Manalis, Scott R.
    Description

    See Materials and Methods for description of the techniques.Summary of major data sources used.

  6. d

    Top-1000 HHS Open Data Resources

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Chief Data Officer (2025). Top-1000 HHS Open Data Resources [Dataset]. https://catalog.data.gov/dataset/top-1000-hhs-open-data-resources
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Office of Chief Data Officer
    Description

    HHS responsibly shares “open by default” data with the public to democratize access to information, demystify the Department, and increase transparency through data sharing. HHS Open Data is non-sensitive data, meaning thousands of health and human services datasets are publicly available to fuel new business models, enable emerging technologies like AI, accelerate scientific discoveries, and inspire American innovation. This top-1000 HHS Open Data websites and resources page, dynamically generated from the Digital Analytics Program (DAP) provided by the U.S. General Services Administration (GSA), is driven by near-real-time user demand. GSA’s DAP helps federal agencies and the public see how visitors find, access, and use government websites, data, and services online. The below list filters DAP for only resources from HHS and includes all HHS Divisions. You may filter by individual HHS Divisions and columns.

  7. O

    Department of Community Resources & Services Online Data Sources

    • opendata.howardcountymd.gov
    • data.wu.ac.at
    csv, xlsx, xml
    Updated Oct 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Community Resources & Services (2019). Department of Community Resources & Services Online Data Sources [Dataset]. https://opendata.howardcountymd.gov/w/kdeq-r7qc/j72c-n6z5?cur=LdI0ncE4AfX&from=n10jJ2BVdMM
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Oct 28, 2019
    Dataset authored and provided by
    Department of Community Resources & Services
    Description

    This dataset lists various data sources used within the Department of Community Resources & Services for various internal and external reports. This dataset allows individuals and organizations to identify the type of data they are looking for and to which geographical level they are trying to get the data for (i.e. National, State, County, etc.). This dataset will be updated every quarter and should be utilized for research purposes

  8. o

    Open Data Inception

    • public.opendatasoft.com
    • data.smartidf.services
    • +1more
    csv, excel, geojson +1
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Open Data Inception [Dataset]. https://public.opendatasoft.com/explore/dataset/open-data-sources/?flg=en-us
    Explore at:
    excel, csv, json, geojsonAvailable download formats
    Dataset updated
    Dec 2, 2025
    License

    https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain

    Description

    Open Data Inception is a project that compiles a comprehensive list of open data portals worldwide. It provides a geotagged, searchable map and list of these portals, making it easier for users to find clean, usable open data by country or topic. The initiative aims to address the challenge of locating reliable data sources, offering a user-friendly resource with an API for data enthusiasts and researchers. The project also explores standardizing metadata to improve data discoverability.Open Data Inception relies on crowsourcing and anyone can suggest the addition of a portal via this form.

  9. Data sources’ characteristics*.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuseppe Roberto; Ingrid Leal; Naveed Sattar; A. Katrina Loomis; Paul Avillach; Peter Egger; Rients van Wijngaarden; David Ansell; Sulev Reisberg; Mari-Liis Tammesoo; Helene Alavere; Alessandro Pasqua; Lars Pedersen; James Cunningham; Lara Tramontan; Miguel A. Mayer; Ron Herings; Preciosa Coloma; Francesco Lapi; Miriam Sturkenboom; Johan van der Lei; Martijn J. Schuemie; Peter Rijnbeek; Rosa Gini (2023). Data sources’ characteristics*. [Dataset]. http://doi.org/10.1371/journal.pone.0160648.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Giuseppe Roberto; Ingrid Leal; Naveed Sattar; A. Katrina Loomis; Paul Avillach; Peter Egger; Rients van Wijngaarden; David Ansell; Sulev Reisberg; Mari-Liis Tammesoo; Helene Alavere; Alessandro Pasqua; Lars Pedersen; James Cunningham; Lara Tramontan; Miguel A. Mayer; Ron Herings; Preciosa Coloma; Francesco Lapi; Miriam Sturkenboom; Johan van der Lei; Martijn J. Schuemie; Peter Rijnbeek; Rosa Gini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sources’ characteristics*.

  10. U.S. main sources freelancers used to find work 2020

    • statista.com
    Updated Sep 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2020). U.S. main sources freelancers used to find work 2020 [Dataset]. https://www.statista.com/statistics/530909/sources-for-freelancers-to-find-work-us/
    Explore at:
    Dataset updated
    Sep 15, 2020
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 15, 2020 - Jul 7, 2020
    Area covered
    United States
    Description

    According to a 2020 survey, about 46 percent of American freelancers who have participated in work throughout the COVID-19 pandemic stated that a previous client is a main source of finding more freelance work. This figure stood at 36 percent for freelancers who paused their freelance work during the COVID-19 pandemic.

  11. d

    Frontiers of Data Visualization Workshop II: Data Wrangling Workshop Summary...

    • catalog.data.gov
    Updated May 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NCO NITRD (2025). Frontiers of Data Visualization Workshop II: Data Wrangling Workshop Summary [Dataset]. https://catalog.data.gov/dataset/frontiers-of-data-visualization-workshop-ii-data-wrangling-workshop-summary
    Explore at:
    Dataset updated
    May 14, 2025
    Dataset provided by
    NCO NITRD
    Description

    The Data Visualization Workshop II: Data Wrangling was a web-based event held on October 18, 2017. This workshop report summarizes the individual perspectives of a group of visualization experts from the public, private, and academic sectors who met online to discuss how to improve the creation and use of high-quality visualizations. The specific focus of this workshop was on the complexities of "data wrangling". Data wrangling includes finding the appropriate data sources that are both accessible and usable and then shaping and combining that data to facilitate the most accurate and meaningful analysis possible. The workshop was organized as a 3-hour web event and moderated by the members of the Human Computer Interaction and Information Management Task Force of the Networking and Information Technology Research and Development Program's Big Data Interagency Working Group. Report prepared by the Human Computer Interaction And Information Management Task Force, Big Data Interagency Working Group, Networking & Information Technology Research & Development Subcommittee, Committee On Technology Of The National Science & Technology Council...

  12. r

    Data sources used in bibliometric studies 1978-2022

    • demo.researchdata.se
    • researchdata.se
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Camilla Hertil Lindelöw (2025). Data sources used in bibliometric studies 1978-2022 [Dataset]. http://doi.org/10.5281/zenodo.15037456
    Explore at:
    Dataset updated
    Jun 2, 2025
    Dataset provided by
    University of Borås
    Authors
    Camilla Hertil Lindelöw
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contain data sources used in bibliometric studies. Every row corresponds to one data source so there will be multiple rows for articles containing multiple sources. The column norm_coarse corresponds to data source category and norm_fine to data source (only available for bibliographical metadata sources). See article for details.Related article: Lindelöw, C. H., Hammarfelt, B., & Mazoni, A. (2025). Data sources used in bibliometrics 1978–2022: From proprietary databases to the great wide open. Journal of the Association for Information Science and Technology, 1–14. https://doi.org/10.1002/asi.25018

  13. f

    Data sources used to determine the hazard rates for progression-free...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Oct 13, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valant, Jason; Scher, Howard I.; Todd, Mary B.; Solo, Kirk; Mehra, Maneesha (2015). Data sources used to determine the hazard rates for progression-free survival and overall survival associated with each clinical state, and the survival estimates derived from these publications for inclusion into the model. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001903207
    Explore at:
    Dataset updated
    Oct 13, 2015
    Authors
    Valant, Jason; Scher, Howard I.; Todd, Mary B.; Solo, Kirk; Mehra, Maneesha
    Description

    *The distribution of patients flowing from nmCRPC to mCRPC that has not been treated with or not progressed on chemotherapy was determined based on Oudard et al 2009 [23].PSA, prostate-specific antigen; nmCRPC, non-metastatic castration-resistant prostate cancer; mCRPC, metastatic castration-resistant prostate cancer; NA, not applicable.Data sources used to determine the hazard rates for progression-free survival and overall survival associated with each clinical state, and the survival estimates derived from these publications for inclusion into the model.

  14. w

    Name Find Source LLC Whois Database | Whois Data Center

    • whoisdatacenter.com
    csv
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AllHeart Web Inc (2025). Name Find Source LLC Whois Database | Whois Data Center [Dataset]. https://whoisdatacenter.com/registrar/2863/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 7, 2025
    Dataset authored and provided by
    AllHeart Web Inc
    License

    https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/

    Time period covered
    Oct 12, 2025 - Dec 31, 2025
    Description

    Name Find Source LLC Whois Database, discover comprehensive ownership details, registration dates, and more for Name Find Source LLC with Whois Data Center.

  15. Data supporting the Master thesis "Monitoring von Open Data Praktiken -...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Nov 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katharina Zinke; Katharina Zinke (2024). Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" [Dataset]. http://doi.org/10.5281/zenodo.14196539
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 21, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Katharina Zinke; Katharina Zinke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Dresden
    Description

    Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" (Monitoring open data practices - challenges in finding data publications using the example of publications by researchers at TU Dresden) - Katharina Zinke, Institut für Bibliotheks- und Informationswissenschaften, Humboldt-Universität Berlin, 2023

    This ZIP-File contains the data the thesis is based on, interim exports of the results and the R script with all pre-processing, data merging and analyses carried out. The documentation of the additional, explorative analysis is also available. The actual PDFs and text files of the scientific papers used are not included as they are published open access.

    The folder structure is shown below with the file names and a brief description of the contents of each file. For details concerning the analyses approach, please refer to the master's thesis (publication following soon).

    ## Data sources

    Folder 01_SourceData/

    - PLOS-Dataset_v2_Mar23.csv (PLOS-OSI dataset)

    - ScopusSearch_ExportResults.csv (export of Scopus search results from Scopus)

    - ScopusSearch_ExportResults.ris (export of Scopus search results from Scopus)

    - Zotero_Export_ScopusSearch.csv (export of the file names and DOIs of the Scopus search results from Zotero)

    ## Automatic classification

    Folder 02_AutomaticClassification/

    - (NOT INCLUDED) PDFs folder (Folder for PDFs of all publications identified by the Scopus search, named AuthorLastName_Year_PublicationTitle_Title)

    - (NOT INCLUDED) PDFs_to_text folder (Folder for all texts extracted from the PDFs by ODDPub, named AuthorLastName_Year_PublicationTitle_Title)

    - PLOS_ScopusSearch_matched.csv (merge of the Scopus search results with the PLOS_OSI dataset for the files contained in both)

    - oddpub_results_wDOIs.csv (results file of the ODDPub classification)

    - PLOS_ODDPub.csv (merge of the results file of the ODDPub classification with the PLOS-OSI dataset for the publications contained in both)

    ## Manual coding

    Folder 03_ManualCheck/

    - CodeSheet_ManualCheck.txt (Code sheet with descriptions of the variables for manual coding)

    - ManualCheck_2023-06-08.csv (Manual coding results file)

    - PLOS_ODDPub_Manual.csv (Merge of the results file of the ODDPub and PLOS-OSI classification with the results file of the manual coding)

    ## Explorative analysis for the discoverability of open data

    Folder04_FurtherAnalyses

    Proof_of_of_Concept_Open_Data_Monitoring.pdf (Description of the explorative analysis of the discoverability of open data publications using the example of a researcher) - in German

    ## R-Script

    Analyses_MA_OpenDataMonitoring.R (R-Script for preparing, merging and analyzing the data and for performing the ODDPub algorithm)

  16. Z

    Data from: Bibliographic dataset characterizing studies that use online...

    • data-staging.niaid.nih.gov
    • portalcientifico.unav.edu
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ball-Damerow, Joan E.; Brenskelle, Laura; Barve, Narayani; LaFrance, Raphael; Soltis, Pamela S.; Sierwald, Petra; Bieler, Rüdiger; Ariño, Arturo; Guralnick, Robert (2020). Bibliographic dataset characterizing studies that use online biodiversity databases [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_2589438
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Field Museum of Natural History
    Department of Environmental Biology, Universidad de Navarra
    Florida Museum of Natural History, University of Florida, Gainesville
    Authors
    Ball-Damerow, Joan E.; Brenskelle, Laura; Barve, Narayani; LaFrance, Raphael; Soltis, Pamela S.; Sierwald, Petra; Bieler, Rüdiger; Ariño, Arturo; Guralnick, Robert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes bibliographic information for 501 papers that were published from 2010-April 2017 (time of search) and use online biodiversity databases for research purposes. Our overarching goal in this study is to determine how research uses of biodiversity data developed during a time of unprecedented growth of online data resources. We also determine uses with the highest number of citations, how online occurrence data are linked to other data types, and if/how data quality is addressed. Specifically, we address the following questions:

    1.) What primary biodiversity databases have been cited in published research, and which

     databases have been cited most often?
    

    2.) Is the biodiversity research community citing databases appropriately, and are

     the cited databases currently accessible online?
    

    3.) What are the most common uses, general taxa addressed, and data linkages, and how

     have they changed over time?
    

    4.) What uses have the highest impact, as measured through the mean number of citations

     per year?
    

    5.) Are certain uses applied more often for plants/invertebrates/vertebrates?

    6.) Are links to specific data types associated more often with particular uses?

    7.) How often are major data quality issues addressed?

    8.) What data quality issues tend to be addressed for the top uses?

    Relevant papers for this analysis include those that use online and openly accessible primary occurrence records, or those that add data to an online database. Google Scholar (GS) provides full-text indexing, which was important to identify data sources that often appear buried in the methods section of a paper. Our search was therefore restricted to GS. All authors discussed and agreed upon representative search terms, which were relatively broad to capture a variety of databases hosting primary occurrence records. The terms included: “species occurrence” database (8,800 results), “natural history collection” database (634 results), herbarium database (16,500 results), “biodiversity database” (3,350 results), “primary biodiversity data” database (483 results), “museum collection” database (4,480 results), “digital accessible information” database (10 results), and “digital accessible knowledge” database (52 results)--note that quotations are used as part of the search terms where specific phrases are needed in whole. We downloaded all records returned by each search (or the first 500 if there were more) into a Zotero reference management database. About one third of the 2500 papers in the final dataset were relevant. Three of the authors with specialized knowledge of the field characterized relevant papers using a standardized tagging protocol based on a series of key topics of interest. We developed a list of potential tags and descriptions for each topic, including: database(s) used, database accessibility, scale of study, region of study, taxa addressed, research use of data, other data types linked to species occurrence data, data quality issues addressed, authors, institutions, and funding sources. Each tagged paper was thoroughly checked by a second tagger.

    The final dataset of tagged papers allow us to quantify general areas of research made possible by the expansion of online species occurrence databases, and trends over time. Analyses of this data will be published in a separate quantitative review.

  17. h

    EuroSpeech-Data-Sources

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel, EuroSpeech-Data-Sources [Dataset]. https://huggingface.co/datasets/sam8000/EuroSpeech-Data-Sources
    Explore at:
    Authors
    Samuel
    Description

    EuroSpeech Data Sources

    This repository contains the source metadata for the EuroSpeech multilingual speech dataset. EuroSpeech is a large-scale parliamentary speech corpus covering 22 European languages with over 78k hours of aligned speech-text data.

      📋 Repository Contents
    

    This repository provides comprehensive CSV files that document the original sources for all audio and transcript data used in the EuroSpeech dataset. For each of the 22 countries included in… See the full description on the dataset page: https://huggingface.co/datasets/sam8000/EuroSpeech-Data-Sources.

  18. Component algorithms description.

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuseppe Roberto; Ingrid Leal; Naveed Sattar; A. Katrina Loomis; Paul Avillach; Peter Egger; Rients van Wijngaarden; David Ansell; Sulev Reisberg; Mari-Liis Tammesoo; Helene Alavere; Alessandro Pasqua; Lars Pedersen; James Cunningham; Lara Tramontan; Miguel A. Mayer; Ron Herings; Preciosa Coloma; Francesco Lapi; Miriam Sturkenboom; Johan van der Lei; Martijn J. Schuemie; Peter Rijnbeek; Rosa Gini (2023). Component algorithms description. [Dataset]. http://doi.org/10.1371/journal.pone.0160648.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Giuseppe Roberto; Ingrid Leal; Naveed Sattar; A. Katrina Loomis; Paul Avillach; Peter Egger; Rients van Wijngaarden; David Ansell; Sulev Reisberg; Mari-Liis Tammesoo; Helene Alavere; Alessandro Pasqua; Lars Pedersen; James Cunningham; Lara Tramontan; Miguel A. Mayer; Ron Herings; Preciosa Coloma; Francesco Lapi; Miriam Sturkenboom; Johan van der Lei; Martijn J. Schuemie; Peter Rijnbeek; Rosa Gini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Component algorithms description.

  19. f

    DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Jan 18, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fulone, Izabela; Ferre, Felipe; da Costa Lima, Elisangela; Ito, Marcia; Osorio-de-Castro, Claudia Garcia Serpa; Mota, Daniel Marques; de Souza, Luiz Júpiter Carneiro; Zimmernan, Ivan Ricardo; Elseviers, Monique; Lopes, Luciane Cruz; Leal, Lisiane Freitas; Da Luz Carvalho-Soares, Monica (2022). DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000302077
    Explore at:
    Dataset updated
    Jan 18, 2022
    Authors
    Fulone, Izabela; Ferre, Felipe; da Costa Lima, Elisangela; Ito, Marcia; Osorio-de-Castro, Claudia Garcia Serpa; Mota, Daniel Marques; de Souza, Luiz Júpiter Carneiro; Zimmernan, Ivan Ricardo; Elseviers, Monique; Lopes, Luciane Cruz; Leal, Lisiane Freitas; Da Luz Carvalho-Soares, Monica
    Area covered
    Brazil
    Description

    Background: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.

  20. d

    Hourly wind speed in miles per hour and three-digit data-source flag...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Oct 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Hourly wind speed in miles per hour and three-digit data-source flag associated with the data, January 1, 1948 - September 30, 2015 [Dataset]. https://catalog.data.gov/dataset/hourly-wind-speed-in-miles-per-hour-and-three-digit-data-source-flag-associated-with-th-30
    Explore at:
    Dataset updated
    Oct 1, 2025
    Dataset provided by
    U.S. Geological Survey
    Description

    The text file "Wind speed.txt" contains hourly data and associated data-source flag from January 1, 1948, to September 30, 2015. The primary source of the data is the Argonne National Laboratory, Illinois. The first four columns give year, month, day and hour of the observation. Column 5 is the data in miles per hour. Column 6 is the three-digit data-source flag to identify the wind speed data processing and they indicate if the data are original or missing, the method that was used to fill the missing periods, and any other transformations of the data. The data-source flag consist of a three-digit sequence in the form "xyz" that describe the origin and transformations of the data values. The user of the data should consult Over and others (2010) for the detailed documentation of this hourly data-source flag series. Reference Cited: Over, T.M., Price, T.H., and Ishii, A.L., 2010, Development and analysis of a meteorological database, Argonne National Laboratory, Illinois: U.S. Geological Survey Open File Report 2010-1220, 67 p., http://pubs.usgs.gov/of/2010/1220/.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Department of Environment (2025). Data Sources [Dataset]. https://pacific-data.sprep.org/dataset/data-sources

Data Sources

Explore at:
xlsxAvailable download formats
Dataset updated
Feb 14, 2025
Dataset provided by
Tonga
Department of Environment
License

https://pacific-data.sprep.org/resource/private-data-license-agreement-0https://pacific-data.sprep.org/resource/private-data-license-agreement-0

Area covered
Tonga
Description

Data sources. Not complete. Will get it done this weekend.

Search
Clear search
Close search
Google apps
Main menu