100+ datasets found
  1. DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes (2023). DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study.xlsx [Dataset]. http://doi.org/10.3389/fphar.2021.789872.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Brazil
    Description

    Background: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.

  2. O

    Department of Community Resources & Services Online Data Sources

    • opendata.howardcountymd.gov
    • data.wu.ac.at
    csv, xlsx, xml
    Updated Oct 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Community Resources & Services (2019). Department of Community Resources & Services Online Data Sources [Dataset]. https://opendata.howardcountymd.gov/w/kdeq-r7qc/j72c-n6z5?cur=LdI0ncE4AfX&from=n10jJ2BVdMM
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Oct 28, 2019
    Dataset authored and provided by
    Department of Community Resources & Services
    Description

    This dataset lists various data sources used within the Department of Community Resources & Services for various internal and external reports. This dataset allows individuals and organizations to identify the type of data they are looking for and to which geographical level they are trying to get the data for (i.e. National, State, County, etc.). This dataset will be updated every quarter and should be utilized for research purposes

  3. d

    Addresses (Open Data)

    • catalog.data.gov
    • data-academy.tempe.gov
    • +11more
    Updated Nov 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2025). Addresses (Open Data) [Dataset]. https://catalog.data.gov/dataset/addresses-open-data
    Explore at:
    Dataset updated
    Nov 22, 2025
    Dataset provided by
    City of Tempe
    Description

    This dataset is a compilation of address point data for the City of Tempe. The dataset contains a point location, the official address (as defined by The Building Safety Division of Community Development) for all occupiable units and any other official addresses in the City. There are several additional attributes that may be populated for an address, but they may not be populated for every address. Contact: Lynn Flaaen-Hanna, Development Services Specialist Contact E-mail Link: Map that Lets You Explore and Export Address Data Data Source: The initial dataset was created by combining several datasets and then reviewing the information to remove duplicates and identify errors. This published dataset is the system of record for Tempe addresses going forward, with the address information being created and maintained by The Building Safety Division of Community Development.Data Source Type: ESRI ArcGIS Enterprise GeodatabasePreparation Method: N/APublish Frequency: WeeklyPublish Method: AutomaticData Dictionary

  4. m

    Use case data sources

    • data.mendeley.com
    Updated Feb 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Werner Kritzinger (2018). Use case data sources [Dataset]. http://doi.org/10.17632/fx9xfmtfcw.1
    Explore at:
    Dataset updated
    Feb 28, 2018
    Authors
    Werner Kritzinger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sources for 54 use cases, which were used to identify the impacts to the value creation system by Additive Manufacturing.

  5. AHRQ Social Determinants of Health Updated Database

    • datalumos.org
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AHRQ (2025). AHRQ Social Determinants of Health Updated Database [Dataset]. http://doi.org/10.3886/E220762V1
    Explore at:
    Dataset updated
    Feb 25, 2025
    Dataset provided by
    Agency for Healthcare Research and Qualityhttp://www.ahrq.gov/
    Authors
    AHRQ
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AHRQ's database on Social Determinants of Health (SDOH) was created under a project funded by the Patient Centered Outcomes Research (PCOR) Trust Fund. The purpose of this project is to create easy to use, easily linkable SDOH-focused data to use in PCOR research, inform approaches to address emerging health issues, and ultimately contribute to improved health outcomes.The database was developed to make it easier to find a range of well documented, readily linkable SDOH variables across domains without having to access multiple source files, facilitating SDOH research and analysis.Variables in the files correspond to five key SDOH domains: social context (e.g., age, race/ethnicity, veteran status), economic context (e.g., income, unemployment rate), education, physical infrastructure (e.g, housing, crime, transportation), and healthcare context (e.g., health insurance). The files can be linked to other data by geography (county, ZIP Code, and census tract). The database includes data files and codebooks by year at three levels of geography, as well as a documentation file.The data contained in the SDOH database are drawn from multiple sources and variables may have differing availability, patterns of missing, and methodological considerations across sources, geographies, and years. Users should refer to the data source documentation and codebooks, as well as the original data sources, to help identify these patterns

  6. d

    Global Web Data | Web Scraping Data | Job Postings Data | Source: Company...

    • datarade.ai
    .json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads, Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 232M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-job-postings-dat-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset authored and provided by
    PredictLeads
    Area covered
    Bosnia and Herzegovina, Guadeloupe, French Guiana, Comoros, Virgin Islands (British), Northern Mariana Islands, Bonaire, Kuwait, El Salvador, Kosovo
    Description

    PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.

    Key Features:

    ✅232M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. ✅7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. ✅Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. ✅Technology & Skill Tracking – Identify emerging tech trends and industry demands. ✅Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. ✅Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.

    Primary Attributes:

    • id (string, UUID) – Unique identifier for the job posting.
    • type (string, constant: "job_opening") – Object type.
    • title (string) – Job title.
    • description (string) – Full job description, extracted from the job listing.
    • url (string, URL) – Direct link to the job posting.
    • first_seen_at – Timestamp when the job was first detected.
    • last_seen_at – Timestamp when the job was last detected.
    • last_processed_at – Timestamp when the job data was last processed.

    Job Metadata:

    • contract_types (array of strings) – Type of employment (e.g., "full time", "part time", "contract").
    • categories (array of strings) – Job categories (e.g., "engineering", "marketing").
    • seniority (string) – Seniority level of the job (e.g., "manager", "non_manager").
    • status (string) – Job status (e.g., "open", "closed").
    • language (string) – Language of the job posting.
    • location (string) – Full location details as listed in the job description.
    • Location Data (location_data) (array of objects)
    • city (string, nullable) – City where the job is located.
    • state (string, nullable) – State or region of the job location.
    • zip_code (string, nullable) – Postal/ZIP code.
    • country (string, nullable) – Country where the job is located.
    • region (string, nullable) – Broader geographical region.
    • continent (string, nullable) – Continent name.
    • fuzzy_match (boolean) – Indicates whether the location was inferred.

    Salary Data (salary_data)

    • salary (string) – Salary range extracted from the job listing.
    • salary_low (float, nullable) – Minimum salary in original currency.
    • salary_high (float, nullable) – Maximum salary in original currency.
    • salary_currency (string, nullable) – Currency of the salary (e.g., "USD", "EUR").
    • salary_low_usd (float, nullable) – Converted minimum salary in USD.
    • salary_high_usd (float, nullable) – Converted maximum salary in USD.
    • salary_time_unit (string, nullable) – Time unit for the salary (e.g., "year", "month", "hour").

    Occupational Data (onet_data) (object, nullable)

    • code (string, nullable) – ONET occupation code.
    • family (string, nullable) – Broad occupational family (e.g., "Computer and Mathematical").
    • occupation_name (string, nullable) – Official ONET occupation title.

    Additional Attributes:

    • tags (array of strings, nullable) – Extracted skills and keywords (e.g., "Python", "JavaScript").

    📌 Trusted by enterprises, recruiters, and investors for high-precision job market insights.

    PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset

  7. U

    Database used for the evaluation of data used to identify groundwater...

    • data.usgs.gov
    • gimi9.com
    • +1more
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eliza Gross, Database used for the evaluation of data used to identify groundwater sources under the direct influence of surface water in Pennsylvania [Dataset]. http://doi.org/10.5066/P9Q0BXH1
    Explore at:
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Eliza Gross
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    1920 - 2016
    Area covered
    Pennsylvania
    Description

    The U.S. Geological Survey (USGS), in cooperation with the Pennsylvania Department of Environmental Protection (PADEP), conducted an evaluation of data used by the PADEP to identify groundwater sources under the direct influence of surface water (GUDI) in Pennsylvania (Gross and others, 2022). The data used in this evaluation and the processes used to compile them from multiple sources are described and provided herein. Data were compiled primarily but not exclusively from PADEP resources, including (1) source-information for public water-supply systems and Microscopic Particulate Analysis (MPA) results for public water-supply system groundwater sources from the agency’s Pennsylvania Drinking Water Information System (PADWIS) database (Pennsylvania Department of Environmental Protection, 2016), and (2) results associated with MPA testing from the PADEP Bureau of Laboratories (BOL) files and water-quality analyses obtained from the PADEP BOL, Sample Information System (Pennsylvania ...

  8. a

    GIS Data Sources

    • hub.arcgis.com
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    King County (2024). GIS Data Sources [Dataset]. https://hub.arcgis.com/documents/kingcounty::gis-data-sources?uiVersion=content-views
    Explore at:
    Dataset updated
    Apr 2, 2024
    Dataset authored and provided by
    King County
    Area covered
    Description

    This page is an index of all the data sources that the GIS Center has to offer. If you're looking for anything, you'll find it here!

  9. Data from: Inventory of online public databases and repositories holding...

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Inventory of online public databases and repositories holding agricultural data in 2017 [Dataset]. https://catalog.data.gov/dataset/inventory-of-online-public-databases-and-repositories-holding-agricultural-data-in-2017-d4c81
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    United States agricultural researchers have many options for making their data available online. This dataset aggregates the primary sources of ag-related data and determines where researchers are likely to deposit their agricultural data. These data serve as both a current landscape analysis and also as a baseline for future studies of ag research data. Purpose As sources of agricultural data become more numerous and disparate, and collaboration and open data become more expected if not required, this research provides a landscape inventory of online sources of open agricultural data. An inventory of current agricultural data sharing options will help assess how the Ag Data Commons, a platform for USDA-funded data cataloging and publication, can best support data-intensive and multi-disciplinary research. It will also help agricultural librarians assist their researchers in data management and publication. The goals of this study were to establish where agricultural researchers in the United States-- land grant and USDA researchers, primarily ARS, NRCS, USFS and other agencies -- currently publish their data, including general research data repositories, domain-specific databases, and the top journals compare how much data is in institutional vs. domain-specific vs. federal platforms determine which repositories are recommended by top journals that require or recommend the publication of supporting data ascertain where researchers not affiliated with funding or initiatives possessing a designated open data repository can publish data Approach The National Agricultural Library team focused on Agricultural Research Service (ARS), Natural Resources Conservation Service (NRCS), and United States Forest Service (USFS) style research data, rather than ag economics, statistics, and social sciences data. To find domain-specific, general, institutional, and federal agency repositories and databases that are open to US research submissions and have some amount of ag data, resources including re3data, libguides, and ARS lists were analysed. Primarily environmental or public health databases were not included, but places where ag grantees would publish data were considered. Search methods We first compiled a list of known domain specific USDA / ARS datasets / databases that are represented in the Ag Data Commons, including ARS Image Gallery, ARS Nutrition Databases (sub-components), SoyBase, PeanutBase, National Fungus Collection, i5K Workspace @ NAL, and GRIN. We then searched using search engines such as Bing and Google for non-USDA / federal ag databases, using Boolean variations of “agricultural data” /“ag data” / “scientific data” + NOT + USDA (to filter out the federal / USDA results). Most of these results were domain specific, though some contained a mix of data subjects. We then used search engines such as Bing and Google to find top agricultural university repositories using variations of “agriculture”, “ag data” and “university” to find schools with agriculture programs. Using that list of universities, we searched each university web site to see if their institution had a repository for their unique, independent research data if not apparent in the initial web browser search. We found both ag specific university repositories and general university repositories that housed a portion of agricultural data. Ag specific university repositories are included in the list of domain-specific repositories. Results included Columbia University – International Research Institute for Climate and Society, UC Davis – Cover Crops Database, etc. If a general university repository existed, we determined whether that repository could filter to include only data results after our chosen ag search terms were applied. General university databases that contain ag data included Colorado State University Digital Collections, University of Michigan ICPSR (Inter-university Consortium for Political and Social Research), and University of Minnesota DRUM (Digital Repository of the University of Minnesota). We then split out NCBI (National Center for Biotechnology Information) repositories. Next we searched the internet for open general data repositories using a variety of search engines, and repositories containing a mix of data, journals, books, and other types of records were tested to determine whether that repository could filter for data results after search terms were applied. General subject data repositories include Figshare, Open Science Framework, PANGEA, Protein Data Bank, and Zenodo. Finally, we compared scholarly journal suggestions for data repositories against our list to fill in any missing repositories that might contain agricultural data. Extensive lists of journals were compiled, in which USDA published in 2012 and 2016, combining search results in ARIS, Scopus, and the Forest Service's TreeSearch, plus the USDA web sites Economic Research Service (ERS), National Agricultural Statistics Service (NASS), Natural Resources and Conservation Service (NRCS), Food and Nutrition Service (FNS), Rural Development (RD), and Agricultural Marketing Service (AMS). The top 50 journals' author instructions were consulted to see if they (a) ask or require submitters to provide supplemental data, or (b) require submitters to submit data to open repositories. Data are provided for Journals based on a 2012 and 2016 study of where USDA employees publish their research studies, ranked by number of articles, including 2015/2016 Impact Factor, Author guidelines, Supplemental Data?, Supplemental Data reviewed?, Open Data (Supplemental or in Repository) Required? and Recommended data repositories, as provided in the online author guidelines for each the top 50 journals. Evaluation We ran a series of searches on all resulting general subject databases with the designated search terms. From the results, we noted the total number of datasets in the repository, type of resource searched (datasets, data, images, components, etc.), percentage of the total database that each term comprised, any dataset with a search term that comprised at least 1% and 5% of the total collection, and any search term that returned greater than 100 and greater than 500 results. We compared domain-specific databases and repositories based on parent organization, type of institution, and whether data submissions were dependent on conditions such as funding or affiliation of some kind. Results A summary of the major findings from our data review: Over half of the top 50 ag-related journals from our profile require or encourage open data for their published authors. There are few general repositories that are both large AND contain a significant portion of ag data in their collection. GBIF (Global Biodiversity Information Facility), ICPSR, and ORNL DAAC were among those that had over 500 datasets returned with at least one ag search term and had that result comprise at least 5% of the total collection. Not even one quarter of the domain-specific repositories and datasets reviewed allow open submission by any researcher regardless of funding or affiliation. See included README file for descriptions of each individual data file in this dataset. Resources in this dataset:Resource Title: Journals. File Name: Journals.csvResource Title: Journals - Recommended repositories. File Name: Repos_from_journals.csvResource Title: TDWG presentation. File Name: TDWG_Presentation.pptxResource Title: Domain Specific ag data sources. File Name: domain_specific_ag_databases.csvResource Title: Data Dictionary for Ag Data Repository Inventory. File Name: Ag_Data_Repo_DD.csvResource Title: General repositories containing ag data. File Name: general_repos_1.csvResource Title: README and file inventory. File Name: README_InventoryPublicDBandREepAgData.txt

  10. f

    Objective assessment of database quality for use in the automotive research...

    • fat-ev.de
    Updated Apr 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Objective assessment of database quality for use in the automotive research and development process – Part 2 [Dataset]. https://www.fat-ev.de/de/publikationen/publication/objective-assessment-of-database-quality-for-use-in-the-automotive-research-and-development-process---part-2
    Explore at:
    Dataset updated
    Apr 18, 2023
    Description

    Road traffic accidents remain to be a leading cause of death worldwide with roughly 1.3 million fatalities annually. To develop new safety approaches according to real-world challenges, accurate information is needed. Therefore, road safety experts are constantly looking for real-world data to answer the open challenges and ultimately reach “Vision Zero”.The Global Safety Database (GSD) offers access to an one of its kind up-to-date repository of road traffic accident statistics and databases on a meta-data level for road safety analyses.One main objective is the compilation of international data sources, for which a data management system has been developed. In addition to the inventory of road accident data sources, a questionnaire created by road safety experts is used to check the applicability of data sources for specific questions. Therefore, an automated and dynamic matching process enables comparing variables representing the questions with the existing data source content in the GSD. The results are stored in a result matrix which indicates the proportion of variables that correspond to the variables necessary to answer the research question for each data source investigated. In order to identify similarities and differences in road safety within the countries, a clustering methodology is developed to point out the possibilities and limitations of projecting information from the initial countries to other areas. The assessment of the representativeness of the individual data sources is the basis for the clustering. From a general perspective, the GSD is an essential tool pushing forward the worldwide harmonisation of traffic accident statistics and databases. Knowledge about the real-world accident scenery by bringing important databases together empowers the data-driven development which is eventually a key bringing us one step closer to a road system without casualties, the achievement of the Vision Zero.

  11. f

    Data sources used to determine the hazard rates for progression-free...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Oct 13, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valant, Jason; Scher, Howard I.; Todd, Mary B.; Solo, Kirk; Mehra, Maneesha (2015). Data sources used to determine the hazard rates for progression-free survival and overall survival associated with each clinical state, and the survival estimates derived from these publications for inclusion into the model. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001903207
    Explore at:
    Dataset updated
    Oct 13, 2015
    Authors
    Valant, Jason; Scher, Howard I.; Todd, Mary B.; Solo, Kirk; Mehra, Maneesha
    Description

    *The distribution of patients flowing from nmCRPC to mCRPC that has not been treated with or not progressed on chemotherapy was determined based on Oudard et al 2009 [23].PSA, prostate-specific antigen; nmCRPC, non-metastatic castration-resistant prostate cancer; mCRPC, metastatic castration-resistant prostate cancer; NA, not applicable.Data sources used to determine the hazard rates for progression-free survival and overall survival associated with each clinical state, and the survival estimates derived from these publications for inclusion into the model.

  12. Z

    Data from: Bibliographic dataset characterizing studies that use online...

    • data-staging.niaid.nih.gov
    • portalcientifico.unav.edu
    • +1more
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ball-Damerow, Joan E.; Brenskelle, Laura; Barve, Narayani; LaFrance, Raphael; Soltis, Pamela S.; Sierwald, Petra; Bieler, Rüdiger; Ariño, Arturo; Guralnick, Robert (2020). Bibliographic dataset characterizing studies that use online biodiversity databases [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_2589438
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Field Museum of Natural History
    Department of Environmental Biology, Universidad de Navarra
    Florida Museum of Natural History, University of Florida, Gainesville
    Authors
    Ball-Damerow, Joan E.; Brenskelle, Laura; Barve, Narayani; LaFrance, Raphael; Soltis, Pamela S.; Sierwald, Petra; Bieler, Rüdiger; Ariño, Arturo; Guralnick, Robert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes bibliographic information for 501 papers that were published from 2010-April 2017 (time of search) and use online biodiversity databases for research purposes. Our overarching goal in this study is to determine how research uses of biodiversity data developed during a time of unprecedented growth of online data resources. We also determine uses with the highest number of citations, how online occurrence data are linked to other data types, and if/how data quality is addressed. Specifically, we address the following questions:

    1.) What primary biodiversity databases have been cited in published research, and which

     databases have been cited most often?
    

    2.) Is the biodiversity research community citing databases appropriately, and are

     the cited databases currently accessible online?
    

    3.) What are the most common uses, general taxa addressed, and data linkages, and how

     have they changed over time?
    

    4.) What uses have the highest impact, as measured through the mean number of citations

     per year?
    

    5.) Are certain uses applied more often for plants/invertebrates/vertebrates?

    6.) Are links to specific data types associated more often with particular uses?

    7.) How often are major data quality issues addressed?

    8.) What data quality issues tend to be addressed for the top uses?

    Relevant papers for this analysis include those that use online and openly accessible primary occurrence records, or those that add data to an online database. Google Scholar (GS) provides full-text indexing, which was important to identify data sources that often appear buried in the methods section of a paper. Our search was therefore restricted to GS. All authors discussed and agreed upon representative search terms, which were relatively broad to capture a variety of databases hosting primary occurrence records. The terms included: “species occurrence” database (8,800 results), “natural history collection” database (634 results), herbarium database (16,500 results), “biodiversity database” (3,350 results), “primary biodiversity data” database (483 results), “museum collection” database (4,480 results), “digital accessible information” database (10 results), and “digital accessible knowledge” database (52 results)--note that quotations are used as part of the search terms where specific phrases are needed in whole. We downloaded all records returned by each search (or the first 500 if there were more) into a Zotero reference management database. About one third of the 2500 papers in the final dataset were relevant. Three of the authors with specialized knowledge of the field characterized relevant papers using a standardized tagging protocol based on a series of key topics of interest. We developed a list of potential tags and descriptions for each topic, including: database(s) used, database accessibility, scale of study, region of study, taxa addressed, research use of data, other data types linked to species occurrence data, data quality issues addressed, authors, institutions, and funding sources. Each tagged paper was thoroughly checked by a second tagger.

    The final dataset of tagged papers allow us to quantify general areas of research made possible by the expansion of online species occurrence databases, and trends over time. Analyses of this data will be published in a separate quantitative review.

  13. D

    Data Quality Coverage Analytics Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data Quality Coverage Analytics Market Research Report 2033 [Dataset]. https://dataintelo.com/report/data-quality-coverage-analytics-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Quality Coverage Analytics Market Outlook



    According to our latest research, the global Data Quality Coverage Analytics market size stood at USD 2.8 billion in 2024, reflecting a robust expansion driven by the accelerating digital transformation across enterprises worldwide. The market is projected to grow at a CAGR of 16.4% during the forecast period, reaching a forecasted size of USD 11.1 billion by 2033. This remarkable growth trajectory is underpinned by the increasing necessity for accurate, reliable, and actionable data to fuel strategic business decisions, regulatory compliance, and operational optimization in an increasingly data-centric business landscape.




    One of the primary growth factors for the Data Quality Coverage Analytics market is the exponential surge in data generation from diverse sources, including IoT devices, enterprise applications, social media platforms, and cloud-based environments. This data explosion has brought to the forefront the critical need for robust data quality management solutions that ensure the integrity, consistency, and reliability of data assets. Organizations across sectors are recognizing that poor data quality can lead to significant operational inefficiencies, flawed analytics outcomes, and increased compliance risks. As a result, there is a heightened demand for advanced analytics tools that can provide comprehensive coverage of data quality metrics, automate data profiling, and offer actionable insights for continuous improvement.




    Another significant driver fueling the market's expansion is the tightening regulatory landscape across industries such as BFSI, healthcare, and government. Regulatory frameworks like GDPR, HIPAA, and SOX mandate stringent data quality standards and audit trails, compelling organizations to invest in sophisticated data quality analytics solutions. These tools not only help organizations maintain compliance but also enhance their ability to detect anomalies, prevent data breaches, and safeguard sensitive information. Furthermore, the integration of artificial intelligence and machine learning into data quality analytics platforms is enabling more proactive and predictive data quality management, which is further accelerating market adoption.




    The growing emphasis on data-driven decision-making within enterprises is also playing a pivotal role in propelling the Data Quality Coverage Analytics market. As organizations strive to leverage business intelligence and advanced analytics for competitive advantage, the importance of high-quality, well-governed data becomes paramount. Data quality analytics platforms empower organizations to identify data inconsistencies, rectify errors, and maintain a single source of truth, thereby unlocking the full potential of their data assets. This trend is particularly pronounced in industries such as retail, manufacturing, and telecommunications, where real-time insights derived from accurate data can drive operational efficiencies, enhance customer experiences, and support innovation.




    From a regional perspective, North America currently dominates the Data Quality Coverage Analytics market due to the high concentration of technology-driven enterprises, early adoption of advanced analytics solutions, and robust regulatory frameworks. However, the Asia Pacific region is witnessing the fastest growth, fueled by rapid digitalization, increasing investments in cloud infrastructure, and the emergence of data-driven business models across key economies such as China, India, and Japan. Europe also represents a significant market, driven by stringent data protection regulations and the widespread adoption of data governance initiatives. Latin America and the Middle East & Africa are gradually catching up, as organizations in these regions recognize the strategic value of data quality in driving business transformation.



    Component Analysis



    The Component segment of the Data Quality Coverage Analytics market is bifurcated into software and services, each playing a crucial role in enabling organizations to achieve comprehensive data quality management. The software segment encompasses a wide range of solutions, including data profiling, cleansing, enrichment, monitoring, and reporting tools. These software solutions are designed to automate and streamline the process of identifying and rectifying data quality issues across diverse data sources and formats. As organizations increasingly adopt cloud-base

  14. d

    Hourly wind speed in miles per hour and three-digit data-source flag...

    • catalog.data.gov
    • data.usgs.gov
    • +2more
    Updated Oct 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Hourly wind speed in miles per hour and three-digit data-source flag associated with the data, January 1, 1948 - September 30, 2015 [Dataset]. https://catalog.data.gov/dataset/hourly-wind-speed-in-miles-per-hour-and-three-digit-data-source-flag-associated-with-th-30
    Explore at:
    Dataset updated
    Oct 1, 2025
    Dataset provided by
    U.S. Geological Survey
    Description

    The text file "Wind speed.txt" contains hourly data and associated data-source flag from January 1, 1948, to September 30, 2015. The primary source of the data is the Argonne National Laboratory, Illinois. The first four columns give year, month, day and hour of the observation. Column 5 is the data in miles per hour. Column 6 is the three-digit data-source flag to identify the wind speed data processing and they indicate if the data are original or missing, the method that was used to fill the missing periods, and any other transformations of the data. The data-source flag consist of a three-digit sequence in the form "xyz" that describe the origin and transformations of the data values. The user of the data should consult Over and others (2010) for the detailed documentation of this hourly data-source flag series. Reference Cited: Over, T.M., Price, T.H., and Ishii, A.L., 2010, Development and analysis of a meteorological database, Argonne National Laboratory, Illinois: U.S. Geological Survey Open File Report 2010-1220, 67 p., http://pubs.usgs.gov/of/2010/1220/.

  15. d

    Doorda UK Vulnerability Data | Location Data | 1.8M Postcodes from 30 Data...

    • datarade.ai
    .csv
    Updated Nov 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Doorda (2024). Doorda UK Vulnerability Data | Location Data | 1.8M Postcodes from 30 Data Sources | Location Intelligence and Analytics [Dataset]. https://datarade.ai/data-products/doorda-uk-vulnerability-data-property-data-34m-addresses-doorda
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Nov 5, 2024
    Dataset authored and provided by
    Doorda
    Area covered
    United Kingdom
    Description

    Doorda's UK Vulnerability Data provides a comprehensive database of over 1.8M postcodes sourced from 30 data sources, offering unparalleled insights for location intelligence and analytics purposes.

    Volume and stats: - 1.8M Postcodes - 5 Vulnerability areas covered - 1-100 Vulnerability rating

    Our Residential Real Estate Data offers a multitude of use cases: - Market Analysis - Identify Vulnerable Consumers - Mitigate Lead Generation Risk - Risk Management - Location Planning

    The key benefits of leveraging our Residential Real Estate Data include: - Data Accuracy - Informed Decision-Making - Competitive Advantage - Efficiency - Single Source

    Covering a wide range of industries and sectors, our data empowers organisations to make informed decisions, uncover market trends, and gain a competitive edge in the UK market.

  16. m

    COVID-19 Combined Data-set with Improved Measurement Errors

    • data.mendeley.com
    Updated May 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Afshin Ashofteh (2020). COVID-19 Combined Data-set with Improved Measurement Errors [Dataset]. http://doi.org/10.17632/nw5m4hs3jr.3
    Explore at:
    Dataset updated
    May 13, 2020
    Authors
    Afshin Ashofteh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Public health-related decision-making on policies aimed at controlling the COVID-19 pandemic outbreak depends on complex epidemiological models that are compelled to be robust and use all relevant available data. This data article provides a new combined worldwide COVID-19 dataset obtained from official data sources with improved systematic measurement errors and a dedicated dashboard for online data visualization and summary. The dataset adds new measures and attributes to the normal attributes of official data sources, such as daily mortality, and fatality rates. We used comparative statistical analysis to evaluate the measurement errors of COVID-19 official data collections from the Chinese Center for Disease Control and Prevention (Chinese CDC), World Health Organization (WHO) and European Centre for Disease Prevention and Control (ECDC). The data is collected by using text mining techniques and reviewing pdf reports, metadata, and reference data. The combined dataset includes complete spatial data such as countries area, international number of countries, Alpha-2 code, Alpha-3 code, latitude, longitude, and some additional attributes such as population. The improved dataset benefits from major corrections on the referenced data sets and official reports such as adjustments in the reporting dates, which suffered from a one to two days lag, removing negative values, detecting unreasonable changes in historical data in new reports and corrections on systematic measurement errors, which have been increasing as the pandemic outbreak spreads and more countries contribute data for the official repositories. Additionally, the root mean square error of attributes in the paired comparison of datasets was used to identify the main data problems. The data for China is presented separately and in more detail, and it has been extracted from the attached reports available on the main page of the CCDC website. This dataset is a comprehensive and reliable source of worldwide COVID-19 data that can be used in epidemiological models assessing the magnitude and timeline for confirmed cases, long-term predictions of deaths or hospital utilization, the effects of quarantine, stay-at-home orders and other social distancing measures, the pandemic’s turning point or in economic and social impact analysis, helping to inform national and local authorities on how to implement an adaptive response approach to re-opening the economy, re-open schools, alleviate business and social distancing restrictions, design economic programs or allow sports events to resume.

  17. Global Real World Evidence Solutions Market Size By Data Source (Electronic...

    • verifiedmarketresearch.com
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2025). Global Real World Evidence Solutions Market Size By Data Source (Electronic Health Records, Claims Data, Registries, Medical Devices), By Therapeutic Area (Oncology, Cardiovascular Diseases, Neurology, Rare Diseases), By Application (Drug Development, Clinical Decision Support, Epidemiological Studies, Post-Marketing Surveillance), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/real-world-evidence-solutions-market/
    Explore at:
    Dataset updated
    Oct 6, 2025
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2026 - 2032
    Area covered
    Global
    Description

    Real World Evidence Solutions Market size was valued at USD 1.30 Billion in 2024 and is projected to reach USD 3.71 Billion by 2032, growing at a CAGR of 13.92% during the forecast period 2026-2032.Global Real World Evidence Solutions Market DriversThe market drivers for the Real World Evidence Solutions Market can be influenced by various factors. These may include:Growing Need for Evidence-Based Healthcare: Real-world evidence (RWE) is becoming more and more important in healthcare decision-making, according to stakeholders such as payers, providers, and regulators. In addition to traditional clinical trial data, RWE solutions offer important insights into the efficacy, safety, and value of healthcare interventions in real-world situations.Growing Use of RWE by Pharmaceutical Companies: RWE solutions are being used by pharmaceutical companies to assist with market entry, post-marketing surveillance, and drug development initiatives. Pharmaceutical businesses can find new indications for their current medications, improve clinical trial designs, and convince payers and providers of the worth of their products with the use of RWE.Increasing Priority for Value-Based Healthcare: The emphasis on proving the cost- and benefit-effectiveness of healthcare interventions in real-world settings is growing as value-based healthcare models gain traction. To assist value-based decision-making, RWE solutions are essential in evaluating the economic effect and real-world consequences of healthcare interventions.Technological and Data Analytics Advancements: RWE solutions are becoming more capable due to advances in machine learning, artificial intelligence, and big data analytics. With the use of these technologies, healthcare stakeholders can obtain actionable insights from the analysis of vast and varied datasets, including patient-generated data, claims data, and electronic health records.Regulatory Support for RWE Integration: RWE is being progressively integrated into regulatory decision-making processes by regulatory organisations including the European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA). The FDA's Real-World Evidence Programme and the EMA's Adaptive Pathways and PRIority MEdicines (PRIME) programme are two examples of initiatives that are making it easier to incorporate RWE into regulatory submissions and drug development.Increasing Emphasis on Patient-Centric Healthcare: The value of patient-reported outcomes and real-world experiences in healthcare decision-making is becoming more widely acknowledged. RWE technologies facilitate the collection and examination of patient-centered data, offering valuable insights into treatment efficacy, patient inclinations, and quality of life consequences.Extension of RWE Use Cases: RWE solutions are being used in medication development, post-market surveillance, health economics and outcomes research (HEOR), comparative effectiveness research, and market access, among other healthcare fields. The necessity for a variety of RWE solutions catered to the needs of different stakeholders is being driven by the expansion of RWE use cases.

  18. Summary of model objectives.

    • plos.figshare.com
    xls
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kate M. Johnson; Boshen Jiao; M. A. Bender; Scott D. Ramsey; Beth Devine; Anirban Basu (2023). Summary of model objectives. [Dataset]. http://doi.org/10.1371/journal.pone.0267448.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Kate M. Johnson; Boshen Jiao; M. A. Bender; Scott D. Ramsey; Beth Devine; Anirban Basu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of model objectives.

  19. I

    Self-citation analysis data based on PubMed Central subset (2002-2005)

    • databank.illinois.edu
    • aws-databank-alb.library.illinois.edu
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubhanshu Mishra; Brent D Fegley; Jana Diesner; Vetle I. Torvik, Self-citation analysis data based on PubMed Central subset (2002-2005) [Dataset]. http://doi.org/10.13012/B2IDB-9665377_V1
    Explore at:
    Authors
    Shubhanshu Mishra; Brent D Fegley; Jana Diesner; Vetle I. Torvik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    U.S. National Institutes of Health (NIH)
    U.S. National Science Foundation (NSF)
    Description

    Self-citation analysis data based on PubMed Central subset (2002-2005) ---------------------------------------------------------------------- Created by Shubhanshu Mishra, Brent D. Fegley, Jana Diesner, and Vetle Torvik on April 5th, 2018 ## Introduction This is a dataset created as part of the publication titled: Mishra S, Fegley BD, Diesner J, Torvik VI (2018) Self-Citation is the Hallmark of Productive Authors, of Any Gender. PLOS ONE. It contains files for running the self citation analysis on articles published in PubMed Central between 2002 and 2005, collected in 2015. The dataset is distributed in the form of the following tab separated text files: * Training_data_2002_2005_pmc_pair_First.txt (1.2G) - Data for first authors * Training_data_2002_2005_pmc_pair_Last.txt (1.2G) - Data for last authors * Training_data_2002_2005_pmc_pair_Middle_2nd.txt (964M) - Data for middle 2nd authors * Training_data_2002_2005_pmc_pair_txt.header.txt - Header for the data * COLUMNS_DESC.txt file - Descriptions of all columns * model_text_files.tar.gz - Text files containing model coefficients and scores for model selection. * results_all_model.tar.gz - Model coefficient and result files in numpy format used for plotting purposes. v4.reviewer contains models for analysis done after reviewer comments. * README.txt file ## Dataset creation Our experiments relied on data from multiple sources including properitery data from Thompson Rueter's (now Clarivate Analytics) Web of Science collection of MEDLINE citations. Author's interested in reproducing our experiments should personally request from Clarivate Analytics for this data. However, we do make a similar but open dataset based on citations from PubMed Central which can be utilized to get similar results to those reported in our analysis. Furthermore, we have also freely shared our datasets which can be used along with the citation datasets from Clarivate Analytics, to re-create the datased used in our experiments. These datasets are listed below. If you wish to use any of those datasets please make sure you cite both the dataset as well as the paper introducing the dataset. * MEDLINE 2015 baseline: https://www.nlm.nih.gov/bsd/licensee/2015_stats/baseline_doc.html * Citation data from PubMed Central (original paper includes additional citations from Web of Science) * Author-ity 2009 dataset: - Dataset citation: Torvik, Vetle I.; Smalheiser, Neil R. (2018): Author-ity 2009 - PubMed author name disambiguated dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4222651_V1 - Paper citation: Torvik, V. I., & Smalheiser, N. R. (2009). Author name disambiguation in MEDLINE. ACM Transactions on Knowledge Discovery from Data, 3(3), 1–29. https://doi.org/10.1145/1552303.1552304 - Paper citation: Torvik, V. I., Weeber, M., Swanson, D. R., & Smalheiser, N. R. (2004). A probabilistic similarity metric for Medline records: A model for author name disambiguation. Journal of the American Society for Information Science and Technology, 56(2), 140–158. https://doi.org/10.1002/asi.20105 * Genni 2.0 + Ethnea for identifying author gender and ethnicity: - Dataset citation: Torvik, Vetle (2018): Genni + Ethnea for the Author-ity 2009 dataset. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-9087546_V1 - Paper citation: Smith, B. N., Singh, M., & Torvik, V. I. (2013). A search engine approach to estimating temporal changes in gender orientation of first names. In Proceedings of the 13th ACM/IEEE-CS joint conference on Digital libraries - JCDL ’13. ACM Press. https://doi.org/10.1145/2467696.2467720 - Paper citation: Torvik VI, Agarwal S. Ethnea -- an instance-based ethnicity classifier based on geo-coded author names in a large-scale bibliographic database. International Symposium on Science of Science March 22-23, 2016 - Library of Congress, Washington DC, USA. http://hdl.handle.net/2142/88927 * MapAffil for identifying article country of affiliation: - Dataset citation: Torvik, Vetle I. (2018): MapAffil 2016 dataset -- PubMed author affiliations mapped to cities and their geocodes worldwide. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4354331_V1 - Paper citation: Torvik VI. MapAffil: A Bibliographic Tool for Mapping Author Affiliation Strings to Cities and Their Geocodes Worldwide. D-Lib magazine : the magazine of the Digital Library Forum. 2015;21(11-12):10.1045/november2015-torvik * IMPLICIT journal similarity: - Dataset citation: Torvik, Vetle (2018): Author-implicit journal, MeSH, title-word, and affiliation-word pairs based on Author-ity 2009. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-4742014_V1 * Novelty dataset for identify article level novelty: - Dataset citation: Mishra, Shubhanshu; Torvik, Vetle I. (2018): Conceptual novelty scores for PubMed articles. University of Illinois at Urbana-Champaign. https://doi.org/10.13012/B2IDB-5060298_V1 - Paper citation: Mishra S, Torvik VI. Quantifying Conceptual Novelty in the Biomedical Literature. D-Lib magazine : The Magazine of the Digital Library Forum. 2016;22(9-10):10.1045/september2016-mishra - Code: https://github.com/napsternxg/Novelty * Expertise dataset for identifying author expertise on articles: * Source code provided at: https://github.com/napsternxg/PubMed_SelfCitationAnalysis Note: The dataset is based on a snapshot of PubMed (which includes Medline and PubMed-not-Medline records) taken in the first week of October, 2016. Check here for information to get PubMed/MEDLINE, and NLMs data Terms and Conditions Additional data related updates can be found at Torvik Research Group ## Acknowledgments This work was made possible in part with funding to VIT from NIH grant P01AG039347 and NSF grant 1348742. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. ## License Self-citation analysis data based on PubMed Central subset (2002-2005) by Shubhanshu Mishra, Brent D. Fegley, Jana Diesner, and Vetle Torvik is licensed under a Creative Commons Attribution 4.0 International License. Permissions beyond the scope of this license may be available at https://github.com/napsternxg/PubMed_SelfCitationAnalysis.

  20. d

    Hourly solar radiation in Langleys and three-digit data-source flag...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Hourly solar radiation in Langleys and three-digit data-source flag associated with the data, January 1, 1948 - September 30, 2015 [Dataset]. https://catalog.data.gov/dataset/hourly-solar-radiation-in-langleys-and-three-digit-data-source-flag-associated-with-the-30
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The text file "Solar radiation.txt" contains hourly data and associated data-source flag from January 1, 1948, to September 30, 2015. The primary source of the data is the Argonne National Laboratory, Illinois. The first four columns give year, month, day and hour of the observation. Column 5 is the data in Langleys. Column 6 is the three-digit data-source flag to identify the solar radiation data processing and they indicate if the data are original or missing, the method that was used to fill the missing periods, and any other transformations of the data. Bera (2014) describes in detail an addition of a new flag based on the regression analysis of the backup data series at St. Charles (STC) for water years (WY) 2008–10. The user of the data should consult Over and others (2010) and Bera (2014) for the detailed documentation of this hourly data-source flag series. Reference Cited: Over, T.M., Price, T.H., and Ishii, A.L., 2010, Development and analysis of a meteorological database, Argonne National Laboratory, Illinois: U.S. Geological Survey Open File Report 2010-1220, 67 p., http://pubs.usgs.gov/of/2010/1220/. Bera, M., 2014, Watershed Data Management (WDM) database for Salt Creek streamflow simulation, DuPage County, Illinois, water years 2005-11: U.S. Geological Survey Data Series 870, 18 p., http://dx.doi.org/10.3133/ds870.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes (2023). DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study.xlsx [Dataset]. http://doi.org/10.3389/fphar.2021.789872.s002
Organization logo

DataSheet2_Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study.xlsx

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Lisiane Freitas Leal; Claudia Garcia Serpa Osorio-de-Castro; Luiz Júpiter Carneiro de Souza; Felipe Ferre; Daniel Marques Mota; Marcia Ito; Monique Elseviers; Elisangela da Costa Lima; Ivan Ricardo Zimmernan; Izabela Fulone; Monica Da Luz Carvalho-Soares; Luciane Cruz Lopes
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Brazil
Description

Background: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.

Search
Clear search
Close search
Google apps
Main menu