100+ datasets found
  1. A

    Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 [Dataset]. https://www.ceicdata.com/en/algeria/governance-policy-and-institutions/dz-spi-pillar-4-data-sources-score-scale-0100
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2016 - Dec 1, 2022
    Area covered
    Algeria
    Variables measured
    Money Market Rate
    Description

    Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 data was reported at 45.958 NA in 2022. This records a decrease from the previous number of 49.075 NA for 2021. Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 data is updated yearly, averaging 49.892 NA from Dec 2016 (Median) to 2022, with 7 observations. The data reached an all-time high of 52.417 NA in 2018 and a record low of 45.958 NA in 2022. Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Algeria – Table DZ.World Bank.WDI: Governance: Policy and Institutions. The data sources overall score is a composity measure of whether countries have data available from the following sources: Censuses and surveys, administrative data, geospatial data, and private sector/citizen generated data. The data sources (input) pillar is segmented by four types of sources generated by (i) the statistical office (censuses and surveys), and sources accessed from elsewhere such as (ii) administrative data, (iii) geospatial data, and (iv) private sector data and citizen generated data. The appropriate balance between these source types will vary depending on a country’s institutional setting and the maturity of its statistical system. High scores should reflect the extent to which the sources being utilized enable the necessary statistical indicators to be generated. For example, a low score on environment statistics (in the data production pillar) may reflect a lack of use of (and low score for) geospatial data (in the data sources pillar). This type of linkage is inherent in the data cycle approach and can help highlight areas for investment required if country needs are to be met.;Statistical Performance Indicators, The World Bank (https://datacatalog.worldbank.org/dataset/statistical-performance-indicators);Weighted average;

  2. Sources of breached healthcare data in the U.S. 2023

    • statista.com
    Updated Nov 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Sources of breached healthcare data in the U.S. 2023 [Dataset]. https://www.statista.com/statistics/1274686/source-of-breached-healthcare-data-us/
    Explore at:
    Dataset updated
    Nov 27, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    A 2023 report on data breaches in the healthcare system in the United States revealed that in most incidents, the leaked data was located in the network server, with almost 70 percent of data breaches indicating this location. The second-most common location of breached data was e-mail, with over 18 percent of the cases, followed by paper or films, with nearly six percent of the cases.

  3. d

    Global Web Data | Web Scraping Data | Job Postings Data | Source: Company...

    • datarade.ai
    .json
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PredictLeads, Global Web Data | Web Scraping Data | Job Postings Data | Source: Company Website | 232M+ Records [Dataset]. https://datarade.ai/data-products/predictleads-web-data-web-scraping-data-job-postings-dat-predictleads
    Explore at:
    .jsonAvailable download formats
    Dataset authored and provided by
    PredictLeads
    Area covered
    El Salvador, Guadeloupe, French Guiana, Comoros, Kuwait, Bonaire, Bosnia and Herzegovina, Kosovo, Virgin Islands (British), Northern Mariana Islands
    Description

    PredictLeads Job Openings Data provides high-quality hiring insights sourced directly from company websites - not job boards. Using advanced web scraping technology, our dataset offers real-time access to job trends, salaries, and skills demand, making it a valuable resource for B2B sales, recruiting, investment analysis, and competitive intelligence.

    Key Features:

    ✅232M+ Job Postings Tracked – Data sourced from 92 Million company websites worldwide. ✅7,1M+ Active Job Openings – Updated in real-time to reflect hiring demand. ✅Salary & Compensation Insights – Extract salary ranges, contract types, and job seniority levels. ✅Technology & Skill Tracking – Identify emerging tech trends and industry demands. ✅Company Data Enrichment – Link job postings to employer domains, firmographics, and growth signals. ✅Web Scraping Precision – Directly sourced from employer websites for unmatched accuracy.

    Primary Attributes:

    • id (string, UUID) – Unique identifier for the job posting.
    • type (string, constant: "job_opening") – Object type.
    • title (string) – Job title.
    • description (string) – Full job description, extracted from the job listing.
    • url (string, URL) – Direct link to the job posting.
    • first_seen_at – Timestamp when the job was first detected.
    • last_seen_at – Timestamp when the job was last detected.
    • last_processed_at – Timestamp when the job data was last processed.

    Job Metadata:

    • contract_types (array of strings) – Type of employment (e.g., "full time", "part time", "contract").
    • categories (array of strings) – Job categories (e.g., "engineering", "marketing").
    • seniority (string) – Seniority level of the job (e.g., "manager", "non_manager").
    • status (string) – Job status (e.g., "open", "closed").
    • language (string) – Language of the job posting.
    • location (string) – Full location details as listed in the job description.
    • Location Data (location_data) (array of objects)
    • city (string, nullable) – City where the job is located.
    • state (string, nullable) – State or region of the job location.
    • zip_code (string, nullable) – Postal/ZIP code.
    • country (string, nullable) – Country where the job is located.
    • region (string, nullable) – Broader geographical region.
    • continent (string, nullable) – Continent name.
    • fuzzy_match (boolean) – Indicates whether the location was inferred.

    Salary Data (salary_data)

    • salary (string) – Salary range extracted from the job listing.
    • salary_low (float, nullable) – Minimum salary in original currency.
    • salary_high (float, nullable) – Maximum salary in original currency.
    • salary_currency (string, nullable) – Currency of the salary (e.g., "USD", "EUR").
    • salary_low_usd (float, nullable) – Converted minimum salary in USD.
    • salary_high_usd (float, nullable) – Converted maximum salary in USD.
    • salary_time_unit (string, nullable) – Time unit for the salary (e.g., "year", "month", "hour").

    Occupational Data (onet_data) (object, nullable)

    • code (string, nullable) – ONET occupation code.
    • family (string, nullable) – Broad occupational family (e.g., "Computer and Mathematical").
    • occupation_name (string, nullable) – Official ONET occupation title.

    Additional Attributes:

    • tags (array of strings, nullable) – Extracted skills and keywords (e.g., "Python", "JavaScript").

    📌 Trusted by enterprises, recruiters, and investors for high-precision job market insights.

    PredictLeads Dataset: https://docs.predictleads.com/v3/guide/job_openings_dataset

  4. Data from: Comparison of NSDUH Health and Health Care Utilization Estimates...

    • catalog.data.gov
    • data.virginia.gov
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Substance Abuse and Mental Health Services Administration (2025). Comparison of NSDUH Health and Health Care Utilization Estimates to Other National Data Sources [Dataset]. https://catalog.data.gov/dataset/comparison-of-nsduh-health-and-health-care-utilization-estimates-to-other-national-data-so
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    Substance Abuse and Mental Health Services Administrationhttps://www.samhsa.gov/
    Description

    This report compares specific health conditions, overall health, and health care utilization prevalence estimates from the 2006 National Survey on Drug Use and Health (NSDUH) and other national data sources. Methodological differences among these data sources that may contribute to differences in estimates are described. In addition to NSDUH, three of the data sources use respondent self-reports to measure health characteristics and service utilization: the National Health Interview Survey (NHIS), the Behavioral Risk Factor Surveillance System (BRFSS), and the Medical Expenditure Panel Survey (MEPS). One survey, the National Health and Nutrition Examination Survey (NHANES), conducts initial interviews in respondents\' homes, collecting further data at nearby locations. Five data sources provide health care utilization data extracted from hospital records; these sources include the National Hospital Discharge Survey (NHDS), the Nationwide Inpatient Sample (NIS), the Nationwide Emergency Department Sample (NEDS), the National Health and Ambulatory Medical Care Survey (NHAMCS), and the Drug Abuse Warning Network (DAWN). Several methodological differences that could cause differences in estimates are discussed, including type and mode of data collection; weighting and representativeness of the sample; question placement, wording, and format; and use of proxy reporting for adolescents.

  5. o

    Net Zero Use Cases and Data Requirements

    • ukpowernetworks.opendatasoft.com
    csv, excel, json
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Net Zero Use Cases and Data Requirements [Dataset]. https://ukpowernetworks.opendatasoft.com/explore/dataset/top-30-use-cases/
    Explore at:
    excel, json, csvAvailable download formats
    Dataset updated
    Oct 7, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionFollowing the identification of Local Area Energy Planning (LAEP) use cases, this dataset lists the data sources and/or information that could help facilitate this research. View our dedicated page to find out how we derived this list: Local Area Energy Plan — UK Power Networks (opendatasoft.com)

    Methodological Approach Data upload: a list of datasets and ancillary details are uploaded into a static Excel file before uploaded onto the Open Data Portal.

    Quality Control Statement

    Quality Control Measures include: Manual review and correct of data inconsistencies Use of additional verification steps to ensure accuracy in the methodology

    Assurance Statement The Open Data Team and Local Net Zero Team worked together to ensure data accuracy and consistency.

    Other Download dataset information: Metadata (JSON)

    Definitions of key terms related to this dataset can be found in the Open Data Portal Glossary: https://ukpowernetworks.opendatasoft.com/pages/glossary/

    Please note that "number of records" in the top left corner is higher than the number of datasets available as many datasets are indexed against multiple use cases leading to them being counted as multiple records.

  6. Data sources’ characteristics*.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giuseppe Roberto; Ingrid Leal; Naveed Sattar; A. Katrina Loomis; Paul Avillach; Peter Egger; Rients van Wijngaarden; David Ansell; Sulev Reisberg; Mari-Liis Tammesoo; Helene Alavere; Alessandro Pasqua; Lars Pedersen; James Cunningham; Lara Tramontan; Miguel A. Mayer; Ron Herings; Preciosa Coloma; Francesco Lapi; Miriam Sturkenboom; Johan van der Lei; Martijn J. Schuemie; Peter Rijnbeek; Rosa Gini (2023). Data sources’ characteristics*. [Dataset]. http://doi.org/10.1371/journal.pone.0160648.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Giuseppe Roberto; Ingrid Leal; Naveed Sattar; A. Katrina Loomis; Paul Avillach; Peter Egger; Rients van Wijngaarden; David Ansell; Sulev Reisberg; Mari-Liis Tammesoo; Helene Alavere; Alessandro Pasqua; Lars Pedersen; James Cunningham; Lara Tramontan; Miguel A. Mayer; Ron Herings; Preciosa Coloma; Francesco Lapi; Miriam Sturkenboom; Johan van der Lei; Martijn J. Schuemie; Peter Rijnbeek; Rosa Gini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sources’ characteristics*.

  7. w

    Data Use in Academia Dataset

    • datacatalog.worldbank.org
    csv, utf-8
    Updated Nov 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semantic Scholar Open Research Corpus (S2ORC) (2023). Data Use in Academia Dataset [Dataset]. https://datacatalog.worldbank.org/search/dataset/0065200/data_use_in_academia_dataset
    Explore at:
    utf-8, csvAvailable download formats
    Dataset updated
    Nov 27, 2023
    Dataset provided by
    Semantic Scholar Open Research Corpus (S2ORC)
    Brian William Stacy
    License

    https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc

    Description

    This dataset contains metadata (title, abstract, date of publication, field, etc) for around 1 million academic articles. Each record contains additional information on the country of study and whether the article makes use of data. Machine learning tools were used to classify the country of study and data use.


    Our data source of academic articles is the Semantic Scholar Open Research Corpus (S2ORC) (Lo et al. 2020). The corpus contains more than 130 million English language academic papers across multiple disciplines. The papers included in the Semantic Scholar corpus are gathered directly from publishers, from open archives such as arXiv or PubMed, and crawled from the internet.


    We placed some restrictions on the articles to make them usable and relevant for our purposes. First, only articles with an abstract and parsed PDF or latex file are included in the analysis. The full text of the abstract is necessary to classify the country of study and whether the article uses data. The parsed PDF and latex file are important for extracting important information like the date of publication and field of study. This restriction eliminated a large number of articles in the original corpus. Around 30 million articles remain after keeping only articles with a parsable (i.e., suitable for digital processing) PDF, and around 26% of those 30 million are eliminated when removing articles without an abstract. Second, only articles from the year 2000 to 2020 were considered. This restriction eliminated an additional 9% of the remaining articles. Finally, articles from the following fields of study were excluded, as we aim to focus on fields that are likely to use data produced by countries’ national statistical system: Biology, Chemistry, Engineering, Physics, Materials Science, Environmental Science, Geology, History, Philosophy, Math, Computer Science, and Art. Fields that are included are: Economics, Political Science, Business, Sociology, Medicine, and Psychology. This third restriction eliminated around 34% of the remaining articles. From an initial corpus of 136 million articles, this resulted in a final corpus of around 10 million articles.


    Due to the intensive computer resources required, a set of 1,037,748 articles were randomly selected from the 10 million articles in our restricted corpus as a convenience sample.


    The empirical approach employed in this project utilizes text mining with Natural Language Processing (NLP). The goal of NLP is to extract structured information from raw, unstructured text. In this project, NLP is used to extract the country of study and whether the paper makes use of data. We will discuss each of these in turn.


    To determine the country or countries of study in each academic article, two approaches are employed based on information found in the title, abstract, or topic fields. The first approach uses regular expression searches based on the presence of ISO3166 country names. A defined set of country names is compiled, and the presence of these names is checked in the relevant fields. This approach is transparent, widely used in social science research, and easily extended to other languages. However, there is a potential for exclusion errors if a country’s name is spelled non-standardly.


    The second approach is based on Named Entity Recognition (NER), which uses machine learning to identify objects from text, utilizing the spaCy Python library. The Named Entity Recognition algorithm splits text into named entities, and NER is used in this project to identify countries of study in the academic articles. SpaCy supports multiple languages and has been trained on multiple spellings of countries, overcoming some of the limitations of the regular expression approach. If a country is identified by either the regular expression search or NER, it is linked to the article. Note that one article can be linked to more than one country.


    The second task is to classify whether the paper uses data. A supervised machine learning approach is employed, where 3500 publications were first randomly selected and manually labeled by human raters using the Mechanical Turk service (Paszke et al. 2019).[1] To make sure the human raters had a similar and appropriate definition of data in mind, they were given the following instructions before seeing their first paper:


    Each of these documents is an academic article. The goal of this study is to measure whether a specific academic article is using data and from which country the data came.

    There are two classification tasks in this exercise:

    1. identifying whether an academic article is using data from any country

    2. Identifying from which country that data came.

    For task 1, we are looking specifically at the use of data. Data is any information that has been collected, observed, generated or created to produce research findings. As an example, a study that reports findings or analysis using a survey data, uses data. Some clues to indicate that a study does use data includes whether a survey or census is described, a statistical model estimated, or a table or means or summary statistics is reported.

    After an article is classified as using data, please note the type of data used. The options are population or business census, survey data, administrative data, geospatial data, private sector data, and other data. If no data is used, then mark "Not applicable". In cases where multiple data types are used, please click multiple options.[2]

    For task 2, we are looking at the country or countries that are studied in the article. In some cases, no country may be applicable. For instance, if the research is theoretical and has no specific country application. In some cases, the research article may involve multiple countries. In these cases, select all countries that are discussed in the paper.

    We expect between 10 and 35 percent of all articles to use data.


    The median amount of time that a worker spent on an article, measured as the time between when the article was accepted to be classified by the worker and when the classification was submitted was 25.4 minutes. If human raters were exclusively used rather than machine learning tools, then the corpus of 1,037,748 articles examined in this study would take around 50 years of human work time to review at a cost of $3,113,244, which assumes a cost of $3 per article as was paid to MTurk workers.


    A model is next trained on the 3,500 labelled articles. We use a distilled version of the BERT (bidirectional Encoder Representations for transformers) model to encode raw text into a numeric format suitable for predictions (Devlin et al. (2018)). BERT is pre-trained on a large corpus comprising the Toronto Book Corpus and Wikipedia. The distilled version (DistilBERT) is a compressed model that is 60% the size of BERT and retains 97% of the language understanding capabilities and is 60% faster (Sanh, Debut, Chaumond, Wolf 2019). We use PyTorch to produce a model to classify articles based on the labeled data. Of the 3,500 articles that were hand coded by the MTurk workers, 900 are fed to the machine learning model. 900 articles were selected because of computational limitations in training the NLP model. A classification of “uses data” was assigned if the model predicted an article used data with at least 90% confidence.


    The performance of the models classifying articles to countries and as using data or not can be compared to the classification by the human raters. We consider the human raters as giving us the ground truth. This may underestimate the model performance if the workers at times got the allocation wrong in a way that would not apply to the model. For instance, a human rater could mistake the Republic of Korea for the Democratic People’s Republic of Korea. If both humans and the model perform the same kind of errors, then the performance reported here will be overestimated.


    The model was able to predict whether an article made use of data with 87% accuracy evaluated on the set of articles held out of the model training. The correlation between the number of articles written about each country using data estimated under the two approaches is given in the figure below. The number of articles represents an aggregate total of

  8. Global Real World Evidence Solutions Market Size By Data Source (Electronic...

    • verifiedmarketresearch.com
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VERIFIED MARKET RESEARCH (2025). Global Real World Evidence Solutions Market Size By Data Source (Electronic Health Records, Claims Data, Registries, Medical Devices), By Therapeutic Area (Oncology, Cardiovascular Diseases, Neurology, Rare Diseases), By Application (Drug Development, Clinical Decision Support, Epidemiological Studies, Post-Marketing Surveillance), By Geographic Scope And Forecast [Dataset]. https://www.verifiedmarketresearch.com/product/real-world-evidence-solutions-market/
    Explore at:
    Dataset updated
    Oct 6, 2025
    Dataset provided by
    Verified Market Researchhttps://www.verifiedmarketresearch.com/
    Authors
    VERIFIED MARKET RESEARCH
    License

    https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/

    Time period covered
    2026 - 2032
    Area covered
    Global
    Description

    Real World Evidence Solutions Market size was valued at USD 1.30 Billion in 2024 and is projected to reach USD 3.71 Billion by 2032, growing at a CAGR of 13.92% during the forecast period 2026-2032.Global Real World Evidence Solutions Market DriversThe market drivers for the Real World Evidence Solutions Market can be influenced by various factors. These may include:Growing Need for Evidence-Based Healthcare: Real-world evidence (RWE) is becoming more and more important in healthcare decision-making, according to stakeholders such as payers, providers, and regulators. In addition to traditional clinical trial data, RWE solutions offer important insights into the efficacy, safety, and value of healthcare interventions in real-world situations.Growing Use of RWE by Pharmaceutical Companies: RWE solutions are being used by pharmaceutical companies to assist with market entry, post-marketing surveillance, and drug development initiatives. Pharmaceutical businesses can find new indications for their current medications, improve clinical trial designs, and convince payers and providers of the worth of their products with the use of RWE.Increasing Priority for Value-Based Healthcare: The emphasis on proving the cost- and benefit-effectiveness of healthcare interventions in real-world settings is growing as value-based healthcare models gain traction. To assist value-based decision-making, RWE solutions are essential in evaluating the economic effect and real-world consequences of healthcare interventions.Technological and Data Analytics Advancements: RWE solutions are becoming more capable due to advances in machine learning, artificial intelligence, and big data analytics. With the use of these technologies, healthcare stakeholders can obtain actionable insights from the analysis of vast and varied datasets, including patient-generated data, claims data, and electronic health records.Regulatory Support for RWE Integration: RWE is being progressively integrated into regulatory decision-making processes by regulatory organisations including the European Medicines Agency (EMA) and the U.S. Food and Drug Administration (FDA). The FDA's Real-World Evidence Programme and the EMA's Adaptive Pathways and PRIority MEdicines (PRIME) programme are two examples of initiatives that are making it easier to incorporate RWE into regulatory submissions and drug development.Increasing Emphasis on Patient-Centric Healthcare: The value of patient-reported outcomes and real-world experiences in healthcare decision-making is becoming more widely acknowledged. RWE technologies facilitate the collection and examination of patient-centered data, offering valuable insights into treatment efficacy, patient inclinations, and quality of life consequences.Extension of RWE Use Cases: RWE solutions are being used in medication development, post-market surveillance, health economics and outcomes research (HEOR), comparative effectiveness research, and market access, among other healthcare fields. The necessity for a variety of RWE solutions catered to the needs of different stakeholders is being driven by the expansion of RWE use cases.

  9. f

    Maximum Analysis Sample Sizes by Analysis Type and Data Source.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Apr 6, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murray, Aja Louise; Obsuth, Ingrid; Sutherland, Alex; Eisner, Manuel; Pilbeam, Liv; Cope, Aiden (2016). Maximum Analysis Sample Sizes by Analysis Type and Data Source. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001509657
    Explore at:
    Dataset updated
    Apr 6, 2016
    Authors
    Murray, Aja Louise; Obsuth, Ingrid; Sutherland, Alex; Eisner, Manuel; Pilbeam, Liv; Cope, Aiden
    Description

    Maximum Analysis Sample Sizes by Analysis Type and Data Source.

  10. m

    Use case data sources

    • data.mendeley.com
    Updated Feb 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Werner Kritzinger (2018). Use case data sources [Dataset]. http://doi.org/10.17632/fx9xfmtfcw.1
    Explore at:
    Dataset updated
    Feb 28, 2018
    Authors
    Werner Kritzinger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data sources for 54 use cases, which were used to identify the impacts to the value creation system by Additive Manufacturing.

  11. United States COVID-19 County Level Data Sources - ARCHIVED

    • data.virginia.gov
    • healthdata.gov
    • +2more
    csv, json, rdf, xsl
    Updated Feb 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). United States COVID-19 County Level Data Sources - ARCHIVED [Dataset]. https://data.virginia.gov/dataset/united-states-covid-19-county-level-data-sources-archived
    Explore at:
    json, rdf, csv, xslAvailable download formats
    Dataset updated
    Feb 23, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Area covered
    United States
    Description

    The Public Health Emergency (PHE) declaration for COVID-19 expired on May 11, 2023. As a result, the Aggregate Case and Death Surveillance System will be discontinued. Although these data will continue to be publicly available, this dataset will no longer be updated.

    On October 20, 2022, CDC began retrieving aggregate case and death data from jurisdictional and state partners weekly instead of daily.

    This dataset includes the URLs that were used by the aggregate county data collection process that compiled aggregate case and death counts by county. Within this file, each of the states (plus select jurisdictions and territories) are listed along with the county web sources which were used for pulling these numbers. Some states had a single statewide source for collecting the county data, while other states and local health jurisdictions may have had standalone sources for individual counties. In the cases where both local and state web sources were listed, a composite approach was taken so that the maximum value reported for a location from either source was used. The initial raw data were sourced from these links and ingested into the CDC aggregate county dataset before being published on the COVID Data Tracker.

  12. d

    Doorda UK Vulnerability Data | Location Data | 1.8M Postcodes from 30 Data...

    • datarade.ai
    .csv
    Updated Nov 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Doorda (2024). Doorda UK Vulnerability Data | Location Data | 1.8M Postcodes from 30 Data Sources | Location Intelligence and Analytics [Dataset]. https://datarade.ai/data-products/doorda-uk-vulnerability-data-property-data-34m-addresses-doorda
    Explore at:
    .csvAvailable download formats
    Dataset updated
    Nov 5, 2024
    Dataset authored and provided by
    Doorda
    Area covered
    United Kingdom
    Description

    Doorda's UK Vulnerability Data provides a comprehensive database of over 1.8M postcodes sourced from 30 data sources, offering unparalleled insights for location intelligence and analytics purposes.

    Volume and stats: - 1.8M Postcodes - 5 Vulnerability areas covered - 1-100 Vulnerability rating

    Our Residential Real Estate Data offers a multitude of use cases: - Market Analysis - Identify Vulnerable Consumers - Mitigate Lead Generation Risk - Risk Management - Location Planning

    The key benefits of leveraging our Residential Real Estate Data include: - Data Accuracy - Informed Decision-Making - Competitive Advantage - Efficiency - Single Source

    Covering a wide range of industries and sectors, our data empowers organisations to make informed decisions, uncover market trends, and gain a competitive edge in the UK market.

  13. Combined Origin Year Data Source

    • data-wadnr.opendata.arcgis.com
    • hub.arcgis.com
    Updated Apr 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Washington State Department of Natural Resources (2023). Combined Origin Year Data Source [Dataset]. https://data-wadnr.opendata.arcgis.com/datasets/combined-origin-year-data-source
    Explore at:
    Dataset updated
    Apr 6, 2023
    Dataset authored and provided by
    Washington State Department of Natural Resourceshttps://dnr.wa.gov/
    Area covered
    Description

    Data source codeDescription10original FRIS sample based on cores for older forest (origin_year ≤ 1900)11original FRIS sample based on cores for areas lacking RS-FRIS12original FRIS sample non-core data (e.g., stratified sample, LULC) for areas lacking RS-FRIS21Leave trees, identified using remotely-sensed data, within LRM completed even-aged harvests. Origin year based on RS-FRIS predicted origin year.22RS-FRIS 2.0 predicted origin year.23RS-FRIS 3.0 predicted origin year.24RS-FRIS 4.0 predicted origin year.25RS-FRIS 5.0 predicted origin year.30LRM stand origin date or activity completion date within completed even-aged harvests.40Overrides. These are areas where new data were collected or other data sources override all existing sources.53, 54High severity (53) and very high severity (54) fires 2012-202161GNN71Establishment year from inherited inventory for Deep River Woods land transaction.This data set is periodically updated. Last update performed 2025-08-31

  14. f

    Description of data sources.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weaver, James; DeFalco, Frank; Swerdel, Joel; Andryc, Alan; Reps, Jenna; Hardin, Jill; Conover, Mitchell M.; Makadia, Rupa; Gilbert, James P.; Shoaibi, Azza; Rao, Gowtham A.; Voss, Erica A.; Ryan, Patrick B.; Hughes, Nigel; Schuemie, Martijn J.; Sena, Anthony G.; Knoll, Chris; Blacketer, Clair; Fortin, Stephen; Molinaro, Anthony (2025). Description of data sources. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001339001
    Explore at:
    Dataset updated
    Jan 16, 2025
    Authors
    Weaver, James; DeFalco, Frank; Swerdel, Joel; Andryc, Alan; Reps, Jenna; Hardin, Jill; Conover, Mitchell M.; Makadia, Rupa; Gilbert, James P.; Shoaibi, Azza; Rao, Gowtham A.; Voss, Erica A.; Ryan, Patrick B.; Hughes, Nigel; Schuemie, Martijn J.; Sena, Anthony G.; Knoll, Chris; Blacketer, Clair; Fortin, Stephen; Molinaro, Anthony
    Description

    ObjectiveThis paper introduces a novel framework for evaluating phenotype algorithms (PAs) using the open-source tool, Cohort Diagnostics.Materials and methodsThe method is based on several diagnostic criteria to evaluate a patient cohort returned by a PA. Diagnostics include estimates of incidence rate, index date entry code breakdown, and prevalence of all observed clinical events prior to, on, and after index date. We test our framework by evaluating one PA for systemic lupus erythematosus (SLE) and two PAs for Alzheimer’s disease (AD) across 10 different observational data sources.ResultsBy utilizing CohortDiagnostics, we found that the population-level characteristics of individuals in the cohort of SLE closely matched the disease’s anticipated clinical profile. Specifically, the incidence rate of SLE was consistently higher in occurrence among females. Moreover, expected clinical events like laboratory tests, treatments, and repeated diagnoses were also observed. For AD, although one PA identified considerably fewer patients, absence of notable differences in clinical characteristics between the two cohorts suggested similar specificity.DiscussionWe provide a practical and data-driven approach to evaluate PAs, using two clinical diseases as examples, across a network of OMOP data sources. Cohort Diagnostics can ensure the subjects identified by a specific PA align with those intended for inclusion in a research study.ConclusionDiagnostics based on large-scale population-level characterization can offer insights into the misclassification errors of PAs.

  15. Enriched NYTimes COVID19 U.S. County Dataset

    • kaggle.com
    zip
    Updated Jun 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ringhilterra17 (2020). Enriched NYTimes COVID19 U.S. County Dataset [Dataset]. https://www.kaggle.com/ringhilterra17/enrichednytimescovid19
    Explore at:
    zip(11291611 bytes)Available download formats
    Dataset updated
    Jun 14, 2020
    Authors
    ringhilterra17
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Overview and Inspiration

    I wanted to make some geospatial visualizations to convey the current severity of COVID19 in different parts of the U.S..

    I liked the NYTimes COVID dataset, but it was lacking information on county boundary shape data, population per county, new cases / deaths per day, and per capita calculations, and county demographics.

    After a lot of work tracking down the different data sources I wanted and doing all of the data wrangling and joins in python, I wanted to open-source the final enriched data set in order to give others a head start in their COVID-19 related analytic, modeling, and visualization efforts.

    This dataset is enriched with county shapes, county center point coordinates, 2019 census population estimates, county population densities, cases and deaths per capita, and calculated per day cases / deaths metrics. It contains daily data per county back to January, allowing for analyizng changes over time.

    UPDATE: I have also included demographic information per county, including ages, races, and gender breakdown. This could help determine which counties are most susceptible to an outbreak.

    How this data can be used

    Geospatial analysis and visualization - Which counties are currently getting hit the hardest (per capita and totals)? - What patterns are there in the spread of the virus across counties? (network based spread simulations using county center lat / lons) -county population densities play a role in how quickly the virus spreads? -how does a specific county/state cases and deaths compare to other counties/states? Join with other county level datasets easily (with fips code column)

    Content Details

    See the column descriptions for more details on the dataset

    Visualizations and Analysis Examples

    COVID-19 U.S. Time-lapse: Confirmed Cases per County (per capita)

    https://github.com/ringhilterra/enriched-covid19-data/blob/master/example_viz/covid-cases-final-04-06.gif?raw=true" alt="">-

    Other Data Notes

    • Please review nytimes README for detailed notes on Covid-19 data - https://github.com/nytimes/covid-19-data/
    • The only update I made in regards to 'Geographic Exceptions', is that I took 'New York City' county provided in the Covid-19 data, which has all cases for 'for the five boroughs of New York City (New York, Kings, Queens, Bronx and Richmond counties) and replaced the missing FIPS for those rows with the 'New York County' fips code 36061. That way I could join to a geometry, and then I used the sum of those five boroughs population estimates for the 'New York City' estimate, which allowed me calculate 'per capita' metrics for 'New York City' entries in the Covid-19 dataset

    Acknowledgements

  16. Z

    SCAR Southern Ocean Diet and Energetics Database

    • data.niaid.nih.gov
    • data.aad.gov.au
    • +3more
    Updated Jul 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scientific Committee on Antarctic Research (2023). SCAR Southern Ocean Diet and Energetics Database [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5072527
    Explore at:
    Dataset updated
    Jul 24, 2023
    Authors
    Scientific Committee on Antarctic Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Southern Ocean
    Description

    Information related to diet and energy flow is fundamental to a diverse range of Antarctic and Southern Ocean biological and ecosystem studies. This metadata record describes a database of such information being collated by the SCAR Expert Groups on Antarctic Biodiversity Informatics (EG-ABI) and Birds and Marine Mammals (EG-BAMM) to assist the scientific community in this work. It includes data related to diet and energy flow from conventional (e.g. gut content) and modern (e.g. molecular) studies, stable isotopes, fatty acids, and energetic content. It is a product of the SCAR community and open for all to participate in and use.

    Data have been drawn from published literature, existing trophic data collections, and unpublished data. The database comprises five principal tables, relating to (i) direct sampling methods of dietary assessment (e.g. gut, scat, and bolus content analyses, stomach flushing, and observed predation), (ii) stable isotopes, (iii) lipids, (iv) DNA-based diet assessment, and (v) energetics values. The schemas of these tables are described below, and a list of the sources used to populate the tables is provided with the data.

    A range of manual and automated checks were used to ensure that the entered data were as accurate as possible. These included visual checking of transcribed values, checking of row or column sums against known totals, and checking for values outside of allowed ranges. Suspicious entries were re-checked against original source.

    Notes on names: Names have been validated against the World Register of Marine Species (http://www.marinespecies.org/). For uncertain taxa, the most specific taxonomic name has been used (e.g. prey reported in a study as "Pachyptila sp." will appear here as "Pachyptila"; "Cephalopods" will appear as "Cephalopoda"). Uncertain species identifications (e.g. "Notothenia rossii?" or "Gymnoscopelus cf. piabilis") have been assigned the genus name (e.g. "Notothenia", "Gymnoscopelus"). Original names have been retained in a separate column to allow future cross-checking. WoRMS identifiers (APHIA_ID numbers) are given where possible.

    Grouped prey data in the diet sample table need to be handled with a bit of care. Papers commonly report prey statistics aggregated over groups of prey - e.g. one might give the diet composition by individual cephalopod prey species, and then an overall record for all cephalopod prey. The PREY_IS_AGGREGATE column identifies such records. This allows us to differentiate grouped data like this from unidentified prey items from a certain prey group - for example, an unidentifiable cephalopod record would be entered as Cephalopoda (the scientific name), with "N" in the PREY_IS_AGGREGATE column. A record that groups together a number of cephalopod records, possibly including some unidentifiable cephalopods, would also be entered as Cephalopoda, but with "Y" in the PREY_IS_AGGREGATE column. See the notes on PREY_IS_AGGREGATE, below.

    There are two related R packages that provide data access and functionality for working with these data. See the package home pages for more information: https://github.com/SCAR/sohungry and https://github.com/SCAR/solong.

    Data table schemas

    Sources data table

    • SOURCE_ID: The unique identifier of this source

    • DETAILS: The bibliographic details for this source (e.g. "Hindell M (1988) The diet of the royal penguin Eudyptes schlegeli at Macquarie Island. Emu 88:219–226")

    • NOTES: Relevant notes about this source – if it’s a published paper, this is probably the abstract

    • DOI: The DOI of the source (paper or dataset), in the form "10.xxxx/yyyy"

    Diet data table

    • RECORD_ID: The unique identifier of this record

    • SOURCE_ID: The identifier of the source study from which this record was obtained (see corresponding entry in the sources data table)

    • SOURCE_DETAILS, SOURCE_DOI: The details and DOI of the source, copied from the sources data table for convenience

    • ORIGINAL_RECORD_ID: The identifier of this data record in its original source, if it had one

    • LOCATION: The name of the location at which the data was collected

    • WEST: The westernmost longitude of the sampling region, in decimal degrees (negative values for western hemisphere longitudes)

    • EAST: The easternmost longitude of the sampling region, in decimal degrees (negative values for western hemisphere longitudes)

    • SOUTH: The southernmost latitude of the sampling region, in decimal degrees (negative values for southern hemisphere latitudes)

    • NORTH: The northernmost latitude of the sampling region, in decimal degrees (negative values for southern hemisphere latitudes)

    • ALTITUDE_MIN: The minimum altitude of the sampling region, in metres

    • ALTITUDE_MAX: The maximum altitude of the sampling region, in metres

    • DEPTH_MIN: The shallowest depth of the sampling, in metres

    • DEPTH_MAX: The deepest depth of the sampling, in metres

    • OBSERVATION_DATE_START: The start of the sampling period

    • OBSERVATION_DATE_END: The end of the sampling period. If sampling was carried out over multiple seasons (e.g. during January of 2002 and January of 2003), this will be the first and last dates (in this example, from 1-Jan-2002 to 31-Jan-2003)

    • PREDATOR_NAME: The name of the predator. This may differ from predator_name_original if, for example, taxonomy has changed since the original publication, if the original publication had spelling errors or used common (not scientific) names

    • PREDATOR_NAME_ORIGINAL: The name of the predator, as it appeared in the original source

    • PREDATOR_APHIA_ID: The numeric identifier of the predator in the WoRMS taxonomic register

    • PREDATOR_WORMS_RANK, PREDATOR_WORMS_KINGDOM, PREDATOR_WORMS_PHYLUM, PREDATOR_WORMS_CLASS, PREDATOR_WORMS_ORDER, PREDATOR_WORMS_FAMILY, PREDATOR_WORMS_GENUS: The taxonomic details of the predator, from the WoRMS taxonomic register

    • PREDATOR_GROUP_SOKI: A descriptive label of the group to which the predator belongs (currently used in the Southern Ocean Knowledge and Information wiki, http://soki.aq)

    • PREDATOR_LIFE_STAGE: Life stage of the predator, e.g. "adult", "chick", "larva", "juvenile". Note that if a food sample was taken from an adult animal, but that food was destined for a juvenile, then the life stage will be "juvenile" (this is common with seabirds feeding chicks)

    • PREDATOR_BREEDING_STAGE: Stage of the breeding season of the predator, if applicable, e.g. "brooding", "chick rearing", "nonbreeding", "posthatching"

    • PREDATOR_SEX: Sex of the predator: "male", "female", "both", or "unknown"

    • PREDATOR_SAMPLE_COUNT: The number of predators for which data are given. If (say) 50 predators were caught but only 20 analysed, this column will contain 20. For scat content studies, this will be the number of scats analysed

    • PREDATOR_SAMPLE_ID: The identifier of the predator(s). If predators are being reported at the individual level (i.e. PREDATOR_SAMPLE_COUNT = 1) then PREDATOR_SAMPLE_ID is the individual animal ID. Alternatively, if the data values being entered here are from a group of predators, then the PREDATOR_SAMPLE_ID identifies that group of predators. PREDATOR_SAMPLE_ID values are unique within a source (i.e. SOURCE_ID, PREDATOR_SAMPLE_ID pairs are globally unique). Rows with the same SOURCE_ID and PREDATOR_SAMPLE_ID values relate to the same predator individual or group of individuals, and so can be combined (e.g. for prey diversity analyses). Subsamples are indicated by a decimal number S.nnn, where S is the parent PREDATOR_SAMPLE_ID, and nnn (001-999) is the subsample number. Studies will sometimes report detailed prey information for a large sample, but then report prey information for various subsamples of that sample (e.g. broken down by predator sex, or sampling season). In the simplest case, the diet of each predator will be reported only once in the study, and in this scenario the PREDATOR_SAMPLE_ID values will simply be 1 to N (for N predators).

    • PREDATOR_SIZE_MIN, PREDATOR_SIZE_MAX, PREDATOR_SIZE_MEAN, PREDATOR_SIZE_SD: The minimum, maximum, mean, and standard deviation of the size of the predators in the sample

    • PREDATOR_SIZE_UNITS: The units of size (e.g. "mm")

    • PREDATOR_SIZE_NOTES: Notes on the predator size information, including a definition of what the size value represents (e.g. "total length", "standard length")

    • PREDATOR_MASS_MIN, PREDATOR_MASS_MAX, PREDATOR_MASS_MEAN, PREDATOR_MASS_SD: The minimum, maximum, mean, and standard deviation of the mass of the predators in the sample

    • PREDATOR_MASS_UNITS: The units of mass (e.g. "g", "kg")

    • PREDATOR_MASS_NOTES: Notes on the predator mass information, including a definition of what the mass value represents

    • PREY_NAME: The scientific name of the prey item (corrected, if necessary)

    • PREY_NAME_ORIGINAL: The name of the prey item, as it appeared in the original source

    PREY_APHIA_ID: The numeric identifier of the prey in the WoRMS taxonomic register

    • PREY_WORMS_RANK, PREY_WORMS_KINGDOM, PREY_WORMS_PHYLUM, PREY_WORMS_CLASS, PREY_WORMS_ORDER, PREY_WORMS_FAMILY, PREY_WORMS_GENUS: The taxonomic details of the prey, from the WoRMS taxonomic register

    • PREY_GROUP_SOKI: A descriptive label of the group to which the prey belongs (currently used in the Southern Ocean Knowledge and Information wiki, http://soki.aq)

    • PREY_IS_AGGREGATE: "Y" indicates that this row is an aggregation of other rows in this data source. For example, a study might give a number of individual squid species records, and then an overall squid record that encompasses the individual records. Use the PREY_IS_AGGREGATE information to avoid double-counting during analyses

    • PREY_LIFE_STAGE: Life stage of the prey (e.g. "adult", "chick", "larva")

    • PREY_SEX: The sex of the prey ("male", "female", "both", or "unknown"). Note that this is generally "unknown"

    • PREY_SAMPLE_COUNT: The number of prey individuals from which size and mass measurements were made (note: this is NOT the total number of individuals of

  17. Z

    Resources of IncRML: Incremental Knowledge Graph Construction from...

    • data-staging.niaid.nih.gov
    Updated Dec 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Van Assche, Dylan; Andres Rojas Melendez, Julian; De Meester, Ben; Colpaert, Pieter (2024). Resources of IncRML: Incremental Knowledge Graph Construction from Heterogeneous Data Sources [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_10171156
    Explore at:
    Dataset updated
    Dec 13, 2024
    Dataset provided by
    IDLab
    Authors
    Van Assche, Dylan; Andres Rojas Melendez, Julian; De Meester, Ben; Colpaert, Pieter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IncRML resources

    This Zenodo dataset contains all the resources of the paper 'IncRML: Incremental Knowledge Graph Construction from Heterogeneous Data Sources' submitted to the Semantic Web Journal's Special Issue on Knowledge Graph Construction. This resource aims to make the paper experiments fully reproducible through our experiment tool written in Python which was already used before in the Knowledge Graph Construction Challenge by the ESWC 2023 Workshop on Knowledge Graph Construction. The exact Java JAR file of the RMLMapper (rmlmapper.jar) is also provided in this dataset which was used to execute the experiments. This JAR file was executed with Java OpenJDK 11.0.20.1 on Ubuntu 22.04.1 LTS (Linux 5.15.0-53-generic). Each experiment was executed 5 times and the median values are reported together with the standard deviation of the measurements.

    Datasets

    We provide both dataset dumps of the GTFS-Madrid-Benchmark and of real-life use cases from Open Data in Belgium.GTFS-Madrid-Benchmark dumps are used to analyze the impact on execution time and resources, while the real-life use cases aim to verify the approach on different types of datasets since the GTFS-Madrid-Benchmark is a single type of dataset which does not advertise changes at all.

    Benchmarks

    GTFS-Madrid-Benchmark: change types with fixed data size and amount of changes: additions-only, modifications-only, deletions-only (11 versions)

    GTFS-Madrid-Benchmark: amount of changes with fixed data size: 0%, 25%, 50%, 75%, and 100% changes (11 versions)

    GTFS-Madrid-Benchmark: data size with fixed amount of changes: scales 1, 10, 100 (11 versions)

    Real-world datasets

    Traffic control center Vlaams Verkeerscentrum (Belgium): traffic board messages data (1 day, 28760 versions)

    Meteorological institute KMI (Belgium): weather sensor data (1 day, 144 versions)

    Public transport agency NMBS (Belgium): train schedule data (1 week, 7 versions)

    Public transport agency De Lijn (Belgium): busses schedule data (1 week, 7 versions)

    Bike-sharing company BlueBike (Belgium): bike-sharing availability data (1 day, 1440 versions)

    Bike-sharing company JCDecaux (EU): bike-sharing availability data (1 day, 1440 versions)

    OpenStreetMap (World): geographical map data (1 day, 1440 versions)

    Ingestion

    Real-world datasets LDES output was converted into SPARQL UPDATE queries and executed against Virtuoso to have an estimate for non-LDES clients how incremental generation impacted ingestion into triplestores.

    Remarks

    The first version of each dataset is always used as a baseline. All next versions are applied as an update on the existing version. The reported results are only focusing on the updates since these are the actual incremental generation.

    GTFS-Change-50_percent-{ALL, CHANGE}.tar.xz datasets are not uploaded as GTFS-Madrid-Benchmark scale 100 because both share the same parameters (50% changes, scale 100). Please use GTFS-Scale-100-{ALL, CHANGE}.tar.xz for GTFS-Change-50_percent-{ALL, CHANGE}.tar.xz

    All datasets are compressed with XZ and provided as a TAR archive, be aware that you need sufficient space to decompress these archives! 2 TB of free space is advised to decompress all benchmarks and use cases. The expected output is provided as a ZIP file in each TAR archive, decompressing these requires even more space (4 TB).

    Reproducing

    By using our experiment tool, you can easily reproduce the experiments as followed:

    Download one of the TAR.XZ archives and unpack them.

    Clone the GitHub repository of our experiment tool and install the Python dependencies with 'pip install -r requirements.txt'.

    Download the rmlmapper.jar JAR file from this Zenodo dataset and place it inside the experiment tool root folder.

    Execute the tool by running: './exectool --root=/path/to/the/root/of/the/tarxz/archive --runs=5 run'. The argument '--runs=5' is used to perform the experiment 5 times.

    Once executed, you can generate the statistics by running: './exectool --root=/path/to/the/root/of/the/tarxz/archive stats'.

    Testcases

    Testcases to verify the integration of RML and LDES with IncRML, see https://doi.org/10.5281/zenodo.10171394

  18. Data from: Matlab Scripts and Sample Data Associated with Water Resources...

    • osti.gov
    • data.openei.org
    • +2more
    Updated Jul 18, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Becker, Matthew W (2015). Matlab Scripts and Sample Data Associated with Water Resources Research Article [Dataset]. https://www.osti.gov/dataexplorer/biblio/dataset/1638712
    Explore at:
    Dataset updated
    Jul 18, 2015
    Dataset provided by
    United States Department of Energyhttp://energy.gov/
    Office of Energy Efficiency and Renewable Energyhttp://energy.gov/eere
    Authors
    Becker, Matthew W
    Description

    Scripts and data acquired at the Mirror Lake Research Site, cited by the article submitted to Water Resources Research: Distributed Acoustic Sensing (DAS) as a Distributed Hydraulic Sensor in Fractured Bedrock M. W. Becker(1), T. I. Coleman(2), and C. C. Ciervo(1) 1 California State University, Long Beach, Geology Department, 1250 Bellflower Boulevard, Long Beach, California, 90840, USA. 2 Silixa LLC, 3102 W Broadway St, Suite A, Missoula MT 59808, USA. Corresponding author: Matthew W. Becker (matt.becker@csulb.edu).

  19. J

    Japan JP: SPI: Pillar 4 Data Sources Score: Scale 0-100

    • ceicdata.com
    Updated Jun 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). Japan JP: SPI: Pillar 4 Data Sources Score: Scale 0-100 [Dataset]. https://www.ceicdata.com/en/japan/governance-policy-and-institutions/jp-spi-pillar-4-data-sources-score-scale-0100
    Explore at:
    Dataset updated
    Jun 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2017 - Mar 1, 2024
    Area covered
    Japan
    Variables measured
    Money Market Rate
    Description

    Japan JP: SPI: Pillar 4 Data Sources Score: Scale 0-100 data was reported at 84.050 NA in 2024. This stayed constant from the previous number of 84.050 NA for 2023. Japan JP: SPI: Pillar 4 Data Sources Score: Scale 0-100 data is updated yearly, averaging 78.317 NA from Mar 2017 (Median) to 2024, with 8 observations. The data reached an all-time high of 84.050 NA in 2024 and a record low of 71.542 NA in 2017. Japan JP: SPI: Pillar 4 Data Sources Score: Scale 0-100 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Japan – Table JP.World Bank.WDI: Governance: Policy and Institutions. The data sources overall score is a composity measure of whether countries have data available from the following sources: Censuses and surveys, administrative data, geospatial data, and private sector/citizen generated data. The data sources (input) pillar is segmented by four types of sources generated by (i) the statistical office (censuses and surveys), and sources accessed from elsewhere such as (ii) administrative data, (iii) geospatial data, and (iv) private sector data and citizen generated data. The appropriate balance between these source types will vary depending on a country’s institutional setting and the maturity of its statistical system. High scores should reflect the extent to which the sources being utilized enable the necessary statistical indicators to be generated. For example, a low score on environment statistics (in the data production pillar) may reflect a lack of use of (and low score for) geospatial data (in the data sources pillar). This type of linkage is inherent in the data cycle approach and can help highlight areas for investment required if country needs are to be met.;Statistical Performance Indicators, The World Bank (https://datacatalog.worldbank.org/dataset/statistical-performance-indicators);Weighted average;

  20. d

    Data from: Compilation of Public-Supply Well Construction Depths in...

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Compilation of Public-Supply Well Construction Depths in California [Dataset]. https://catalog.data.gov/dataset/compilation-of-public-supply-well-construction-depths-in-california
    Explore at:
    Dataset updated
    Nov 20, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    California
    Description

    This data release is a compilation of construction depth information for 12,383 active and inactive public-supply wells (PSWs) in California from various data sources. Construction data from multiple sources were indexed by the California State Water Resources Control Board Division of Drinking Water (DDW) primary station code (PS Code). Five different data sources were compared with the following priority order: 1, Local sources from select municipalities and water purveyors (Local); 2, Local DDW district data (DDW); 3, The United States Geological Survey (USGS) National Water Information System (NWIS); 4, The California State Water Resources Control Board Groundwater Ambient Monitoring and Assessment Groundwater Information System (SWRCB); and 5, USGS attribution of California Department of Water Resources well completion report data (WCR). For all data sources, the uppermost depth to the well's open or perforated interval was attributed as depth to top of perforations (ToP). The composite depth to bottom of well (Composite BOT) field was attributed from available construction data in the following priority order: 1, Depth to bottom of perforations (BoP); 2, Depth of completed well (Well Depth); 3; Borehole depth (Hole Depth). PSW ToPs and Composite BOTs from each of the five data sources were then compared and summary construction depths for both fields were selected for wells with multiple data sources according to the data-source priority order listed above. Case-by-case modifications to the final selected summary construction depths were made after priority order-based selection to ensure internal logical consistency (for example, ToP must not exceed Composite BOT). This data release contains eight tab-delimited text files. WellConstructionSourceData_Local.txt contains well construction-depth data, Composite BOT data-source attribution, and local agency data-source attribution for the Local data. WellConstructionSourceData_DDW.txt contains well construction-depth data and Composite BOT data-source attribution for the DDW data. WellConstructionSourceData_NWIS.txt contains well construction-depth data, Composite BOT data-source attribution, and USGS site identifiers for the NWIS data. WellConstructionSourceData_SWRCB.txt contains well construction-depth data and Composite BOT data-source attribution for the SWRCB data. WellConstructionSourceData_WCR.txt contains contains well construction depth data and Composite BOT data-source attribution for the WCR data. WellConstructionCompilation_ToP.txt contains all ToP data listed by data source. WellConstructionCompilation_BOT.txt contains all Composite BOT data listed by data source. WellConstructionCompilation_Summary.txt contains summary ToP and Composite BOT values for each well with data-source attribution for both construction fields. All construction depths are in units of feet below land surface and are reported to the nearest foot.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
CEICdata.com, Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 [Dataset]. https://www.ceicdata.com/en/algeria/governance-policy-and-institutions/dz-spi-pillar-4-data-sources-score-scale-0100

Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100

Explore at:
Dataset provided by
CEICdata.com
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Time period covered
Dec 1, 2016 - Dec 1, 2022
Area covered
Algeria
Variables measured
Money Market Rate
Description

Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 data was reported at 45.958 NA in 2022. This records a decrease from the previous number of 49.075 NA for 2021. Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 data is updated yearly, averaging 49.892 NA from Dec 2016 (Median) to 2022, with 7 observations. The data reached an all-time high of 52.417 NA in 2018 and a record low of 45.958 NA in 2022. Algeria DZ: SPI: Pillar 4 Data Sources Score: Scale 0-100 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Algeria – Table DZ.World Bank.WDI: Governance: Policy and Institutions. The data sources overall score is a composity measure of whether countries have data available from the following sources: Censuses and surveys, administrative data, geospatial data, and private sector/citizen generated data. The data sources (input) pillar is segmented by four types of sources generated by (i) the statistical office (censuses and surveys), and sources accessed from elsewhere such as (ii) administrative data, (iii) geospatial data, and (iv) private sector data and citizen generated data. The appropriate balance between these source types will vary depending on a country’s institutional setting and the maturity of its statistical system. High scores should reflect the extent to which the sources being utilized enable the necessary statistical indicators to be generated. For example, a low score on environment statistics (in the data production pillar) may reflect a lack of use of (and low score for) geospatial data (in the data sources pillar). This type of linkage is inherent in the data cycle approach and can help highlight areas for investment required if country needs are to be met.;Statistical Performance Indicators, The World Bank (https://datacatalog.worldbank.org/dataset/statistical-performance-indicators);Weighted average;

Search
Clear search
Close search
Google apps
Main menu