100+ datasets found

Health visitor service delivery metrics experimental statistics: 2019 to...
gov.uk
s3.amazonaws.com
Updated Aug 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Public Health England (2022). Health visitor service delivery metrics experimental statistics: 2019 to 2020 annual data [Dataset]. https://www.gov.uk/government/statistics/health-visitor-service-delivery-metrics-experimental-statistics-2019-to-2020-annual-data
Explore at:
Dataset updated
Aug 2, 2022
Dataset provided by
GOV.UKhttp://gov.uk/
Authors
Public Health England
Description
This release is for quarters 1 to 4 of 2019 to 2020.

Local authority commissioners and health professionals can use these resources to track how many pregnant women, children and families in their local area have received health promoting reviews at particular points during pregnancy and childhood.

The data and commentaries also show variation at a local, regional and national level. This can help with planning, commissioning and improving local services.

The metrics cover health reviews for pregnant women, children and their families at several stages which are:

antenatal contact

new birth visit

6 to 8-week review

12-month review

2 to 2-and-a-half-year review

Public Health England (PHE) collects the data, which is submitted by local authorities on a voluntary basis.

See health visitor service delivery metrics in the child and maternal health statistics collection to access data for previous years.

Find guidance on using these statistics and other intelligence resources to help you make decisions about the planning and provision of child and maternal health services.

See health visitor service metrics and outcomes definitions from Community Services Dataset (CSDS).

Correction notice

Since publication in November 2020, Lewisham and Leicestershire councils have identified errors in the new birth visits within 14 days data it submitted to Public Health England (PHE) for 2019 to 2020 data. This error has caused a statistically significant change in the health visiting data for 2019 to 2020, and so the Office for Health Improvement and Disparities (OHID) has updated and reissued the data in OHID’s Fingertips tool.

A correction notice has been added to the 2019 to 2020 annual statistical release and statistical commentary but the data has not been altered.

Please consult OHID’s Fingertips tool for corrected data for Lewisham and Leicestershire, the London and East Midlands region, and England.
d
Metrics by Individual Security and Exchange
catalog.data.gov
Updated May 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Public Affairs (2025). Metrics by Individual Security and Exchange [Dataset]. https://catalog.data.gov/dataset/metrics-by-individual-security-and-exchange
Explore at:
Dataset updated
May 30, 2025
Dataset provided by
Public Affairs
Description
These datasets provide metrics for each individual security partitioned by exchange.
Data from: Pre-compiled metrics data sets, links to yearly statistics files...
doi.pangaea.de
html, tsv
Updated Sep 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin G Schultz; Sabine Schröder; Olga Lyapina; Owen R Cooper (2017). Pre-compiled metrics data sets, links to yearly statistics files in CSV format [Dataset]. http://doi.org/10.1594/PANGAEA.880505
Explore at:
tsv, htmlAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.880505
Dataset updated
Sep 8, 2017
Dataset provided by
PANGAEA
Authors
Martin G Schultz; Sabine Schröder; Olga Lyapina; Owen R Cooper
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1970 - Jan 1, 2015
Variables measured
DATE/TIME, File name, File size, Uniform resource locator/link to file
Description
Errata: On Dec 2nd, 2018, several yearly statistics files were replaced with new versions to correct an inconsistency related to the computation of the "dma8epax" statistics. As written in Schultz et al. (2017) [https://doi.org/10.1525/elementa.244], Supplement 1, Table 6: "When the aggregation period is “seasonal”, “summer”, or “annual”, the 4th highest daily 8-hour maximum of the aggregation period will be computed.". The data values for these aggregation periods are correct, however, the header information in the original files stated that the respective data column would contain "average daily maximum 8-hour ozone mixing ratio (nmol mol-1)". Therefore, the header of the seasonal, summer, and annual files has been corrected. Furthermore, the "dma8epax" column in the monthly files erroneously contained 4th highest daily maximum 8-hour average values, while it should have listed monthly average values instead. The data of this metric in the monthly files have therefore been replaced. The new column header reads "avgdma8epax". The updated files contain a version label "1.1" and a brief description of the error. If you have made use of previous TOAR data files with the "dma8epax" metric, please exchange your data files.
d
Data from: Metrics for Evaluating Performance of Prognostic Techniques
catalog.data.gov
datasets.ai
+4more
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Metrics for Evaluating Performance of Prognostic Techniques [Dataset]. https://catalog.data.gov/dataset/metrics-for-evaluating-performance-of-prognostic-techniques
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
Prognostics is an emerging concept in condition basedmaintenance(CBM)ofcriticalsystems.Alongwith developing the fundamentals of being able to confidently predict Remaining Useful Life (RUL), the technology calls for fielded applications as it inches towards maturation. This requires a stringent performance evaluation so that the significance of the concept can be fully exploited. Currently, prognostics concepts lack standard definitions and suffer from ambiguous and inconsistent interpretations. This lack of standards is in part due to the varied end-user requirements for different applications, time scales, available information, domain dynamics, etc. to name a few issues. Instead, the research community has used a variety of metrics based largely on convenience with respect to their respective requirements. Very little attention has been focused on establishing a common ground to compare different efforts. This paper surveys the metrics that are already used for prognostics in a variety of domains including medicine, nuclear, automotive, aerospace, and electronics. It also considers other domains that involve prediction-related tasks, such as weather and finance. Differences and similarities between these domains and health maintenancehave been analyzed to help understand what performance evaluation methods may or may not be borrowed. Further, these metrics have been categorized in several ways that may be useful in deciding upon a suitable subset for a specific application. Some important prognostic concepts have been defined using a notational framework that enables interpretation of different metrics coherently. Last, but not the least, a list of metrics has been suggested to assess critical aspects of RUL predictions before they are fielded in real applications.
COVID-19 Time-Series Metrics by County and State (ARCHIVED)
catalog.data.gov
data.chhs.ca.gov
+2more
Updated Nov 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Public Health (2024). COVID-19 Time-Series Metrics by County and State (ARCHIVED) [Dataset]. https://catalog.data.gov/dataset/covid-19-time-series-metrics-by-county-and-state-archived-aad71
Explore at:
Dataset updated
Nov 27, 2024
Dataset provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description
Note: This COVID-19 data set is no longer being updated as of December 1, 2023. Access current COVID-19 data on the CDPH respiratory virus dashboard (https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/Respiratory-Viruses/RespiratoryDashboard.aspx) or in open data format (https://data.chhs.ca.gov/dataset/respiratory-virus-dashboard-metrics). As of August 17, 2023, data is being updated each Friday. For death data after December 31, 2022, California uses Provisional Deaths from the Center for Disease Control and Prevention’s National Center for Health Statistics (NCHS) National Vital Statistics System (NVSS). Prior to January 1, 2023, death data was sourced from the COVID-19 registry. The change in data source occurred in July 2023 and was applied retroactively to all 2023 data to provide a consistent source of death data for the year of 2023. As of May 11, 2023, data on cases, deaths, and testing is being updated each Thursday. Metrics by report date have been removed, but previous versions of files with report date metrics are archived below. All metrics include people in state and federal prisons, US Immigration and Customs Enforcement facilities, US Marshal detention facilities, and Department of State Hospitals facilities. Members of California's tribal communities are also included. The "Total Tests" and "Positive Tests" columns show totals based on the collection date. There is a lag between when a specimen is collected and when it is reported in this dataset. As a result, the most recent dates on the table will temporarily show NONE in the "Total Tests" and "Positive Tests" columns. This should not be interpreted as no tests being conducted on these dates. Instead, these values will be updated with the number of tests conducted as data is received.
2023-2024 Big 5 European Soccer Player Statistics
kaggle.com
Updated Jul 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mamoun Kabbaj (2024). 2023-2024 Big 5 European Soccer Player Statistics [Dataset]. https://www.kaggle.com/datasets/mamounkabbaj/2023-2024-big-5-european-soccer-player-statistics/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 17, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mamoun Kabbaj
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Description

This dataset contains detailed player performance statistics for the 2023-2024 season from the Big 5 European soccer leagues: Premier League, La Liga, Serie A, Bundesliga, and Ligue 1. The data has been meticulously scraped from FBref.com, a comprehensive source for soccer statistics.

Columns and Metrics:

Rank: The rank of the player based on performance metrics.

Player: Name of the player.

Nation: Nationality of the player.

Position: Playing position of the player.

Squad: Club the player belongs to.

Competition: League the player is competing in.

Age: Age of the player.

Year_Born: Year the player was born.

Playing Time_MP: Matches played.

Playing Time_Starts: Matches started.

Playing Time_Min: Minutes played.

Playing Time_90s: Equivalent of 90-minute matches played.

Performance_Gls: Goals scored.

Performance_Ast: Assists.

Performance_G+A: Goals plus assists.

Performance_G-PK: Goals excluding penalties.

Performance_PK: Penalty kicks made.

Performance_PKatt: Penalty kicks attempted.

Performance_CrdY: Yellow cards.

Performance_CrdR: Red cards.

Expected_xG: Expected goals.

Expected_npxG: Non-penalty expected goals.

Expected_xAG: Expected assists.

Expected_npxG+xAG: Non-penalty expected goals plus expected assists.

Progression_PrgC: Progressive carries.

Progression_PrgP: Progressive passes.

Progression_PrgR: Progressive dribbles.

Per 90 Minutes_Gls: Goals per 90 minutes.

Per 90 Minutes_Ast: Assists per 90 minutes.

Per 90 Minutes_G+A: Goals plus assists per 90 minutes.

Per 90 Minutes_G-PK: Goals excluding penalties per 90 minutes.

Per 90 Minutes_G+A-PK: Goals plus assists excluding penalties per 90 minutes.

Per 90 Minutes_xG: Expected goals per 90 minutes.

Per 90 Minutes_xAG: Expected assists per 90 minutes.

Per 90 Minutes_xG+xAG: Expected goals plus expected assists per 90 minutes.

Per 90 Minutes_npxG: Non-penalty expected goals per 90 minutes.

Per 90 Minutes_npxG+xAG: Non-penalty expected goals plus expected assists per 90 minutes.

I am passionate about soccer and have created this dataset in the hope that it can be useful for others who share my love for the game. Whether you're conducting analysis, building models, or just exploring player stats, I hope this dataset provides valuable insights and serves as a helpful resource.
NIST Collaborative Research Cycle Data and Metrics Archive
catalog.data.gov
data.nist.gov
Updated Apr 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). NIST Collaborative Research Cycle Data and Metrics Archive [Dataset]. https://catalog.data.gov/dataset/nist-collaborative-research-cycle-data-and-metrics-archive
Explore at:
Dataset updated
Apr 11, 2024
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
This repository contains the collected resources submitted to and created by the NIST Collaborative Research Cycle (CRC) Data and Metrics Archive. The NIST Collaborative Research Cycle (CRC) is an ongoing effort to benchmark, compare, and investigate deidentification technologies. The program asks the research community to deidentify a compact and interesting dataset called the NIST Diverse Communities Data Excerpts, demographic data from communities across the U.S. sourced from the American Community Survey. This repository contains all of the submitted deidentified data instances each accompanied by a detailed abstract describing how the deidentified data were generated. We conduct an extensive standardized evaluation of each deidentified instance using a host of fidelity, utility, and privacy metrics, using out tool, SDNist. We?ve packaged the data, abstracts, and evaluation results into a human- and machine-readable archive.
Z
Data from: Multi-Source Distributed System Data for AI-powered Analytics
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Nov 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge Cardoso (2022). Multi-Source Distributed System Data for AI-powered Analytics [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3484800
Explore at:
Dataset updated
Nov 10, 2022
Dataset provided by
Jorge Cardoso
Odej Kao
Ajay Kumar Mandapati
Sasho Nedelkoski
Soeren Becker
Jasmin Bogatinovski
Description
Abstract:

In recent years there has been an increased interest in Artificial Intelligence for IT Operations (AIOps). This field utilizes monitoring data from IT systems, big data platforms, and machine learning to automate various operations and maintenance (O&M) tasks for distributed systems. The major contributions have been materialized in the form of novel algorithms. Typically, researchers took the challenge of exploring one specific type of observability data sources, such as application logs, metrics, and distributed traces, to create new algorithms. Nonetheless, due to the low signal-to-noise ratio of monitoring data, there is a consensus that only the analysis of multi-source monitoring data will enable the development of useful algorithms that have better performance.
Unfortunately, existing datasets usually contain only a single source of data, often logs or metrics. This limits the possibilities for greater advances in AIOps research. Thus, we generated high-quality multi-source data composed of distributed traces, application logs, and metrics from a complex distributed system. This paper provides detailed descriptions of the experiment, statistics of the data, and identifies how such data can be analyzed to support O&M tasks such as anomaly detection, root cause analysis, and remediation.

General Information:

This repository contains the simple scripts for data statistics, and link to the multi-source distributed system dataset.

You may find details of this dataset from the original paper:

Sasho Nedelkoski, Ajay Kumar Mandapati, Jasmin Bogatinovski, Soeren Becker, Jorge Cardoso, Odej Kao, "Multi-Source Distributed System Data for AI-powered Analytics". [link very soon]

If you use the data, implementation, or any details of the paper, please cite!

The multi-source/multimodal dataset is composed of distributed traces, application logs, and metrics produced from running a complex distributed system (Openstack). In addition, we also provide the workload and fault scripts together with the Rally report which can serve as ground truth (all at the Zenodo link below). We provide two datasets, which differ on how the workload is executed. The openstack_multimodal_sequential_actions is generated via executing workload of sequential user requests. The openstack_multimodal_concurrent_actions is generated via executing workload of concurrent user requests.

The difference of the concurrent dataset is that:

Due to the heavy load on the control node, the metric data for wally113 (control node) is not representative and we excluded it.

Three rally actions are executed in parallel: boot_and_delete, create_and_delete_networks, create_and_delete_image, whereas for the sequential there were 5 actions executed.

The raw logs in both datasets contain the same files. If the user wants the logs filetered by time with respect to the two datasets, should refer to the timestamps at the metrics (they provide the time window). In addition, we suggest to use the provided aggregated time ranged logs for both datasets in CSV format.

Important: The logs and the metrics are synchronized with respect time and they are both recorded on CEST (central european standard time). The traces are on UTC (Coordinated Universal Time -2 hours). They should be synchronized if the user develops multimodal methods.

Our GitHub repository can be found at: https://github.com/SashoNedelkoski/multi-source-observability-dataset/
Semiannual Metrics Reported to the Office of Science and Technology Policy...
catalog.data.gov
data.nist.gov
+1more
Updated Jul 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2022). Semiannual Metrics Reported to the Office of Science and Technology Policy to Demonstrate Implementation of NIST's Public Access Plan [Dataset]. https://catalog.data.gov/dataset/semiannual-metrics-reported-to-the-office-of-science-and-technology-policy-to-demonstrate--27f60
Explore at:
Dataset updated
Jul 29, 2022
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
NIST collects and reports numerous metrics semiannually to the White House Office of Science and Technology Policy to demonstrate implementation of the NIST Plan for Public Access. These metrics include: the number of NIST data management plans (internal and external); the number of datasets listed on data.gov and made public on NIST's website; datasets accessed through NIST's data discovery tool; publications available through PubMed Central (PMC) and govinfo; and publications accessed through PMC. NISTIR 8084, which summarizes NIST's plan for providing public access is accessible via http://dx.doi.org/10.6028/NIST.IR.8084.
COVID-19 Equity Metrics (PAUSED)
data.chhs.ca.gov
csv, xlsx, zip
Updated Jun 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Public Health (2025). COVID-19 Equity Metrics (PAUSED) [Dataset]. https://data.chhs.ca.gov/dataset/covid-19-equity-metrics
Explore at:
csv(923925), csv(324960), csv(332837), xlsx(45453), csv(11194064), csv(198712), zipAvailable download formats
Dataset updated
Jun 26, 2025
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description
Note: This dataset is on hiatus.

CDPH strives to respond equitably to the COVID-19 pandemic and is therefore interested in how different communities are impacted. Collecting and reporting health equity data helps to identify health disparities and improve the state’s response. To that end, CDPH tracks cases, deaths, and testing by race and ethnicity as well as other social determinants of health, such as income, crowded housing, and access to health insurance.

During the response, CDPH used a health equity metric, defined as the positivity rate in the most disproportionately-impacted communities according to the Healthy Places Index. The purpose of this metric was to ensure California reopened its economy safely by reducing disease transmission in all communities. This metric is tracked and reported in comparison to statewide positivity rate. More information is available at https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/COVID-19/CaliforniaHealthEquityMetric.aspx.

Data completeness is also critical to addressing inequities. CDPH reports data completeness by race and ethnicity, sexual orientation, and gender identity to better understand missingness in the data.

Health equity data is updated weekly. Data may be suppressed based on county population or total counts.

For more information on California’s commitment to health equity, please see https://covid19.ca.gov/equity/
Z
2019 EMDataResource Model Metrics Challenge Dataset
data.niaid.nih.gov
zenodo.org
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Igaev, Maxim (2024). 2019 EMDataResource Model Metrics Challenge Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4148788
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Wang, Liguo
Kryshtafovych, Andriy
Kihara, Daisuke
Dill, Ken
Terashi, Genki
Schröder, Gunnar F
Lawson, Catherine L
Patwardhan, Ardan
Fraser, James S
Schäfer, Luisa U
Williams, Christopher J
Chiu, Wah
Burnley, Tom
Cheng, Jianlin
Hou, Jie
Yu, Xiaodi
Afonine, Pavel V
Richardson, Jane S
Berman, Helen M
Perez, Alberto
Olek, Mateusz
Hung, Li-Wei
Pintilie, Greg D
Barad, Benjamin A
Herzik, Mark A Jr
Terwilliger, Thomas C
Sarkar, Daipayan
Farrell, Daniel P
Hoh, Soon Wen
Baker, Matthew L
Kumar, Dilip
Igaev, Maxim
Cowtan, Kevin
Wu, Tianqi
Shekhar, Mrinal
Mittal, Sumit
Bond, Paul
Winn, Martyn
Monastryrskyy, Bohdan
DiMaio, Frank
Wankowicz, Stephanie A
Schmid, Michael F
Pfab, Jonas
Adams, Paul D
Cao, Renzhi
Palmer, Colin M
Si, Dong
Zhang, Kaiming
Joseph, Agnel P
Singharoy, Abishek
Vaiana, Andrea
Chojnowski, Grzegorz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the full dataset of the 2019 Cryo-EM Map-based Model Metrics Challenge sponsored by EMDataResource (www.emdataresource.org, challenges.emdataresource.org, model-compare.emdataresource.org). The goals of this challenge were (1) to assess the quality of models that can be produced using current modeling software, (2) to check the reproducibility of modeling results from different software developers and users, and (3) compare the performance of current metrics used for evaluation of models. The focus was on near-atomic resolution maps with an innovative twist: three of four target maps formed a resolution series (1.8 to 3.1 Å) from the same specimen and imaging experiment. Tools developed in previous challenges were expanded for managing, visualizing and analyzing the 63 submitted coordinate models, and several new metrics were introduced.

File Descriptions:

2019-EMDataResource-Challenge-web.pdf: Archive of News, Goals, Timeline, Targets, Modelling Instructions, Process, FAQ, Submission Instructions, Submission Summary Statistics source from the EMDR Challenges website

correlation-images.tar.gz: Pairwise correlation tables for selected metric scores from the EMDR Model Compare website

maps.tar.gz: The maps used for Fit-to-Map analyses in the Challenge

models.tar.gz: The 63 models submitted by the modelling teams

results.tar.gz: The output logs for all of the analysis methods

Scores.xlsx: Scores for each model and analysis method, compiled into spreadsheet format

targets.tar.gz: The reference models used in the analysis

Post submission correction to the web archive PDF document: The full list of EMDataResource members on the model committee is as follows: Cathy Lawson, Andriy Kryshtafovych, Greg Pintilie, Mike Schmid, Helen Berman, Wah Chiu.
NBA player stats & salaries 2023
kaggle.com
Updated Feb 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
albarpambagio (2025). NBA player stats & salaries 2023 [Dataset]. https://www.kaggle.com/datasets/albarpambagio/nba-player-stats-and-salaries-2014-2023
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 22, 2025
Dataset provided by
Kaggle
Authors
albarpambagio
Description
NBA Player Statistics Dataset (Regular Season & Playoffs)

Overview

This dataset contains detailed NBA player statistics for both the regular season and playoffs, including per-game performance metrics and advanced analytics such as Player Efficiency Rating (PER). The dataset is useful for basketball analytics, machine learning projects, and statistical research on player performance.

Dataset Features

Basic Information

Player: Name of the player

Age: Player's age in the season

Team: Team abbreviation

Pos: Position played (e.g., PG, SG, SF, PF, C)

Season Type: Indicates whether stats are from Regular Season or Playoffs

Per-Game Statistics

G: Games played

GS: Games started

MP: Minutes played per game

FG, FGA, FG%: Field goals made, attempted, and percentage

3P, 3PA, 3P%: Three-pointers made, attempted, and percentage

2P, 2PA, 2P%: Two-pointers made, attempted, and percentage

FT, FTA, FT%: Free throws made, attempted, and percentage

ORB, DRB, TRB: Offensive, defensive, and total rebounds per game

AST: Assists per game

STL: Steals per game

BLK: Blocks per game

TOV: Turnovers per game

PF: Personal fouls per game

PTS: Points per game

Advanced Metrics

PER: Player Efficiency Rating, a metric that measures per-minute performance while adjusting for pace

Usage

This dataset is ideal for:
✅ Basketball analytics (player comparisons, efficiency analysis)
✅ Machine learning projects (predicting player performance, clustering player roles)
✅ Data visualization (trends in player stats, team comparisons)
m
Software code quality and source code metrics dataset
data.mendeley.com
narcis.nl
Updated Feb 17, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sayed Mohsin Reza (2021). Software code quality and source code metrics dataset [Dataset]. http://doi.org/10.17632/77p6rzb73n.2
Explore at:
Unique identifier
https://doi.org/10.17632/77p6rzb73n.2
Dataset updated
Feb 17, 2021
Authors
Sayed Mohsin Reza
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains quality, source code metrics information of 60 versions under 10 different repositories. The dataset is extracted into 3 levels: (1) Class (2) Method (3) Package. The dataset is created upon analyzing 9,420,246 lines of code and 173,237 classes. The provided dataset contains one quality_attributes folder and three associated files: repositories.csv, versions.csv, and attribute-details.csv. The first file (repositories.csv) contains general information(repository name, repository URL, number of commits, stars, forks, etc) in order to understand the size, popularity, and maintainability. File versions.csv contains general information (version unique ID, number of classes, packages, external classes, external packages, version repository link) to provide an overview of versions and how overtime the repository continues to grow. File attribute-details.csv contains detailed information (attribute name, attribute short form, category, and description) about extracted static analysis metrics and code quality attributes. The short form is used in the real dataset as a unique identifier to show value for packages, classes, and methods.
f
Dataset statistics after preprocessing.
plos.figshare.com
xls
Updated Jun 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghulam Mustafa; Abid Rauf; Muhammad Tanvir Afzal (2024). Dataset statistics after preprocessing. [Dataset]. http://doi.org/10.1371/journal.pone.0303105.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0303105.t002
Dataset updated
Jun 13, 2024
Dataset provided by
PLOS ONE
Authors
Ghulam Mustafa; Abid Rauf; Muhammad Tanvir Afzal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In scientific research, assessing the impact and influence of authors is crucial for evaluating their scholarly contributions. Whereas in literature, multitudinous parameters have been developed to quantify the productivity and significance of researchers, including the publication count, citation count, well-known h index and its extensions and variations. However, with a plethora of available assessment metrics, it is vital to identify and prioritize the most effective metrics. To address the complexity of this task, we employ a powerful deep learning technique known as the Multi-Layer Perceptron (MLP) classifier for the classification and the ranking purposes. By leveraging the MLP’s capacity to discern patterns within datasets, we assign importance scores to each parameter using the proposed modified recursive elimination technique. Based on the importance scores, we ranked these parameters. Furthermore, in this study, we put forth a comprehensive statistical analysis of the top-ranked author assessment parameters, encompassing a vast array of 64 distinct metrics. This analysis gives us treasured insights in between these parameters, shedding light on the potential correlations and dependencies that may affect assessment outcomes. In the statistical analysis, we combined these parameters by using seven well-known statistical methods, such as arithmetic means, harmonic means, geometric means etc. After combining the parameters, we sorted the list of each pair of parameters and analyzed the top 10, 50, and 100 records. During this analysis, we counted the occurrence of the award winners. For experimental proposes, data collection was done from the field of Mathematics. This dataset consists of 525 individuals who are yet to receive their awards along with 525 individuals who have been recognized as potential award winners by certain well known and prestigious scientific societies belonging to the fields’ of mathematics in the last three decades. The results of this study revealed that, in ranking of the author assessment parameters, the normalized h index achieved the highest importance score as compared to the remaining sixty-three parameters. Furthermore, the statistical analysis results revealed that the Trigonometric Mean (TM) outperformed the other six statistical models. Moreover, based on the analysis of the parameters, specifically the M Quotient and FG index, it is evident that combining these parameters with any other parameter using various statistical models consistently produces excellent results in terms of the percentage score for returning awardees.
d
Performance Metrics - Innovation & Technology - 311 Website Availability
catalog.data.gov
data.cityofchicago.org
+2more
Updated Sep 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofchicago.org (2023). Performance Metrics - Innovation & Technology - 311 Website Availability [Dataset]. https://catalog.data.gov/dataset/performance-metrics-innovation-technology-311-website-availability
Explore at:
Dataset updated
Sep 29, 2023
Dataset provided by
data.cityofchicago.org
Description
The 311 website allows residents to submit service requests or check the status of existing requests online. The percentage of 311 website uptime, the amount of time the site was available, and the target uptime for each week are available by mousing over columns. The target availability for this site is 99.5%.
Pre-compiled metrics data sets, links to aggregated statistics files in CSV...
doi.pangaea.de
search.dataone.org
html, tsv
Updated Sep 8, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin G Schultz; Sabine Schröder; Olga Lyapina; Owen R Cooper (2017). Pre-compiled metrics data sets, links to aggregated statistics files in CSV format [Dataset]. http://doi.org/10.1594/PANGAEA.880503
Explore at:
tsv, htmlAvailable download formats
Unique identifier
https://doi.org/10.1594/PANGAEA.880503
Dataset updated
Sep 8, 2017
Dataset provided by
PANGAEA
Authors
Martin G Schultz; Sabine Schröder; Olga Lyapina; Owen R Cooper
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1990 - Jan 1, 2015
Variables measured
DATE/TIME, File name, File size, Uniform resource locator/link to file
Description
Errata: Due to a coding error, monthly files with "dma8epax" statistics were wrongly aggregated. This concerns all gridded files of this metric as well as the monthly aggregated csv files. All erroneous files were replaced with corrected versions on Jan, 16th, 2018. Each updated file contains a version label "1.1" and a brief description of the error. If you have made use of previous TOAR data files with the "dma8epax" metric, please exchange your data files.
Aggregated Statistics Dataset
figshare.com
txt
Updated Jan 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Perathoner (2024). Aggregated Statistics Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24639174.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24639174.v1
Dataset updated
Jan 23, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Alexander Perathoner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The file generated by the parse-logs script. Is used to calculate rate of passing / failing flaky tests.
f
Dataset statistics before preprocessing.
plos.figshare.com
xls
Updated Jun 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ghulam Mustafa; Abid Rauf; Muhammad Tanvir Afzal (2024). Dataset statistics before preprocessing. [Dataset]. http://doi.org/10.1371/journal.pone.0303105.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0303105.t001
Dataset updated
Jun 13, 2024
Dataset provided by
PLOS ONE
Authors
Ghulam Mustafa; Abid Rauf; Muhammad Tanvir Afzal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In scientific research, assessing the impact and influence of authors is crucial for evaluating their scholarly contributions. Whereas in literature, multitudinous parameters have been developed to quantify the productivity and significance of researchers, including the publication count, citation count, well-known h index and its extensions and variations. However, with a plethora of available assessment metrics, it is vital to identify and prioritize the most effective metrics. To address the complexity of this task, we employ a powerful deep learning technique known as the Multi-Layer Perceptron (MLP) classifier for the classification and the ranking purposes. By leveraging the MLP’s capacity to discern patterns within datasets, we assign importance scores to each parameter using the proposed modified recursive elimination technique. Based on the importance scores, we ranked these parameters. Furthermore, in this study, we put forth a comprehensive statistical analysis of the top-ranked author assessment parameters, encompassing a vast array of 64 distinct metrics. This analysis gives us treasured insights in between these parameters, shedding light on the potential correlations and dependencies that may affect assessment outcomes. In the statistical analysis, we combined these parameters by using seven well-known statistical methods, such as arithmetic means, harmonic means, geometric means etc. After combining the parameters, we sorted the list of each pair of parameters and analyzed the top 10, 50, and 100 records. During this analysis, we counted the occurrence of the award winners. For experimental proposes, data collection was done from the field of Mathematics. This dataset consists of 525 individuals who are yet to receive their awards along with 525 individuals who have been recognized as potential award winners by certain well known and prestigious scientific societies belonging to the fields’ of mathematics in the last three decades. The results of this study revealed that, in ranking of the author assessment parameters, the normalized h index achieved the highest importance score as compared to the remaining sixty-three parameters. Furthermore, the statistical analysis results revealed that the Trigonometric Mean (TM) outperformed the other six statistical models. Moreover, based on the analysis of the parameters, specifically the M Quotient and FG index, it is evident that combining these parameters with any other parameter using various statistical models consistently produces excellent results in terms of the percentage score for returning awardees.
d
Long-term monotonic trends in annual and monthly streamflow metrics at...
catalog.data.gov
data.usgs.gov
+1more
Updated Oct 5, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Long-term monotonic trends in annual and monthly streamflow metrics at streamgages in the United States (ver. 2.0, October 2024) [Dataset]. https://catalog.data.gov/dataset/long-term-monotonic-trends-in-annual-and-monthly-streamflow-metrics-at-streamgages-in-the-
Explore at:
Dataset updated
Oct 5, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
United States
Description
The U.S. Geological Survey (USGS) Water Resources Mission Area (WMA) is working to address a need to understand where the Nation is experiencing water shortages or surpluses relative to the demand for water need by delivering routine assessments of water supply and demand and an understanding of the natural and human factors affecting the balance between supply and demand. A key part of these national assessments is identifying long-term trends in water availability, including groundwater and surface water quantity, quality, and use. This data release contains Mann-Kendall monotonic trend analyses for 18 observed annual and monthly streamflow metrics at 6,347 U.S. Geological Survey streamgages located in the conterminous United States, Alaska, Hawaii, and Puerto Rico. Streamflow metrics include annual mean flow, maximum 1-day and 7-day flows, minimum 7-day and 30-day flows, and the date of the center of volume (the date on which 50% of the annual flow has passed by a gage), along with the mean flow for each month of the year. Annual streamflow metrics are computed from mean daily discharge records at U.S. Geological Survey streamgages that are publicly available from the National Water Information System (NWIS). Trend analyses are computed using annual streamflow metrics computed through climate year 2022 (April 2022- March 2023) for low-flow metrics and water year 2022 (October 2021 - September 2022) for all other metrics. Trends at each site are available for up to four different periods: (i) the longest possible period that meets completeness criteria at each site, (ii) 1980-2020, (iii) 1990-2020, (iv) 2000-2020. Annual metric time series analyzed for trends must have 80 percent complete records during fixed periods. In addition, each of these time series must have 80 percent complete records during their first and last decades. All longest possible period time series must be at least 10 years long and have annual metric values for at least 80% of the years running from 2013 to 2022. This data release provides the following five CSV output files along with a model archive: (1) streamflow_trend_results.csv - contains test results of all trend analyses with each row representing one unique combination of (i) NWIS streamgage identifiers, (ii) metric (computed using Oct 1 - Sep 30 water years except for low-flow metrics computed using climate years (Apr 1 - Mar 31), (iii) trend periods of interest (longest possible period through 2022, 1980-2020, 1990-2020, 2000-2020) and (iv) records containing either the full trend period or only a portion of the trend period following substantial increases in cumulative upstream reservoir storage capacity. This is an output from the final process step (#5) of the workflow. (2) streamflow_trend_trajectories_with_confidence_bands.csv - contains annual trend trajectories estimated using Theil-Sen regression, which estimates the median of the probability distribution of a metric for a given year, along with 90 percent confidence intervals (5th and 95h percentile values). This is an output from the final process step (#5) of the workflow. (3) streamflow_trend_screening_all_steps.csv - contains the screening results of all 7,873 streamgages initially considered as candidate sites for trend analysis and identifies the screens that prevented some sites from being included in the Mann-Kendall trend analysis. (4) all_site_year_metrics.csv - contains annual time series values of streamflow metrics computed from mean daily discharge data at 7,873 candidate sites. This is an output of Process Step 1 in the workflow. (5) all_site_year_filters.csv - contains information about the completeness and quality of daily mean discharge at each streamgage during each year (water year, climate year, and calendar year). This is also an output of Process Step 1 in the workflow and is combined with all_site_year_metrics.csv in Process Step 2. In addition, a .zip file contains a model archive for reproducing the trend results using R 4.4.1 statistical software. See the README file contained in the model archive for more information. Caution must be exercised when utilizing monotonic trend analyses conducted over periods of up to several decades (and in some places longer ones) due to the potential for confounding deterministic gradual trends with multi-decadal climatic fluctuations. In addition, trend results are available for post-reservoir construction periods within the four trend periods described above to avoid including abrupt changes arising from the construction of larger reservoirs in periods for which gradual monotonic trends are computed. Other abrupt changes, such as changes to water withdrawals and wastewater return flows, or episodic disturbances with multi-year recovery periods, such as wildfires, are not evaluated. Sites with pronounced abrupt changes or other non-monotonic trajectories of change may require more sophisticated trend analyses than those presented in this data release.
A
‘Austin's data portal activity metrics’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Austin's data portal activity metrics’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-austin-s-data-portal-activity-metrics-1ce3/latest
Explore at:
Dataset updated
Feb 13, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Austin's data portal activity metrics’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/data-portal-activity-metricse on 13 February 2022.

--- Dataset description provided by original source is as follows ---

About this dataset

Background

Austin's open data portal provides lots of public data about the City of Austin. It also provides portal administrators with behind-the-scenes information about how the portal is used... but that data is mysterious, hard to handle in a spreadsheet, and not located all in one place.

Until now! Authorized city staff used admin credentials to grab this usage data and share it the public. The City of Austin wants to use this data to inform the development of its open data initiative and manage the open data portal more effectively.

This project contains related datasets for anyone to explore. These include site-level metrics, dataset-level metrics, and department information for context. A detailed detailed description of how the files were prepared (along with code) can be found on github here.

Example questions to answer about the data portal

What parts of the open data portal do people seem to value most?

What can we tell about who our users are?

How are our data publishers doing?

How much data is published programmatically vs manually?

How data is super fresh? Super stale?

Whatever you think we should know...

About the files

all_views_20161003.csv

There is a resource available to portal administrators called "Dataset of datasets". This is the export of that resource, and it was captured on Oct 3, 2016. It contains a summary of the assets available on the data portal. While this file contains over 1400 resources (such as views, charts, and binary files), only 363 are actual tabular datasets.

table_metrics_ytd.csv

This file contains information about the 363 tabular datasets on the portal. Activity metrics for an individual dataset can be accessed by calling Socrata's views/metrics API and passing along the dataset's unique ID, a time frame, and admin credentials. The process of obtaining the 363 identifiers, calling the API, and staging the information can be reviewed in the python notebook here.

site_metrics.csv

This file is the export of site-level stats that Socrata generates using a given time frame and grouping preference. This file contains records about site usage each month from Nov 2011 through Sept 2016. By the way, it contains 285 columns... and we don't know what many of them mean. But we are determined to find out!! For a preliminary exploration of the columns and what portal-related business processes to which they might relate, check out the notes in this python notebook here

city_departments_in_current_budget.csv

This file contains a list of all City of Austin departments according to how they're identified in the most recently approved budget documents. Could be helpful for getting to know more about who the publishers are.

crosswalk_to_budget_dept.csv

The City is in the process of standardizing how departments identify themselves on the data portal. In the meantime, here's a crosswalk from the department values observed in all_views_20161003.csv to the department names that appear in the City's budget

This dataset was created by Hailey Pate and contains around 100 samples along with Di Sync Success, Browser Firefox 19, technical information and other features such as: - Browser Firefox 33 - Di Sync Failed - and more.

How to use this dataset

Analyze Sf Query Error User in relation to Js Page View Admin

Study the influence of Browser Firefox 37 on Datasets Created

More datasets

Acknowledgements

If you use this dataset in your research, please credit Hailey Pate

Start A New Notebook!

--- Original source retains full ownership of the source dataset ---

Facebook

Twitter

Click to copy link

Link copied

Cite

Public Health England (2022). Health visitor service delivery metrics experimental statistics: 2019 to 2020 annual data [Dataset]. https://www.gov.uk/government/statistics/health-visitor-service-delivery-metrics-experimental-statistics-2019-to-2020-annual-data

Health visitor service delivery metrics experimental statistics: 2019 to 2020 annual data

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Aug 2, 2022

Dataset provided by

GOV.UKhttp://gov.uk/

Authors

Public Health England

Description

This release is for quarters 1 to 4 of 2019 to 2020.

Local authority commissioners and health professionals can use these resources to track how many pregnant women, children and families in their local area have received health promoting reviews at particular points during pregnancy and childhood.

The data and commentaries also show variation at a local, regional and national level. This can help with planning, commissioning and improving local services.

The metrics cover health reviews for pregnant women, children and their families at several stages which are:

antenatal contact
new birth visit
6 to 8-week review
12-month review
2 to 2-and-a-half-year review

Public Health England (PHE) collects the data, which is submitted by local authorities on a voluntary basis.

See health visitor service delivery metrics in the child and maternal health statistics collection to access data for previous years.

Find guidance on using these statistics and other intelligence resources to help you make decisions about the planning and provision of child and maternal health services.

See health visitor service metrics and outcomes definitions from Community Services Dataset (CSDS).

Correction notice

Since publication in November 2020, Lewisham and Leicestershire councils have identified errors in the new birth visits within 14 days data it submitted to Public Health England (PHE) for 2019 to 2020 data. This error has caused a statistically significant change in the health visiting data for 2019 to 2020, and so the Office for Health Improvement and Disparities (OHID) has updated and reissued the data in OHID’s Fingertips tool.

A correction notice has been added to the 2019 to 2020 annual statistical release and statistical commentary but the data has not been altered.

Please consult OHID’s Fingertips tool for corrected data for Lewisham and Leicestershire, the London and East Midlands region, and England.

Clear search

Close search

Google apps

Main menu

Health visitor service delivery metrics experimental statistics: 2019 to...

Correction notice

Metrics by Individual Security and Exchange

Data from: Pre-compiled metrics data sets, links to yearly statistics files...

Data from: Metrics for Evaluating Performance of Prognostic Techniques

COVID-19 Time-Series Metrics by County and State (ARCHIVED)

2023-2024 Big 5 European Soccer Player Statistics

Description

Columns and Metrics:

NIST Collaborative Research Cycle Data and Metrics Archive

Data from: Multi-Source Distributed System Data for AI-powered Analytics

Semiannual Metrics Reported to the Office of Science and Technology Policy...

COVID-19 Equity Metrics (PAUSED)

2019 EMDataResource Model Metrics Challenge Dataset

NBA player stats & salaries 2023

NBA Player Statistics Dataset (Regular Season & Playoffs)

Overview

Dataset Features

Usage

Software code quality and source code metrics dataset

Dataset statistics after preprocessing.

Performance Metrics - Innovation & Technology - 311 Website Availability

Pre-compiled metrics data sets, links to aggregated statistics files in CSV...

Aggregated Statistics Dataset

Dataset statistics before preprocessing.

Long-term monotonic trends in annual and monthly streamflow metrics at...

‘Austin's data portal activity metrics’ analyzed by Analyst-2

About this dataset

Background

Example questions to answer about the data portal

About the files

all_views_20161003.csv

table_metrics_ytd.csv

site_metrics.csv

city_departments_in_current_budget.csv

crosswalk_to_budget_dept.csv

How to use this dataset

Acknowledgements

Start A New Notebook!

Health visitor service delivery metrics experimental statistics: 2019 to 2020 annual data

Correction notice

`all_views_20161003.csv`

`table_metrics_ytd.csv`

`site_metrics.csv`

`city_departments_in_current_budget.csv`

`crosswalk_to_budget_dept.csv`