100+ datasets found

d
Incomplete Published Data Assets Report
datasets.ai
s.cnmilf.com
+1more
Updated Aug 26, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Homeland Security (2024). Incomplete Published Data Assets Report [Dataset]. https://datasets.ai/datasets/incomplete-published-data-assets-report
Explore at:
Dataset updated
Aug 26, 2024
Dataset authored and provided by
Department of Homeland Security
Description
Displays incomplete Published data assets. This report can be used to help improve the Data Asset Completeness score from the Enterprise Data Management (EDM) Scorecard by identifying which missing fields are required for completeness.
f
Data from: Robust multivariate mixture regression models with incomplete...
tandf.figshare.com
datasetcatalog.nlm.nih.gov
text/x-tex
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hwa Kyung Lim; Naveen N. Narisetty; Sooyoung Cheon (2023). Robust multivariate mixture regression models with incomplete data [Dataset]. http://doi.org/10.6084/m9.figshare.3491345.v1
Explore at:
text/x-texAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3491345.v1
Dataset updated
Jun 2, 2023
Dataset provided by
Taylor & Francis
Authors
Hwa Kyung Lim; Naveen N. Narisetty; Sooyoung Cheon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Multivariate mixture regression models can be used to investigate the relationships between two or more response variables and a set of predictor variables by taking into consideration unobserved population heterogeneity. It is common to take multivariate normal distributions as mixing components, but this mixing model is sensitive to heavy-tailed errors and outliers. Although normal mixture models can approximate any distribution in principle, the number of components needed to account for heavy-tailed distributions can be very large. Mixture regression models based on the multivariate t distributions can be considered as a robust alternative approach. Missing data are inevitable in many situations and parameter estimates could be biased if the missing values are not handled properly. In this paper, we propose a multivariate t mixture regression model with missing information to model heterogeneity in regression function in the presence of outliers and missing values. Along with the robust parameter estimation, our proposed method can be used for (i) visualization of the partial correlation between response variables across latent classes and heterogeneous regressions, and (ii) outlier detection and robust clustering even under the presence of missing values. We also propose a multivariate t mixture regression model using MM-estimation with missing information that is robust to high-leverage outliers. The proposed methodologies are illustrated through simulation studies and real data analysis.
A dataset from a survey investigating disciplinary differences in data...
zenodo.org
data.niaid.nih.gov
bin, csv, pdf, txt
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein (2024). A dataset from a survey investigating disciplinary differences in data citation [Dataset]. http://doi.org/10.5281/zenodo.7555363
Explore at:
csv, txt, pdf, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7555363
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anton Boudreau Ninkov; Anton Boudreau Ninkov; Chantal Ripp; Chantal Ripp; Kathleen Gregory; Kathleen Gregory; Isabella Peters; Isabella Peters; Stefanie Haustein; Stefanie Haustein
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
GENERAL INFORMATION

Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation

Date of data collection: January to March 2022

Collection instrument: SurveyMonkey

Funding: Alfred P. Sloan Foundation

SHARING/ACCESS INFORMATION

Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license

Links to publications that cite or use the data:

Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437

Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data:
A survey investigating disciplinary differences in data citation. Zenodo. https://doi.org/10.5281/zenodo.7555266

DATA & FILE OVERVIEW

File List

Filename: MDCDatacitationReuse2021Codebook.pdf
Codebook

Filename: MDCDataCitationReuse2021surveydata.csv
Dataset format in csv

Filename: MDCDataCitationReuse2021surveydata.sav
Dataset format in SPSS

Filename: MDCDataCitationReuseSurvey2021QNR.pdf
Questionnaire

Additional related data collected that was not included in the current data package: Open ended questions asked to respondents

METHODOLOGICAL INFORMATION

Description of methods used for collection/generation of data:

The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.

Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).

Methods for processing the data:

Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.

Instrument- or software-specific information needed to interpret the data:

The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.

DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata

Number of variables: 94

Number of cases/rows: 2,492

Missing data codes: 999 Not asked

Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.
f
Data from: Archetypal Analysis With Missing Data: See All Samples by Looking...
tandf.figshare.com
application/x-rar
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Irene Epifanio; M. Victoria Ibáñez; Amelia Simó (2023). Archetypal Analysis With Missing Data: See All Samples by Looking at a Few Based on Extreme Profiles [Dataset]. http://doi.org/10.6084/m9.figshare.7445378.v2
Explore at:
application/x-rarAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7445378.v2
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
Irene Epifanio; M. Victoria Ibáñez; Amelia Simó
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this article, we propose several methodologies for handling missing or incomplete data in archetype analysis (AA) and archetypoid analysis (ADA). AA seeks to find archetypes, which are convex combinations of data points, and to approximate the samples as mixtures of those archetypes. In ADA, the representative archetypal data belong to the sample, that is, they are actual data points. With the proposed procedures, missing data are not discarded or previously filled by imputation and the theoretical properties regarding location of archetypes are guaranteed, unlike the previous approaches. The new procedures adapt the AA algorithm either by considering the missing values in the computation of the solution or by skipping them. In the first case, the solutions of previous approaches are modified to fulfill the theory and a new procedure is proposed, where the missing values are updated by the fitted values. In this second case, the procedure is based on the estimation of dissimilarities between samples and the projection of these dissimilarities in a new space, where AA or ADA is applied, and those results are used to provide a solution in the original space. A comparative analysis is carried out in a simulation study, with favorable results. The methodology is also applied to two real datasets: a well-known climate dataset and a global development dataset. We illustrate how these unsupervised methodologies allow complex data to be understood, even by nonexperts. Supplementary materials for this article are available online.
ARCHIVED: COVID-19 Testing by Race/Ethnicity Over Time
healthdata.gov
data.sfgov.org
+1more
application/rdfxml +5
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). ARCHIVED: COVID-19 Testing by Race/Ethnicity Over Time [Dataset]. https://healthdata.gov/dataset/ARCHIVED-COVID-19-Testing-by-Race-Ethnicity-Over-T/ntmc-mxb8
Explore at:
tsv, csv, json, application/rssxml, application/rdfxml, xmlAvailable download formats
Dataset updated
Apr 8, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY This dataset includes San Francisco COVID-19 tests by race/ethnicity and by date. This dataset represents the daily count of tests collected, and the breakdown of test results (positive, negative, or indeterminate). Tests in this dataset include all those collected from persons who listed San Francisco as their home address at the time of testing. It also includes tests that were collected by San Francisco providers for persons who were missing a locating address. This dataset does not include tests for residents listing a locating address outside of San Francisco, even if they were tested in San Francisco.

The data were de-duplicated by individual and date, so if a person gets tested multiple times on different dates, all tests will be included in this dataset (on the day each test was collected). If a person tested multiple times on the same date, only one test is included from that date. When there are multiple tests on the same date, a positive result, if one exists, will always be selected as the record for the person. If a PCR and antigen test are taken on the same day, the PCR test will supersede. If a person tests multiple times on the same day and the results are all the same (e.g. all negative or all positive) then the first test done is selected as the record for the person.

The total number of positive test results is not equal to the total number of COVID-19 cases in San Francisco.

When a person gets tested for COVID-19, they may be asked to report information about themselves. One piece of information that might be requested is a person's race and ethnicity. These data are often incomplete in the laboratory and provider reports of the test results sent to the health department. The data can be missing or incomplete for several possible reasons:

• The person was not asked about their race and ethnicity. • The person was asked, but refused to answer. • The person answered, but the testing provider did not include the person's answers in the reports. • The testing provider reported the person's answers in a format that could not be used by the health department.

For any of these reasons, a person's race/ethnicity will be recorded in the dataset as “Unknown.”

B. NOTE ON RACE/ETHNICITY The different values for Race/Ethnicity in this dataset are "Asian;" "Black or African American;" "Hispanic or Latino/a, all races;" "American Indian or Alaska Native;" "Native Hawaiian or Other Pacific Islander;" "White;" "Multi-racial;" "Other;" and “Unknown."

The Race/Ethnicity categorization increases data clarity by emulating the methodology used by the U.S. Census in the American Community Survey. Specifically, persons who identify as "Asian," "Black or African American," "American Indian or Alaska Native," "Native Hawaiian or Other Pacific Islander," "White," "Multi-racial," or "Other" do NOT include any person who identified as Hispanic/Latino at any time in their testing reports that either (1) identified them as SF residents or (2) as someone who tested without a locating address by an SF provider. All persons across all races who identify as Hispanic/Latino are recorded as “"Hispanic or Latino/a, all races." This categorization increases data accuracy by correcting the way “Other” persons were counted. Previously, when a person reported “Other” for Race/Ethnicity, they would be recorded “Unknown.” Under the new categorization, they are counted as “Other” and are distinct from “Unknown.”

If a person records their race/ethnicity as “Asian,” “Black or African American,” “American Indian or Alaska Native,” “Native Hawaiian or Other Pacific Islander,” “White,” or “Other” for their first COVID-19 test, then this data will not change—even if a different race/ethnicity is reported for this person for any future COVID-19 test. There are two exceptions to this rule. The first exception is if a person’s race/ethnicity value i
G
AI-Generated Synthetic Tabular Dataset Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Aug 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). AI-Generated Synthetic Tabular Dataset Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/ai-generated-synthetic-tabular-dataset-market
Explore at:
pptx, csv, pdfAvailable download formats
Dataset updated
Aug 4, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
AI-Generated Synthetic Tabular Dataset Market Outlook

According to our latest research, the AI-Generated Synthetic Tabular Dataset market size reached USD 1.42 billion in 2024 globally, reflecting the rapid adoption of artificial intelligence-driven data generation solutions across numerous industries. The market is expected to expand at a robust CAGR of 34.7% from 2025 to 2033, reaching a forecasted value of USD 19.17 billion by 2033. This exceptional growth is primarily driven by the increasing need for high-quality, privacy-preserving datasets for analytics, model training, and regulatory compliance, particularly in sectors with stringent data privacy requirements.

One of the principal growth factors propelling the AI-Generated Synthetic Tabular Dataset market is the escalating demand for data-driven innovation amidst tightening data privacy regulations. Organizations across healthcare, finance, and government sectors are facing mounting challenges in accessing and sharing real-world data due to GDPR, HIPAA, and other global privacy laws. Synthetic data, generated by advanced AI algorithms, offers a solution by mimicking the statistical properties of real datasets without exposing sensitive information. This enables organizations to accelerate AI and machine learning development, conduct robust analytics, and facilitate collaborative research without risking data breaches or non-compliance. The growing sophistication of generative models, such as GANs and VAEs, has further increased confidence in the utility and realism of synthetic tabular data, fueling adoption across both large enterprises and research institutions.

Another significant driver is the surge in digital transformation initiatives and the proliferation of AI and machine learning applications across industries. As businesses strive to leverage predictive analytics, automation, and intelligent decision-making, the need for large, diverse, and high-quality datasets has become paramount. However, real-world data is often siloed, incomplete, or inaccessible due to privacy concerns. AI-generated synthetic tabular datasets bridge this gap by providing scalable, customizable, and bias-mitigated data for model training and validation. This not only accelerates AI deployment but also enhances model robustness and generalizability. The flexibility of synthetic data generation platforms, which can simulate rare events and edge cases, is particularly valuable in sectors like finance and healthcare, where such scenarios are underrepresented in real datasets but critical for risk assessment and decision support.

The rapid evolution of the AI-Generated Synthetic Tabular Dataset market is also underpinned by technological advancements and growing investments in AI infrastructure. The availability of cloud-based synthetic data generation platforms, coupled with advancements in natural language processing and tabular data modeling, has democratized access to synthetic datasets for organizations of all sizes. Strategic partnerships between technology providers, research institutions, and regulatory bodies are fostering innovation and establishing best practices for synthetic data quality, utility, and governance. Furthermore, the integration of synthetic data solutions with existing data management and analytics ecosystems is streamlining workflows and reducing barriers to adoption, thereby accelerating market growth.

Regionally, North America dominates the AI-Generated Synthetic Tabular Dataset market, accounting for the largest share in 2024 due to the presence of leading AI technology firms, strong regulatory frameworks, and early adoption across industries. Europe follows closely, driven by stringent data protection laws and a vibrant research ecosystem. The Asia Pacific region is emerging as a high-growth market, fueled by rapid digitalization, government initiatives, and increasing investments in AI research and development. Latin America and the Middle East & Africa are also witnessing growing interest, particularly in sectors like finance and government, though market maturity varies across countries. The regional landscape is expected to evolve dynamically as regulatory harmonization, cross-border data collaboration, and technological advancements continue to shape market trajectories globally.

"https://growthmarketreports.com/request-sample/18491">
z
Data from: Incomplete specimens in geometric morphometric analyses
zenodo.org
explore.openaire.eu
+1more
Updated Oct 11, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arbour, Jessica H.; Brown, Caleb M. (2014). Data from: Incomplete specimens in geometric morphometric analyses [Dataset]. http://doi.org/10.5061/dryad.mp713
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.mp713
Dataset updated
Oct 11, 2014
Dataset provided by
University of Toronto
Authors
Arbour, Jessica H.; Brown, Caleb M.
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
1.The analysis of morphological diversity frequently relies on the use of multivariate methods for characterizing biological shape. However, many of these methods are intolerant of missing data, which can limit the use of rare taxa and hinder the study of broad patterns of ecological diversity and morphological evolution. This study applied a mutli-dataset approach to compare variation in missing data estimation and its effect on geometric morphometric analysis across taxonomically-variable groups, landmark position and sample sizes. 2.Missing morphometric landmark data was simulated from five real, complete datasets, including modern fish, primates and extinct theropod dinosaurs. Missing landmarks were then estimated using several standard approaches and a geometric-morphometric-specific method. The accuracy of missing data estimation was determined for each estimation method, landmark position, and morphological dataset. Procrustes superimposition was used to compare the eigenvectors and principal component scores of a geometric morphometric analysis of the original landmark data, to datasets with A) missing values estimated, or B) simulated incomplete specimens excluded, for varying levels of specimens incompleteness and sample sizes. 3.Standard estimation techniques were more reliable estimators and had lower impacts on morphometric analysis compared to a geometric-morphometric-specific estimator. For most datasets and estimation techniques, estimating missing data produced a better fit to the structure of the original data than exclusion of incomplete specimens, and this was maintained even at considerably reduced sample sizes. The impact of missing data on geometric morphometric analysis was disproportionately affected by the most fragmentary specimens. 4.Missing data estimation was influenced by variability of specific anatomical features, and may be improved by a better understanding of shape variation present in a dataset. Our results suggest that the inclusion of incomplete specimens through the use of effective missing data estimators better reflects the patterns of shape variation within a dataset than using only complete specimens, however the effectiveness of missing data estimation can be maximized by excluding only the most incomplete specimens. It is advised that missing data estimators be evaluated for each dataset and landmark independently, as the effectiveness of estimators can vary strongly and unpredictably between different taxa and structures.
D
Campaign Finance - Transactions
data.sfgov.org
s.cnmilf.com
+1more
csv, xlsx, xml
Updated Oct 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Campaign Finance - Transactions [Dataset]. https://data.sfgov.org/City-Management-and-Ethics/Campaign-Finance-Transactions/pitq-e56w
Explore at:
csv, xml, xlsxAvailable download formats
Dataset updated
Oct 8, 2025
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
A. SUMMARY Transactions from FPPC Forms 460, 461, 496, 497, and 450. This dataset combines all schedules, pages, and includes unitemized totals. Only transactions from the "most recent" version of a filing (original/amendment) appear here.

B. HOW THE DATASET IS CREATED Committees file campaign statements with the Ethics Commission on a periodic basis. Those statements are stored with the Commission's data provider. Data is generally presented as-filed by committees.

If a committee files an amendment, the data from that filing completely replaces the original and any prior amendments in the filing sequence.

C. UPDATE PROCESS Each night starting at midnight Pacific time a script runs to check for new filings with the Commission's database, and updates this dataset with transactions from new filings. The update process can take a variable amount of time to complete. Viewing or downloading this dataset while the update is running may result in incomplete data, therefore it is highly recommended to view or download this data before midnight or after 8am.

During the update, some fields are copied from the Filings dataset into this dataset for viewing convenience. The copy process may occasionally fail for some transactions due to timing issues but should self-correct the following day. Transactions with a blank 'Filing Id Number' or 'Filing Date' field are such transactions, but can be joined with the appropriate record using the 'Filing Activity Nid' field shared between Filing and Transaction datasets.

D. HOW TO USE THIS DATASET
Transactions from rejected filings are not included in this dataset. Transactions from many different FPPC forms and schedules are combined in this dataset, refer to the column "Form Type" to differentiate transaction types. Properties suffixed with "-nid" can be used to join the data between Filers, Filings, and Transaction datasets. Refer to the Ethics Commission's webpage for more information. Fppc Form460 is organized into Schedules as follows:

A: Monetary Contributions Received

B1: Loans Received

B2: Loan Guarantors

C: Nonmonetary Contributions Received

D: Summary of Expenditures Supporting/Opposing Other Candidates, Measures and Committees

E: Payments Made

F: Accrued Expenses (Unpaid Bills)

G: Payments Made by an Agent or Independent Contractor (on Behalf of This Committee)

H: Loans Made to Others

I: Miscellaneous Increases to Cash

RELATED DATASETS

San Francisco Campaign Filers

Filings Received by SFEC

Summary Totals

Transactions
S
Global scientific academies Dataset
scidb.cn
Updated Nov 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
chen xiaoli (2024). Global scientific academies Dataset [Dataset]. http://doi.org/10.57760/sciencedb.14674
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.14674
Dataset updated
Nov 18, 2024
Dataset provided by
Science Data Bank
Authors
chen xiaoli
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This dataset was generated as part of the study aimed at profiling global scientific academies, which play a significant role in promoting scholarly communication and scientific progress. Below is a detailed description of the dataset:Data Generation Procedures and Tools: The dataset was compiled using a combination of web scraping, manual verification, and data integration from multiple sources, including Wikipedia categories,member of union of scientific organizations, and web searches using specific query phrases (e.g., "country name + (academy OR society) AND site:.country code"). The records were enriched by cross-referencing data from the Wikidata API, the VIAF API, and the Research Organisation Registry (ROR). Additional manual curation ensured accuracy and consistency.Temporal and Geographical Scopes: The dataset covers scientific academies from a wide temporal scope, ranging from the 15th century to the present. The geographical scope includes academies from all continents, with emphasis on both developed and post-developing countries. The dataset aims to capture the full spectrum of scientific academies across different periods of historical development.Tabular Data Description: The dataset comprises a total of 301 academy records and 14,008 website navigation sections. Each row in the dataset represents a single scientific academy, while the columns describe attributes such as the academy’s name, founding date, location (city and country), website URL, email, and address.Missing Data: Although the dataset offers comprehensive coverage, some entries may have missing or incomplete fields. For instance, section was not available for all records.Data Errors and Error Ranges: The data has been verified through manual curation, reducing the likelihood of errors. However, the use of crowd-sourced data from platforms like Wikipedia introduces potential risks of outdated or incomplete information. Any errors are likely minor and confined to fields such as navigation menu classifications, which may not fully reflect the breadth of an academy's activities.Data Files, Formats, and Sizes: The dataset is provided in CSV format and JSON format, ensuring compatibility with a wide range of software applications, including Microsoft Excel, Google Sheets, and programming languages such as Python (via libraries like pandas).This dataset provides a valuable resource for further research into the organizational behaviors, geographic distribution, and historical significance of scientific academies across the globe. It can be used for large-scale analyses, including comparative studies across different regions or time periods.Any feedback on the data is welcome! Please contact the maintaner of the dataset!If you use the data, please cite the following paper:Xiaoli Chen and Xuezhao Wang. 2024. Profiling Global Scientific Academies. In The 2024 ACM/IEEE Joint Conference on Digital Libraries (JCDL ’24), December 16–20, 2024, Hong Kong, China. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3677389.3702582
A
‘Store Transaction data’ analyzed by Analyst-2
analyst-2.ai
Updated Sep 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Store Transaction data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-store-transaction-data-ccd5/dd67e58b/?iid=035-749&v=presentation
Explore at:
Dataset updated
Sep 30, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Store Transaction data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/iamprateek/store-transaction-data on 30 September 2021.

--- Dataset description provided by original source is as follows ---

Context

Nielsen receives transaction level scanning data (POS Data) from its partner stores on a regular basis. Stores sharing POS data include bigger format store types such as supermarkets, hypermarkets as well as smaller traditional trade grocery stores (Kirana stores), medical stores etc. using a POS machine.

While in a bigger format store, all items for all transactions are scanned using a POS machine, smaller and more localized shops do not have a 100% compliance rate in terms of scanning and inputting information into the POS machine for all transactions.

A transaction involving a single packet of chips or a single piece of candy may not be scanned and recorded to spare customer the inconvenience or during rush hours when the store is crowded with customers.

Thus, the data received from such stores is often incomplete and lacks complete information of all transactions completed within a day.

Additionally, apart from incomplete transaction data in a day, it is observed that certain stores do not share data for all active days. Stores share data ranging from 2 to 28 days in a month. While it is possible to impute/extrapolate data for 2 days of a month using 28 days of actual historical data, the vice versa is not recommended.

Nielsen encourages you to create a model which can help impute/extrapolate data to fill in the missing data gaps in the store level POS data currently received.

Content

You are provided with the dataset that contains store level data by brands and categories for select stores-

Hackathon_ Ideal_Data - The file contains brand level data for 10 stores for the last 3 months. This can be referred to as the ideal data.

Hackathon_Working_Data - This contains data for selected stores which are missing and/or incomplete.

Hackathon_Mapping_File - This file is provided to help understand the column names in the data set.

Hackathon_Validation_Data - This file contains the data stores and product groups for which you have to predict the Total_VALUE.

Sample Submission - This file represents what needs to be uploaded as output by candidate in the same format. The sample data is provided in the file to help understand the columns and values required.

Acknowledgements

Nielsen Holdings plc (NYSE: NLSN) is a global measurement and data analytics company that provides the most complete and trusted view available of consumers and markets worldwide. Nielsen is divided into two business units. Nielsen Global Media, the arbiter of truth for media markets, provides media and advertising industries with unbiased and reliable metrics that create a shared understanding of the industry required for markets to function. Nielsen Global Connect provides consumer packaged goods manufacturers and retailers with accurate, actionable information and insights and a complete picture of the complex and changing marketplace that companies need to innovate and grow. Our approach marries proprietary Nielsen data with other data sources to help clients around the world understand what’s happening now, what’s happening next, and how to best act on this knowledge. An S&P 500 company, Nielsen has operations in over 100 countries, covering more than 90% of the world’s population.

Know more: https://www.nielsen.com/us/en/

Inspiration

Build an imputation and/or extrapolation model to fill the missing data gaps for select stores by analyzing the data and determine which factors/variables/features can help best predict the store sales.

--- Original source retains full ownership of the source dataset ---
Store Transaction data
kaggle.com
Updated Mar 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prateek Gupta (2020). Store Transaction data [Dataset]. https://www.kaggle.com/iamprateek/store-transaction-data/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 18, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Prateek Gupta
Description
Context

Nielsen receives transaction level scanning data (POS Data) from its partner stores on a regular basis. Stores sharing POS data include bigger format store types such as supermarkets, hypermarkets as well as smaller traditional trade grocery stores (Kirana stores), medical stores etc. using a POS machine.

While in a bigger format store, all items for all transactions are scanned using a POS machine, smaller and more localized shops do not have a 100% compliance rate in terms of scanning and inputting information into the POS machine for all transactions.

A transaction involving a single packet of chips or a single piece of candy may not be scanned and recorded to spare customer the inconvenience or during rush hours when the store is crowded with customers.

Thus, the data received from such stores is often incomplete and lacks complete information of all transactions completed within a day.

Additionally, apart from incomplete transaction data in a day, it is observed that certain stores do not share data for all active days. Stores share data ranging from 2 to 28 days in a month. While it is possible to impute/extrapolate data for 2 days of a month using 28 days of actual historical data, the vice versa is not recommended.

Nielsen encourages you to create a model which can help impute/extrapolate data to fill in the missing data gaps in the store level POS data currently received.

Content

You are provided with the dataset that contains store level data by brands and categories for select stores-

Hackathon_ Ideal_Data - The file contains brand level data for 10 stores for the last 3 months. This can be referred to as the ideal data.

Hackathon_Working_Data - This contains data for selected stores which are missing and/or incomplete.

Hackathon_Mapping_File - This file is provided to help understand the column names in the data set.

Hackathon_Validation_Data - This file contains the data stores and product groups for which you have to predict the Total_VALUE.

Sample Submission - This file represents what needs to be uploaded as output by candidate in the same format. The sample data is provided in the file to help understand the columns and values required.

Acknowledgements

Nielsen Holdings plc (NYSE: NLSN) is a global measurement and data analytics company that provides the most complete and trusted view available of consumers and markets worldwide. Nielsen is divided into two business units. Nielsen Global Media, the arbiter of truth for media markets, provides media and advertising industries with unbiased and reliable metrics that create a shared understanding of the industry required for markets to function. Nielsen Global Connect provides consumer packaged goods manufacturers and retailers with accurate, actionable information and insights and a complete picture of the complex and changing marketplace that companies need to innovate and grow. Our approach marries proprietary Nielsen data with other data sources to help clients around the world understand what’s happening now, what’s happening next, and how to best act on this knowledge. An S&P 500 company, Nielsen has operations in over 100 countries, covering more than 90% of the world’s population.

Know more: https://www.nielsen.com/us/en/

Inspiration

Build an imputation and/or extrapolation model to fill the missing data gaps for select stores by analyzing the data and determine which factors/variables/features can help best predict the store sales.
m
GDM register with risk factors for screening selection criteria pilot...
data.mendeley.com
researchdata.edu.au
Updated Apr 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ezekiel Uba Nwose (2023). GDM register with risk factors for screening selection criteria pilot dataset [Dataset]. http://doi.org/10.17632/r8s3j8hfdb.1
Explore at:
Unique identifier
https://doi.org/10.17632/r8s3j8hfdb.1
Dataset updated
Apr 24, 2023
Authors
Ezekiel Uba Nwose
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background: Gestational diabetes Mellitus (GDM) if unmanaged can complicate pregnancy outcomes. Selective screening of GDM is a common policy hence, the need for complete medical records of patients. The extent and pattern that incomplete documentation of patients’ records can prevent recall of antenatal patients requires elucidation. Aim: To describe effectiveness of phone contacts on medical records and GDM risk factors among those reachable by telehealth. Data: Initial data were collected in 2018, which continued in 2019 at Eku Baptist Government Hospital (EBGH). Demographic data were complete in all patients, but incomplete documentation was observed with as much as 98%. 301/391 lacked complete data about 95% of the cases, this was solely due to missing height measurements. In 2020, records of 123 case files were reviewed for effectiveness of phone contacts to do telehealth, and with simultaneous GDM risk assessment. 98/123 have phone details on medical records, of which 41/98 cases followed up were reached hence constituted the pilot dataset. Analysis performed: Descriptive frequency analysis Reuse potentials: This dataset is reusable and useful in the future for potential systematic review and/or meta-analysis. Also, for GDM screening selection criteria, medical records and telehealth services – this pilot dataset is useful for health service limitations and areas for improvement.
f
Evaluating Functional Diversity: Missing Trait Data and the Importance of...
plos.figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maria Májeková; Taavi Paal; Nichola S. Plowman; Michala Bryndová; Liis Kasari; Anna Norberg; Matthias Weiss; Tom R. Bishop; Sarah H. Luke; Katerina Sam; Yoann Le Bagousse-Pinguet; Jan Lepš; Lars Götzenberger; Francesco de Bello (2023). Evaluating Functional Diversity: Missing Trait Data and the Importance of Species Abundance Structure and Data Transformation [Dataset]. http://doi.org/10.1371/journal.pone.0149270
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0149270
Dataset updated
May 30, 2023
Dataset provided by
PLOS ONE
Authors
Maria Májeková; Taavi Paal; Nichola S. Plowman; Michala Bryndová; Liis Kasari; Anna Norberg; Matthias Weiss; Tom R. Bishop; Sarah H. Luke; Katerina Sam; Yoann Le Bagousse-Pinguet; Jan Lepš; Lars Götzenberger; Francesco de Bello
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Functional diversity (FD) is an important component of biodiversity that quantifies the difference in functional traits between organisms. However, FD studies are often limited by the availability of trait data and FD indices are sensitive to data gaps. The distribution of species abundance and trait data, and its transformation, may further affect the accuracy of indices when data is incomplete. Using an existing approach, we simulated the effects of missing trait data by gradually removing data from a plant, an ant and a bird community dataset (12, 59, and 8 plots containing 62, 297 and 238 species respectively). We ranked plots by FD values calculated from full datasets and then from our increasingly incomplete datasets and compared the ranking between the original and virtually reduced datasets to assess the accuracy of FD indices when used on datasets with increasingly missing data. Finally, we tested the accuracy of FD indices with and without data transformation, and the effect of missing trait data per plot or per the whole pool of species. FD indices became less accurate as the amount of missing data increased, with the loss of accuracy depending on the index. But, where transformation improved the normality of the trait data, FD values from incomplete datasets were more accurate than before transformation. The distribution of data and its transformation are therefore as important as data completeness and can even mitigate the effect of missing data. Since the effect of missing trait values pool-wise or plot-wise depends on the data distribution, the method should be decided case by case. Data distribution and data transformation should be given more careful consideration when designing, analysing and interpreting FD studies, especially where trait data are missing. To this end, we provide the R package “traitor” to facilitate assessments of missing trait data.
NNDSS - TABLE 1V. Malaria to Measles, Indigenous
catalog.data.gov
healthdata.gov
+4more
Updated Jul 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2025). NNDSS - TABLE 1V. Malaria to Measles, Indigenous [Dataset]. https://catalog.data.gov/dataset/nndss-table-1v-malaria-to-measles-indigenous
Explore at:
Dataset updated
Jul 9, 2025
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Description
NNDSS - TABLE 1V. Malaria to Measles, Indigenous - 2020. In this Table, provisional cases* of notifiable diseases are displayed for United States, U.S. territories, and Non-U.S. residents. Notice: Data from California published in week 29 for years 2019 and 2020 were incomplete when originally published on July 24, 2020. On August 4, 2020, incomplete case counts were replaced with a "U" indicating case counts are not available for specified time period. Notice: Measles data for weeks 1-4 (in Table 1v) were updated on 02-28-2020 to correct the classification of imported and indigenous. For all weeks, measles is considered imported if the disease was acquired outside of the United States and is considered indigenous if the disease was acquired anywhere within the United States or it is not known where the disease was acquired. Note: This table contains provisional cases of national notifiable diseases from the National Notifiable Diseases Surveillance System (NNDSS). NNDSS data from the 50 states, New York City, the District of Columbia and the U.S. territories are collated and published weekly on the NNDSS Data and Statistics web page (https://wwwn.cdc.gov/nndss/data-and-statistics.html). Cases reported by state health departments to CDC for weekly publication are provisional because of the time needed to complete case follow-up. Therefore, numbers presented in later weeks may reflect changes made to these counts as additional information becomes available. The national surveillance case definitions used to define a case are available on the NNDSS web site at https://wwwn.cdc.gov/nndss/. Information about the weekly provisional data and guides to interpreting data are available at: https://wwwn.cdc.gov/nndss/infectious-tables.html. Footnotes: U: Unavailable — The reporting jurisdiction was unable to send the data to CDC or CDC was unable to process the data. -: No reported cases — The reporting jurisdiction did not submit any cases to CDC. N: Not reportable — The disease or condition was not reportable by law, statute, or regulation in the reporting jurisdiction. NN: Not nationally notifiable — This condition was not designated as being nationally notifiable. NP: Nationally notifiable but not published. NC: Not calculated — There is insufficient data available to support the calculation of this statistic. Cum: Cumulative year-to-date counts. Max: Maximum — Maximum case count during the previous 52 weeks. * Case counts for reporting years 2019 and 2020 are provisional and subject to change. Cases are assigned to the reporting jurisdiction submitting the case to NNDSS, if the case's country of usual residence is the U.S., a U.S. territory, unknown, or null (i.e. country not reported); otherwise, the case is assigned to the 'Non-U.S. Residents' category. Country of usual residence is currently not reported by all jurisdictions or for all conditions. For further information on interpretation of these data, see https://wwwn.cdc.gov/nndss/document/Users_guide_WONDER_tables_cleared_final.pdf. †Previous 52 week maximum and cumulative YTD are determined from periods of time when the condition was reportable in the jurisdiction (i.e., may be less than 52 weeks of data or incomplete YTD data). § Measles is considered imported if the disease was acquired outside of the United States and is considered indigenous if the disease was acquired anywhere within the United States or it is not known where the disease was acquired.
d
Percentage of P1 Cohort who Did Not Complete Secondary Education
data.gov.sg
Updated Dec 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ministry of Education (2024). Percentage of P1 Cohort who Did Not Complete Secondary Education [Dataset]. https://data.gov.sg/dataset/percentage-of-p1-cohort-who-did-not-complete-secondary-education
Explore at:
Dataset updated
Dec 19, 2024
Dataset authored and provided by
Ministry of Education
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Time period covered
Jan 2003 - Jan 2023
Description
Dataset from Ministry of Education. For more information, visit https://data.gov.sg/datasets/d_eb818056f2f6b839256edfd2abb86d7c/view
E
Replication Data for: Sparse multi-trait genomic prediction under incomplete...
data.moa.gov.et
html
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CIMMYT Ethiopia (2025). Replication Data for: Sparse multi-trait genomic prediction under incomplete block designs [Dataset]. https://data.moa.gov.et/dataset/hdl-11529-10548787
Explore at:
htmlAvailable download formats
Dataset updated
Jan 20, 2025
Dataset provided by
CIMMYT Ethiopia
Description
The efficiency of genomic selection methodologies can be increased by sparse testing where a subset of materials are evaluated in different environments. Seven different multi-environment plant breeding datasets were used to evaluate four different methods for allocating lines to environments in a multi-trait genomic prediction problem. The results of the analysis are presented in the accompanying article.
c
Campaign Finance - Summary Totals
s.cnmilf.com
data.sfgov.org
+1more
Updated Oct 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sfgov.org (2025). Campaign Finance - Summary Totals [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/campaign-finance-summary-totals
Explore at:
Dataset updated
Oct 4, 2025
Dataset provided by
data.sfgov.org
Description
A. SUMMARY This dataset contains current summary information for electronically filed FPPC campaign forms. The columns in this dataset correspond to the figures reported on the summary page of FPPC forms 450, 460, 461, and 465. Refer to the FPPC Forms represented in this dataset. B. HOW THE DATASET IS CREATED Committees file campaign statements with the Ethics Commission on a periodic basis. Those statements are stored with the Commission's provider. Data is generally presented as-filed by committees. If a committee files an amendment, the data from that filing completely replaces the original and any prior amendments in the filing sequence. C. UPDATE PROCESS Each night starting at midnight Pacific time a script runs to check for new filings with the Commission's database, and updates this dataset with transactions from new filings. The update process can take a variable amount of time to complete. Viewing or downloading this dataset while the update is running may result in incomplete data, therefore it is highly recommended to view or download this data before midnight or after 8am. D. HOW TO USE THIS DATASET Transactions from rejected and superseded filings are not included in this dataset. Transactions from many different FPPC forms are combined in this dataset, refer to the column "Form Type" to differentiate transaction types. A row with no value in the SyncFlag column indicates a paper filing amended an electronic filing. The SFEC is working on how to automatically deal with these cases. Properties suffixed with "-nid" can be used to join the data between Filers, Filings, and Transaction datasets. Refer to the Ethics Commission's webpage for more information. RELATED DATASETS San Francisco Campaign Filers Filings Received by SFEC Summary Totals Transactions
k
Comprehensive battery aging dataset: capacity and impedance fade...
radar.kit.edu
service.tib.eu
+1more
tar
Updated Mar 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthias Luh; Thomas Blank (2024). Comprehensive battery aging dataset: capacity and impedance fade measurements of a lithium-ion NMC/C-SiO cell [dataset] [Dataset]. http://doi.org/10.35097/1947
Explore at:
tar(69375563264 bytes)Available download formats
Unique identifier
https://doi.org/10.35097/1947
Dataset updated
Mar 7, 2024
Dataset provided by
Karlsruhe Institute of Technology
Authors
Matthias Luh; Thomas Blank
Description
The data is described in detail in the open-access publication "Comprehensive battery aging dataset: capacity and impedance fade measurements of a lithium-ion NMC/C-SiO cell" published in Nature Scientific Data under the DOI: 10.1038/s41597-024-03831-x, also see “Related identifier”. An updated dataset is published under the DOI 10.35097/1969 (result data, e.g., capacity fade and impedance increase) and 10.35097/kww7jv8ajuvchcah (log data), also see “Related identifier”. Python example code to read, process, and visualize the data is provided in the GitHub repository: https://github.com/energystatusdata/bat-age-data-scripts/ Note: The "cell_eisv2.zip" file in this dataset is incomplete and only contains data for cells P001_1 to P044_2. The corrected file "cell_eisv2_fixed.zip" containing data for all 228 cells P001_1 to P076_3 can be found in the dataset “Addendum to "Comprehensive battery aging dataset: capacity and impedance fade measurements of a lithium-ion NMC/C-SiO cell [dataset]"” with the DOI 10.35097/krk531nmj4bsshha (see “Related identifier”).
f
Data from: Main Effects and Interactions in Mixed and Incomplete Data Frames...
tandf.figshare.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geneviève Robin; Olga Klopp; Julie Josse; Éric Moulines; Robert Tibshirani (2023). Main Effects and Interactions in Mixed and Incomplete Data Frames [Dataset]. http://doi.org/10.6084/m9.figshare.8191850.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8191850.v3
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
Geneviève Robin; Olga Klopp; Julie Josse; Éric Moulines; Robert Tibshirani
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A mixed data frame (MDF) is a table collecting categorical, numerical, and count observations. The use of MDF is widespread in statistics and the applications are numerous from abundance data in ecology to recommender systems. In many cases, an MDF exhibits simultaneously main effects, such as row, column, or group effects and interactions, for which a low-rank model has often been suggested. Although the literature on low-rank approximations is very substantial, with few exceptions, existing methods do not allow to incorporate main effects and interactions while providing statistical guarantees. The present work fills this gap. We propose an estimation method which allows to recover simultaneously the main effects and the interactions. We show that our method is near optimal under conditions which are met in our targeted applications. We also propose an optimization algorithm which provably converges to an optimal solution. Numerical experiments reveal that our method, mimi, performs well when the main effects are sparse and the interaction matrix has low-rank. We also show that mimi compares favorably to existing methods, in particular when the main effects are significantly large compared to the interactions, and when the proportion of missing entries is large. The method is available as an R package on the Comprehensive R Archive Network. Supplementary materials for this article are available online.
C
CA State Lands Commission Leases
data.cnra.ca.gov
data.ca.gov
+7more
Updated Jun 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California State Lands Commission (2025). CA State Lands Commission Leases [Dataset]. https://data.cnra.ca.gov/dataset/ca-state-lands-commission-leases
Explore at:
arcgis geoservices rest api, csv, kml, zip, xlsx, txt, gdb, gpkg, geojson, htmlAvailable download formats
Dataset updated
Jun 12, 2025
Dataset authored and provided by
California State Lands Commissionhttps://www.slc.ca.gov/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
California
Description
The California State Lands Commission (CSLC) may issue leases or permits on state lands under its jurisdiction. Additional information regarding CSLC leasing can be found at: https://www.slc.ca.gov/leases-permits/. This is a point feature dataset indicating the approximate locations, often represented by a center point, of the general lease area of lands leased by the CSLC on state sovereign lands and school lands, including coastal marine areas, bays, rivers, and lakes. This dataset is to be considered incomplete as new leases are being entered into the CSLC database, and existing leases are modified or terminated, on an ongoing basis. Many marine areas, bays, rivers, and lakes are not under the leasing jurisdiction of the CSLC because they have been legislatively granted in trust to other government entities. Additionally, some leases are not shown at all for a variety of reasons. Many point features in this dataset provide links to maps and/or land descriptions used in the CSLC lease approval process. These documents are hosted on the CSLC archives website at https://www.slc.ca.gov/archives/. In some cases, these documents provide reliable, current lease boundary information, while in other cases, additional information is necessary to properly define lease boundaries. Additionally, revisions to lease boundaries may have occurred subsequent to CSLC approval, as in the case of as-built locations that differ from originally approved alignments. Further, the boundary of some leases may be the mean high tide line, which in a state of nature is both ambulatory, and in the absence of a survey conducted by a licensed land surveyor, not readily locatable.

Disclaimer of Liability
This dataset is not suitable for any legal purpose. The CSLC makes no warranty of any kind, express or implied, nor assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any of the information contained in the dataset. The CSLC assumes no legal liability or responsibility for anyone's use of the information.

Facebook

Twitter

Click to copy link

Link copied

Cite

Department of Homeland Security (2024). Incomplete Published Data Assets Report [Dataset]. https://datasets.ai/datasets/incomplete-published-data-assets-report

Incomplete Published Data Assets Report

Explore at:

Dataset updated

Aug 26, 2024

Dataset authored and provided by

Department of Homeland Security

Description

Displays incomplete Published data assets. This report can be used to help improve the Data Asset Completeness score from the Enterprise Data Management (EDM) Scorecard by identifying which missing fields are required for completeness.

Clear search

Close search

Google apps

Main menu

Incomplete Published Data Assets Report

Data from: Robust multivariate mixture regression models with incomplete...

A dataset from a survey investigating disciplinary differences in data...

Data from: Archetypal Analysis With Missing Data: See All Samples by Looking...

ARCHIVED: COVID-19 Testing by Race/Ethnicity Over Time

AI-Generated Synthetic Tabular Dataset Market Research Report 2033

AI-Generated Synthetic Tabular Dataset Market Outlook

Data from: Incomplete specimens in geometric morphometric analyses

Campaign Finance - Transactions

Global scientific academies Dataset

‘Store Transaction data’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

Store Transaction data

Context

Content

Acknowledgements

Inspiration

GDM register with risk factors for screening selection criteria pilot...

Evaluating Functional Diversity: Missing Trait Data and the Importance of...

NNDSS - TABLE 1V. Malaria to Measles, Indigenous

Percentage of P1 Cohort who Did Not Complete Secondary Education

Replication Data for: Sparse multi-trait genomic prediction under incomplete...

Campaign Finance - Summary Totals

Comprehensive battery aging dataset: capacity and impedance fade...

Data from: Main Effects and Interactions in Mixed and Incomplete Data Frames...

CA State Lands Commission Leases

Incomplete Published Data Assets ReportSee More Versions

Incomplete Published Data Assets Report