100+ datasets found
  1. f

    Orange dataset table

    • figshare.com
    xlsx
    Updated Mar 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Mar 4, 2022
    Dataset provided by
    figshare
    Authors
    Rui Simões
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

    Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.

  2. practical-statistics-for-data-scientists

    • kaggle.com
    Updated Jan 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    almaas izdihar (2024). practical-statistics-for-data-scientists [Dataset]. https://www.kaggle.com/datasets/almaasizdihar/practical-statistics-for-data-scientists/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 24, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    almaas izdihar
    Description

    Dataset

    This dataset was created by almaas izdihar

    Contents

  3. LinkedIn Datasets

    • brightdata.com
    .json, .csv, .xlsx
    Updated Dec 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2021). LinkedIn Datasets [Dataset]. https://brightdata.com/products/datasets/linkedin
    Explore at:
    .json, .csv, .xlsxAvailable download formats
    Dataset updated
    Dec 17, 2021
    Dataset authored and provided by
    Bright Datahttps://brightdata.com/
    License

    https://brightdata.com/licensehttps://brightdata.com/license

    Area covered
    Worldwide
    Description

    Unlock the full potential of LinkedIn data with our extensive dataset that combines profiles, company information, and job listings into one powerful resource for business decision-making, strategic hiring, competitive analysis, and market trend insights. This all-encompassing dataset is ideal for professionals, recruiters, analysts, and marketers aiming to enhance their strategies and operations across various business functions. Dataset Features

    Profiles: Dive into detailed public profiles featuring names, titles, positions, experience, education, skills, and more. Utilize this data for talent sourcing, lead generation, and investment signaling, with a refresh rate ensuring up to 30 million records per month. Companies: Access comprehensive company data including ID, country, industry, size, number of followers, website details, subsidiaries, and posts. Tailored subsets by industry or region provide invaluable insights for CRM enrichment, competitive intelligence, and understanding the startup ecosystem, updated monthly with up to 40 million records. Job Listings: Explore current job opportunities detailed with job titles, company names, locations, and employment specifics such as seniority levels and employment functions. This dataset includes direct application links and real-time application numbers, serving as a crucial tool for job seekers and analysts looking to understand industry trends and the job market dynamics.

    Customizable Subsets for Specific Needs Our LinkedIn dataset offers the flexibility to tailor the dataset according to your specific business requirements. Whether you need comprehensive insights across all data points or are focused on specific segments like job listings, company profiles, or individual professional details, we can customize the dataset to match your needs. This modular approach ensures that you get only the data that is most relevant to your objectives, maximizing efficiency and relevance in your strategic applications. Popular Use Cases

    Strategic Hiring and Recruiting: Track talent movement, identify growth opportunities, and enhance your recruiting efforts with targeted data. Market Analysis and Competitive Intelligence: Gain a competitive edge by analyzing company growth, industry trends, and strategic opportunities. Lead Generation and CRM Enrichment: Enrich your database with up-to-date company and professional data for targeted marketing and sales strategies. Job Market Insights and Trends: Leverage detailed job listings for a nuanced understanding of employment trends and opportunities, facilitating effective job matching and market analysis. AI-Driven Predictive Analytics: Utilize AI algorithms to analyze large datasets for predicting industry shifts, optimizing business operations, and enhancing decision-making processes based on actionable data insights.

    Whether you are mapping out competitive landscapes, sourcing new talent, or analyzing job market trends, our LinkedIn dataset provides the tools you need to succeed. Customize your access to fit specific needs, ensuring that you have the most relevant and timely data at your fingertips.

  4. w

    Dataset of books about Commercial statistics

    • workwithdata.com
    Updated Apr 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of books about Commercial statistics [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=j0-book_subject&fop0=%3D&fval0=Commercial+statistics&j=1&j0=book_subjects
    Explore at:
    Dataset updated
    Apr 17, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about books. It has 226 rows and is filtered where the book subjects is Commercial statistics. It features 9 columns including author, publication date, language, and book publisher.

  5. student-performance-data

    • kaggle.com
    Updated Jun 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Azam (2025). student-performance-data [Dataset]. http://doi.org/10.34740/kaggle/dsv/12160820
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Muhammad Azam
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Student Performance Data

    This dataset provides insights into various factors influencing the academic performance of students. It is curated for use in educational research, data analytics projects, and predictive modeling. The data reflects a combination of personal, familial, and academic-related variables gathered through observation or survey.

    The dataset includes a diverse range of students and captures key characteristics such as study habits, family background, school attendance, and overall performance. It is well-suited for exploring correlations, visualizing trends, and training machine learning models related to academic outcomes.

    Highlights:

    Clean, structured format suitable for immediate use Designed for beginner to intermediate-level data analysis Valuable for classification, regression, and data storytelling projects

    File Format:

    Type: CSV (Comma-Separated Values) Encoding: UTF-8 Structure: Each row represents a student record

    Applications

    Student performance prediction Educational policy planning Identification of performance gaps and influencing factors Exploratory data analysis and visualization

  6. D

    UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View...

    • dataverse.no
    • search.dataone.org
    txt, zip
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anurag Dalal; Anurag Dalal (2025). UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View Synthesis [Dataset]. http://doi.org/10.18710/WSU7I6
    Explore at:
    txt(7447), zip(960339536)Available download formats
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    DataverseNO
    Authors
    Anurag Dalal; Anurag Dalal
    License

    https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6

    Description

    The dataset comprises three dynamic scenes characterized by both simple and complex lighting conditions. The quantity of cameras ranges from 4 to 512, including 4, 6, 8, 10, 12, 14, 16, 32, 64, 128, 256, and 512. The point clouds are randomly generated.

  7. s

    Statistics Finland Service Interface (WFS) Dataset Collection 2024 -...

    • store.smartdatahub.io
    Updated Nov 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Statistics Finland Service Interface (WFS) Dataset Collection 2024 - Datasets - This service has been deprecated - please visit https://www.smartdatahub.io/ to access data. See the About page for details. // [Dataset]. https://store.smartdatahub.io/dataset/fi_tilastokeskus_tilastointialueet_avi4500k_2024
    Explore at:
    Dataset updated
    Nov 11, 2024
    Area covered
    Finland
    Description

    This collection of datasets originates from the Statistics Center's service interface, known as Tilastokeskus (Statistics Finland), in Finland. The collection is composed of related data tables, with each table presenting a variety of related data in a structured format of columns and rows. The data in this collection is highly detailed and organized, providing a valuable resource for those seeking to understand specific statistical areas. The datasets in this collection are current as of 2024. This dataset is licensed under CC BY 4.0 (Creative Commons Attribution 4.0, https://creativecommons.org/licenses/by/4.0/deed.fi).

  8. N

    New Point, IN Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). New Point, IN Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e1f49e2b-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New Point, IN
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of New Point by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for New Point. The dataset can be utilized to understand the population distribution of New Point by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in New Point. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for New Point.

    Key observations

    Largest age group (population): Male # 60-64 years (26) | Female # 40-44 years (16). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the New Point population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the New Point is shown in the following column.
    • Population (Female): The female population in the New Point is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in New Point for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for New Point Population by Gender. You can refer the same here

  9. O

    BUTTER - Empirical Deep Learning Dataset

    • data.openei.org
    • datasets.ai
    • +2more
    code, data, website
    Updated May 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charles Tripp; Jordan Perr-Sauer; Lucas Hayne; Monte Lunacek; Charles Tripp; Jordan Perr-Sauer; Lucas Hayne; Monte Lunacek (2022). BUTTER - Empirical Deep Learning Dataset [Dataset]. http://doi.org/10.25984/1872441
    Explore at:
    code, website, dataAvailable download formats
    Dataset updated
    May 20, 2022
    Dataset provided by
    Open Energy Data Initiative (OEDI)
    National Renewable Energy Laboratory
    USDOE Office of Energy Efficiency and Renewable Energy (EERE), Multiple Programs (EE)
    Authors
    Charles Tripp; Jordan Perr-Sauer; Lucas Hayne; Monte Lunacek; Charles Tripp; Jordan Perr-Sauer; Lucas Hayne; Monte Lunacek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The BUTTER Empirical Deep Learning Dataset represents an empirical study of the deep learning phenomena on dense fully connected networks, scanning across thirteen datasets, eight network shapes, fourteen depths, twenty-three network sizes (number of trainable parameters), four learning rates, six minibatch sizes, four levels of label noise, and fourteen levels of L1 and L2 regularization each. Multiple repetitions (typically 30, sometimes 10) of each combination of hyperparameters were preformed, and statistics including training and test loss (using a 80% / 20% shuffled train-test split) are recorded at the end of each training epoch. In total, this dataset covers 178 thousand distinct hyperparameter settings ("experiments"), 3.55 million individual training runs (an average of 20 repetitions of each experiments), and a total of 13.3 billion training epochs (three thousand epochs were covered by most runs). Accumulating this dataset consumed 5,448.4 CPU core-years, 17.8 GPU-years, and 111.2 node-years.

  10. N

    Vici, OK Population Breakdown by Gender and Age Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Vici, OK Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e207322c-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Oklahoma, Vici
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Vici by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Vici. The dataset can be utilized to understand the population distribution of Vici by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Vici. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Vici.

    Key observations

    Largest age group (population): Male # 0-4 years (45) | Female # 25-29 years (38). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the Vici population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the Vici is shown in the following column.
    • Population (Female): The female population in the Vici is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in Vici for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Vici Population by Gender. You can refer the same here

  11. N

    Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Missouri...

    • neilsberg.com
    Updated Aug 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Comprehensive Income by Age Group Dataset: Longitudinal Analysis of Missouri Household Incomes Across 4 Age Groups and 16 Income Brackets. Annual Editions Collection // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/2ee229b6-aeee-11ee-aaca-3860777c1fe6/
    Explore at:
    Dataset updated
    Aug 7, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Missouri
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Missouri household income by age. The dataset can be utilized to understand the age-based income distribution of Missouri income.

    Content

    The dataset will have the following datasets when applicable

    Please note: The 2020 1-Year ACS estimates data was not reported by the Census Bureau due to the impact on survey collection and analysis caused by COVID-19. Consequently, median household income data for 2020 is unavailable for large cities (population 65,000 and above).

    • Missouri annual median income by age groups dataset (in 2022 inflation-adjusted dollars)
    • Age-wise distribution of Missouri household incomes: Comparative analysis across 16 income brackets

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Interested in deeper insights and visual analysis?

    Explore our comprehensive data analysis and visual representations for a deeper understanding of Missouri income distribution by age. You can refer the same here

  12. s

    Repository URL

    • cinergi.sdsc.edu
    • datadiscoverystudio.org
    resource url
    Updated 1997
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (1997). Repository URL [Dataset]. http://cinergi.sdsc.edu/geoportal/rest/metadata/item/f819b09d47f4497ca3f71a0189db0fe1/html
    Explore at:
    resource urlAvailable download formats
    Dataset updated
    1997
    Area covered
    Description

    Link Function: information

  13. Hydrographic and Impairment Statistics Database: AMME

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Jun 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Park Service (2024). Hydrographic and Impairment Statistics Database: AMME [Dataset]. https://catalog.data.gov/dataset/hydrographic-and-impairment-statistics-database-amme
    Explore at:
    Dataset updated
    Jun 4, 2024
    Dataset provided by
    National Park Servicehttp://www.nps.gov/
    Description

    Hydrographic and Impairment Statistics (HIS) is a National Park Service (NPS) Water Resources Division (WRD) project established to track certain goals created in response to the Government Performance and Results Act of 1993 (GPRA). One water resources management goal established by the Department of the Interior under GRPA requires NPS to track the percent of its managed surface waters that are meeting Clean Water Act (CWA) water quality standards. This goal requires an accurate inventory that spatially quantifies the surface water hydrography that each bureau manages and a procedure to determine and track which waterbodies are or are not meeting water quality standards as outlined by Section 303(d) of the CWA. This project helps meet this DOI GRPA goal by inventorying and monitoring in a geographic information system for the NPS: (1) CWA 303(d) quality impaired waters and causes; and (2) hydrographic statistics based on the United States Geological Survey (USGS) National Hydrography Dataset (NHD). Hydrographic and 303(d) impairment statistics were evaluated based on a combination of 1:24,000 (NHD) and finer scale data (frequently provided by state GIS layers).

  14. Retail Credit Bank Data

    • kaggle.com
    Updated Sep 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SR (2021). Retail Credit Bank Data [Dataset]. https://www.kaggle.com/datasets/surekharamireddy/credit-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 10, 2021
    Dataset provided by
    Kaggle
    Authors
    SR
    Description

    Context

    A retail bank would like to hire you to build a credit default model for their credit card portfolio. The bank expects the model to identify the consumers who are likely to default on their credit card payments over the next 12 months. This model will be used to reduce the bank’s future losses. The bank is willing to provide you with some sample datathat they can currently extract from their systems. This data set (credit_data.csv) consists of 13,444 observations with 14 variables.

    Content

    Based on the bank’s experience, the number of derogatory reports is a strong indicator of default. This is all that the information you are able to get from the bank at the moment. Currently, they do not have the expertise to provide any clarification on this data and are also unsure about other variables captured by their systems

  15. d

    Crash Data

    • catalog.data.gov
    • data.townofcary.org
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cary (2025). Crash Data [Dataset]. https://catalog.data.gov/dataset/crash-data
    Explore at:
    Dataset updated
    Jul 12, 2025
    Dataset provided by
    Cary
    Description

    This dataset contains crash information from the last five years to the current date. The data is based on the National Incident Based Reporting System (NIBRS). The data is dynamic, allowing for additions, deletions and modifications at any time, resulting in more accurate information in the database. Due to ongoing and continuous data entry, the numbers of records in subsequent extractions are subject to change.About Crash DataThe Cary Police Department strives to make crash data as accurate as possible, but there is no avoiding the introduction of errors into this process, which relies on data furnished by many people and that cannot always be verified. As the data is updated on this site there will be instances of adding new incidents and updating existing data with information gathered through the investigative process.Not surprisingly, crash data becomes more accurate over time, as new crashes are reported and more information comes to light during investigations.This dynamic nature of crash data means that content provided here today will probably differ from content provided a week from now. Likewise, content provided on this site will probably differ somewhat from crime statistics published elsewhere by the Town of Cary, even though they draw from the same database.About Crash LocationsCrash locations reflect the approximate locations of the crash. Certain crashes may not appear on maps if there is insufficient detail to establish a specific, mappable location.

  16. r

    RI- Demographic Data

    • redivis.com
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Columbia Population Research Center (2023). RI- Demographic Data [Dataset]. https://redivis.com/datasets/fh74-90v3ge9m2
    Explore at:
    Dataset updated
    Dec 19, 2023
    Dataset authored and provided by
    Columbia Population Research Center
    Description

    The table RI- Demographic Data is part of the dataset Demographic Data, available at https://columbia.redivis.com/datasets/fh74-90v3ge9m2. It contains 734919 rows across 699 variables.

  17. r

    HI- Demographic Data

    • redivis.com
    Updated Dec 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Columbia Population Research Center (2023). HI- Demographic Data [Dataset]. https://redivis.com/datasets/fh74-90v3ge9m2
    Explore at:
    Dataset updated
    Dec 19, 2023
    Dataset authored and provided by
    Columbia Population Research Center
    Description

    The table HI- Demographic Data is part of the dataset Demographic Data, available at https://columbia.redivis.com/datasets/fh74-90v3ge9m2. It contains 767560 rows across 699 variables.

  18. Hydrographic and Impairment Statistics Database: THRB

    • catalog.data.gov
    • datasets.ai
    Updated Jun 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Park Service (2024). Hydrographic and Impairment Statistics Database: THRB [Dataset]. https://catalog.data.gov/dataset/hydrographic-and-impairment-statistics-database-thrb
    Explore at:
    Dataset updated
    Jun 5, 2024
    Dataset provided by
    National Park Servicehttp://www.nps.gov/
    Description

    Hydrographic and Impairment Statistics (HIS) is a National Park Service (NPS) Water Resources Division (WRD) project established to track certain goals created in response to the Government Performance and Results Act of 1993 (GPRA). One water resources management goal established by the Department of the Interior under GRPA requires NPS to track the percent of its managed surface waters that are meeting Clean Water Act (CWA) water quality standards. This goal requires an accurate inventory that spatially quantifies the surface water hydrography that each bureau manages and a procedure to determine and track which waterbodies are or are not meeting water quality standards as outlined by Section 303(d) of the CWA. This project helps meet this DOI GRPA goal by inventorying and monitoring in a geographic information system for the NPS: (1) CWA 303(d) quality impaired waters and causes; and (2) hydrographic statistics based on the United States Geological Survey (USGS) National Hydrography Dataset (NHD). Hydrographic and 303(d) impairment statistics were evaluated based on a combination of 1:24,000 (NHD) and finer scale data (frequently provided by state GIS layers).

  19. g

    Biological oceanography datasets from Centre for Estuarine and Marine...

    • gimi9.com
    Updated May 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Biological oceanography datasets from Centre for Estuarine and Marine Ecology (NIOO-CEME) | gimi9.com [Dataset]. https://gimi9.com/dataset/nl_15918-biological-oceanography-datasets-from-centre-for-estuarine-and-marine-ecology--nioo-ceme-/
    Explore at:
    Dataset updated
    May 3, 2025
    Description

    The National Oceanographic Data Committee (NODC) of the Netherlands is the national platform for exchange of oceanographic and marine data and information, and for advisory services in the field of ocean and marine data management. The overall objective of the NODC is to effect a major and significant improvement in the overview and access to marine and oceanographic data and data-products from government and research institutes in the Netherlands. This is not done alone and only with a national focus, but on a European scale as an active partner in the Pan-European SeaDataNet project, complying to the INSPIRE and the new Marine Strategy EU Directives, and on a global scale as the Netherlands representative in major international organisations in this field, ICES and IOC-IODE. A major step has been made with the launch of the NODCi - National Infrastructure for access to Oceanographic and Marine Data and Information. This was developed in the framework of the Ruimte voor Geo-Informatie (RGI) programme as RGI-014 project. It includes a new NODC-i portal (www.nodc.nl), that provides users with a range of metadata services and a unique interface to the data management systems of each of the NODC members. By this Common Data Index (CDI) interface, users can get harmonised access to the datasets, that are managed in a distributed way at each of the NODC members. The NODCi portal functions as the Dutch node in the SeaDataNet infrastructure. The NODC CDI service contains several thousands of references to individual marine and oceanographic datasets. For inclusion in the National Geo Register these have been aggregated by combinations of Data Holding Centres - Disciplines. Each NGR - NODC record therefore represents a large number of individual metadata records and associated datasets. By following the specified URL to the NODCi portal, users can consider these metadata in detail and can achieve downloading of interesting datasets via the shopping cart transaction system, that is integrated in the NODCi portal.

  20. u

    Amazon review data 2018

    • cseweb.ucsd.edu
    • nijianmo.github.io
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UCSD CSE Research Project, Amazon review data 2018 [Dataset]. https://cseweb.ucsd.edu/~jmcauley/datasets/amazon_v2/
    Explore at:
    Dataset authored and provided by
    UCSD CSE Research Project
    Description

    Context

    This Dataset is an updated version of the Amazon review dataset released in 2014. As in the previous version, this dataset includes reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features), and links (also viewed/also bought graphs). In addition, this version provides the following features:

    • More reviews:

      • The total number of reviews is 233.1 million (142.8 million in 2014).
    • New reviews:

      • Current data includes reviews in the range May 1996 - Oct 2018.
    • Metadata: - We have added transaction metadata for each review shown on the review page.

      • Added more detailed metadata of the product landing page.

    Acknowledgements

    If you publish articles based on this dataset, please cite the following paper:

    • Jianmo Ni, Jiacheng Li, Julian McAuley. Justifying recommendations using distantly-labeled reviews and fined-grained aspects. EMNLP, 2019.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1

Orange dataset table

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
xlsxAvailable download formats
Dataset updated
Mar 4, 2022
Dataset provided by
figshare
Authors
Rui Simões
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.

Search
Clear search
Close search
Google apps
Main menu