8 datasets found
  1. Z

    UCI datasets

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Drton, Mathias (2023). UCI datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7681647
    Explore at:
    Dataset updated
    Apr 4, 2023
    Dataset provided by
    Zadorozhnyi, Oleksandr
    Drton, Mathias
    Haug, Stephan
    Reifferscheidt, David
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Collection of two datasets from the UCI website that could be used for structure learning tasks. Includes datasets regarding

    Air Quality

    US census 1990

    Size: Two datasets of sizes 9471*17 and 2458285*68 correspondingly

    Number of features: 15-68

    Ground truth: No

    Type of Graph: No ground truth

    More information about the datasets is contained in the dataset_description.html files.

  2. UCI Communities and Crime Unnormalized Data Set

    • kaggle.com
    Updated Feb 21, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kavitha (2018). UCI Communities and Crime Unnormalized Data Set [Dataset]. https://www.kaggle.com/kkanda/communities%20and%20crime%20unnormalized%20data%20set/notebooks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 21, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kavitha
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    Introduction: The dataset used for this experiment is real and authentic. The dataset is acquired from UCI machine learning repository website [13]. The title of the dataset is ‘Crime and Communities’. It is prepared using real data from socio-economic data from 1990 US Census, law enforcement data from the 1990 US LEMAS survey, and crimedata from the 1995 FBI UCR [13]. This dataset contains a total number of 147 attributes and 2216 instances.

    The per capita crimes variables were calculated using population values included in the 1995 FBI data (which differ from the 1990 Census values).

    Content

    The variables included in the dataset involve the community, such as the percent of the population considered urban, and the median family income, and involving law enforcement, such as per capita number of police officers, and percent of officers assigned to drug units. The crime attributes (N=18) that could be predicted are the 8 crimes considered 'Index Crimes' by the FBI)(Murders, Rape, Robbery, .... ), per capita (actually per 100,000 population) versions of each, and Per Capita Violent Crimes and Per Capita Nonviolent Crimes)

    predictive variables : 125 non-predictive variables : 4 potential goal/response variables : 18

    Acknowledgements

    http://archive.ics.uci.edu/ml/datasets/Communities%20and%20Crime%20Unnormalized

    U. S. Department of Commerce, Bureau of the Census, Census Of Population And Housing 1990 United States: Summary Tape File 1a & 3a (Computer Files),

    U.S. Department Of Commerce, Bureau Of The Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan. (1992)

    U.S. Department of Justice, Bureau of Justice Statistics, Law Enforcement Management And Administrative Statistics (Computer File) U.S. Department Of Commerce, Bureau Of The Census Producer, Washington, DC and Inter-university Consortium for Political and Social Research Ann Arbor, Michigan. (1992)

    U.S. Department of Justice, Federal Bureau of Investigation, Crime in the United States (Computer File) (1995)

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

    Data available in the dataset may not act as a complete source of information for identifying factors that contribute to more violent and non-violent crimes as many relevant factors may still be missing.

    However, I would like to try and answer the following questions answered.

    1. Analyze if number of vacant and occupied houses and the period of time the houses were vacant had contributed to any significant change in violent and non-violent crime rates in communities

    2. How has unemployment changed crime rate(violent and non-violent) in the communities?

    3. Were people from a particular age group more vulnerable to crime?

    4. Does ethnicity play a role in crime rate?

    5. Has education played a role in bringing down the crime rate?

  3. Online Shoppers Intention

    • kaggle.com
    Updated Aug 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julio Ortega (2024). Online Shoppers Intention [Dataset]. https://www.kaggle.com/datasets/julioortegagimenez/online-shoppers-intention/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 20, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Julio Ortega
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The data was obtained from the following website: https://archive.ics.uci.edu/ml/datasets/Online+Shoppers+Purchasing+Intention+Dataset Sakar,C. and Kastro,Yomi. (2018). Online Shoppers Purchasing Intention Dataset. UCI Machine Learning Repository. https://doi.org/10.24432/C5F88Q.

  4. UCI E3SM1.0 model output prepared for CMIP6 PAMIP pdSST-pdSICSIT

    • wdc-climate.de
    Updated 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of California Irvine (UCI) (2021). UCI E3SM1.0 model output prepared for CMIP6 PAMIP pdSST-pdSICSIT [Dataset]. http://doi.org/10.22033/ESGF/CMIP6.16550
    Explore at:
    Dataset updated
    2021
    Dataset provided by
    Earth System Grid
    World Data Center for Climate (WDCC) at DKRZ
    Authors
    University of California Irvine (UCI)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Coupled Model Intercomparison Project Phase 6 (CMIP6) datasets. These data include all datasets published for 'CMIP6.PAMIP.UCI.E3SM-1-0.pdSST-pdSICSIT' with the full Data Reference Syntax following the template 'mip_era.activity_id.institution_id.source_id.experiment_id.member_id.table_id.variable_id.grid_label.version'.

    The E3SM 1.0 (Energy Exascale Earth System Model) climate model, released in 2018, includes the following components: aerosol: MAM4 with resuspension, marine organics, and secondary organics (same grid as atmos), atmos: EAM (v1.0, cubed sphere spectral-element grid; 5400 elements with p=3; 1 deg average grid spacing; 90 x 90 x 6 longitude/latitude/cubeface; 72 levels; top level 0.1 hPa), atmosChem: Troposphere specified oxidants for aerosols. Stratosphere linearized interactive ozone (LINOZ v2) (same grid as atmos), land: ELM (v1.0, cubed sphere spectral-element grid; 5400 elements with p=3; 1 deg average grid spacing; 90 x 90 x 6 longitude/latitude/cubeface; satellite phenology mode), MOSART (v1.0, 0.5 degree latitude/longitude grid), ocean: MPAS-Ocean (v6.0, oEC60to30 unstructured SVTs mesh with 235160 cells and 714274 edges, variable resolution 60 km to 30 km; 60 levels; top grid cell 0-10 m), seaIce: MPAS-Seaice (v6.0, same grid as ocean). The model was run by the Department of Earth System Science, University of California Irvine, Irvine, CA 92697, USA (UCI) in native nominal resolutions: aerosol: 100 km, atmos: 100 km, atmosChem: 100 km, land: 100 km, ocean: 50 km, seaIce: 50 km.

    Project: These data have been generated as part of the internationally-coordinated Coupled Model Intercomparison Project Phase 6 (CMIP6; see also GMD Special Issue: http://www.geosci-model-dev.net/special_issue590.html). The simulation data provides a basis for climate research designed to answer fundamental science questions and serves as resource for authors of the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC-AR6).

    CMIP6 is a project coordinated by the Working Group on Coupled Modelling (WGCM) as part of the World Climate Research Programme (WCRP). Phase 6 builds on previous phases executed under the leadership of the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and relies on the Earth System Grid Federation (ESGF) and the Centre for Environmental Data Analysis (CEDA) along with numerous related activities for implementation. The original data is hosted and partially replicated on a federated collection of data nodes, and most of the data relied on by the IPCC is being archived for long-term preservation at the IPCC Data Distribution Centre (IPCC DDC) hosted by the German Climate Computing Center (DKRZ).

    The project includes simulations from about 120 global climate models and around 45 institutions and organizations worldwide. - Project website: https://pcmdi.llnl.gov/CMIP6.

  5. kddcup.data.gz

    • figshare.com
    application/gzip
    Updated Jun 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moreau (2023). kddcup.data.gz [Dataset]. http://doi.org/10.6084/m9.figshare.23566914.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 23, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Moreau
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    Data for joblib example on compression, to make sure we can always serve it. Please don't use this data but refer to the original website: http://kdd.ics.uci.edu/databases/kddcup99/task.html

  6. UCI CESM1-WACCM-SC model output prepared for CMIP6 PAMIP futSST-pdSIC

    • wdc-climate.de
    Updated 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peings, Yannick (2020). UCI CESM1-WACCM-SC model output prepared for CMIP6 PAMIP futSST-pdSIC [Dataset]. http://doi.org/10.22033/ESGF/CMIP6.12292
    Explore at:
    Dataset updated
    2020
    Dataset provided by
    Earth System Grid
    World Data Center for Climate (WDCC) at DKRZ
    Authors
    Peings, Yannick
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Coupled Model Intercomparison Project Phase 6 (CMIP6) datasets. These data include all datasets published for 'CMIP6.PAMIP.NCAR.CESM1-WACCM-SC.futSST-pdSIC' with the full Data Reference Syntax following the template 'mip_era.activity_id.institution_id.source_id.experiment_id.member_id.table_id.variable_id.grid_label.version'.

    The Community Earth System Model 1, with the Whole Atmosphere Community Climate Model and Specified Chemistry climate model, released in 2011, includes the following components: aerosol: MOZART-specified (same grid as atmos), atmos: WACCM4 (1.9x2.5 finite volume grid; 144 x 96 longitude/latitude; 66 levels; top level 5.9e-06 mb), atmosChem: MOZART-specified (same grid as atmos), land: CLM4.0, ocean: POP2 (320 x 384 longitude/latitude; 60 levels; top grid cell 0-10 m), ocnBgchem: BEC (same grid as ocean), seaIce: CICE4 (same as grid as ocean). The model was run by the Department of Earth System Science, University of California Irvine, Irvine, CA 92697, USA (UCI) in native nominal resolutions: aerosol: 250 km, atmos: 250 km, atmosChem: 250 km, land: 250 km, ocean: 100 km, ocnBgchem: 100 km, seaIce: 100 km.

    Project: These data have been generated as part of the internationally-coordinated Coupled Model Intercomparison Project Phase 6 (CMIP6; see also GMD Special Issue: http://www.geosci-model-dev.net/special_issue590.html). The simulation data provides a basis for climate research designed to answer fundamental science questions and serves as resource for authors of the Sixth Assessment Report of the Intergovernmental Panel on Climate Change (IPCC-AR6).

    CMIP6 is a project coordinated by the Working Group on Coupled Modelling (WGCM) as part of the World Climate Research Programme (WCRP). Phase 6 builds on previous phases executed under the leadership of the Program for Climate Model Diagnosis and Intercomparison (PCMDI) and relies on the Earth System Grid Federation (ESGF) and the Centre for Environmental Data Analysis (CEDA) along with numerous related activities for implementation. The original data is hosted and partially replicated on a federated collection of data nodes, and most of the data relied on by the IPCC is being archived for long-term preservation at the IPCC Data Distribution Centre (IPCC DDC) hosted by the German Climate Computing Center (DKRZ).

    The project includes simulations from about 120 global climate models and around 45 institutions and organizations worldwide. - Project website: https://pcmdi.llnl.gov/CMIP6.

  7. Mouse Protein Expression

    • kaggle.com
    Updated Feb 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andre Ye (2022). Mouse Protein Expression [Dataset]. https://www.kaggle.com/datasets/washingtongold/mpempe
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 19, 2022
    Dataset provided by
    Kaggle
    Authors
    Andre Ye
    Description

    From UCI Dataset Repository. From the website:

    Source: Clara Higuera Department of Software Engineering and Artificial Intelligence, Faculty of Informatics and the Department of Biochemistry and Molecular Biology, Faculty of Chemistry, University Complutense, Madrid, Spain. Email: clarahiguera '@' ucm.es

    Katheleen J. Gardiner, creator and owner of the protein expression data, is currently with the Linda Crnic Institute for Down Syndrome, Department of Pediatrics, Department of Biochemistry and Molecular Genetics, Human Medical Genetics and Genomics, and Neuroscience Programs, University of Colorado, School of Medicine, Aurora, Colorado, USA. Email: katheleen.gardiner '@' ucdenver.edu

    Krzysztof J. Cios is currently with the Department of Computer Science, Virginia Commonwealth University, Richmond, Virginia, USA, and IITiS Polish Academy of Sciences, Poland. Email: kcios '@' vcu.edu

    Data Set Information: The data set consists of the expression levels of 77 proteins/protein modifications that produced detectable signals in the nuclear fraction of cortex. There are 38 control mice and 34 trisomic mice (Down syndrome), for a total of 72 mice. In the experiments, 15 measurements were registered of each protein per sample/mouse. Therefore, for control mice, there are 38x15, or 570 measurements, and for trisomic mice, there are 34x15, or 510 measurements. The dataset contains a total of 1080 measurements per protein. Each measurement can be considered as an independent sample/mouse.

    The eight classes of mice are described based on features such as genotype, behavior and treatment. According to genotype, mice can be control or trisomic. According to behavior, some mice have been stimulated to learn (context-shock) and others have not (shock-context) and in order to assess the effect of the drug memantine in recovering the ability to learn in trisomic mice, some mice have been injected with the drug and others have not.

    Classes: c-CS-s: control mice, stimulated to learn, injected with saline (9 mice) c-CS-m: control mice, stimulated to learn, injected with memantine (10 mice) c-SC-s: control mice, not stimulated to learn, injected with saline (9 mice) c-SC-m: control mice, not stimulated to learn, injected with memantine (10 mice)

    t-CS-s: trisomy mice, stimulated to learn, injected with saline (7 mice) t-CS-m: trisomy mice, stimulated to learn, injected with memantine (9 mice) t-SC-s: trisomy mice, not stimulated to learn, injected with saline (9 mice) t-SC-m: trisomy mice, not stimulated to learn, injected with memantine (9 mice)

    The aim is to identify subsets of proteins that are discriminant between the classes.

    Attribute Information:

    1 Mouse ID 2..78 Values of expression levels of 77 proteins; the names of proteins are followed by “_n†indicating that they were measured in the nuclear fraction. For example: DYRK1A_n 79 Genotype: control (c) or trisomy (t) 80 Treatment type: memantine (m) or saline (s) 81 Behavior: context-shock (CS) or shock-context (SC) 82 Class: c-CS-s, c-CS-m, c-SC-s, c-SC-m, t-CS-s, t-CS-m, t-SC-s, t-SC-m

    Relevant Papers:

    The posted data were analyzed by: Higuera C, Gardiner KJ, Cios KJ (2015) Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome. PLoS ONE 10(6): e0129126. [Web Link] journal.pone.0129126

    The data are a subset of the data analyzed by: Ahmed MM, Dhanasekaran AR, Block A, Tong S, Costa ACS, Stasko M, et al. (2015) Protein Dynamics Associated with Failed and Rescued Learning in the Ts65Dn Mouse Model of Down Syndrome. PLoS ONE 10(3): e0119491. [Web Link]

  8. d

    Replication Data for: The Gender Readings Gap in Political Science Graduate...

    • search.dataone.org
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hardt, Heidi; Smith, Amy Erica; Kim, Hannah June; Meister, Philippe (2023). Replication Data for: The Gender Readings Gap in Political Science Graduate Training [Dataset]. http://doi.org/10.7910/DVN/UNWIHE
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Hardt, Heidi; Smith, Amy Erica; Kim, Hannah June; Meister, Philippe
    Description

    This is the replication Stata code and log file for the Journal of Politics research note, "The Gender Readings Gap in Political Science Graduate Training," by Heidi Hardt, Amy Erica Smith, Hannah June Kim and Philippe Meister. For our searchable database, see our website here: http://gradtraining.socsci.uci.edu/

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Drton, Mathias (2023). UCI datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7681647

UCI datasets

Explore at:
Dataset updated
Apr 4, 2023
Dataset provided by
Zadorozhnyi, Oleksandr
Drton, Mathias
Haug, Stephan
Reifferscheidt, David
License

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically

Description

Collection of two datasets from the UCI website that could be used for structure learning tasks. Includes datasets regarding

Air Quality

US census 1990

Size: Two datasets of sizes 9471*17 and 2458285*68 correspondingly

Number of features: 15-68

Ground truth: No

Type of Graph: No ground truth

More information about the datasets is contained in the dataset_description.html files.

Search
Clear search
Close search
Google apps
Main menu