Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
biology
Collection of databases, domain theories, and data generators that are used by machine learning community for empirical analysis of machine learning algorithms. Datasets approved to be in the repository will be assigned Digital Object Identifier (DOI) if they do not already possess one. Datasets will be licensed under a Creative Commons Attribution 4.0 International license (CC BY 4.0) which allows for the sharing and adaptation of the datasets for any purpose, provided that the appropriate credit is given
UCI Machine Learning Repository is a collection of over 550 datasets.
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
The UCI Machine Learning Repository is a collection of databases, domain theories, and data generators that are used by the machine learning community for the empirical analysis of machine learning algorithms. The archive was created as an ftp archive in 1987 by David Aha and fellow graduate students at UC Irvine. Since that time, it has been widely used by students, educators, and researchers all over the world as a primary source of machine learning data sets. As an indication of the impact of the archive, it has been cited over 1000 times, making it one of the top 100 most cited "papers" in all of computer science. The current version of the web site was designed in 2007 by Arthur Asuncion and David Newman, and this project is in collaboration with Rexa.info at the University of Massachusetts Amherst. Funding support from the National Science Foundation is gratefully acknowledged. Many people deserve thanks for making the repository a success. Foremost among them are the d
This dataset was created by Ben Duong
Financial overview and grant giving statistics of University of California Irvine Foundation
This dataset provides atmospheric concentrations of halocarbons and hydrocarbons measured by the UC-Irvine Whole Air Sampler (WAS) during airborne campaigns conducted by NASA's Atmospheric Tomography (ATom) mission. The analysis of samples from the UCI WAS provides measurements of more than 50 trace gases, including C2-C10 NMHCs, C1-C2 halocarbons, C1-C5 alkyl nitrates, and selected sulfur compounds. Species were identified and measured using an established technique of airborne whole air sampling followed by laboratory analysis using gas chromatography (GC) with flame ionization detection (FID), and mass spectrometric detection (MSD). The ATom mission deployed an extensive gas and aerosol payload on the NASA DC-8 aircraft for systematic, global-scale sampling of the atmosphere, profiling continuously from 0.2 to 12 km altitude. Flights occurred in each of 4 seasons from 2016 to 2018.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Western U.S., especially southern California and Orange County; Baja California, Mexico.
https://whoisdatacenter.com/terms-of-use/https://whoisdatacenter.com/terms-of-use/
Explore the historical Whois records related to uc-irvine.com (Domain). Get insights into ownership history and changes over time.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Collection of two datasets from the UCI website that could be used for structure learning tasks. Includes datasets regarding
Size: Two datasets of sizes 9471*17 and 2458285*68 correspondingly
Number of features: 15-68
Ground truth: No
Type of Graph: No ground truth
More information about the datasets is contained in the dataset_description.html files.
Financial overview and grant giving statistics of Regents of the University of California at Irvine
University of California-Irvine/PERSIANN (Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks) system data.
This is the dataset Occupancy Detection Data Set, UCI as used in the article how-to-predict-room-occupancy-based-on-environmental-factors
"no","date","Temperature","Humidity","Light","CO2","HumidityRatio","Occupancy"
UC Irvine Machine Learning Repository
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GIS data collected as part of a campus-wide tree inventory, done in part to determine scope and effect of Polyphagous Shot Hole Borer (PSHB) infestation.
This data set includes data from the CIRPAS Twin Otter aircraft during the "Physics of Stratocumulus Tops" (POST) project off the coast of Monterey, CA. This data set contains ASCII txt files of data from the UC Irvine 40-hz probes.
https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/
Dataset Card for [Dataset Name]
Dataset Summary
The SMS Spam Collection v.1 is a public set of SMS labeled messages that have been collected for mobile phone spam research. It has one collection composed by 5,574 English, real and non-enconded messages, tagged according being legitimate (ham) or spam.
Supported Tasks and Leaderboards
[More Information Needed]
Languages
English
Dataset Structure
Data Instances
[More Information… See the full description on the dataset page: https://huggingface.co/datasets/ucirvine/sms_spam.
Author:
Source: Unknown - Date unknown
Please cite:
Internet Usage Data
Data Type
multivariate
Abstract
This data contains general demographic information on internet users
in 1997.
Sources
Original Owner
[1]Graphics, Visualization, & Usability Center College of Computing Geogia Institute of Technology Atlanta, GA
Donor
[2]Dr Di Cook Department of Statistics Iowa State University
Date Donated: June 30, 1999
Data Characteristics
This data comes from a survey conducted by the Graphics and
Visualization Unit at Georgia Tech October 10 to November 16, 1997.
The full details of the survey are available [3]here.
The particular subset of the survey provided here is the "general
demographics" of internet users. The data have been recoded as
entirely numeric, with an index to the codes described in the "Coding"
file.
The full survey is available from the web site above, along with
summaries, tables and graphs of their analyses. In addition there is
information on other parts of the survey, including technology
demographics and web commerce.
Data Format
The data is stored in an ASCII files with one observation per line.
Spaces separate fields.
Past Usage
This data was used in the American Statistical Association Statistical
Graphics and Computing Sections 1999 Data Exposition.
_
[4]The UCI KDD Archive
[5]Information and Computer Science
[6]University of California, Irvine
Irvine, CA 92697-3425
Last modified: June 30, 1999
References
1. http://www.gvu.gatech.edu/gvu/user_surveys/survey-1997-10/
2. http://www.public.iastate.edu/~dicook/
3. http://www.cc.gatech.edu/gvu/user_surveys/survey-1997-10/
4. http://kdd.ics.uci.edu/
5. http://www.ics.uci.edu/
6. http://www.uci.edu/
Information about the dataset CLASSTYPE: nominal CLASSINDEX: none specific
ANTswers is an experimental chatbot that can answer questions about the UC Irvine Libraries. ANTswers is a web-based application, run on a remote library server and is accessed through a web interface page. ANTswers’ personality and persona is based on the UCI mascot, Peter the Anteater. ANTswers responds to simple and short questions. The first link in a response opens in a preview window, all other links open in a new window. Each transaction is reviewed and a data form is filled out to track usage; such as date, time, answer rate, etc.
University of California Irvine - BioCentury Company Profiles for the biopharma industry
A database of general chemical information. The datasets are comprised of various available chemical datasets annotated with interesting properties to train and test machine-learning prediction and searching methods. Tools provided include ChemicalSearch, Virtual Chemical Space, Reaction Explorer, Datasets, and supplemental material. ChemicalSearch is a tool that allows users to find a chemical by basic criteria like molecular weight and predicted logP, or by the more abstract notion of structural similarity. Virtual Chemical Space is a tool which lets users interactively deconstruct target compounds into component precursors and reconstruct similar building-blocks into combinatorial libraries representing the virtual chemical space near the target compound. Reaction Explorer is a synthesis explorer and mechanism explorer. It provides an interactive system for learning and practicing reactions, syntheses and mechanisms in organic chemistry, with advanced support for the automatic generation of random problems, curved-arrow mechanism diagrams, and inquiry-based learning.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
biology