49 datasets found
  1. Basic information on 40 datasets from UCI repository used in this study...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregor Stiglic; Simon Kocbek; Igor Pernek; Peter Kokol (2023). Basic information on 40 datasets from UCI repository used in this study including information about number of instances, attributes, classes, length of longest attribute name (LAN) and length of the longest nominal attribute value (LAV). [Dataset]. http://doi.org/10.1371/journal.pone.0033812.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Gregor Stiglic; Simon Kocbek; Igor Pernek; Peter Kokol
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Basic information on 40 datasets from UCI repository used in this study including information about number of instances, attributes, classes, length of longest attribute name (LAN) and length of the longest nominal attribute value (LAV).

  2. DatasetofDatasets (DoD)

    • kaggle.com
    zip
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstantinos Malliaridis (2024). DatasetofDatasets (DoD) [Dataset]. https://www.kaggle.com/terminalgr/datasetofdatasets-124-1242024
    Explore at:
    zip(7583 bytes)Available download formats
    Dataset updated
    Aug 12, 2024
    Authors
    Konstantinos Malliaridis
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset is essentially the metadata from 164 datasets. Each of its lines concerns a dataset from which 22 features have been extracted, which are used to classify each dataset into one of the categories 0-Unmanaged, 2-INV, 3-SI, 4-NOA (DatasetType).

    This Dataset consists of 164 Rows. Each row is the metadata of an other dataset. The target column is datasetType which has 4 values indicating the dataset type. These are:

    2 - Invoice detail (INV): This dataset type is a special report (usually called Detailed Sales Statement) produced by a Company Accounting or an Enterprise Resource Planning software (ERP). Using a INV-type dataset directly for ARM is extremely convenient for users as it relieves them from the tedious work of transforming data into another more suitable form. INV-type data input typically includes a header but, only two of its attributes are essential for data mining. The first attribute serves as the grouping identifier creating a unique transaction (e.g., Invoice ID, Order Number), while the second attribute contains the items utilized for data mining (e.g., Product Code, Product Name, Product ID).

    3 - Sparse Item (SI): This type is widespread in Association Rules Mining (ARM). It involves a header and a fixed number of columns. Each item corresponds to a column. Each row represents a transaction. The typical cell stores a value, usually one character in length, that depicts the presence or absence of the item in the corresponding transaction. The absence character must be identified or declared before the Association Rules Mining process takes place.

    4 - Nominal Attributes (NOA): This type is commonly used in Machine Learning and Data Mining tasks. It involves a fixed number of columns. Each column registers nominal/categorical values. The presence of a header row is optional. However, in cases where no header is provided, there is a risk of extracting incorrect rules if similar values exist in different attributes of the dataset. The potential values for each attribute can vary.

    0 - Unmanaged for ARM: On the other hand, not all datasets are suitable for extracting useful association rules or frequent item sets. For instance, datasets characterized predominantly by numerical features with arbitrary values, or datasets that involve fragmented or mixed types of data types. For such types of datasets, ARM processing becomes possible only by introducing a data discretization stage which in turn introduces information loss. Such types of datasets are not considered in the present treatise and they are termed (0) Unmanaged in the sequel.

    The dataset type is crucial to determine for ARM, and the current dataset is used to classify the dataset's type using a Supervised Machine Learning Model.

    There is and another dataset type named 1 - Market Basket List (MBL) where each dataset row is a transaction. A transaction involves a variable number of items. However, due to this characteristic, these datasets can be easily categorized using procedural programming and DoD does not include instances of them. For more details about Dataset Types please refer to article "WebApriori: a web application for association rules mining". https://link.springer.com/chapter/10.1007/978-3-030-49663-0_44

  3. Patient categorical and nominal attributes.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bogumil M. Konopka; Felicja Lwow; Magdalena Owczarz; Łukasz Łaczmański (2023). Patient categorical and nominal attributes. [Dataset]. http://doi.org/10.1371/journal.pone.0201950.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Bogumil M. Konopka; Felicja Lwow; Magdalena Owczarz; Łukasz Łaczmański
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Patient categorical and nominal attributes.

  4. autoPrice

    • kaggle.com
    zip
    Updated Sep 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2020). autoPrice [Dataset]. https://www.kaggle.com/mathurinache/autoprice
    Explore at:
    zip(3004 bytes)Available download formats
    Dataset updated
    Sep 6, 2020
    Authors
    Mathurin Aché
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    All nominal attributes and instances with missing values are deleted. Price treated as the class attribute.

    As used by Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems. Singapore: Springer-Verlag.

    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

    Title: 1985 Auto Imports Database

    Source Information: -- Creator/Donor: Jeffrey C. Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu) -- Date: 19 May 1987 -- Sources: 1) 1985 Model Import Car and Truck Specifications, 1985 Ward's Automotive Yearbook. 2) Personal Auto Manuals, Insurance Services Office, 160 Water Street, New York, NY 10038 3) Insurance Collision Report, Insurance Institute for Highway Safety, Watergate 600, Washington, DC 20037

    Past Usage: -- Kibler,~D., Aha,~D.~W., & Albert,~M. (1989). Instance-based prediction of real-valued attributes. {it Computational Intelligence}, {it 5}, 51--57. -- Predicted price of car using all numeric and Boolean attributes -- Method: an instance-based learning (IBL) algorithm derived from a localized k-nearest neighbor algorithm. Compared with a linear regression prediction...so all instances with missing attribute values were discarded. This resulted with a training set of 159 instances, which was also used as a test set (minus the actual instance during testing). -- Results: Percent Average Deviation Error of Prediction from Actual -- 11.84% for the IBL algorithm -- 14.12% for the resulting linear regression equation

    Relevant Information: -- Description This data set consists of three types of entities: (a) the specification of an auto in terms of various characteristics, (b) its assigned insurance risk rating, (c) its normalized losses in use as compared to other cars. The second rating corresponds to the degree to which the auto is more risky than its price indicates. Cars are initially assigned a risk factor symbol associated with its price. Then, if it is more risky (or less), this symbol is adjusted by moving it up (or down) the scale. Actuarians call this process "symboling". A value of +3 indicates that the auto is risky, -3 that it is probably pretty safe.

    The third factor is the relative average loss payment per insured vehicle year. This value is normalized for all autos within a particular size classification (two-door small, station wagons, sports/speciality, etc...), and represents the average loss per car per year.

    -- Note: Several of the attributes in the database could be used as a "class" attribute.

    Number of Instances: 205

    Number of Attributes: 26 total -- 15 continuous -- 1 integer -- 10 nominal

    Attribute Information: Attribute: Attribute Range: ------------------ ----------------------------------------------- symboling: -3, -2, -1, 0, 1, 2, 3. normalized-losses: continuous from 65 to 256. make: alfa-romero, audi, bmw, chevrolet, dodge, honda, isuzu, jaguar, mazda, mercedes-benz, mercury, mitsubishi, nissan, peugot, plymouth, porsche, renault, saab, subaru, toyota, volkswagen, volvo fuel-type: diesel, gas. aspiration: std, turbo. num-of-doors: four, two. body-style: hardtop, wagon, sedan, hatchback, convertible. drive-wheels: 4wd, fwd, rwd. engine-location: front, rear. wheel-base: continuous from 86.6 120.9. length: continuous from 141.1 to 208.1. width: continuous from 60.3 to 72.3. height: continuous from 47.8 to 59.8. curb-weight: continuous from 1488 to 4066. engine-type: dohc, dohcv, l, ohc, ohcf, ohcv, rotor. num-of-cylinders: eight, five, four, six, three, twelve, two. engine-size: continuous from 61 to 326. fuel-system: 1bbl, 2bbl, 4bbl, idi, mfi, mpfi, spdi, spfi. bore: continuous from 2.54 to 3.94. stroke: continuous from 2.07 to 4.17. compression-ratio: continuous from 7 to 23. horsepower: continuous from 48 to 288. peak-rpm: continuous from 4150 to 6600. city-mpg: continuous from 13 to 49. highway-mpg: continuous from 16 to 54. price: continuous from 5118 to 45400.

    Missing Attribute Values: (denoted by "?") Attribute #: Number of instances missing a value: 41 2 4 4 2 2 4%

  5. m

    Abdominal Electromyograms (EMGs) Dataset: Breathing Patterns of Sleeping...

    • data.mendeley.com
    Updated Apr 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gennady Chuiko (2023). Abdominal Electromyograms (EMGs) Dataset: Breathing Patterns of Sleeping Adults [Dataset]. http://doi.org/10.17632/pmspdmgcd4.3
    Explore at:
    Dataset updated
    Apr 13, 2023
    Authors
    Gennady Chuiko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data set provides Machine Learning for defining breathing patterns in sleep for adults using preprocessed abdominal electromyograms (EMGs). The data set of 40 records was casually picked from a vaster database (Computing in Cardiology Challenge 2018: Training/Test Sets. 2018. URL: https://archive.physionet.org/physiobank/database/challenge/2018/). The optimal exponential smoothing model was uniform for all records: additive errors, small undamped trends, and no seasonality. Cleared out by trends and noises, signals had autocorrelation functions with the power-law decay. That has allowed making their persistence factors evaluations (Hurst exponent).
    Most of the signals (38 of 40) showed frequent outliers: from a few percent up to 24.6 % of emissions. Wide data variability has been rated with the median absolute deviations, which is the most robust statistic in such a case. High variability looks a bit odd, considering low enough noise levels. The outliers' percentage, variability, SNR (signal-to-noise ratio), and persistency factors were statistically z-scored with medians and median absolute deviations. Further, their linear combinations form three independent Principal Components: numeric attributes z_1, z_2, and z_3 of the data set.
    Manhattan distances matrix among subjects' vectors in 4D attributes space allows imaging the data set as a weighted biconnected graph, the vertices of which are subjects. The weights of the graph's edges reflect distances between any pair of them. "Closeness centralities" of vertices, a well-known parameter in graphs theory, allowed us to cluster the data on two clusters with 11 and 29 subjects. They present two biconnected subgraphs, peripheral and core, respectively. The belonging to one of them has been reflected in binary (nominal) attribute z_4. There are 0 as the label of the peripheral subgraph and 1 for core one, respectively. The periodograms of EMGs permitted us to find ten subjects with regular breathing and 30 with irregular one, defining two inequal classes using nominal attribute z_5. So, we offer here the data set for Machine Learning in ARFF format, containing 40 instances with five attributes, the sense of which is described above.

  6. Adult Income - UCI Dataset

    • kaggle.com
    zip
    Updated Jan 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zahra Meki (2023). Adult Income - UCI Dataset [Dataset]. https://www.kaggle.com/datasets/zahrameki/adult-income-uci-dataset
    Explore at:
    zip(722129 bytes)Available download formats
    Dataset updated
    Jan 5, 2023
    Authors
    Zahra Meki
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    My Application Link: https://zahrameki-project.streamlit.app

    Dataset Description

    Number of Instances: - 48842 instances, mix of continuous and discrete (train=32561, test=16281) - 45222 if instances with unknown values are removed (train=30162, test=15060)

    Number of Attributes: - 6 continuous, 8 nominal attributes.

    Attribute Information:
    - age: continuous.
    - workclass: Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.
    - fnlwgt: continuous.
    - education: Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.
    - education-num: continuous.
    - marital-status:Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse.
    - occupation:Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.
    - relationship:Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.
    - race:White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.
    - gender:Female, Male.
    - capital-gain:continuous.
    - capital-loss:continuous.
    - hours-per-week:continuous.
    - native-country: United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.
    - income: >50K, <=50K

  7. Data from: DATA MINING THE GALAXY ZOO MERGERS

    • data.nasa.gov
    • gimi9.com
    • +3more
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). DATA MINING THE GALAXY ZOO MERGERS [Dataset]. https://data.nasa.gov/dataset/data-mining-the-galaxy-zoo-mergers
    Explore at:
    Dataset updated
    Mar 31, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    DATA MINING THE GALAXY ZOO MERGERS STEVEN BAEHR, ARUN VEDACHALAM, KIRK BORNE, AND DANIEL SPONSELLER Abstract. Collisions between pairs of galaxies usually end in the coalescence (merger) of the two galaxies. Collisions and mergers are rare phenomena, yet they may signal the ultimate fate of most galaxies, including our own Milky Way. With the onset of massive collection of astronomical data, a computerized and automated method will be necessary for identifying those colliding galaxies worthy of more detailed study. This project researches methods to accomplish that goal. Astronomical data from the Sloan Digital Sky Survey (SDSS) and human-provided classifications on merger status from the Galaxy Zoo project are combined and processed with machine learning algorithms. The goal is to determine indicators of merger status based solely on discovering those automated pipeline-generated attributes in the astronomical database that correlate most strongly with the patterns identified through visual inspection by the Galaxy Zoo volunteers. In the end, we aim to provide a new and improved automated procedure for classification of collisions and mergers in future petascale astronomical sky surveys. Both information gain analysis (via the C4.5 decision tree algorithm) and cluster analysis (via the Davies-Bouldin Index) are explored as techniques for finding the strongest correlations between human-identified patterns and existing database attributes. Galaxy attributes measured in the SDSS green waveband images are found to represent the most influential of the attributes for correct classification of collisions and mergers. Only a nominal information gain is noted in this research, however, there is a clear indication of which attributes contribute so that a direction for further study is apparent.

  8. f

    Data from: Identification and Prioritization of Important Attributes of...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Nov 4, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hiligsmann, Mickaël; Kremer, Ingrid E. H.; van der Weijden, Trudy; van de Kolk, Ilona; Evers, Silvia M. A. A.; Jongen, Peter J. (2016). Identification and Prioritization of Important Attributes of Disease-Modifying Drugs in Decision Making among Patients with Multiple Sclerosis: A Nominal Group Technique and Best-Worst Scaling [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001589376
    Explore at:
    Dataset updated
    Nov 4, 2016
    Authors
    Hiligsmann, Mickaël; Kremer, Ingrid E. H.; van der Weijden, Trudy; van de Kolk, Ilona; Evers, Silvia M. A. A.; Jongen, Peter J.
    Description

    ObjectivesUnderstanding the preferences of patients with multiple sclerosis (MS) for disease-modifying drugs and involving these patients in clinical decision making can improve the concordance between medical decisions and patient values and may, subsequently, improve adherence to disease-modifying drugs. This study aims first to identify which characteristics–or attributes–of disease-modifying drugs influence patients´ decisions about these treatments and second to quantify the attributes’ relative importance among patients.MethodsFirst, three focus groups of relapsing-remitting MS patients were formed to compile a preliminary list of attributes using a nominal group technique. Based on this qualitative research, a survey with several choice tasks (best-worst scaling) was developed to prioritize attributes, asking a larger patient group to choose the most and least important attributes. The attributes’ mean relative importance scores (RIS) were calculated.ResultsNineteen patients reported 34 attributes during the focus groups and 185 patients evaluated the importance of the attributes in the survey. The effect on disease progression received the highest RIS (RIS = 9.64, 95% confidence interval: [9.48–9.81]), followed by quality of life (RIS = 9.21 [9.00–9.42]), relapse rate (RIS = 7.76 [7.39–8.13]), severity of side effects (RIS = 7.63 [7.33–7.94]) and relapse severity (RIS = 7.39 [7.06–7.73]). Subgroup analyses showed heterogeneity in preference of patients. For example, side effect-related attributes were statistically more important for patients who had no experience in using disease-modifying drugs compared to experienced patients (p < .001).ConclusionsThis study shows that, on average, patients valued effectiveness and unwanted effects as most important. Clinicians should be aware of the average preferences but also that attributes of disease-modifying drugs are valued differently by different patients. Person-centred clinical decision making would be needed and requires eliciting individual preferences.

  9. Z

    nickel

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bart Massey (2020). nickel [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_322467
    Explore at:
    Dataset updated
    Jan 24, 2020
    Authors
    Bart Massey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering.

    Title: Nickle Repository Transaction Data

    Sources: (a) Original creators of database: Bart Massey 01 503 725-5393 Computer Science Dept. PO Box 751 MS CMPS Portland State University Portland, OR USA 97207-0751 bart@cs.pdx.edu

    (b) Donor of database: owner

    (c) Date received: 31 March 2005

    Past Usage: none

    Relevant Information:

    This dataset was assembled by analyzing the publicly-available CVS archives of the Nickle programming language (http://nickle.org) using a modified version of CVSAnalY (http://metricsgrimoire.github.io/CVSAnalY). It is intended for a wide variety of uses, and thus no dependent variable is specified. See Massey's PROMISE 2005 paper, "Longitudinal Analysis of Long-Timescale Open Source Repository Data", for further information.

    Number of Instances: 2972

    Number of Attributes: 10 (incl. 2 inferred)

    Attribute information:

    Inferred filetype:

    non-numeric—nominal

    9 values (documentation,images,i18n,ui,multimedia,code,build,devel-doc,unknown)

    File Pathname: UNIX relative path name

    non-numeric—structured

    15 directories, 265 files

    Revision: dotted-decimal revision string

    non-numeric—structured

    161 unique revision numbers

    Author ID: integer identifier

    non-numeric—nominal

    6 unique authors

    Lines added: integer

    numeric—integer

    MIN: 0 MAX: 1011 MEAN: 24.0148 STDEV: 65.6858

    Lines removed: integer

    numeric—integer

    MIN: 0 MAX: 678 MEAN: 11.6413 STDEV: 42.4373)

    File has since been removed ("in Attic"): boolean (0, 1)

    non-numeric—nominal

    11.98% positive

    Commit has CVS_SILENT flag: boolean (0, 1)

    non-numeric—nominal

    never set

    Inferred that committer was not author: boolean (0, 1)

    non-numeric—nominal

    never set

    Commit date: date

    MIN: "1999-01-13 06:22:11" MAX: "2005-01-14 16:58:53"

    Note: CVS has trouble with timezones; we assume all dates are UTC.

    Missing Attribute Values:

    unknown inferred filetype (attr 1): 69

    Class Distribution: N/A

  10. Person Information

    • kaggle.com
    zip
    Updated Dec 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jatin (2021). Person Information [Dataset]. https://www.kaggle.com/datasets/jkanthony/person-information/discussion
    Explore at:
    zip(1040 bytes)Available download formats
    Dataset updated
    Dec 16, 2021
    Authors
    Jatin
    Description

    This database contains 5 numeric-valued attributes.

    Attribute Information:

    1. ID: distinct for each instance and represented numerically

    2. hobby: nominal values ranging between 1 and 3 (Chess, Sports, Stamps)

    3. age: nominal values ranging between 1 and 4 (Child, Teenager, Young, Old)

    4. educational level: nominal values ranging between 1 and 4 (Primary, Higher Secondary, Graduate, Post Graduate)

    5. marital status: nominal values ranging between 1 and 4 (Not Married, Married, Divorced, Complicated)

    6. class: nominal value between 1 and 3 (Lower, Middle, Upper)

  11. Data from: Online Retail Dataset

    • kaggle.com
    zip
    Updated Sep 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sowndarya23 (2022). Online Retail Dataset [Dataset]. https://www.kaggle.com/datasets/sowndarya23/online-retail-dataset/code
    Explore at:
    zip(7571371 bytes)Available download formats
    Dataset updated
    Sep 27, 2022
    Authors
    Sowndarya23
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset:

    This is a transnational data set which contains all the transactions occurring between 2010 and 2011 online retail. These information is collected from the countries like US, UK, France etc.

    Attribute Details:

    InvoiceNo: Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation. StockCode: Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product. Description: Product (item) name. Nominal. Quantity: The quantities of each product (item) per transaction. Numeric. InvoiceDate: Invice date and time. Numeric. The day and time when a transaction was generated. UnitPrice: Unit price. Numeric. Product price per unit in sterling (£). CustomerID: Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer. Country: Country name. Nominal. The name of the country where a customer resides.

  12. Z

    Peatland Decomposition Database (1.1.0)

    • data.niaid.nih.gov
    Updated Mar 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Teickner, Henning; Knorr, Klaus-Holger (2025). Peatland Decomposition Database (1.1.0) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_11276064
    Explore at:
    Dataset updated
    Mar 5, 2025
    Dataset provided by
    University of Münster
    Authors
    Teickner, Henning; Knorr, Klaus-Holger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    1 Introduction

    The Peatland Decomposition Database (PDD) stores data from published litterbag experiments related to peatlands. Currently, the database focuses on northern peatlands and Sphagnum litter and peat, but it also contains data from some vascular plant litterbag experiments. Currently, the database contains entries from 34 studies, 2,160 litterbag experiments, and 7,297 individual samples with 117,841 measurements for various attributes (e.g. relative mass remaining, N content, holocellulose content, mesh size). The aim is to provide a harmonized data source that can be useful to re-analyse existing data and to plan future litterbag experiments.

    The Peatland Productivity and Decomposition Parameter Database (PPDPD) (Bona et al. 2018) is similar to the Peatland Decomposition Database (PDD) in that both contain data from peatland litterbag experiments. The differences are that both databases partly contain different data, that PPDPD additionally contains information on vegetation productivity, which PDD does not, and that PDD provides more information and metadata on litterbag experiments, and also measurement errors.

    2 Updates

    Compared to version 1.0.0, this version has a new structure for table experimental_design_format, contains additional metadata on the experimental design (these were omitted in version 1.0.0), and contains the scripts that were used to import the data into the database.

    3 Methods

    3.1 Data collection

    Data for the database was collected from published litterbag studies, by extracting published data from figures, tables, or other data sources, and by contacting the authors of the studies to obtain raw data. All data processing was done with R (R version 4.2.0 (2022-04-22)) (R Core Team 2022).

    Studies were identified via a Scopus search with search string (TITLE-ABS-KEY ( peat* AND ( "litter bag" OR "decomposition rate" OR "decay rate" OR "mass loss")) AND NOT ("tropic*")) (2022-12-17). These studies were further screened to exclude those which do not contain litterbag data or which recycle data from other studies that have already been considered. Additional studies with litterbag experiments in northern peatlands we were aware of, but which were not identified in the literature search were added to the list of publications. For studies not older than 10 years, authors were contacted to obtain raw data, however this was successful only in few cases. To date, the database focuses on Sphagnum litterbag experiments and not from all studies that were identified by the literature search data have been included yet in the database.

    Data from figures were extracted using the package ‘metaDigitise’ (1.0.1) (Pick, Nakagawa, and Noble 2018). Data from tables were extracted manually.

    Data from the following studies are currently included: Farrish and Grigal (1985), Bartsch and Moore (1985), Farrish and Grigal (1988), Vitt (1990), Hogg, Lieffers, and Wein (1992), Sanger, Billett, and Cresser (1994), Hiroki and Watanabe (1996), Szumigalski and Bayley (1996), Prevost, Belleau, and Plamondon (1997), Arp, Cooper, and Stednick (1999), Robbert A. Scheffer and Aerts (2000), R. A. Scheffer, Van Logtestijn, and Verhoeven (2001), Limpens and Berendse (2003), Waddington, Rochefort, and Campeau (2003), Asada, Warner, and Banner (2004), Thormann, Bayley, and Currah (2001), Trinder, Johnson, and Artz (2008), Breeuwer et al. (2008), Trinder, Johnson, and Artz (2009), Bragazza and Iacumin (2009), Hoorens, Stroetenga, and Aerts (2010), Straková et al. (2010), Straková et al. (2012), Orwin and Ostle (2012), Lieffers (1988), Manninen et al. (2016), Johnson and Damman (1991), Bengtsson, Rydin, and Hájek (2018a), Bengtsson, Rydin, and Hájek (2018b), Asada and Warner (2005), Bengtsson, Granath, and Rydin (2017), Bengtsson, Granath, and Rydin (2016), Hagemann and Moroni (2015), Hagemann and Moroni (2016), B. Piatkowski et al. (2021), B. T. Piatkowski et al. (2021), Mäkilä et al. (2018), Golovatskaya and Nikonova (2017), Golovatskaya and Nikonova (2017).

    4 Database records

    The database is a ‘MariaDB’ database and the database schema was designed to store data and metadata following the Ecological Metadata Language (EML) (Jones et al. 2019). Descriptions of the tables are shown in Tab. 1.

    The database contains general metadata relevant for litterbag experiments (e.g., geographical, temporal, and taxonomic coverage, mesh sizes, experimental design). However, it does not contain a detailed description of sample handling, sample preprocessing methods, site descriptions, because there currently are no discipline-specific metadata and reporting standards. Table 1: Description of the individual tables in the database.

    Name Description

    attributes Defines the attributes of the database and the values in column attribute_name in table data.

    citations Stores bibtex entries for references and data sources.

    citations_to_datasets Links entries in table citations with entries in table datasets.

    custom_units Stores custom units.

    data Stores measured values for samples, for example remaining masses.

    datasets Lists the individual datasets.

    experimental_design_format Stores information on the experimental design of litterbag experiments.

    measurement_scales, measurement_scales_date_time, measurement_scales_interval, measurement_scales_nominal, measurement_scales_ordinal, measurement_scales_ratio Defines data value types.

    missing_value_codes Defines how missing values are encoded.

    samples Stores information on individual samples.

    samples_to_samples Links samples to other samples, for example litter samples collected in the field to litter samples collected during the incubation of the litterbags.

    units, unit_types Stores information on measurement units.

    5 Attributes Table 2: Definition of attributes in the Peatland Decomposition Database and entries in the column attribute_name in table data.

    Name Definition Example value Unit Measurement scale Number type Minimum value Maximum value String format

    4_hydroxyacetophenone_mass_absolute A numeric value representing the content of 4-hydroxyacetophenone, as described in Straková et al. (2010). 0.26 g ratio real 0 Inf NA

    4_hydroxyacetophenone_mass_relative_mass A numeric value representing the content of 4-hydroxyacetophenone, as described in Straková et al. (2010). 0.26 g/g ratio real 0 1 NA

    4_hydroxybenzaldehyde_mass_absolute A numeric value representing the content of 4-hydroxybenzaldehyde, as described in Straková et al. (2010). 0.26 g ratio real 0 Inf NA

    4_hydroxybenzaldehyde_mass_relative_mass A numeric value representing the content of 4-hydroxybenzaldehyde, as described in Straková et al. (2010). 0.26 g/g ratio real 0 1 NA

    4_hydroxybenzoic_acid_mass_absolute A numeric value representing the content of 4-hydroxybenzoic acid, as described in Straková et al. (2010). 0.26 g ratio real 0 Inf NA

    4_hydroxybenzoic_acid_mass_relative_mass A numeric value representing the content of 4-hydroxybenzoic acid, as described in Straková et al. (2010). 0.26 g/g ratio real 0 1 NA

    abbreviation In table custom_units: A string representing an abbreviation for the custom unit. gC NA nominal NA NA NA NA

    acetone_extractives_mass_absolute A numeric value representing the content of acetone extractives, as described in Straková et al. (2010). 0.26 g ratio real 0 Inf NA

    acetone_extractives_mass_relative_mass A numeric value representing the content of acetone extractives, as described in Straková et al. (2010). 0.26 g/g ratio real 0 1 NA

    acetosyringone_mass_absolute A numeric value representing the content of acetosyringone, as described in Straková et al. (2010). 0.26 g ratio real 0 Inf NA

    acetosyringone_mass_relative_mass A numeric value representing the content of acetosyringone, as described in Straková et al. (2010). 0.26 g/g ratio real 0 1 NA

    acetovanillone_mass_absolute A numeric value representing the content of acetovanillone, as described in Straková et al. (2010). 0.26 g ratio real 0 Inf NA

    acetovanillone_mass_relative_mass A numeric value representing the content of acetovanillone, as described in Straková et al. (2010). 0.26 g/g ratio real 0 1 NA

    arabinose_mass_absolute A numeric value representing the content of arabinose, as described in Straková et al. (2010). 0.26 g ratio real 0 Inf NA

    arabinose_mass_relative_mass A numeric value representing the content of arabinose, as described in Straková et al. (2010). 0.26 g/g ratio real 0 1 NA

    ash_mass_absolute A numeric value representing the content of ash (after burning at 550°C). 4 g ratio real 0 Inf NA

    ash_mass_relative_mass A numeric value representing the content of ash (after burning at 550°C). 0.05 g/g ratio real 0 Inf NA

    attribute_definition A free text field with a textual description of the meaning of attributes in the dpeatdecomposition database. NA NA nominal NA NA NA NA

    attribute_name A string describing the names of the attributes in all tables of the dpeatdecomposition database. attribute_name NA nominal NA NA NA NA

    bibtex A string representing the bibtex code used for a literature reference throughout the dpeatdecomposition database. Galka.2021 NA nominal NA NA NA NA

    bounds_maximum A numeric value representing the minimum possible value for a numeric attribute. 0 NA interval real Inf Inf NA

    bounds_minimum A numeric value representing the maximum possible value for a numeric attribute. INF NA interval real Inf Inf NA

    bulk_density A numeric value representing the bulk density of the sample [g cm-3]. 0,2 g/cm^3 ratio real 0 Inf NA

    C_absolute The absolute mass of C in the sample. 1 g ratio real 0 Inf NA

    C_relative_mass The absolute mass of C in the sample. 1 g/g ratio real 0 Inf NA

    C_to_N A numeric value representing the C to N ratio of the sample. 35 g/g ratio real 0 Inf NA

    C_to_P A numeric value representing the C to P ratio of the sample. 35 g/g ratio real 0 Inf NA

    Ca_absolute The

  13. g

    DCCEEW_Geospatial - Interim Biogeographic Regionalisation for Australia...

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DCCEEW_Geospatial - Interim Biogeographic Regionalisation for Australia (IBRA) Version 5.1 (Subregions) | gimi9.com [Dataset]. https://gimi9.com/dataset/au_erin-interim-biogeographic-regionalisation-for-australia-ibra-version-5-1-subregions/
    Explore at:
    Description

    IBRA version 5.1 Sub-regions, like their parent regionalisation IBRA version 5.1, represent a landscape based approach to classifying the land surface of Australia from a range of continental data on environmental attributes, at a finer scale. 354 IBRA Sub-regions have been delineated, each reflecting a unifying set of major environmental influences which shape the occurrence of flora and fauna and their interaction with the physical environment.The IBRA Version 5.1 Sub-regions are the result of refinement of the IBRA Version 4 boundaries. These refined boundaries were jointly defined by the Commonwealth, State and Territory nature and conservation agencies. Following a DEH facilitated workshop on the revision of boundaries on 24 July 2000, spatial data refinements were undertaken by DEH in conjunction with relevant State / Territory agencies.Nominal attributes for the IBRA and IBRA Sub-regions are; climate, lithology/geology, landform, vegetation, flora and fauna, and landuse. The use of these attributes varies across the States.

  14. Online Retail II

    • kaggle.com
    zip
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bojan Tunguz (2021). Online Retail II [Dataset]. https://www.kaggle.com/tunguz/online-retail-ii
    Explore at:
    zip(7471823 bytes)Available download formats
    Dataset updated
    Apr 12, 2021
    Authors
    Bojan Tunguz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Source:

    Dr. Daqing Chen, Course Director: MSc Data Science. chend '@' lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.

    Data Set Information:

    This Online Retail II data set contains all the transactions occurring for a UK-based and registered, non-store online retail between 01/12/2009 and 09/12/2011.The company mainly sells unique all-occasion gift-ware. Many customers of the company are wholesalers.

    Attribute Information:

    InvoiceNo: Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation. StockCode: Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product. Description: Product (item) name. Nominal. Quantity: The quantities of each product (item) per transaction. Numeric. InvoiceDate: Invice date and time. Numeric. The day and time when a transaction was generated. UnitPrice: Unit price. Numeric. Product price per unit in sterling (£). CustomerID: Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer. Country: Country name. Nominal. The name of the country where a customer resides.

    Relevant Papers:

    Chen, D. Sain, S.L., and Guo, K. (2012), Data mining for the online retail industry: A case study of RFM model-based customer segmentation using data mining, Journal of Database Marketing and Customer Strategy Management, Vol. 19, No. 3, pp. 197-208. doi: [Web Link]. Chen, D., Guo, K. and Ubakanma, G. (2015), Predicting customer profitability over time based on RFM time series, International Journal of Business Forecasting and Marketing Intelligence, Vol. 2, No. 1, pp.1-18. doi: [Web Link]. Chen, D., Guo, K., and Li, Bo (2019), Predicting Customer Profitability Dynamically over Time: An Experimental Comparative Study, 24th Iberoamerican Congress on Pattern Recognition (CIARP 2019), Havana, Cuba, 28-31 Oct, 2019. Laha Ale, Ning Zhang, Huici Wu, Dajiang Chen, and Tao Han, Online Proactive Caching in Mobile Edge Computing Using Bidirectional Deep Recurrent Neural Network, IEEE Internet of Things Journal, Vol. 6, Issue 3, pp. 5520-5530, 2019. Rina Singh, Jeffrey A. Graves, Douglas A. Talbert, William Eberle, Prefix and Suffix Sequential Pattern Mining, Industrial Conference on Data Mining 2018: Advances in Data Mining. Applications and Theoretical Aspects, pp. 309-324. 2018.

    Citation Request:

    If you have no special citation requests, please leave this field blank.

  15. Interim Biogeographic Regionalisation for Australia (IBRA), Version 5.1 -...

    • data.gov.au
    shapefile
    Updated Nov 20, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australian Government Department of Climate Change, Energy, the Environment and Water (2018). Interim Biogeographic Regionalisation for Australia (IBRA), Version 5.1 - Sub-regions [Dataset]. https://data.gov.au/dataset/ds-environment-A981947A-B93A-4CC0-97A6-6B7A55B4DDA6?q=
    Explore at:
    shapefileAvailable download formats
    Dataset updated
    Nov 20, 2018
    Dataset provided by
    Australian Governmenthttp://www.australia.gov.au/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IBRA version 5.1 Sub-regions, like their parent regionalisation IBRA version 5.1, represent a landscape based approach to classifying the land surface of Australia from a range of continental data …Show full descriptionIBRA version 5.1 Sub-regions, like their parent regionalisation IBRA version 5.1, represent a landscape based approach to classifying the land surface of Australia from a range of continental data on environmental attributes, at a finer scale. 354 IBRA Sub-regions have been delineated, each reflecting a unifying set of major environmental influences which shape the occurrence of flora and fauna and their interaction with the physical environment. The IBRA Version 5.1 Sub-regions are the result of refinement of the IBRA Version 4 boundaries. These refined boundaries were jointly defined by the Commonwealth, State and Territory nature and conservation agencies. Following a DEH facilitated workshop on the revision of boundaries on 24 July 2000, spatial data refinements were undertaken by DEH in conjunction with relevant State / Territory agencies. Nominal attributes for the IBRA and IBRA Sub-regions are; climate, lithology/geology, landform, vegetation, flora and fauna, and landuse. The use of these attributes varies across the States.

  16. List of potentially important attributes/interventions.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vicki Nelson; Alex Dubov; Kelly Morton; Liana Fraenkel (2023). List of potentially important attributes/interventions. [Dataset]. http://doi.org/10.1371/journal.pone.0264921.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Vicki Nelson; Alex Dubov; Kelly Morton; Liana Fraenkel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of potentially important attributes/interventions.

  17. isbsg10

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ISBSG Limited; ISBSG Limited (2020). isbsg10 [Dataset]. http://doi.org/10.5281/zenodo.268485
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    ISBSG Limited; ISBSG Limited
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is teaser data from the The International Software Benchmarking Standards Group. Stored here is a small subset of of the ISBSG data. The rest of the data can be accessed, for a nominal cost, from https://www.isbsg.org/.

    See also

    The COSMIC data set http://openscience.us/repo/effort/isbsg/cosmic.html

    Reference

    The International Software Benchmarking Standards Group Limited, ISBSG http://www.isbsg.org))

    Attribute Information

  18. E

    TrajectoryProfile - R4.x257.000.0019 - os75nb - 28.89N, 88.20W - 2018-07-18

    • erddap.griidc.org
    Updated Jan 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tracey Sutton (2021). TrajectoryProfile - R4.x257.000.0019 - os75nb - 28.89N, 88.20W - 2018-07-18 [Dataset]. https://erddap.griidc.org/erddap/info/R4_x257_000_0019_os75nb/index.html
    Explore at:
    Dataset updated
    Jan 13, 2021
    Dataset provided by
    Gulf of Mexico Research Initiative Information and Data (GRIIDC)
    Authors
    Tracey Sutton
    Time period covered
    Jul 19, 2018 - Aug 2, 2018
    Area covered
    Variables measured
    crs, flag, time, depth, latitude, platform, longitude, instrument, trajectory, percent_good, and 8 more
    Description

    This dataset contains Acoustic Doppler Current Profiler (ADCP), serial sensor, shipboard computer system (South China Sea (SCS)) measurements, Conductivity, Temperature, Depth (CTD), and the ship load and cruise track data from aboard the R/V Point Sur for cruise DP06 for an area encompassing roughly 27°N to 28°N and 86.5°W to 90°W. Data was collected July 19-August 1, 2018. The overall purpose of this cruise is to perform deep water sampling of in-situ seawater and associated fauna. CODAS_variables: Variables in this Common Oceanographic Data Analysis System (CODAS) short-form Netcdf file are intended for most end-user scientific analysis and display purposes. For additional information see the CODAS_processing_note global attribute and the attributes of each of the variables.

    ============= ================================================================= time Time at the end of the ensemble, days from start of year. lon, lat Longitude, Latitude from Global Positioning System (GPS) at the end of the ensemble. u,v Ocean zonal and meridional velocity component profiles. uship, vship Zonal and meridional velocity components of the ship. heading Mean ship heading during the ensemble. depth Bin centers in nominal meters (no sound speed profile correction). tr_temp ADCP transducer temperature. pg Percent Good pings for u, v averaging after editing. pflag Profile Flags based on editing, used to mask u, v. amp Received signal strength in ADCP-specific units; no correction for spreading or attenuation. ============= ================================================================= cdm_data_type=TrajectoryProfile cdm_profile_variables=time cdm_trajectory_variables=trajectory, latitude, longitude comment=software: pycurrents comment1=CODAS_variables: Variables in this CODAS short-form Netcdf file are intended for most end-user scientific analysis and display purposes. For additional information see the CODAS_processing_note global attribute and the attributes of each of the variables.

    ============= ================================================================= time Time at the end of the ensemble, days from start of year. lon, lat Longitude, Latitude from GPS at the end of the ensemble. u,v Ocean zonal and meridional velocity component profiles. uship, vship Zonal and meridional velocity components of the ship. heading Mean ship heading during the ensemble. depth Bin centers in nominal meters (no sound speed profile correction). tr_temp ADCP transducer temperature. pg Percent Good pings for u, v averaging after editing. pflag Profile Flags based on editing, used to mask u, v. amp Received signal strength in ADCP-specific units; no correction for spreading or attenuation. ============= ================================================================= comment2=CODAS_processing_note:

    CODAS processing note:

    Overview

    The CODAS database is a specialized storage format designed for shipboard ADCP data. "CODAS processing" uses this format to hold averaged shipboard ADCP velocities and other variables, during the stages of data processing. The CODAS database stores velocity profiles relative to the ship as east and north components along with position, ship speed, heading, and other variables. The netCDF short form contains ocean velocities relative to earth, time, position, transducer temperature, and ship heading; these are designed to be "ready for immediate use". The netCDF long form is just a dump of the entire CODAS database. Some variables are no longer used, and all have names derived from their original CODAS names, dating back to the late 1980's.

    Post-processing

    CODAS post-processing, i.e. that which occurs after the single-ping profiles have been vector-averaged and loaded into the CODAS database, includes editing (using automated algorithms and manual tools), rotation and scaling of the measured velocities, and application of a time-varying heading correction. Additional algorithms developed more recently include translation of the GPS positions to the transducer location, and averaging of ship's speed over the times of valid pings when Percent Good is reduced. Such post-processing is needed prior to submission of "processed ADCP data" to JASADCP or other archives.

    Full CODAS processing

    Whenever single-ping data have been recorded, full CODAS processing provides the best end product.

    Full CODAS processing starts with the single-ping velocities in beam coordinates. Based on the transducer orientation relative to the hull, the beam velocities are transformed to horizontal, vertical, and "error velocity" components. Using a reliable heading (typically from the ship's gyro compass), the velocities in ship coordinates are rotated into earth coordinates.

    Pings are grouped into an "ensemble" (usually 2-5 minutes duration) and undergo a suite of automated editing algorithms (removal of acoustic interference; identification of the bottom; editing based on thresholds; and specialized editing that targets CTD wire interference and "weak, biased profiles". The ensemble of single-ping velocities is then averaged using an iterative reference layer averaging scheme. Each ensemble is approximated as a single function of depth, with a zero-average over a reference layer plus a reference layer velocity for each ping. Adding the average of the single-ping reference layer velocities to the function of depth yields the ensemble-average velocity profile. These averaged profiles, along with ancillary measurements, are written to disk, and subsequently loaded into the CODAS database. Everything after this stage is "post-processing".

    note (time):

    Time is stored in the database using UTC Year, Month, Day, Hour, Minute, Seconds. Floating point time "Decimal Day" is the floating point interval in days since the start of the year, usually the year of the first day of the cruise.

    note (heading):

    CODAS processing uses heading from a reliable device, and (if available) uses a time-dependent correction by an accurate heading device. The reliable heading device is typically a gyro compass (for example, the Bridge gyro). Accurate heading devices can be POSMV, Seapath, Phins, Hydrins, MAHRS, or various Ashtech devices; this varies with the technology of the time. It is always confusing to keep track of the sign of the heading correction. Headings are written degrees, positive clockwise. setting up some variables:

    X = transducer angle (CONFIG1_heading_bias) positive clockwise (beam 3 angle relative to ship) G = Reliable heading (gyrocompass) A = Accurate heading dh = G - A = time-dependent heading correction (ANCIL2_watrk_hd_misalign)

    Rotation of the measured velocities into the correct coordinate system amounts to (u+i*v)*(exp(i*theta)) where theta is the sum of the corrected heading and the transducer angle.

    theta = X + (G - dh) = X + G - dh

    Watertrack and Bottomtrack calibrations give an indication of the residual angle offset to apply, for example if mean and median of the phase are all 0.5 (then R=0.5). Using the "rotate" command, the value of R is added to "ANCIL2_watrk_hd_misalign".

    new_dh = dh + R

    Therefore the total angle used in rotation is

    new_theta = X + G - dh_new = X + G - (dh + R) = (X - R) + (G - dh)

    The new estimate of the transducer angle is: X - R ANCIL2_watrk_hd_misalign contains: dh + R

    ====================================================

    Profile flags

    Profile editing flags are provided for each depth cell:

    binary decimal below Percent value value bottom Good bin -------+----------+--------+----------+-------+ 000 0 001 1 bad 010 2 bad 011 3 bad bad 100 4 bad 101 5 bad bad 110 6 bad bad 111 7 bad bad bad -------+----------+--------+----------+-------+ contributor_email=acook1@nova.edu contributor_institution=Nova Southeastern University / Halmos College of Natural Sciences and Oceanography contributor_name=April Cook contributor_phone=+1-954-262-3733 contributor_role=Project Manager contributor_role_vocabulary=https://vocab.nerc.ac.uk/collection/G04/current/ contributor_url=https://cnso.nova.edu/overview/faculty-staff-profiles/april_cook.html Conventions=CF-1.6, ACDD-1.3, IOOS-1.2, COARDS Country=USA cruise_name=PS19-04_Sutton_ADCP date_metadata_modified=2021-01-13T17:55:08Z Easternmost_Easting=-87.30255555555556 featureType=TrajectoryProfile geospatial_bounds=Polygon ((25.08759 -91.13281, 31.35551 -91.13281, 31.35551 -81.81641, 25.08759 -81.81641, 25.08759 -91.13281)) geospatial_bounds_crs=EPSG:4326 geospatial_bounds_vertical_crs=EPSG:5831 geospatial_lat_max=30.358338888888888 geospatial_lat_min=27.422730555555557 geospatial_lat_resolution=-1.4737620404287115E-4 geospatial_lat_units=degrees_north geospatial_lon_max=-87.30255555555556 geospatial_lon_min=-89.09468055555556 geospatial_lon_resolution=1.7407068240401683E-4 geospatial_lon_units=degrees_east geospatial_vertical_max=970.989990234375 geospatial_vertical_min=26.93000030517578 geospatial_vertical_positive=down geospatial_vertical_resolution=16.0 geospatial_vertical_units=m history=Created: 2018-08-02 14:24:54 UTC id=PS19-04_Sutton_ADCP infoUrl=https://data.gulfresearchinitiative.org/data/R4.x257.000:0019 institution=Nova Southeastern University / Halmos College of Natural Sciences and Oceanography instrument=ADCP instrument_vocabulary=GCMD Science Keywords Version 9.1.5 keywords_vocabulary=GCMD Science

  19. Nominal transport risk value of hazardous materials.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Changxi Ma; Wei Hao; Fuquan Pan; Wang Xiang (2023). Nominal transport risk value of hazardous materials. [Dataset]. http://doi.org/10.1371/journal.pone.0198931.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Changxi Ma; Wei Hao; Fuquan Pan; Wang Xiang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Nominal transport risk value of hazardous materials.

  20. Credit card approvals

    • kaggle.com
    zip
    Updated Aug 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muzammil Rizvi1 (2023). Credit card approvals [Dataset]. https://www.kaggle.com/datasets/muzammilrizvi1/credit-card-apprivals
    Explore at:
    zip(9206 bytes)Available download formats
    Dataset updated
    Aug 28, 2023
    Authors
    Muzammil Rizvi1
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file concerns credit card applications. All attribute names and values have been changed to meaningless symbols to protect confidentiality of the data.

    This dataset is interesting because there is a good mix of attributes -- continuous, nominal with small numbers of values, and nominal with larger numbers of values. There are also a few missing values.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gregor Stiglic; Simon Kocbek; Igor Pernek; Peter Kokol (2023). Basic information on 40 datasets from UCI repository used in this study including information about number of instances, attributes, classes, length of longest attribute name (LAN) and length of the longest nominal attribute value (LAV). [Dataset]. http://doi.org/10.1371/journal.pone.0033812.t001
Organization logo

Basic information on 40 datasets from UCI repository used in this study including information about number of instances, attributes, classes, length of longest attribute name (LAN) and length of the longest nominal attribute value (LAV).

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Gregor Stiglic; Simon Kocbek; Igor Pernek; Peter Kokol
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Basic information on 40 datasets from UCI repository used in this study including information about number of instances, attributes, classes, length of longest attribute name (LAN) and length of the longest nominal attribute value (LAV).

Search
Clear search
Close search
Google apps
Main menu