2 datasets found
  1. Detecting Heavy Drinking Data Set

    • kaggle.com
    zip
    Updated Aug 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ABDUL AMEER (2021). Detecting Heavy Drinking Data Set [Dataset]. https://www.kaggle.com/ameerkings/detecting-heavy-drinking-data-set
    Explore at:
    zip(164377445 bytes)Available download formats
    Dataset updated
    Aug 6, 2021
    Authors
    ABDUL AMEER
    Description

    Data Set Information:

    Relevant Information: All data is fully anonymized.

    Data was originally collected from 19 participants, but the TAC readings of 6 participants were deemed unusable by SCRAM [1]. The data included is from the remaining 13 participants.

    Accelerometer data was collected from smartphones at a sampling rate of 40Hz (file: all_accelerometer_data_pids_13.csv). The file contains 5 columns: a timestamp, a participant ID, and a sample from each axis of the accelerometer. Data was collected from a mix of 11 iPhones and 2 Android phones as noted in phone_types.csv. TAC data was collected using SCRAM [2] ankle bracelets and was collected at 30 minute intervals. The raw TAC readings are in the raw_tac directory. TAC readings which are more readily usable for processing are in clean_tac directory and have two columns: a timestamp and TAC reading. The cleaned TAC readings: (1) were processed with a zero-phase low-pass filter to smooth noise without shifting phase; (2) were shifted backwards by 45 minutes so the labels more closely match the true intoxication of the participant (since alcohol takes about 45 minutes to exit through the skin.) Please see the above referenced study for more details on how the data was processed ([Web Link]).

    1 - [Web Link] 2 - J. Robert Zettl. The determination of blood alcohol concentration by transdermal measurement. [Web Link], 2002.

    Number of Instances: Accelerometer readings: 14,057,567 TAC readings: 715 Participants: 13

    Number of Attributes: - Time series: 3 axes of accelerometer data (columns x, y, z in all_accelerometer_data_pids_13.csv) - Static: 1 phone-type feature (in phone_types.csv) - Target: 1 time series of TAC for each of the 13 participants (in clean_tac directory).

    For Each Attribute: (Main) all_accelerometer_data_pids_13.csv: time: integer, unix timestamp, milliseconds pid: symbolic, 13 categories listed in pids.txt x: continuous, time-series y: continuous, time-series z: continuous, time-series clean_tac/*.csv: timestamp: integer, unix timestamp, seconds TAC_Reading: continuous, time-series phone_type.csv: pid: symbolic, 13 categories listed in pids.txt phonetype: symbolic, 2 categories (iPhone, Android)

    (Other) raw/*.xlsx: TAC Level: continuous, time-series IR Voltage: continuous, time-series Temperature: continuous, time-series Time: datetime Date: datetime

    Missing Attribute Values: None

    Target Distribution: TAC is measured in g/dl where 0.08 is the legal limit for intoxication while driving Mean TAC: 0.065 +/- 0.182 Max TAC: 0.443 TAC Inner Quartiles: 0.002, 0.029, 0.092 Mean Time-to-last-drink: 16.1 +/- 6.9 hrs

    Attribute Information:

    Provide information about each attribute in your data set.

  2. Bar Crawl: Detecting Heavy Drinking Data Set

    • kaggle.com
    Updated Mar 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaafa (2020). Bar Crawl: Detecting Heavy Drinking Data Set [Dataset]. https://www.kaggle.com/nautiyalamit/bar-crawl-detecting-heavy-drinking-data-set/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 25, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gaafa
    Description

    Data Set Information:

    Relevant Information: All data is fully anonymized.

    Data was originally collected from 19 participants, but the TAC readings of 6 participants were deemed unusable by SCRAM [1]. The data included is from the remaining 13 participants.

    Accelerometer data was collected from smartphones at a sampling rate of 40Hz (file: all_accelerometer_data_pids_13.csv). The file contains 5 columns: a timestamp, a participant ID, and a sample from each axis of the accelerometer. Data was collected from a mix of 11 iPhones and 2 Android phones as noted in phone_types.csv. TAC data was collected using SCRAM [2] ankle bracelets and was collected at 30 minute intervals. The raw TAC readings are in the raw_tac directory. TAC readings which are more readily usable for processing are in clean_tac directory and have two columns: a timestamp and TAC reading. The cleaned TAC readings: (1) were processed with a zero-phase low-pass filter to smooth noise without shifting phase; (2) were shifted backwards by 45 minutes so the labels more closely match the true intoxication of the participant (since alcohol takes about 45 minutes to exit through the skin.) Please see the above referenced study for more details on how the data was processed ([Web Link]).

    1 - [Web Link] 2 - J. Robert Zettl. The determination of blood alcohol concentration by transdermal measurement. [Web Link], 2002.

    Number of Instances: Accelerometer readings: 14,057,567 TAC readings: 715 Participants: 13

    Number of Attributes: - Time series: 3 axes of accelerometer data (columns x, y, z in all_accelerometer_data_pids_13.csv) - Static: 1 phone-type feature (in phone_types.csv) - Target: 1 time series of TAC for each of the 13 participants (in clean_tac directory).

    For Each Attribute: (Main) all_accelerometer_data_pids_13.csv: time: integer, unix timestamp, milliseconds pid: symbolic, 13 categories listed in pids.txt x: continuous, time-series y: continuous, time-series z: continuous, time-series clean_tac/*.csv: timestamp: integer, unix timestamp, seconds TAC_Reading: continuous, time-series phone_type.csv: pid: symbolic, 13 categories listed in pids.txt phonetype: symbolic, 2 categories (iPhone, Android)

    (Other) raw/*.xlsx: TAC Level: continuous, time-series IR Voltage: continuous, time-series Temperature: continuous, time-series Time: datetime Date: datetime

    Missing Attribute Values: None

    Target Distribution: TAC is measured in g/dl where 0.08 is the legal limit for intoxication while driving Mean TAC: 0.065 +/- 0.182 Max TAC: 0.443 TAC Inner Quartiles: 0.002, 0.029, 0.092 Mean Time-to-last-drink: 16.1 +/- 6.9 hrs

    Attribute Information:

    Provide information about each attribute in your data set.

    Relevant Papers:

    Past Usage: (a) Complete reference of article where it was described/used: Killian, J.A., Passino, K.M., Nandi, A., Madden, D.R. and Clapp, J., Learning to Detect Heavy Drinking Episodes Using Smartphone Accelerometer Data. In Proceedings of the 4th International Workshop on Knowledge Discovery in Healthcare Data co-located with the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019) (pp. 35-42). Web Link Indication of what attribute(s) were being predicted Features: Three-axis time series accelerometer data Target: Time series transdermal alcohol content (TAC) data (real-time measure of intoxication) (c) Indication of study's results The study decomposed each time series into 10 second windows and performed binary classification to predict if windows corresponded to an intoxicated participant (TAC >= 0.08) or sober participant (TAC < 0.08). The study tested several models and achieved a test accuracy of 77.5% with a random forest.

    Citation Request:

    When using this dataset, please cite: Killian, J.A., Passino, K.M., Nandi, A., Madden, D.R. and Clapp, J., Learning to Detect Heavy Drinking Episodes Using Smartphone Accelerometer Data. In Proceedings of the 4th International Workshop on Knowledge Discovery in Healthcare Data co-located with the 28th International Joint Conference on Artificial Intelligence (IJCAI 2019) (pp. 35-42). [Web Link]

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
ABDUL AMEER (2021). Detecting Heavy Drinking Data Set [Dataset]. https://www.kaggle.com/ameerkings/detecting-heavy-drinking-data-set
Organization logo

Detecting Heavy Drinking Data Set

Accelerometer and transdermal alcohol content data from a college bar crawl.

Explore at:
182 scholarly articles cite this dataset (View in Google Scholar)
zip(164377445 bytes)Available download formats
Dataset updated
Aug 6, 2021
Authors
ABDUL AMEER
Description

Data Set Information:

Relevant Information: All data is fully anonymized.

Data was originally collected from 19 participants, but the TAC readings of 6 participants were deemed unusable by SCRAM [1]. The data included is from the remaining 13 participants.

Accelerometer data was collected from smartphones at a sampling rate of 40Hz (file: all_accelerometer_data_pids_13.csv). The file contains 5 columns: a timestamp, a participant ID, and a sample from each axis of the accelerometer. Data was collected from a mix of 11 iPhones and 2 Android phones as noted in phone_types.csv. TAC data was collected using SCRAM [2] ankle bracelets and was collected at 30 minute intervals. The raw TAC readings are in the raw_tac directory. TAC readings which are more readily usable for processing are in clean_tac directory and have two columns: a timestamp and TAC reading. The cleaned TAC readings: (1) were processed with a zero-phase low-pass filter to smooth noise without shifting phase; (2) were shifted backwards by 45 minutes so the labels more closely match the true intoxication of the participant (since alcohol takes about 45 minutes to exit through the skin.) Please see the above referenced study for more details on how the data was processed ([Web Link]).

1 - [Web Link] 2 - J. Robert Zettl. The determination of blood alcohol concentration by transdermal measurement. [Web Link], 2002.

Number of Instances: Accelerometer readings: 14,057,567 TAC readings: 715 Participants: 13

Number of Attributes: - Time series: 3 axes of accelerometer data (columns x, y, z in all_accelerometer_data_pids_13.csv) - Static: 1 phone-type feature (in phone_types.csv) - Target: 1 time series of TAC for each of the 13 participants (in clean_tac directory).

For Each Attribute: (Main) all_accelerometer_data_pids_13.csv: time: integer, unix timestamp, milliseconds pid: symbolic, 13 categories listed in pids.txt x: continuous, time-series y: continuous, time-series z: continuous, time-series clean_tac/*.csv: timestamp: integer, unix timestamp, seconds TAC_Reading: continuous, time-series phone_type.csv: pid: symbolic, 13 categories listed in pids.txt phonetype: symbolic, 2 categories (iPhone, Android)

(Other) raw/*.xlsx: TAC Level: continuous, time-series IR Voltage: continuous, time-series Temperature: continuous, time-series Time: datetime Date: datetime

Missing Attribute Values: None

Target Distribution: TAC is measured in g/dl where 0.08 is the legal limit for intoxication while driving Mean TAC: 0.065 +/- 0.182 Max TAC: 0.443 TAC Inner Quartiles: 0.002, 0.029, 0.092 Mean Time-to-last-drink: 16.1 +/- 6.9 hrs

Attribute Information:

Provide information about each attribute in your data set.

Search
Clear search
Close search
Google apps
Main menu