7 datasets found

Gender, Age, and Emotion Detection from Voice
kaggle.com
Updated May 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rohit Zaman (2021). Gender, Age, and Emotion Detection from Voice [Dataset]. https://www.kaggle.com/datasets/rohitzaman/gender-age-and-emotion-detection-from-voice/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 29, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rohit Zaman
Description
Context

Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.

Content

Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.

Acknowledgements

Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/
h
synthetic-trades-XRP-cleaned_params
huggingface.co
Updated Aug 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Habibullah Akbar (2025). synthetic-trades-XRP-cleaned_params [Dataset]. https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-XRP-cleaned_params
Explore at:
Dataset updated
Aug 12, 2025
Authors
Habibullah Akbar
Description
Descriptive Statistics for final_pnl_ratio:

Count: 16603 Mean: -0.0464 Median: -0.0037 Standard Deviation: 0.1414 Min: -1.2449 Max: 0.1881 25th Percentile (Q1): -0.0208 75th Percentile (Q3): -0.0008 Total trades analyzed: 16603

Trade Range:

99% of trades have a Final PnL Ratio between: -1.0009 and 0.0186 (This means 99% of your PnL ratios fall within this range.)

Risk Analysis:

Number of trades with >30% loss (PnL < -0.30): 699 Probability of a catastrophic loss (>30% loss): 4.21%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-XRP-cleaned_params.
Hinode/EIS Solar Flare Footpoint Non-Thermal Velocity Dataset (2011–2024)
zenodo.org
bin, csv, txt
Updated Aug 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andy S.H. To; Andy S.H. To (2025). Hinode/EIS Solar Flare Footpoint Non-Thermal Velocity Dataset (2011–2024) [Dataset]. http://doi.org/10.5281/zenodo.15613861
Explore at:
csv, bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15613861
Dataset updated
Aug 16, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Andy S.H. To; Andy S.H. To
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
EIS Flare Catalog and Non-thermal Velocity (NTV) Curves Data

This dataset contains non-thermal velocity (NTV) measurements from Hinode/EIS spectral lines during solar flares, organized by flare class and spectral line. It also includes a comprehensive flare catalog linking EIS observations to GOES flare events.

Files:

eis_flare_catalog_zenodo_release.csv - EIS flare catalog

ntv_curves_normalized_for_zenodo.csv - Normalized NTV curves

ntv_curves_raw_for_zenodo.csv - Raw (non-normalized) NTV curves

Flare Catalog Structure (eis_flare_catalog_zenodo_release.csv)

Each row corresponds to an EIS raster or exposure associated with a GOES flare. The catalog includes:

Flare Timing and Classification:

flare_peak, flare_start, flare_end, peak_time – Flare timing (GOES event list)

subtracted_goes_class, recalibrated_goes_class, uncalibrated_goes_class – Flare classification based on different calibration schemes

granular_flare_category – Simplified flare category (C1, C2–3, C>4, M, X)

Flare Positioning:

flare_hpcx, flare_hpcy – Helioprojective flare coordinates (arcsec)

flare_hpcx_pixel, flare_hpcy_pixel – Flare location in EIS pixel coordinates

EIS Observation Metadata:

study_id_x – EIS study identifier

eis_start_time_x, scan_time_x, column_time – EIS raster timing

time_offset_seconds – Time offset between EIS observation and flare peak

duration_from_peak – Observation duration relative to flare peak

eis_xcen, eis_ycen – Raster center in arcsec

eis_fovx, eis_fovy – Raster field-of-view dimensions (arcsec)

NTV Data Structure:

Each NTV CSV file contains the following columns:

Time and Identification:

bin_center: Time relative to flare peak (seconds)

spectral_line: Name of the spectral line (e.g., 'fe_12_195.12_ntv')

flare_class: Flare category ('C1', 'C2-3', 'C>4', 'M_class', 'X_class')

flare_category: Same as flare_class (for normalized data)

Statistical Measures (Raw Values):

mean: Mean NTV value in the time bin (km/s)

median: Median NTV value in the time bin (km/s)

std: Standard deviation of NTV values in the time bin (km/s)

percentile_25: 25th percentile (Q1) of NTV values (km/s)

percentile_75: 75th percentile (Q3) of NTV values (km/s)

Sample Information:

count: Number of individual measurements in the time bin

unique_flares: Number of unique flares contributing to the time bin

Normalized Values (only in normalized file):

norm_mean: Mean NTV normalized by peak value

norm_median: Median NTV normalized by peak value

norm_p25: 25th percentile normalized by peak value

norm_p75: 75th percentile normalized by peak value

normalization_factor: The peak value used for normalization (km/s)

Usage Notes:

Time bins are 300 seconds wide

Negative times indicate before flare peak, positive times after peak

Only time bins with sufficient data (>10 measurements) are included in plots

Normalization is performed by dividing by the maximum smoothed median value within ±10000s of peak

Spectral lines with insufficient data or known issues have been excluded

Spectral Line Information:

Lines are named as: element_ionization_wavelength_ntv
Example: 'fe_12_195.12_ntv' = Fe XII 195.12 Å non-thermal velocity

Citation:
Please cite the associated paper when using this data.

Python packages used for data curation:

Sunpy, EISPAC, Pandas, Astropy
h
synthetic-trades-crypto-params
huggingface.co
Updated Aug 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Habibullah Akbar (2025). synthetic-trades-crypto-params [Dataset]. https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-crypto-params
Explore at:
Dataset updated
Aug 12, 2025
Authors
Habibullah Akbar
Description
Descriptive Statistics for final_pnl_ratio:

Count: 102304 Mean: -0.0186 Median: -0.0030 Standard Deviation: 0.0936 Min: -1.5163 Max: 1.7247 25th Percentile (Q1): -0.0148 75th Percentile (Q3): 0.0001 Total trades analyzed: 102304

Trade Range:

99% of trades have a Final PnL Ratio between: -0.6826 and 0.1957 (This means 99% of your PnL ratios fall within this range.)

Risk Analysis:

Number of trades with >50% loss (PnL < -0.50): 793 Probability of a catastrophic loss (>50% loss): 0.78%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-crypto-params.
h
synthetic-trades-BNB-cleaned_params
huggingface.co
Updated Aug 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Habibullah Akbar (2025). synthetic-trades-BNB-cleaned_params [Dataset]. https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-BNB-cleaned_params
Explore at:
Dataset updated
Aug 12, 2025
Authors
Habibullah Akbar
Description
Descriptive Statistics for final_pnl_ratio:

Count: 25131 Mean: -0.0066 Median: -0.0025 Standard Deviation: 0.0617 Min: -1.2345 Max: 1.3500 25th Percentile (Q1): -0.0125 75th Percentile (Q3): 0.0020 Total trades analyzed: 25131

Trade Range:

99% of trades have a Final PnL Ratio between: -0.2718 and 0.2446 (This means 99% of your PnL ratios fall within this range.)

Risk Analysis:

Number of trades with >30% loss (PnL < -0.30): 107 Probability of a catastrophic loss (>30% loss): 0.43%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-BNB-cleaned_params.
h
synthetic-trades-ETH-cleaned_params
huggingface.co
Updated Aug 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Habibullah Akbar (2025). synthetic-trades-ETH-cleaned_params [Dataset]. https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-ETH-cleaned_params
Explore at:
Dataset updated
Aug 12, 2025
Authors
Habibullah Akbar
Description
Descriptive Statistics for final_pnl_ratio:

Count: 21949 Mean: -0.0074 Median: -0.0024 Standard Deviation: 0.0637 Min: -1.2313 Max: 1.7247 25th Percentile (Q1): -0.0129 75th Percentile (Q3): 0.0021 Total trades analyzed: 21949

Trade Range:

99% of trades have a Final PnL Ratio between: -0.2829 and 0.2315 (This means 99% of your PnL ratios fall within this range.)

Risk Analysis:

Number of trades with >30% loss (PnL < -0.30): 95 Probability of a catastrophic loss (>30% loss): 0.43%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-ETH-cleaned_params.
h
synthetic-trades-ADA-cleaned_params
huggingface.co
Updated Aug 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Habibullah Akbar (2025). synthetic-trades-ADA-cleaned_params [Dataset]. https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-ADA-cleaned_params
Explore at:
Dataset updated
Aug 12, 2025
Authors
Habibullah Akbar
Description
Descriptive Statistics for final_pnl_ratio:

Count: 14426 Mean: -0.0410 Median: -0.0043 Standard Deviation: 0.1252 Min: -1.1856 Max: 0.5808 25th Percentile (Q1): -0.0212 75th Percentile (Q3): -0.0009 Total trades analyzed: 14426

Trade Range:

99% of trades have a Final PnL Ratio between: -1.0003 and 0.0288 (This means 99% of your PnL ratios fall within this range.)

Risk Analysis:

Number of trades with >30% loss (PnL < -0.30): 480 Probability of a catastrophic loss (>30% loss): 3.33%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-ADA-cleaned_params.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Rohit Zaman (2021). Gender, Age, and Emotion Detection from Voice [Dataset]. https://www.kaggle.com/datasets/rohitzaman/gender-age-and-emotion-detection-from-voice/suggestions

Gender, Age, and Emotion Detection from Voice

Extracted statistical features from audios and added labels to form the datasets

Explore at:

35 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 29, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Rohit Zaman

Description

Context

Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.

Content

Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.

Acknowledgements

Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/

Clear search

Close search

Google apps

Main menu

Gender, Age, and Emotion Detection from Voice

Context

Content

Acknowledgements

synthetic-trades-XRP-cleaned_params

Hinode/EIS Solar Flare Footpoint Non-Thermal Velocity Dataset (2011–2024)

EIS Flare Catalog and Non-thermal Velocity (NTV) Curves Data

Files:

Flare Catalog Structure (eis_flare_catalog_zenodo_release.csv)

NTV Data Structure:

Statistical Measures (Raw Values):

Sample Information:

Normalized Values (only in normalized file):

Usage Notes:

Spectral Line Information:

synthetic-trades-crypto-params

synthetic-trades-BNB-cleaned_params

synthetic-trades-ETH-cleaned_params

synthetic-trades-ADA-cleaned_params

Gender, Age, and Emotion Detection from Voice

Extracted statistical features from audios and added labels to form the datasets

Context

Content

Acknowledgements