Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.
Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.
Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/
Descriptive Statistics for final_pnl_ratio:
Count: 16603 Mean: -0.0464 Median: -0.0037 Standard Deviation: 0.1414 Min: -1.2449 Max: 0.1881 25th Percentile (Q1): -0.0208 75th Percentile (Q3): -0.0008 Total trades analyzed: 16603
Trade Range:
99% of trades have a Final PnL Ratio between: -1.0009 and 0.0186 (This means 99% of your PnL ratios fall within this range.)
Risk Analysis:
Number of trades with >30% loss (PnL < -0.30): 699 Probability of a catastrophic loss (>30% loss): 4.21%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-XRP-cleaned_params.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains non-thermal velocity (NTV) measurements from Hinode/EIS spectral lines during solar flares, organized by flare class and spectral line. It also includes a comprehensive flare catalog linking EIS observations to GOES flare events.
Each row corresponds to an EIS raster or exposure associated with a GOES flare. The catalog includes:
Flare Timing and Classification:
Flare Positioning:
EIS Observation Metadata:
Each NTV CSV file contains the following columns:
Time and Identification:
Lines are named as: element_ionization_wavelength_ntv
Example: 'fe_12_195.12_ntv' = Fe XII 195.12 Å non-thermal velocity
Citation:
Please cite the associated paper when using this data.
Python packages used for data curation:
Sunpy, EISPAC, Pandas, Astropy
Descriptive Statistics for final_pnl_ratio:
Count: 102304 Mean: -0.0186 Median: -0.0030 Standard Deviation: 0.0936 Min: -1.5163 Max: 1.7247 25th Percentile (Q1): -0.0148 75th Percentile (Q3): 0.0001 Total trades analyzed: 102304
Trade Range:
99% of trades have a Final PnL Ratio between: -0.6826 and 0.1957 (This means 99% of your PnL ratios fall within this range.)
Risk Analysis:
Number of trades with >50% loss (PnL < -0.50): 793 Probability of a catastrophic loss (>50% loss): 0.78%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-crypto-params.
Descriptive Statistics for final_pnl_ratio:
Count: 25131 Mean: -0.0066 Median: -0.0025 Standard Deviation: 0.0617 Min: -1.2345 Max: 1.3500 25th Percentile (Q1): -0.0125 75th Percentile (Q3): 0.0020 Total trades analyzed: 25131
Trade Range:
99% of trades have a Final PnL Ratio between: -0.2718 and 0.2446 (This means 99% of your PnL ratios fall within this range.)
Risk Analysis:
Number of trades with >30% loss (PnL < -0.30): 107 Probability of a catastrophic loss (>30% loss): 0.43%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-BNB-cleaned_params.
Descriptive Statistics for final_pnl_ratio:
Count: 21949 Mean: -0.0074 Median: -0.0024 Standard Deviation: 0.0637 Min: -1.2313 Max: 1.7247 25th Percentile (Q1): -0.0129 75th Percentile (Q3): 0.0021 Total trades analyzed: 21949
Trade Range:
99% of trades have a Final PnL Ratio between: -0.2829 and 0.2315 (This means 99% of your PnL ratios fall within this range.)
Risk Analysis:
Number of trades with >30% loss (PnL < -0.30): 95 Probability of a catastrophic loss (>30% loss): 0.43%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-ETH-cleaned_params.
Descriptive Statistics for final_pnl_ratio:
Count: 14426 Mean: -0.0410 Median: -0.0043 Standard Deviation: 0.1252 Min: -1.1856 Max: 0.5808 25th Percentile (Q1): -0.0212 75th Percentile (Q3): -0.0009 Total trades analyzed: 14426
Trade Range:
99% of trades have a Final PnL Ratio between: -1.0003 and 0.0288 (This means 99% of your PnL ratios fall within this range.)
Risk Analysis:
Number of trades with >30% loss (PnL < -0.30): 480 Probability of a catastrophic loss (>30% loss): 3.33%… See the full description on the dataset page: https://huggingface.co/datasets/ChavyvAkvar/synthetic-trades-ADA-cleaned_params.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.
Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.
Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/