13 datasets found

A Replication Dataset for Fundamental Frequency Estimation
zenodo.org
live.european-language-grid.eu
+1more
bin
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bastian Bechtold; Bastian Bechtold (2025). A Replication Dataset for Fundamental Frequency Estimation [Dataset]. http://doi.org/10.5281/zenodo.3904389
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3904389
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Bastian Bechtold; Bastian Bechtold
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Part of the dissertation Pitch of Voiced Speech in the Short-Time Fourier Transform: Algorithms, Ground Truths, and Evaluation Methods.
© 2020, Bastian Bechtold. All rights reserved.

Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise.

The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time.

Included Code and Data

ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora:

CMU-ARCTIC (consensus truth) [1]

FDA (corpus truth and consensus truth) [2]

KEELE (corpus truth and consensus truth) [3]

MOCHA-TIMIT (consensus truth) [4]

PTDB-TUG (corpus truth and consensus truth) [5]

TIMIT (consensus truth) [6]

noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora:

NOISEX [7]

QUT-NOISE [8]

synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise.

noisy_speech.pkl and synthetic_speech.pkl are pickled Pandas dataframes of performance metrics derived from the above data for the following list of fundamental frequency estimation algorithms:

AUTOC [9]

AMDF [10]

BANA [11]

CEP [12]

CREPE [13]

DIO [14]

DNN [15]

KALDI [16]

MAPS

MBSC [17]

NLS [18]

PEFAC [19]

PRAAT [20]

RAPT [21]

SACC [22]

SAFE [23]

SHR [24]

SIFT [25]

SRH [26]

STRAIGHT [27]

SWIPE [28]

YAAPT [29]

YIN [30]

noisy speech evaluation.py and synthetic speech evaluation.py are Python programs to calculate the above Pandas dataframes from the above JBOF datasets. They calculate the following performance measures:

Gross Pitch Error (GPE), the percentage of pitches where the estimated pitch deviates from the true pitch by more than 20%.

Fine Pitch Error (FPE), the mean error of grossly correct estimates.

High/Low Octave Pitch Error (OPE), the percentage pitches that are GPEs and happens to be at an integer multiple of the true pitch.

Gross Remaining Error (GRE), the percentage of pitches that are GPEs but not OPEs.

Fine Remaining Bias (FRB), the median error of GREs.

True Positive Rate (TPR), the percentage of true positive voicing estimates.

False Positive Rate (FPR), the percentage of false positive voicing estimates.

False Negative Rate (FNR), the percentage of false negative voicing estimates.

F₁, the harmonic mean of precision and recall of the voicing decision.

Pipfile is a pipenv-compatible pipfile for installing all prerequisites necessary for running the above Python programs.

The Python programs take about an hour to compute on a fast 2019 computer, and require at least 32 Gb of memory.

References:

John Kominek and Alan W Black. CMU ARCTIC database for speech synthesis, 2003.

Paul C Bagshaw, Steven Hiller, and Mervyn A Jack. Enhanced Pitch Tracking and the Processing of F0 Contours for Computer Aided Intonation Teaching. In EUROSPEECH, 1993.

F Plante, Georg F Meyer, and William A Ainsworth. A Pitch Extraction Reference Database. In Fourth European Conference on Speech Communication and Technology, pages 837–840, Madrid, Spain, 1995.

Alan Wrench. MOCHA MultiCHannel Articulatory database: English, November 1999.

Gregor Pirker, Michael Wohlmayr, Stefan Petrik, and Franz Pernkopf. A Pitch Tracking Corpus with Evaluation on Multipitch Tracking Scenario. page 4, 2011.

John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, and Victor Zue. TIMIT Acoustic-Phonetic Continuous Speech Corpus, 1993.

Andrew Varga and Herman J.M. Steeneken. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recog- nition systems. Speech Communication, 12(3):247–251, July 1993.

David B. Dean, Sridha Sridharan, Robert J. Vogt, and Michael W. Mason. The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. Proceedings of Interspeech 2010, 2010.

Man Mohan Sondhi. New methods of pitch extraction. Audio and Electroacoustics, IEEE Transactions on, 16(2):262—266, 1968.

Myron J. Ross, Harry L. Shaffer, Asaf Cohen, Richard Freudberg, and Harold J. Manley. Average magnitude difference function pitch extractor. Acoustics, Speech and Signal Processing, IEEE Transactions on, 22(5):353—362, 1974.

Na Yang, He Ba, Weiyang Cai, Ilker Demirkol, and Wendi Heinzelman. BaNa: A Noise Resilient Fundamental Frequency Detection Algorithm for Speech and Music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):1833–1848, December 2014.

Michael Noll. Cepstrum Pitch Determination. The Journal of the Acoustical Society of America, 41(2):293–309, 1967.

Jong Wook Kim, Justin Salamon, Peter Li, and Juan Pablo Bello. CREPE: A Convolutional Representation for Pitch Estimation. arXiv:1802.06182 [cs, eess, stat], February 2018. arXiv: 1802.06182.

Masanori Morise, Fumiya Yokomori, and Kenji Ozawa. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications. IEICE Transactions on Information and Systems, E99.D(7):1877–1884, 2016.

Kun Han and DeLiang Wang. Neural Network Based Pitch Tracking in Very Noisy Speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):2158–2168, Decem- ber 2014.

Pegah Ghahremani, Bagher BabaAli, Daniel Povey, Korbinian Riedhammer, Jan Trmal, and Sanjeev Khudanpur. A pitch extraction algorithm tuned for automatic speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 2494–2498. IEEE, 2014.

Lee Ngee Tan and Abeer Alwan. Multi-band summary correlogram-based pitch detection for noisy speech. Speech Communication, 55(7-8):841–856, September 2013.

Jesper Kjær Nielsen, Tobias Lindstrøm Jensen, Jesper Rindom Jensen, Mads Græsbøll Christensen, and Søren Holdt Jensen. Fast fundamental frequency estimation: Making a statistically
US Means of Transportation to Work Census Data
kaggle.com
Updated Feb 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sagar G (2022). US Means of Transportation to Work Census Data [Dataset]. https://www.kaggle.com/goswamisagard/american-census-survey-b08301-cleaned-csv-data/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 23, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sagar G
Area covered
United States
Description

US Census Bureau conducts American Census Survey 1 and 5 Yr surveys that record various demographics and provide public access through APIs. I have attempted to call the APIs through the python environment using the requests library, Clean, and organize the data in a usable format.

Data Ingestion and Cleaning:

ACS Subject data [2011-2019] was accessed using Python by following the below API Link: https://api.census.gov/data/2011/acs/acs1?get=group(B08301)&for=county:* The data was obtained in JSON format by calling the above API, then imported as Python Pandas Dataframe. The 84 variables returned have 21 Estimate values for various metrics, 21 pairs of respective Margin of Error, and respective Annotation values for Estimate and Margin of Error Values. This data was then undergone through various cleaning processes using Python, where excess variables were removed, and the column names were renamed. Web-Scraping was carried out to extract the variables' names and replace the codes in the column names in raw data.

The above step was carried out for multiple ACS/ACS-1 datasets spanning 2011-2019 and then merged into a single Python Pandas Dataframe. The columns were rearranged, and the "NAME" column was split into two columns, namely 'StateName' and 'CountyName.' The counties for which no data was available were also removed from the Dataframe. Once the Dataframe was ready, it was separated into two new dataframes for separating State and County Data and exported into '.csv' format

Data Source:

More information about the source of Data can be found at the URL below: US Census Bureau. (n.d.). About: Census Bureau API. Retrieved from Census.gov https://www.census.gov/data/developers/about.html

Final Word:

I hope this data helps you to create something beautiful, and awesome. I will be posting a lot more databases shortly, if I get more time from assignments, submissions, and Semester Projects 🧙🏼‍♂️. Good Luck.
f
Enhancing UNCDF Operations: Power BI Dashboard Development and Data Mapping
figshare.com
Updated Jan 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maryam Binti Haji Abdul Halim (2025). Enhancing UNCDF Operations: Power BI Dashboard Development and Data Mapping [Dataset]. http://doi.org/10.6084/m9.figshare.28147451.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.28147451.v1
Dataset updated
Jan 6, 2025
Dataset provided by
figshare
Authors
Maryam Binti Haji Abdul Halim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This project focuses on data mapping, integration, and analysis to support the development and enhancement of six UNCDF operational applications: OrgTraveler, Comms Central, Internal Support Hub, Partnership 360, SmartHR, and TimeTrack. These apps streamline workflows for travel claims, internal support, partnership management, and time tracking within UNCDF.Key Features and Tools:Data Mapping for Salesforce CRM Migration: Structured and mapped data flows to ensure compatibility and seamless migration to Salesforce CRM.Python for Data Cleaning and Transformation: Utilized pandas, numpy, and APIs to clean, preprocess, and transform raw datasets into standardized formats.Power BI Dashboards: Designed interactive dashboards to visualize workflows and monitor performance metrics for decision-making.Collaboration Across Platforms: Integrated Google Collab for code collaboration and Microsoft Excel for data validation and analysis.
Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic...
zenodo.org
data.niaid.nih.gov
bin, csv, zip
Updated Dec 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa (2022). Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials [Dataset]. http://doi.org/10.5281/zenodo.6965147
Explore at:
bin, zip, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6965147
Dataset updated
Dec 24, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials

Background

This dataset contains data from monotonic and cyclic loading experiments on structural metallic materials. The materials are primarily structural steels and one iron-based shape memory alloy is also included. Summary files are included that provide an overview of the database and data from the individual experiments is also included.

The files included in the database are outlined below and the format of the files is briefly described. Additional information regarding the formatting can be found through the post-processing library (https://github.com/ahartloper/rlmtp/tree/master/protocols).

Usage

The data is licensed through the Creative Commons Attribution 4.0 International.

If you have used our data and are publishing your work, we ask that you please reference both:

this database through its DOI, and

any publication that is associated with the experiments. See the Overall_Summary and Database_References files for the associated publication references.

Included Files

Overall_Summary_2022-08-25_v1-0-0.csv: summarises the specimen information for all experiments in the database.

Summarized_Mechanical_Props_Campaign_2022-08-25_v1-0-0.csv: summarises the average initial yield stress and average initial elastic modulus per campaign.

Unreduced_Data-#_v1-0-0.zip: contain the original (not downsampled) data

Where # is one of: 1, 2, 3, 4, 5, 6. The unreduced data is broken into separate archives because of upload limitations to Zenodo. Together they provide all the experimental data.

We recommend you un-zip all the folders and place them in one "Unreduced_Data" directory similar to the "Clean_Data"

The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.

There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the unreduced data.

The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.

Clean_Data_v1-0-0.zip: contains all the downsampled data

The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.

There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the clean data.

The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.

Database_References_v1-0-0.bib

Contains a bibtex reference for many of the experiments in the database. Corresponds to the "citekey" entry in the summary files.

File Format: Downsampled Data

These are the "LP_

The header of the first column is empty: the first column corresponds to the index of the sample point in the original (unreduced) data

Time[s]: time in seconds since the start of the test

e_true: true strain

Sigma_true: true stress in MPa

(optional) Temperature[C]: the surface temperature in degC

These data files can be easily loaded using the pandas library in Python through:

import pandas data = pandas.read_csv(data_file, index_col=0)

The data is formatted so it can be used directly in RESSPyLab (https://github.com/AlbanoCastroSousa/RESSPyLab). Note that the column names "e_true" and "Sigma_true" were kept for backwards compatibility reasons with RESSPyLab.

File Format: Unreduced Data

These are the "LP_

The first column is the index of each data point

S/No: sample number recorded by the DAQ

System Date: Date and time of sample

Time[s]: time in seconds since the start of the test

C_1_Force[kN]: load cell force

C_1_Déform1[mm]: extensometer displacement

C_1_Déplacement[mm]: cross-head displacement

Eng_Stress[MPa]: engineering stress

Eng_Strain[]: engineering strain

e_true: true strain

Sigma_true: true stress in MPa

(optional) Temperature[C]: specimen surface temperature in degC

The data can be loaded and used similarly to the downsampled data.

File Format: Overall_Summary

The overall summary file provides data on all the test specimens in the database. The columns include:

hidden_index: internal reference ID

grade: material grade

spec: specifications for the material

source: base material for the test specimen

id: internal name for the specimen

lp: load protocol

size: type of specimen (M8, M12, M20)

gage_length_mm_: unreduced section length in mm

avg_reduced_dia_mm_: average measured diameter for the reduced section in mm

avg_fractured_dia_top_mm_: average measured diameter of the top fracture surface in mm

avg_fractured_dia_bot_mm_: average measured diameter of the bottom fracture surface in mm

fy_n_mpa_: nominal yield stress

fu_n_mpa_: nominal ultimate stress

t_a_deg_c_: ambient temperature in degC

date: date of test

investigator: person(s) who conducted the test

location: laboratory where test was conducted

machine: setup used to conduct test

pid_force_k_p, pid_force_t_i, pid_force_t_d: PID parameters for force control

pid_disp_k_p, pid_disp_t_i, pid_disp_t_d: PID parameters for displacement control

pid_extenso_k_p, pid_extenso_t_i, pid_extenso_t_d: PID parameters for extensometer control

citekey: reference corresponding to the Database_References.bib file

yield_stress_mpa_: computed yield stress in MPa

elastic_modulus_mpa_: computed elastic modulus in MPa

fracture_strain: computed average true strain across the fracture surface

c,si,mn,p,s,n,cu,mo,ni,cr,v,nb,ti,al,b,zr,sn,ca,h,fe: chemical compositions in units of %mass

file: file name of corresponding clean (downsampled) stress-strain data

File Format: Summarized_Mechanical_Props_Campaign

Meant to be loaded in Python as a pandas DataFrame with multi-indexing, e.g.,

tab1 = pd.read_csv('Summarized_Mechanical_Props_Campaign_' + date + version + '.csv', index_col=[0, 1, 2, 3], skipinitialspace=True, header=[0, 1], keep_default_na=False, na_values='')

citekey: reference in "Campaign_References.bib".

Grade: material grade.

Spec.: specifications (e.g., J2+N).

Yield Stress [MPa]: initial yield stress in MPa

size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign

Elastic Modulus [MPa]: initial elastic modulus in MPa

size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign

Caveats

The files in the following directories were tested before the protocol was established. Therefore, only the true stress-strain is available for each:

A500

A992_Gr50

BCP325

BCR295

HYP400

S460NL

S690QL/25mm

S355J2_Plates/S355J2_N_25mm and S355J2_N_50mm
Maricopa County Assessor "Fast Food" Search Query
kaggle.com
Updated Sep 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FoxbatCS (2021). Maricopa County Assessor "Fast Food" Search Query [Dataset]. https://www.kaggle.com/foxbatcs/maricopa-county-assessor-fast-food-search-query/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 21, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
FoxbatCS
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Maricopa County
Description
SOURCE

This data was obtained from the Maricopa County Assessor under the search "Fast Food". The query has approximately 1342 results, with only 1000 returned due MCA Data Policies.

DATA CLEANING

Due to some Subdivision Name values posessing unescaped commas that interfered with Pandas' ability to properly align the columns, some manual cleaning in Libre Office was performed by me.

Aside from a handful of Null values, the data is fairly clean and requires little from Pandas.

NULL VALUES

Here are the sums and percentage of NULLS in the dataframe. Interestingly, there are 17 NULLS that do not have any physical addresses. This amounts to 1.7% of values for the Address, City, and Zip, and are all corresponding rows for those missing values.

I have looked into a couple of these on the Maricopa County Assessor's GIS Portal, and they do not appear to have any assigned physical addresses. This is a good avenue of exploration for EDA. Possibly an error that could be corrected, or some obscure legal reason, but interesting nonetheless.

Additionally, there are 391 NULLS in Subdivision Name accounting for 39.1%. This is a feature that I am interested in exploring to determine if there are any predominant groups. It could also generate a list of Entities that can be searched later to see if the dataset can be enriched beyond it's initial 1,000 record limit.

There are 348 NULLS in the MCR column. This is the definition according to the MCA Glossary

MCR (MARICOPA COUNTY RECORDER NUMBER) Often associated with recorded plat maps.

This seems to be an uninteresting nominal value, so I will drop this columns.

While Property Type and Rental have no NULLS, 100% of those values are Fast Food Restaurant and N (for No), and therefore offer no useful information, and will be dropped.

I will leave the S/T/R column, although it also seems to be uninteresting nominal values, I am curious if there are predominent groups, and since it also has no NULLS, might be useful for further data enrichment.
s
Data from: Nairobi Motorcycle Transit Comparison Dataset: Fuel vs. Electric...
scholardata.sun.ac.za
data.mendeley.com
Updated Mar 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Kitetu; Alois Mbutura; Halloran Stratford; MJ Booysen (2025). Nairobi Motorcycle Transit Comparison Dataset: Fuel vs. Electric Vehicle Performance Tracking (2023) [Dataset]. http://doi.org/10.25413/sun.28554200.v1
Explore at:
Unique identifier
https://doi.org/10.25413/sun.28554200.v1
Dataset updated
Mar 8, 2025
Dataset provided by
SUNScholarData
Authors
Martin Kitetu; Alois Mbutura; Halloran Stratford; MJ Booysen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Nairobi
Description
This dataset contains GPS tracking data and performance metrics for motorcycle taxis (boda bodas) in Nairobi, Kenya, comparing traditional internal combustion engine (ICE) motorcycles with electric motorcycles. The study was conducted in two phases:Baseline Phase: 118 ICE motorcycles tracked over 14 days (2023-11-13 to 2023-11-26)Transition Phase: 108 ICE motorcycles (control) and 9 electric motorcycles (treatment) tracked over 12 days (2023-12-10 to 2023-12-21)The dataset is organised into two main categories:Trip Data: Individual trip-level records containing timing, distance, duration, location, and speed metricsDaily Data: Daily aggregated summaries containing usage metrics, economic data, and energy consumptionThis dataset enables comparative analysis of electric vs. ICE motorcycle performance, economic modelling of transportation costs, environmental impact assessment, urban mobility pattern analysis, and energy efficiency studies in emerging markets.Institutions:EED AdvisoryClean Air TaskforceStellenbosch UniversitySteps to reproduce:Raw Data CollectionGPS tracking devices installed on motorcycles, collecting location data at 10-second intervalsRider-reported information on revenue, maintenance costs, and fuel/electricity usageProcessing StepsGPS data cleaning: Filtered invalid coordinates, removed duplicates, interpolated missing pointsTrip identification: Defined by >1 minute stationary periods or ignition cyclesTrip metrics calculation: Distance, duration, idle time, average/max speedsDaily data aggregation: Summed by user_id and date with self-reported economic dataValidation: Cross-checked with rider logs and known routesAnonymisation: Removed start and end coordinates for first and last trips of each day to protect rider privacy and home locationsTechnical InformationGeographic coverage: Nairobi, KenyaTime period: November-December 2023Time zone: UTC+3 (East Africa Time)Currency: Kenyan Shillings (KES)Data format: CSV filesSoftware used: Python 3.8 (pandas, numpy, geopy)Notes: Some location data points are intentionally missing to protect rider privacy. Self-reported economic and energy consumption data has some missing values where riders did not report.CategoriesMotorcycle, Transportation in Africa, Electric Vehicles
Pre-Processed Power Grid Frequency Time Series
zenodo.org
bin, zip
Updated Jul 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johannes Kruse; Johannes Kruse; Benjamin Schäfer; Benjamin Schäfer; Dirk Witthaut; Dirk Witthaut (2021). Pre-Processed Power Grid Frequency Time Series [Dataset]. http://doi.org/10.5281/zenodo.3744121
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3744121
Dataset updated
Jul 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Johannes Kruse; Johannes Kruse; Benjamin Schäfer; Benjamin Schäfer; Dirk Witthaut; Dirk Witthaut
Description
Overview
This repository contains ready-to-use frequency time series as well as the corresponding pre-processing scripts in python. The data covers three synchronous areas of the European power grid:

Continental Europe

Great Britain

Nordic

This work is part of the paper "Predictability of Power Grid Frequency"[1]. Please cite this paper, when using the data and the code. For a detailed documentation of the pre-processing procedure we refer to the supplementary material of the paper.

Data sources
We downloaded the frequency recordings from publically available repositories of three different Transmission System Operators (TSOs).

Continental Europe [2]: We downloaded the data from the German TSO TransnetBW GmbH, which retains the Copyright on the data, but allows to re-publish it upon request [3].

Great Britain [4]: The download was supported by National Grid ESO Open Data, which belongs to the British TSO National Grid. They publish the frequency recordings under the NGESO Open License [5].

Nordic [6]: We obtained the data from the Finish TSO Fingrid, which provides the data under the open license CC-BY 4.0 [7].

Content of the repository

A) Scripts

In the "Download_scripts" folder you will find three scripts to automatically download frequency data from the TSO's websites.

In "convert_data_format.py" we save the data with corrected timestamp formats. Missing data is marked as NaN (processing step (1) in the supplementary material of [1]).

In "clean_corrupted_data.py" we load the converted data and identify corrupted recordings. We mark them as NaN and clean some of the resulting data holes (processing step (2) in the supplementary material of [1]).

The python scripts run with Python 3.7 and with the packages found in "requirements.txt".

B) Data_converted and Data_cleansed
The folder "Data_converted" contains the output of "convert_data_format.py" and "Data_cleansed" contains the output of "clean_corrupted_data.py".

File type: The files are zipped csv-files, where each file comprises one year.

Data format: The files contain two columns. The first one represents the time stamps in the format Year-Month-Day Hour-Minute-Second, which is given as naive local time. The second column contains the frequency values in Hz.

NaN representation: We mark corrupted and missing data as "NaN" in the csv-files.

Use cases
We point out that this repository can be used in two different was:

Use pre-processed data: You can directly use the converted or the cleansed data. Note however that both data sets include segments of NaN-values due to missing and corrupted recordings. Only a very small part of the NaN-values were eliminated in the cleansed data to not manipulate the data too much. If your application cannot deal with NaNs, you could build upon the following commands to select the longest interval of valid data from the cleansed data:

from helper_functions import * import pandas as pd cleansed_data = pd.read_csv('/Path_to_cleansed_data/data.zip', index_col=0, header=None, squeeze=True, parse_dates=[0]) valid_bounds, valid_sizes = true_intervals(~cleansed_data.isnull()) start,end= valid_bounds[ np.argmax(valid_sizes) ] data_without_nan = cleansed_data.iloc[start:end]

Produce your own cleansed data: Depending on your application, you might want to cleanse the data in a custom way. You can easily add your custom cleansing procedure in "clean_corrupted_data.py" and then produce cleansed data from the raw data in "Data_converted".

License
We release the code in the folder "Scripts" under the MIT license [8]. In the case of Nationalgrid and Fingrid, we further release the pre-processed data in the folder "Data_converted" and "Data_cleansed" under the CC-BY 4.0 license [7]. TransnetBW originally did not publish their data under an open license. We have explicitly received the permission to publish the pre-processed version from TransnetBW. However, we cannot publish our pre-processed version under an open license due to the missing license of the original TransnetBW data.
Netflix
kaggle.com
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prasanna@82 (2025). Netflix [Dataset]. https://www.kaggle.com/datasets/prasanna82/netflix
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 30, 2025
Dataset provided by
Kaggle
Authors
Prasanna@82
Description
Netflix Dataset Exploration and Visualization

This project involves an in-depth analysis of the Netflix dataset to uncover key trends and patterns in the streaming platform’s content offerings. Using Python libraries such as Pandas, NumPy, and Matplotlib, this notebook visualizes and interprets critical insights from the data.

Objectives:

Analyze the distribution of content types (Movies vs. TV Shows)

Identify the most prolific countries producing Netflix content

Study the ratings and duration of shows

Handle missing values using techniques like interpolation, forward-fill, and custom replacements

Enhance readability with bar charts, horizontal plots, and annotated visuals

Key Visualizations:

Bar charts for type distribution and country-wise contributions

Handling missing data in rating, duration, and date_added

Annotated plots showing values for clarity

Tools Used:

Python 3

Pandas for data wrangling

Matplotlib for visualizations

Jupyter Notebook for hands-on analysis

Outcome: This project provides a clear view of Netflix's content library, helping data enthusiasts and beginners understand how to process, clean, and visualize real-world datasets effectively.

Feel free to fork, adapt, and extend the work.
image-impeccable
huggingface.co
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ThinkOnward (2025). image-impeccable [Dataset]. https://huggingface.co/datasets/thinkonward/image-impeccable
Explore at:
Dataset updated
May 11, 2025
Dataset provided by
Think Onward LLC
Authors
ThinkOnward
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Card for Image Impeccable

Dataset Description

This data was produced by ThinkOnward for the Image Impeccable Challenge, using a synthetic seismic dataset generator called Synthoseis.

Created by: Mike McIntire and Jesse Pisel License: CC 4.0

Uses How to generate a dataset

This dataset is provided as paired noisy and clean seismic volumes. Follow the following step to load the data to numpy volumes import pandas as pd import numpy as… See the full description on the dataset page: https://huggingface.co/datasets/thinkonward/image-impeccable.
Data from: BSRN solar radiation data for the testing, validation and...
zenodo.org
portaldelainvestigacion.uma.es
+2more
bin
Updated Feb 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jose A Ruiz-Arias; Jose A Ruiz-Arias (2024). BSRN solar radiation data for the testing, validation and benchmarking of solar irradiance components separation models [Dataset]. http://doi.org/10.5281/zenodo.10593079
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10593079
Dataset updated
Feb 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jose A Ruiz-Arias; Jose A Ruiz-Arias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset is an excerpt of the validation dataset used in:

Ruiz-Arias JA, Gueymard CA. Review and performance benchmarking of 1-min solar irradiance components separation methods: The critical role of dynamically-constrained sky conditions. Submitted for publication to Renewable and Sustainable Energy Reviews.

and it is ready to use in the Python package splitting_models developed during that research. See the documentation in the Python package for usage details. Below, there is a detailed description of the dataset.

The data is in a single parquet file that contains 1-min time series of solar geometry, clear-sky solar irradiance simulations, solar irradiance observations and CAELUS sky types for 5 BSRN sites, one per primary Köppen-Geiger climate, namely: Minamitorishima (mnm), JP, for equatorial climate; Alice Springs (asp), AU, for dry climate; Carpentras (car), FR, for temperate climate; Bondville (bon), US, for continental climate; and Sonnblick (son), AT, for cold/polar/snow climate. It includes one calendar year per site. The BSRN data is publicly available. See download instructions in https://bsrn.awi.de/data.

The specific variables included in the dataset are:

climate: primary Köppen-Geiger climate. Values are: A (equatorial), B (dry), C (temperate), D (continental) and E (polar/snow).

longitude: longitude, in degrees east.

latitude: latitude, in degrees north.

sza: solar zenith angle, in degrees.

eth: extraterrestrial solar irradiance (i.e., top of atmosphere solar irradiance), in W/m2.

ghics: clear-sky global solar irradiance, in W/m2. It is evaluated with the SPARTA clear-sky model and MERRA-2 clear-sky atmosphere.

difcs: clear-sky diffuse solar irradiance, in W/m2.It is evaluated with the SPARTA clear-sky model and MERRA-2 clear-sky atmosphere.

ghicda: clean-and-dry clear-sky global solar irradiance, in W/m2. It is evaluated with the SPARTA clear-sky model and MERRA-2 clear-sky atmosphere, prescribing zero aerosols and zero precipitable water.

ghi: observed global horizontal irradiance, in W/m2.

dif: observed diffuse irradiance, in W/m2.

sky_type: CAELUS sky type. Values are: 1 (unknown), 2 (overcast), 3 (thick clouds), 4 (scattered clouds), 5 (thin clouds), 6 (cloudless) and 7 (cloud enhancement).

The dataset can be easily loaded in a Python Pandas DataFrame as follows:

import pandas as pd

data = pd.read_parquet(

The dataframe has a multi-index with two levels: times_utc and site. The former are the UTC timestamps at the center of each 1-min interval. The latter is each site's label.
Z
A subsection of England and Wales EPC households, joined with PPD data, used...
data.niaid.nih.gov
zenodo.org
Updated Nov 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Phillips, Tom (2022). A subsection of England and Wales EPC households, joined with PPD data, used for simulation modelling [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7322966
Explore at:
Dataset updated
Nov 15, 2022
Dataset provided by
Jenkinson, Ryan
Lopez-Garcia, Daniel
Chan, Stephanie
Phillips, Tom
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Area covered
Wales, England
Description
If you want to give feedback on this dataset, or wish to request it in another form (e.g csv), please fill out this survey here. We are a not-for-profit research organisation keen to see how others use our open models and tools, so all feedback is appreciated! It's a short form that takes 5 minutes to complete.

Important Note: Before downloading this dataset, please read the License and Software Attribution section at the bottom.

This dataset aligns with the work published in Centre for Net Zero's report "Hitting the Target". In this work, we simulate a range of interventions to model the situations in which we believe the UK will meet its 600,000 heat pump installation per year target by 2028. For full modelling assumptions and findings, read our report on our website.

The code for running our simulation is open source here.

This dataset contains over 9 million households that have been address matched between Energy Performance Certificates (EPC) data and Price Paid Data (PPD). The code for our address matching is here. Since these datasets are Open Government License (OGL), this dataset is too. We basically model specific columns from various datasets, as set out in our methodology section in our report, to simplify and clean up this dataset for academic use. License information is also available in the appendix of our report above.

The EPC data loaders can be found here (the data is here) and the rest of the schemas and data download locations can be found here.

Note that this dataset is not regularly maintained or updated. It is correct as of January 2022. The data was curated and tested using dbt via this Github repository and would be simple to rerun on the latest data.

The schema / data dictionary for this data can be found here.

Our recommended way of loading this data is in Python. After downloading all "parts" of the dataset to a folder. You can run:

import pandas as pd data = pd.read_parquet("path/to/data/folder/")

Licenses and software attribution:

For EPC, PPD and UK House Price Index data:

For the EPC data, we are permitted to republish this providing we mention that all researchers who download this dataset follow these copyright restrictions. We do not explicitly release any Royal Mail address data, instead we use these fields to generate a pseudonymised "address_cluster_id" which reflects a unique combination of the address lines and postcodes, as well as other metadata. When viewing ICO and GDPR guidelines, this still counts as personal data, but we have gone to measures to pseudonymise as much as possible to fulfil our obligations as a data processor. You must read this carefully before downloading the data, and ensure that you are using it for the research purposes as determined by this copyright notice.

Contains HM Land Registry data © Crown copyright and database right 2021. This data is licensed under the Open Government Licence v3.0.

Contains OS data © Crown copyright and database right 2022.

Contains Office for National Statistics data licensed under the Open Government Licence v.3.0.

The OGL v3.0 license states that we are free to:

copy, publish, distribute and transmit the Information;

adapt the Information;

exploit the Information commercially and non-commercially for example, by combining it with other Information, or by including it in your own product or application.

However we must (where we do any of the above):

acknowledge the source of the Information in your product or application by including or linking to any attribution statement specified by the Information Provider(s) and, where possible, provide a link to this licence;

You can see more information here.

For XOServe Off Gas Postcodes:

This dataset has been released openly for all uses here.

For the address matching:

GNU Parallel: O. Tange (2018): GNU Parallel 2018, March 2018, https://doi.org/10.5281/zenodo.1146014
Singapore Residents dataset
kaggle.com
Updated Aug 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anuj_sahay (2019). Singapore Residents dataset [Dataset]. https://www.kaggle.com/anujsahay112/singapore-residents-dataset/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 28, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Anuj_sahay
Area covered
Singapore
Description
Context

This dataset is in context of the real world data science work and how the data analyst and data scientist work.

Content

The dataset consists of four columns Year, Level_1(Ethnic group/gender), Level_2(Age group), and population

Acknowledgements

I would sincerely thank GeoIQ for sharing this dataset with me along with tasks. Just having a basic knowledge of Pandas and Numpy and other python data science libraries is not enough. How can you execute tasks and how can you preprocess the data before making any prediction is very important. Most of the datasets in Kaggle are clean and well arranged but this dataset thought me how real world data science and analysis works. Every data science beginner must work on this dataset and try to execute the tasks. It would only give them a good exposer to the real data science world.

Inspiration

Identify the largest Ethnic group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.

Identify the largest age group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.

Identify the group (by age, ethnicity and gender) that: a. Has shown the highest growth rate b. Has shown the lowest growth rate c. Has remained the same

Plot a graph for population trends
Flipkart OnlineOrders
kaggle.com
Updated Jun 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sabya (2020). Flipkart OnlineOrders [Dataset]. https://www.kaggle.com/sabya40/filpkart-onlineorders/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 22, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sabya
Description
Context

This dataset contains 6 months of Customer online orders. The data is simple but messy and unorganized. This for beginner and Intermediate level who want to improve there skills in Pandas, matplotlib, and seaborn.

Content

Dataset context columns like: crawl_timestamp, product_name, product_category_tree, retail_price, discounted_price, brand.

The main focus is to clean the dataset and make it organized using pandas.

Acknowledgements

I wouldn't be here without the help of data.world. Thank You.

Inspiration

I have some questions for this Dataset: 1. What was the best month for sales? How much was earned that month? 2. What time should we display advertisements to maximize the likelihood of purchases? 3. Which category sold most in that six month period? 4. Top 10 products sold most in that six month period?
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Bastian Bechtold; Bastian Bechtold (2025). A Replication Dataset for Fundamental Frequency Estimation [Dataset]. http://doi.org/10.5281/zenodo.3904389

A Replication Dataset for Fundamental Frequency Estimation

Explore at:

binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.3904389

Dataset updated

Apr 24, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Bastian Bechtold; Bastian Bechtold

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Estimating the fundamental frequency of speech remains an active area of research, with varied applications in speech recognition, speaker identification, and speech compression. A vast number of algorithms for estimatimating this quantity have been proposed over the years, and a number of speech and noise corpora have been developed for evaluating their performance. The present dataset contains estimated fundamental frequency tracks of 25 algorithms, six speech corpora, two noise corpora, at nine signal-to-noise ratios between -20 and 20 dB SNR, as well as an additional evaluation of synthetic harmonic tone complexes in white noise.

The dataset also contains pre-calculated performance measures both novel and traditional, in reference to each speech corpus’ ground truth, the algorithms’ own clean-speech estimate, and our own consensus truth. It can thus serve as the basis for a comparison study, or to replicate existing studies from a larger dataset, or as a reference for developing new fundamental frequency estimation algorithms. All source code and data is available to download, and entirely reproducible, albeit requiring about one year of processor-time.

Included Code and Data

ground truth data.zip is a JBOF dataset of fundamental frequency estimates and ground truths of all speech files in the following corpora:
- CMU-ARCTIC (consensus truth) [1]
- FDA (corpus truth and consensus truth) [2]
- KEELE (corpus truth and consensus truth) [3]
- MOCHA-TIMIT (consensus truth) [4]
- PTDB-TUG (corpus truth and consensus truth) [5]
- TIMIT (consensus truth) [6]
noisy speech data.zip is a JBOF datasets of fundamental frequency estimates of speech files mixed with noise from the following corpora:
- NOISEX [7]
- QUT-NOISE [8]
synthetic speech data.zip is a JBOF dataset of fundamental frequency estimates of synthetic harmonic tone complexes in white noise.
noisy_speech.pkl and synthetic_speech.pkl are pickled Pandas dataframes of performance metrics derived from the above data for the following list of fundamental frequency estimation algorithms:
- AUTOC [9]
- AMDF [10]
- BANA [11]
- CEP [12]
- CREPE [13]
- DIO [14]
- DNN [15]
- KALDI [16]
- MAPS
- MBSC [17]
- NLS [18]
- PEFAC [19]
- PRAAT [20]
- RAPT [21]
- SACC [22]
- SAFE [23]
- SHR [24]
- SIFT [25]
- SRH [26]
- STRAIGHT [27]
- SWIPE [28]
- YAAPT [29]
- YIN [30]
noisy speech evaluation.py and synthetic speech evaluation.py are Python programs to calculate the above Pandas dataframes from the above JBOF datasets. They calculate the following performance measures:
- Gross Pitch Error (GPE), the percentage of pitches where the estimated pitch deviates from the true pitch by more than 20%.
- Fine Pitch Error (FPE), the mean error of grossly correct estimates.
- High/Low Octave Pitch Error (OPE), the percentage pitches that are GPEs and happens to be at an integer multiple of the true pitch.
- Gross Remaining Error (GRE), the percentage of pitches that are GPEs but not OPEs.
- Fine Remaining Bias (FRB), the median error of GREs.
- True Positive Rate (TPR), the percentage of true positive voicing estimates.
- False Positive Rate (FPR), the percentage of false positive voicing estimates.
- False Negative Rate (FNR), the percentage of false negative voicing estimates.
- F₁, the harmonic mean of precision and recall of the voicing decision.
Pipfile is a pipenv-compatible pipfile for installing all prerequisites necessary for running the above Python programs.

The Python programs take about an hour to compute on a fast 2019 computer, and require at least 32 Gb of memory.

References:

John Kominek and Alan W Black. CMU ARCTIC database for speech synthesis, 2003.
Paul C Bagshaw, Steven Hiller, and Mervyn A Jack. Enhanced Pitch Tracking and the Processing of F0 Contours for Computer Aided Intonation Teaching. In EUROSPEECH, 1993.
F Plante, Georg F Meyer, and William A Ainsworth. A Pitch Extraction Reference Database. In Fourth European Conference on Speech Communication and Technology, pages 837–840, Madrid, Spain, 1995.
Alan Wrench. MOCHA MultiCHannel Articulatory database: English, November 1999.
Gregor Pirker, Michael Wohlmayr, Stefan Petrik, and Franz Pernkopf. A Pitch Tracking Corpus with Evaluation on Multipitch Tracking Scenario. page 4, 2011.
John S. Garofolo, Lori F. Lamel, William M. Fisher, Jonathan G. Fiscus, David S. Pallett, Nancy L. Dahlgren, and Victor Zue. TIMIT Acoustic-Phonetic Continuous Speech Corpus, 1993.
Andrew Varga and Herman J.M. Steeneken. Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recog- nition systems. Speech Communication, 12(3):247–251, July 1993.
David B. Dean, Sridha Sridharan, Robert J. Vogt, and Michael W. Mason. The QUT-NOISE-TIMIT corpus for the evaluation of voice activity detection algorithms. Proceedings of Interspeech 2010, 2010.
Man Mohan Sondhi. New methods of pitch extraction. Audio and Electroacoustics, IEEE Transactions on, 16(2):262—266, 1968.
Myron J. Ross, Harry L. Shaffer, Asaf Cohen, Richard Freudberg, and Harold J. Manley. Average magnitude difference function pitch extractor. Acoustics, Speech and Signal Processing, IEEE Transactions on, 22(5):353—362, 1974.
Na Yang, He Ba, Weiyang Cai, Ilker Demirkol, and Wendi Heinzelman. BaNa: A Noise Resilient Fundamental Frequency Detection Algorithm for Speech and Music. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):1833–1848, December 2014.
Michael Noll. Cepstrum Pitch Determination. The Journal of the Acoustical Society of America, 41(2):293–309, 1967.
Jong Wook Kim, Justin Salamon, Peter Li, and Juan Pablo Bello. CREPE: A Convolutional Representation for Pitch Estimation. arXiv:1802.06182 [cs, eess, stat], February 2018. arXiv: 1802.06182.
Masanori Morise, Fumiya Yokomori, and Kenji Ozawa. WORLD: A Vocoder-Based High-Quality Speech Synthesis System for Real-Time Applications. IEICE Transactions on Information and Systems, E99.D(7):1877–1884, 2016.
Kun Han and DeLiang Wang. Neural Network Based Pitch Tracking in Very Noisy Speech. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 22(12):2158–2168, Decem- ber 2014.
Pegah Ghahremani, Bagher BabaAli, Daniel Povey, Korbinian Riedhammer, Jan Trmal, and Sanjeev Khudanpur. A pitch extraction algorithm tuned for automatic speech recognition. In Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on, pages 2494–2498. IEEE, 2014.
Lee Ngee Tan and Abeer Alwan. Multi-band summary correlogram-based pitch detection for noisy speech. Speech Communication, 55(7-8):841–856, September 2013.
Jesper Kjær Nielsen, Tobias Lindstrøm Jensen, Jesper Rindom Jensen, Mads Græsbøll Christensen, and Søren Holdt Jensen. Fast fundamental frequency estimation: Making a statistically

Clear search

Close search

Google apps

Main menu

A Replication Dataset for Fundamental Frequency Estimation

US Means of Transportation to Work Census Data

Data Ingestion and Cleaning:

Data Source:

Final Word:

Enhancing UNCDF Operations: Power BI Dashboard Development and Data Mapping

Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic...

Maricopa County Assessor "Fast Food" Search Query

SOURCE

DATA CLEANING

NULL VALUES

Data from: Nairobi Motorcycle Transit Comparison Dataset: Fuel vs. Electric...

Pre-Processed Power Grid Frequency Time Series

Netflix

image-impeccable

Data from: BSRN solar radiation data for the testing, validation and...

A subsection of England and Wales EPC households, joined with PPD data, used...

Singapore Residents dataset

Context

Content

Acknowledgements

Inspiration

Flipkart OnlineOrders

Context

Content

Acknowledgements

Inspiration

A Replication Dataset for Fundamental Frequency Estimation