Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
A curated dataset of 9,450 labeled audio segments capturing the rich and diverse soundscapes of South Asia. This dataset is designed for audio classification tasks and includes sounds ranging from traditional music and religious rituals to environmental noises and animal calls.
Class-ID_Class-Name_Segment-Number.wav metadata.csv)| Class ID | Class Name |
|---|---|
| 0 | Tanpura |
| 1 | Traditional Song |
| 2 | Railway Engine |
| 3 | Children Class Noise |
| 4 | Harmonium |
| 5 | Dhak |
| 6 | Tabla |
| 7 | Azan |
| 8 | Church Prayer |
| 9 | Irrigation Engine |
| 10 | Ektara |
| 11 | Launch Engine |
| 12 | Flute |
| 13 | Buddhist Prayer |
| 14 | Fish Market |
| 15 | Tiger |
| 16 | Elephant |
| 17 | Kalboishakhi Storm |
| 18 | Sanatan Religion Aroti |
| 19 | Rickshaw Horn |
| 20 | Afghanistan Pashto Music |
slice_file_name: Name of the audio segment slicing_start_time: Start time of the segment slicing_end_time: End time of the segment ClassID: Numeric class label (0 to 20) Class_name: Descriptive class name folder: Folder containing the segmentπ sas-kiit.netlify.app
Paper Link : https://ieeexplore.ieee.org/document/10829485 If you use this dataset, please cite: @inproceedings{chatterjee2024south, title={South Asian Sounds: Audio Classification}, author={Chatterjee, Rajdeep and Bishwas, Pappu and Chakrabarty, Sudip and Bandyopadhyay, Tathagata}, booktitle={2024 4th International Conference on Computer, Communication, Control & Information Technology (C3IT)}, pages={1--6}, year={2024}, organization={IEEE} }
This dataset is intended for research and academic use only.
Please provide proper citation when using it in your work.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
!!!WARNING~~~This dataset has a large number of flaws and is unable to properly answer many questions that people generally use it to answer, such as whether national hate crimes are changing (or at least they use the data so improperly that they get the wrong answer). A large number of people using this data (academics, advocates, reporting, US Congress) do so inappropriately and get the wrong answer to their questions as a result. Indeed, many published papers using this data should be retracted. Before using this data I highly recommend that you thoroughly read my book on UCR data, particularly the chapter on hate crimes (https://ucrbook.com/hate-crimes.html) as well as the FBI's own manual on this data. The questions you could potentially answer well are relatively narrow and generally exclude any causal relationships. ~~~WARNING!!!Version 8 release notes:Adds 2019 dataVersion 7 release notes:Changes release notes description, does not change data.Version 6 release notes:Adds 2018 dataVersion 5 release notes:Adds data in the following formats: SPSS, SAS, and Excel.Changes project name to avoid confusing this data for the ones done by NACJD.Adds data for 1991.Fixes bug where bias motivation "anti-lesbian, gay, bisexual, or transgender, mixed group (lgbt)" was labeled "anti-homosexual (gay and lesbian)" prior to 2013 causing there to be two columns and zero values for years with the wrong label.All data is now directly from the FBI, not NACJD. The data initially comes as ASCII+SPSS Setup files and read into R using the package asciiSetupReader. All work to clean the data and save it in various file formats was also done in R. Version 4 release notes: Adds data for 2017.Adds rows that submitted a zero-report (i.e. that agency reported no hate crimes in the year). This is for all years 1992-2017. Made changes to categorical variables (e.g. bias motivation columns) to make categories consistent over time. Different years had slightly different names (e.g. 'anti-am indian' and 'anti-american indian') which I made consistent. Made the 'population' column which is the total population in that agency. Version 3 release notes: Adds data for 2016.Order rows by year (descending) and ORI.Version 2 release notes: Fix bug where Philadelphia Police Department had incorrect FIPS county code. The Hate Crime data is an FBI data set that is part of the annual Uniform Crime Reporting (UCR) Program data. This data contains information about hate crimes reported in the United States. Please note that the files are quite large and may take some time to open.Each row indicates a hate crime incident for an agency in a given year. I have made a unique ID column ("unique_id") by combining the year, agency ORI9 (the 9 character Originating Identifier code), and incident number columns together. Each column is a variable related to that incident or to the reporting agency. Some of the important columns are the incident date, what crime occurred (up to 10 crimes), the number of victims for each of these crimes, the bias motivation for each of these crimes, and the location of each crime. It also includes the total number of victims, total number of offenders, and race of offenders (as a group). Finally, it has a number of columns indicating if the victim for each offense was a certain type of victim or not (e.g. individual victim, business victim religious victim, etc.). The only changes I made to the data are the following. Minor changes to column names to make all column names 32 characters or fewer (so it can be saved in a Stata format), made all character values lower case, reordered columns. I also generated incident month, weekday, and month-day variables from the incident date variable included in the original data.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
A curated dataset of 9,450 labeled audio segments capturing the rich and diverse soundscapes of South Asia. This dataset is designed for audio classification tasks and includes sounds ranging from traditional music and religious rituals to environmental noises and animal calls.
Class-ID_Class-Name_Segment-Number.wav metadata.csv)| Class ID | Class Name |
|---|---|
| 0 | Tanpura |
| 1 | Traditional Song |
| 2 | Railway Engine |
| 3 | Children Class Noise |
| 4 | Harmonium |
| 5 | Dhak |
| 6 | Tabla |
| 7 | Azan |
| 8 | Church Prayer |
| 9 | Irrigation Engine |
| 10 | Ektara |
| 11 | Launch Engine |
| 12 | Flute |
| 13 | Buddhist Prayer |
| 14 | Fish Market |
| 15 | Tiger |
| 16 | Elephant |
| 17 | Kalboishakhi Storm |
| 18 | Sanatan Religion Aroti |
| 19 | Rickshaw Horn |
| 20 | Afghanistan Pashto Music |
slice_file_name: Name of the audio segment slicing_start_time: Start time of the segment slicing_end_time: End time of the segment ClassID: Numeric class label (0 to 20) Class_name: Descriptive class name folder: Folder containing the segmentπ sas-kiit.netlify.app
Paper Link : https://ieeexplore.ieee.org/document/10829485 If you use this dataset, please cite: @inproceedings{chatterjee2024south, title={South Asian Sounds: Audio Classification}, author={Chatterjee, Rajdeep and Bishwas, Pappu and Chakrabarty, Sudip and Bandyopadhyay, Tathagata}, booktitle={2024 4th International Conference on Computer, Communication, Control & Information Technology (C3IT)}, pages={1--6}, year={2024}, organization={IEEE} }
This dataset is intended for research and academic use only.
Please provide proper citation when using it in your work.