65 datasets found

Singapore Residents dataset
kaggle.com
zip
Updated Aug 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anuj_sahay (2019). Singapore Residents dataset [Dataset]. https://www.kaggle.com/anujsahay112/singapore-residents-dataset
Explore at:
zip(116422 bytes)Available download formats
Dataset updated
Aug 28, 2019
Authors
Anuj_sahay
Area covered
Singapore
Description
Context

This dataset is in context of the real world data science work and how the data analyst and data scientist work.

Content

The dataset consists of four columns Year, Level_1(Ethnic group/gender), Level_2(Age group), and population

Acknowledgements

I would sincerely thank GeoIQ for sharing this dataset with me along with tasks. Just having a basic knowledge of Pandas and Numpy and other python data science libraries is not enough. How can you execute tasks and how can you preprocess the data before making any prediction is very important. Most of the datasets in Kaggle are clean and well arranged but this dataset thought me how real world data science and analysis works. Every data science beginner must work on this dataset and try to execute the tasks. It would only give them a good exposer to the real data science world.

Inspiration

Identify the largest Ethnic group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.

Identify the largest age group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.

Identify the group (by age, ethnicity and gender) that: a. Has shown the highest growth rate b. Has shown the lowest growth rate c. Has remained the same

Plot a graph for population trends
Resident Population Born Outside Singapore by Age Group, Ethnic Group and...
data.gov.sg
Updated Nov 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Singapore Department of Statistics (2025). Resident Population Born Outside Singapore by Age Group, Ethnic Group and Sex (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_61ef44ab621ed1ef5592be1ab19b48fe/view
Explore at:
Dataset updated
Nov 5, 2025
Dataset authored and provided by
Singapore Department of Statistics
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Area covered
Singapore
Description
Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_61ef44ab621ed1ef5592be1ab19b48fe/view
Resident Population Aged 5 Years and Over by Language Most / Second Most...
data.gov.sg
Updated Nov 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Singapore Department of Statistics (2025). Resident Population Aged 5 Years and Over by Language Most / Second Most Frequently Spoken at Home, Age Group and Ethnic Group (All Ethnic Groups) (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_ad4a8ccbdab03d16c486a9ee6988289d/view
Explore at:
Dataset updated
Nov 5, 2025
Dataset authored and provided by
Singapore Department of Statistics
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Description
Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_ad4a8ccbdab03d16c486a9ee6988289d/view
New Events Data in Singapore
kaggle.com
zip
Updated Sep 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Techsalerator (2024). New Events Data in Singapore [Dataset]. https://www.kaggle.com/datasets/techsalerator/new-events-data-in-singapore
Explore at:
zip(4948 bytes)Available download formats
Dataset updated
Sep 14, 2024
Authors
Techsalerator
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Singapore
Description
Techsalerator's News Events Data for Singapore: A Comprehensive Overview

Techsalerator's News Events Data for Singapore offers a powerful resource for businesses, researchers, and media organizations. This dataset compiles information on significant news events across Singapore, pulling from a wide range of media sources, including news outlets, online publications, and social platforms. It provides valuable insights for those looking to track trends, analyze public sentiment, or monitor industry-specific developments.

Key Data Fields - Event Date: Captures the exact date of the news event. This is crucial for analysts who need to monitor trends over time or for businesses responding to market shifts. - Event Title: A brief headline describing the event. This allows users to quickly categorize and assess news content based on relevance to their interests. - Source: Identifies the news outlet or platform where the event was reported. This helps users track credible sources and assess the reach and influence of the event. - Location: Provides geographic information, indicating where the event took place within Singapore. This is especially valuable for regional analysis or localized marketing efforts. - Event Description: A detailed summary of the event, outlining key developments, participants, and potential impact. Researchers and businesses use this to understand the context and implications of the event.

Top 5 News Categories in Singapore - Politics: Major news coverage on government decisions, political movements, elections, and policy changes that affect the national landscape. - Economy: Focuses on Singapore’s economic indicators, inflation rates, international trade, and corporate activities influencing business and finance sectors. - Social Issues: News events covering public health, education, and other societal concerns that drive public discourse. - Sports: Highlights events in popular sports such as soccer, swimming, and table tennis, often drawing widespread attention and engagement. - Technology and Innovation: Reports on tech developments, startups, and innovations in Singapore’s thriving tech ecosystem, featuring emerging companies and advancements.

Top 5 News Sources in Singapore - The Straits Times: A leading news outlet providing comprehensive coverage of national politics, economy, and social issues. - Channel News Asia: A major news platform known for its timely updates on breaking news, politics, and current affairs. - The Business Times: A widely-read newspaper offering insights into economic developments, business news, and corporate activities. - TODAY: A significant news source covering a broad spectrum of topics, including politics, economy, and social issues. - Channel 8 News: The national news channel delivering updates on significant events, public health, and sports across Singapore.

Accessing Techsalerator’s News Events Data for Singapore To access Techsalerator’s News Events Data for Singapore, please contact info@techsalerator.com with your specific needs. We will provide a customized quote based on the data fields and records you require, with delivery available within 24 hours. Ongoing access options can also be discussed.

Included Data Fields - Event Date - Event Title - Source - Location - Event Description - Event Category (Politics, Economy, Sports, etc.) - Participants (if applicable) - Event Impact (Social, Economic, etc.)

Techsalerator’s dataset is an invaluable tool for keeping track of significant events in Singapore. It aids in making informed decisions, whether for business strategy, market analysis, or academic research, providing a clear picture of the country’s news landscape.
Singapore Residents By Age Group, Ethnic Group And Sex, At End June, Annual
data.gov.sg
Updated Nov 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Singapore Department of Statistics (2025). Singapore Residents By Age Group, Ethnic Group And Sex, At End June, Annual [Dataset]. https://data.gov.sg/datasets/d_3cf667d761b4bdc6d4d3d3aeec37dea5/view
Explore at:
Dataset updated
Nov 1, 2025
Dataset authored and provided by
Singapore Department of Statistics
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Time period covered
Jan 1957 - Dec 2025
Area covered
Singapore
Description
Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_3cf667d761b4bdc6d4d3d3aeec37dea5/view
Resident Population Born In Singapore by Age Group, Ethnic Group and Sex...
data.gov.sg
Updated Nov 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Singapore Department of Statistics (2025). Resident Population Born In Singapore by Age Group, Ethnic Group and Sex (Census of Population 2010) [Dataset]. https://data.gov.sg/datasets/d_8f1404e56fa6ef520b901a4b51062ee6/view
Explore at:
Dataset updated
Nov 17, 2025
Dataset authored and provided by
Singapore Department of Statistics
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Area covered
Singapore
Description
Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_8f1404e56fa6ef520b901a4b51062ee6/view
Data from: The National University of Singapore SMS Corpus
kaggle.com
zip
Updated Aug 7, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachael Tatman (2017). The National University of Singapore SMS Corpus [Dataset]. https://www.kaggle.com/datasets/rtatman/the-national-university-of-singapore-sms-corpus/discussion
Explore at:
zip(3788449 bytes)Available download formats
Dataset updated
Aug 7, 2017
Authors
Rachael Tatman
Area covered
Singapore
Description
Context:

Short Message Service (SMS) messages are short messages sent from one person to another from their mobile phones. They represent a means of personal communication that is an important communicative artifact in our current digital era. This dataset contains SMS messages that were collected from users who knew they were participating in a research project and that their messages would be shared publicly. This dataset contains two SMS messages in two languages: Singapore English and Mandarin Chinese.

Content:

This is a corpus of SMS (Short Message Service) messages collected for research at the Department of Computer Science at the National University of Singapore. This dataset consists of 67,093 SMS messages taken from the corpus on Mar 9, 2015. The messages largely originate from Singaporeans and mostly from students attending the University. These messages were collected from volunteers who were made aware that their contributions were going to be made publicly available. The data collectors opportunistically collected as much metadata about the messages and their senders as possible, so as to enable different types of analyses.

Acknowledgements:

This corpus was collected by Tao Chen and Min-Yen Kan. If you use this data, please cite the following paper:

Tao Chen and Min-Yen Kan (2013). Creating a Live, Public Short Message Service Corpus: The NUS SMS Corpus. Language Resources and Evaluation, 47(2)(2013), pages 299-355. URL: https://link.springer.com/article/10.1007%2Fs10579-012-9197-9

Inspiration:

This dataset contains a lot of short, informal texts and is ideal for trying your hand at various natural language processing tasks. There’s also a lot of information about the messages which might reveal interesting insights. Here are some ideas to get you started:

This dataset contains Singapore English. How well do tools trained on other varieties of English, like stemmers or part of speech taggers, work on it?

What time of day are most SMS messages sent? Is this different for the English and Mandarin datasets?

Unlike English, Mandarin does not have spaces between words, which can be made up of several characters. Can you build or implement a system for word identification?
🐎 Hong Kong Jockey Club and Singapore TurfClub
kaggle.com
zip
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mexwell (2024). 🐎 Hong Kong Jockey Club and Singapore TurfClub [Dataset]. https://www.kaggle.com/datasets/mexwell/hong-kong-jockey-club-and-singapore-turfclub/data
Explore at:
zip(41690560 bytes)Available download formats
Dataset updated
Oct 15, 2024
Authors
mexwell
Area covered
Singapore
Description
Forword

Gambling is bad, m'kay.

This repository provides horse race data for the Hong Kong Jockey Club and the Singpore Turf Club. The data was obtained by scraping their respective public websites, and comes with no guarantee of correctness whatsoever.

A particularly cool thing is that we also provides historical odds for a period of time for HKJC race. Being able to predict what would be the final odds for a given horse on a given race is extremely valuable, but historical data are, as far as we know, not publicly available. We thus wrote a scraper, that ran for 2 seasons, that probed the odds at regular interval up to the race start. This allows for cool time series analysis that can't be done with historical data available on the public websites.

That dataset is provided as a set of compressed CSV files, that can easily be reloaded to a database of your choice, a pandas dataframe, or even Excel if you don't know any better. The HKJC website is just a little less crappy that the TurfClub one, in general HK data contains more information than their Singaporean counterpart.

Original Data

Dataset

horses

List of all the horses (some retired) for HKJC and SGTC that ran a race, up to 2018-07-01.

performances

Each row of this table is the result for a single horse in a single race, with their position, final odds (for first place -- more explicit dividends can be found in the all_dividends table for HK races). This is the main source of information for the statistics you want. Note that some races found in the performance table do NOT have their counterpart in the races table.

This contains historical results from 1979 up to 2018-06-27 for Hong Kong, and 2002-03-08 to 2018-04-24 for Singapore.

races

List of all the races ran between 2016-09-28 and 2018-06-27 for Hong Kong and 2016-09-25 to 2018-04-24 for Singapore. Note that some races not found in this table still have available performances in the performances table.

all_dividends

Each row of this table contains the JSON-encoded dividend results (which can be used to infer the final odds) for each race ran in Hong Kong between 2016-09-28 and 2018-06-27.

sectional_times

Each row contains the sectional times for races ran between 2008-06-05 and 2018-06-27. That's basically, for a given horse in a given race, what was their placing and time at given section of the track.

live_odds

Live odds evolution for Hong Kong race ran between 2016-09-27 and 2018-06-27. HKJC is a "pari-mutuel" system where odds for a given horse / bet evolve up to the start of the race. This dataset was collected by poking for the odds at various interval before a race (with the interval getting smaller as the race was getting closer, since that's when the odds tend to vary the most). As far as we can tell, this kind of information can not be found in historical dataset, and can only be collected in real-time.

Acknowledgement

Foto von Gene Devine auf Unsplash
Resident Households by Ethnic Group, Labour Force Status and Sex of...
data.gov.sg
Updated Nov 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Singapore Department of Statistics (2025). Resident Households by Ethnic Group, Labour Force Status and Sex of Household Reference Person (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_5c1b7f454248e9bb20bc5959eea5928b/view
Explore at:
Dataset updated
Nov 6, 2025
Dataset authored and provided by
Singapore Department of Statistics
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Description
Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_5c1b7f454248e9bb20bc5959eea5928b/view
Singapore-music-classifier
kaggle.com
zip
Updated Nov 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fajilatun Nahar (2020). Singapore-music-classifier [Dataset]. https://www.kaggle.com/fajilatunnahar/singaporemusicclassifier
Explore at:
zip(1534107746 bytes)Available download formats
Dataset updated
Nov 13, 2020
Authors
Fajilatun Nahar
Area covered
Singapore
Description
Context

A dataset to classify Chinese, Malay, Hindi and Tamil songs

Content

The songs were downloaded from Spotify and they are of 30 seconds each. Low-level features and high-level features were extracted using OpenSmile and essential respectively. Mel Spectrogram images were also extracted using librosa.

Dataset Description

Dataset 1: 260 low-level features

Dataset 2: 127 high-level features

Dataset 3: Combination of 387 high and low level features

Dataset 4: 260 low-level features, each row in the dataset is a 5 second frame of a song.

Dataset 5 1820 low-level features (feature space increased with statistical metrics (mean, minimum, maximum, variance, skewness, kurtosis)

Dataset 6: 111 low-level features with feature selection (wrapper method)

Dataset 7: 82 high-level features with feature selection (wrapper method)

Dataset 8: 182 low-level features with feature selection (filter method)

Dataset 9: 67 high-level features with feature selection (filter method)

Dataset 10: 92 low-level common features from datasets 6 and 8

Dataset 11: 49 high-level common features from datasets 7 and 9

Dataset 12: Mel Spectrogram images

Inspiration

What is the best classification accuracy?

Can we identify exclusive traits of that ethnic group via the songs?
s
Twitter bot profiling
researchdata.smu.edu.sg
smu.edu.sg
+1more
pdf
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25440/smu.12062706.v1
Dataset updated
May 31, 2023
Dataset provided by
SMU Research Data Repository (RDR)
Authors
Living Analytics Research Centre
License
http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/
Description
This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6
T
United States Imports from Singapore
tradingeconomics.com
csv, excel, json, xml
Updated Jun 3, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2017). United States Imports from Singapore [Dataset]. https://tradingeconomics.com/united-states/imports-from-singapore
Explore at:
csv, excel, xml, jsonAvailable download formats
Dataset updated
Jun 3, 2017
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 1985 - Feb 29, 2024
Area covered
United States
Description
Imports from Singapore in the United States increased to 3384.61 USD Million in February from 3272.07 USD Million in January of 2024. This dataset includes a chart with historical data for the United States Imports from Singapore.
S
Singapore SG: Diabetes Prevalence: % of Population Aged 20-79
ceicdata.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com, Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 [Dataset]. https://www.ceicdata.com/en/singapore/health-statistics/sg-diabetes-prevalence--of-population-aged-2079
Explore at:
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2017
Area covered
Singapore
Description
Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 data was reported at 10.990 % in 2017. Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 data is updated yearly, averaging 10.990 % from Dec 2017 (Median) to 2017, with 1 observations. Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Singapore – Table SG.World Bank.WDI: Health Statistics. Diabetes prevalence refers to the percentage of people ages 20-79 who have type 1 or type 2 diabetes.; ; International Diabetes Federation, Diabetes Atlas.; Weighted average;
SINGA:PURA (SINGApore: Polyphonic URban Audio)
zenodo.org
bin, csv, pdf, zip
Updated Jul 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kenneth Ooi; Kenneth Ooi; Karn N. Watcharasupat; Karn N. Watcharasupat; Santi Peksi; Furi Andi Karnapi; Furi Andi Karnapi; Zhen-Ting Ong; Zhen-Ting Ong; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh; Woon-Seng Gan; Woon-Seng Gan; Santi Peksi; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh (2024). SINGA:PURA (SINGApore: Polyphonic URban Audio) [Dataset]. http://doi.org/10.5281/zenodo.5645825
Explore at:
bin, zip, pdf, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5645825
Dataset updated
Jul 17, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kenneth Ooi; Kenneth Ooi; Karn N. Watcharasupat; Karn N. Watcharasupat; Santi Peksi; Furi Andi Karnapi; Furi Andi Karnapi; Zhen-Ting Ong; Zhen-Ting Ong; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh; Woon-Seng Gan; Woon-Seng Gan; Santi Peksi; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh
Area covered
Singapore
Description
SINGA:PURA Dataset (v1.0a)

This repository contains the strongly-labelled subset of recordings of the SINGA:PURA (SINGApore: Polyphonic URban Audio) dataset and corresponding metadata, formatted in a manner compatible with a soundata dataset loader.

Please note that this repository does not contain the unlabelled recordings of the SINGA:PURA dataset! If you wish to access the unlabelled recordings, please refer to https://doi.org/10.21979/N9/Y8UQ6F for the full version (v1.0) of the SINGA:PURA dataset (which contains both the strongly-labelled and unlabelled recordings).

Regarding this repository

The SINGA:PURA dataset is a polyphonic urban sound dataset with spatiotemporal context that contains 6547 strongly-labelled and 72406 unlabelled recordings from a wireless acoustic sensor network deployed in Singapore to identify and mitigate noise sources in Singapore. However, this repository only contains the subset of 6547 strongly-labelled recordings from the SINGA:PURA dataset and their corresponding labels, formatted in a manner compatible with a soundata dataset loader. The recordings are all 10 seconds in length, and may have 1 or 7 channels, depending on the recording device used to record them.

The readme file in this repository ("Readme.md") contains the same information as this description: a short description on the organisation of this repository, as well our label taxonomy and the dataset itself. For full details regarding the sensor units used, the recording conditions, and annotation methodology, please refer to our conference paper below:

K. Ooi, K. N. Watcharasupat, S. Peksi, F. A. Karnapi, Z.-T. Ong, D. Chua, H.-W. Leow, L.-L. Kwok, X.-L. Ng, Z.-A. Loh, W.-S. Gan, "A Strongly-Labelled Polyphonic Dataset of Urban Sounds with Spatiotemporal Context," in 13th Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2021.

The conference paper has also been included in this repository as "APSIPA.pdf".

Directory structure

This repository contains a total of 9 files. 5 of the files ("labelled.zip", "labelled.z01", "labelled.z02", "labelled.z03", "labelled.z04") form a multi-part ZIP archive that, when extracted, contain the subset of 6547 strongly-labelled recordings (in FLAC format) in the SINGA:PURA dataset organised in folders by date of recording. The other 4 files are:

"APSIPA.pdf": A PDF copy of the conference paper describing the dataset, recording and annotation methodology in detail.

"labelled_metadata_public.csv": A CSV file containing the metadata for the 6547 strongly-labelled recordings. Each row corresponds to a single recording. See the section titled "Metadata CSV file" for more information.

"labels_public.zip": A ZIP archive that, when extracted, contains 6547 CSV files that each contain the strong labels for their corresponding strongly-labelled recording. The names of the CSV files are identical to the names of the corresponding FLAC files containing the recordings, save for the file extension. Each row corresponds to a single acoustic event. See "Labels CSV files" for more information.

"Readme.md": The readme file for this repository.

Each numbered part of the multi-part ZIP archive is 1000 MB in size, which makes the dataset in its entirety about 5 GB in size. Please ensure that your connection has sufficient bandwidth to support the download, and it may also be useful to use a download manager for downloading the individual files of the dataset. To extract the multi-part ZIP archive, it may be helpful to use either WinRAR or WinZip.

After extraction, the directory structure of this repository should be as follows:

. ├─ labelled │ ├─ 2020-08-03 │ │ └─ [b827eb7d576e][2020-08-03T23-32-11Z][manual][---][565a40f866f3d2804332ca7896a4c77d][93.29-86.29 66.65]!-90.flac │ │ │ ├─ 2020-08-17 │ │ └─ <.flac files> │ │ │ ├─ ... │ │ │ └─ 2020-10-31 │ └─ <.flac files> │ ├─ labels_public │ ├─ [b827eb0a63c9][2020-08-20T11-29-04Z][manual][---][de313d12d7f31937615be80cc47a1ad9][]-53.csv │ ├─ [b827eb0a63c9][2020-08-20T11-30-04Z][manual][---][de313d12d7f31937615be80cc47a1ad9][]-54.csv │ ├─ ... │ └─ [b827ebf3744c][2020-09-02T06-53-04Z][manual][---][4edbade2d41d5f80e324ee4f10d401c0][]-1647.csv │ ├─ APSIPA.pdf ├─ labelled_metadata_public.csv └─ Readme.md

Label taxonomy

Our label taxonomy is derived from the taxonomy used in the SONYC-UST datasets, but has been adapted to fit the local (Singapore) context while retaining compatibility with the SONYC-UST ontonology. We chose this taxonomy to allow the SINGA:PURA dataset to be used in conjunction with the SONYC-UST datasets when training urban sound tagging models by simply omitting the labels that are absent in the SONYC-UST taxonomy from the recordings in the SINGA:PURA dataset. For more information regarding the SONYC-UST datasets, please refer to the following paper published by the SONYC team:

M. Cartwright, J. Cramer, A. E. M. Mendez, Y. Wang, H. Wu, V. Lostanlen, M. Fuentes, G. Dove, C. Mydlarz, J. Salamon, O. Nov, J. P. Bello, "SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context," in Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020.

Specifically, our label taxonomy consists of 14 coarse-grained classes and 40 fine-grained classes. Their organisation is as follows:

─┬─ 1. Engine ───────────────┬─ 1. Small engine │ ├─ 2. Medium engine │ └─ 3. Large engine ├─ 2. Machinery impact ─────┬─ 1. Rock drill │ ├─ 2. Jackhammer │ ├─ 3. Hoe ram │ └─ 4. Pile driver ├─ 3. Non-machinery impact ─┬─ 1. Glass breaking* │ ├─ 2. Car crash* │ └─ 3. Explosion* ├─ 4. Powered saw ──────────┬─ 1. Chainsaw │ ├─ 2. Small/medium rotating saw │ └─ 3. Large rotating saw ├─ 5. Alert signal ─────────┬─ 1. Car horn │ ├─ 2. Car alarm │ ├─ 3. Siren │ └─ 4. Reverse beeper ├─ 6. Music ────────────────┬─ 1. Stationary music │ └─ 2. Mobile music ├─ 7. Human voice ──────────┬─ 1. Talking │ ├─ 2. Shouting │ ├─ 3. Large crowd │ ├─ 4. Amplified speech │ └─ 5. Singing* ├─ 8. Human movement* ──────┬─ 1. Footsteps* │ └─ 2. Clapping* ├─ 9. Animal* ──────────────┬─ 1. Dog barking │ ├─ 2. Bird chirping* │ └─ 3. Insect chirping* ├─ 10. Water* ──────────────── 1. Hose pump* ├─ 11. Weather* ────────────┬─ 1. Rain* │ ├─ 2. Thunder* │ └─ 3. Wind* ├─ 12. Brake* ──────────────┬─ 1. Friction brake* │ └─ 2. Exhaust brake* ├─ 13. Train* ──────────────── 1. Electric train* └─ 0. Others* ──────────────┬─ 1. Screeching* ├─ 2. Plastic crinkling* ├─ 3. Cleaning* └─ 4. Gear*

Classes marked with an asterisk (*) are present in the SINGA:PURA taxonomy but not the SONYC taxonomy. The "Ice cream truck" class from the SONYC taxonomy has been excluded from the SINGA:PURA taxonomy because this class does not exist in the local context.

In addition, note that the label for the coarse-grained class "Others" in this repository is "0", which is different from the label "X" that is used in the full version of the SINGA:PURA dataset.

Metadata CSV file

Each row of "labelled_metadata_public.csv" corresponds to a single recording and contains the following fields:

"sensor_id": A string representing the identity of the sensor that the recording was taken from. Each sensor node has a unique identity. In other words, if and only if the "sensor_id" strings for two files are different, then the recordings were taken from different sensors.

"filename": The name of the raw audio file corresponding to this row of metadata. Note that there is actually a timestamp on the
S
Singapore SG: Net Migration
ceicdata.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com, Singapore SG: Net Migration [Dataset]. https://www.ceicdata.com/en/singapore/population-and-urbanization-statistics/sg-net-migration
Explore at:
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 1962 - Dec 1, 2012
Area covered
Singapore
Variables measured
Population
Description
Singapore SG: Net Migration data was reported at 298,448.000 Person in 2017. This records a decrease from the previous number of 337,932.000 Person for 2012. Singapore SG: Net Migration data is updated yearly, averaging 193,369.000 Person from Dec 1962 (Median) to 2017, with 12 observations. The data reached an all-time high of 449,245.000 Person in 2007 and a record low of -363.000 Person in 1967. Singapore SG: Net Migration data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Singapore – Table SG.World Bank.WDI: Population and Urbanization Statistics. Net migration is the net total of migrants during the period, that is, the total number of immigrants less the annual number of emigrants, including both citizens and noncitizens. Data are five-year estimates.; ; United Nations Population Division. World Population Prospects: 2017 Revision.; Sum;
d
Data from: Heart rate changes during partial seizures: A study amongst...
catalog.data.gov
data.virginia.gov
+1more
Updated Sep 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (2025). Heart rate changes during partial seizures: A study amongst Singaporean patients [Dataset]. https://catalog.data.gov/dataset/heart-rate-changes-during-partial-seizures-a-study-amongst-singaporean-patients
Explore at:
Dataset updated
Sep 6, 2025
Dataset provided by
National Institutes of Health
Area covered
Singapore
Description
Introduction Studies in Europe and America showed that tachycardia, less often bradycardia, frequently accompanied partial seizures in Caucasian patients. We determine frequency, magnitude and type of ictal heart rate changes during partial seizures in non-Caucasian patients in Singapore. Methods Partial seizures recorded during routine EEGs performed in a tertiary hospital between 1995 and 1999 were retrospectively reviewed. All routine EEGs had simultaneous ECG recording. Heart rate before and during seizures was determined and correlated with epileptogenic focus. Differences in heart rate before and during seizures were grouped into 4 types: (1) >10% decrease; (2) -10 to +20% change; (3) 20–50% increase; (3) >50% increase. Results Of the total of 37 partial seizures, 18 were left hemisphere (LH), 13 were right hemisphere (RH) and 6 were bilateral (BL) in onset. 51% of all seizures showed no significant change in heart rate (type 2), 22% had moderate sinus tachycardia (type 3), 11% showed severe sinus tachycardia (type 4), while 16% had sinus bradycardia (type 1). Asystole was recorded in one seizure. Apart from having more tachycardia in bilateral onset seizures, there was no correlation between side of ictal discharge and heart rate response. Compared to Caucasian patients, sinus tachycardia was considerably less frequent. Frequency of bradycardia was similar to those recorded in the literature. Conclusions Significant heart rate changes during partial seizures were seen in half of Singaporean patients. Although sinus tachycardia was the most common heart rate change, the frequency was considerably lower compared to Caucasian patients. This might be due to methodological and ethnic differences. Rates of bradycardia are similar to those recorded in the literature.
w
Global Financial Inclusion (Global Findex) Database 2011 - Singapore
microdata.worldbank.org
catalog.ihsn.org
Updated Apr 15, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Development Research Group, Finance and Private Sector Development Unit (2015). Global Financial Inclusion (Global Findex) Database 2011 - Singapore [Dataset]. https://microdata.worldbank.org/index.php/catalog/1240
Explore at:
Dataset updated
Apr 15, 2015
Dataset authored and provided by
Development Research Group, Finance and Private Sector Development Unit
Time period covered
2011
Area covered
Singapore
Description
Abstract

Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.

The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.

Geographic coverage

National Coverage.

Analysis unit

Individual

Universe

The target population is the civilian, non-institutionalized population 15 years and above. The sample is nationally representative.

Kind of data

Sample survey data [ssd]

Sampling procedure

The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.

Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.

Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.

The sample size in the majority of economies was 1,000 individuals.

Mode of data collection

Face-to-face [f2f]

Research instrument

The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.

Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.

Sampling error estimates

Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.
m
Immunization, measles (% of children ages 12-23 months) - Singapore
macro-rankings.com
csv, excel
Updated Oct 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
macro-rankings (2025). Immunization, measles (% of children ages 12-23 months) - Singapore [Dataset]. https://www.macro-rankings.com/singapore/immunization-measles-(-of-children-ages-12-23-months)
Explore at:
excel, csvAvailable download formats
Dataset updated
Oct 4, 2025
Dataset authored and provided by
macro-rankings
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Singapore
Description
Time series data for the statistic Immunization, measles (% of children ages 12-23 months) and country Singapore. Indicator Definition:Child immunization, measles, measures the percentage of children ages 12-23 months who received the measles vaccination before 12 months or at any time before the survey. A child is considered adequately immunized against measles after receiving one dose of vaccine.The indicator "Immunization, measles (% of children ages 12-23 months)" stands at 97.00 as of 12/31/2023, the highest value since 12/31/1998. Regarding the One-Year-Change of the series, the current value constitutes an increase of 1.04 percent compared to the value the year prior.The 1 year change in percent is 1.04.The 3 year change in percent is 0.0.The 5 year change in percent is 0.0.The 10 year change in percent is 2.11.The Serie's long term average value is 90.84. It's latest available value, on 12/31/2023, is 6.78 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 12/31/1980, to it's latest available value, on 12/31/2023, is +106.38%.The Serie's change in percent from it's maximum value, on 12/31/1997, to it's latest available value, on 12/31/2023, is -1.02%.
Employed Residents Aged 15 Years and Over by Occupation, Ethnic Group and...
data.gov.sg
Updated Nov 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Singapore Department of Statistics (2025). Employed Residents Aged 15 Years and Over by Occupation, Ethnic Group and Sex (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_4d42b895bcd91b3efe4ed2c0e021b928/view
Explore at:
Dataset updated
Nov 6, 2025
Dataset authored and provided by
Singapore Department of Statistics
License
https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence
Description
Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_4d42b895bcd91b3efe4ed2c0e021b928/view
m
Getting credit: Credit registry coverage (% of adults) - Singapore
macro-rankings.com
csv, excel
Updated Dec 31, 2004
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
macro-rankings (2004). Getting credit: Credit registry coverage (% of adults) - Singapore [Dataset]. https://www.macro-rankings.com/singapore/getting-credit-credit-registry-coverage-(-of-adults)
Explore at:
csv, excelAvailable download formats
Dataset updated
Dec 31, 2004
Dataset authored and provided by
macro-rankings
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Singapore
Description
Time series data for the statistic Getting credit: Credit registry coverage (% of adults) and country Singapore. Indicator Definition:The credit registry coverage reports the number of individuals and firms listed in a credit registry’s database as of January 1 with information on their borrowing history from the past five years, and the number of individuals and firms that have had no borrowing history in the past five years but for which a lender requested a credit report from the registry in the previous calendar year. The number is expressed as a percentage of the adult population.

Facebook

Twitter

Click to copy link

Link copied

Cite

Anuj_sahay (2019). Singapore Residents dataset [Dataset]. https://www.kaggle.com/anujsahay112/singapore-residents-dataset

Singapore Residents dataset

Population exploratory data

Explore at:

zip(116422 bytes)Available download formats

Dataset updated

Aug 28, 2019

Authors

Anuj_sahay

Area covered

Singapore

Description

Context

This dataset is in context of the real world data science work and how the data analyst and data scientist work.

Content

The dataset consists of four columns Year, Level_1(Ethnic group/gender), Level_2(Age group), and population

Acknowledgements

I would sincerely thank GeoIQ for sharing this dataset with me along with tasks. Just having a basic knowledge of Pandas and Numpy and other python data science libraries is not enough. How can you execute tasks and how can you preprocess the data before making any prediction is very important. Most of the datasets in Kaggle are clean and well arranged but this dataset thought me how real world data science and analysis works. Every data science beginner must work on this dataset and try to execute the tasks. It would only give them a good exposer to the real data science world.

Inspiration

Identify the largest Ethnic group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.
Identify the largest age group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.
Identify the group (by age, ethnicity and gender) that: a. Has shown the highest growth rate b. Has shown the lowest growth rate c. Has remained the same
Plot a graph for population trends

Clear search

Close search

Google apps

Main menu

Singapore Residents dataset

Context

Content

Acknowledgements

Inspiration

Resident Population Born Outside Singapore by Age Group, Ethnic Group and...

Resident Population Aged 5 Years and Over by Language Most / Second Most...

New Events Data in Singapore

Singapore Residents By Age Group, Ethnic Group And Sex, At End June, Annual

Resident Population Born In Singapore by Age Group, Ethnic Group and Sex...

Data from: The National University of Singapore SMS Corpus

Context:

Content:

Acknowledgements:

Inspiration:

🐎 Hong Kong Jockey Club and Singapore TurfClub

Forword

Dataset

horses

performances

races

all_dividends

sectional_times

live_odds

Acknowledgement

Resident Households by Ethnic Group, Labour Force Status and Sex of...

Singapore-music-classifier

Context

Content

Inspiration

Twitter bot profiling

United States Imports from Singapore

Singapore SG: Diabetes Prevalence: % of Population Aged 20-79

SINGA:PURA (SINGApore: Polyphonic URban Audio)

Singapore SG: Net Migration

Data from: Heart rate changes during partial seizures: A study amongst...

Global Financial Inclusion (Global Findex) Database 2011 - Singapore

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Sampling error estimates

Immunization, measles (% of children ages 12-23 months) - Singapore

Employed Residents Aged 15 Years and Over by Occupation, Ethnic Group and...

Getting credit: Credit registry coverage (% of adults) - Singapore

Singapore Residents dataset

Population exploratory data

Context

Content

Acknowledgements

Inspiration