65 datasets found
  1. Singapore Residents dataset

    • kaggle.com
    zip
    Updated Aug 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anuj_sahay (2019). Singapore Residents dataset [Dataset]. https://www.kaggle.com/anujsahay112/singapore-residents-dataset
    Explore at:
    zip(116422 bytes)Available download formats
    Dataset updated
    Aug 28, 2019
    Authors
    Anuj_sahay
    Area covered
    Singapore
    Description

    Context

    This dataset is in context of the real world data science work and how the data analyst and data scientist work.

    Content

    The dataset consists of four columns Year, Level_1(Ethnic group/gender), Level_2(Age group), and population

    Acknowledgements

    I would sincerely thank GeoIQ for sharing this dataset with me along with tasks. Just having a basic knowledge of Pandas and Numpy and other python data science libraries is not enough. How can you execute tasks and how can you preprocess the data before making any prediction is very important. Most of the datasets in Kaggle are clean and well arranged but this dataset thought me how real world data science and analysis works. Every data science beginner must work on this dataset and try to execute the tasks. It would only give them a good exposer to the real data science world.

    Inspiration

    1. Identify the largest Ethnic group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.
    2. Identify the largest age group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.
    3. Identify the group (by age, ethnicity and gender) that: a. Has shown the highest growth rate b. Has shown the lowest growth rate c. Has remained the same
    4. Plot a graph for population trends
  2. Resident Population Born Outside Singapore by Age Group, Ethnic Group and...

    • data.gov.sg
    Updated Nov 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Department of Statistics (2025). Resident Population Born Outside Singapore by Age Group, Ethnic Group and Sex (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_61ef44ab621ed1ef5592be1ab19b48fe/view
    Explore at:
    Dataset updated
    Nov 5, 2025
    Dataset authored and provided by
    Singapore Department of Statistics
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Area covered
    Singapore
    Description

    Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_61ef44ab621ed1ef5592be1ab19b48fe/view

  3. Resident Population Aged 5 Years and Over by Language Most / Second Most...

    • data.gov.sg
    Updated Nov 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Department of Statistics (2025). Resident Population Aged 5 Years and Over by Language Most / Second Most Frequently Spoken at Home, Age Group and Ethnic Group (All Ethnic Groups) (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_ad4a8ccbdab03d16c486a9ee6988289d/view
    Explore at:
    Dataset updated
    Nov 5, 2025
    Dataset authored and provided by
    Singapore Department of Statistics
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Description

    Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_ad4a8ccbdab03d16c486a9ee6988289d/view

  4. New Events Data in Singapore

    • kaggle.com
    zip
    Updated Sep 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Techsalerator (2024). New Events Data in Singapore [Dataset]. https://www.kaggle.com/datasets/techsalerator/new-events-data-in-singapore
    Explore at:
    zip(4948 bytes)Available download formats
    Dataset updated
    Sep 14, 2024
    Authors
    Techsalerator
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Singapore
    Description

    Techsalerator's News Events Data for Singapore: A Comprehensive Overview

    Techsalerator's News Events Data for Singapore offers a powerful resource for businesses, researchers, and media organizations. This dataset compiles information on significant news events across Singapore, pulling from a wide range of media sources, including news outlets, online publications, and social platforms. It provides valuable insights for those looking to track trends, analyze public sentiment, or monitor industry-specific developments.

    Key Data Fields - Event Date: Captures the exact date of the news event. This is crucial for analysts who need to monitor trends over time or for businesses responding to market shifts. - Event Title: A brief headline describing the event. This allows users to quickly categorize and assess news content based on relevance to their interests. - Source: Identifies the news outlet or platform where the event was reported. This helps users track credible sources and assess the reach and influence of the event. - Location: Provides geographic information, indicating where the event took place within Singapore. This is especially valuable for regional analysis or localized marketing efforts. - Event Description: A detailed summary of the event, outlining key developments, participants, and potential impact. Researchers and businesses use this to understand the context and implications of the event.

    Top 5 News Categories in Singapore - Politics: Major news coverage on government decisions, political movements, elections, and policy changes that affect the national landscape. - Economy: Focuses on Singapore’s economic indicators, inflation rates, international trade, and corporate activities influencing business and finance sectors. - Social Issues: News events covering public health, education, and other societal concerns that drive public discourse. - Sports: Highlights events in popular sports such as soccer, swimming, and table tennis, often drawing widespread attention and engagement. - Technology and Innovation: Reports on tech developments, startups, and innovations in Singapore’s thriving tech ecosystem, featuring emerging companies and advancements.

    Top 5 News Sources in Singapore - The Straits Times: A leading news outlet providing comprehensive coverage of national politics, economy, and social issues. - Channel News Asia: A major news platform known for its timely updates on breaking news, politics, and current affairs. - The Business Times: A widely-read newspaper offering insights into economic developments, business news, and corporate activities. - TODAY: A significant news source covering a broad spectrum of topics, including politics, economy, and social issues. - Channel 8 News: The national news channel delivering updates on significant events, public health, and sports across Singapore.

    Accessing Techsalerator’s News Events Data for Singapore To access Techsalerator’s News Events Data for Singapore, please contact info@techsalerator.com with your specific needs. We will provide a customized quote based on the data fields and records you require, with delivery available within 24 hours. Ongoing access options can also be discussed.

    Included Data Fields - Event Date - Event Title - Source - Location - Event Description - Event Category (Politics, Economy, Sports, etc.) - Participants (if applicable) - Event Impact (Social, Economic, etc.)

    Techsalerator’s dataset is an invaluable tool for keeping track of significant events in Singapore. It aids in making informed decisions, whether for business strategy, market analysis, or academic research, providing a clear picture of the country’s news landscape.

  5. Singapore Residents By Age Group, Ethnic Group And Sex, At End June, Annual

    • data.gov.sg
    Updated Nov 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Department of Statistics (2025). Singapore Residents By Age Group, Ethnic Group And Sex, At End June, Annual [Dataset]. https://data.gov.sg/datasets/d_3cf667d761b4bdc6d4d3d3aeec37dea5/view
    Explore at:
    Dataset updated
    Nov 1, 2025
    Dataset authored and provided by
    Singapore Department of Statistics
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Time period covered
    Jan 1957 - Dec 2025
    Area covered
    Singapore
    Description

    Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_3cf667d761b4bdc6d4d3d3aeec37dea5/view

  6. Resident Population Born In Singapore by Age Group, Ethnic Group and Sex...

    • data.gov.sg
    Updated Nov 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Department of Statistics (2025). Resident Population Born In Singapore by Age Group, Ethnic Group and Sex (Census of Population 2010) [Dataset]. https://data.gov.sg/datasets/d_8f1404e56fa6ef520b901a4b51062ee6/view
    Explore at:
    Dataset updated
    Nov 17, 2025
    Dataset authored and provided by
    Singapore Department of Statistics
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Area covered
    Singapore
    Description

    Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_8f1404e56fa6ef520b901a4b51062ee6/view

  7. Data from: The National University of Singapore SMS Corpus

    • kaggle.com
    zip
    Updated Aug 7, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachael Tatman (2017). The National University of Singapore SMS Corpus [Dataset]. https://www.kaggle.com/datasets/rtatman/the-national-university-of-singapore-sms-corpus/discussion
    Explore at:
    zip(3788449 bytes)Available download formats
    Dataset updated
    Aug 7, 2017
    Authors
    Rachael Tatman
    Area covered
    Singapore
    Description

    Context:

    Short Message Service (SMS) messages are short messages sent from one person to another from their mobile phones. They represent a means of personal communication that is an important communicative artifact in our current digital era. This dataset contains SMS messages that were collected from users who knew they were participating in a research project and that their messages would be shared publicly. This dataset contains two SMS messages in two languages: Singapore English and Mandarin Chinese.

    Content:

    This is a corpus of SMS (Short Message Service) messages collected for research at the Department of Computer Science at the National University of Singapore. This dataset consists of 67,093 SMS messages taken from the corpus on Mar 9, 2015. The messages largely originate from Singaporeans and mostly from students attending the University. These messages were collected from volunteers who were made aware that their contributions were going to be made publicly available. The data collectors opportunistically collected as much metadata about the messages and their senders as possible, so as to enable different types of analyses.

    Acknowledgements:

    This corpus was collected by Tao Chen and Min-Yen Kan. If you use this data, please cite the following paper:

    Tao Chen and Min-Yen Kan (2013). Creating a Live, Public Short Message Service Corpus: The NUS SMS Corpus. Language Resources and Evaluation, 47(2)(2013), pages 299-355. URL: https://link.springer.com/article/10.1007%2Fs10579-012-9197-9

    Inspiration:

    This dataset contains a lot of short, informal texts and is ideal for trying your hand at various natural language processing tasks. There’s also a lot of information about the messages which might reveal interesting insights. Here are some ideas to get you started:

    • This dataset contains Singapore English. How well do tools trained on other varieties of English, like stemmers or part of speech taggers, work on it?
    • What time of day are most SMS messages sent? Is this different for the English and Mandarin datasets?
    • Unlike English, Mandarin does not have spaces between words, which can be made up of several characters. Can you build or implement a system for word identification?
  8. 🐎 Hong Kong Jockey Club and Singapore TurfClub

    • kaggle.com
    zip
    Updated Oct 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2024). 🐎 Hong Kong Jockey Club and Singapore TurfClub [Dataset]. https://www.kaggle.com/datasets/mexwell/hong-kong-jockey-club-and-singapore-turfclub/data
    Explore at:
    zip(41690560 bytes)Available download formats
    Dataset updated
    Oct 15, 2024
    Authors
    mexwell
    Area covered
    Singapore
    Description

    Forword

    Gambling is bad, m'kay.

    This repository provides horse race data for the Hong Kong Jockey Club and the Singpore Turf Club. The data was obtained by scraping their respective public websites, and comes with no guarantee of correctness whatsoever.

    A particularly cool thing is that we also provides historical odds for a period of time for HKJC race. Being able to predict what would be the final odds for a given horse on a given race is extremely valuable, but historical data are, as far as we know, not publicly available. We thus wrote a scraper, that ran for 2 seasons, that probed the odds at regular interval up to the race start. This allows for cool time series analysis that can't be done with historical data available on the public websites.

    That dataset is provided as a set of compressed CSV files, that can easily be reloaded to a database of your choice, a pandas dataframe, or even Excel if you don't know any better. The HKJC website is just a little less crappy that the TurfClub one, in general HK data contains more information than their Singaporean counterpart.

    Original Data

    Dataset

    horses

    List of all the horses (some retired) for HKJC and SGTC that ran a race, up to 2018-07-01.

    performances

    Each row of this table is the result for a single horse in a single race, with their position, final odds (for first place -- more explicit dividends can be found in the all_dividends table for HK races). This is the main source of information for the statistics you want. Note that some races found in the performance table do NOT have their counterpart in the races table.

    This contains historical results from 1979 up to 2018-06-27 for Hong Kong, and 2002-03-08 to 2018-04-24 for Singapore.

    races

    List of all the races ran between 2016-09-28 and 2018-06-27 for Hong Kong and 2016-09-25 to 2018-04-24 for Singapore. Note that some races not found in this table still have available performances in the performances table.

    all_dividends

    Each row of this table contains the JSON-encoded dividend results (which can be used to infer the final odds) for each race ran in Hong Kong between 2016-09-28 and 2018-06-27.

    sectional_times

    Each row contains the sectional times for races ran between 2008-06-05 and 2018-06-27. That's basically, for a given horse in a given race, what was their placing and time at given section of the track.

    live_odds

    Live odds evolution for Hong Kong race ran between 2016-09-27 and 2018-06-27. HKJC is a "pari-mutuel" system where odds for a given horse / bet evolve up to the start of the race. This dataset was collected by poking for the odds at various interval before a race (with the interval getting smaller as the race was getting closer, since that's when the odds tend to vary the most). As far as we can tell, this kind of information can not be found in historical dataset, and can only be collected in real-time.

    Acknowledgement

    Foto von Gene Devine auf Unsplash

  9. Resident Households by Ethnic Group, Labour Force Status and Sex of...

    • data.gov.sg
    Updated Nov 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Department of Statistics (2025). Resident Households by Ethnic Group, Labour Force Status and Sex of Household Reference Person (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_5c1b7f454248e9bb20bc5959eea5928b/view
    Explore at:
    Dataset updated
    Nov 6, 2025
    Dataset authored and provided by
    Singapore Department of Statistics
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Description

    Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_5c1b7f454248e9bb20bc5959eea5928b/view

  10. Singapore-music-classifier

    • kaggle.com
    zip
    Updated Nov 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fajilatun Nahar (2020). Singapore-music-classifier [Dataset]. https://www.kaggle.com/fajilatunnahar/singaporemusicclassifier
    Explore at:
    zip(1534107746 bytes)Available download formats
    Dataset updated
    Nov 13, 2020
    Authors
    Fajilatun Nahar
    Area covered
    Singapore
    Description

    Context

    A dataset to classify Chinese, Malay, Hindi and Tamil songs

    Content

    The songs were downloaded from Spotify and they are of 30 seconds each. Low-level features and high-level features were extracted using OpenSmile and essential respectively. Mel Spectrogram images were also extracted using librosa.

    Dataset Description

    • Dataset 1: 260 low-level features
    • Dataset 2: 127 high-level features
    • Dataset 3: Combination of 387 high and low level features
    • Dataset 4: 260 low-level features, each row in the dataset is a 5 second frame of a song.
    • Dataset 5 1820 low-level features (feature space increased with statistical metrics (mean, minimum, maximum, variance, skewness, kurtosis)
    • Dataset 6: 111 low-level features with feature selection (wrapper method)
    • Dataset 7: 82 high-level features with feature selection (wrapper method)
    • Dataset 8: 182 low-level features with feature selection (filter method)
    • Dataset 9: 67 high-level features with feature selection (filter method)
    • Dataset 10: 92 low-level common features from datasets 6 and 8
    • Dataset 11: 49 high-level common features from datasets 7 and 9
    • Dataset 12: Mel Spectrogram images

    Inspiration

    • What is the best classification accuracy?
    • Can we identify exclusive traits of that ethnic group via the songs?
  11. s

    Twitter bot profiling

    • researchdata.smu.edu.sg
    • smu.edu.sg
    • +1more
    pdf
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Living Analytics Research Centre (2023). Twitter bot profiling [Dataset]. http://doi.org/10.25440/smu.12062706.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    SMU Research Data Repository (RDR)
    Authors
    Living Analytics Research Centre
    License

    http://rightsstatements.org/vocab/InC/1.0/http://rightsstatements.org/vocab/InC/1.0/

    Description

    This dataset comprises a set of Twitter accounts in Singapore that are used for social bot profiling research conducted by the Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Here a bot is defined as a Twitter account that generates contents and/or interacts with other users automatically (at least according to human judgment). In this research, Twitter bots have been categorized into three major types:

    Broadcast bot. This bot aims at disseminating information to general audience by providing, e.g., benign links to news, blogs or sites. Such bot is often managed by an organization or a group of people (e.g., bloggers). Consumption bot. The main purpose of this bot is to aggregate contents from various sources and/or provide update services (e.g., horoscope reading, weather update) for personal consumption or use. Spam bot. This type of bots posts malicious contents (e.g., to trick people by hijacking certain account or redirecting them to malicious sites), or promotes harmless but invalid/irrelevant contents aggressively.

    This categorization is general enough to cater for new, emerging types of bot (e.g., chatbots can be viewed as a special type of broadcast bots). The dataset was collected from 1 January to 30 April 2014 via the Twitter REST and streaming APIs. Starting from popular seed users (i.e., users having many followers), their follow, retweet, and user mention links were crawled. The data collection proceeds by adding those followers/followees, retweet sources, and mentioned users who state Singapore in their profile location. Using this procedure, a total of 159,724 accounts have been collected. To identify bots, the first step is to check active accounts who tweeted at least 15 times within the month of April 2014. These accounts were then manually checked and labelled, of which 589 bots were found. As many more human users are expected in the Twitter population, the remaining accounts were randomly sampled and manually checked. With this, 1,024 human accounts were identified. In total, this results in 1,613 labelled accounts. Related Publication: R. J. Oentaryo, A. Murdopo, P. K. Prasetyo, and E.-P. Lim. (2016). On profiling bots in social media. Proceedings of the International Conference on Social Informatics (SocInfo’16), 92-109. Bellevue, WA. https://doi.org/10.1007/978-3-319-47880-7_6

  12. T

    United States Imports from Singapore

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jun 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2017). United States Imports from Singapore [Dataset]. https://tradingeconomics.com/united-states/imports-from-singapore
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Jun 3, 2017
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 31, 1985 - Feb 29, 2024
    Area covered
    United States
    Description

    Imports from Singapore in the United States increased to 3384.61 USD Million in February from 3272.07 USD Million in January of 2024. This dataset includes a chart with historical data for the United States Imports from Singapore.

  13. S

    Singapore SG: Diabetes Prevalence: % of Population Aged 20-79

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 [Dataset]. https://www.ceicdata.com/en/singapore/health-statistics/sg-diabetes-prevalence--of-population-aged-2079
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2017
    Area covered
    Singapore
    Description

    Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 data was reported at 10.990 % in 2017. Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 data is updated yearly, averaging 10.990 % from Dec 2017 (Median) to 2017, with 1 observations. Singapore SG: Diabetes Prevalence: % of Population Aged 20-79 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Singapore – Table SG.World Bank.WDI: Health Statistics. Diabetes prevalence refers to the percentage of people ages 20-79 who have type 1 or type 2 diabetes.; ; International Diabetes Federation, Diabetes Atlas.; Weighted average;

  14. SINGA:PURA (SINGApore: Polyphonic URban Audio)

    • zenodo.org
    bin, csv, pdf, zip
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kenneth Ooi; Kenneth Ooi; Karn N. Watcharasupat; Karn N. Watcharasupat; Santi Peksi; Furi Andi Karnapi; Furi Andi Karnapi; Zhen-Ting Ong; Zhen-Ting Ong; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh; Woon-Seng Gan; Woon-Seng Gan; Santi Peksi; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh (2024). SINGA:PURA (SINGApore: Polyphonic URban Audio) [Dataset]. http://doi.org/10.5281/zenodo.5645825
    Explore at:
    bin, zip, pdf, csvAvailable download formats
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kenneth Ooi; Kenneth Ooi; Karn N. Watcharasupat; Karn N. Watcharasupat; Santi Peksi; Furi Andi Karnapi; Furi Andi Karnapi; Zhen-Ting Ong; Zhen-Ting Ong; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh; Woon-Seng Gan; Woon-Seng Gan; Santi Peksi; Danny Chua; Hui-Wen Leow; Li-Long Kwok; Xin-Lei Ng; Zhen-Ann Loh
    Area covered
    Singapore
    Description

    SINGA:PURA Dataset (v1.0a)

    This repository contains the strongly-labelled subset of recordings of the SINGA:PURA (SINGApore: Polyphonic URban Audio) dataset and corresponding metadata, formatted in a manner compatible with a soundata dataset loader.

    Please note that this repository does not contain the unlabelled recordings of the SINGA:PURA dataset! If you wish to access the unlabelled recordings, please refer to https://doi.org/10.21979/N9/Y8UQ6F for the full version (v1.0) of the SINGA:PURA dataset (which contains both the strongly-labelled and unlabelled recordings).

    Regarding this repository

    The SINGA:PURA dataset is a polyphonic urban sound dataset with spatiotemporal context that contains 6547 strongly-labelled and 72406 unlabelled recordings from a wireless acoustic sensor network deployed in Singapore to identify and mitigate noise sources in Singapore. However, this repository only contains the subset of 6547 strongly-labelled recordings from the SINGA:PURA dataset and their corresponding labels, formatted in a manner compatible with a soundata dataset loader. The recordings are all 10 seconds in length, and may have 1 or 7 channels, depending on the recording device used to record them.

    The readme file in this repository ("Readme.md") contains the same information as this description: a short description on the organisation of this repository, as well our label taxonomy and the dataset itself. For full details regarding the sensor units used, the recording conditions, and annotation methodology, please refer to our conference paper below:

    K. Ooi, K. N. Watcharasupat, S. Peksi, F. A. Karnapi, Z.-T. Ong, D. Chua, H.-W. Leow, L.-L. Kwok, X.-L. Ng, Z.-A. Loh, W.-S. Gan, "A Strongly-Labelled Polyphonic Dataset of Urban Sounds with Spatiotemporal Context," in 13th Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2021.

    The conference paper has also been included in this repository as "APSIPA.pdf".

    Directory structure

    This repository contains a total of 9 files. 5 of the files ("labelled.zip", "labelled.z01", "labelled.z02", "labelled.z03", "labelled.z04") form a multi-part ZIP archive that, when extracted, contain the subset of 6547 strongly-labelled recordings (in FLAC format) in the SINGA:PURA dataset organised in folders by date of recording. The other 4 files are:

    • "APSIPA.pdf": A PDF copy of the conference paper describing the dataset, recording and annotation methodology in detail.
    • "labelled_metadata_public.csv": A CSV file containing the metadata for the 6547 strongly-labelled recordings. Each row corresponds to a single recording. See the section titled "Metadata CSV file" for more information.
    • "labels_public.zip": A ZIP archive that, when extracted, contains 6547 CSV files that each contain the strong labels for their corresponding strongly-labelled recording. The names of the CSV files are identical to the names of the corresponding FLAC files containing the recordings, save for the file extension. Each row corresponds to a single acoustic event. See "Labels CSV files" for more information.
    • "Readme.md": The readme file for this repository.

    Each numbered part of the multi-part ZIP archive is 1000 MB in size, which makes the dataset in its entirety about 5 GB in size. Please ensure that your connection has sufficient bandwidth to support the download, and it may also be useful to use a download manager for downloading the individual files of the dataset. To extract the multi-part ZIP archive, it may be helpful to use either WinRAR or WinZip.

    After extraction, the directory structure of this repository should be as follows:

    .
    ├─ labelled
    │ ├─ 2020-08-03
    │ │ └─ [b827eb7d576e][2020-08-03T23-32-11Z][manual][---][565a40f866f3d2804332ca7896a4c77d][93.29-86.29 66.65]!-90.flac
    │ │
    │ ├─ 2020-08-17
    │ │ └─ <.flac files>
    │ │
    │ ├─ ...
    │ │
    │ └─ 2020-10-31
    │   └─ <.flac files>
    │
    ├─ labels_public
    │ ├─ [b827eb0a63c9][2020-08-20T11-29-04Z][manual][---][de313d12d7f31937615be80cc47a1ad9][]-53.csv
    │ ├─ [b827eb0a63c9][2020-08-20T11-30-04Z][manual][---][de313d12d7f31937615be80cc47a1ad9][]-54.csv
    │ ├─ ...
    │ └─ [b827ebf3744c][2020-09-02T06-53-04Z][manual][---][4edbade2d41d5f80e324ee4f10d401c0][]-1647.csv
    │
    ├─ APSIPA.pdf
    ├─ labelled_metadata_public.csv
    └─ Readme.md

    Label taxonomy

    Our label taxonomy is derived from the taxonomy used in the SONYC-UST datasets, but has been adapted to fit the local (Singapore) context while retaining compatibility with the SONYC-UST ontonology. We chose this taxonomy to allow the SINGA:PURA dataset to be used in conjunction with the SONYC-UST datasets when training urban sound tagging models by simply omitting the labels that are absent in the SONYC-UST taxonomy from the recordings in the SINGA:PURA dataset. For more information regarding the SONYC-UST datasets, please refer to the following paper published by the SONYC team:

    M. Cartwright, J. Cramer, A. E. M. Mendez, Y. Wang, H. Wu, V. Lostanlen, M. Fuentes, G. Dove, C. Mydlarz, J. Salamon, O. Nov, J. P. Bello, "SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context," in Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020.

    Specifically, our label taxonomy consists of 14 coarse-grained classes and 40 fine-grained classes. Their organisation is as follows:

    ─┬─ 1. Engine ───────────────┬─ 1. Small engine
     │              ├─ 2. Medium engine
     │              └─ 3. Large engine
     ├─ 2. Machinery impact ─────┬─ 1. Rock drill
     │              ├─ 2. Jackhammer
     │              ├─ 3. Hoe ram
     │              └─ 4. Pile driver
     ├─ 3. Non-machinery impact ─┬─ 1. Glass breaking*
     │              ├─ 2. Car crash*
     │              └─ 3. Explosion*
     ├─ 4. Powered saw ──────────┬─ 1. Chainsaw
     │              ├─ 2. Small/medium rotating saw
     │              └─ 3. Large rotating saw
     ├─ 5. Alert signal ─────────┬─ 1. Car horn
     │              ├─ 2. Car alarm
     │              ├─ 3. Siren
     │              └─ 4. Reverse beeper
     ├─ 6. Music ────────────────┬─ 1. Stationary music
     │              └─ 2. Mobile music
     ├─ 7. Human voice ──────────┬─ 1. Talking
     │              ├─ 2. Shouting
     │              ├─ 3. Large crowd
     │              ├─ 4. Amplified speech
     │              └─ 5. Singing*
     ├─ 8. Human movement* ──────┬─ 1. Footsteps*
     │              └─ 2. Clapping*
     ├─ 9. Animal* ──────────────┬─ 1. Dog barking
     │              ├─ 2. Bird chirping*
     │              └─ 3. Insect chirping*
     ├─ 10. Water* ──────────────── 1. Hose pump*
     ├─ 11. Weather* ────────────┬─ 1. Rain*
     │              ├─ 2. Thunder*
     │              └─ 3. Wind*
     ├─ 12. Brake* ──────────────┬─ 1. Friction brake*
     │              └─ 2. Exhaust brake*
     ├─ 13. Train* ──────────────── 1. Electric train*
     └─ 0. Others* ──────────────┬─ 1. Screeching*
                   ├─ 2. Plastic crinkling*
                   ├─ 3. Cleaning*
                   └─ 4. Gear*

    Classes marked with an asterisk (*) are present in the SINGA:PURA taxonomy but not the SONYC taxonomy. The "Ice cream truck" class from the SONYC taxonomy has been excluded from the SINGA:PURA taxonomy because this class does not exist in the local context.

    In addition, note that the label for the coarse-grained class "Others" in this repository is "0", which is different from the label "X" that is used in the full version of the SINGA:PURA dataset.

    Metadata CSV file

    Each row of "labelled_metadata_public.csv" corresponds to a single recording and contains the following fields:

    • "sensor_id": A string representing the identity of the sensor that the recording was taken from. Each sensor node has a unique identity. In other words, if and only if the "sensor_id" strings for two files are different, then the recordings were taken from different sensors.
    • "filename": The name of the raw audio file corresponding to this row of metadata. Note that there is actually a timestamp on the

  15. S

    Singapore SG: Net Migration

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Singapore SG: Net Migration [Dataset]. https://www.ceicdata.com/en/singapore/population-and-urbanization-statistics/sg-net-migration
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 1962 - Dec 1, 2012
    Area covered
    Singapore
    Variables measured
    Population
    Description

    Singapore SG: Net Migration data was reported at 298,448.000 Person in 2017. This records a decrease from the previous number of 337,932.000 Person for 2012. Singapore SG: Net Migration data is updated yearly, averaging 193,369.000 Person from Dec 1962 (Median) to 2017, with 12 observations. The data reached an all-time high of 449,245.000 Person in 2007 and a record low of -363.000 Person in 1967. Singapore SG: Net Migration data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Singapore – Table SG.World Bank.WDI: Population and Urbanization Statistics. Net migration is the net total of migrants during the period, that is, the total number of immigrants less the annual number of emigrants, including both citizens and noncitizens. Data are five-year estimates.; ; United Nations Population Division. World Population Prospects: 2017 Revision.; Sum;

  16. d

    Data from: Heart rate changes during partial seizures: A study amongst...

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Heart rate changes during partial seizures: A study amongst Singaporean patients [Dataset]. https://catalog.data.gov/dataset/heart-rate-changes-during-partial-seizures-a-study-amongst-singaporean-patients
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Area covered
    Singapore
    Description

    Introduction Studies in Europe and America showed that tachycardia, less often bradycardia, frequently accompanied partial seizures in Caucasian patients. We determine frequency, magnitude and type of ictal heart rate changes during partial seizures in non-Caucasian patients in Singapore. Methods Partial seizures recorded during routine EEGs performed in a tertiary hospital between 1995 and 1999 were retrospectively reviewed. All routine EEGs had simultaneous ECG recording. Heart rate before and during seizures was determined and correlated with epileptogenic focus. Differences in heart rate before and during seizures were grouped into 4 types: (1) >10% decrease; (2) -10 to +20% change; (3) 20–50% increase; (3) >50% increase. Results Of the total of 37 partial seizures, 18 were left hemisphere (LH), 13 were right hemisphere (RH) and 6 were bilateral (BL) in onset. 51% of all seizures showed no significant change in heart rate (type 2), 22% had moderate sinus tachycardia (type 3), 11% showed severe sinus tachycardia (type 4), while 16% had sinus bradycardia (type 1). Asystole was recorded in one seizure. Apart from having more tachycardia in bilateral onset seizures, there was no correlation between side of ictal discharge and heart rate response. Compared to Caucasian patients, sinus tachycardia was considerably less frequent. Frequency of bradycardia was similar to those recorded in the literature. Conclusions Significant heart rate changes during partial seizures were seen in half of Singaporean patients. Although sinus tachycardia was the most common heart rate change, the frequency was considerably lower compared to Caucasian patients. This might be due to methodological and ethnic differences. Rates of bradycardia are similar to those recorded in the literature.

  17. w

    Global Financial Inclusion (Global Findex) Database 2011 - Singapore

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Apr 15, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Development Research Group, Finance and Private Sector Development Unit (2015). Global Financial Inclusion (Global Findex) Database 2011 - Singapore [Dataset]. https://microdata.worldbank.org/index.php/catalog/1240
    Explore at:
    Dataset updated
    Apr 15, 2015
    Dataset authored and provided by
    Development Research Group, Finance and Private Sector Development Unit
    Time period covered
    2011
    Area covered
    Singapore
    Description

    Abstract

    Well-functioning financial systems serve a vital purpose, offering savings, credit, payment, and risk management products to people with a wide range of needs. Yet until now little had been known about the global reach of the financial sector - the extent of financial inclusion and the degree to which such groups as the poor, women, and youth are excluded from formal financial systems. Systematic indicators of the use of different financial services had been lacking for most economies.

    The Global Financial Inclusion (Global Findex) database provides such indicators. This database contains the first round of Global Findex indicators, measuring how adults in more than 140 economies save, borrow, make payments, and manage risk. The data set can be used to track the effects of financial inclusion policies globally and develop a deeper and more nuanced understanding of how people around the world manage their day-to-day finances. By making it possible to identify segments of the population excluded from the formal financial sector, the data can help policy makers prioritize reforms and design new policies.

    Geographic coverage

    National Coverage.

    Analysis unit

    Individual

    Universe

    The target population is the civilian, non-institutionalized population 15 years and above. The sample is nationally representative.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The Global Findex indicators are drawn from survey data collected by Gallup, Inc. over the 2011 calendar year, covering more than 150,000 adults in 148 economies and representing about 97 percent of the world's population. Since 2005, Gallup has surveyed adults annually around the world, using a uniform methodology and randomly selected, nationally representative samples. The second round of Global Findex indicators was collected in 2014 and is forthcoming in 2015. The set of indicators will be collected again in 2017.

    Surveys were conducted face-to-face in economies where landline telephone penetration is less than 80 percent, or where face-to-face interviewing is customary. The first stage of sampling is the identification of primary sampling units, consisting of clusters of households. The primary sampling units are stratified by population size, geography, or both, and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size; otherwise, simple random sampling is used. Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. If an interview cannot be obtained at the initial sampled household, a simple substitution method is used. Respondents are randomly selected within the selected households by means of the Kish grid.

    Surveys were conducted by telephone in economies where landline telephone penetration is over 80 percent. The telephone surveys were conducted using random digit dialing or a nationally representative list of phone numbers. In selected countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to teach a person in each household, spread over different days and times of year.

    The sample size in the majority of economies was 1,000 individuals.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The questionnaire was designed by the World Bank, in conjunction with a Technical Advisory Board composed of leading academics, practitioners, and policy makers in the field of financial inclusion. The Bill and Melinda Gates Foundation and Gallup, Inc. also provided valuable input. The questionnaire was piloted in over 20 countries using focus groups, cognitive interviews, and field testing. The questionnaire is available in 142 languages upon request.

    Questions on insurance, mobile payments, and loan purposes were asked only in developing economies. The indicators on awareness and use of microfinance insitutions (MFIs) are not included in the public dataset. However, adults who report saving at an MFI are considered to have an account; this is reflected in the composite account indicator.

    Sampling error estimates

    Estimates of standard errors (which account for sampling error) vary by country and indicator. For country- and indicator-specific standard errors, refer to the Annex and Country Table in Demirguc-Kunt, Asli and L. Klapper. 2012. "Measuring Financial Inclusion: The Global Findex." Policy Research Working Paper 6025, World Bank, Washington, D.C.

  18. m

    Immunization, measles (% of children ages 12-23 months) - Singapore

    • macro-rankings.com
    csv, excel
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2025). Immunization, measles (% of children ages 12-23 months) - Singapore [Dataset]. https://www.macro-rankings.com/singapore/immunization-measles-(-of-children-ages-12-23-months)
    Explore at:
    excel, csvAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Singapore
    Description

    Time series data for the statistic Immunization, measles (% of children ages 12-23 months) and country Singapore. Indicator Definition:Child immunization, measles, measures the percentage of children ages 12-23 months who received the measles vaccination before 12 months or at any time before the survey. A child is considered adequately immunized against measles after receiving one dose of vaccine.The indicator "Immunization, measles (% of children ages 12-23 months)" stands at 97.00 as of 12/31/2023, the highest value since 12/31/1998. Regarding the One-Year-Change of the series, the current value constitutes an increase of 1.04 percent compared to the value the year prior.The 1 year change in percent is 1.04.The 3 year change in percent is 0.0.The 5 year change in percent is 0.0.The 10 year change in percent is 2.11.The Serie's long term average value is 90.84. It's latest available value, on 12/31/2023, is 6.78 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 12/31/1980, to it's latest available value, on 12/31/2023, is +106.38%.The Serie's change in percent from it's maximum value, on 12/31/1997, to it's latest available value, on 12/31/2023, is -1.02%.

  19. Employed Residents Aged 15 Years and Over by Occupation, Ethnic Group and...

    • data.gov.sg
    Updated Nov 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singapore Department of Statistics (2025). Employed Residents Aged 15 Years and Over by Occupation, Ethnic Group and Sex (Census of Population 2020) [Dataset]. https://data.gov.sg/datasets/d_4d42b895bcd91b3efe4ed2c0e021b928/view
    Explore at:
    Dataset updated
    Nov 6, 2025
    Dataset authored and provided by
    Singapore Department of Statistics
    License

    https://data.gov.sg/open-data-licencehttps://data.gov.sg/open-data-licence

    Description

    Dataset from Singapore Department of Statistics. For more information, visit https://data.gov.sg/datasets/d_4d42b895bcd91b3efe4ed2c0e021b928/view

  20. m

    Getting credit: Credit registry coverage (% of adults) - Singapore

    • macro-rankings.com
    csv, excel
    Updated Dec 31, 2004
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    macro-rankings (2004). Getting credit: Credit registry coverage (% of adults) - Singapore [Dataset]. https://www.macro-rankings.com/singapore/getting-credit-credit-registry-coverage-(-of-adults)
    Explore at:
    csv, excelAvailable download formats
    Dataset updated
    Dec 31, 2004
    Dataset authored and provided by
    macro-rankings
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Singapore
    Description

    Time series data for the statistic Getting credit: Credit registry coverage (% of adults) and country Singapore. Indicator Definition:The credit registry coverage reports the number of individuals and firms listed in a credit registry’s database as of January 1 with information on their borrowing history from the past five years, and the number of individuals and firms that have had no borrowing history in the past five years but for which a lender requested a credit report from the registry in the previous calendar year. The number is expressed as a percentage of the adult population.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anuj_sahay (2019). Singapore Residents dataset [Dataset]. https://www.kaggle.com/anujsahay112/singapore-residents-dataset
Organization logo

Singapore Residents dataset

Population exploratory data

Explore at:
zip(116422 bytes)Available download formats
Dataset updated
Aug 28, 2019
Authors
Anuj_sahay
Area covered
Singapore
Description

Context

This dataset is in context of the real world data science work and how the data analyst and data scientist work.

Content

The dataset consists of four columns Year, Level_1(Ethnic group/gender), Level_2(Age group), and population

Acknowledgements

I would sincerely thank GeoIQ for sharing this dataset with me along with tasks. Just having a basic knowledge of Pandas and Numpy and other python data science libraries is not enough. How can you execute tasks and how can you preprocess the data before making any prediction is very important. Most of the datasets in Kaggle are clean and well arranged but this dataset thought me how real world data science and analysis works. Every data science beginner must work on this dataset and try to execute the tasks. It would only give them a good exposer to the real data science world.

Inspiration

  1. Identify the largest Ethnic group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.
  2. Identify the largest age group in Singapore. Their average population growth over the years and what proportion of the total population do they constitute.
  3. Identify the group (by age, ethnicity and gender) that: a. Has shown the highest growth rate b. Has shown the lowest growth rate c. Has remained the same
  4. Plot a graph for population trends
Search
Clear search
Close search
Google apps
Main menu