Language is the fourth release from the Census of Canada taken on May 11, 2021. This release explores language characteristics of the Canadian population: mother tongue, knowledge of official languages, languages most often spoken at home, and other home languages. In Alberta, most people speak English but immigrant languages, especially those from Asian countries, are becoming increasingly common. In addition, Indigenous languages are increasingly being used in households.
English(Canada) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and control, in-car command and control, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(466 people in total), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
This Alberta Official Statistic compares the knowledge of languages among the Aboriginal Identity population in provinces and territories, based on self-assessment of the ability to converse in the language. Based on the 2011 National Household Survey (NHS), English is the most common language known by the Aboriginal Identity Population across Canada. In most provinces, nearly 100% of the Aboriginal Identity population can converse in English. The lowest proportion of English-speaking Aboriginal people is in Quebec, where the majority speak French. The highest proportion of Aboriginal people who speak Aboriginal languages was in Nunavut at 88.6%, followed by Quebec (32.4%) and the Northwest Territories (32.1%). In Alberta, more Aboriginal people are able to speak Aboriginal languages (15.1%) than are able to speak French or other (non-Aboriginal) languages. The proportion of Alberta Aboriginal people able to speak Aboriginal languages was sixth highest among provinces and territories.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This dataset, released in August 2017, contains the Australian residents population by their birthplace divided into English speaking (ES) and non-English speaking (NES) countries, 2016. The following countries are designated as ES: Canada, Ireland, New Zealand, South Africa, United Kingdom and the United States of America; the remaining countries are designated as NES. The dataset also includes the population of people born overseas and report poor proficiency in English. The data is by Local Government Area (LGA) 2016 geographic boundaries. For more information please see the data source notes on the data. Source: Compiled by PHIDU based on the ABS Census of Population and Housing, August 2016. AURIN has spatially enabled the original data. Data that was not shown/not applicable/not published/not available for the specific area ('#', '..', '^', 'np, 'n.a.', 'n.y.a.' in original PHIDU data) was removed.It has been replaced by by Blank cells. For other keys and abbreviations refer to PHIDU Keys.
The Data Portal on English-Speaking Quebec (DESQ) is a growing online database created by the Quebec English-Speaking Communities Research Network (QUESCREN) at Concordia University. DESQ stores custom statistical datasets related to English-speaking Quebec created for individual researchers or organizations. Most are drawn from Canadian census data but are unavailable on the Statistics Canada website. Datasets are available in three formats: The website is currently under development and all dataset formats listed here may not yet be available. All datasets on this site were created by Statistics Canada for third-party organizations. They are freely available via an open license from Statistics Canada and open-use licensing agreements with the third-party organizations. You are free to use, share, publish and freely distribute these tables as long as you provide an acknowledgement of source (e.g.: Statistics Canada, “name of product,” “reference date,” custom dataset created for “name of partner organization” by Statistics Canada. Reproduced and distributed on an as-is basis with the permission of Statistics Canada.) Producer of DESQ: Quebec English-Speaking Communities Research Network (QUESCREN) Project Management IT Consultant Project Consultant The Secrétariat aux relations avec les Québécois d’expression anglaise, Gouvernement du Quebec, funded this project. QUESCREN partner organizations originally obtained the datasets from Statistics Canada. We acknowledge their generosity in agreeing to make the data available to the public via DESQ, and the work of project consultant Jan Warnke in facilitating their availability. QUESCREN also receives funding from the Department of Canadian Heritage, the Canadian Institute for Research on Linguistic Minorities, and Concordia University.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ORBIT (Object Recognition for Blind Image Training) -India Dataset is a collection of 105,243 images of 76 commonly used objects, collected by 12 individuals in India who are blind or have low vision. This dataset is an "Indian subset" of the original ORBIT dataset [1, 2], which was collected in the UK and Canada. In contrast to the ORBIT dataset, which was created in a Global North, Western, and English-speaking context, the ORBIT-India dataset features images taken in a low-resource, non-English-speaking, Global South context, a home to 90% of the world’s population of people with blindness. Since it is easier for blind or low-vision individuals to gather high-quality data by recording videos, this dataset, like the ORBIT dataset, contains images (each sized 224x224) derived from 587 videos. These videos were taken by our data collectors from various parts of India using the Find My Things [3] Android app. Each data collector was asked to record eight videos of at least 10 objects of their choice.
Collected between July and November 2023, this dataset represents a set of objects commonly used by people who are blind or have low vision in India, including earphones, talking watches, toothbrushes, and typical Indian household items like a belan (rolling pin), and a steel glass. These videos were taken in various settings of the data collectors' homes and workspaces using the Find My Things Android app.
The image dataset is stored in the ‘Dataset’ folder, organized by folders assigned to each data collector (P1, P2, ...P12) who collected them. Each collector's folder includes sub-folders named with the object labels as provided by our data collectors. Within each object folder, there are two subfolders: ‘clean’ for images taken on clean surfaces and ‘clutter’ for images taken in cluttered environments where the objects are typically found. The annotations are saved inside a ‘Annotations’ folder containing a JSON file per video (e.g., P1--coffee mug--clean--231220_084852_coffee mug_224.json) that contains keys corresponding to all frames/images in that video (e.g., "P1--coffee mug--clean--231220_084852_coffee mug_224--000001.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, "P1--coffee mug--clean--231220_084852_coffee mug_224--000002.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, ...). The ‘object_not_present_issue’ key is True if the object is not present in the image, and the ‘pii_present_issue’ key is True, if there is a personally identifiable information (PII) present in the image. Note, all PII present in the images has been blurred to protect the identity and privacy of our data collectors. This dataset version was created by cropping images originally sized at 1080 × 1920; therefore, an unscaled version of the dataset will follow soon.
This project was funded by the Engineering and Physical Sciences Research Council (EPSRC) Industrial ICASE Award with Microsoft Research UK Ltd. as the Industrial Project Partner. We would like to acknowledge and express our gratitude to our data collectors for their efforts and time invested in carefully collecting videos to build this dataset for their community. The dataset is designed for developing few-shot learning algorithms, aiming to support researchers and developers in advancing object-recognition systems. We are excited to share this dataset and would love to hear from you if and how you use this dataset. Please feel free to reach out if you have any questions, comments or suggestions.
REFERENCES:
Daniela Massiceti, Lida Theodorou, Luisa Zintgraf, Matthew Tobias Harris, Simone Stumpf, Cecily Morrison, Edward Cutrell, and Katja Hofmann. 2021. ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision. DOI: https://doi.org/10.25383/city.14294597
microsoft/ORBIT-Dataset. https://github.com/microsoft/ORBIT-Dataset
Linda Yilin Wen, Cecily Morrison, Martin Grayson, Rita Faia Marques, Daniela Massiceti, Camilla Longden, and Edward Cutrell. 2024. Find My Things: Personalized Accessibility through Teachable AI for People who are Blind or Low Vision. In Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI EA '24). Association for Computing Machinery, New York, NY, USA, Article 403, 1–6. https://doi.org/10.1145/3613905.3648641
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This fact sheet is based on data from the 2021 Canadian Legal Problems Survey (CLPS) undertaken by Statistics Canada and commissioned by Justice Canada. The CLPS is a legal needs or legal problems survey; these surveys are done in countries around the world to measure the incidence of legal problems, how respondents attempt to resolve them, and the impacts of these problems. The CLPS reached people aged 18 years and older who could speak English or French. The final sample size was 21,170 people from 10 provinces and included an oversample of Indigenous people.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Language is the fourth release from the Census of Canada taken on May 11, 2021. This release explores language characteristics of the Canadian population: mother tongue, knowledge of official languages, languages most often spoken at home, and other home languages. In Alberta, most people speak English but immigrant languages, especially those from Asian countries, are becoming increasingly common. In addition, Indigenous languages are increasingly being used in households.