67 datasets found

Languages in Canada 2022
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Languages in Canada 2022 [Dataset]. https://www.statista.com/statistics/271218/languages-in-canada/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2022
Area covered
Canada
Description
The statistic reflects the distribution of languages in Canada in 2022. In 2022, 87.1 percent of the total population in Canada spoke English as their native tongue.
F
Canadian English General Conversation Speech Dataset for ASR
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian English General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-english-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the Canadian English General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of English speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Canadian English communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade English speech models that understand and respond to authentic Canadian accents and dialects.
Speech Data
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Canadian English. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
•Participant Diversity:
•
Speakers: 60 verified native Canadian English speakers from FutureBeeAI’s contributor community.

•
Regions: Representing various provinces of Canada to ensure dialectal diversity and demographic balance.

•
Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.

•Recording Details:
•
Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.

•
Duration: Each conversation ranges from 15 to 60 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.

•
Environment: Quiet, echo-free settings with no background noise.

Topic Diversity
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
•Sample Topics Include:
•Family & Relationships
•Food & Recipes
•Education & Career
•Healthcare Discussions
•Social Issues
•Technology & Gadgets
•Travel & Local Culture
•Shopping & Marketplace Experiences, and many more.
Transcription
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
•Transcription Highlights:
•Speaker-segmented dialogues
•Time-coded utterances
•Non-speech elements (pauses, laughter, etc.)
•High transcription accuracy, achieved through double QA pass, average WER < 5%
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
Metadata
The dataset comes with granular metadata for both speakers and recordings:
•
Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.

•
Recording Metadata: Topic, duration, audio format, device type, and sample rate.

Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
Usage and Applications
This dataset is a versatile resource for multiple English speech and language AI applications:
•
ASR Development: Train accurate speech-to-text systems for Canadian English.

•
Voice Assistants: Build smart assistants capable of understanding natural Canadian conversations.

<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;
Rate of English–French bilingualism in Québec and Canada 1971-2021
statista.com
Updated Jul 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Rate of English–French bilingualism in Québec and Canada 1971-2021 [Dataset]. https://www.statista.com/statistics/1338881/rate-english-french-bilingualism-quebec-canada/
Explore at:
Dataset updated
Jul 9, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Canada
Description
Over the past fifty years, the proportion of Quebecers speaking both English and French has increased steadily, from **** percent in 1971 to almost half the population (**** percent) in 2021. The rate of English-French bilingualism, on the other hand, has declined in the rest of the country: outside Quebec, just over ten percent of people were bilingual in English and French in 2001, compared to *** percent two decades later.
G
Percent official language speakers by municipality
open.canada.ca
open.alberta.ca
html
Updated Jul 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Alberta (2024). Percent official language speakers by municipality [Dataset]. https://open.canada.ca/data/dataset/e669ec7d-bb3c-465f-ab77-3d2a58d93461
Explore at:
htmlAvailable download formats
Dataset updated
Jul 24, 2024
Dataset provided by
Government of Alberta
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
Refers to the percentage of individuals most often speaking at home at least one of English or French at the time of the census
u
Facts and Figures 2015: Profiles of Official Language Immigrants: English...
data.urbandatacentre.ca
Updated Oct 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Facts and Figures 2015: Profiles of Official Language Immigrants: English Speaking Permanent Residents inside Quebec - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-caa61377-f34c-4f31-89ae-a57c8a73f99d
Explore at:
Dataset updated
Oct 1, 2024
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Canada, Quebec
Description
"Facts and Figures, Profiles of Official Language Immigrants: English Speaking Permanent Residents in Quebec presents the annual intake of English-speaking permanent residents in the province of Quebec by category of immigration from 2006 to 2015. The report examines selected characteristics for English-speaking permanent residents. “English-speaking immigrants” are defined by the following criteria: 1) permanent residents with English as Mother Tongue; 2) permanent residents with Mother Tongue other than English and with “English Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated. "
Facts and Figures 2015: Profiles of Official Language Immigrants: English...
open.canada.ca
data.amerigeoss.org
xls
Updated Nov 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Immigration, Refugees and Citizenship Canada (2024). Facts and Figures 2015: Profiles of Official Language Immigrants: English Speaking Permanent Residents inside Quebec [Dataset]. https://open.canada.ca/data/dataset/caa61377-f34c-4f31-89ae-a57c8a73f99d
Explore at:
xlsAvailable download formats
Dataset updated
Nov 22, 2024
Dataset provided by
Immigration, Refugees and Citizenship Canadahttp://www.cic.gc.ca/
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 2006 - Dec 31, 2015
Area covered
Quebec
Description
"Facts and Figures, Profiles of Official Language Immigrants: English Speaking Permanent Residents in Quebec presents the annual intake of English-speaking permanent residents in the province of Quebec by category of immigration from 2006 to 2015. The report examines selected characteristics for English-speaking permanent residents. “English-speaking immigrants” are defined by the following criteria: 1) permanent residents with English as Mother Tongue; 2) permanent residents with Mother Tongue other than English and with “English Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated. "
Population of Montréal in Canada 2021, by official language spoken and...
statista.com
Updated Jan 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Population of Montréal in Canada 2021, by official language spoken and gender [Dataset]. https://www.statista.com/statistics/1339075/population-montreal-canada-official-language-spoken-gender/
Explore at:
Dataset updated
Jan 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2021
Area covered
Canada
Description
In 2021, French was the first language spoken by over 71 percent of the population of Montréal, Québec in Canada. 20.4 percent of the city's residents had English as their first language, 6.7 percent used both English and French as their primary language, and 1.6 percent of the population spoke another language. That same year, 46.4 percent of people living in the province of Québec could speak both English and French.
Percentage of population with knowledge of English and French by census...
datasets.ai
catalogue.arctic-sdi.org
+1more
0, 21, 23, 52
Updated Sep 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada | Statistique Canada (2024). Percentage of population with knowledge of English and French by census division, 2016 [Dataset]. https://datasets.ai/datasets/7043f8c1-d5e5-492f-8bb1-7eeac9f2a74f
Explore at:
52, 21, 0, 23Available download formats
Dataset updated
Sep 22, 2024
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Authors
Statistics Canada | Statistique Canada
Area covered
French
Description
This service shows the percentage of population, excluding institutional residents, with knowledge of English and French for Canada by 2016 census division. The data is from the Census Profile, Statistics Canada Catalogue no. 98-316-X2016001.

Knowledge of official languages refers to whether the person can conduct a conversation in English only, French only, in both languages or in neither language. For a child who has not yet learned to speak, this includes languages that the child is learning to speak at home. For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.

For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.

To have a cartographic representation of the ecumene with this socio-economic indicator, it is recommended to add as the first layer, the “NRCan - 2016 population ecumene by census division” web service, accessible in the data resources section below.
Population of Montréal in Canada 2021, by knowledge of official languages...
statista.com
Updated Jan 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Population of Montréal in Canada 2021, by knowledge of official languages and gender [Dataset]. https://www.statista.com/statistics/1338899/population-montreal-canada-knowledge-official-languages-gender/
Explore at:
Dataset updated
Jan 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2021
Area covered
Canada
Description
In 2021, most of the population of the city of Montreal, located in the Canadian province of Quebec, could speak both English and French. In fact, approximately 1.23 million men and 1.68 million women were bilingual. Of those who spoke only one of the official languages, the majority (1.43 million people) spoke only French. In addition, more than 68,400 people did not know either language, with women outnumbering men.
q
2016. English Spoken at Home, French Spoken at Home, Aboriginal Language...
desq.quescren.ca
Updated Mar 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). 2016. English Spoken at Home, French Spoken at Home, Aboriginal Language Spoken at Home, Immigrant Language Spoken at Home, Mother Tongue, Age and Sex for the Population Excluding Institutional Residents of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations - Dataset - Data Portal on English-Speaking Quebec [Dataset]. https://desq.quescren.ca/dataset/chssn-2016-98-400-x2016344
Explore at:
Dataset updated
Mar 30, 2024
Area covered
Canada, French, Quebec
Description
100% data.
207 Hours – Canadian Speaking English Speech Data by Mobile Phone
nexdata.ai
Updated Apr 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2024). 207 Hours – Canadian Speaking English Speech Data by Mobile Phone [Dataset]. https://www.nexdata.ai/datasets/speechrecog/1047?source=Github
Explore at:
Dataset updated
Apr 29, 2024
Dataset authored and provided by
Nexdata
Area covered
Canada
Variables measured
Format, Country, Speaker, Language, Accuracy Rate, Content category, Recording device, Recording condition
Description
English(Canada) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and control, in-car command and control, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(466 people in total), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
F
Canadian English Call Center Data for Telecom AI
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian English Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-english-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada
Dataset funded by
FutureBeeAI
Description
Introduction
This Canadian English Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.
Speech Data
The dataset contains 30 hours of dual-channel call center recordings between native Canadian English speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.
•Participant Diversity:
•
Speakers: 60 native Canadian English speakers from our verified contributor pool.

•
Regions: Representing multiple provinces across Canada to ensure coverage of various accents and dialects.

•
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.

•
Call Duration: Ranges from 5 to 15 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.
•Inbound Calls:
•Phone Number Porting
•Network Connectivity Issues
•Billing and Payments
•Technical Support
•Service Activation
•International Roaming Enquiry
•Refund Requests and Billing Adjustments
•Emergency Service Access, and others
•Outbound Calls:
•Welcome Calls & Onboarding
•Payment Reminders
•Customer Satisfaction Surveys
•Technical Updates
•Service Usage Reviews
•Network Complaint Status Calls, and more
This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.
Transcription
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., pauses, coughs)
•High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.
These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.
Metadata
Rich metadata is available for each participant and conversation:
•
Participant Metadata: ID, age, gender, accent, dialect, and location.

<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;
u
Mother Tongue (English), 1996 - Catalogue - Canadian Urban Data Catalogue...
data.urbandatacentre.ca
Updated Oct 1, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Mother Tongue (English), 1996 - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-e65a4300-8893-11e0-a6eb-6cf049291510
Explore at:
Dataset updated
Oct 1, 2024
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Canada
Description
This map shows the percentage of the Canadian population whose mother tongue is English. The 1996 Census defines mother tongue as the first language a person learned at home in childhood and still understood at the time of the census. The 1996 Census showed that 24.0 million Canadians could speak English (84%), 19.3 million spoke English most often at home (68%) and 17.1 million had English mother tongue (60%).
G
Mother Tongue (English), 1996
ouvert.canada.ca
open.canada.ca
+1more
jp2, zip
Updated Mar 14, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natural Resources Canada (2022). Mother Tongue (English), 1996 [Dataset]. https://ouvert.canada.ca/data/dataset/e65a4300-8893-11e0-a6eb-6cf049291510
Explore at:
zip, jp2Available download formats
Dataset updated
Mar 14, 2022
Dataset provided by
Natural Resources Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
This map shows the percentage of the Canadian population whose mother tongue is English. The 1996 Census defines mother tongue as the first language a person learned at home in childhood and still understood at the time of the census. The 1996 Census showed that 24.0 million Canadians could speak English (84%), 19.3 million spoke English most often at home (68%) and 17.1 million had English mother tongue (60%).
F
Canadian English Call Center Data for Healthcare AI
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian English Call Center Data for Healthcare AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-english-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada
Dataset funded by
FutureBeeAI
Description
Introduction
This Canadian English Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of English speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
Speech Data
The dataset features 30 Hours of dual-channel call center conversations between native Canadian English speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
•Participant Diversity:
•
Speakers: 60 verified native Canadian English speakers from our contributor community.

•
Regions: Diverse provinces across Canada to ensure broad dialectal representation.

•
Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.

•RecordingDetails:
•
Conversation Nature: Naturally flowing, unscripted conversations.

•
Call Duration: Each session ranges between 5 to 15 minutes.

•
Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clear conditions without background noise or echo.

Topic Diversity
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
•Inbound Calls:
•Appointment Scheduling
•New Patient Registration
•Surgical Consultation
•Dietary Advice and Consultations
•Insurance Coverage Inquiries
•Follow-up Treatment Requests, and more
•OutboundCalls:
•Appointment Reminders
•Preventive Care Campaigns
•Test Results & Lab Reports
•Health Risk Assessment Calls
•Vaccination Updates
•Wellness Subscription Outreach, and more
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Transcription
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
•Transcription Includes:
•Speaker-identified Dialogues
•Time-coded Segments
•Non-speech Annotations (e.g., silence, cough)
•High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.
Metadata
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
•
Participant Metadata: ID, gender, age, region, accent, and dialect.

•
Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

Usage and Applications
This dataset can be used across a range of healthcare and voice AI use cases:
•
F
Canadian English Call Center Data for Realestate AI
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian English Call Center Data for Realestate AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/realestate-call-center-conversation-english-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada
Dataset funded by
FutureBeeAI
Description
Introduction
This Canadian English Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English -speaking Real Estate customers. With over 30 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
Speech Data
The dataset features 30 hours of dual-channel call center recordings between native Canadian English speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
•Participant Diversity:
•
Speakers: 60 native Canadian English speakers from our verified contributor community.

•
Regions: Representing different provinces across Canada to ensure accent and dialect variation.

•
Participant Profile: Balanced gender mix (60% male, 40% female) and age range from 18 to 70.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted agent-customer discussions.

•
Call Duration: Average 5–15 minutes per call.

•
Audio Format: Stereo WAV, 16-bit, recorded at 8kHz and 16kHz.

•
Recording Environment: Captured in noise-free and echo-free conditions.

Topic Diversity
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
•Inbound Calls:
•Property Inquiries
•Rental Availability
•Renovation Consultation
•Property Features & Amenities
•Investment Property Evaluation
•Ownership History & Legal Info, and more
•Outbound Calls:
•New Listing Notifications
•Post-Purchase Follow-ups
•Property Recommendations
•Value Updates
•Customer Satisfaction Surveys, and others
Such domain-rich variety ensures model generalization across common real estate support conversations.
Transcription
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., background noise, pauses)
•High transcription accuracy with word error rate below 5% via dual-layer human review.
These transcriptions streamline ASR and NLP development for English real estate voice applications.
Metadata
Detailed metadata accompanies each participant and conversation:
•
Participant Metadata: ID, age, gender, location, accent, and dialect.

•
Conversation Metadata: Topic, call type, sentiment, sample rate, and technical details.

This enables smart filtering, dialect-focused model training, and structured dataset exploration.
Usage and Applications
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;
q
2011. First Official Language Spoken, Detailed Language Spoken Most Often at...
desq.quescren.ca
Updated Mar 31, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). 2011. First Official Language Spoken, Detailed Language Spoken Most Often at Home, Age Groups and Sex for the Population Excluding Institutional Residents of Canada, Provinces, Territories, Census Divisions and Census Subdivisions - Dataset - Data Portal on English-Speaking Quebec [Dataset]. https://desq.quescren.ca/dataset/chssn-2011-98-314-xcb2011039
Explore at:
Dataset updated
Mar 31, 2024
Area covered
Canada, Quebec
Description
This ZIP file contains an IVT file. File size: 13.9 MiB
g
Statistics Canada, Population by Language Spoken at Home by Census Division,...
geocommons.com
Updated Jul 3, 2008
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brendan (2008). Statistics Canada, Population by Language Spoken at Home by Census Division, Alberta-Canada, 2006 [Dataset]. http://geocommons.com/search.html
Explore at:
Dataset updated
Jul 3, 2008
Dataset provided by
Statistics Canada
Brendan
Description
This dataset displays information regarding the language spoken most often at home. This data is available on the Census Division level, and is available from the 2006 Canadian Census. This data was obtained through: Statistics Canada. This data refers to the language spoken most often at home by the individual at the time of the census. Other languages spoken at home on a regular basis were also collected. Included are population figures for the following attributes: Total Population, English, French, Non-Official, English and French, English and Non-Official Language, French and Non-Official Language, and English French and Non-Official Speaking. This data is also broken down by Age Group.
q
2016. First Official Language Spoken, Language Spoken Most Often at Home,...
desq.quescren.ca
Updated Mar 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). 2016. First Official Language Spoken, Language Spoken Most Often at Home, Age and Sex for the Population Excluding Institutional Residents of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations - Dataset - Data Portal on English-Speaking Quebec [Dataset]. https://desq.quescren.ca/dataset/chssn-2016-98-400-x2016071
Explore at:
Dataset updated
Mar 30, 2024
Area covered
Quebec, Canada
Description
100% data.
Number of students in official languages programs, public elementary and...
www150.statcan.gc.ca
data.urbandatacentre.ca
+3more
Updated Oct 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2024). Number of students in official languages programs, public elementary and secondary schools, by program type, grade and sex [Dataset]. http://doi.org/10.25318/3710000901-eng
Explore at:
Unique identifier
https://doi.org/10.25318/3710000901-eng
Dataset updated
Oct 10, 2024
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
Enrolments in regular second language programs (or core language programs), French immersion programs, and education programs in the minority official language offered in public elementary and secondary schools, by type of program, grade and sex.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista, Languages in Canada 2022 [Dataset]. https://www.statista.com/statistics/271218/languages-in-canada/

Languages in Canada 2022

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

2022

Area covered

Canada

Description

The statistic reflects the distribution of languages in Canada in 2022. In 2022, 87.1 percent of the total population in Canada spoke English as their native tongue.

Clear search

Close search

Google apps

Main menu

Languages in Canada 2022

Canadian English General Conversation Speech Dataset for ASR

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Rate of English–French bilingualism in Québec and Canada 1971-2021

Percent official language speakers by municipality

Facts and Figures 2015: Profiles of Official Language Immigrants: English...

Facts and Figures 2015: Profiles of Official Language Immigrants: English...

Population of Montréal in Canada 2021, by official language spoken and...

Percentage of population with knowledge of English and French by census...

Population of Montréal in Canada 2021, by knowledge of official languages...

2016. English Spoken at Home, French Spoken at Home, Aboriginal Language...

207 Hours – Canadian Speaking English Speech Data by Mobile Phone

Canadian English Call Center Data for Telecom AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Mother Tongue (English), 1996 - Catalogue - Canadian Urban Data Catalogue...

Mother Tongue (English), 1996

Canadian English Call Center Data for Healthcare AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Canadian English Call Center Data for Realestate AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

2011. First Official Language Spoken, Detailed Language Spoken Most Often at...

Statistics Canada, Population by Language Spoken at Home by Census Division,...

2016. First Official Language Spoken, Language Spoken Most Often at Home,...

Number of students in official languages programs, public elementary and...

Languages in Canada 2022