The statistic reflects the distribution of languages in Canada in 2022. In 2022, 87.1 percent of the total population in Canada spoke English as their native tongue.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Canadian English General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of English speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Canadian English communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade English speech models that understand and respond to authentic Canadian accents and dialects.
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Canadian English. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
The dataset comes with granular metadata for both speakers and recordings:
Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
This dataset is a versatile resource for multiple English speech and language AI applications:
Over the past fifty years, the proportion of Quebecers speaking both English and French has increased steadily, from **** percent in 1971 to almost half the population (**** percent) in 2021. The rate of English-French bilingualism, on the other hand, has declined in the rest of the country: outside Quebec, just over ten percent of people were bilingual in English and French in 2001, compared to *** percent two decades later.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Refers to the percentage of individuals most often speaking at home at least one of English or French at the time of the census
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
"Facts and Figures, Profiles of Official Language Immigrants: English Speaking Permanent Residents in Quebec presents the annual intake of English-speaking permanent residents in the province of Quebec by category of immigration from 2006 to 2015. The report examines selected characteristics for English-speaking permanent residents. “English-speaking immigrants” are defined by the following criteria: 1) permanent residents with English as Mother Tongue; 2) permanent residents with Mother Tongue other than English and with “English Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated. "
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
"Facts and Figures, Profiles of Official Language Immigrants: English Speaking Permanent Residents in Quebec presents the annual intake of English-speaking permanent residents in the province of Quebec by category of immigration from 2006 to 2015. The report examines selected characteristics for English-speaking permanent residents. “English-speaking immigrants” are defined by the following criteria: 1) permanent residents with English as Mother Tongue; 2) permanent residents with Mother Tongue other than English and with “English Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated. "
In 2021, French was the first language spoken by over 71 percent of the population of Montréal, Québec in Canada. 20.4 percent of the city's residents had English as their first language, 6.7 percent used both English and French as their primary language, and 1.6 percent of the population spoke another language. That same year, 46.4 percent of people living in the province of Québec could speak both English and French.
This service shows the percentage of population, excluding institutional residents, with knowledge of English and French for Canada by 2016 census division. The data is from the Census Profile, Statistics Canada Catalogue no. 98-316-X2016001.
Knowledge of official languages refers to whether the person can conduct a conversation in English only, French only, in both languages or in neither language. For a child who has not yet learned to speak, this includes languages that the child is learning to speak at home. For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.
For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.
To have a cartographic representation of the ecumene with this socio-economic indicator, it is recommended to add as the first layer, the “NRCan - 2016 population ecumene by census division” web service, accessible in the data resources section below.
In 2021, most of the population of the city of Montreal, located in the Canadian province of Quebec, could speak both English and French. In fact, approximately 1.23 million men and 1.68 million women were bilingual. Of those who spoke only one of the official languages, the majority (1.43 million people) spoke only French. In addition, more than 68,400 people did not know either language, with women outnumbering men.
100% data.
English(Canada) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and control, in-car command and control, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(466 people in total), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Canadian English Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.
The dataset contains 30 hours of dual-channel call center recordings between native Canadian English speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.
This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.
Rich metadata is available for each participant and conversation:
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This map shows the percentage of the Canadian population whose mother tongue is English. The 1996 Census defines mother tongue as the first language a person learned at home in childhood and still understood at the time of the census. The 1996 Census showed that 24.0 million Canadians could speak English (84%), 19.3 million spoke English most often at home (68%) and 17.1 million had English mother tongue (60%).
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This map shows the percentage of the Canadian population whose mother tongue is English. The 1996 Census defines mother tongue as the first language a person learned at home in childhood and still understood at the time of the census. The 1996 Census showed that 24.0 million Canadians could speak English (84%), 19.3 million spoke English most often at home (68%) and 17.1 million had English mother tongue (60%).
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Canadian English Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of English speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
The dataset features 30 Hours of dual-channel call center conversations between native Canadian English speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
This dataset can be used across a range of healthcare and voice AI use cases:
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Canadian English Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for English -speaking Real Estate customers. With over 30 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
The dataset features 30 hours of dual-channel call center recordings between native Canadian English speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
Such domain-rich variety ensures model generalization across common real estate support conversations.
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
These transcriptions streamline ASR and NLP development for English real estate voice applications.
Detailed metadata accompanies each participant and conversation:
This enables smart filtering, dialect-focused model training, and structured dataset exploration.
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
This ZIP file contains an IVT file. File size: 13.9 MiB
This dataset displays information regarding the language spoken most often at home. This data is available on the Census Division level, and is available from the 2006 Canadian Census. This data was obtained through: Statistics Canada. This data refers to the language spoken most often at home by the individual at the time of the census. Other languages spoken at home on a regular basis were also collected. Included are population figures for the following attributes: Total Population, English, French, Non-Official, English and French, English and Non-Official Language, French and Non-Official Language, and English French and Non-Official Speaking. This data is also broken down by Age Group.
100% data.
Enrolments in regular second language programs (or core language programs), French immersion programs, and education programs in the minority official language offered in public elementary and secondary schools, by type of program, grade and sex.
The statistic reflects the distribution of languages in Canada in 2022. In 2022, 87.1 percent of the total population in Canada spoke English as their native tongue.