Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Facts and Figures, Profiles of Official Language Immigrants: French Speaking Permanent Residents outside Quebec presents the annual intake of French-speaking permanent residents in Canada outside the province of Québec, by category of immigration from 2006 to 2015. The report examines selected characteristics for French-speaking permanent residents. “French-speaking immigrants” are defined by the following criteria: 1) permanent residents with French as Mother Tongue; 2) permanent residents with Mother Tongue other than French and with “French Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Canadian French Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for French -speaking Real Estate customers. With over 30 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
The dataset features 30 hours of dual-channel call center recordings between native Canadian French speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
Such domain-rich variety ensures model generalization across common real estate support conversations.
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
These transcriptions streamline ASR and NLP development for French real estate voice applications.
Detailed metadata accompanies each participant and conversation:
This enables smart filtering, dialect-focused model training, and structured dataset exploration.
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
Facebook
TwitterThis service shows the percentage of population, excluding institutional residents, with knowledge of English and French for Canada by 2016 census division. The data is from the Census Profile, Statistics Canada Catalogue no. 98-316-X2016001.
Knowledge of official languages refers to whether the person can conduct a conversation in English only, French only, in both languages or in neither language. For a child who has not yet learned to speak, this includes languages that the child is learning to speak at home. For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.
For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.
To have a cartographic representation of the ecumene with this socio-economic indicator, it is recommended to add as the first layer, the “NRCan - 2016 population ecumene by census division” web service, accessible in the data resources section below.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Canadian French Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for French-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.
The dataset contains 30 hours of dual-channel call center recordings between native Canadian French speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.
This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.
Rich metadata is available for each participant and conversation:
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Canadian French Call Center Speech Dataset for the BFSI (Banking, Financial Services, and Insurance) sector is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for French-speaking customers. Featuring over 30 hours of real-world, unscripted audio, it offers authentic customer-agent interactions across a range of BFSI services to train robust and domain-aware ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI developers, financial technology teams, and NLP researchers to build high-accuracy, production-ready models across BFSI customer service scenarios.
The dataset contains 30 hours of dual-channel call center recordings between native Canadian French speakers. Captured in realistic financial support settings, these conversations span diverse BFSI topics from loan enquiries and card disputes to insurance claims and investment options, providing deep contextual coverage for model training and evaluation.
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral, ensuring real-world BFSI voice coverage.
This variety ensures models trained on the dataset are equipped to handle complex financial dialogues with contextual accuracy.
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
These transcriptions are production-ready, making financial domain model training faster and more accurate.
Rich metadata is available for each participant and conversation:
Facebook
TwitterThis map shows the percentage of the Canadian population whose mother tongue was French. The 1996 Census defines mother tongue as the first language a person learned at home in childhood and still understood at the time of the census. The 1996 Census showed that 8.9 million Canadians could conduct a conversation in French (31%), 6.4 million spoke French most often at home (23%) and 6.7 million had French as their mother tongue (24%).
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The French TTS Monologue Speech Dataset is a professionally curated resource built to train realistic, expressive, and production-grade text-to-speech (TTS) systems. It contains studio-recorded long-form speech by trained native French voice artists, each contributing 1 to 2 hours of clean, uninterrupted monologue audio.
Unlike typical prompt-based datasets with short, isolated phrases, this collection features long-form, topic-driven monologues that mirror natural human narration. It includes content types that are directly useful for real-world applications, like audiobook-style storytelling, educational lectures, health advisories, product explainers, digital how-tos, formal announcements, and more.
All recordings are captured in professional studios using high-end equipment and under the guidance of experienced voice directors.
Only clean, production-grade audio makes it into the final dataset.
All voice artists are native French speakers with professional training or prior experience in narration. We ensure a diverse pool in terms of age, gender, and region to bring a balanced and rich vocal dataset.
Scripts are not generic or repetitive. Scripts are professionally authored by domain experts to reflect real-world use cases. They avoid redundancy and include modern vocabulary, emotional range, and phonetically rich sentence structures.
While the script is used during the recording, we also provide post-recording updates to ensure the transcript reflects the final spoken audio. Minor edits are made to adjust for skipped or rephrased words.
Facebook
Twitter100% data.
Facebook
TwitterThis Alberta Official Statistic compares the knowledge of languages among the Aboriginal Identity population in provinces and territories, based on self-assessment of the ability to converse in the language. Based on the 2011 National Household Survey (NHS), English is the most common language known by the Aboriginal Identity Population across Canada. In most provinces, nearly 100% of the Aboriginal Identity population can converse in English. The lowest proportion of English-speaking Aboriginal people is in Quebec, where the majority speak French. The highest proportion of Aboriginal people who speak Aboriginal languages was in Nunavut at 88.6%, followed by Quebec (32.4%) and the Northwest Territories (32.1%). In Alberta, more Aboriginal people are able to speak Aboriginal languages (15.1%) than are able to speak French or other (non-Aboriginal) languages. The proportion of Alberta Aboriginal people able to speak Aboriginal languages was sixth highest among provinces and territories.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
"Facts and Figures, Profiles of Official Language Immigrants: English Speaking Permanent Residents in Quebec presents the annual intake of English-speaking permanent residents in the province of Quebec by category of immigration from 2006 to 2015. The report examines selected characteristics for English-speaking permanent residents. “English-speaking immigrants” are defined by the following criteria: 1) permanent residents with English as Mother Tongue; 2) permanent residents with Mother Tongue other than English and with “English Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated. "
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 1392 series, with data for years 1994 - 1998 (not all combinations necessarily have data for all years), and was last released on 2007-01-29. This table contains data described by the following dimensions (Not all combinations are available): Geography (29 items: Austria; Belgium (French speaking); Canada; Belgium (Flemish speaking) ...), Sex (2 items: Males; Females ...), Age groups (3 items: 11 years; 13 years;15 years ...), Student response (2 items: Yes; No ...), Family member (4 items: Mother; Father; Stepfather; Stepmother ...).
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Presenting the Canadian French Scripted Monologue Speech Dataset for the Telecom Domain, a purpose-built dataset created to accelerate the development of French speech recognition and voice AI models specifically tailored for the telecommunications industry.
This dataset includes over 6,000 high-quality scripted prompt recordings in Canadian French, representing real-world telecom customer service scenarios. It’s designed to support the training of speech-based AI systems used in call centers, virtual agents, and voice-powered support tools.
The dataset reflects a wide variety of common telecom customer interactions, including:
To maximize contextual richness, prompts include:
Each audio file is paired with an accurate, verbatim transcription for precise model training:
Detailed metadata is included to
Facebook
TwitterEnrolments in regular second language programs (or core language programs), French immersion programs, and education programs in the minority official language offered in public elementary and secondary schools, by type of program, grade and sex.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Introducing the Canadian French Scripted Monologue Speech Dataset for the Real Estate Domain, a dataset designed to support the development of French speech recognition and conversational AI technologies tailored for the real estate industry.
This dataset includes over 6,000 high-quality scripted prompt recordings in Canadian French. The speech content reflects a wide range of real estate interactions to help build intelligent, domain-specific customer support systems and speech-enabled tools.
This dataset captures a broad spectrum of use cases and conversational themes within the real estate sector, such as:
Each scripted prompt incorporates key elements to simulate realistic real estate conversations:
To ensure precision in model training, each audio recording is paired with a verbatim text transcription:
Each data sample is enriched with detailed metadata to enhance usability:
Facebook
TwitterRefers to the percentage of individuals most often speaking at home at least one of English or French at the time of the census
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Canadian French Wake Word & Voice Command Dataset is expertly curated to support the training and development of voice-activated systems. This dataset includes a large collection of wake words and command phrases, essential for enabling seamless user interaction with voice assistants and other speech-enabled technologies. It’s designed to ensure accurate wake word detection and voice command recognition, enhancing overall system performance and user experience.
This dataset includes 20,000+ audio recordings of wake words and command phrases. Each participant contributed 400 recordings, captured under varied environmental conditions and speaking speeds. The data covers:
This diversity ensures robust training for real-world voice assistant applications.
Each audio file is accompanied by detailed metadata to support advanced filtering and training needs.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This dataset represents the number of Anglophone and Francophone Canadian Armed Forces (CAF) Regular Force members by Officers and Non-Commissioned Members from 1997 to 2022. Military Personnel Command (MPC) supports the requirement to release accurate and timely information to Canadians, in line with the principles of Open Government. MPC has made every attempt to ensure the accuracy and reliability of the information provided. However, data contained within this report may also appear in historic, current and future reports of a similar nature where it may be represented differently, and in some cases appear to be in conflict with the current report. MPC assumes no responsibility, or liability, for any errors or omissions in the content of this publication. The Commander of Military Personnel Command (MILPERSCOM) is also appointed as the Chief of Military Personnel (CMP).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This Gallup poll seeks the opinions of Canadians. The primary subject of this survey is politics, with the questions focussing on politicians and political parties, as well as other issues of political importance to both Canada, and other countries. Respondents were also asked questions so that they could be grouped according to geographic, demographic and social groups. Topics of interest include: Adolf Eichmann's trial in Israel; concentration camps; the Conservative party's majority; federal elections; friendliness towards people from Germany and Japan; mandatory English classes in French speaking provinces; mandatory French classes in English speaking provinces; Kennedy's performance as American President; major problems facing the government; nuclear weapons testing, and the possiblity of nuclear war; the Peace Corps; preferred political parties; religion being taught in schools; unemployment; union membership; voting behaviour; and whether Western Canada is more friendly than the rest of Canada. Basic demographics variables are also included.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This call for proposals aims to fund projects that improve access to health services for members of French-speaking communities outside Québec and English-speaking communities in Québec (known as official language minority communities - OLMCs).
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Population that speaks an official language (English or French) as the primary language in the home expressed as a percentage of the total population.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Facts and Figures, Profiles of Official Language Immigrants: French Speaking Permanent Residents outside Quebec presents the annual intake of French-speaking permanent residents in Canada outside the province of Québec, by category of immigration from 2006 to 2015. The report examines selected characteristics for French-speaking permanent residents. “French-speaking immigrants” are defined by the following criteria: 1) permanent residents with French as Mother Tongue; 2) permanent residents with Mother Tongue other than French and with “French Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated.