20 datasets found

Facts and Figures 2015: Profiles of Official Language Immigrants: French...
ouvert.canada.ca
open.canada.ca
+1more
xls
Updated Nov 22, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Immigration, Refugees and Citizenship Canada (2024). Facts and Figures 2015: Profiles of Official Language Immigrants: French Speaking Permanent Residents Outside Quebec [Dataset]. https://ouvert.canada.ca/data/dataset/656d603b-b07e-4f6c-9e3a-92b1d85f2d91
Explore at:
xlsAvailable download formats
Dataset updated
Nov 22, 2024
Dataset provided by
Immigration, Refugees and Citizenship Canadahttp://www.cic.gc.ca/
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 2006 - Dec 31, 2015
Area covered
Québec City, Quebec, French
Description
Facts and Figures, Profiles of Official Language Immigrants: French Speaking Permanent Residents outside Quebec presents the annual intake of French-speaking permanent residents in Canada outside the province of Québec, by category of immigration from 2006 to 2015. The report examines selected characteristics for French-speaking permanent residents. “French-speaking immigrants” are defined by the following criteria: 1) permanent residents with French as Mother Tongue; 2) permanent residents with Mother Tongue other than French and with “French Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated.
F
Canadian French Call Center Data for Realestate AI
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian French Call Center Data for Realestate AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/realestate-call-center-conversation-french-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada, French
Dataset funded by
FutureBeeAI
Description
Introduction
This Canadian French Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for French -speaking Real Estate customers. With over 30 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
Speech Data
The dataset features 30 hours of dual-channel call center recordings between native Canadian French speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
•Participant Diversity:
•
Speakers: 60 native Canadian French speakers from our verified contributor community.

•
Regions: Representing different provinces across Canada to ensure accent and dialect variation.

•
Participant Profile: Balanced gender mix (60% male, 40% female) and age range from 18 to 70.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted agent-customer discussions.

•
Call Duration: Average 5–15 minutes per call.

•
Audio Format: Stereo WAV, 16-bit, recorded at 8kHz and 16kHz.

•
Recording Environment: Captured in noise-free and echo-free conditions.

Topic Diversity
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
•Inbound Calls:
•Property Inquiries
•Rental Availability
•Renovation Consultation
•Property Features & Amenities
•Investment Property Evaluation
•Ownership History & Legal Info, and more
•Outbound Calls:
•New Listing Notifications
•Post-Purchase Follow-ups
•Property Recommendations
•Value Updates
•Customer Satisfaction Surveys, and others
Such domain-rich variety ensures model generalization across common real estate support conversations.
Transcription
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., background noise, pauses)
•High transcription accuracy with word error rate below 5% via dual-layer human review.
These transcriptions streamline ASR and NLP development for French real estate voice applications.
Metadata
Detailed metadata accompanies each participant and conversation:
•
Participant Metadata: ID, age, gender, location, accent, and dialect.

•
Conversation Metadata: Topic, call type, sentiment, sample rate, and technical details.

This enables smart filtering, dialect-focused model training, and structured dataset exploration.
Usage and Applications
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px; align-items:
Percentage of population with knowledge of English and French by census...
datasets.ai
catalogue.arctic-sdi.org
+1more
0, 21, 23, 52
Updated Mar 19, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada | Statistique Canada (2019). Percentage of population with knowledge of English and French by census division, 2016 [Dataset]. https://datasets.ai/datasets/7043f8c1-d5e5-492f-8bb1-7eeac9f2a74f
Explore at:
52, 21, 0, 23Available download formats
Dataset updated
Mar 19, 2019
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Authors
Statistics Canada | Statistique Canada
Area covered
French
Description
This service shows the percentage of population, excluding institutional residents, with knowledge of English and French for Canada by 2016 census division. The data is from the Census Profile, Statistics Canada Catalogue no. 98-316-X2016001.

Knowledge of official languages refers to whether the person can conduct a conversation in English only, French only, in both languages or in neither language. For a child who has not yet learned to speak, this includes languages that the child is learning to speak at home. For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.

For additional information refer to 'Knowledge of official languages' in the 2016 Census Dictionary.

To have a cartographic representation of the ecumene with this socio-economic indicator, it is recommended to add as the first layer, the “NRCan - 2016 population ecumene by census division” web service, accessible in the data resources section below.
F
Canadian French Call Center Data for Telecom AI
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian French Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-french-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada, French
Dataset funded by
FutureBeeAI
Description
Introduction
This Canadian French Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for French-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.
Speech Data
The dataset contains 30 hours of dual-channel call center recordings between native Canadian French speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.
•Participant Diversity:
•
Speakers: 60 native Canadian French speakers from our verified contributor pool.

•
Regions: Representing multiple provinces across Canada to ensure coverage of various accents and dialects.

•
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.

•
Call Duration: Ranges from 5 to 15 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.
•Inbound Calls:
•Phone Number Porting
•Network Connectivity Issues
•Billing and Payments
•Technical Support
•Service Activation
•International Roaming Enquiry
•Refund Requests and Billing Adjustments
•Emergency Service Access, and others
•Outbound Calls:
•Welcome Calls & Onboarding
•Payment Reminders
•Customer Satisfaction Surveys
•Technical Updates
•Service Usage Reviews
•Network Complaint Status Calls, and more
This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.
Transcription
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., pauses, coughs)
•High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.
These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.
Metadata
Rich metadata is available for each participant and conversation:
•
Participant Metadata: ID, age, gender, accent, dialect, and location.

<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;
F
Canadian French Call Center Data for BFSI AI
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian French Call Center Data for BFSI AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/bfsi-call-center-conversation-french-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada, French
Dataset funded by
FutureBeeAI
Description
Introduction
This Canadian French Call Center Speech Dataset for the BFSI (Banking, Financial Services, and Insurance) sector is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for French-speaking customers. Featuring over 30 hours of real-world, unscripted audio, it offers authentic customer-agent interactions across a range of BFSI services to train robust and domain-aware ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI developers, financial technology teams, and NLP researchers to build high-accuracy, production-ready models across BFSI customer service scenarios.
Speech Data
The dataset contains 30 hours of dual-channel call center recordings between native Canadian French speakers. Captured in realistic financial support settings, these conversations span diverse BFSI topics from loan enquiries and card disputes to insurance claims and investment options, providing deep contextual coverage for model training and evaluation.
•Participant Diversity:
•
Speakers: 60 native Canadian French speakers from our verified contributor pool.

•
Regions: Representing multiple provinces across Canada to ensure coverage of various accents and dialects.

•
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.

•
Call Duration: Ranges from 5 to 15 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral, ensuring real-world BFSI voice coverage.
•Inbound Calls:
•Debit Card Block Request
•Transaction Disputes
•Loan Enquiries
•Credit Card Billing Issues
•Account Closure & Claims
•Policy Renewals & Cancellations
•Retirement & Tax Planning
•Investment Risk Queries, and more
•Outbound Calls:
•Loan & Credit Card Offers
•Customer Surveys
•EMI Reminders
•Policy Upgrades
•Insurance Follow-ups
•Investment Opportunity Calls
•Retirement Planning Reviews, and more
This variety ensures models trained on the dataset are equipped to handle complex financial dialogues with contextual accuracy.
Transcription
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•30 hours-coded Segments
•Non-speech Tags (e.g., pauses, background noise)
•High transcription accuracy with word error rate < 5% due to double-layered quality checks.
These transcriptions are production-ready, making financial domain model training faster and more accurate.
Metadata
Rich metadata is available for each participant and conversation:
•
Participant Metadata: ID, age, gender,
d
Mother Tongue (French), 1996
datasets.ai
open.canada.ca
0, 57
Updated Sep 26, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natural Resources Canada | Ressources naturelles Canada (2016). Mother Tongue (French), 1996 [Dataset]. https://datasets.ai/datasets/e66df20f-8893-11e0-bf79-6cf049291510
Explore at:
57, 0Available download formats
Dataset updated
Sep 26, 2016
Dataset authored and provided by
Natural Resources Canada | Ressources naturelles Canada
Area covered
French
Description
This map shows the percentage of the Canadian population whose mother tongue was French. The 1996 Census defines mother tongue as the first language a person learned at home in childhood and still understood at the time of the census. The 1996 Census showed that 8.9 million Canadians could conduct a conversation in French (31%), 6.4 million spoke French most often at home (23%) and 6.7 million had French as their mother tongue (24%).
F
Canadian French TTS Speech Dataset for Speech Synthesis
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian French TTS Speech Dataset for Speech Synthesis [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/tts-monolgue-french-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada, French
Dataset funded by
FutureBeeAI
Description
The French TTS Monologue Speech Dataset is a professionally curated resource built to train realistic, expressive, and production-grade text-to-speech (TTS) systems. It contains studio-recorded long-form speech by trained native French voice artists, each contributing 1 to 2 hours of clean, uninterrupted monologue audio.
Unlike typical prompt-based datasets with short, isolated phrases, this collection features long-form, topic-driven monologues that mirror natural human narration. It includes content types that are directly useful for real-world applications, like audiobook-style storytelling, educational lectures, health advisories, product explainers, digital how-tos, formal announcements, and more.
All recordings are captured in professional studios using high-end equipment and under the guidance of experienced voice directors.
Recording & Audio Quality
•
Audio Format: WAV, 48 kHz, available in 16-bit, 24-bit, and 32-bit depth

•
SNR: Minimum 30 dB

•
Channel: Mono

•
Recording Duration: 20-30 minutes

•
Recording Environment: Studio-controlled, acoustically treated rooms

•
Per Speaker Volume: 1–2 hours of speech per artist

•
Quality Control: Each file is reviewed and cleaned for common acoustic issues, including: reverberation, lip smacks, mouth clicks, thumping, hissing, plosives, sibilance, background noise, static interference, clipping, and other artifacts.

Only clean, production-grade audio makes it into the final dataset.
Voice Artist Selection
All voice artists are native French speakers with professional training or prior experience in narration. We ensure a diverse pool in terms of age, gender, and region to bring a balanced and rich vocal dataset.
•Artist Profile:
•Gender: Male and Female
•Age Range: 20–60 years
•Regions: Native French-speaking states from Canada
•
Selection Process: All artists are screened, onboarded, and sample-approved using FutureBeeAI’s proprietary Yugo platform.

Script Quality & Coverage
Scripts are not generic or repetitive. Scripts are professionally authored by domain experts to reflect real-world use cases. They avoid redundancy and include modern vocabulary, emotional range, and phonetically rich sentence structures.
•
Word Count per Script: 3,000–5,000 words per 30-minute session

•Content Types:
•Storytelling
•Script and book reading
•Informational explainers
•Government service instructions
•E-commerce tutorials
•Motivational content
•Health & wellness guides
•Education & career advice
•
Linguistic Design: Balanced punctuation, emotional range, modern syntax, and vocabulary diversity

Transcripts & Alignment
While the script is used during the recording, we also provide post-recording updates to ensure the transcript reflects the final spoken audio. Minor edits are made to adjust for skipped or rephrased words.
•
Segmentation: Time-stamped at the sentence level, aligned to actual spoken delivery

•
Format: Available in plain text and JSON

•Post-processing:
•Corrected for disfluencies
<div
q
2016. English Spoken at Home, French Spoken at Home, Aboriginal Language...
desq.quescren.ca
Updated Mar 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). 2016. English Spoken at Home, French Spoken at Home, Aboriginal Language Spoken at Home, Immigrant Language Spoken at Home, Mother Tongue, Age and Sex for the Population Excluding Institutional Residents of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations - Dataset - Data Portal on English-Speaking Quebec [Dataset]. https://desq.quescren.ca/dataset/chssn-2016-98-400-x2016344
Explore at:
Dataset updated
Mar 30, 2024
Area covered
Quebec, Canada, French
Description
100% data.
a
Knowledge of Language of Aboriginal Identity Population, Canada, Provinces...
open.alberta.ca
Updated May 28, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2013). Knowledge of Language of Aboriginal Identity Population, Canada, Provinces and Territories - Open Government [Dataset]. https://open.alberta.ca/dataset/knowledge-of-language-of-aboriginal-identity-population-canada-provinces-and-territories
Explore at:
Dataset updated
May 28, 2013
Area covered
Canada
Description
This Alberta Official Statistic compares the knowledge of languages among the Aboriginal Identity population in provinces and territories, based on self-assessment of the ability to converse in the language. Based on the 2011 National Household Survey (NHS), English is the most common language known by the Aboriginal Identity Population across Canada. In most provinces, nearly 100% of the Aboriginal Identity population can converse in English. The lowest proportion of English-speaking Aboriginal people is in Quebec, where the majority speak French. The highest proportion of Aboriginal people who speak Aboriginal languages was in Nunavut at 88.6%, followed by Quebec (32.4%) and the Northwest Territories (32.1%). In Alberta, more Aboriginal people are able to speak Aboriginal languages (15.1%) than are able to speak French or other (non-Aboriginal) languages. The proportion of Alberta Aboriginal people able to speak Aboriginal languages was sixth highest among provinces and territories.
u
Facts and Figures 2015: Profiles of Official Language Immigrants: English...
data.urbandatacentre.ca
Updated Oct 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Facts and Figures 2015: Profiles of Official Language Immigrants: English Speaking Permanent Residents inside Quebec - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-caa61377-f34c-4f31-89ae-a57c8a73f99d
Explore at:
Dataset updated
Oct 19, 2025
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Quebec, Canada
Description
"Facts and Figures, Profiles of Official Language Immigrants: English Speaking Permanent Residents in Quebec presents the annual intake of English-speaking permanent residents in the province of Quebec by category of immigration from 2006 to 2015. The report examines selected characteristics for English-speaking permanent residents. “English-speaking immigrants” are defined by the following criteria: 1) permanent residents with English as Mother Tongue; 2) permanent residents with Mother Tongue other than English and with “English Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated. "
Student response to question: Which of these people live at your home...
data.wu.ac.at
www150.statcan.gc.ca
+2more
csv, html, xml
Updated Jul 26, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada | Statistique Canada (2018). Student response to question: Which of these people live at your home (answers are for the home where they live most of the time), by sex, age group and selected countries [Dataset]. https://data.wu.ac.at/schema/www_data_gc_ca/YWJhNGIwNTUtNmY2ZS00MTIyLTgwMGYtNDQyNDc2YTk2ZTc4
Explore at:
html, csv, xmlAvailable download formats
Dataset updated
Jul 26, 2018
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
This table contains 1392 series, with data for years 1994 - 1998 (not all combinations necessarily have data for all years), and was last released on 2007-01-29. This table contains data described by the following dimensions (Not all combinations are available): Geography (29 items: Austria; Belgium (French speaking); Canada; Belgium (Flemish speaking) ...), Sex (2 items: Males; Females ...), Age groups (3 items: 11 years; 13 years;15 years ...), Student response (2 items: Yes; No ...), Family member (4 items: Mother; Father; Stepfather; Stepmother ...).
F
Canadian French Scripted Monologue Speech Data for Telecom
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian French Scripted Monologue Speech Data for Telecom [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/telecom-scripted-speech-monologues-spanish-usa
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada, French
Dataset funded by
FutureBeeAI
Description
Introduction
Presenting the Canadian French Scripted Monologue Speech Dataset for the Telecom Domain, a purpose-built dataset created to accelerate the development of French speech recognition and voice AI models specifically tailored for the telecommunications industry.
Speech Data
This dataset includes over 6,000 high-quality scripted prompt recordings in Canadian French, representing real-world telecom customer service scenarios. It’s designed to support the training of speech-based AI systems used in call centers, virtual agents, and voice-powered support tools.
•Participant Diversity
•
Speakers: 60 native Canadian French speakers

•
Geographic Distribution: Carefully selected from multiple regions across Canada to capture a wide spectrum of dialects and speaking styles

•
Demographics: Balanced representation of males and females (60:40 ratio), aged between 18 to 70 years

•Recording Specifications
•
Type: Scripted monologue prompts focused on telecom industry use cases

•
Duration: Each audio clip ranges from 5 to 30 seconds

•
Format: WAV files in mono, 16-bit depth, with sample rates of 8 kHz and 16 kHz

•
Environment: Clean, echo-free, and noise-controlled settings to ensure optimal audio clarity

Topic Coverage
The dataset reflects a wide variety of common telecom customer interactions, including:
•Customer onboarding and service inquiries
•Billing and payment questions
•Data plans and product information
•Technical support requests
•Network coverage discussions
•Regulatory compliance and policy information
•Upgrades, renewals, and service plan changes
•Domain-specific scripted interactions tailored to real-world telecom use cases
Contextual Depth
To maximize contextual richness, prompts include:
•
Localized Names: Common Canada names in various formats

•
Addresses: Region-specific address structures for realism

•
Dates & Times: Spoken date and time references in typical telecom scenarios (e.g., billing cycles, service activation times)

•
Telecom Terminology: Keywords related to mobile data, network, SIM, devices, plans, etc.

•
Numbers & Rates: Usage statistics, pricing info, recharge values, and billing figures

•
Service Providers: References to telecom companies and third-party service entities

Transcription
Each audio file is paired with an accurate, verbatim transcription for precise model training:
•
Content: Transcriptions are direct representations of each recorded prompt

•
Format: Plain text (.TXT), with filenames matching their corresponding audio files

•
Verification: Every transcription is manually verified by native Canadian French linguists to ensure consistency and accuracy

Metadata
Detailed metadata is included to
Number of students in official languages programs, public elementary and...
www150.statcan.gc.ca
data.urbandatacentre.ca
+2more
Updated Oct 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2025). Number of students in official languages programs, public elementary and secondary schools, by program type, grade and sex [Dataset]. http://doi.org/10.25318/3710000901-eng
Explore at:
Unique identifier
https://doi.org/10.25318/3710000901-eng
Dataset updated
Oct 28, 2025
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
Enrolments in regular second language programs (or core language programs), French immersion programs, and education programs in the minority official language offered in public elementary and secondary schools, by type of program, grade and sex.
F
Canadian French Scripted Monologue Speech Data in Real Estate
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian French Scripted Monologue Speech Data in Real Estate [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/realestate-scripted-speech-monologues-spanish-usa
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada, French
Dataset funded by
FutureBeeAI
Description
Introduction
Introducing the Canadian French Scripted Monologue Speech Dataset for the Real Estate Domain, a dataset designed to support the development of French speech recognition and conversational AI technologies tailored for the real estate industry.
Speech Data
This dataset includes over 6,000 high-quality scripted prompt recordings in Canadian French. The speech content reflects a wide range of real estate interactions to help build intelligent, domain-specific customer support systems and speech-enabled tools.
•Participant Diversity
•
Speakers: 60 native French speakers from across Canada

•
Regional Variation: Balanced representation of regional dialects and speaking styles

•
Demographics: Ages 18–70, with a 60:40 male-to-female ratio

•Recording Specifications
•
Type: Scripted monologue recordings

•
Duration: 5–30 seconds per audio clip

•
Audio Format: WAV, mono channel, 16-bit, sampled at 8 kHz and 16 kHz

•
Recording Environment: Quiet, echo-free settings with no background noise

Topic and Scenario Coverage
This dataset captures a broad spectrum of use cases and conversational themes within the real estate sector, such as:
•Property inquiries and viewing appointments
•Price negotiations and financial discussions
•Contractual and legal clarifications
•Relocation coordination and service support
•Real estate agent interactions
•Regulatory information and buyer/seller advisory
•Domain-specific spoken statements and service dialogues
Contextual Depth
Each scripted prompt incorporates key elements to simulate realistic real estate conversations:
•
Names: Culturally appropriate Canada names in various spoken formats

•
Addresses: Detailed location references, including cities, districts, and street names

•
Dates & Times: Contextual references to appointments, contract timelines, or move-in dates

•
Property Descriptions: Features, measurements, and amenities of real estate listings

•
Financial Details: Prices, rental amounts, down payments, deposits, and loan-related figures

•
Legal Terms: Frequently used terms in property contracts and documentation

Transcription
To ensure precision in model training, each audio recording is paired with a verbatim text transcription:
•
Content: Exact scripted text for each corresponding audio prompt

•
Format: Plain text (.TXT) files named to match their associated audio recordings

•
Quality Control: All transcriptions are manually reviewed by native Canadian French linguists for consistency and correctness

Metadata
Each data sample is enriched with detailed metadata to enhance usability:
•
Participant Metadata: <span
u
Percent official language speakers by municipality - Catalogue - Canadian...
data.urbandatacentre.ca
Updated Oct 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Percent official language speakers by municipality - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-e669ec7d-bb3c-465f-ab77-3d2a58d93461
Explore at:
Dataset updated
Oct 19, 2025
Description
Refers to the percentage of individuals most often speaking at home at least one of English or French at the time of the census
F
Canadian French Wake Words & Voice Commands Speech Data
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Canadian French Wake Words & Voice Commands Speech Data [Dataset]. https://www.futurebeeai.com/dataset/wake-words-and-commands-dataset/wake-words-and-commands-french-canada
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Canada, French
Dataset funded by
FutureBeeAI
Description
Introduction
The Canadian French Wake Word & Voice Command Dataset is expertly curated to support the training and development of voice-activated systems. This dataset includes a large collection of wake words and command phrases, essential for enabling seamless user interaction with voice assistants and other speech-enabled technologies. It’s designed to ensure accurate wake word detection and voice command recognition, enhancing overall system performance and user experience.
Speech Data
This dataset includes 20,000+ audio recordings of wake words and command phrases. Each participant contributed 400 recordings, captured under varied environmental conditions and speaking speeds. The data covers:
•Wake words alone
•Wake words followed by command phrases
Participant Diversity
•
Speakers: 50 native Canadian French speakers from the FutureBeeAI community

•
Regions: Participants from various Canada provinces, ensuring broad coverage of accents and dialects

•
Demographics: Ages 18–70; 60% male and 40% female participants

Recording Details
•
Type: Scripted wake words and command phrases

•
Duration: 1 to 15 seconds per clip

•
Format: WAV, stereo, 16-bit, with sample rates ranging from 16 kHz to 48 kHz

Dataset Diversity
•Wake Word Types
•
Automobile Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Ok Ford, etc.

•
Voice Assistant Wake Words: Hey Siri, Ok Google, Alexa, Hey Cortana, Hi Bixby, Hey Celia, etc.

•
Home Appliance Wake Words: Hi LG, Ok LG, Hello Lloyd, and more

•Command Types by Use Case
•
Automobile: Play music, check directions, voice search, provide feedback, and more

•
Voice Assistant: Ask general questions, make calls, control devices, shopping, manage calendars, and more

•
Home Appliances: Control appliances, check status, set reminders/alarms, manage shopping lists, etc.

•Recording Environments
•No background noise
•Background traffic noise
•People talking in the background
•Speaking Pace
•Normal speed
•Fast speed
This diversity ensures robust training for real-world voice assistant applications.
Metadata
Each audio file is accompanied by detailed metadata to support advanced filtering and training needs.
•
Participant Metadata: Unique ID, age, gender, region, accent, dialect

•
Recording Metadata: Transcript, environment, pace, device used, sample rate, bit depth, file format

Use Cases & Applications
•
Voice Assistant Activation: Train models to accurately detect and trigger based on wake words

•
Smart Home Devices: Enable responsive voice control in smart appliances

•
<b
G
Canadian Armed Forces Regular Force Francophone and Anglophone Officers and...
open.canada.ca
csv
Updated Mar 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Defence (2025). Canadian Armed Forces Regular Force Francophone and Anglophone Officers and NCMs [Dataset]. https://open.canada.ca/data/en/dataset/b579bb2a-8799-49d9-9aa4-dd55b8ccecf1
Explore at:
csvAvailable download formats
Dataset updated
Mar 6, 2025
Dataset provided by
National Defence
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Apr 1, 1997 - Mar 31, 2024
Area covered
Canada
Description
This dataset represents the number of Anglophone and Francophone Canadian Armed Forces (CAF) Regular Force members by Officers and Non-Commissioned Members from 1997 to 2022. Military Personnel Command (MPC) supports the requirement to release accurate and timely information to Canadians, in line with the principles of Open Government. MPC has made every attempt to ensure the accuracy and reliability of the information provided. However, data contained within this report may also appear in historic, current and future reports of a similar nature where it may be represented differently, and in some cases appear to be in conflict with the current report. MPC assumes no responsibility, or liability, for any errors or omissions in the content of this publication. The Commander of Military Personnel Command (MILPERSCOM) is also appointed as the Chief of Military Personnel (CMP).
B
Canadian Gallup Poll, May 1961, #288
borealisdata.ca
dataone.org
Updated Jun 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gallup Canada (2023). Canadian Gallup Poll, May 1961, #288 [Dataset]. http://doi.org/10.5683/SP2/ERNKPC
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP2/ERNKPC
Dataset updated
Jun 23, 2023
Dataset provided by
Borealis
Authors
Gallup Canada
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Canada
Description
This Gallup poll seeks the opinions of Canadians. The primary subject of this survey is politics, with the questions focussing on politicians and political parties, as well as other issues of political importance to both Canada, and other countries. Respondents were also asked questions so that they could be grouped according to geographic, demographic and social groups. Topics of interest include: Adolf Eichmann's trial in Israel; concentration camps; the Conservative party's majority; federal elections; friendliness towards people from Germany and Japan; mandatory English classes in French speaking provinces; mandatory French classes in English speaking provinces; Kennedy's performance as American President; major problems facing the government; nuclear weapons testing, and the possiblity of nuclear war; the Peace Corps; preferred political parties; religion being taught in schools; unemployment; union membership; voting behaviour; and whether Western Canada is more friendly than the rest of Canada. Basic demographics variables are also included.
u
Official Languages Health Program Call for Proposals 2019-2022:...
data.urbandatacentre.ca
Updated Oct 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Official Languages Health Program Call for Proposals 2019-2022: Micro-Funding to Improve Access to Health Services for Official Language Minority Communities - Applicant Guide - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-29942ce0-c87c-4ee2-8d02-f9f49ef77033
Explore at:
Dataset updated
Oct 19, 2025
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Canada
Description
This call for proposals aims to fund projects that improve access to health services for members of French-speaking communities outside Québec and English-speaking communities in Québec (known as official language minority communities - OLMCs).
G
Population by language by municipality
open.canada.ca
open.alberta.ca
csv, html, json, xlsx +1
Updated Jul 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Alberta (2024). Population by language by municipality [Dataset]. https://open.canada.ca/data/dataset/4666e9fe-532c-44ae-ba40-ded754218c12
Explore at:
xlsx, csv, html, json, xmlAvailable download formats
Dataset updated
Jul 24, 2024
Dataset provided by
Government of Alberta
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
Population that speaks an official language (English or French) as the primary language in the home expressed as a percentage of the total population.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Immigration, Refugees and Citizenship Canada (2024). Facts and Figures 2015: Profiles of Official Language Immigrants: French Speaking Permanent Residents Outside Quebec [Dataset]. https://ouvert.canada.ca/data/dataset/656d603b-b07e-4f6c-9e3a-92b1d85f2d91

Facts and Figures 2015: Profiles of Official Language Immigrants: French Speaking Permanent Residents Outside Quebec

Explore at:

xlsAvailable download formats

Dataset updated

Nov 22, 2024

Dataset provided by

Immigration, Refugees and Citizenship Canadahttp://www.cic.gc.ca/

License

Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically

Time period covered

Jan 1, 2006 - Dec 31, 2015

Area covered

Québec City, Quebec, French

Description

Facts and Figures, Profiles of Official Language Immigrants: French Speaking Permanent Residents outside Quebec presents the annual intake of French-speaking permanent residents in Canada outside the province of Québec, by category of immigration from 2006 to 2015. The report examines selected characteristics for French-speaking permanent residents. “French-speaking immigrants” are defined by the following criteria: 1) permanent residents with French as Mother Tongue; 2) permanent residents with Mother Tongue other than French and with “French Only” as official language spoken (excluding “Both English and French” as official language spoken). Note that official language(s) spoken (English only, French only, both French and English, and neither language) are self-declared indicators of knowledge of an official language. Please note that in these datasets, the figures have been suppressed or rounded to prevent the identification of individuals when the datasets are compiled and compared with other publicly available statistics. Values between 0 and 5 are shown as “--“ and all other values are rounded to the nearest multiple of 5. This may result to the sum of the figures not equating to the totals indicated.

Clear search

Close search

Google apps

Main menu

Facts and Figures 2015: Profiles of Official Language Immigrants: French...

Canadian French Call Center Data for Realestate AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Percentage of population with knowledge of English and French by census...

Canadian French Call Center Data for Telecom AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Canadian French Call Center Data for BFSI AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Mother Tongue (French), 1996

Canadian French TTS Speech Dataset for Speech Synthesis

Recording & Audio Quality

Voice Artist Selection

Script Quality & Coverage

Transcripts & Alignment

2016. English Spoken at Home, French Spoken at Home, Aboriginal Language...

Knowledge of Language of Aboriginal Identity Population, Canada, Provinces...

Facts and Figures 2015: Profiles of Official Language Immigrants: English...

Student response to question: Which of these people live at your home...

Canadian French Scripted Monologue Speech Data for Telecom

Introduction

Speech Data

Topic Coverage

Contextual Depth

Transcription

Metadata

Number of students in official languages programs, public elementary and...

Canadian French Scripted Monologue Speech Data in Real Estate

Introduction

Speech Data

Topic and Scenario Coverage

Contextual Depth

Transcription

Metadata

Percent official language speakers by municipality - Catalogue - Canadian...

Canadian French Wake Words & Voice Commands Speech Data

Introduction

Speech Data

Participant Diversity

Recording Details

Dataset Diversity

Metadata

Use Cases & Applications

Canadian Armed Forces Regular Force Francophone and Anglophone Officers and...

Canadian Gallup Poll, May 1961, #288

Official Languages Health Program Call for Proposals 2019-2022:...

Population by language by municipality

Facts and Figures 2015: Profiles of Official Language Immigrants: French Speaking Permanent Residents Outside QuebecSee More Versions

Facts and Figures 2015: Profiles of Official Language Immigrants: French Speaking Permanent Residents Outside Quebec