16 datasets found

Facebook users in Vietnam 2019-2028
statista.com
ai-chatbox.pro
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Facebook users in Vietnam 2019-2028 [Dataset]. https://www.statista.com/forecasts/1136459/facebook-users-in-vietnam
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Vietnam
Description
The number of Facebook users in Vietnam was forecast to increase between 2024 and 2028 by in total *** million users (+**** percent). This overall increase does not happen continuously, notably not in 2027 and 2028. The Facebook user base is estimated to amount to ***** million users in 2028. Notably, the number of Facebook users of was continuously increasing over the past years.User figures, shown here regarding the platform facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Facebook users in countries like Indonesia and Malaysia.
760 Hours - Vietnamese(Vietnam) Scripted Monologue Smartphone speech dataset...
nexdata.ai
m.nexdata.ai
Updated May 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2024). 760 Hours - Vietnamese(Vietnam) Scripted Monologue Smartphone speech dataset [Dataset]. https://www.nexdata.ai/datasets/speechrecog/1006?source=Huggingface
Explore at:
Dataset updated
May 4, 2024
Dataset authored and provided by
Nexdata
Area covered
Vietnam
Variables measured
Format, Country, Speaker, Language, Accuracy Rate, Content category, Recording device, Recording condition, Language(Region) Code, Features of annotation
Description
Vietnamese(Vietnam) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and control, in-car command and control, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,751 people in total), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
F
Vietnamese Call Center Data for Telecom AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Vietnamese Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-vietnamese-vietnam
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Vietnamese Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Vietnamese-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.
Speech Data
The dataset contains 30 hours of dual-channel call center recordings between native Vietnamese speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.
•Participant Diversity:
•
Speakers: 60 native Vietnamese speakers from our verified contributor pool.

•
Regions: Representing multiple provinces across Vietnam to ensure coverage of various accents and dialects.

•
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.

•
Call Duration: Ranges from 5 to 15 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.
•Inbound Calls:
•Phone Number Porting
•Network Connectivity Issues
•Billing and Payments
•Technical Support
•Service Activation
•International Roaming Enquiry
•Refund Requests and Billing Adjustments
•Emergency Service Access, and others
•Outbound Calls:
•Welcome Calls & Onboarding
•Payment Reminders
•Customer Satisfaction Surveys
•Technical Updates
•Service Usage Reviews
•Network Complaint Status Calls, and more
This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.
Transcription
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., pauses, coughs)
•High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.
These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.
Metadata
Rich metadata is available for each participant and conversation:
•
Participant Metadata: ID, age, gender, accent, dialect, and location.

<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px; align-items:
Social media users in Vietnam 2020-2029
statista.com
ai-chatbox.pro
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Social media users in Vietnam 2020-2029 [Dataset]. https://www.statista.com/forecasts/1147065/social-media-users-in-vietnam
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Vietnam
Description
The number of social media users in Vietnam was forecast to continuously increase between 2024 and 2029 by in total ** million users (+***** percent). After the ninth consecutive increasing year, the social media user base is estimated to reach ***** million users and therefore a new peak in 2029. Notably, the number of social media users of was continuously increasing over the past years.The shown figures regarding social media users have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of social media users in countries like Thailand and Malaysia.
1,149 Hours - Vietnamese(Vietnam) Spontaneous Dialogue Smartphone speech...
nexdata.ai
Updated Sep 24, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2023). 1,149 Hours - Vietnamese(Vietnam) Spontaneous Dialogue Smartphone speech dataset [Dataset]. https://www.nexdata.ai/datasets/1122?source=Huggingface
Explore at:
Dataset updated
Sep 24, 2023
Dataset authored and provided by
Nexdata
Area covered
Vietnam
Variables measured
Device, Format, Country, Speaker, Language, Annotation, Accuracy rate, Recording content, Language(Region) Code, Recording Environment
Description
Vietnamese(Vietnam) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 1400 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
F
Vietnamese Call Center Data for Realestate AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Vietnamese Call Center Data for Realestate AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/realestate-call-center-conversation-vietnamese-vietnam
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Vietnamese Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Vietnamese -speaking Real Estate customers. With over 30 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
Speech Data
The dataset features 30 hours of dual-channel call center recordings between native Vietnamese speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
•Participant Diversity:
•
Speakers: 60 native Vietnamese speakers from our verified contributor community.

•
Regions: Representing different provinces across Vietnam to ensure accent and dialect variation.

•
Participant Profile: Balanced gender mix (60% male, 40% female) and age range from 18 to 70.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted agent-customer discussions.

•
Call Duration: Average 5–15 minutes per call.

•
Audio Format: Stereo WAV, 16-bit, recorded at 8kHz and 16kHz.

•
Recording Environment: Captured in noise-free and echo-free conditions.

Topic Diversity
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
•Inbound Calls:
•Property Inquiries
•Rental Availability
•Renovation Consultation
•Property Features & Amenities
•Investment Property Evaluation
•Ownership History & Legal Info, and more
•Outbound Calls:
•New Listing Notifications
•Post-Purchase Follow-ups
•Property Recommendations
•Value Updates
•Customer Satisfaction Surveys, and others
Such domain-rich variety ensures model generalization across common real estate support conversations.
Transcription
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., background noise, pauses)
•High transcription accuracy with word error rate below 5% via dual-layer human review.
These transcriptions streamline ASR and NLP development for Vietnamese real estate voice applications.
Metadata
Detailed metadata accompanies each participant and conversation:
•
Participant Metadata: ID, age, gender, location, accent, and dialect.

•
Conversation Metadata: Topic, call type, sentiment, sample rate, and technical details.

This enables smart filtering, dialect-focused model training, and structured dataset exploration.
Usage and Applications
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
<div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px; align-items:
F
Vietnamese Call Center Data for Travel AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Vietnamese Call Center Data for Travel AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/travel-call-center-conversation-vietnamese-vietnam
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
Vietnam
Dataset funded by
FutureBeeAI
Description
Introduction
This Vietnamese Call Center Speech Dataset for the Travel industry is purpose-built to power the next generation of voice AI applications for travel booking, customer support, and itinerary assistance. With over 30 hours of unscripted, real-world conversations, the dataset enables the development of highly accurate speech recognition and natural language understanding models tailored for Vietnamese -speaking travelers.
Created by FutureBeeAI, this dataset supports researchers, data scientists, and conversational AI teams in building voice technologies for airlines, travel portals, and hospitality platforms.
Speech Data
The dataset includes 30 hours of dual-channel audio recordings between native Vietnamese speakers engaged in real travel-related customer service conversations. These audio files reflect a wide variety of topics, accents, and scenarios found across the travel and tourism industry.
•Participant Diversity:
•
Speakers: 60 native Vietnamese contributors from our verified pool.

•
Regions: Covering multiple Vietnam provinces to capture accent and dialectal variation.

•
Participant Profile: Balanced representation of age (18–70) and gender (60% male, 40% female).

•Recording Details:
•
Conversation Nature: Naturally flowing, spontaneous customer-agent calls.

•
Call Duration: Between 5 and 15 minutes per session.

•
Audio Format: Stereo WAV, 16-bit depth, at 8kHz and 16kHz.

•
Recording Environment: Captured in controlled, noise-free, echo-free settings.

Topic Diversity
Inbound and outbound conversations span a wide range of real-world travel support situations with varied outcomes (positive, neutral, negative).
•Inbound Calls:
•Booking Assistance
•Destination Information
•Flight Delays or Cancellations
•Support for Disabled Passengers
•Health and Safety Travel Inquiries
•Lost or Delayed Luggage, and more
•Outbound Calls:
•Promotional Travel Offers
•Customer Feedback Surveys
•Booking Confirmations
•Flight Rescheduling Alerts
•Visa Expiry Notifications, and others
These scenarios help models understand and respond to diverse traveler needs in real-time.
Transcription
Each call is accompanied by manually curated, high-accuracy transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-Stamped Segments
•Non-speech Markers (e.g., pauses, coughs)
•High transcription accuracy by dual-layered transcription review ensures word error rate under 5%.
Metadata
Extensive metadata enriches each call and speaker for better filtering and AI training:
•
Participant Metadata: ID, age, gender, region, accent, and dialect.

•
Conversation Metadata: Topic, domain, call type, sentiment, and audio specs.

Usage and Applications
This dataset is ideal for a variety of AI use cases in the travel and tourism space:
•
ASR Systems: Train Vietnamese speech-to-text engines for travel platforms.

<div style="margin-top:10px; margin-bottom: 10px; padding-left:
F
Vietnamese Call Center Data for Healthcare AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Vietnamese Call Center Data for Healthcare AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-vietnamese-vietnam
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Vietnamese Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Vietnamese speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
Speech Data
The dataset features 30 Hours of dual-channel call center conversations between native Vietnamese speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
•Participant Diversity:
•
Speakers: 60 verified native Vietnamese speakers from our contributor community.

•
Regions: Diverse provinces across Vietnam to ensure broad dialectal representation.

•
Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.

•RecordingDetails:
•
Conversation Nature: Naturally flowing, unscripted conversations.

•
Call Duration: Each session ranges between 5 to 15 minutes.

•
Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clear conditions without background noise or echo.

Topic Diversity
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
•Inbound Calls:
•Appointment Scheduling
•New Patient Registration
•Surgical Consultation
•Dietary Advice and Consultations
•Insurance Coverage Inquiries
•Follow-up Treatment Requests, and more
•OutboundCalls:
•Appointment Reminders
•Preventive Care Campaigns
•Test Results & Lab Reports
•Health Risk Assessment Calls
•Vaccination Updates
•Wellness Subscription Outreach, and more
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Transcription
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
•Transcription Includes:
•Speaker-identified Dialogues
•Time-coded Segments
•Non-speech Annotations (e.g., silence, cough)
•High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.
Metadata
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
•
Participant Metadata: ID, gender, age, region, accent, and dialect.

•
Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

Usage and Applications
This dataset can be used across a range of healthcare and voice AI use cases:
•
<b
Number of smartphone users in Vietnam 2014-2029
statista.com
ai-chatbox.pro
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department (2023). Number of smartphone users in Vietnam 2014-2029 [Dataset]. https://www.statista.com/topics/9168/smartphone-market-in-vietnam/
Explore at:
Dataset updated
Dec 21, 2023
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Area covered
Vietnam
Description
The number of smartphone users in Vietnam was forecast to continuously increase between 2024 and 2029 by in total 12.9 million users (+15.05 percent). After the fifteenth consecutive increasing year, the smartphone user base is estimated to reach 98.64 million users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Malaysia and Indonesia.
711 Hours - Vietnamese(Vietnam) Real-world Casual Conversation and Monologue...
nexdata.ai
Updated Mar 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2025). 711 Hours - Vietnamese(Vietnam) Real-world Casual Conversation and Monologue speech dataset [Dataset]. https://www.nexdata.ai/datasets/speechrecog/1128
Explore at:
Dataset updated
Mar 30, 2025
Dataset authored and provided by
Nexdata
Area covered
Vietnam
Variables measured
Format, Country, Accuracy, Language, Content category, Language(Region) Code, Recording environment, Features of annotation
Description
Vietnamese(Vietnam) Real-world Casual Conversation and Monologue speech dataset, covers self-media, conversation, live and other generic domains, mirrors real-world interactions. Transcribed with text content, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.
F
Vietnamese Call Center Data for Delivery & Logistics AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Vietnamese Call Center Data for Delivery & Logistics AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/delivery-call-center-conversation-vietnamese-vietnam
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Vietnamese Call Center Speech Dataset for the Delivery and Logistics industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Vietnamese-speaking customers. With over 30 hours of real-world, unscripted call center audio, this dataset captures authentic delivery-related conversations essential for training high-performance ASR models.
Curated by FutureBeeAI, this dataset empowers AI teams, logistics tech providers, and NLP researchers to build accurate, production-ready models for customer support automation in delivery and logistics.
Speech Data
The dataset contains 30 hours of dual-channel call center recordings between native Vietnamese speakers. Captured across various delivery and logistics service scenarios, these conversations cover everything from order tracking to missed delivery resolutions offering a rich, real-world training base for AI models.
•Participant Diversity:
•
Speakers: 60 native Vietnamese speakers from our verified contributor pool.

•
Regions: Multiple provinces of Vietnam for accent and dialect diversity.

•
Participant Profile: Balanced gender distribution (60% male, 40% female) with ages ranging from 18 to 70.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted customer-agent dialogues.

•
Call Duration: 5 to 15 minutes on average.

•
Audio Format: Stereo WAV, 16-bit depth, recorded at 8kHz and 16kHz.

•
Recording Environment: Captured in clean, noise-free, echo-free conditions.

Topic Diversity
This speech corpus includes both inbound and outbound delivery-related conversations, covering varied outcomes (positive, negative, neutral) to train adaptable voice models.
•Inbound Calls:
•Order Tracking
•Delivery Complaints
•Undeliverable Addresses
•Return Process Enquiries
•Delivery Method Selection
•Order Modifications, and more
•Outbound Calls:
•Delivery Confirmations
•Subscription Offer Calls
•Incorrect Address Follow-ups
•Missed Delivery Notifications
•Delivery Feedback Surveys
•Out-of-Stock Alerts, and others
This comprehensive coverage reflects real-world logistics workflows, helping voice AI systems interpret context and intent with precision.
Transcription
All recordings come with high-quality, human-generated verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•Time-coded Segments
•Non-speech Tags (e.g., pauses, noise)
•High transcription accuracy with word error rate under 5% via dual-layer quality checks.
These transcriptions support fast, reliable model development for Vietnamese voice AI applications in the delivery sector.
Metadata
Detailed metadata is included for each participant and conversation:
•
Participant Metadata: ID, age, gender, region, accent, dialect.

•
Conversation Metadata: Topic, call type, sentiment, sample rate, and technical attributes.

This metadata aids in training specialized models, filtering demographics, and running advanced analytics.
Usage and Applications
<p style="margin-block:
F
Vietnamese Call Center Data for BFSI AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Vietnamese Call Center Data for BFSI AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/bfsi-call-center-conversation-vietnamese-vietnam
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Vietnamese Call Center Speech Dataset for the BFSI (Banking, Financial Services, and Insurance) sector is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Vietnamese-speaking customers. Featuring over 30 hours of real-world, unscripted audio, it offers authentic customer-agent interactions across a range of BFSI services to train robust and domain-aware ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI developers, financial technology teams, and NLP researchers to build high-accuracy, production-ready models across BFSI customer service scenarios.
Speech Data
The dataset contains 30 hours of dual-channel call center recordings between native Vietnamese speakers. Captured in realistic financial support settings, these conversations span diverse BFSI topics from loan enquiries and card disputes to insurance claims and investment options, providing deep contextual coverage for model training and evaluation.
•Participant Diversity:
•
Speakers: 60 native Vietnamese speakers from our verified contributor pool.

•
Regions: Representing multiple provinces across Vietnam to ensure coverage of various accents and dialects.

•
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.

•
Call Duration: Ranges from 5 to 15 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral, ensuring real-world BFSI voice coverage.
•Inbound Calls:
•Debit Card Block Request
•Transaction Disputes
•Loan Enquiries
•Credit Card Billing Issues
•Account Closure & Claims
•Policy Renewals & Cancellations
•Retirement & Tax Planning
•Investment Risk Queries, and more
•Outbound Calls:
•Loan & Credit Card Offers
•Customer Surveys
•EMI Reminders
•Policy Upgrades
•Insurance Follow-ups
•Investment Opportunity Calls
•Retirement Planning Reviews, and more
This variety ensures models trained on the dataset are equipped to handle complex financial dialogues with contextual accuracy.
Transcription
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•30 hours-coded Segments
•Non-speech Tags (e.g., pauses, background noise)
•High transcription accuracy with word error rate < 5% due to double-layered quality checks.
These transcriptions are production-ready, making financial domain model training faster and more accurate.
Metadata
Rich metadata is available for each participant and conversation:
•
Participant Metadata: ID, age, gender, accent, dialect,
F
Vietnamese Call Center Data for Retail & E-Commerce AI
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Vietnamese Call Center Data for Retail & E-Commerce AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/retail-call-center-conversation-vietnamese-vietnam
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
This Vietnamese Call Center Speech Dataset for the Retail and E-commerce industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Vietnamese speakers. Featuring over 30 hours of real-world, unscripted audio, it provides authentic human-to-human customer service conversations vital for training robust ASR models.
Curated by FutureBeeAI, this dataset empowers voice AI developers, data scientists, and language model researchers to build high-accuracy, production-ready models across retail-focused use cases.
Speech Data
The dataset contains 30 hours of dual-channel call center recordings between native Vietnamese speakers. Captured in realistic scenarios, these conversations span diverse retail topics from product inquiries to order cancellations, providing a wide context range for model training and testing.
•Participant Diversity:
•
Speakers: 60 native Vietnamese speakers from our verified contributor pool.

•
Regions: Representing multiple provinces across Vietnam to ensure coverage of various accents and dialects.

•
Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.

•Recording Details:
•
Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.

•
Call Duration: Ranges from 5 to 15 minutes.

•
Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.

•
Recording Environment: Captured in clean conditions with no echo or background noise.

Topic Diversity
This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral, ensuring real-world scenario coverage.
•Inbound Calls:
•Product Inquiries
•Order Cancellations
•Refund & Exchange Requests
•Subscription Queries, and more
•Outbound Calls:
•Order Confirmations
•Upselling & Promotions
•Account Updates
•Loyalty Program Offers
•Customer Verifications, and others
Such variety enhances your model’s ability to generalize across retail-specific voice interactions.
Transcription
All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.
•Transcription Includes:
•Speaker-Segmented Dialogues
•30 hours-coded Segments
•Non-speech Tags (e.g., pauses, cough)
•High transcription accuracy with word error rate < 5% due to double-layered quality checks.
These transcriptions are production-ready, making model training faster and more accurate.
Metadata
Rich metadata is available for each participant and conversation:
•
Participant Metadata: ID, age, gender, accent, dialect, and location.

•
Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

This granularity supports advanced analytics, dialect filtering, and fine-tuned model evaluation.
Usage and Applications
This dataset is ideal for a range of voice AI and NLP applications:
•
Automatic Speech Recognition (ASR): Fine-tune Vietnamese speech-to-text systems.

<span
d
GrabFood, GrabExpress Restaurant & Food Delivery Transaction Data |...
datarade.ai
.json, .xml, .csv
Updated Oct 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Measurable AI (2023). GrabFood, GrabExpress Restaurant & Food Delivery Transaction Data | E-Receipt Data | South East Asia | Granular & Aggregate Data avail. [Dataset]. https://datarade.ai/data-products/grabfood-grabexpress-restaurant-food-delivery-transaction-measurable-ai
Explore at:
.json, .xml, .csvAvailable download formats
Dataset updated
Oct 13, 2023
Dataset authored and provided by
Measurable AI
Area covered
Indonesia, Singapore, Malaysia, Philippines, Japan, Vietnam, Cambodia, Thailand, South East Asia
Description
The Measurable AI GrabFood and GrabExpress Restaurant & Food Delivery Transaction datasets are leading sources of email receipts and transaction data, offering data collected directly from users via Proprietary Consumer Apps, with millions of opt-in users.

We source our email receipt consumer data panel via two consumer apps which garner the express consent of our end-users (GDPR compliant). We then aggregate and anonymize all the transactional data to produce raw and aggregate datasets for our clients.

Use Cases Our clients leverage our datasets to produce actionable consumer insights such as: - Market share analysis - User behavioral traits (e.g. retention rates) - Average order values - Promotional strategies used by the key players. Several of our clients also use our datasets for forecasting and understanding industry trends better.

Coverage - SE Asia (Singapore, Indonesia, Thailand, Malaysia, Philippines, Vietnam, Cambodia)

Granular Data Itemized, high-definition data per transaction level with metrics such as - Order value - Items ordered - No. of orders per user - Delivery fee - Service fee - Promotions used - Geolocation data and more

Aggregate Data - Weekly/ monthly order volume - Revenue delivered in aggregate form, with historical data dating back to 2018. All the transactional e-receipts are sent from the GrabFood and Grab Express food delivery apps to users’ registered accounts.

Most of our clients are fast-growing Tech Companies, Financial Institutions, Buyside Firms, Market Research Agencies, Consultancies and Academia.

Our dataset is GDPR compliant, contains no PII information and is aggregated & anonymized with user consent. Contact business@measurable.ai for a data dictionary and to find out our volume in each country.
F
East Asian Children Facial Image Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). East Asian Children Facial Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/image-dataset/facial-images-minor-east-asian
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Area covered
East Asia
Dataset funded by
FutureBeeAI
Description
Introduction
Welcome to the East Asian Child Faces Dataset, meticulously curated to enhance face recognition models and support the development of advanced biometric identification systems, child identification models, and other facial recognition technologies.
Facial Image Data
This dataset comprises over 5,000 child image sets, divided into participant-wise sets with each set including:
•
Facial Images: 15 different high-quality images per child.

Diversity and Representation
The dataset includes contributions from a diverse network of children across East Asian countries:
•
Geographical Representation: Participants from East Asian countries, including China, Japan, Philippines, Malaysia, Singapore, Thailand, Vietnam, Indonesia, and more.

•
Demographics: Participants are children under the age of 18, representing both males and females.

•
File Format: The dataset contains images in JPEG and HEIC file format.

Quality and Conditions
To ensure high utility and robustness, all images are captured under varying conditions:
•
Lighting Conditions: Images are taken in different lighting environments to ensure variability and realism.

•
Backgrounds: A variety of backgrounds are available to enhance model generalization.

•
Device Quality: Photos are taken using the latest mobile devices to ensure high resolution and clarity.

Metadata
Each facial image set is accompanied by detailed metadata for each participant, including:
•Participant Identifier
•File Name
•Age
•Gender
•Country
•Demographic Information
•File Format
This metadata is essential for training models that can accurately recognize and identify children's faces across different demographics and conditions.
Usage and Applications
This facial image dataset is ideal for various applications in the field of computer vision, including but not limited to:
•
Facial Recognition Models: Improving the accuracy and reliability of facial recognition systems.

•
KYC Models: Streamlining the identity verification processes for financial and other services.

•
Biometric Identity Systems: Developing robust biometric identification solutions.

•
Child Identification Models: Training models to accurately identify children in various scenarios.

•
Age Prediction Models: Training models to accurately predict the age of minors based on facial features.

•
Generative AI Models: Training generative AI models to create realistic and diverse synthetic facial images.

Secure and Ethical Collection
•
Data Security: Data was securely stored and processed within our platform, ensuring data security and confidentiality.

•
Ethical Guidelines: The biometric data collection process adhered to strict ethical guidelines, ensuring the privacy and consent of all participants’ guardians.

•
Participant Consent: The guardians were informed of the purpose of collection and potential use of the data, as agreed through written consent.

<h3
d
Asia Building Footprint Data | 3M+ Locations in Asia: India Vietnam (...)
datarade.ai
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
InfobelPRO (2025). Asia Building Footprint Data | 3M+ Locations in Asia: India Vietnam (...) [Dataset]. https://datarade.ai/data-products/asia-building-footprint-data-3m-locations-in-asia-india-infobelpro
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Feb 13, 2025
Dataset authored and provided by
InfobelPRO
Area covered
Vietnam, India
Description
Access 3M+ high-precision building footprints across 7 countries, enabling advanced mapping, location analysis, and strategic decision-making. With 30+ years of data expertise, we provide clean, validated, and enriched datasets to power businesses worldwide.

Expand market reach with global-scale, high-precision data.

Enhance mapping, navigation, and spatial analysis.

Optimize site selection, urban planning, and infrastructure development.

Improve logistics, delivery routes, and network optimization.

Assess property values, competitor landscapes, and demographic trends.

Strengthen disaster management and risk assessment with reliable insights.

Leverage AI-driven enrichment for deeper, data-driven decision-making.

Our use cases demonstrate how our data has been beneficial and helped our customers in several key areas:

Gain a Competitive Edge with Smarter Mapping: Use building footprint data to analyse competitors, identify high-traffic areas, and optimize locations for maximum market impact.

Enhance Navigation & Last-Mile Efficiency: Improve customer experiences with precise building entrances, parking areas, and optimized routes for seamless navigation and delivery.

Find the Perfect Site for Growth: Leverage building footprint data to select prime locations, maximize foot traffic, and drive higher sales.

Optimize Energy & Infrastructure Planning: Assess rooftop solar potential, utility networks, and energy distribution for smarter, more efficient urban development.

Improve Risk Assessment & Security: Use precise building data for insurance underwriting, security planning, and crime prevention strategies.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Facebook users in Vietnam 2019-2028 [Dataset]. https://www.statista.com/forecasts/1136459/facebook-users-in-vietnam

Facebook users in Vietnam 2019-2028

Explore at:

10 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Jul 10, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Area covered

Vietnam

Description

The number of Facebook users in Vietnam was forecast to increase between 2024 and 2028 by in total *** million users (+**** percent). This overall increase does not happen continuously, notably not in 2027 and 2028. The Facebook user base is estimated to amount to ***** million users in 2028. Notably, the number of Facebook users of was continuously increasing over the past years.User figures, shown here regarding the platform facebook, have been estimated by taking into account company filings or press material, secondary research, app downloads and traffic data. They refer to the average monthly active users over the period and count multiple accounts by persons only once.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to *** countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of Facebook users in countries like Indonesia and Malaysia.

Clear search

Close search

Google apps

Main menu

Facebook users in Vietnam 2019-2028

760 Hours - Vietnamese(Vietnam) Scripted Monologue Smartphone speech dataset...

Vietnamese Call Center Data for Telecom AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Social media users in Vietnam 2020-2029

1,149 Hours - Vietnamese(Vietnam) Spontaneous Dialogue Smartphone speech...

Vietnamese Call Center Data for Realestate AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Vietnamese Call Center Data for Travel AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Vietnamese Call Center Data for Healthcare AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Number of smartphone users in Vietnam 2014-2029

711 Hours - Vietnamese(Vietnam) Real-world Casual Conversation and Monologue...

Vietnamese Call Center Data for Delivery & Logistics AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

Vietnamese Call Center Data for BFSI AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Vietnamese Call Center Data for Retail & E-Commerce AI

Introduction

Speech Data

Topic Diversity

Transcription

Metadata

Usage and Applications

GrabFood, GrabExpress Restaurant & Food Delivery Transaction Data |...

East Asian Children Facial Image Dataset

Introduction

Facial Image Data

Diversity and Representation

Quality and Conditions

Metadata

Usage and Applications

Secure and Ethical Collection

Asia Building Footprint Data | 3M+ Locations in Asia: India Vietnam (...)

Facebook users in Vietnam 2019-2028