17 datasets found
  1. 769 Hours - French Speech Data by Mobile Phone

    • m.nexdata.ai
    Updated Oct 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). 769 Hours - French Speech Data by Mobile Phone [Dataset]. https://m.nexdata.ai/datasets/speechrecog/952
    Explore at:
    Dataset updated
    Oct 22, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    French
    Variables measured
    Device, Format, Country, Speaker, Language, Accuracy rate, Content category, Recording device, Recording condition, Language(Region) Code, and 1 more
    Description

    French(France) Scripted Monologue Smartphone speech dataset, collected from monologue based on given prompts, covering general category; human-machine interaction category. Transcribed with text content. Our dataset was collected from extensive and diversify speakers(1623 native speakers), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  2. 80 Hours - French(Canada) Spontaneous Dialogue Smartphone speech dataset

    • m.nexdata.ai
    • nexdata.ai
    Updated Jun 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). 80 Hours - French(Canada) Spontaneous Dialogue Smartphone speech dataset [Dataset]. https://m.nexdata.ai/datasets/speechrecog/1302?source=Kaggle
    Explore at:
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    Nexdata
    Area covered
    Canada, French
    Variables measured
    Format, Country, Speaker, Language, Accuracy Rate, Content category, Recording device, Recording condition, Language(Region) Code, Features of annotation
    Description

    French(Canada) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(126 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  3. Sites mobiles 5G

    • kaggle.com
    Updated Mar 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2021). Sites mobiles 5G [Dataset]. https://www.kaggle.com/mathurinache/sites-mobiles-5g/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 16, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mathurin Aché
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Ce jeu de données est issu d'une fusion-hybridation des jeux de données Arcep et ANFR sur les sites 5G, ces deux organisations présentant chacune des informations partielles et ne synchronisant pas leurs publications https://www.data.gouv.fr/fr/datasets/fichier-complet-des-sites-mobiles-5g/

  4. p

    Mobile Phone Repair Shops in France - 4,598 Available (Free Sample)

    • poidata.io
    csv
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Poidata.io (2025). Mobile Phone Repair Shops in France - 4,598 Available (Free Sample) [Dataset]. https://www.poidata.io/report/mobile-phone-repair-shop/france
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 4, 2025
    Dataset provided by
    Poidata.io
    Area covered
    France
    Description

    This dataset provides information on 4,598 in France as of June, 2025. It includes details such as email addresses (where publicly available), phone numbers (where publicly available), and geocoded addresses. Explore market trends, identify potential business partners, and gain valuable insights into the industry. Download a complimentary sample of 10 records to see what's included.

  5. F

    French Newspaper, Magazine, and Books OCR Image Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). French Newspaper, Magazine, and Books OCR Image Dataset [Dataset]. https://www.futurebeeai.com/dataset/ocr-dataset/french-newspaper-book-magazine-ocr-image-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    French
    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Introducing the French Newspaper, Books, and Magazine Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the French language.

    Dataset Contain & Diversity:

    Containing a total of 5000 images, this French OCR dataset offers an equal distribution across newspapers, books, and magazines. Within, you'll find a diverse collection of content, including articles, advertisements, cover pages, headlines, call outs, and author sections from a variety of newspapers, books, and magazines. Images in this dataset showcases distinct fonts, writing formats, colors, designs, and layouts.

    To ensure the diversity of the dataset and to build robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personal identifiable information (PII), and in each image a minimum of 80% space is contain visible French text.

    Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, further enhancing dataset diversity. The collection features images in portrait and landscape modes.

    All these images were captured by native French people to ensure the text quality, avoid toxic content and PII text. We used latest iOS and android mobile devices above 5MP camera to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.

    Metadata:

    Along with the image data you will also receive detailed structured metadata in CSV format. For each image it includes metadata like device information, source type like newspaper, magazine or book image, and image type like portrait or landscape etc. Each image is properly renamed corresponding to the metadata.

    The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of French text recognition models.

    Update & Custom Collection:

    We're committed to expanding this dataset by continuously adding more images with the assistance of our native French crowd community.

    If you require a custom dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.

    Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific requirements using our crowd community.

    License:

    This Image dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Leverage the power of this image dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the French language. Your journey to enhanced language understanding and processing starts here.

  6. d

    Veraset Movement | Europe | GPS Mobile Location Data | Reliable, Compliant,...

    • datarade.ai
    .csv
    Updated May 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Veraset (2022). Veraset Movement | Europe | GPS Mobile Location Data | Reliable, Compliant, Precise Location Data [Dataset]. https://datarade.ai/data-products/veraset-movement-europe-gps-mobile-location-data-reli-veraset
    Explore at:
    .csvAvailable download formats
    Dataset updated
    May 31, 2022
    Dataset authored and provided by
    Veraset
    Area covered
    Germany, Luxembourg, Hungary, Lithuania, Bulgaria, Estonia, Spain, Finland, Belgium, Italy
    Description

    Leverage the most reliable and compliant mobile device location/foot traffic dataset on the market.

    Veraset Movement (Mobile Location Data) offers unparalleled insights into footfall traffic patterns across dozens of European countries.

    Covering 45+ European countries, Veraset's Mobile Location Data draws on raw GPS data from tier-1 apps, SDKs, and aggregators of mobile devices to provide customers with accurate, up-to-the-minute information on human movement. Ideal for ad tech, planning, retail, and transportation logistics, Veraset's Movement data helps shape strategy and make impactful data-driven decisions.

    Veraset’s European Movement Panel includes the following countries: - United Kingdom-GB - Germany-DE - France-FR - Spain-ES - Italy-IT - The Netherlands-NL - Switzerland-CH - Belgium-BE - Sweden-SE - Austria-AT - Denmark-DK - Finland-FI - Cyprus-CY - Poland-PL - Ireland-IE - Portugal-PT - Romania-RO - Hungary-HU - Czech Republic-CZ - Greece-GR - Bulgaria-BG - Lithuania-LT - Croatia-HR - Norway-NO - Latvia-LV - Luxembourg-LU - Slovakia-SK - Estonia-EE - Cayman Islands-KY - Slovenia-SI - Vatican city-VA - Turks and Caicos Islands-TC - Bermuda-BM - Malta-MT - Iceland-IS - Liechtenstein-LI - Monaco-MC - British Virgin Islands-VG - Anguilla-AI - Andorra-AD - Greenland-GL - San Marino-SM - Federated States of Micronesia-FM - Montserrat-MS - Pitcairn islands-PN

    Common Use Cases of Veraset's Mobile Location Data: - Advertising - Ad Placement, Attribution, and Segmentation - Audience Creation/Building - Dynamic Ad Targeting - Infrastructure Plans - Route Optimization - Public Transit Optimization - Credit Card Loyalty - Competitive Analysis - Risk assessment, Underwriting, and Policy Personalization - Enrichment of Existing Datasets - Trade Area Analysis - Predictive Analytics and Trend Forecasting

  7. p

    Cell Phone Accessory Stores in France - 3,931 Available (Free Sample)

    • poidata.io
    csv
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Poidata.io (2025). Cell Phone Accessory Stores in France - 3,931 Available (Free Sample) [Dataset]. https://www.poidata.io/report/cell-phone-accessory-store/france
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    Poidata.io
    Area covered
    France
    Description

    This dataset provides information on 3,931 in France as of May, 2025. It includes details such as email addresses (where publicly available), phone numbers (where publicly available), and geocoded addresses. Explore market trends, identify potential business partners, and gain valuable insights into the industry. Download a complimentary sample of 10 records to see what's included.

  8. o

    Observatoire des ondes - Relais mobiles 5G activés - Tours Métropole Val de...

    • toursmetropole.opendatasoft.com
    • data.tours-metropole.fr
    csv, excel, geojson +1
    Updated Feb 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Observatoire des ondes - Relais mobiles 5G activés - Tours Métropole Val de Loire [Dataset]. https://toursmetropole.opendatasoft.com/explore/dataset/sites-mobiles-5g-france/api/
    Explore at:
    excel, json, csv, geojsonAvailable download formats
    Dataset updated
    Feb 5, 2024
    License

    Licence Ouverte / Open Licence 2.0https://www.etalab.gouv.fr/wp-content/uploads/2018/11/open-licence.pdf
    License information was derived automatically

    Area covered
    Tours, Centre-Val de Loire
    Description

    Ce jeu de données recense et localise la dernière version des antennes mobiles 5G des différents opérateurs et les bandes de fréquences disponibles sur ces sites en France Métropolitaine. Enrichissementajout des hiérarchies administratives.

  9. Data from: Common Phone: A Multilingual Dataset for Robust Acoustic...

    • zenodo.org
    • explore.openaire.eu
    application/gzip
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philipp Klumpp; Philipp Klumpp; Tomás Arias-Vergara; Paula Andrea Pérez-Toro; Elmar Nöth; Juan Rafael Orozco-Arroyave; Tomás Arias-Vergara; Paula Andrea Pérez-Toro; Elmar Nöth; Juan Rafael Orozco-Arroyave (2024). Common Phone: A Multilingual Dataset for Robust Acoustic Modelling [Dataset]. http://doi.org/10.5281/zenodo.5846137
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Philipp Klumpp; Philipp Klumpp; Tomás Arias-Vergara; Paula Andrea Pérez-Toro; Elmar Nöth; Juan Rafael Orozco-Arroyave; Tomás Arias-Vergara; Paula Andrea Pérez-Toro; Elmar Nöth; Juan Rafael Orozco-Arroyave
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Release Date: 17.01.22

    Welcome to Common Phone 1.0

    Legal Information

    Common Phone is a subset of the Common Voice corpus collected by Mozilla Corporation. By using Common Phone, you agree to the Common Voice Legal Terms. Common Phone is maintained and distributed by speech researchers at the Pattern Recognition Lab of Friedrich-Alexander-University Erlangen-Nuremberg (FAU) under the CC0 license.

    Like for Common Voice, you must not make any attempt to identify speakers that contributed to Common Phone.

    About Common Phone

    This corpus aims to provide a basis for Machine Learning (ML) researchers and enthusiasts to train and test their models against a wide variety of speakers, hardware/software ecosystems and acoustic conditions to improve generalization and availability of ML in real-world speech applications.
    The current version of Common Phone comprises 116,5 hours of speech samples, collected from 11.246 speakers in 6 languages:

    Language

    Speakers

    Hours

    train / dev / test

    train / dev / test

    English

    4716 / 771 / 774

    14.1 / 2.3 / 2.3

    French

    796 / 138 / 135

    13.6 / 2.3 / 2.2

    German

    1176 / 202 / 206

    14.5 / 2.5 / 2.6

    Italian

    1031 / 176 / 178

    14.6 / 2.5 / 2.5

    Spanish

    508 / 88 / 91

    16.5 / 3.0 / 3.1

    Russian

    190 / 34 / 36

    12.7 / 2.6 / 2.8

    Total

    8417 / 1409 / 1420

    85.8 / 15.2 / 15.5

    Presented train, dev and test splits are not identical to those shipped with Common Voice. Speaker separation among splits was realized by only using those speakers that had provided age and gender information. This information can only be provided as a registered user on the website. When logged in, the session ID of contributed recordings is always linked to your user, thus we could easily link recordings to individual speakers. Keep in mind this would not be possible for unregistered users, as their session ID changes if they decide to contribute more than once.
    During speaker selection, we considered that some speakers had contributed to more than one of the six Common Voice datasets (one for each language). In Common Phone, a speaker will only appear in one language.
    The dataset is structured as follows:

    • Six top-level directories, one for each language.
    • Each language folder contains:
      • [train|dev|test].csv files listing audio files, respective speaker ID and plain text transcript.
      • meta.csv provides speaker information: age group, gender, language, accent (if available) and which of the three splits this speaker was assigned to. File names match corresponding audio file names except their extension.
      • /grids/ contains phonetic transcription for every audio file in Praat TextGrid format.
      • /mp3/ contains audio files in mp3, identical to those of Common Voice, e.g., sampling rates have been preserved and may vary for different files.
      • /wav/ contains raw audio files in 16 bits/sample, 16 kHz single channel. They had been created from the original mp3 audios. We provide them for convenience, keep in mind that their source had undergone MP3-compression.

    Where does the phonetic annotation come from?

    Phonetic annotation was computed via BAS Web Services. We used the regular Pipeline (G2P-MAUS) without ASR to create an alignment of text transcripts with audio signals. We chose International Phonetic Alphabet (IPA) output symbols as they work well even in a multi-lingual setup. Common Phone annotation comprises 101 phonetic symbols, including silence.

    Why Common Phone?

    • Large number of speakers and varying acoustic conditions to improve robustness of ML models
    • Time-aligned IPA phonetic transcription for every audio sample
    • Gender-balanced and age-group-matched (equal number of female/male speakers in every age group)
    • Support for six different languages to leverage multi-lingual approaches
    • Original MP3 files plus standard WAVE files

    Is there any publication available?

    Yes, a paper describing Common Phone in detail is currently under revision for LREC 2022. You can access a pre-print version on arXiv entitled “Common Phone: A Multilingual Dataset for Robust Acoustic Modelling”.

  10. F

    French Product Image OCR Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). French Product Image OCR Dataset [Dataset]. https://www.futurebeeai.com/dataset/ocr-dataset/french-product-image-ocr-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    French
    Dataset funded by
    FutureBeeAI
    Description

    What’s Included

    Introducing the French Product Image Dataset - a diverse and comprehensive collection of images meticulously curated to propel the advancement of text recognition and optical character recognition (OCR) models designed specifically for the French language.

    Dataset Contain & Diversity:

    Containing a total of 2000 images, this French OCR dataset offers diverse distribution across different types of front images of Products. In this dataset, you'll find a variety of text that includes product names, taglines, logos, company names, addresses, product content, etc. Images in this dataset showcase distinct fonts, writing formats, colors, designs, and layouts.

    To ensure the diversity of the dataset and to build a robust text recognition model we allow limited (less than five) unique images from a single resource. Stringent measures have been taken to exclude any personally identifiable information (PII) and to ensure that in each image a minimum of 80% of space contains visible French text.

    Images have been captured under varying lighting conditions – both day and night – along with different capture angles and backgrounds, to build a balanced OCR dataset. The collection features images in portrait and landscape modes.

    All these images were captured by native French people to ensure the text quality, avoid toxic content and PII text. We used the latest iOS and Android mobile devices above 5MP cameras to click all these images to maintain the image quality. In this training dataset images are available in both JPEG and HEIC formats.

    Metadata:

    Along with the image data, you will also receive detailed structured metadata in CSV format. For each image, it includes metadata like image orientation, county, language, and device information. Each image is properly renamed corresponding to the metadata.

    The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of French text recognition models.

    Update & Custom Collection:

    We're committed to expanding this dataset by continuously adding more images with the assistance of our native French crowd community.

    If you require a custom product image OCR dataset tailored to your guidelines or specific device distribution, feel free to contact us. We're equipped to curate specialized data to meet your unique needs.

    Furthermore, we can annotate or label the images with bounding box or transcribe the text in the image to align with your specific project requirements using our crowd community.

    License:

    This Image dataset, created by FutureBeeAI, is now available for commercial use.

    Conclusion:

    Leverage the power of this product image OCR dataset to elevate the training and performance of text recognition, text detection, and optical character recognition models within the realm of the French language. Your journey to enhanced language understanding and processing starts here.

  11. C

    GPS Raw data France

    • ckan.mobidatalab.eu
    csv
    Updated Nov 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Singlespot (2023). GPS Raw data France [Dataset]. https://ckan.mobidatalab.eu/dataset/gps_raw_data_france
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 16, 2023
    Dataset provided by
    Singlespot
    Area covered
    France
    Description

    This dataset contains GPS tracks of mobile phones users.

  12. 520 Hours - French Speaking English Speech Data by Mobile Phone

    • m.nexdata.ai
    • nexdata.ai
    Updated May 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). 520 Hours - French Speaking English Speech Data by Mobile Phone [Dataset]. https://m.nexdata.ai/datasets/speechrecog/989?source=Kaggle
    Explore at:
    Dataset updated
    May 5, 2025
    Dataset authored and provided by
    Nexdata
    Area covered
    French
    Variables measured
    Format, Country, Speaker, Language, Accuracy Rate, Content category, Recording device, Recording condition, Features of annotation
    Description

    English(France) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,089 people in total), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  13. 8

    Sites mobiles 5G - France

    • data.82amenagement.fr
    • data.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Apr 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Sites mobiles 5G - France [Dataset]. https://data.82amenagement.fr/explore/dataset/sites-mobiles-5g-france/
    Explore at:
    geojson, csv, excel, jsonAvailable download formats
    Dataset updated
    Apr 27, 2025
    License

    Licence Ouverte / Open Licence 2.0https://www.etalab.gouv.fr/wp-content/uploads/2018/11/open-licence.pdf
    License information was derived automatically

    Area covered
    France
    Description

    Ce jeu de données recense et localise la dernière version des antennes mobiles 5G des différents opérateurs et les bandes de fréquences disponibles sur ces sites en France Métropolitaine. Enrichissementajout des hiérarchies administratives.

  14. Mobile Phones Market Size Volume in France, 2023

    • reportlinker.com
    Updated Apr 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ReportLinker (2024). Mobile Phones Market Size Volume in France, 2023 [Dataset]. https://www.reportlinker.com/dataset/1203bca66faa59ac78eadf8a3f1059e22b9b4738
    Explore at:
    Dataset updated
    Apr 5, 2024
    Dataset authored and provided by
    ReportLinker
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    France
    Description

    Mobile Phones Market Size Volume in France, 2023 Discover more data with ReportLinker!

  15. s

    Sites mobiles 2G, 3G, 4G - France

    • data.smartidf.services
    • ods.backoffice.smartidf.services
    • +2more
    csv, excel, geojson +1
    Updated Mar 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Sites mobiles 2G, 3G, 4G - France [Dataset]. https://data.smartidf.services/explore/dataset/buildingref-france-arcep-mobile-site-2g3g4g/
    Explore at:
    csv, json, excel, geojsonAvailable download formats
    Dataset updated
    Mar 27, 2025
    License

    Licence Ouverte / Open Licence 2.0https://www.etalab.gouv.fr/wp-content/uploads/2018/11/open-licence.pdf
    License information was derived automatically

    Area covered
    France
    Description

    Ce jeu de données recense et localise la dernière version des antennes mobiles des quatre opérateurs et la disponibilité des différentes technologies (2G, 3G, 4G) sur ces sites en France Métropolitaine et Outre-Mer. Attention, des erreurs ont été détectés sur les identifiants des sites (id sous forme E+XX) Enrichissement ajout des hiérarchies administratives. rattachement des sites Outre-mer à la hiérarchie administrative. ajout du nom de l'opérateur pour les sites d'Outre-mer.

  16. F

    French Call Center Data for Telecom AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). French Call Center Data for Telecom AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-french-france
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    French
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This French Call Center Speech Dataset for the Telecom industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for French-speaking telecom customers. Featuring over 30 hours of real-world, unscripted audio, it delivers authentic customer-agent interactions across key telecom support scenarios to help train robust ASR models.

    Curated by FutureBeeAI, this dataset empowers voice AI engineers, telecom automation teams, and NLP researchers to build high-accuracy, production-ready models for telecom-specific use cases.

    Speech Data

    The dataset contains 30 hours of dual-channel call center recordings between native French speakers. Captured in realistic customer support settings, these conversations span a wide range of telecom topics from network complaints to billing issues, offering a strong foundation for training and evaluating telecom voice AI solutions.

    Participant Diversity:
    Speakers: 60 native French speakers from our verified contributor pool.
    Regions: Representing multiple provinces across France to ensure coverage of various accents and dialects.
    Participant Profile: Balanced gender mix (60% male, 40% female) with age distribution from 18 to 70 years.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted interactions between agents and customers.
    Call Duration: Ranges from 5 to 15 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clean conditions with no echo or background noise.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls with varied conversational outcomes like positive, negative, and neutral ensuring broad scenario coverage for telecom AI development.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refund Requests and Billing Adjustments
    Emergency Service Access, and others
    Outbound Calls:
    Welcome Calls & Onboarding
    Payment Reminders
    Customer Satisfaction Surveys
    Technical Updates
    Service Usage Reviews
    Network Complaint Status Calls, and more

    This variety helps train telecom-specific models to manage real-world customer interactions and understand context-specific voice patterns.

    Transcription

    All audio files are accompanied by manually curated, time-coded verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, coughs)
    High transcription accuracy with word error rate < 5% thanks to dual-layered quality checks.

    These transcriptions are production-ready, allowing for faster development of ASR and conversational AI systems in the Telecom domain.

    Metadata

    Rich metadata is available for each participant and conversation:

    Participant Metadata: ID, age, gender, accent, dialect, and location.

  17. n

    231.9 Hours - French Scripted Monologue Smartphone speech dataset

    • m.nexdata.ai
    Updated Apr 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2024). 231.9 Hours - French Scripted Monologue Smartphone speech dataset [Dataset]. https://m.nexdata.ai/datasets/speechrecog/114
    Explore at:
    Dataset updated
    Apr 12, 2024
    Dataset provided by
    nexdata technology inc
    Authors
    Nexdata
    Area covered
    French
    Variables measured
    Format, Country, Speaker, Language, Accuracy Rate, Content category, Recording device, Recording condition, Features of annotation
    Description

    French Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering economy, entertainment, news, informal language, numbers, alphabet domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(406 speakers, from French, Canada, and Africa etc.), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Nexdata (2023). 769 Hours - French Speech Data by Mobile Phone [Dataset]. https://m.nexdata.ai/datasets/speechrecog/952
Organization logo

769 Hours - French Speech Data by Mobile Phone

Explore at:
Dataset updated
Oct 22, 2023
Dataset authored and provided by
Nexdata
Area covered
French
Variables measured
Device, Format, Country, Speaker, Language, Accuracy rate, Content category, Recording device, Recording condition, Language(Region) Code, and 1 more
Description

French(France) Scripted Monologue Smartphone speech dataset, collected from monologue based on given prompts, covering general category; human-machine interaction category. Transcribed with text content. Our dataset was collected from extensive and diversify speakers(1623 native speakers), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

Search
Clear search
Close search
Google apps
Main menu