7 datasets found
  1. Main languages spoken at home in Tunisia 2022

    • statista.com
    Updated Apr 25, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2014). Main languages spoken at home in Tunisia 2022 [Dataset]. https://www.statista.com/statistics/1279956/main-languages-spoken-at-home-in-tunisia/
    Explore at:
    Dataset updated
    Apr 25, 2014
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 21, 2022 - Mar 17, 2022
    Area covered
    Tunisia
    Description

    According to a survey conducted in 2021, Tunisian Arabic was the main language spoken in around 93 percent of households in Tunisia. Arabic followed, with by roughly 6 percent of Tunisians. Berber language accounted for only 0.1 percent, according to the survey.

  2. Main languages spoken at home in Tunisia 2022, by area of residence

    • statista.com
    Updated Mar 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Main languages spoken at home in Tunisia 2022, by area of residence [Dataset]. https://www.statista.com/statistics/1279963/main-languages-spoken-at-home-in-tunisia-by-area-of-residence/
    Explore at:
    Dataset updated
    Mar 15, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 21, 2022 - Mar 17, 2022
    Area covered
    Tunisia
    Description

    According to a survey conducted in 2022, Tunisian Arabic was the main language spoken at home in Tunisia. Tunisian Arabic-speaking households were more common in urban areas (around 94 percent) compared to rural areas (roughly 91 percent). On the contrary, rural areas had higher percentages of linguistic diversity among households, there were larger shares of people who spoke Arabic, French and Berber.

  3. Number of living languages in Tunisia 2021, by type

    • statista.com
    Updated Dec 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2021). Number of living languages in Tunisia 2021, by type [Dataset]. https://www.statista.com/statistics/1280627/number-of-living-languages-in-tunisia-by-status/
    Explore at:
    Dataset updated
    Dec 8, 2021
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2021
    Area covered
    Tunisia
    Description

    As of 2021, there were seven living languages in Tunisia. Most of those - amounting to three - were categorized as developing, meaning that they were in the initial phase of development. In addition, two languages (Arabic and French) were used at institutional levels in the country.

  4. E

    OrienTel French as spoken in Tunisia database

    • catalogue.elra.info
    • live.european-language-grid.eu
    Updated Feb 22, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2007). OrienTel French as spoken in Tunisia database [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0188/
    Explore at:
    Dataset updated
    Feb 22, 2007
    Dataset provided by
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
    ELRA (European Language Resources Association)
    License

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    Area covered
    French, Tunisia
    Description

    The OrienTel French as spoken in Tunisia database comprises 576 Tunisian speakers of French (290 males, 286 females) recorded over the Tunisian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.Each speaker uttered the following items:•1 isolated single digit•1 sequencesof 10 isolated digits•5 connected digits : 1 prompt sheet number (6 digits), 1 telephone number (6-15 digits), 1 credit card number (14-16 digits), 1 PIN code (6 digits), 1 spontaneous phone number•1 currency money amount•2 natural numbers•3 dates : 1 prompted date, 1 relative or general date expression, 1 prompted date phrase (Western calendar)•2 time phrases : 1 time of day (spontaneous), 1 time phrase (word style)•3 spelled words : 1 spontaneous (own forename), 1 city name, 1 real word for coverage•5 directory assistance utterances : 1 spontaneous, own forename, 1 city of childhood (spontaneous), 1 frequent city name, 1 frequent company name, 1 common forename and surname•2 yes/no questions : 1 predominantly ”yes” question, 1 predominantly ”no” question•6 application keywords/keyphrases•1 word spotting phrase using embedded application words•4 phonetically rich words•9 phonetically rich sentences•2+3 spontaneous items (for control)The following age distribution has been obtained: 2 speakers are below 16, 407 speakers are between 16 and 30, 104 speakers are between 31 and 45, 59 speakers are between 46 and 60, 4 speakers are over 60.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.

  5. TunSwitch: Code-Switched Tunisian Arabic Speech Dataset

    • zenodo.org
    pdf, zip
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salah Zaiem; Salah Zaiem; Ahmed Amine Ben Abdallah; Ahmed Amine Ben Abdallah (2024). TunSwitch: Code-Switched Tunisian Arabic Speech Dataset [Dataset]. http://doi.org/10.5281/zenodo.8342762
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Salah Zaiem; Salah Zaiem; Ahmed Amine Ben Abdallah; Ahmed Amine Ben Abdallah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tunisia
    Description

    We developed a tool for collecting Tunisian dialect data, prompting users to record themselves reading provided phrases. We sourced sentences from Tunisiya. These sentences are consequently removed from the LM training corpus. 89 persons have participated leading to the collection of 2631 distinct phrases. This set will be called TunSwitch TO, ``TO" standing for Tunisian Only, as these sentences do not have non-Tunisian words.

    In response to the limited availability of paired Text-Speech Tunisian datasets with code-switching, we have built a corpus through meticulous manual annotation. Whenever encountered, French and English words are enclosed within "<>" tags, and left Tunisian words without any enclosing tags. While these tags have not been used in the proposed models, they allow to have language-usage statistics and may be useful for further approaches handling code-switching. The resulting set is released as TunSwitch CS, ``CS" standing for Code-Switched.

    The TunSwitch CS dataset samples come from a set of radio shows and podcasts, representing diverse topics and a large number of unique speakers. The audio are first segmented into chunks, prioritizing word integrity using the WebRTC-VAD algorithm for silence detection. Afterward, we used a Pyannote overlap detection model to remove overlapping speech sections. Then, a music detection model is employed to eliminate music-containing chunks that could disrupt ASR model accuracy.

  6. E

    OrienTel Tunisia MCA (Modern Colloquial Arabic) database

    • catalogue.elra.info
    • live.european-language-grid.eu
    Updated Feb 22, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2007). OrienTel Tunisia MCA (Modern Colloquial Arabic) database [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0186/
    Explore at:
    Dataset updated
    Feb 22, 2007
    Dataset provided by
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
    ELRA (European Language Resources Association)
    License

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

    Area covered
    Tunisia
    Description

    The OrienTel Tunisia MCA (Modern Colloquial Arabic) database comprises 792 Tunisian speakers (426 males, 366 females) recorded over the Tunisian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.Each speaker uttered the following items:•1 isolated single digit•1 sequence of 10 isolated digits•5 connected digits : 1 prompt sheet number (6 digits), 1 telephone number (6-15 digits), 1 credit card number (14-16 digits), 1 PIN code (6 digits), 1 spontaneous phone number•2 currency money amounts•1 natural number•4 dates : 1 spontaneous (date or year of birth), 1 prompted date, 1 relative or general date expression, 1 prompted date phrase (Islamic calendar)•2 time phrases : 1 time of day (spontaneous), 1 time phrase (word style)•3 spelled words : 1 spontaneous (own forename), 1 city name, 1 real word for coverage•5 directory assistance utterances : 1 spontaneous, own forename, 1 city of childhood (spontaneous), 1 frequent city name, 1 frequent company name, 1 common forename and surname•2 yes/no questions : 1 predominantly ”yes” question, 1 predominantly ”no” question•6 application keywords/keyphrases•1 word spotting phrase using embedded application words•4 phonetically rich words•9 phonetically rich sentences•2+3 spontaneous items (for control)•1 free spontaneous speechThe following age distribution has been obtained: 516 speakers are between 16 and 30, 193 speakers are between 31 and 45, 82 speakers are between 46 and 60, 1 speaker over 60.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.

  7. E

    OrienTel Tunisia MSA (Modern Standard Arabic) database

    • catalogue.elra.info
    • live.european-language-grid.eu
    Updated Feb 22, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2007). OrienTel Tunisia MSA (Modern Standard Arabic) database [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0187/
    Explore at:
    Dataset updated
    Feb 22, 2007
    Dataset provided by
    ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
    ELRA (European Language Resources Association)
    License

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf

    https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf

    Area covered
    Tunisia
    Description

    The OrienTel Tunisia MSA (Modern Standard Arabic) database comprises 598 Tunisian speakers (359 males, 239 females) recorded over the Tunisian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.Each speaker uttered the following items:•1 isolated single digit•2 sequences of 5 isolated digits•7+1 connected digits : 1 prompt sheet number (6 digits), 6 strings of 4 digits in written format, +1 prompt sheet number in digits•2 currency money amounts•2 natural numbers•3 dates : 1 prompted date, 1 relative or general date expression, 1 prompted date phrase (Islamic calendar)•1 time phrase•2 spelled words : string of 4 letter sequences•3 directory assistance utterances : 1 frequent city name, 1 frequent company name, 1 personal name ( first name and family name)•2 yes/no questions : 1 predominantly ”yes” question, 1 predominantly ”no” question•6 application keywords/keyphrases•1 word spotting phrase using embedded application words•4 phonetically rich words•9 phonetically rich sentences•4+1 spontaneous items (for control)The following age distribution has been obtained: 2 speakers are below 16, 441 speakers are between 16 and 30, 101 speakers are between 31 and 45, 54 speakers are between 46 and 60.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2014). Main languages spoken at home in Tunisia 2022 [Dataset]. https://www.statista.com/statistics/1279956/main-languages-spoken-at-home-in-tunisia/
Organization logo

Main languages spoken at home in Tunisia 2022

Explore at:
Dataset updated
Apr 25, 2014
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 21, 2022 - Mar 17, 2022
Area covered
Tunisia
Description

According to a survey conducted in 2021, Tunisian Arabic was the main language spoken in around 93 percent of households in Tunisia. Arabic followed, with by roughly 6 percent of Tunisians. Berber language accounted for only 0.1 percent, according to the survey.

Search
Clear search
Close search
Google apps
Main menu