7 datasets found

Main languages spoken at home in Tunisia 2022
statista.com
Updated Apr 25, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2014). Main languages spoken at home in Tunisia 2022 [Dataset]. https://www.statista.com/statistics/1279956/main-languages-spoken-at-home-in-tunisia/
Explore at:
Dataset updated
Apr 25, 2014
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 21, 2022 - Mar 17, 2022
Area covered
Tunisia
Description
According to a survey conducted in 2021, Tunisian Arabic was the main language spoken in around 93 percent of households in Tunisia. Arabic followed, with by roughly 6 percent of Tunisians. Berber language accounted for only 0.1 percent, according to the survey.
Main languages spoken at home in Tunisia 2022, by area of residence
statista.com
Updated Mar 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2023). Main languages spoken at home in Tunisia 2022, by area of residence [Dataset]. https://www.statista.com/statistics/1279963/main-languages-spoken-at-home-in-tunisia-by-area-of-residence/
Explore at:
Dataset updated
Mar 15, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Feb 21, 2022 - Mar 17, 2022
Area covered
Tunisia
Description
According to a survey conducted in 2022, Tunisian Arabic was the main language spoken at home in Tunisia. Tunisian Arabic-speaking households were more common in urban areas (around 94 percent) compared to rural areas (roughly 91 percent). On the contrary, rural areas had higher percentages of linguistic diversity among households, there were larger shares of people who spoke Arabic, French and Berber.
Number of living languages in Tunisia 2021, by type
statista.com
Updated Dec 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2021). Number of living languages in Tunisia 2021, by type [Dataset]. https://www.statista.com/statistics/1280627/number-of-living-languages-in-tunisia-by-status/
Explore at:
Dataset updated
Dec 8, 2021
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2021
Area covered
Tunisia
Description
As of 2021, there were seven living languages in Tunisia. Most of those - amounting to three - were categorized as developing, meaning that they were in the initial phase of development. In addition, two languages (Arabic and French) were used at institutional levels in the country.
E
OrienTel French as spoken in Tunisia database
catalogue.elra.info
live.european-language-grid.eu
Updated Feb 22, 2007
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2007). OrienTel French as spoken in Tunisia database [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0188/
Explore at:
Dataset updated
Feb 22, 2007
Dataset provided by
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
ELRA (European Language Resources Association)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
Area covered
French, Tunisia
Description
The OrienTel French as spoken in Tunisia database comprises 576 Tunisian speakers of French (290 males, 286 females) recorded over the Tunisian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.Each speaker uttered the following items:•1 isolated single digit•1 sequencesof 10 isolated digits•5 connected digits : 1 prompt sheet number (6 digits), 1 telephone number (6-15 digits), 1 credit card number (14-16 digits), 1 PIN code (6 digits), 1 spontaneous phone number•1 currency money amount•2 natural numbers•3 dates : 1 prompted date, 1 relative or general date expression, 1 prompted date phrase (Western calendar)•2 time phrases : 1 time of day (spontaneous), 1 time phrase (word style)•3 spelled words : 1 spontaneous (own forename), 1 city name, 1 real word for coverage•5 directory assistance utterances : 1 spontaneous, own forename, 1 city of childhood (spontaneous), 1 frequent city name, 1 frequent company name, 1 common forename and surname•2 yes/no questions : 1 predominantly ”yes” question, 1 predominantly ”no” question•6 application keywords/keyphrases•1 word spotting phrase using embedded application words•4 phonetically rich words•9 phonetically rich sentences•2+3 spontaneous items (for control)The following age distribution has been obtained: 2 speakers are below 16, 407 speakers are between 16 and 30, 104 speakers are between 31 and 45, 59 speakers are between 46 and 60, 4 speakers are over 60.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
TunSwitch: Code-Switched Tunisian Arabic Speech Dataset
zenodo.org
pdf, zip
Updated Jul 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Salah Zaiem; Salah Zaiem; Ahmed Amine Ben Abdallah; Ahmed Amine Ben Abdallah (2024). TunSwitch: Code-Switched Tunisian Arabic Speech Dataset [Dataset]. http://doi.org/10.5281/zenodo.8342762
Explore at:
zip, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8342762
Dataset updated
Jul 11, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Salah Zaiem; Salah Zaiem; Ahmed Amine Ben Abdallah; Ahmed Amine Ben Abdallah
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Tunisia
Description
We developed a tool for collecting Tunisian dialect data, prompting users to record themselves reading provided phrases. We sourced sentences from Tunisiya. These sentences are consequently removed from the LM training corpus. 89 persons have participated leading to the collection of 2631 distinct phrases. This set will be called TunSwitch TO, ``TO" standing for Tunisian Only, as these sentences do not have non-Tunisian words.

In response to the limited availability of paired Text-Speech Tunisian datasets with code-switching, we have built a corpus through meticulous manual annotation. Whenever encountered, French and English words are enclosed within "<>" tags, and left Tunisian words without any enclosing tags. While these tags have not been used in the proposed models, they allow to have language-usage statistics and may be useful for further approaches handling code-switching. The resulting set is released as TunSwitch CS, ``CS" standing for Code-Switched.

The TunSwitch CS dataset samples come from a set of radio shows and podcasts, representing diverse topics and a large number of unique speakers. The audio are first segmented into chunks, prioritizing word integrity using the WebRTC-VAD algorithm for silence detection. Afterward, we used a Pyannote overlap detection model to remove overlapping speech sections. Then, a music detection model is employed to eliminate music-containing chunks that could disrupt ASR model accuracy.
E
OrienTel Tunisia MCA (Modern Colloquial Arabic) database
catalogue.elra.info
live.european-language-grid.eu
Updated Feb 22, 2007
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2007). OrienTel Tunisia MCA (Modern Colloquial Arabic) database [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0186/
Explore at:
Dataset updated
Feb 22, 2007
Dataset provided by
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
ELRA (European Language Resources Association)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
Area covered
Tunisia
Description
The OrienTel Tunisia MCA (Modern Colloquial Arabic) database comprises 792 Tunisian speakers (426 males, 366 females) recorded over the Tunisian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.Each speaker uttered the following items:•1 isolated single digit•1 sequence of 10 isolated digits•5 connected digits : 1 prompt sheet number (6 digits), 1 telephone number (6-15 digits), 1 credit card number (14-16 digits), 1 PIN code (6 digits), 1 spontaneous phone number•2 currency money amounts•1 natural number•4 dates : 1 spontaneous (date or year of birth), 1 prompted date, 1 relative or general date expression, 1 prompted date phrase (Islamic calendar)•2 time phrases : 1 time of day (spontaneous), 1 time phrase (word style)•3 spelled words : 1 spontaneous (own forename), 1 city name, 1 real word for coverage•5 directory assistance utterances : 1 spontaneous, own forename, 1 city of childhood (spontaneous), 1 frequent city name, 1 frequent company name, 1 common forename and surname•2 yes/no questions : 1 predominantly ”yes” question, 1 predominantly ”no” question•6 application keywords/keyphrases•1 word spotting phrase using embedded application words•4 phonetically rich words•9 phonetically rich sentences•2+3 spontaneous items (for control)•1 free spontaneous speechThe following age distribution has been obtained: 516 speakers are between 16 and 30, 193 speakers are between 31 and 45, 82 speakers are between 46 and 60, 1 speaker over 60.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
E
OrienTel Tunisia MSA (Modern Standard Arabic) database
catalogue.elra.info
live.european-language-grid.eu
Updated Feb 22, 2007
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2007). OrienTel Tunisia MSA (Modern Standard Arabic) database [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0187/
Explore at:
Dataset updated
Feb 22, 2007
Dataset provided by
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
ELRA (European Language Resources Association)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
Area covered
Tunisia
Description
The OrienTel Tunisia MSA (Modern Standard Arabic) database comprises 598 Tunisian speakers (359 males, 239 females) recorded over the Tunisian fixed and mobile telephone network. This database is partitioned into 1 CD and 1 DVD. The speech databases made within the OrienTel project were validated by SPEX, the Netherlands, to assess their compliance with the OrienTel format and content specifications.Speech samples are stored as sequences of 8-bit 8 kHz A-law. Each prompted utterance is stored in a separate file. Each signal file is accompanied by an ASCII SAM label file which contains the relevant descriptive information.Each speaker uttered the following items:•1 isolated single digit•2 sequences of 5 isolated digits•7+1 connected digits : 1 prompt sheet number (6 digits), 6 strings of 4 digits in written format, +1 prompt sheet number in digits•2 currency money amounts•2 natural numbers•3 dates : 1 prompted date, 1 relative or general date expression, 1 prompted date phrase (Islamic calendar)•1 time phrase•2 spelled words : string of 4 letter sequences•3 directory assistance utterances : 1 frequent city name, 1 frequent company name, 1 personal name ( first name and family name)•2 yes/no questions : 1 predominantly ”yes” question, 1 predominantly ”no” question•6 application keywords/keyphrases•1 word spotting phrase using embedded application words•4 phonetically rich words•9 phonetically rich sentences•4+1 spontaneous items (for control)The following age distribution has been obtained: 2 speakers are below 16, 441 speakers are between 16 and 30, 101 speakers are between 31 and 45, 54 speakers are between 46 and 60.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2014). Main languages spoken at home in Tunisia 2022 [Dataset]. https://www.statista.com/statistics/1279956/main-languages-spoken-at-home-in-tunisia/

Main languages spoken at home in Tunisia 2022

Explore at:

Dataset updated

Apr 25, 2014

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Feb 21, 2022 - Mar 17, 2022

Area covered

Tunisia

Description

According to a survey conducted in 2021, Tunisian Arabic was the main language spoken in around 93 percent of households in Tunisia. Arabic followed, with by roughly 6 percent of Tunisians. Berber language accounted for only 0.1 percent, according to the survey.

Clear search

Close search

Google apps

Main menu

Main languages spoken at home in Tunisia 2022

Main languages spoken at home in Tunisia 2022, by area of residence

Number of living languages in Tunisia 2021, by type

OrienTel French as spoken in Tunisia database

TunSwitch: Code-Switched Tunisian Arabic Speech Dataset

OrienTel Tunisia MCA (Modern Colloquial Arabic) database

OrienTel Tunisia MSA (Modern Standard Arabic) database

Main languages spoken at home in Tunisia 2022