In 2021, 52.6 million people had English as a main language in England and Wales, approximately 91.1 percent of the population. Although the number of English speakers has grown in number since 2011, when there were 49.8 million speakers, as a share of the population it has declined by 1.2 percent.
In 2021, there were 611,845 people who spoke Polish as a main language in England and Wales, the most common non-English language among the population. This was followed by Romanian, and Panjabi, which had 471,945 speakers and 290,745 speakers respectively.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Dataset population: Persons aged 3 and over
Main language (detailed)
The language that is a person's first or preferred language.
This information helps central government, local authorities and the NHS to allocate resources and provide services for non-English speakers, e.g. English teaching and translation services. It is a better indicator than country of birth, which was previously used to forecast the additional cost of providing services to people whose first language is not English.
The data are also used to assess the impact of English or Welsh language ability on employment and other social inclusion indicators.
Information on the number of British Sign Language users helps with service planning and assists in developing policies to address the needs of the deaf community.
These statistics are used by public service providers to effectively target the delivery of their services, for example in the provision of translation and interpretation services, the availability of English language lessons, and the distribution of official information leaflets in alternative languages.
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
The UK English Speecon database is divided into 2 sets: 1) The first set comprises the recordings of 606 adult UK English speakers (325 males, 281 females), recorded over 4 microphone channels in 4 recording environments (office, entertainment, car, public place), and consisting of about 195 hours of audio data. 2) The second set comprises the recordings of 51 child UK English speakers (14 boys, 37 girls), recorded over 4 microphone channels in 1 recording environment (children room), and consisting of about 9 hours of audio data. This database is partitioned into 31 DVDs (first set) and 4 DVDs (second set).The speech databases made within the Speecon project were validated by SPEX, the Netherlands, to assess their compliance with the Speecon format and content specifications.Each of the four speech channels is recorded at 16 kHz, 16 bit, uncompressed unsigned integers in Intel format (lo-hi byte order). To each signal file corresponds an ASCII SAM label file which contains the relevant descriptive information.Each speaker uttered the following items (over 290 items for adults and over 210 items for children):Calibration data: 6 noise recordings The “silence word” recordingFree spontaneous items (adults only):5 minutes (session time) of free spontaneous, rich context items (story telling) (an open number of spontaneous topics out of a set of 30 topics)17 Elicited spontaneous items (adults only):3 dates, 2 times, 3 proper names, 2 city names, 1 letter sequence, 2 answers to questions, 3 telephone numbers, 1 language Read speech:30 phonetically rich sentences uttered by adults and 60 uttered by children5 phonetically rich words (adults only)4 isolated digits1 isolated digit sequence4 connected digit sequences1 telephone number3 natural numbers1 money amount2 time phrases (T1 : analogue, T2 : digital)3 dates (D1 : analogue, D2 : relative and general date, D3 : digital)3 letter sequences1 proper name2 city or street names2 questions2 special keyboard characters 1 Web address1 email address208 application specific words and phrases per session (adults)74 toy commands, 14 phone commands and 34 general commands (children)The following age distribution has been obtained: Adults: 321 speakers are between 16 and 30, 182 speakers are between 31 and 45, 103 speakers are over 46.Children: All 51 speakers are between 11 and 14.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Home UK English DatasetHigh-Quality UK English Wake Word Dataset for AI & Speech Models Contact Us OverviewTitleUK English Language DatasetDataset TypeWake WordDescriptionWake Words / Voice Command / Trigger Word /…
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Dataset population: Households
English as a household language
This variable describes whether English is used as a main language in a household.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset provides Census 2021 estimates that classify usual residents in England and Wales by their proficiency in English. The estimates are as at Census Day, 21 March 2021.
Area type
Census 2021 statistics are published for a number of different geographies. These can be large, for example the whole of England, or small, for example an output area (OA), the lowest level of geography for which statistics are produced.
For higher levels of geography, more detailed statistics can be produced. When a lower level of geography is used, such as output areas (which have a minimum of 100 persons), the statistics produced have less detail. This is to protect the confidentiality of people and ensure that individuals or their characteristics cannot be identified.
Coverage
Census 2021 statistics are published for the whole of England and Wales. Data are also available in these geographic types:
Proficiency in English language (6 categories)
How well people whose main language is not English (English or Welsh in Wales) speak English.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This provides estimates of the percentage of usual residents aged 3 and over in England and Wales by their proficiency in English. The proficiency in English classification corresponds to the tick box response options on the census questionnaire. Estimates are used to help central government, local authorities and the NHS allocate resources and provide services for non-English speakers. It also helps public service providers effectively target the delivery of their services. For example, translation and interpretation services and material in alternative languages. Statistical Disclosure Control - In order to protect against disclosure of personal information from the Census, there has been swapping of records in the Census database between different geographic areas, and so some counts will be affected. In the main, the greatest effects will be at the lowest geographies, since the record swapping is targeted towards those households with unusual characteristics in small areas. Data is Powered by LG Inform Plus and automatically checked for new data on the 3rd of each month.
In 2024/25, approximately 21.4 percent of all pupils at schools in England did not speak English as a first language, compared with 18 percent in 2015/16.
In 2021, most people in England and Wales who did not have English as their main language were proficient in English to some degree, with 43.9 percent advising they could speak English "very well" and a further 35.8 percent who could speak English "well".
English and maths (formerly Skills for Life) qualifications are designed to give people the reading, writing, maths and communication skills they need in everyday life, to operate effectively in work and to help them succeed on other training courses.
These data provide information on participation and achievements for English and maths qualifications and are broken down into a number of key reports.
If you need help finding data please refer to the table finder tool to search for specific breakdowns available for FE statistics.
<p class="gem-c-attachment_metadata"><span class="gem-c-attachment_attribute">MS Excel Spreadsheet</span>, <span class="gem-c-attachment_attribute">10.9 MB</span></p>
<p class="gem-c-attachment_metadata">This file may not be suitable for users of assistive technology.</p>
<details data-module="ga4-event-tracker" data-ga4-event='{"event_name":"select_content","type":"detail","text":"Request an accessible format.","section":"Request an accessible format.","index_section":1}' class="gem-c-details govuk-details govuk-!-margin-bottom-0" title="Request an accessible format.">
Request an accessible format.
If you use assistive technology (such as a screen reader) and need a version of this document in a more accessible format, please email <a href="mailto:alternative.formats@education.gov.uk" target="_blank" class="govuk-link">alternative.formats@education.gov.uk</a>. Please tell us what format you need. It will help us if you say what assistive technology you use.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset provides Census 2021 estimates that classify usual residents aged 3 years and over in England and Wales by proficiency in English and by age. The estimates are as at Census Day, 21 March 2021.
Estimates for single year of age between ages 90 and 100+ are less reliable than other ages. Estimation and adjustment at these ages was based on the age range 90+ rather than five-year age bands. Read more about this quality notice.
Area type
Census 2021 statistics are published for a number of different geographies. These can be large, for example the whole of England, or small, for example an output area (OA), the lowest level of geography for which statistics are produced.
For higher levels of geography, more detailed statistics can be produced. When a lower level of geography is used, such as output areas (which have a minimum of 100 persons), the statistics produced have less detail. This is to protect the confidentiality of people and ensure that individuals or their characteristics cannot be identified.
Lower tier local authorities
Lower tier local authorities provide a range of local services. There are 309 lower tier local authorities in England made up of 181 non-metropolitan districts, 59 unitary authorities, 36 metropolitan districts and 33 London boroughs (including City of London). In Wales there are 22 local authorities made up of 22 unitary authorities.
Coverage
Census 2021 statistics are published for the whole of England and Wales. However, you can choose to filter areas by:
Proficiency in English language
How well people whose main language is not English (English or Welsh in Wales) speak English.
Age
A person’s age on Census Day, 21 March 2021 in England and Wales. Infants aged under 1 year are classified as 0 years of age.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the UK English General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of English speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world UK English communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade English speech models that understand and respond to authentic British accents and dialects.
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of UK English. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
The dataset comes with granular metadata for both speakers and recordings:
Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
This dataset is a versatile resource for multiple English speech and language AI applications:
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Dataset population: Persons aged 3 and over
Age
Age is derived from the date of birth question and is a person's age at their last birthday, at 27 March 2011. Dates of birth that imply an age over 115 are treated as invalid and the person's age is imputed. Infants less than one year old are classified as 0 years of age.
General health
General health is a self-assessment of a person's general state of health. People were asked to assess whether their health was very good, good, fair, bad or very bad.
For England and Wales, this assessment is not based on a person's health over any specified period of time.
Proficiency in English
Proficiency in English language classifies people whose main language is not English (or not English or Welsh in Wales) according to their ability to speak English. A person is classified in one of the categories:
This question was handled slightly differently in the England and Wales censuses.
In the English census a tick box was used in Question 18, asking 'What is your main language?', giving the option of 'English' or 'Other'.
In the Welsh census, a tick box was used in Question 18, asking 'What is your main language?', giving the option of 'English or Welsh' or 'Other'.
Those who ticked 'Other' would be asked about their ability to speak English.
A consequence of this is that a person who reports their main language to be Welsh and completed the Welsh census, will not be asked about their ability to speak English. Whereas a person who indicates that their main language is Welsh and lives in England would be asked about 'their ability to speak English'.
Copies of the census forms can be found here: UK census forms.
Sex
The classification of a person as either male or female.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Dataset population: Persons aged 3 and over
Age upon arrival in the UK
The age of arrival in the UK is derived from the date that a person last arrived to live in the UK and their age. Short visits away from the UK are not counted in determining the date that a person last arrived.
Age of arrival is only applicable to usual residents who were not born in the UK. It does not include usual residents born in the UK who have emigrated and since returned; these are recorded in the category 'Born in the UK'.
Proficiency in English
Proficiency in English language classifies people whose main language is not English (or not English or Welsh in Wales) according to their ability to speak English. A person is classified in one of the categories:
This question was handled slightly differently in the England and Wales censuses.
In the English census a tick box was used in Question 18, asking 'What is your main language?', giving the option of 'English' or 'Other'.
In the Welsh census, a tick box was used in Question 18, asking 'What is your main language?', giving the option of 'English or Welsh' or 'Other'.
Those who ticked 'Other' would be asked about their ability to speak English.
A consequence of this is that a person who reports their main language to be Welsh and completed the Welsh census, will not be asked about their ability to speak English. Whereas a person who indicates that their main language is Welsh and lives in England would be asked about 'their ability to speak English'.
Copies of the census forms can be found here: UK census forms.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Dataset population: Persons aged 3 and over
Age
Age is derived from the date of birth question and is a person's age at their last birthday, at 27 March 2011. Dates of birth that imply an age over 115 are treated as invalid and the person's age is imputed. Infants less than one year old are classified as 0 years of age.
Main language
The language that is a person's first or preferred language.
This information helps central government, local authorities and the NHS to allocate resources and provide services for non-English speakers, for example English teaching and translation services. It is a better indicator than country of birth, which was previously used to forecast the additional cost of providing services to people whose first language is not English.
The data are also used to assess the impact of English or Welsh language ability on employment and other social inclusion indicators.
Information on the number of British Sign Language users helps with service planning and assists in developing policies to address the needs of the deaf community.
These statistics are used by public service providers to effectively target the delivery of their services, for example in the provision of translation and interpretation services, the availability of English language lessons, and the distribution of official information leaflets in alternative languages.
Sex
The classification of a person as either male or female.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Dataset population: Persons aged 3 and over
Age
Age is derived from the date of birth question and is a person's age at their last birthday, at 27 March 2011. Dates of birth that imply an age over 115 are treated as invalid and the person's age is imputed. Infants less than one year old are classified as 0 years of age.
Proficiency in English
Proficiency in English language classifies people whose main language is not English (or not English or Welsh in Wales) according to their ability to speak English. A person is classified in one of the categories:
This question was handled slightly differently in the England and Wales censuses.
In the English census a tick box was used in Question 18, asking "What is your main language?", giving the option of 'English' or 'Other'.
In the Welsh census, a tick box was used in Question 18, asking "What is your main language?", giving the option of 'English or Welsh' or 'Other'.
Those who ticked 'Other' would be asked about their ability to speak English.
A consequence of this is that a person who reports their main language to be Welsh and completed the Welsh census, will not be asked about their ability to speak English. Whereas a person who indicates that their main language is Welsh and lives in England would be asked about 'their ability to speak English'.
Copies of the census forms can be found here: UK census forms.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
According to the 2021 Census, 81.7% of the population of England and Wales was white, 9.3% Asian, 4.0% black, 2.9% mixed and 2.1% from other ethnic groups.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Decennial life tables for males and for females have been constructed based on the mortality experience of the population of England and Wales during the three years 2010, 2011 and 2012.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset provides Census 2022 estimates for the English language skills by Individuals in Scotland.
A classification of a persons skills in the English Language. It breaks down into combinations of "Understand (spoken)", "Speak", "Read" and "Write".
Details of classification can be found here
The quality assurance report can be found here
In 2021, 52.6 million people had English as a main language in England and Wales, approximately 91.1 percent of the population. Although the number of English speakers has grown in number since 2011, when there were 49.8 million speakers, as a share of the population it has declined by 1.2 percent.