21 datasets found
  1. t

    HISPANIC OR LATINO AND RACE - DP05_PIN_T - Dataset - CKAN

    • portal.tad3.org
    Updated Nov 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). HISPANIC OR LATINO AND RACE - DP05_PIN_T - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/hispanic-or-latino-and-race-dp05_pin_t
    Explore at:
    Dataset updated
    Nov 17, 2024
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    ACS DEMOGRAPHIC AND HOUSING ESTIMATES HISPANIC OR LATINO AND RACE - DP05 Universe - Total population Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 The terms “Hispanic,” “Latino,” and “Spanish” are used interchangeably. Some respondents identify with all three terms while others may identify with only one of these three specific terms. People who identify with the terms “Hispanic,” “Latino,” or “Spanish” are those who classify themselves in one of the specific Hispanic, Latino, or Spanish categories listed on the questionnaire (“Mexican, Mexican Am., or Chicano,” “Puerto Rican,” or “Cuban”) as well as those who indicate that they are “another Hispanic, Latino, or Spanish origin.” People who do not identify with one of the specific origins listed on the questionnaire but indicate that they are “another Hispanic, Latino, or Spanish origin” are those whose origins are from Spain, the Spanish-speaking countries of Central or South America, or another Spanish culture or origin. Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before their arrival in the UnitedStates. People who identify their origin as Hispanic, Latino, or Spanish may be of any race.

  2. Census Data - Languages spoken in Chicago, 2008 – 2012

    • data.cityofchicago.org
    • healthdata.gov
    • +4more
    csv, xlsx, xml
    Updated Sep 12, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2014). Census Data - Languages spoken in Chicago, 2008 – 2012 [Dataset]. https://data.cityofchicago.org/Health-Human-Services/Census-Data-Languages-spoken-in-Chicago-2008-2012/a2fk-ec6q
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    Sep 12, 2014
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    U.S. Census Bureau
    Area covered
    Chicago
    Description

    This dataset contains estimates of the number of residents aged 5 years or older in Chicago who “speak English less than very well,” by the non-English language spoken at home and community area of residence, for the years 2008 – 2012. See the full dataset description for more information at: https://data.cityofchicago.org/api/views/fpup-mc9v/files/dK6ZKRQZJ7XEugvUavf5MNrGNW11AjdWw0vkpj9EGjg?download=true&filename=P:\EPI\OEPHI\MATERIALS\REFERENCES\ECONOMIC_INDICATORS\Dataset_Description_Languages_2012_FOR_PORTAL_ONLY.pdf

  3. c

    English Proficiency by Age - Datasets - CTData.org

    • data.ctdata.org
    Updated Mar 16, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). English Proficiency by Age - Datasets - CTData.org [Dataset]. http://data.ctdata.org/dataset/english-proficiency-by-age
    Explore at:
    Dataset updated
    Mar 16, 2016
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    English Proficiency by Age reports demographic details regarding how many people speak English natively, and the proficiency of non-native speakers.

  4. d

    Population of the Limited English Proficient (LEP) Speakers by Community...

    • catalog.data.gov
    • data.cityofnewyork.us
    • +1more
    Updated Jan 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofnewyork.us (2024). Population of the Limited English Proficient (LEP) Speakers by Community District [Dataset]. https://catalog.data.gov/dataset/population-of-the-limited-english-proficient-lep-speakers-by-community-district
    Explore at:
    Dataset updated
    Jan 19, 2024
    Dataset provided by
    data.cityofnewyork.us
    Description

    Many residents of New York City speak more than one language; a number of them speak and understand non-English languages more fluently than English. This dataset, derived from the Census Bureau's American Community Survey (ACS), includes information on over 1.7 million limited English proficient (LEP) residents and a subset of that population called limited English proficient citizens of voting age (CVALEP) at the Community District level. There are 59 community districts throughout NYC, with each district being represented by a Community Board.

  5. h

    peoples_speech

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MLCommons, peoples_speech [Dataset]. https://huggingface.co/datasets/MLCommons/peoples_speech
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    MLCommons
    License

    Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
    License information was derived automatically

    Description

    Dataset Card for People's Speech

      Dataset Summary
    

    The People's Speech Dataset is among the world's largest English speech recognition corpus today that is licensed for academic and commercial usage under CC-BY-SA and CC-BY 4.0. It includes 30,000+ hours of transcribed speech in English languages with a diverse set of speakers. This open dataset is large enough to train speech-to-text systems and crucially is available with a permissive license.

      Supported Tasks… See the full description on the dataset page: https://huggingface.co/datasets/MLCommons/peoples_speech.
    
  6. Oakland Equal Access Accommodations

    • kaggle.com
    Updated Dec 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Oakland (2019). Oakland Equal Access Accommodations [Dataset]. https://www.kaggle.com/cityofoakland/oakland-equal-access-accommodations/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 6, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    City of Oakland
    Area covered
    Oakland
    Description

    Content

    The equal access accommodations Indicator is measured by comparing the percent of public contact position (PCP) employees who speak Spanish to the percent of Spanish speakers who have limited English proficiency (LEP) citywide. The Equal Access to Services Ordinance includes a requirement for City departments to offer bilingual services based on citywide demographics. In FY2016-2017, the two languages required by the ordinance were Spanish and Chinese. We chose to measure Spanish-speaking PCP employees for this Indicator because Spanish speakers comprise a larger proportion of the population.

    Context

    This is a dataset hosted by the city of Oakland in California. The organization has an open data platform found here and they update their information according to the amount of data that is brought in. Explore Oakland's Data using Kaggle and all of the data sources available through the city of Oakland organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.

    Cover photo by Luís Eusébio on Unsplash
    Unsplash Images are distributed under a unique Unsplash License.

    This dataset is distributed under NA

  7. l

    Census 21 - English proficiency ward

    • data.leicester.gov.uk
    csv, excel, geojson +1
    Updated Jun 26, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Census 21 - English proficiency ward [Dataset]. https://data.leicester.gov.uk/explore/dataset/census-21-english-proficiency-ward/
    Explore at:
    json, geojson, excel, csvAvailable download formats
    Dataset updated
    Jun 26, 2023
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    The census is undertaken by the Office for National Statistics every 10 years and gives us a picture of all the people and households in England and Wales. The most recent census took place in March of 2021.The census asks every household questions about the people who live there and the type of home they live in. In doing so, it helps to build a detailed snapshot of society. Information from the census helps the government and local authorities to plan and fund local services, such as education, doctors' surgeries and roads.Key census statistics for Leicester are published on the open data platform to make information accessible to local services, voluntary and community groups, and residents. There is also a dashboard published showcasing various datasets from the census allowing users to view data for all wards and compare this with Leicester overall statistics.Further information about the census and full datasets can be found on the ONS website - https://www.ons.gov.uk/census/aboutcensus/censusproductsProficiency in EnglishThis dataset provides Census 2021 estimates that classify usual residents in England and Wales by their proficiency in English. The estimates are as at Census Day, 21 March 2021.Definition: How well people whose main language is not English (English or Welsh in Wales) speak English.This dataset provides details for the electoral wards of Leicester city.

  8. a

    People Speaking English Less Than "Very Well" GIS

    • hub.arcgis.com
    • data-sccphd.opendata.arcgis.com
    Updated Aug 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Santa Clara County Public Health (2022). People Speaking English Less Than "Very Well" GIS [Dataset]. https://hub.arcgis.com/maps/sccphd::people-speaking-english-less-than-very-well-gis
    Explore at:
    Dataset updated
    Aug 24, 2022
    Dataset authored and provided by
    Santa Clara County Public Health
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Table contains count and percentage of county residents ages 5 years and older who speak English less than "very well". Data are presented at county, city, zip code and census tract level. Data are presented for zip codes (ZCTAs) fully within the county. Source: U.S. Census Bureau, 2016-2020 American Community Survey 5-year estimates, Table S1601; data accessed on August 23, 2022 from https://api.census.gov. The 2020 Decennial geographies are used for data summarization.METADATA:notes (String): Lists table title, notes, sourcesgeolevel (String): Level of geographyGEOID (Numeric): Geography IDNAME (String): Name of geographypop_5plus (Numeric): Population ages 5 years and olderspeak_Eng_lt_very_well (Numeric): Number of people ages 5 and older who speak English less than "very well"pct_speak_Eng_lt_very_well (Numeric): Percent of people ages 5 and older who speak English less than "very well"

  9. d

    Language Access Secret Shopper (LASS) Ratings

    • catalog.data.gov
    • data.cityofnewyork.us
    Updated Dec 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofnewyork.us (2024). Language Access Secret Shopper (LASS) Ratings [Dataset]. https://catalog.data.gov/dataset/language-access-secret-shopper-lass-ratings
    Explore at:
    Dataset updated
    Dec 20, 2024
    Dataset provided by
    data.cityofnewyork.us
    Description

    This dataset shows the work of the Language Access Secret Shopper (LASS) program from 2014 onward (though the LASS program did not run in 2020 and 2021 due to the COVID-19 pandemic). The LASS program assigns secret shoppers to visit more than 200 of New York City’s service centers to assess how well the service centers provide services to customers with Limited English Proficiency (LEP). As used in this dataset, LEP individuals do not speak English as their primary language and have a limited ability to read, speak, write, or understand English. Additional information is available at https://www.nyc.gov/site/operations/performance/language-access-secret-shopper-program.page#:~:text=Started%20in%202010%2C%20LASS%20secret,and%20highlight%20exceptional%20customer%20service.

  10. h

    english_dialects

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yoach Lacombe, english_dialects [Dataset]. https://huggingface.co/datasets/ylacombe/english_dialects
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Yoach Lacombe
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for "english_dialects"

      Dataset Summary
    

    This dataset consists of 31 hours of transcribed high-quality audio of English sentences recorded by 120 volunteers speaking with different accents of the British Isles. The dataset is intended for linguistic analysis as well as use for speech technologies. The speakers self-identified as native speakers of Southern England, Midlands, Northern England, Welsh, Scottish and Irish varieties of English. The recording scripts… See the full description on the dataset page: https://huggingface.co/datasets/ylacombe/english_dialects.

  11. f

    Data_Sheet_1_Virtual mentalizing imagery therapy for Spanish language Latino...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liliana Ramirez-Gomez; Julene K. Johnson; Christine Ritchie; Ashley K. Meyer; Emily Tan; Saira Madarasmi; Paulina Gutierrez-Ramirez; Cecilianna Aldarondo-Hernández; David Mischoulon; Sreya Banerjee; Felipe A. Jain (2023). Data_Sheet_1_Virtual mentalizing imagery therapy for Spanish language Latino family dementia caregivers: A feasibility and acceptability study.PDF [Dataset]. http://doi.org/10.3389/fpsyg.2023.961835.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Frontiers
    Authors
    Liliana Ramirez-Gomez; Julene K. Johnson; Christine Ritchie; Ashley K. Meyer; Emily Tan; Saira Madarasmi; Paulina Gutierrez-Ramirez; Cecilianna Aldarondo-Hernández; David Mischoulon; Sreya Banerjee; Felipe A. Jain
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spanish speaking family caregivers of people living with dementia have limited supportive resources in Spanish. There are few validated, culturally acceptable virtual interventions for reducing these caregivers’ psychological distress. We investigated the feasibility of a Spanish language adaptation of a virtual Mentalizing Imagery Therapy (MIT) program, which provides guided imagery and mindfulness training to reduce depression, increase mentalizing, and promote well-being. 12 Spanish-speaking family dementia caregivers received a 4-week virtual MIT program. Follow-up was obtained post group and at 4 months post baseline assessment. Feasibility, acceptability, and satisfaction with MIT were assessed. The primary psychological outcome was depressive symptoms; secondary outcomes included caregiver burden, dispositional mindfulness, perceived stress, well-being, interpersonal support, and neurological quality of life. Statistical analysis was performed with mixed linear models. Caregivers were 52 ± 8 (mean ± SD) years of age. 60% had a high school education or less. Participation in weekly group meetings was 100%. Home practice was performed on average 4 ± 1 times per week [range 2–5]. Satisfaction with MIT reached 19 ± 2 of a possible 20 points. Reduction in depression from baseline was observed by week three (p = 0.01) and maintained at 4 month follow-up (p = 0.05). There were significant improvements in mindfulness post-group, and in caregiver burden and well-being at 4 months. MIT was successfully adapted for Latino Spanish language family dementia caregivers within a virtual group environment. MIT is feasible and acceptable and may help reduce depressive symptoms and improve subjective well-being. Larger, randomized controlled trials of MIT should determine durability of effects and validate efficacy in this population.

  12. D

    2023 Limited English Proficiency (LEP) for the DVRPC Region Public Use...

    • catalog.dvrpc.org
    • njogis-newjersey.opendata.arcgis.com
    • +1more
    api, geojson, html +1
    Updated Aug 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DVRPC (2025). 2023 Limited English Proficiency (LEP) for the DVRPC Region Public Use Microdata Areas [Dataset]. https://catalog.dvrpc.org/dataset/2023-limited-english-proficiency-lep-for-the-dvrpc-region-public-use-microdata-areas
    Explore at:
    xml, geojson, api, htmlAvailable download formats
    Dataset updated
    Aug 28, 2025
    Dataset authored and provided by
    DVRPC
    Description

    The Delaware Valley Regional Planning Commission (DVRPC) is committed to upholding the principles and intentions of the 1964 Civil Rights Act and related nondiscrimination statutes in all of the Commission’s work, including publications, products, communications, public input, and decision-making processes. Language barriers may prohibit people who are Limited in English Proficiency (also known as LEP persons) from obtaining services, information, or participating in public planning processes. To better identify LEP populations and thoroughly evaluate the Commission’s efforts to provide meaningful access, DVRPC has produced this Limited-English Proficiency Plan. This is the data that was used to make the maps for the upcoming plan. Public Use Microdata Area (PUMA), are geographies of at least 100,000 people that are nested within states or equivalent entities. States are able to delineate PUMAs within their borders, or use PUMA Criteria provided by the Census Bureau. Census tables used to gather data from the 2019- 2023 American Community Survey 5-Year Estimates ACS 2019-2023, Table B16001: Language Spoken at Home by Ability to Speak English for the Population 5 Years and Over. ACS data are derived from a survey and are subject to sampling variablity.

    *Limited English Proficiency (LEP) refers to those persons that speak English less than "very well". DVRPC has mapped the below Language Groups for our Plan.

    Spanish

    Russian

    Chinese

    Korean

    Vietnamese Source of PUMA boundaries: US Census Bureau. The TIGER/Line Files Please refer to U:_OngoingProjects\LEP\ACS_5YR_B16001_PUMAs_metadata.xlsx for full attribute loop up and fields used in making the DVRPC LEP Map Series. Please contact Chris Pollard (cpollard@dvrpc.org) should you have any questions about this dataset.

  13. 520 Hours - French Speaking English Speech Data by Mobile Phone

    • nexdata.ai
    Updated Sep 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2023). 520 Hours - French Speaking English Speech Data by Mobile Phone [Dataset]. https://www.nexdata.ai/datasets/speechrecog/989?source=Github
    Explore at:
    Dataset updated
    Sep 9, 2023
    Dataset authored and provided by
    Nexdata
    Area covered
    French
    Variables measured
    Format, Country, Speaker, Language, Accuracy Rate, Content category, Recording device, Recording condition, Features of annotation
    Description

    English(France) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(1,089 people in total), geographicly speaking, enhancing model performance in real and complex tasks.Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

  14. Data from: Language Development of Non-verbal Children Age 3 Years through 7...

    • icpsr.umich.edu
    ascii, delimited, r +3
    Updated Oct 25, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brady, Nancy (2016). Language Development of Non-verbal Children Age 3 Years through 7 Years, 2007 to 2012 [Kansas City Metro Area] [Dataset]. http://doi.org/10.3886/ICPSR36472.v1
    Explore at:
    r, stata, sas, spss, ascii, delimitedAvailable download formats
    Dataset updated
    Oct 25, 2016
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Brady, Nancy
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/36472/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36472/terms

    Time period covered
    2007 - 2012
    Area covered
    Kansas, Kansas
    Description

    The Language Development of Non-verbal Children Age 3 Years through 7 Years in the Kansas Metro Area is one of the three projects in the Communication of People with MR, 2006 to 2012 Series, which focuses on identifying participant variables that predict success in increasing communication skills of individual with intellectual disabilities. Data for Dataset 1 of this study were collected to illustrate how acquisition of symbolic communication using Voice Output Communication Aid (VOCA) affects the development of successful communication exchanges. For the data collection of Dataset 1, children were recruited by contacting school districts in and near the Kansas City metropolitan area, specifically, in Topeka, Kansas, and Wichita, Kansas. Teachers and speech-language pathologists were asked to nominate any children meeting specific criteria. The 93 children who were enrolled were administered the Mullen Scales of Early Learning and the Preschool Language Scale. A structured play assessment was also administered. Subsequently, data for Dataset 2 was collected to analyze and compare 19 Spanish-speaking children to the original sample. Both data files contain the results of Complexity of Communication Scale, a measure developed by the Communication of People with MR project.

  15. s

    Speech Accent Archive

    • marketplace.sshopencloud.eu
    Updated Apr 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Speech Accent Archive [Dataset]. https://marketplace.sshopencloud.eu/dataset/jnNNLE
    Explore at:
    Dataset updated
    Apr 24, 2020
    Description

    Everyone who speaks a language, speaks it with an accent. A particular accent essentially reflects a person's linguistic background. When people listen to someone speak with a different accent from their own, they notice the difference, and they may even make certain biased social judgments about the speaker. The speech accent archive is established to uniformly exhibit a large set of speech accents from a variety of language backgrounds. Native and non-native speakers of English all read the same English paragraph and are carefully recorded. The archive is constructed as a teaching tool and as a research tool. It is meant to be used by linguists as well as other people who simply wish to listen to and compare the accents of different English speakers. This dataset allows you to compare the demographic and linguistic backgrounds of the speakers in order to determine which variables are key predictors of each accent. The speech accent archive demonstrates that accents are systematic rather than merely mistaken speech. All of the linguistic analyses of the accents are available for public scrutiny. We welcome comments on the accuracy of our transcriptions and analyses.

  16. a

    GCCSA-P11a Proficiency in Spoken Eng by Year of Arrival by Sex-Census 2016 -...

    • data.aurin.org.au
    Updated Mar 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). GCCSA-P11a Proficiency in Spoken Eng by Year of Arrival by Sex-Census 2016 - Dataset - AURIN [Dataset]. https://data.aurin.org.au/dataset/au-govt-abs-census-gccsa-p11a-english-profic-by-arrival-yr-by-sex-census-2016-gccsa-2016
    Explore at:
    Dataset updated
    Mar 5, 2025
    License

    Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
    License information was derived automatically

    Description

    GCCSA based data for Proficiency in Spoken English by Year of Arrival in Australia by Sex, in Place of Enumeration Profile (PEP), 2016 Census. Count of persons born overseas in the following categories of proficiency: Speaks English only, Multilingual speaks English well, Multilingual speaks English not well/at all, Multilingual English proficiency not stated, Total multilingual people, Language and English proficiency not stated, and Total persons. Excludes persons born in 'Australia, (includes External Territories), nfd', 'Norfolk Island' and 'Australian External Territories, nec' and persons who did not state a country of birth. Where year of arrival is stated as 2016, it refers to the period from 1 January 2016 to 9 August 2016. P11 is broken up into 2 sections (P11a – P11b), this section contains 'Males Speaks English only Year of arrival Before 2000’ - ' Persons Speaks other language and speaks English Total Year of arrival 2000 2005’. The data is by GCCSA 2016 boundaries. Periodicity: 5-Yearly. Note: There are small random adjustments made to all cell values to protect the confidentiality of data. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals. For more information visit the data source: http://www.abs.gov.au/census.

  17. g

    Identified Areas of Emerging CALD Communities - Non-main English-Speaking...

    • gimi9.com
    Updated Feb 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Identified Areas of Emerging CALD Communities - Non-main English-Speaking Country of Birth (Polygon) (SA1 Level) (2001-2021) | gimi9.com [Dataset]. https://gimi9.com/dataset/au_ecald_dataset_1_cald_cob_polygon/
    Explore at:
    Dataset updated
    Feb 1, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    An emerging CALD community refers to a place with a significant increase in the number of Culturally and Linguistically Diverse (CALD) populations according to ABS census counts. These communities may experience social barriers that adversely affect the quality of life. Emerging CALD Communities are an ongoing feature of the Australian cultural landscape. Further research has been required into the status of Emerging CALD Communities. This project concerns how social and environmental inequalities have been distributed in Australia's CALD populations over the last two decades. It aims to measure changes in the CALD populations and exposure to urban heat and greening due to social inequities and climate change. Two layers of CALD total populations at the SA1 level were generated for five consecutive Australian Census years (2001, 2006, 2011, 2016, 2021) using historic ABS Census datasets. The first layer represents individuals who speak a non-English language at home, while the second layer includes those born in a country where the main language is non-English. Both layers were transformed and aggregated to ensure consistency across Census years, providing a detailed analysis of CALD population trends over two decades. This project expands AURINʼs infrastructure of data and tools, in particular the integrated Heat Vulnerability Index toolkit developed by CI Sun that has provided cloud computing tools for deriving environmental indicators. The outcome of this project is a new nationwide longitudinal database with the quantification of CALD populations and social-environmental inequalities, which will fill a critical gap for AURINʼs data catalogue. The database supports and facilitates multidisciplinary research to perform spatial and statistical analyses to reveal the disproportionate exposure to urban heat and greening across CALD communities in Australia. Spatially explicit information can be generated from the database for planners to make intervention strategies for vulnerable CALD populations, to diminish the inequality for CALD. This significantly advances AURINʼs capability to support CALD research across social science, public health, and the environment, and achieve SDG goals.

  18. 2024 American Community Survey: B16005E | Nativity by Language Spoken at...

    • data.census.gov
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ACS, 2024 American Community Survey: B16005E | Nativity by Language Spoken at Home by Ability to Speak English for the Population 5 Years and Over (Native Hawaiian and Other Pacific Islander Alone) (ACS 1-Year Estimates Detailed Tables) [Dataset]. https://data.census.gov/table/ACSDT1Y2024.B16005E?q=Native-Born&t=Language+Spoken+at+Home&g=030XX00US1
    Explore at:
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ACS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024
    Description

    Key Table Information.Table Title.Nativity by Language Spoken at Home by Ability to Speak English for the Population 5 Years and Over (Native Hawaiian and Other Pacific Islander Alone).Table ID.ACSDT1Y2024.B16005E.Survey/Program.American Community Survey.Year.2024.Dataset.ACS 1-Year Estimates Detailed Tables.Source.U.S. Census Bureau, 2024 American Community Survey, 1-Year Estimates.Dataset Universe.The dataset universe of the American Community Survey (ACS) is the U.S. resident population and housing. For more information about ACS residence rules, see the ACS Design and Methodology Report. Note that each table describes the specific universe of interest for that set of estimates..Methodology.Unit(s) of Observation.American Community Survey (ACS) data are collected from individuals living in housing units and group quarters, and about housing units whether occupied or vacant. For more information about ACS sampling and data collection, see the ACS Design and Methodology Report..Geography Coverage.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year.Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Sampling.The ACS consists of two separate samples: housing unit addresses and group quarters facilities. Independent housing unit address samples are selected for each county or county-equivalent in the U.S. and Puerto Rico, with sampling rates depending on a measure of size for the area. For more information on sampling in the ACS, see the Accuracy of the Data document..Confidentiality.The Census Bureau has modified or suppressed some estimates in ACS data products to protect respondents' confidentiality. Title 13 United States Code, Section 9, prohibits the Census Bureau from publishing results in which an individual's data can be identified. For more information on confidentiality protection in the ACS, see the Accuracy of the Data document..Technical Documentation/Methodology.Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables.Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Weights.ACS estimates are obtained from a raking ratio estimation procedure that results in the assignment of two sets of weights: a weight to each sample person record and a weight to each sample housing unit record. Estimates of person characteristics are based on the person weight. Estimates of family, household, and housing unit characteristics are based on the housing unit weight. For any given geographic area, a characteristic total is estimated by summing the weights assigned to the persons, households, families or housing units possessing the characteristic in the geographic area. For more information on weighting and estimation in the ACS, see the Accuracy of the Data document.Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and ...

  19. a

    SA2-P11b Proficiency in Spoken Eng by Year of Arrival by Sex-Census 2016 -...

    • data.aurin.org.au
    Updated Mar 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). SA2-P11b Proficiency in Spoken Eng by Year of Arrival by Sex-Census 2016 - Dataset - AURIN [Dataset]. https://data.aurin.org.au/dataset/au-govt-abs-census-sa2-p11b-english-profic-by-arrival-yr-by-sex-census-2016-sa2-2016
    Explore at:
    Dataset updated
    Mar 5, 2025
    License

    Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
    License information was derived automatically

    Description

    SA2 based data for Proficiency in Spoken English by Year of Arrival in Australia by Sex, in Place of Enumeration Profile (PEP), 2016 Census. Count of persons born overseas in the following categories of proficiency: Speaks English only, Multilingual speaks English well, Multilingual speaks English not well/at all, Multilingual English proficiency not stated, Total multilingual people, Language and English proficiency not stated, and Total persons. Excludes persons born in 'Australia, (includes External Territories), nfd', 'Norfolk Island' and 'Australian External Territories, nec' and persons who did not state a country of birth. Where year of arrival is stated as 2016, it refers to the period from 1 January 2016 to 9 August 2016. P11 is broken up into 2 sections (P11a – P11b), this section contains 'Persons Speaks other language and speaks English Total Year of arrival 2006-2010’ - 'Persons Total Total’. The data is by SA2 2016 boundaries. Periodicity: 5-Yearly. Note: There are small random adjustments made to all cell values to protect the confidentiality of data. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals. For more information visit the data source: http://www.abs.gov.au/census.

  20. f

    S1 Data -

    • plos.figshare.com
    zip
    Updated Mar 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yunxia Wang (2024). S1 Data - [Dataset]. http://doi.org/10.1371/journal.pone.0299425.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 22, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Yunxia Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To help non-native English speakers quickly master English vocabulary, and improve reading, writing, listening and speaking skills, and communication skills, this study designs, constructs, and improves an English vocabulary learning model that integrates Spiking Neural Network (SNN) and Convolutional Long Short-Term Memory (Conv LSTM) algorithms. The fusion of SNN and Conv LSTM algorithm can fully utilize the advantages of SNN in processing temporal information and Conv LSTM in sequence data modeling, and implement a fusion model that performs well in English vocabulary learning. By adding information transfer and interaction modules, the feature learning and the timing information processing are optimized to improve the vocabulary learning ability of the model in different text contents. The training set used in this study is an open data set from the WordNet and Oxford English Corpus data corpora. The model is presented as a computer program and applied to an English learning application program, an online vocabulary learning platform, or a language education software. The experiment will use the open data set to generate a test set with text volume ranging from 100 to 4000. The performance indicators of the proposed fusion model are compared with those of five traditional models and applied to the latest vocabulary exercises. From the perspective of learners, 10 kinds of model accuracy, loss, polysemy processing accuracy, training time, syntactic structure capturing accuracy, vocabulary coverage, F1-score, context understanding accuracy, word sense disambiguation accuracy, and word order relation processing accuracy are considered. The experimental results reveal that the performance of the fusion model is better under different text sizes. In the range of 100–400 text volume, the accuracy is 0.75–0.77, the loss is less than 0.45, the F1-score is greater than 0.75, the training time is within 300s, and the other performance indicators are more than 65%; In the range of 500–1000 text volume, the accuracy is 0.81–0.83, the loss is not more than 0.40, the F1-score is not less than 0.78, the training time is within 400s, and the other performance indicators are above 70%; In the range of 1500–3000 text volume, the accuracy is 0.82–0.84, the loss is less than 0.28, the F1-score is not less than 0.78, the training time is within 600s, and the remaining performance indicators are higher than 70%. The fusion model can adapt to various types of questions in practical application. After the evaluation of professional teachers, the average scores of the choice, filling-in-the-blank, spelling, matching, exercises, and synonyms are 85.72, 89.45, 80.31, 92.15, 87.62, and 78.94, which are much higher than other traditional models. This shows that as text volume increases, the performance of the fusion model is gradually improved, indicating higher accuracy and lower loss. At the same time, in practical application, the fusion model proposed in this study has a good effect on English learning tasks and offers greater benefits for people unfamiliar with English vocabulary structure, grammar, and question types. This study aims to provide efficient and accurate natural language processing tools to help non-native English speakers understand and apply language more easily, and improve English vocabulary learning and comprehension.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2024). HISPANIC OR LATINO AND RACE - DP05_PIN_T - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/hispanic-or-latino-and-race-dp05_pin_t

HISPANIC OR LATINO AND RACE - DP05_PIN_T - Dataset - CKAN

Explore at:
Dataset updated
Nov 17, 2024
License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

ACS DEMOGRAPHIC AND HOUSING ESTIMATES HISPANIC OR LATINO AND RACE - DP05 Universe - Total population Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 The terms “Hispanic,” “Latino,” and “Spanish” are used interchangeably. Some respondents identify with all three terms while others may identify with only one of these three specific terms. People who identify with the terms “Hispanic,” “Latino,” or “Spanish” are those who classify themselves in one of the specific Hispanic, Latino, or Spanish categories listed on the questionnaire (“Mexican, Mexican Am., or Chicano,” “Puerto Rican,” or “Cuban”) as well as those who indicate that they are “another Hispanic, Latino, or Spanish origin.” People who do not identify with one of the specific origins listed on the questionnaire but indicate that they are “another Hispanic, Latino, or Spanish origin” are those whose origins are from Spain, the Spanish-speaking countries of Central or South America, or another Spanish culture or origin. Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before their arrival in the UnitedStates. People who identify their origin as Hispanic, Latino, or Spanish may be of any race.

Search
Clear search
Close search
Google apps
Main menu