79 datasets found
  1. Hispanic population U.S. 2023, by state

    • statista.com
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Hispanic population U.S. 2023, by state [Dataset]. https://www.statista.com/statistics/259850/hispanic-population-of-the-us-by-state/
    Explore at:
    Dataset updated
    Oct 18, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    In 2023, California had the highest Hispanic population in the United States, with over 15.76 million people claiming Hispanic heritage. Texas, Florida, New York, and Illinois rounded out the top five states for Hispanic residents in that year. History of Hispanic people Hispanic people are those whose heritage stems from a former Spanish colony. The Spanish Empire colonized most of Central and Latin America in the 15th century, which began when Christopher Columbus arrived in the Americas in 1492. The Spanish Empire expanded its territory throughout Central America and South America, but the colonization of the United States did not include the Northeastern part of the United States. Despite the number of Hispanic people living in the United States having increased, the median income of Hispanic households has fluctuated slightly since 1990. Hispanic population in the United States Hispanic people are the second-largest ethnic group in the United States, making Spanish the second most common language spoken in the country. In 2021, about one-fifth of Hispanic households in the United States made between 50,000 to 74,999 U.S. dollars. The unemployment rate of Hispanic Americans has fluctuated significantly since 1990, but has been on the decline since 2010, with the exception of 2020 and 2021, due to the impact of the coronavirus (COVID-19) pandemic.

  2. Number of native Spanish speakers worldwide 2024, by country

    • statista.com
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of native Spanish speakers worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/991020/number-native-spanish-speakers-country-worldwide/
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    Mexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.

  3. Spanish speakers in countries where Spanish is not an official language 2024...

    • statista.com
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Spanish speakers in countries where Spanish is not an official language 2024 [Dataset]. https://www.statista.com/statistics/1276290/number-spanish-speakers-non-hispanic-countries-worldwide/
    Explore at:
    Dataset updated
    Jan 15, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The United States is the non-hispanic country with the largest number of native Spanish speakers in the world, with approximately 41.89 million people with a native command of the language in 2024. However, the European Union had the largest group of non-native speakers with limited proficiency of Spanish, at around 28 million people. Furthermore, Mexico is the country with the largest number of native Spanish speakers in the world as of 2024.

  4. Percentage of Hispanic population in the U.S. by state 2023

    • statista.com
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Percentage of Hispanic population in the U.S. by state 2023 [Dataset]. https://www.statista.com/statistics/259865/percentage-of-hispanic-population-in-the-us-by-state/
    Explore at:
    Dataset updated
    Oct 21, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    In 2022, around 48.59 percent of New Mexico's population was of Hispanic origin, compared to the national percentage of 19.45. California, Texas, and Arizona also registered shares over 30 percent. The distribution of the U.S. population by ethnicity can be accessed here.

  5. The most spoken languages worldwide 2023

    • statista.com
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). The most spoken languages worldwide 2023 [Dataset]. https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/
    Explore at:
    Dataset updated
    Jan 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2022
    Area covered
    World
    Description

    In 2023, there were around 1.5 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.1 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year.

    Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation and other official pronouncements. The United States is a land of immigrations and the languages spoken in the United States vary as a result of the multi-cultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over 41 million people spoke at home in 2021. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.7 million Tagalog speakers and 1.5 million Vietnamese speakers counted in the United States that year.

    Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 44 percent of California’s population was speaking a language other than English at home in 2021.

  6. 2010-2014 ACS Language Spoken at Home Variables - Boundaries

    • hub.arcgis.com
    Updated Nov 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2020). 2010-2014 ACS Language Spoken at Home Variables - Boundaries [Dataset]. https://hub.arcgis.com/maps/98bf5b2403c5456492df577ee3cee241
    Explore at:
    Dataset updated
    Nov 20, 2020
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Description

    This layer contains 2010-2014 American Community Survey (ACS) 5-year data, and contains estimates and margins of error. The layer shows language group of language spoken at home by age. This is shown by tract, county, and state boundaries. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percentage of the population age 5+ who speak Spanish at home. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Vintage: 2010-2014ACS Table(s): B16007 Data downloaded from: Census Bureau's API for American Community Survey Date of API call: November 11, 2020National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer has associated layers containing the most recent ACS data available by the U.S. Census Bureau. Click here to learn more about ACS data releases and click here for the associated boundaries layer. The reason this data is 5+ years different from the most recent vintage is due to the overlapping of survey years. It is recommended by the U.S. Census Bureau to compare non-overlapping datasets.Boundaries come from the US Census TIGER geodatabases. Boundary vintage (2014) appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2010 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.

  7. Hispanic population of the U.S. 2000-2023

    • statista.com
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Hispanic population of the U.S. 2000-2023 [Dataset]. https://www.statista.com/statistics/259806/hispanic-population-of-the-us/
    Explore at:
    Dataset updated
    Oct 18, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    The number of people of Hispanic origin living in the United States has increased around 80 percent from 2000 to 2023. During this last year, about 65.22 million people of Hispanic origin were living in the United States. California and Texas ranked as the states with the highest number of Hispanic origin people as of 2023.

  8. F

    Healthcare Call Center Speech Data: Spanish (Mexico)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Healthcare Call Center Speech Data: Spanish (Mexico) [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-spanish-mexico
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Area covered
    Mexico
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Mexican Spanish Call Center Speech Dataset for the Healthcare domain designed to enhance the development of call center speech recognition models specifically for the Healthcare industry. This dataset is meticulously curated to support advanced speech recognition, natural language processing, conversational AI, and generative voice AI algorithms.

    Speech Data

    This training dataset comprises 30 Hours of call center audio recordings covering various topics and scenarios related to the Healthcare domain, designed to build robust and accurate customer service speech technology.

    Participant Diversity:
    Speakers: 60 expert native Mexican Spanish speakers from the FutureBeeAI Community.
    Regions: Different states/provinces of Mexico, ensuring a balanced representation of Mexican accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Details:
    Conversation Nature: Unscripted and spontaneous conversations between call center agents and customers.
    Call Duration: Average duration of 5 to 15 minutes per call.
    Formats: WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 and 16 kHz.
    Environment: Without background noise and without echo.

    Topic Diversity

    This dataset offers a diverse range of conversation topics, call types, and outcomes, including both inbound and outbound calls with positive, neutral, and negative outcomes.

    Inbound Calls:
    Appointment Scheduling
    New Patient Registration
    Surgery Consultation
    Consultation regarding Diet, and many more
    Outbound Calls:
    Appointment Reminder
    Health and Wellness Subscription Programs
    Lab Tests Results
    Health Risk Assessments
    Preventive Care Reminders, and many more

    This extensive coverage ensures the dataset includes realistic call center scenarios, which is essential for developing effective customer support speech recognition models.

    Transcription

    To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. These transcriptions feature:

    Speaker-wise Segmentation: Time-coded segments for both agents and customers.
    Non-Speech Labels: Tags and labels for non-speech elements.
    Word Error Rate: Word error rate is less than 5% thanks to the dual layer of QA.

    These ready-to-use transcriptions accelerate the development of the Healthcare domain call center conversational AI and ASR models for the Mexican Spanish language.

    Metadata

    The dataset provides comprehensive metadata for each conversation and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent and dialect.
    Conversation Metadata: Domain, topic, call type, outcome/sentiment, bit depth, and sample rate.

    This metadata is a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of Mexican Spanish call center speech recognition models.

    Usage and Applications

    This dataset can be used for various applications in the fields of speech recognition, natural language processing, and conversational AI, specifically tailored to the Healthcare domain. Potential use cases include:

  9. F

    Telecom Call Center Speech Data: Spanish (USA)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Telecom Call Center Speech Data: Spanish (USA) [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/telecom-call-center-conversation-spanish-usa
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the US Spanish Call Center Speech Dataset for the Telecom domain designed to enhance the development of call center speech recognition models specifically for the Telecom industry. This dataset is meticulously curated to support advanced speech recognition, natural language processing, conversational AI, and generative voice AI algorithms.

    Speech Data

    This training dataset comprises 30 Hours of call center audio recordings covering various topics and scenarios related to the Telecom domain, designed to build robust and accurate customer service speech technology.

    Participant Diversity:
    Speakers: 60 expert native US Spanish speakers from the FutureBeeAI Community.
    Regions: Different states/provinces of USA, ensuring a balanced representation of US accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Details:
    Conversation Nature: Unscripted and spontaneous conversations between call center agents and customers.
    Call Duration: Average duration of 5 to 15 minutes per call.
    Formats: WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 and 16 kHz.
    Environment: Without background noise and without echo.

    Topic Diversity

    This dataset offers a diverse range of conversation topics, call types, and outcomes, including both inbound and outbound calls with positive, neutral, and negative outcomes.

    Inbound Calls:
    Phone Number Porting
    Network Connectivity Issues
    Billing and Payments
    Technical Support
    Service Activation
    International Roaming Enquiry
    Refunds and Billing Adjustments
    Emergency Service Access, and many more
    Outbound Calls:
    Welcome Calls / Onboarding Process
    Payment Reminders
    Customer Surveys
    Technical Updates
    Service Usage Reviews
    Network Compliant Status Call, and many more

    This extensive coverage ensures the dataset includes realistic call center scenarios, which is essential for developing effective customer support speech recognition models.

    Transcription

    To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. These transcriptions feature:

    Speaker-wise Segmentation: Time-coded segments for both agents and customers.
    Non-Speech Labels: Tags and labels for non-speech elements.
    Word Error Rate: Word error rate is less than 5% thanks to the dual layer of QA.

    These ready-to-use transcriptions accelerate the development of the Telecom domain call center conversational AI and ASR models for the US Spanish language.

    Metadata

    The dataset provides comprehensive metadata for each conversation and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent and dialect.
    <b

  10. F

    Travel Call Center Speech Data: Spanish (Mexico)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Travel Call Center Speech Data: Spanish (Mexico) [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/travel-call-center-conversation-spanish-mexico
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Area covered
    Mexico
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Mexican Spanish Call Center Speech Dataset for the Travel domain designed to enhance the development of call center speech recognition models specifically for the Travel industry. This dataset is meticulously curated to support advanced speech recognition, natural language processing, conversational AI, and generative voice AI algorithms.

    Speech Data:

    This training dataset comprises 30 Hours of call center audio recordings covering various topics and scenarios related to the Travel domain, designed to build robust and accurate customer service speech technology.

    Participant Diversity:
    Speakers: 60 expert native Mexican Spanish speakers from the FutureBeeAI Community.
    Regions: Different states/provinces of Mexico, ensuring a balanced representation of Mexican accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Details:
    Conversation Nature: Unscripted and spontaneous conversations between call center agents and customers.
    Call Duration: Average duration of 5 to 15 minutes per call.
    Formats: WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 and 16 kHz.
    Environment: Without background noise and without echo.

    Topic Diversity

    This dataset offers a diverse range of conversation topics, call types, and outcomes, including both inbound and outbound calls with positive, neutral, and negative outcomes.

    Inbound Calls:
    Booking inquiries and assistance
    Destination information and recommendations
    Assistance with flight delays or cancellations
    Special assistance for passengers with disabilities
    Travel-related health and safety inquiry
    Assistance with lost or delayed baggage, and many more
    Outbound Calls:
    Promotional offers and package deals
    Customer satisfaction surveys
    Booking confirmations and updates
    Flight schedule changes and notifications
    Customer feedback collection
    Reminders for passport or visa expiration date, and many more

    This extensive coverage ensures the dataset includes realistic call center scenarios, which is essential for developing effective customer support speech recognition models.

    Transcription

    To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. These transcriptions feature:

    Speaker-wise Segmentation: Time-coded segments for both agents and customers.
    Non-Speech Labels: Tags and labels for non-speech elements.
    Word Error Rate: Word error rate is less than 5% thanks to the dual layer of QA.

    These ready-to-use transcriptions accelerate the development of the Travel domain call center conversational AI and ASR models for the Mexican Spanish language.

    Metadata

    The dataset provides comprehensive metadata for each conversation and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent and dialect.
    Conversation Metadata: Domain, topic, call type, outcome/sentiment, bit depth, and sample rate.

  11. F

    Real Estate Call Center Speech Data: Spanish (Spain)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Real Estate Call Center Speech Data: Spanish (Spain) [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/realestate-call-center-conversation-spanish-spain
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Area covered
    Spain
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Spanish Call Center Speech Dataset for the Real Estate domain designed to enhance the development of call center speech recognition models specifically for the Real Estate industry. This dataset is meticulously curated to support advanced speech recognition, natural language processing, conversational AI, and generative voice AI algorithms.

    Speech Data:

    This training dataset comprises 30 Hours of call center audio recordings covering various topics and scenarios related to the Real Estate domain, designed to build robust and accurate customer service speech technology.

    Participant Diversity:
    Speakers: 60 expert native Spanish speakers from the FutureBeeAI Community.
    Regions: Different states/provinces of Spain, ensuring a balanced representation of Spanish accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Details:
    Conversation Nature: Unscripted and spontaneous conversations between call center agents and customers.
    Call Duration: Average duration of 5 to 15 minutes per call.
    Formats: WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 and 16 kHz.
    Environment: Without background noise and without echo.

    Topic Diversity

    This dataset offers a diverse range of conversation topics, call types, and outcomes, including both inbound and outbound calls with positive, neutral, and negative outcomes.

    Inbound Calls:
    Property Inquiry
    Rental Property Search & Availability
    Renovation Inquiries
    Property Features & Amenities Inquiry
    Investment Property Analysis & Advice
    Property History & Ownership Details, and many more
    Outbound Calls:
    New Property Listing Update
    Post Purchase Follow-ups
    Investment Opportunities & Property Recommendations
    Property Value Updates
    Customer Satisfaction Surveys, and many more

    This extensive coverage ensures the dataset includes realistic call center scenarios, which is essential for developing effective customer support speech recognition models.

    Transcription

    To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. These transcriptions feature:

    Speaker-wise Segmentation: Time-coded segments for both agents and customers.
    Non-Speech Labels: Tags and labels for non-speech elements.
    Word Error Rate: Word error rate is less than 5% thanks to the dual layer of QA.

    These ready-to-use transcriptions accelerate the development of the Real Estate domain call center conversational AI and ASR models for the Spanish language.

    Metadata

    The dataset provides comprehensive metadata for each conversation and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent and dialect.
    Conversation Metadata: Domain, topic, call type, outcome/sentiment, bit depth, and sample rate.

    This metadata is a powerful tool for understanding and characterizing the data, enabling informed decision-making in the development of Spanish call center speech recognition models.

    Usage and

  12. Hispanic population in the U.S. 2023, by origin

    • statista.com
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Hispanic population in the U.S. 2023, by origin [Dataset]. https://www.statista.com/statistics/234852/us-hispanic-population/
    Explore at:
    Dataset updated
    Oct 21, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    As of 2023, around 37.99 million people of Mexican descent were living in the United States - the largest of any Hispanic group. Puerto Ricans, Salvadorans, Cubans, and Dominicans rounded out the top five Hispanic groups living in the U.S. in that year.

  13. Non-White Population in the US (Current ACS)

    • gis-for-racialequity.hub.arcgis.com
    Updated Jul 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Urban Observatory by Esri (2021). Non-White Population in the US (Current ACS) [Dataset]. https://gis-for-racialequity.hub.arcgis.com/maps/bd59d1d55f064d1b815997f4b6c7735f
    Explore at:
    Dataset updated
    Jul 1, 2021
    Dataset provided by
    Esrihttp://esri.com/
    Authors
    Urban Observatory by Esri
    Area covered
    Description

    This map shows the percentage of people who identify as something other than non-Hispanic white throughout the US according to the most current American Community Survey. The pattern is shown by states, counties, and Census tracts. Zoom or search for anywhere in the US to see a local pattern. Click on an area to learn more. Filter to your area and save a new version of the map to use for your own mapping purposes.The Arcade expression used was: 100 - B03002_calc_pctNHWhiteE, which is simply 100 minus the percent of population who identifies as non-Hispanic white. The data is from the U.S. Census Bureau's American Community Survey (ACS). The figures in this map update automatically annually when the newest estimates are released by ACS. For more detailed metadata, visit the ArcGIS Living Atlas Layer: ACS Race and Hispanic Origin Variables - Boundaries.The data on race were derived from answers to the question on race that was asked of individuals in the United States. The Census Bureau collects racial data in accordance with guidelines provided by the U.S. Office of Management and Budget (OMB), and these data are based on self-identification. The racial categories included in the census questionnaire generally reflect a social definition of race recognized in this country and not an attempt to define race biologically, anthropologically, or genetically. The categories represent a social-political construct designed for collecting data on the race and ethnicity of broad population groups in this country, and are not anthropologically or scientifically based. Learn more here.Other maps of interest:American Indian or Alaska Native Population in the US (Current ACS)Asian Population in the US (Current ACS)Black or African American Population in the US (Current ACS)Hawaiian or Other Pacific Islander Population in the US (Current ACS)Hispanic or Latino Population in the US (Current ACS) (some people prefer Latinx)Population who are Some Other Race in the US (Current ACS)Population who are Two or More Races in the US (Current ACS) (some people prefer mixed race or multiracial)White Population in the US (Current ACS)Race in the US by Dot DensityWhat is the most common race/ethnicity?

  14. Ranking of languages spoken at home in the U.S. 2022

    • statista.com
    Updated Dec 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Ranking of languages spoken at home in the U.S. 2022 [Dataset]. https://www.statista.com/statistics/183483/ranking-of-languages-spoken-at-home-in-the-us-in-2008/
    Explore at:
    Dataset updated
    Dec 9, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2022
    Area covered
    United States
    Description

    In 2022, around 42.03 million people in the United States spoke Spanish at home. In comparison, approximately 974,829 people were speaking Russian at home during the same year. The distribution of the U.S. population by ethnicity can be accessed here. A ranking of the most spoken languages across the world can be accessed here.

  15. Predominant Race and Ethnicity in the US (2020 Census)

    • redistricting-willcountygis.hub.arcgis.com
    Updated Aug 23, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2021). Predominant Race and Ethnicity in the US (2020 Census) [Dataset]. https://redistricting-willcountygis.hub.arcgis.com/maps/b0232184dfd44b709071bd33224c19aa
    Explore at:
    Dataset updated
    Aug 23, 2021
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Description

    This multi-scale map shows the predominant (most numerous) race/ethnicity living within an area. Map opens at the state level, centered on the lower 48 states. Data is from U.S. Census Bureau's 2020 PL 94-171 data for state, county, tract, block group, and block.The map's colors indicate which of the eight race/ethnicity categories have the highest total count.Race and ethnicity highlights from the U.S. Census Bureau:White population remained the largest race or ethnicity group in the United States, with 204.3 million people identifying as White alone. Overall, 235.4 million people reported White alone or in combination with another group. However, the White alone population decreased by 8.6% since 2010.Two or More Races population (also referred to as the Multiracial population) has changed considerably since 2010. The Multiracial population was measured at 9 million people in 2010 and is now 33.8 million people in 2020, a 276% increase.“In combination” multiracial populations for all race groups accounted for most of the overall changes in each racial category.All of the race alone or in combination groups experienced increases. The Some Other Race alone or in combination group (49.9 million) increased 129%, surpassing the Black or African American population (46.9 million) as the second-largest race alone or in combination group.The next largest racial populations were the Asian alone or in combination group (24 million), the American Indian and Alaska Native alone or in combination group (9.7 million), and the Native Hawaiian and Other Pacific Islander alone or in combination group (1.6 million).Hispanic or Latino population, which includes people of any race, was 62.1 million in 2020. Hispanic or Latino population grew 23%, while the population that was not of Hispanic or Latino origin grew 4.3% since 2010.View more 2020 Census statistics highlights on race and ethnicity.

  16. s

    Hispanic and or Black, Indigenous or People of Color (Hspbipoc) Population...

    • ndp.sdsc.edu
    • nationaldataplatform.org
    Updated Mar 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Hispanic and or Black, Indigenous or People of Color (Hspbipoc) Population Concentration - Northern CA - Dataset - CKAN [Dataset]. https://ndp.sdsc.edu/catalog/dataset/clm-hispanic-and-or-black-indigenous-or-people-of-color-hspbipoc-population-concentration-north3
    Explore at:
    Dataset updated
    Mar 7, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Northern California, California
    Description

    Relative concentration of the Northern California region's Hispanic and/or Black, Indigenous or person of color (HSPBIPOC) population. The variable HSPBIPOC is equivalent to all individuals who select a combination of racial and ethnic identity in response to the Census questionnaire EXCEPT those who select "not Hispanic" for the ethnic identity question, and "white race alone" for the racial identity question. This is the most encompassing possible definition of racial and ethnic identities that may be associated with historic underservice by agencies, or be more likely to express environmental justice concerns (as compared to predominantly non-Hispanic white communities). Until 2021, federal agency guidance for considering environmental justice impacts of proposed actions focused on how the actions affected "racial or ethnic minorities." "Racial minority" is an increasingly meaningless concept in the USA, and particularly so in California, where only about 3/8 of the state's population identifies as non-Hispanic and white race alone - a clear majority of Californians identify as Hispanic and/or not white. Because many federal and state map screening tools continue to rely on "minority population" as an indicator for flagging potentially vulnerable / disadvantaged/ underserved populations, our analysis includes the variable HSPBIPOC which is effectively "all minority" population according to the now outdated federal environmental justice direction. A more meaningful analysis for the potential impact of forest management actions on specific populations considers racial or ethnic populations individually: e.g., all people identifying as Hispanic regardless of race; all people identifying as American Indian, regardless of Hispanic ethnicity; etc. "Relative concentration" is a measure that compares the proportion of population within each Census block group data unit that identify as HSPBIPOC alone to the proportion of all people that live within the 1,207 block groups in the Northern California RRK region that identify as HSPBIPOC alone. Example: if 5.2% of people in a block group identify as HSPBIPOC, the block group has twice the proportion of HSPBIPOC individuals compared to the Northern California RRK region (2.6%), and more than three times the proportion compared to the entire state of California (1.6%). If the local proportion is twice the regional proportion, then HSPBIPOC individuals are highly concentrated locally.

  17. Spanish-language e-book revenue worldwide 2020-2023, by country

    • flwrdeptvarieties.store
    • statista.com
    Updated Dec 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amy Watson (2023). Spanish-language e-book revenue worldwide 2020-2023, by country [Dataset]. https://flwrdeptvarieties.store/?_=%2Ftopics%2F1474%2Fe-books%2F%23zUpilBfjadnZ6q5i9BcSHcxNYoVKuimb
    Explore at:
    Dataset updated
    Dec 18, 2023
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Amy Watson
    Description

    In 2023, Spanish-language e-books sold in Spain made up 55.7 percent of the global Spanish-language e-book sales revenue. Mexico was the second largest market with over 20 percent of the global sales. The United States ranked third.

  18. ACS Language Spoken at Home Variables - Centroids

    • hub.arcgis.com
    • share-open-data-njtpa.hub.arcgis.com
    • +1more
    Updated Oct 20, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Esri (2018). ACS Language Spoken at Home Variables - Centroids [Dataset]. https://hub.arcgis.com/maps/eba9adeb95394d29a43ad9b380bc31bc
    Explore at:
    Dataset updated
    Oct 20, 2018
    Dataset authored and provided by
    Esrihttp://esri.com/
    Area covered
    Description

    This layer shows language group of language spoken at home by age. This is shown by tract, county, and state centroids. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percent and count of population age 5+ who speak Spanish at home. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B16007Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.

  19. b

    Percent of Residents - Hispanic

    • data.baltimorecity.gov
    • vital-signs-bniajfi.hub.arcgis.com
    • +1more
    Updated Feb 27, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baltimore Neighborhood Indicators Alliance (2020). Percent of Residents - Hispanic [Dataset]. https://data.baltimorecity.gov/maps/bc346d573ee74963beaa8a8b69eb7dfb
    Explore at:
    Dataset updated
    Feb 27, 2020
    Dataset authored and provided by
    Baltimore Neighborhood Indicators Alliance
    Area covered
    Description

    The percentage of persons, out of the total number of persons living in an area, self-identifying their ethnicity as Hispanic or Latino. Hispanic origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before they arrived in the United States. People who identify their origin as Hispanic, Latino, or Spanish may be of any race. Source: U.S. Census Bureau, American Community Survey Years Available: 2010, 2011-2015, 2012-2016, 2013-2017, 2014-2018, 2015-2019, 2020, 2017-2021, 2018-2022

  20. F

    Delivery & Logistics Call Center Speech Data: Spanish (USA)

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Delivery & Logistics Call Center Speech Data: Spanish (USA) [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/delivery-call-center-conversation-spanish-usa
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement

    Area covered
    United States
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the US Spanish Call Center Speech Dataset for the Delivery and Logistics domain designed to enhance the development of call center speech recognition models specifically for the Delivery and Logistics industry. This dataset is meticulously curated to support advanced speech recognition, natural language processing, conversational AI, and generative voice AI algorithms.

    Speech Data

    This training dataset comprises 30 Hours of call center audio recordings covering various topics and xscenarios related to the Delivery and Logistics domain, designed to build robust and accurate customer service speech technology.

    Participant Diversity:
    Speakers: 60 expert native US Spanish speakers from the FutureBeeAI Community.
    Regions: Different states/provinces of USA, ensuring a balanced representation of US accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.
    Recording Details:
    Conversation Nature: Unscripted and spontaneous conversations between call center agents and customers.
    Call Duration: Average duration of 5 to 15 minutes per call.
    Formats: WAV format with stereo channels, a bit depth of 16 bits, and a sample rate of 8 and 16 kHz.
    Environment: Without background noise and without echo.

    Topic Diversity

    This dataset offers a diverse range of conversation topics, call types, and outcomes, including both inbound and outbound calls with positive, neutral, and negative outcomes.

    Inbound Calls:
    Order Tracking
    Delivery Complaint
    Undeliverable Address
    Delivery Method Selection
    Return Process Enquiry
    Order Modification, and many more
    Outbound Calls:
    Delivery Confirmation
    Delivery Subscription
    Incorrect Address
    Missed Delivery Attempt
    Delivery Feedback
    Out-of-Stock Notification
    Delivery Satisfaction Survey, and many more

    This extensive coverage ensures the dataset includes realistic call center scenarios, which is essential for developing effective customer support speech recognition models.

    Transcription

    To facilitate your workflow, the dataset includes manual verbatim transcriptions of each call center audio file in JSON format. These transcriptions feature:

    Speaker-wise Segmentation: Time-coded segments for both agents and customers.
    Non-Speech Labels: Tags and labels for non-speech elements.
    Word Error Rate: Word error rate is less than 5% thanks to the dual layer of QA.

    These ready-to-use transcriptions accelerate the development of the Delivery and Logistics domain call center conversational AI and ASR models for the US Spanish language.

    Metadata

    The dataset provides comprehensive metadata for each conversation and participant:

    Participant Metadata: Unique identifier, age, gender, country, state, district, accent and dialect.
    Conversation Metadata: Domain, topic, call type, outcome/sentiment, bit depth, and sample

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2024). Hispanic population U.S. 2023, by state [Dataset]. https://www.statista.com/statistics/259850/hispanic-population-of-the-us-by-state/
Organization logo

Hispanic population U.S. 2023, by state

Explore at:
15 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 18, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
United States
Description

In 2023, California had the highest Hispanic population in the United States, with over 15.76 million people claiming Hispanic heritage. Texas, Florida, New York, and Illinois rounded out the top five states for Hispanic residents in that year. History of Hispanic people Hispanic people are those whose heritage stems from a former Spanish colony. The Spanish Empire colonized most of Central and Latin America in the 15th century, which began when Christopher Columbus arrived in the Americas in 1492. The Spanish Empire expanded its territory throughout Central America and South America, but the colonization of the United States did not include the Northeastern part of the United States. Despite the number of Hispanic people living in the United States having increased, the median income of Hispanic households has fluctuated slightly since 1990. Hispanic population in the United States Hispanic people are the second-largest ethnic group in the United States, making Spanish the second most common language spoken in the country. In 2021, about one-fifth of Hispanic households in the United States made between 50,000 to 74,999 U.S. dollars. The unemployment rate of Hispanic Americans has fluctuated significantly since 1990, but has been on the decline since 2010, with the exception of 2020 and 2021, due to the impact of the coronavirus (COVID-19) pandemic.

Search
Clear search
Close search
Google apps
Main menu