4 datasets found
  1. Primary language spoken by the Medicaid and CHIP population

    • data.virginia.gov
    • healthdata.gov
    csv
    Updated Jan 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Medicare & Medicaid Services (2025). Primary language spoken by the Medicaid and CHIP population [Dataset]. https://data.virginia.gov/dataset/primary-language-spoken-by-the-medicaid-and-chip-population
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 18, 2025
    Dataset provided by
    Centers for Medicare & Medicaid Services
    Description

    This data set includes annual counts and percentages of Medicaid and Children’s Health Insurance Program (CHIP) enrollees by primary language spoken (English, Spanish, and all other languages). Results are shown overall; by state; and by five subpopulation topics: race and ethnicity, age group, scope of Medicaid and CHIP benefits, urban or rural residence, and eligibility category. These results were generated using Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files (TAF) Release 1 data and the Race/Ethnicity Imputation Companion File. This data set includes Medicaid and CHIP enrollees in all 50 states, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands who were enrolled for at least one day in the calendar year, except where otherwise noted. Enrollees in Guam, American Samoa, the Northern Mariana Islands, and select states with data quality issues with the primary language variable in TAF are not included. Results shown for the race and ethnicity subpopulation topic exclude enrollees in the U.S. Virgin Islands. Results shown overall (where subpopulation topic is "Total enrollees") exclude enrollees younger than age 5 and enrollees in the U.S. Virgin Islands. Results for states with TAF data quality issues in the year have a value of "Unusable data." Some rows in the data set have a value of "DS," which indicates that data were suppressed according to the Centers for Medicare & Medicaid Services’ Cell Suppression Policy for values between 1 and 10. This data set is based on the brief: "Primary language spoken by the Medicaid and CHIP population in 2020." Enrollees are assigned to a primary language category based on their reported ISO language code in TAF (English/missing, Spanish, and all other language codes) (Primary Language). Enrollees are assigned to a race and ethnicity subpopulation using the state-reported race and ethnicity information in TAF when it is available and of good quality; if it is missing or unreliable, race and ethnicity is indirectly estimated using an enhanced version of Bayesian Improved Surname Geocoding (BISG) (Race and ethnicity of the national Medicaid and CHIP population in 2020). Enrollees are assigned to an age group subpopulation using age as of December 31st of the calendar year. Enrollees are assigned to the comprehensive benefits or limited benefits subpopulation according to the criteria in the "Identifying Beneficiaries with Full-Scope, Comprehensive, and Limited Benefits in the TAF" DQ Atlas brief. Enrollees are assigned to an urban or rural subpopulation based on the 2010 Rural-Urban Commuting Area (RUCA) code associated with their home or mailing address ZIP code in TAF (Rural Medicaid and CHIP enrollees in 2020). Enrollees are assigned to an eligibility category subpopulation using their latest reported eligibility group code, CHIP code, and age in the calendar year. Please refer to the full brief for additional context about the methodology and detailed findings. Future updates to this data set will include more recent data years as the TAF data become available.

  2. Property Listings for 5 South American Countries

    • kaggle.com
    Updated May 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rasmus Jacobsen (2020). Property Listings for 5 South American Countries [Dataset]. https://www.kaggle.com/rmjacobsen/property-listings-for-5-south-american-countries/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 25, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rasmus Jacobsen
    Area covered
    Americas, South America
    Description

    Context

    The datasets contain real estate listings in Argentina, Colombia, Ecuador, Perú, and Uruguay. With information on number of rooms, districts, prices, etc. They include houses, apartments, commercial lots, and more.

    The datasets origin from Properati Data which is a data division of Properati, the Latin American property search site. On their website you can find links to different tools and datasets to use freely for your projects. All you have to do is make sure you credit them for the data.

    Content

    What a minute the dataset is in Spanish?! Yes, so for that reason I have provided a translated overview below. Keep in mind that although Spanish is a single language, certain words and expressions may vary depending on the country and region, e.g. the word for apartment in Colombia "apartamento" is "departamento" in Argentina. But all of these are easy to translate with Google Translator.

    Overview of Data

    • type - Type of listing:
      • Propiedad (Property).
      • Desarrollo/Proyecto (Development/Project).
    • country - Country in which the listing is published:
      • Argentina
      • Colombia
      • Ecuador
      • Perú
      • Uruguay
    • id - id of the listing. It is not unique: if the listing is updated by the real estate agency (new version of the listing) a new record is created with the same id but different dates: registration and cancellation.
    • start_date - Date of registration of the listing.
    • end_date - Cancellation date of the listing.
    • created_on - Date of registration of the first version of the listing.
    • lat - Latitude of the property.
    • lon - Longitud of the property.
    • l1 - Administrative Level 1: Country of the property.
    • l2 - Administrative Level 2: Usually the province of the property.
    • l3 - Administrative Level 3: Usually the city of the property.
    • l4 - Administrative Level 4: Usually the neighbourhood of the property.
    • operation - Type of listing:
      • Venta (Sale).
      • Alquiler (Rent).
    • type - Type of property:
      • Casa (House).
      • Departamento (Apartment).
      • PH (Horizontal Property).
    • rooms - Number of rooms (useful for Argentina).
    • bedrooms - Number of bedrooms (useful for the rest of the countries).
    • bathrooms - Number of bathrooms.
    • surface_total - Total area in m².
    • surface_covered - Area covered in m².
    • price - Price published in the listing.
    • currency - Currency of published price.
    • price_period - Payment periods:
      • Diario (Daily).
      • Semanal (Weekly).
      • Mensual (Monthly).
    • title - Title of the listing (These are in Spanish).
    • description - Description of the listing (In Spanish).
    • status - Development status (Completed, Under construction, ...).
    • name - Development name.
    • short_description - Short listing description.

    Acknowledgements & Inspiration

    I want to thank Properati Data for providing the datasets free of charge. Especially, datasets on real estate listings that can be difficult to come by without spending time on creating crawlers and finding websites that will allow for crawling.

    The inspiration and reason I came by the datasets in the first place was through my personal project on predicting apartment prices in Buenos Aires.

    Additional Information

    Data was downloaded the May 24 2020.

  3. f

    Table_1_Expressing diminutive meaning in heritage Spanish: linking the...

    • frontiersin.figshare.com
    xlsx
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abel Cruz (2024). Table_1_Expressing diminutive meaning in heritage Spanish: linking the heritage experience to diminutive use in everyday speech.XLSX [Dataset]. http://doi.org/10.3389/flang.2024.1377977.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Frontiers
    Authors
    Abel Cruz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    IntroductionThis paper studies the pragmatic force that heritage speakers may convey through the use of the diminutive in everyday speech. In particular, I analyze the use of the Spanish diminutive in 49 sociolinguistic interviews from a Spanish–English bilingual community in Southern Arizona, U.S. where Spanish is the heritage language. I compare the use of the diminutive in heritage Spanish to the distribution of the diminutive in the speech of a Spanish monolingual community (18 sociolinguistic interviews) from the same dialectal region. Although Spanish and English employ different morphosyntactic strategies to express diminutive meaning, the analysis reveals that the diminutive morpheme -ito/a is a productive morphological device in the Spanish-discourse of heritage speakers from Southern Arizona (i.e., similar diminutive distributions to their monolingual counterparts). While heritage speakers employed the diminutive -ito/a to express the notion of “smallness” in their Spanish-discourse, the analysis indicates that these language users are more likely to invoke a subjective evaluation through the diminutive -ito/a when talking about their family members and/or childhood experiences. This particular finding suggests that the concept “child” is the semantic/pragmatic driving force of the diminutive in heritage Spanish as a marker of speech by, about, to, or with some relation to children. The analysis further suggests that examining the pragmatic dimensions of the diminutive in everyday speech can provide important insights into how heritage speakers encode and create cultural meaning in their heritage languages.MethodsIn this study, I analyze the use of Spanish diminutives in two U.S.-Mexico border regions. The first data set is representative of a Spanish–English bilingual community in Southern Arizona, U.S., provided in the Corpus del Español en el Sur de Arizona (The CESA Corpus). The CESA Corpus comprises 49 sociolinguistic interviews of ~1 h each for a total of ~305,542 words. The second data set comprises 18 sociolinguistic interviews of predominantly monolingual Spanish speakers from the city of Mexicali, Baja California in Mexico, provided in the Proyecto Para el Estudio Sociolingüístico del Español de España y de América (PRESEEA). The Mexicali data set consists of ~119,162 words.ResultsThe analysis revealed that the Spanish diminutive morpheme -ito/a is a productive morphological device in the Spanish-discourse of heritage speakers from Southern Arizona. In addition to its prototypical meaning (i.e., the notion of “smallness”), the diminutive morpheme -ito/a conveyed an array of pragmatic functions in the everyday speech of Spanish heritage speakers and their monolingual counterparts from the same dialectal region. Importantly, these pragmatic functions are mediated by speakers' subjective perceptions of the entity in question. Unlike their monolingual counterparts, heritage speakers are more likely to invoke a subjective evaluation through the diminutive -ito/a when talking about their family members and/or childhood experiences. Altogether, the study suggests that the concept “child” is the semantic/pragmatic driving force of the diminutive in heritage Spanish as a marker of speech by, about, to, or with some relation to children.DiscussionIn this study, I followed Reynoso's framework to study the pragmatic dimensions of the diminutive in everyday speech, that is, speakers' publicly conveyed meaning. The analysis revealed that heritage speakers applied most of the pragmatic functions and their respective values observed in Reynoso's cross-dialectal study of Spanish diminutives, and hence providing further support for her framework. Similarly, the study provides further evidence to Jurafsky's proposal that morphological diminutives arise from semantic or pragmatic links with children. Finally, the analysis indicated that examining the semantic/pragmatic dimensions of the diminutive in everyday speech can provide important insights into how heritage speakers encode and create cultural meaning in their heritage languages, which can in turn have further ramifications for heritage language learning and teaching.

  4. Primary Language of Newly Medi-Cal Eligible Individuals

    • data.chhs.ca.gov
    • data.ca.gov
    • +2more
    csv, zip
    Updated Mar 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health Care Services (2025). Primary Language of Newly Medi-Cal Eligible Individuals [Dataset]. https://data.chhs.ca.gov/dataset/primary-language-of-newly-medi-cal-eligible-individuals
    Explore at:
    csv(32459), zipAvailable download formats
    Dataset updated
    Mar 19, 2025
    Dataset provided by
    California Department of Health Care Serviceshttp://www.dhcs.ca.gov/
    Authors
    Department of Health Care Services
    Description

    This dataset includes the primary language of newly Medi-Cal eligible individuals who identified their primary language as English, Spanish, Vietnamese, Mandarin, Cantonese, Arabic, Other Non-English, Armenian, Russian, Farsi, Korean, Tagalog, Other Chinese Languages, Hmong, Cambodian, Portuguese, Lao, French, Thai, Japanese, Samoan, Other Sign Language, American Sign Language (ASL), Turkish, Ilacano, Mien, Italian, Hebrew, and Polish, by reporting period. The primary language data is from the Medi-Cal Eligibility Data System (MEDS) and includes eligible individuals without prior Medi-Cal eligibility. This dataset is part of the public reporting requirements set forth in California Welfare and Institutions Code 14102.5.

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Centers for Medicare & Medicaid Services (2025). Primary language spoken by the Medicaid and CHIP population [Dataset]. https://data.virginia.gov/dataset/primary-language-spoken-by-the-medicaid-and-chip-population
Organization logo

Primary language spoken by the Medicaid and CHIP population

Explore at:
csvAvailable download formats
Dataset updated
Jan 18, 2025
Dataset provided by
Centers for Medicare & Medicaid Services
Description

This data set includes annual counts and percentages of Medicaid and Children’s Health Insurance Program (CHIP) enrollees by primary language spoken (English, Spanish, and all other languages). Results are shown overall; by state; and by five subpopulation topics: race and ethnicity, age group, scope of Medicaid and CHIP benefits, urban or rural residence, and eligibility category. These results were generated using Transformed Medicaid Statistical Information System (T-MSIS) Analytic Files (TAF) Release 1 data and the Race/Ethnicity Imputation Companion File. This data set includes Medicaid and CHIP enrollees in all 50 states, the District of Columbia, Puerto Rico, and the U.S. Virgin Islands who were enrolled for at least one day in the calendar year, except where otherwise noted. Enrollees in Guam, American Samoa, the Northern Mariana Islands, and select states with data quality issues with the primary language variable in TAF are not included. Results shown for the race and ethnicity subpopulation topic exclude enrollees in the U.S. Virgin Islands. Results shown overall (where subpopulation topic is "Total enrollees") exclude enrollees younger than age 5 and enrollees in the U.S. Virgin Islands. Results for states with TAF data quality issues in the year have a value of "Unusable data." Some rows in the data set have a value of "DS," which indicates that data were suppressed according to the Centers for Medicare & Medicaid Services’ Cell Suppression Policy for values between 1 and 10. This data set is based on the brief: "Primary language spoken by the Medicaid and CHIP population in 2020." Enrollees are assigned to a primary language category based on their reported ISO language code in TAF (English/missing, Spanish, and all other language codes) (Primary Language). Enrollees are assigned to a race and ethnicity subpopulation using the state-reported race and ethnicity information in TAF when it is available and of good quality; if it is missing or unreliable, race and ethnicity is indirectly estimated using an enhanced version of Bayesian Improved Surname Geocoding (BISG) (Race and ethnicity of the national Medicaid and CHIP population in 2020). Enrollees are assigned to an age group subpopulation using age as of December 31st of the calendar year. Enrollees are assigned to the comprehensive benefits or limited benefits subpopulation according to the criteria in the "Identifying Beneficiaries with Full-Scope, Comprehensive, and Limited Benefits in the TAF" DQ Atlas brief. Enrollees are assigned to an urban or rural subpopulation based on the 2010 Rural-Urban Commuting Area (RUCA) code associated with their home or mailing address ZIP code in TAF (Rural Medicaid and CHIP enrollees in 2020). Enrollees are assigned to an eligibility category subpopulation using their latest reported eligibility group code, CHIP code, and age in the calendar year. Please refer to the full brief for additional context about the methodology and detailed findings. Future updates to this data set will include more recent data years as the TAF data become available.

Search
Clear search
Close search
Google apps
Main menu