10 datasets found

H
American Community Survey (ACS)
dataverse.harvard.edu
Updated May 30, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony Damico (2013). American Community Survey (ACS) [Dataset]. http://doi.org/10.7910/DVN/DKI9L4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/DKI9L4
Dataset updated
May 30, 2013
Dataset provided by
Harvard Dataverse
Authors
Anthony Damico
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
analyze the american community survey (acs) with r and monetdb experimental. think of the american community survey (acs) as the united states' census for off-years - the ones that don't end in zero. every year, one percent of all americans respond, making it the largest complex sample administered by the u.s. government (the decennial census has a much broader reach, but since it attempts to contact 100% of the population, it's not a sur vey). the acs asks how people live and although the questionnaire only includes about three hundred questions on demography, income, insurance, it's often accurate at sub-state geographies and - depending how many years pooled - down to small counties. households are the sampling unit, and once a household gets selected for inclusion, all of its residents respond to the survey. this allows household-level data (like home ownership) to be collected more efficiently and lets researchers examine family structure. the census bureau runs and finances this behemoth, of course. the dow nloadable american community survey ships as two distinct household-level and person-level comma-separated value (.csv) files. merging the two just rectangulates the data, since each person in the person-file has exactly one matching record in the household-file. for analyses of small, smaller, and microscopic geographic areas, choose one-, three-, or fiv e-year pooled files. use as few pooled years as you can, unless you like sentences that start with, "over the period of 2006 - 2010, the average american ... [insert yer findings here]." rather than processing the acs public use microdata sample line-by-line, the r language brazenly reads everything into memory by default. to prevent overloading your computer, dr. thomas lumley wrote the sqlsurvey package principally to deal with t his ram-gobbling monster. if you're already familiar with syntax used for the survey package, be patient and read the sqlsurvey examples carefully when something doesn't behave as you expect it to - some sqlsurvey commands require a different structure (i.e. svyby gets called through svymean) and others might not exist anytime soon (like svyolr). gimme some good news: sqlsurvey uses ultra-fast monetdb (click here for speed tests), so follow the monetdb installation instructions before running this acs code. monetdb imports, writes, recodes data slowly, but reads it hyper-fast . a magnificent trade-off: data exploration typically requires you to think, send an analysis command, think some more, send another query, repeat. importation scripts (especially the ones i've already written for you) can be left running overnight sans hand-holding. the acs weights generalize to the whole united states population including individuals living in group quarters, but non-residential respondents get an abridged questionnaire, so most (not all) analysts exclude records with a relp variable of 16 or 17 right off the bat. this new github repository contains four scripts: 2005-2011 - download all microdata.R create the batch (.bat) file needed to initiate the monet database in the future download, unzip, and import each file for every year and size specified by the user create and save household- and merged/person-level replicate weight complex sample designs create a well-documented block of code to re-initiate the monet db server in the future fair warning: this full script takes a loooong time. run it friday afternoon, commune with nature for the weekend, and if you've got a fast processor and speedy internet connection, monday morning it should be ready for action. otherwise, either download only the years and sizes you need or - if you gotta have 'em all - run it, minimize it, and then don't disturb it for a week. 2011 single-year - analysis e xamples.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file perform the standard repertoire of analysis examples, only this time using sqlsurvey functions 2011 single-year - variable reco de example.R run the well-documented block of code to re-initiate the monetdb server copy the single-year 2011 table to maintain the pristine original add a new age category variable by hand add a new age category variable systematically re-create then save the sqlsurvey replicate weight complex sample design on this new table close everything, then load everything back up in a fresh instance of r replicate a few of the census statistics. no muss, no fuss replicate census estimates - 2011.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file match every nation wide statistic on the census bureau's estimates page, using sqlsurvey functions click here to view these four scripts for more detail about the american community survey (acs), visit: < ul> the us census...

Average daily time spent on social media worldwide 2012-2024

statista.com
es.statista.com

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

How much time do people spend on social media?

              As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
              the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
              People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
              During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.

d
US Consumer Demographics | Homeowners & Renters | Email & Mobile Phone |...
datarade.ai
.json, .csv, .xls
Updated Oct 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CompCurve (2024). US Consumer Demographics | Homeowners & Renters | Email & Mobile Phone | Bulk & Custom | 255M People [Dataset]. https://datarade.ai/data-products/compcurve-us-consumer-demographics-homeowners-renters-compcurve
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Oct 18, 2024
Dataset authored and provided by
CompCurve
Area covered
United States
Description
Knowing who your consumers are is essential for businesses, marketers, and researchers. This detailed demographic file offers an in-depth look at American consumers, packed with insights about personal details, household information, financial status, and lifestyle choices. Let's take a closer look at the data:

Personal Identifiers and Basic Demographics At the heart of this dataset are the key details that make up a consumer profile:

Unique IDs (PID, HHID) for individuals and households Full names (First, Middle, Last) and suffixes Gender and age Date of birth Complete location details (address, city, state, ZIP) These identifiers are critical for accurate marketing and form the base for deeper analysis.

Geospatial Intelligence This file goes beyond just listing addresses by including rich geospatial data like:

Latitude and longitude Census tract and block details Codes for Metropolitan Statistical Areas (MSA) and Core-Based Statistical Areas (CBSA) County size codes Geocoding accuracy This allows for precise geographic segmentation and localized marketing.

Housing and Property Data The dataset covers a lot of ground when it comes to housing, providing valuable insights for real estate professionals, lenders, and home service providers:

Homeownership status Dwelling type (single-family, multi-family, etc.) Property values (market, assessed, and appraised) Year built and square footage Room count, amenities like fireplaces or pools, and building quality This data is crucial for targeting homeowners with products and services like refinancing or home improvement offers.

Wealth and Financial Data For a deeper dive into consumer wealth, the file includes:

Estimated household income Wealth scores Credit card usage Mortgage info (loan amounts, rates, terms) Home equity estimates and investment property ownership These indicators are invaluable for financial services, luxury brands, and fundraising organizations looking to reach affluent individuals.

Lifestyle and Interests One of the most useful features of the dataset is its extensive lifestyle segmentation:

Hobbies and interests (e.g., gardening, travel, sports) Book preferences, magazine subscriptions Outdoor activities (camping, fishing, hunting) Pet ownership, tech usage, political views, and religious affiliations This data is perfect for crafting personalized marketing campaigns and developing products that align with specific consumer preferences.

Consumer Behavior and Purchase Habits The file also sheds light on how consumers behave and shop:

Online and catalog shopping preferences Gift-giving tendencies, presence of children, vehicle ownership Media consumption (TV, radio, internet) Retailers and e-commerce businesses will find this behavioral data especially useful for tailoring their outreach.

Demographic Clusters and Segmentation Pre-built segments like:

Household, neighborhood, family, and digital clusters Generational and lifestage groups make it easier to quickly target specific demographics, streamlining the process for market analysis and campaign planning.

Ethnicity and Language Preferences In today's multicultural market, knowing your audience's cultural background is key. The file includes:

Ethnicity codes and language preferences Flags for Hispanic/Spanish-speaking households This helps ensure culturally relevant and sensitive communication.

Education and Occupation Data The dataset also tracks education and career info:

Education level and occupation codes Home-based business indicators This data is essential for B2B marketers, recruitment agencies, and education-focused campaigns.

Digital and Social Media Habits With everyone online, digital behavior insights are a must:

Internet, TV, radio, and magazine usage Social media platform engagement (Facebook, Instagram, LinkedIn) Streaming subscriptions (Netflix, Hulu) This data helps marketers, app developers, and social media managers connect with their audience in the digital space.

Political and Charitable Tendencies For political campaigns or non-profits, this dataset offers:

Political affiliations and outlook Charitable donation history Volunteer activities These insights are perfect for cause-related marketing and targeted political outreach.

Neighborhood Characteristics By incorporating census data, the file provides a bigger picture of the consumer's environment:

Population density, racial composition, and age distribution Housing occupancy and ownership rates This offers important context for understanding the demographic landscape.

Predictive Consumer Indexes The dataset includes forward-looking indicators in categories like:

Fashion, automotive, and beauty products Health, home decor, pet products, sports, and travel These predictive insights help businesses anticipate consumer trends and needs.

Contact Information Finally, the file includes ke...
DOHMH Covid-19 Milestone Data: Daily Number of People Admitted to NYC...
data.cityofnewyork.us
catalog.data.gov
+1more
application/rdfxml +5
Updated Jun 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health and Mental Hygiene (DOHMH) (2021). DOHMH Covid-19 Milestone Data: Daily Number of People Admitted to NYC hospitals for Covid-19 like Illness [Dataset]. https://data.cityofnewyork.us/dataset/DOHMH-Covid-19-Milestone-Data-Daily-Number-of-Peop/sj3k-gzyx
Explore at:
tsv, csv, xml, json, application/rssxml, application/rdfxmlAvailable download formats
Dataset updated
Jun 15, 2021
Dataset provided by
New York City Department of Health and Mental Hygienehttps://nyc.gov/health
Authors
Department of Health and Mental Hygiene (DOHMH)
Area covered
New York
Description
This dataset shows the number of hospital admissions for influenza-like illness, pneumonia, or include ICD-10-CM code (U07.1) for 2019 novel coronavirus. Influenza-like illness is defined as a mention of either: fever and cough, fever and sore throat, fever and shortness of breath or difficulty breathing, or influenza. Patients whose ICD-10-CM code was subsequently assigned with only an ICD-10-CM code for influenza are excluded. Pneumonia is defined as mention or diagnosis of pneumonia. Baseline data represents the average number of people with COVID-19-like illness who are admitted to the hospital during this time of year based on historical counts. The average is based on the daily avg from the rolling same week (same day +/- 3 days) from the prior 3 years. Percent change data represents the change in count of people admitted compared to the previous day. Data sources include all hospital admissions from emergency department visits in NYC. Data are collected electronically and transmitted to the NYC Health Department hourly. This dataset is updated daily. All identifying health information is excluded from the dataset.
h
Human-Like-DPO-Dataset
huggingface.co
Updated May 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Human-Like LLMs (2024). Human-Like-DPO-Dataset [Dataset]. https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 19, 2024
Dataset authored and provided by
Human-Like LLMs
License
https://choosealicense.com/licenses/llama3/https://choosealicense.com/licenses/llama3/
Description
Enhancing Human-Like Responses in Large Language Models

🤗 Models | 📊 Dataset | 📄 Paper

Human-Like-DPO-Dataset

This dataset was created as part of research aimed at improving conversational fluency and engagement in large language models. It is suitable for formats like Direct Preference Optimization (DPO) to guide models toward generating more human-like responses. The dataset includes 10,884 samples across 256 topics, including: Technology Daily Life Science… See the full description on the dataset page: https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset.
h
i-love-anime-sakuga
huggingface.co
Updated May 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DeSULT (2024). i-love-anime-sakuga [Dataset]. https://huggingface.co/datasets/DSULT-Core/i-love-anime-sakuga
Explore at:
Dataset updated
May 24, 2024
Dataset authored and provided by
DeSULT
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
ilovehentai9000/iloveanimesakuga Dataset

Because the website is slow and I hate people who request for "Data" to "Improve" their model. There's no need for this kind of BS.

Uses

Just don't.

License

GAYSEX-Dont Be A Prick License
Instagram: most used hashtags 2024
statista.com
es.statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista Research Department, Instagram: most used hashtags 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/
Explore at:
Dataset provided by
Statistahttp://statista.com/
Authors
Statista Research Department
Description
As of January 2024, #love was the most used hashtag on Instagram, being included in over two billion posts on the social media platform. #Instagood and #instagram were used over one billion times as of early 2024.
e
The middle classes in the city: social mix or just 'people like us'? A...
b2find.eudat.eu
Updated Oct 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). The middle classes in the city: social mix or just 'people like us'? A Comparison of Paris and London - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/35cda695-e995-5df3-b14f-1b2acd846466
Explore at:
Dataset updated
Oct 21, 2023
Area covered
London, Paris
Description
This comparative study investigates the contemporary social and political characteristics and activities of the urban middle classes in Paris and London. It investigates a range of neighbourhood types in each city: inner city gentrified (not socially mixed); gentrifying (socially mixed); suburban; exurban and gated communities; to ask to what extent the middle classes compare or contrast across these different locations in terms of their social relations and political attitudes and engagements (including, for example, schooling, use of public services and neighbourhood activism). The research consists of depth interviews with middle class residents and elite actors in each neighbourhood as well as an analysis of relevant documents that discuss middle class identity and activity in these cities. The study will to draw out the implications of the findings for urban politics and policies (compared with the role the middle classes are assumed to play in these policies) at the neighbourhood, city, national and transnational scales. This is a fully comparative bilateral project with colleagues in Paris who are equivalently funded by the Agence de Nationale de la Recherche.
h
test
huggingface.co
Updated Dec 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
waleedr (2023). test [Dataset]. https://huggingface.co/datasets/wrandhawa/test
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2023
Authors
waleedr
Description
[INST] What career options are available in the field of computer science? [/INST] Computer science offers various paths like software development, data science, and cybersecurity. [INST] I enjoy helping people. What are some rewarding careers in healthcare? [/INST] Consider careers like nursing, physician assistant, or social work for a fulfilling healthcare role. [INST] How can I pursue a career in digital marketing? [/INST] Start by gaining digital marketing skills, and explore roles like… See the full description on the dataset page: https://huggingface.co/datasets/wrandhawa/test.
Data from: US Election 2020
kaggle.com
zip
Updated Dec 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raphael Fontes (2020). US Election 2020 [Dataset]. https://www.kaggle.com/unanimad/us-election-2020
Explore at:
zip(439478 bytes)Available download formats
Dataset updated
Dec 28, 2020
Authors
Raphael Fontes
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Please, If you enjoyed this dataset, don't forget to upvote it.

US Election 2020

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3244747%2F5173f21bc8eaea8335539cc942338b4d%2Fheader_win.png?generation=1605608056355359&alt=media" alt="">

Context

For this year, was the 59th quadrennial presidential election held on Tuesday, November 3, 2020. To win the election, the candidate needs 270 out of 538 electoral votes. A good sign, that show if a candidate is doing well, is if they win states that aren't expcted to go their way.

Content

This dataset contains county-level data from 2020 US Election.

Acknowledgements

US election 2020: A really simple guide
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Anthony Damico (2013). American Community Survey (ACS) [Dataset]. http://doi.org/10.7910/DVN/DKI9L4

American Community Survey (ACS)

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.7910/DVN/DKI9L4

Dataset updated

May 30, 2013

Dataset provided by

Harvard Dataverse

Authors

Anthony Damico

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

analyze the american community survey (acs) with r and monetdb experimental. think of the american community survey (acs) as the united states' census for off-years - the ones that don't end in zero. every year, one percent of all americans respond, making it the largest complex sample administered by the u.s. government (the decennial census has a much broader reach, but since it attempts to contact 100% of the population, it's not a sur vey). the acs asks how people live and although the questionnaire only includes about three hundred questions on demography, income, insurance, it's often accurate at sub-state geographies and - depending how many years pooled - down to small counties. households are the sampling unit, and once a household gets selected for inclusion, all of its residents respond to the survey. this allows household-level data (like home ownership) to be collected more efficiently and lets researchers examine family structure. the census bureau runs and finances this behemoth, of course. the dow nloadable american community survey ships as two distinct household-level and person-level comma-separated value (.csv) files. merging the two just rectangulates the data, since each person in the person-file has exactly one matching record in the household-file. for analyses of small, smaller, and microscopic geographic areas, choose one-, three-, or fiv e-year pooled files. use as few pooled years as you can, unless you like sentences that start with, "over the period of 2006 - 2010, the average american ... [insert yer findings here]." rather than processing the acs public use microdata sample line-by-line, the r language brazenly reads everything into memory by default. to prevent overloading your computer, dr. thomas lumley wrote the sqlsurvey package principally to deal with t his ram-gobbling monster. if you're already familiar with syntax used for the survey package, be patient and read the sqlsurvey examples carefully when something doesn't behave as you expect it to - some sqlsurvey commands require a different structure (i.e. svyby gets called through svymean) and others might not exist anytime soon (like svyolr). gimme some good news: sqlsurvey uses ultra-fast monetdb (click here for speed tests), so follow the monetdb installation instructions before running this acs code. monetdb imports, writes, recodes data slowly, but reads it hyper-fast . a magnificent trade-off: data exploration typically requires you to think, send an analysis command, think some more, send another query, repeat. importation scripts (especially the ones i've already written for you) can be left running overnight sans hand-holding. the acs weights generalize to the whole united states population including individuals living in group quarters, but non-residential respondents get an abridged questionnaire, so most (not all) analysts exclude records with a relp variable of 16 or 17 right off the bat. this new github repository contains four scripts: 2005-2011 - download all microdata.R create the batch (.bat) file needed to initiate the monet database in the future download, unzip, and import each file for every year and size specified by the user create and save household- and merged/person-level replicate weight complex sample designs create a well-documented block of code to re-initiate the monet db server in the future fair warning: this full script takes a loooong time. run it friday afternoon, commune with nature for the weekend, and if you've got a fast processor and speedy internet connection, monday morning it should be ready for action. otherwise, either download only the years and sizes you need or - if you gotta have 'em all - run it, minimize it, and then don't disturb it for a week. 2011 single-year - analysis e xamples.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file perform the standard repertoire of analysis examples, only this time using sqlsurvey functions 2011 single-year - variable reco de example.R run the well-documented block of code to re-initiate the monetdb server copy the single-year 2011 table to maintain the pristine original add a new age category variable by hand add a new age category variable systematically re-create then save the sqlsurvey replicate weight complex sample design on this new table close everything, then load everything back up in a fresh instance of r replicate a few of the census statistics. no muss, no fuss replicate census estimates - 2011.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file match every nation wide statistic on the census bureau's estimates page, using sqlsurvey functions click here to view these four scripts for more detail about the american community survey (acs), visit: < ul> the us census...

Clear search

Close search

Google apps

Main menu

American Community Survey (ACS)

Average daily time spent on social media worldwide 2012-2024

US Consumer Demographics | Homeowners & Renters | Email & Mobile Phone |...

DOHMH Covid-19 Milestone Data: Daily Number of People Admitted to NYC...

Human-Like-DPO-Dataset

i-love-anime-sakuga

Instagram: most used hashtags 2024

The middle classes in the city: social mix or just 'people like us'? A...

test

Data from: US Election 2020

US Election 2020

Context

Content

Acknowledgements

American Community Survey (ACS)