Facebook
Twitter50 Million Rows MSSQL Backup File with Clustered Columnstore Index.
This dataset contains -27K categorized Turkish supermarket items. -81 stores (Every city of Turkey has a store) -100K real Turkish names customer, address -10M rows sales data generated randomly. -All data has a near real price with influation factor by the time.
All the data generated randomly. So the usernames have been generated with real Turkish names and surnames but they are not real people.
The sale data generated randomly. But it has some rules.
For example, every order can contains 1-9 kind of item.
Every orderline amount can be 1-9 pieces.
The randomise function works according to population of the city.
So the number of orders for Istanbul (the biggest city of Turkey) is about 20% of all data
and another city for example orders for the Gaziantep (the population is 2.5% of Turkey population) is about 2.5% off all data.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1611072%2F9442f2a1dbae7f05ead4fde9e1033ac6%2Finbox_1611072_135236e39b79d6fae8830dec3fca4961_1.png?generation=1693509562300174&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1611072%2F1c39195270db87250e59d9f2917ccea1%2Finbox_1611072_b73d9ca432dae956564cfa5bfe42268c_3.png?generation=1693509575061587&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1611072%2Fa908389f33ae5c983e383d17f0d9a763%2Finbox_1611072_c5d349aa1f33c0fc4fc74b79b7167d3a_F3za81TXkAA1Il4.png?generation=1693509586158658&alt=media" alt="">
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the United States population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for United States. The dataset can be utilized to understand the population distribution of United States by age. For example, using this dataset, we can identify the largest age group in United States.
Key observations
The largest age group in United States was for the group of age 30 to 34 years years with a population of 22.71 million (6.86%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in United States was the 80 to 84 years years with a population of 6.25 million (1.89%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for United States Population by Age. You can refer the same here
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description : This dataset contains information on the largest companies in the world ranked by their revenue in USD millions. It includes key financial metrics and details about each company, making it a valuable resource for analysis and comparison.
This list comprises the world's largest companies by consolidated revenue, according to the Fortune Global 500 2024 rankings and other sources. American retail corporation Walmart has been the world's largest company by revenue since 2014. The list is limited to the largest 50 companies, all of which have annual revenues exceeding US$130 billion. This list is incomplete, as not all companies disclose their information to the media or general public. Out of 50 largest companies 23 are American, 17 Asian and 10 European.
Features :
Source : The data has been sourced from the Wikipedia page on List of Largest Companies by Revenue.
Usage : This dataset can be used for various analyses, including : - Financial performance comparisons across industries. - Visualization of the largest global companies. - Insights into employment statistics in relation to revenue.
Beginner-Friendly : This dataset is suitable for beginners looking to practice data analysis, data visualization, and financial comparisons. It provides a straightforward structure with easily understandable features, making it an excellent starting point for those new to data science.
Facebook
TwitterThe Survey of Consumer Finances (SCF) is normally a triennial cross-sectional survey of U.S. families. The survey data include information on families' balance sheets, pensions, income, and demographic characteristics.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Texas by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Texas. The dataset can be utilized to understand the population distribution of Texas by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Texas. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Texas.
Key observations
Largest age group (population): Male # 10-14 years (1.12 million) | Female # 10-14 years (1.08 million). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Texas Population by Gender. You can refer the same here
Facebook
TwitterSuccess.ai’s Phone Number Data offers direct access to over 50 million verified phone numbers for professionals worldwide, extracted from our expansive collection of 170 million profiles. This robust dataset includes work emails and key decision-maker profiles, making it an essential resource for companies aiming to enhance their communication strategies and outreach efficiency. Whether you're launching targeted marketing campaigns, setting up sales calls, or conducting market research, our phone number data ensures you're connected to the right professionals at the right time.
Why Choose Success.ai’s Phone Number Data?
Direct Communication: Reach out directly to professionals with verified phone numbers and work emails, ensuring your message gets to the right person without delay. Global Coverage: Our data spans across continents, providing phone numbers for professionals in North America, Europe, APAC, and emerging markets. Continuously Updated: We regularly refresh our dataset to maintain accuracy and relevance, reflecting changes like promotions, company moves, or industry shifts. Comprehensive Data Points:
Verified Phone Numbers: Direct lines and mobile numbers of professionals across various industries. Work Emails: Reliable email addresses to complement phone communications. Professional Profiles: Decision-makers’ profiles including job titles, company details, and industry information. Flexible Delivery and Integration: Success.ai offers this dataset in various formats suitable for seamless integration into your CRM or sales platform. Whether you prefer API access for real-time data retrieval or static files for periodic updates, we tailor the delivery to meet your operational needs.
Competitive Pricing with Best Price Guarantee: We provide this essential data at the most competitive prices in the industry, ensuring you receive the best value for your investment. Our best price guarantee means you can trust that you are getting the highest quality data at the lowest possible cost.
Targeted Applications for Phone Number Data:
Sales and Telemarketing: Enhance your telemarketing campaigns by reaching out directly to potential customers, bypassing gatekeepers. Market Research: Conduct surveys and research directly with industry professionals to gather insights that can shape your business strategy. Event Promotion: Invite prospects to webinars, conferences, and seminars directly through personal calls or SMS. Customer Support: Improve customer service by integrating accurate contact information into your support systems. Quality Assurance and Compliance:
Data Accuracy: Our data is verified for accuracy to ensure over 99% deliverability rates. Compliance: Fully compliant with GDPR and other international data protection regulations, allowing you to use the data with confidence globally. Customization and Support:
Tailored Data Solutions: Customize the data according to geographic, industry-specific, or job role filters to match your unique business needs. Dedicated Support: Our team is on hand to assist with data integration, usage, and any questions you may have. Start with Success.ai Today: Engage with Success.ai to leverage our Phone Number Data and connect with global professionals effectively. Schedule a consultation or request a sample through our dedicated client portal and begin transforming your outreach and communication strategies today.
Remember, with Success.ai, you don’t just buy data; you invest in a partnership that grows with your business needs, backed by our commitment to quality and affordability.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the New York population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for New York. The dataset can be utilized to understand the population distribution of New York by age. For example, using this dataset, we can identify the largest age group in New York.
Key observations
The largest age group in New York was for the group of age 30 to 34 years years with a population of 1.43 million (7.22%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in New York was the 80 to 84 years years with a population of 403,663 (2.03%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for New York Population by Age. You can refer the same here
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
I have spent some time scrapping and shaping PubChem data into a Neo4j graph database. The process took a lot of time, mainly downloading, and loading it into Neo4j. The whole process took weeks. If you want to build your own I will show you how to download mine and set it up in less than an hour (most of the time you’ll just have to wait). The process of how this dataset is created is described in the following blogs: - https://medium.com/@nijhof.dns/exploring-neodash-for-197m-chemical-full-text-graph-e3baed9615b8 - https://medium.com/neo4j/combining-3-biochemical-datasets-in-a-graph-database-8e9aafbb5788 - https://medium.com/p/d9ee9779dfbe
The full database is a merge of 3 datasets, PubChem (compounds + synonyms), NCI60 (GI50), and ChEMBL (cell lines). It contains 6 nodes of interest: ● Compound: This is related to a compound of PubChem. It has 1 property. ○ pubChemCompId: The id within pubchem. So “compound:cid162366967” links to https://pubchem.ncbi.nlm.nih.gov/compound/162366967. This number can be used with both PubChem RDF and PUG. ● Synonym: A name found in the literature. This name can refer to zero, one, or more compounds. This helps find relations between natural language names and absolute compounds they are related to. ○ Name: Natural language name. Can contain letters, spaces, numbers, and any other Unicode character. ○ pubChemSynId: PubChem synonym id as used within the RDF ● CellLine: These are the ChEMBL cell lines. They hold a lot of information. ○ Name: The name of the cell line. ○ Uri: A unique URI for every element within the ChEMBL RDF. ○ cellosaurusId: The id to connect it to the Cellosaurus dataset. This is one of the most extensive cell line datasets out there. ● Measurement: A measurement you can do within a biomedical experiment. Currently, only GI50 (the concentration needed for Growth Inhibition of 50%) is added. ○ Name: Name of the measurement. ● Condition: A single condition of an experiment. A condition is part of an experiment. Examples are: an individual of the control group, a sample with drug A, or a sample with more CO2 ● Experiment: A collection of multiple conditions all done at the same time with the same bias. Meaning we assume all uncontrolled variables are the same. ○ Name: Name of experiment.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F442733%2F7dd804811e105390dfe20bb5cd1a68c0%2FUntitled%20graph.png?generation=1680113457794452&alt=media" alt="">
How do download it Warning, you need 120 GB of free memory. The compressed file you download is already 30 GB. The uncompressed file is 30 GB. The database afterward is 60 GB. 60 GB is only for temporary files, the other 60 is for the database. If you do this on an HDD hard disk it will be slow.
If you load this into Neo4j desktop as a local database (like I do) it will scream and yell at you, just ignore this. We are pushing it far further than it is designed for, but it will still work.
Go to this Kaggle dataset and download the dump file. Unzip the file, then delete the zipped file. This part needs 60 GB but only takes 30 by the end of it.
Create a database
Open the Neo4j desktop app, and click “Reveal files in File Explorer”. Move the .dump you downloaded into this folder.
Click on the ... behind the .dump file and click Create new DBMS from dump. This database is a dump from Neo4j V4, so your database also needs to be V4.x.x!
It will now create the database. This will take a long time, it might even say it has timed out. Do not believe this lie! In the background, it is still running. Every time you start it, it will time out. Just let it run and press start later again. The second time it will be started up directly.
Every time I start it up I get the timed-out error. After waiting 10 minutes and clicking start again the database, and with it, more than 200 million nodes, is ready. And you are done! Good luck and let me know what you build with it
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Money Supply M0 in the United States increased to 53615000 USD Million in October from 5478000 USD Million in September of 2025. This dataset provides - United States Money Supply M0 - actual values, historical data, forecast, chart, statistics, economic calendar and news.
Facebook
TwitterHow many people use social media?
Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.
Who uses social media?
Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.
How much time do people spend on social media?
Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.
What are the most popular social media platforms?
Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.
Facebook
TwitterWhich county has the most Facebook users?
There are more than 378 million Facebook users in India alone, making it the leading country in terms of Facebook audience size. To put this into context, if India’s Facebook audience were a country then it would be ranked third in terms of largest population worldwide. Apart from India, there are several other markets with more than 100 million Facebook users each: The United States, Indonesia, and Brazil with 193.8 million, 119.05 million, and 112.55 million Facebook users respectively.
Facebook – the most used social media
Meta, the company that was previously called Facebook, owns four of the most popular social media platforms worldwide, WhatsApp, Facebook Messenger, Facebook, and Instagram. As of the third quarter of 2021, there were around 3,5 billion cumulative monthly users of the company’s products worldwide. With around 2.9 billion monthly active users, Facebook is the most popular social media worldwide. With an audience of this scale, it is no surprise that the vast majority of Facebook’s revenue is generated through advertising.
Facebook usage by device
As of July 2021, it was found that 98.5 percent of active users accessed their Facebook account from mobile devices. In fact, almost 81.8 percent of Facebook audiences worldwide access the platform only via mobile phone. Facebook is not only available through mobile browser as the company has published several mobile apps for users to access their products and services. As of the third quarter 2021, the four core Meta products were leading the ranking of most downloaded mobile apps worldwide, with WhatsApp amassing approximately six billion downloads.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the California population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for California. The dataset can be utilized to understand the population distribution of California by age. For example, using this dataset, we can identify the largest age group in California.
Key observations
The largest age group in California was for the group of age 30 to 34 years years with a population of 2.98 million (7.61%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in California was the 80 to 84 years years with a population of 680,447 (1.73%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for California Population by Age. You can refer the same here
Facebook
Twitterhttps://www.usa.gov/government-works/https://www.usa.gov/government-works/
In 2018, food waste in the United States was a significant issue with substantial environmental and economic consequences. Here are some key statistics:
Overall Waste Volume and Percentage:
Approximately 103 million tons (206 billion pounds) of food waste were generated in the US in 2018, according to the EPA.
This amounted to between 30-40% of the entire US food supply going uneaten.
On a per-person basis, it was roughly one pound of food wasted per person per day.
Economic Impact:
The annual food waste in America had an approximate value of $161 billion to $218 billion.
The average American family of four reportedly threw out $1,500 in wasted food per year (based on 2010 price data, which would be higher in 2018).
The restaurant industry alone incurred an estimated $162 billion in costs related to wasted food.
Environmental Impact:
Food waste was the number one material in American landfills, accounting for 24.1% of all municipal solid waste (MSW).
When food rots in landfills, it produces methane, a potent greenhouse gas that is 28 times more powerful than CO2 at trapping heat. Food waste was responsible for an estimated 58% of landfill methane emissions to the atmosphere.
The production of wasted food in the US was equivalent to the greenhouse gas emissions of 37 million cars.
Wasted food also means wasted resources like land, water, and energy. Annually, food loss and waste took up an area of agricultural land the size of California and New York combined, and wasted enough energy to power 50 million US homes for a year.
Approximately 21% of agricultural water resources and 19% of US croplands were wasted for food that was ultimately thrown away.
Sources of Food Waste:
Food waste occurs across the entire supply chain, with significant contributions from:
Households: An estimated 43% of food waste came from homes.
Grocery stores, restaurants, and food service companies: Accounted for about 40% of food waste.
Farms: Responsible for around 16% of food loss.
Manufacturers: Contributed about 2% of food waste.
Breakdown by Material (within MSW):
Food waste comprised the fourth largest material category in total MSW generation, estimated at 63.1 million tons or 21.6% in 2018.
These statistics highlight the significant scale of food waste in the US in 2018 and its wide-ranging negative impacts on the economy and the environment
Food waste flows between waste-generating sectors and waste management routes are captured by these Flow-By-Sector (FBS) databases. Typically, the sectors use codes from the 2012 North American Industry Classification System (NAICS). Method 1 (m1 dataset file), the first dataset, assigns sectors to food waste creation and disposal statistics from the USEPA Wasted Food Report. The National Commercial Non-Hazardous Waste (CNHW) FBS dataset's discarded food data is attributed to sectors using the second approach, method 2 (m2 dataset file).
The CSV file "Food_Waste_national_2018_m2_v1.3.2_9b1bb41.csv" contains the following columns with their likely meanings:
Flowable: The type of material being tracked, in this case, "Food Waste".
Class: A classification for the "Flowable" material, here "Other".
SectorProducedBy: A numerical code indicating the sector that produced the food waste.
SectorConsumedBy: A numerical code indicating the sector that consumed or received the food waste.
SectorSourceName: The source of the sector classification, which is "NAICS_2012_Code" (North American Industry Classification System 2012 Code).
Context: This column appears to be empty in the provided data.
Location: This column seems to contain a location code, e.g., "=""00000""".
LocationSystem: The system used for location identification, which is "FIPS" (Federal Information Processing Standards).
FlowAmount: The quantity of food waste.
Unit: The unit of measurement for "FlowAmount", which is "kg" (kilograms).
FlowType: The type of flow, which is "WASTE_FLOW".
Year: The year the data pertains to, in this case, "2018".
MeasureofSpread: This column appears to be empty in the provided data.
Spread: A value related to the spread of the data, here "0.0".
DistributionType: This column appears to be empty in the provided data.
Min: Minimum value, here "0.0".
Max: Maximum value, here "0.0".
DataReliability: Data reliability value, here "0.0".
TemporalCorrelation: Temporal correlation value, here "0.0".
GeographicalCorrelation: Geographical correlation value, here "0.0".
TechnologicalCorrelation: Technological correlation value, here "0.0".
DataCollection: Data collection method or source, here "CalRecycle_WasteCharacterization".
**MetaSources...
Facebook
TwitterThis web map displays data from the voter registration database as the percent of registered voters by census tract in King County, Washington. The data for this web map is compiled from King County Elections voter registration data for the years 2013-2019. The total number of registered voters is based on the geo-location of the voter's registered address at the time of the general election for each year. The eligible voting population, age 18 and over, is based on the estimated population increase from the US Census Bureau and the Washington Office of Financial Management and was calculated as a projected 6 percent population increase for the years 2010-2013, 7 percent population increase for the years 2010-2014, 9 percent population increase for the years 2010-2015, 11 percent population increase for the years 2010-2016 & 2017, 14 percent population increase for the years 2010-2018 and 17 percent population increase for the years 2010-2019. The total population 18 and over in 2010 was 1,517,747 in King County, Washington. The percentage of registered voters represents the number of people who are registered to vote as compared to the eligible voting population, age 18 and over. The voter registration data by census tract was grouped into six percentage range estimates: 50% or below, 51-60%, 61-70%, 71-80%, 81-90% and 91% or above with an overall 84 percent registration rate. In the map the lighter colors represent a relatively low percentage range of voter registration and the darker colors represent a relatively high percentage range of voter registration. PDF maps of these data can be viewed at King County Elections downloadable voter registration maps. The 2019 General Election Voter Turnout layer is voter turnout data by historical precinct boundaries for the corresponding year. The data is grouped into six percentage ranges: 0-30%, 31-40%, 41-50% 51-60%, 61-70%, and 71-100%. The lighter colors represent lower turnout and the darker colors represent higher turnout. The King County Demographics Layer is census data for language, income, poverty, race and ethnicity at the census tract level and is based on the 2010-2014 American Community Survey 5 year Average provided by the United States Census Bureau. Since the data is based on a survey, they are considered to be estimates and should be used with that understanding. The demographic data sets were developed and are maintained by King County Staff to support the King County Equity and Social Justice program. Other data for this map is located in the King County GIS Spatial Data Catalog, where data is managed by the King County GIS Center, a multi-department enterprise GIS in King County, Washington. King County has nearly 1.3 million registered voters and is the largest jurisdiction in the United States to conduct all elections by mail. In the map you can view the percent of registered voters by census tract, compare registration within political districts, compare registration and demographic data, verify your voter registration or register to vote through a link to the VoteWA, Washington State Online Voter Registration web page.
Facebook
TwitterDuring a 2024 survey, 77 percent of respondents from Nigeria stated that they used social media as a source of news. In comparison, just 23 percent of Japanese respondents said the same. Large portions of social media users around the world admit that they do not trust social platforms either as media sources or as a way to get news, and yet they continue to access such networks on a daily basis.
Social media: trust and consumption
Despite the majority of adults surveyed in each country reporting that they used social networks to keep up to date with news and current affairs, a 2018 study showed that social media is the least trusted news source in the world. Less than 35 percent of adults in Europe considered social networks to be trustworthy in this respect, yet more than 50 percent of adults in Portugal, Poland, Romania, Hungary, Bulgaria, Slovakia and Croatia said that they got their news on social media.
What is clear is that we live in an era where social media is such an enormous part of daily life that consumers will still use it in spite of their doubts or reservations. Concerns about fake news and propaganda on social media have not stopped billions of users accessing their favorite networks on a daily basis.
Most Millennials in the United States use social media for news every day, and younger consumers in European countries are much more likely to use social networks for national political news than their older peers.
Like it or not, reading news on social is fast becoming the norm for younger generations, and this form of news consumption will likely increase further regardless of whether consumers fully trust their chosen network or not.
Facebook
TwitterIn the fourth quarter of 2024, TikTok generated around 186 million downloads from users worldwide. Initially launched in China first by ByteDance as Douyin, the short-video format was popularized by TikTok and took over the global social media environment in 2020. In the first quarter of 2020, TikTok downloads peaked at over 313.5 million worldwide, up by 62.3 percent compared to the first quarter of 2019.
TikTok interactions: is there a magic formula for content success?
In 2024, TikTok registered an engagement rate of approximately 4.64 percent on video content hosted on its platform. During the same examined year, the social video app recorded over 1,100 interactions on average. These interactions were primarily composed of likes, while only recording less than 20 comments per piece of content on average in 2024.
The platform has been actively monitoring the issue of fake interactions, as it removed around 236 million fake likes during the first quarter of 2024. Though there is no secret formula to get the maximum of these metrics, recommended video length can possibly contribute to the success of content on TikTok.
It was recommended that tiny TikTok accounts with up to 500 followers post videos that are around 2.6 minutes long as of the first quarter of 2024. While, the ideal video duration for huge TikTok accounts with over 50,000 followers was 7.28 minutes. The average length of TikTok videos posted by the creators in 2024 was around 43 seconds.
What’s trending on TikTok Shop?
Since its launch in September 2023, TikTok Shop has become one of the most popular online shopping platforms, offering consumers a wide variety of products. In 2023, TikTok shops featuring beauty and personal care items sold over 370 million products worldwide.
TikTok shops featuring womenswear and underwear, as well as food and beverages, followed with 285 and 138 million products sold, respectively. Similarly, in the United States market, health and beauty products were the most-selling items,
accounting for 85 percent of sales made via the TikTok Shop feature during the first month of its launch. In 2023, Indonesia was the market with the largest number of TikTok Shops, hosting over 20 percent of all TikTok Shops. Thailand and Vietnam followed with 18.29 and 17.54 percent of the total shops listed on the famous short video platform, respectively.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
Geoscape GNAF is the geocoded address database for Australian businesses and governments. It’s the trusted source of geocoded address data for Australia with over 50 million contributed addresses distilled into 15.4 million G-NAF addresses. It is built and maintained by Geoscape Australia using independently examined and validated government data.
It contains the state, suburb, street, number and coordinate reference or geocode for street addresses in Australia.
Columns extracted into this csv
The csv here contains 15.3 million rows with text and numeric column values extracted from the source GNAF files using PostgreSQL scripts from: https://github.com/dylanhogg/address-net/tree/master/gnaf_loading
Columns:
building_name
flat_number
flat_number_prefix
flat_number_suffix
flat_type
latitude
level_number
level_number_prefix
level_number_suffix
level_type
locality_name
longitude
lot_number
lot_number_prefix
lot_number_suffix
number_first
number_first_prefix
number_first_suffix
number_last
number_last_prefix
number_last_suffix
postcode
state_abbreviation
street_name
street_suffix_code
street_type_code
References
• Original data source: https://data.gov.au/data/dataset/geocoded-national-address-file-g-naf • G-NAF Product Description: https://docs.geoscape.com.au/projects/gnaf_desc/en/stable/index.html
Restrictions
The EULA terms are based on the Creative Commons Attribution 4.0 International license (CC BY 4.0). However, an important restriction relating to the use of the open G-NAF for the sending of mail has been added. The open G-NAF data must not be used for the generation of an address or a compilation of addresses for the sending of mail unless the user has verified that each address to be used for the sending of mail is capable of receiving mail by reference to a secondary source of information.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Diabetes is among the most prevalent chronic diseases in the United States, impacting millions of Americans each year and exerting a significant financial burden on the economy. Diabetes is a serious chronic disease in which individuals lose the ability to effectively regulate levels of glucose in the blood, and can lead to reduced quality of life and life expectancy. After different foods are broken down into sugars during digestion, the sugars are then released into the bloodstream. This signals the pancreas to release insulin. Insulin helps enable cells within the body to use those sugars in the bloodstream for energy. Diabetes is generally characterized by either the body not making enough insulin or being unable to use the insulin that is made as effectively as needed.
Complications like heart disease, vision loss, lower-limb amputation, and kidney disease are associated with chronically high levels of sugar remaining in the bloodstream for those with diabetes. While there is no cure for diabetes, strategies like losing weight, eating healthily, being active, and receiving medical treatments can mitigate the harms of this disease in many patients. Early diagnosis can lead to lifestyle changes and more effective treatment, making predictive models for diabetes risk important tools for public and public health officials.
The scale of this problem is also important to recognize. The Centers for Disease Control and Prevention has indicated that as of 2018, 34.2 million Americans have diabetes and 88 million have prediabetes. Furthermore, the CDC estimates that 1 in 5 diabetics, and roughly 8 in 10 prediabetics are unaware of their risk. While there are different types of diabetes, type II diabetes is the most common form and its prevalence varies by age, education, income, location, race, and other social determinants of health. Much of the burden of the disease falls on those of lower socioeconomic status as well. Diabetes also places a massive burden on the economy, with diagnosed diabetes costs of roughly $327 billion dollars and total costs with undiagnosed diabetes and prediabetes approaching $400 billion dollars annually.
The Behavioral Risk Factor Surveillance System (BRFSS) is a health-related telephone survey that is collected annually by the CDC. Each year, the survey collects responses from over 400,000 Americans on health-related risk behaviors, chronic health conditions, and the use of preventative services. It has been conducted every year since 1984. For this project, a csv of the dataset available on Kaggle for the year 2015 was used. This original dataset contains responses from 441,455 individuals and has 330 features. These features are either questions directly asked of participants, or calculated variables based on individual participant responses.
This dataset contains 3 files: 1. diabetes _ 012 _ health _ indicators _ BRFSS2015.csv is a clean dataset of 253,680 survey responses to the CDC's BRFSS2015. The target variable Diabetes_012 has 3 classes. 0 is for no diabetes or only during pregnancy, 1 is for prediabetes, and 2 is for diabetes. There is class imbalance in this dataset. This dataset has 21 feature variables 2. diabetes _ binary _ 5050split _ health _ indicators _ BRFSS2015.csv is a clean dataset of 70,692 survey responses to the CDC's BRFSS2015. It has an equal 50-50 split of respondents with no diabetes and with either prediabetes or diabetes. The target variable Diabetes_binary has 2 classes. 0 is for no diabetes, and 1 is for prediabetes or diabetes. This dataset has 21 feature variables and is balanced. 3. diabetes _ binary _ health _ indicators _ BRFSS2015.csv is a clean dataset of 253,680 survey responses to the CDC's BRFSS2015. The target variable Diabetes_binary has 2 classes. 0 is for no diabetes, and 1 is for prediabetes or diabetes. This dataset has 21 feature variables and is not balanced.
Explore some of the following research questions: 1. Can survey questions from the BRFSS provide accurate predictions of whether an individual has diabetes? 2. What risk factors are most predictive of diabetes risk? 3. Can we use a subset of the risk factors to accurately predict whether an individual has diabetes? 4. Can we create a short form of questions from the BRFSS using feature selection to accurately predict if someone might have diabetes or is at high risk of diabetes?
It it important to reiterate that I did not create this dataset, it is just a cleaned and consolidated dataset created from the BRFSS 2015 dataset already on Kaggle. That dataset can be found here and the notebook I used for the data cleaning can be found here.
Zidian Xie et al fo...
Facebook
TwitterThis data was collected as part of a university research paper where COVID-19 cases were analysed using a cross-sectional regression model as at 17th May 2020. In order to better understand COVID-19 cases growth at a country level I decided to create a dataset containing key dates in the progression of the virus globally.
210 rows, 6 columns.
This dataset contains data relating to COVID-19 cases for 210 countries globally. Data was collected using the most recent and reliable information as at 17th May 2020. The majority of data was collected from Worldometer. https://www.worldometers.info/coronavirus/#countries
This dataset contains dates for the 1st coronavirus case, 100th coronavirus case, and (50th coronavirus case per 1 million people) for 210 countries. Data is also provided for the number of days between the 1st case and the 100th as well as the 1st case and the 50th per 1 million people.
Data prior to 15th February 2020, was not easily accessible at the country level from Worldometer. Therefore any dates prior to 15th February 2020 were not sourced from Worldometer but reputable government and local media sources.
Blanks (null values) indicate that the country in question has not reached either 50 coronavirus cases per 1 million people or 100 coronavirus cases. These were left blank.
I would like to acknowledge Worldometer for providing the vast majority of the data in this file. Worldometer is a website that provides real time statistics on topics such as coronavirus cases. Its sources include government official reports as well as trusted local media sources all of which are referenced on their website.
Hopefully this data can be used to better understand the growth of COVID-19 cases globally.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Multilingual Spoken Words Corpus is a large and growing audio dataset of spoken words in 50 languages collectively spoken by over 5 billion people, for academic research and commercial applications in keyword spotting and spoken term search, licensed under CC-BY 4.0. The dataset contains more than 340,000 keywords, totaling 23.4 million 1-second spoken examples (over 6,000 hours). The dataset has many use cases, ranging from voice-enabled consumer devices to call center automation. This dataset is generated by applying forced alignment on crowd-sourced sentence-level audio to produce per-word timing estimates for extraction. All alignments are included in the dataset.
Facebook
Twitter50 Million Rows MSSQL Backup File with Clustered Columnstore Index.
This dataset contains -27K categorized Turkish supermarket items. -81 stores (Every city of Turkey has a store) -100K real Turkish names customer, address -10M rows sales data generated randomly. -All data has a near real price with influation factor by the time.
All the data generated randomly. So the usernames have been generated with real Turkish names and surnames but they are not real people.
The sale data generated randomly. But it has some rules.
For example, every order can contains 1-9 kind of item.
Every orderline amount can be 1-9 pieces.
The randomise function works according to population of the city.
So the number of orders for Istanbul (the biggest city of Turkey) is about 20% of all data
and another city for example orders for the Gaziantep (the population is 2.5% of Turkey population) is about 2.5% off all data.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1611072%2F9442f2a1dbae7f05ead4fde9e1033ac6%2Finbox_1611072_135236e39b79d6fae8830dec3fca4961_1.png?generation=1693509562300174&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1611072%2F1c39195270db87250e59d9f2917ccea1%2Finbox_1611072_b73d9ca432dae956564cfa5bfe42268c_3.png?generation=1693509575061587&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F1611072%2Fa908389f33ae5c983e383d17f0d9a763%2Finbox_1611072_c5d349aa1f33c0fc4fc74b79b7167d3a_F3za81TXkAA1Il4.png?generation=1693509586158658&alt=media" alt="">