Facebook
TwitterPsychological scientists increasingly study web data, such as user ratings or social media postings. However, whether research relying on such web data leads to the same conclusions as research based on traditional data is largely unknown. To test this, we (re)analyzed three datasets, thereby comparing web data with lab and online survey data. We calculated correlations across these different datasets (Study 1) and investigated identical, illustrative research questions in each dataset (Studies 2 to 4). Our results suggest that web and traditional data are not fundamentally different and usually lead to similar conclusions, but also that it is important to consider differences between data types such as populations and research settings. Web data can be a valuable tool for psychologists when accounting for such differences, as it allows for testing established research findings in new contexts, complementing them with insights from novel data sources.
Facebook
TwitterNursing Home Compare has detailed information about every Medicare and Medicaid nursing home in the country. A nursing home is a place for people who can’t be cared for at home and need 24-hour nursing care. These are the official datasets used on the Medicare.gov Nursing Home Compare Website provided by the Centers for Medicare & Medicaid Services. These data allow you to compare the quality of care at every Medicare and Medicaid-certified nursing home in the country, including over 15,000 nationwide.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset consists of the top 50 most visited websites in the world, as well as the category and principal country/territory for each site. The data provides insights into which sites are most popular globally, and what type of content is most popular in different parts of the world
This dataset can be used to track the most popular websites in the world over time. It can also be used to compare website popularity between different countries and categories
- To track the most popular websites in the world over time
- To see how website popularity changes by region
- To find out which website categories are most popular
Dataset by Alexa Internet, Inc. (2019), released on Kaggle under the Open Data Commons Public Domain Dedication and License (ODC-PDDL)
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: df_1.csv | Column name | Description | |:--------------------------------|:---------------------------------------------------------------------| | Site | The name of the website. (String) | | Domain Name | The domain name of the website. (String) | | Category | The category of the website. (String) | | Principal country/territory | The principal country/territory where the website is based. (String) |
Facebook
TwitterHow prevalent is sports betting across the United States? This dataset provides information on the legal status of sports betting, revenue generated by sports betting, the number of sports betting outlets, and more. Use this dataset to compare the revenue generated by sports betting across different states
This dataset can be used to understand the prevalence of sports betting across the United States and to compare the revenue generated by sports betting across states.
File: New Jersey.csv | Column name | Description | |:------------------|:--------------------------------------------------------------| | date | The date of the data. (Date) | | New Jersey | The amount of money bet on sports in New Jersey. (Numeric) | | Pennsylvania | The amount of money bet on sports in Pennsylvania. (Numeric) | | Delaware | The amount of money bet on sports in Delaware. (Numeric) | | Mississippi | The amount of money bet on sports in Mississippi. (Numeric) | | Nevada | The amount of money bet on sports in Nevada. (Numeric) | | Rhode Island | The amount of money bet on sports in Rhode Island. (Numeric) | | West Virginia | The amount of money bet on sports in West Virginia. (Numeric) | | Arkansas | The amount of money bet on sports in Arkansas. (Numeric) | | New York | The amount of money bet on sports in New York. (Numeric) | | Iowa | The amount of money bet on sports in Iowa. (Numeric) | | Indiana | The amount of money bet on sports in Indiana. (Numeric) | | Oregon | The amount of money bet on sports in Oregon. (Numeric) | | New Hampshire | The amount of money bet on sports in New Hampshire. (Numeric) | | Michigan | The amount of money bet on sports in Michigan. (Numeric) | | Montana | The amount of money bet on sports in Montana. (Numeric) | | Colorado | The amount of money bet on sports in Colorado. (Numeric) | | Washington DC | The amount of money bet on sports in Washington DC. (Numeric) | | Illinois | The amount of money bet on sports in Illinois. (Numeric) | | Tennessee | The amount of money bet on sports in Tennessee. (Numeric) |
File: PopulationStates.csv | Column name | Description | |:--------------|:----------------------------------------------------| | State | The state in which the data was collected. (String) |
File: homeless.csv | Column name | Description | |:----------------|:----------------------------------------------------| | year | The year the data was collected. (Integer) | | unsheltered | The number of people who are unsheltered. (Integer) |
File: income.csv | Column name | Description | |:------------------|:--------------------------------------------------------------| | Pennsylvania | The amount of money bet on sports in Pennsylvania. (Numeric) | | Delaware | The amount of money bet on sports in Delaware. (Numeric) | | Mississippi | The amount of money bet on sports in Mississippi. (Numeric) | | Nevada | The amount of money bet on sports in Nevada. (Numeric) | | Rhode Island | The amount of money bet on sports in Rhode Island. (Numeric) | | West Virginia | The amount of money bet on sports in West Virginia. (Numeric) | | Arkansas | The amount of money bet on sports in Arkansas. (Numeric) | | New York | The amount of money bet on sports in New York. (Numeric) | | Iowa | The amount of money bet on sports in Iowa. (Numeric) | | Indiana | The amount of money bet on sports in Indiana. (Numeric) | | New Hampshire | The amount of money bet on sports in New Hampshire. (Numeric) | | Michigan | The amount of money bet on sports in Michigan. (Numeric) | | Colorado | The amount of money bet on sports in Colorado. (Numeric) | | Washington DC | The amount of money bet on sports in Washington DC. (Numeric) | | Illinois | The amount of money bet on sports in Illinois. (Nume...
Facebook
TwitterThe Urban Observatory Compare app shows maps of the same subject for three cities, in a side by side comparison view. The app allows quick visual comparisons of the patterns at work in cities around the world.The app allows people to interact with rich datasets for each city. People can use the Urban Observatory web application to easily compare cities by using a simple web browser. As a user zooms in to one digital city map, other city maps will zoom in parallel, revealing similarities and differences in density and distribution. For instance, a person can simultaneously view traffic density for Abu Dhabi and Paris or simultaneously view vegetation in London and Tokyo.The Urban Observatory is brought to you by Richard Saul Wurman, creator of Technology/Entertainment/Design (TED) and 19.20.21; Jon Kamen of the Academy Award-, Emmy Award-, and Golden Globe Award-winning film company @radical.media; and Esri president Jack Dangermond. "A map is a pattern made understandable, and patterns must be compared to understand successes, failures, and opportunities of our global cities," says Wurman. "The Urban Observatory demonstrates this new paradigm, using cartographic language and constructive data display. People and cities can use maps as a common language," said Wurman. The application utilizes Esri's ArcGIS API for JavaScript. Once a web map is created, it is added to a group and tagged to indicated its city and subject information. Those tags are read by the application as it starts up in the browser.
Facebook
TwitterPIECE is a plant gene structure comparison and evolution database with 25 species. Annotated genes extracted from the species are classified based on the Pfam motif and phylogenetic trees are reconstructed for each gene category integrating exon-intron and protein motif information. Resources in this dataset:Resource Title: Web Page. File Name: Web Page, url: https://probes.pw.usda.gov/piece/index.php
Facebook
TwitterChild Welfare Policies and Demographic Characteristics: A Compilation of State-Level Data is a suite of datasets gathered from various sources. All datasets in this suite contain information about states. It is intended to be a resource for researchers doing policy studies in the areas of foster care, adoption, and child abuse, and is intended as a supplement to the AFCARS and NCANDS datasets. It consists of five studies, their data, and final reports (if any).The common thread linking this suite of datasets is that the level of analysis is always states. This information can be used to group or classify states in some domain, coupled with using the AFCARS or NCANDS data to explore how states or groups of states compare. The intention is that this process will increase the value of AFCARS and NCANDS for analyzing the effects of policy differences across states. Most of the data were gleaned from reports published by academic or public interest organizations, such as The Urban Institute, the North American Council on Adoptable Children, or the John F. Kennedy School of Government at Harvard University. Each of these reports is available at the organization's web site, and is included in the files that accompany this User Guide in PDF format. The value of this compilation is in providing the data in a form that is readily readable by statistical programs such as SAS, SPSS, and Stata, and in compiling in one place the descriptions of the variables and values contained in the reports. Other data in this suite were collected from the United States Bureau of the Census and Wikipedia, a web-based encyclopedia.
Investigators: Hansen, Mary & Dineen, Michael
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comprehensive dataset containing 6 verified SmarterHome.ai - Compare Local Internet Deals locations in Alabama, United States with complete contact information, ratings, reviews, and location data.
Facebook
TwitterComparison of Online Tools.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
MapViewer is a graphical tool for viewing and comparing Gossypium spp. genetic maps. It includes dynamically scrollable maps, correspondence matrices, dot plots, links to details about map features, and exporting functionality. It was developed by the MainLab at Washington State University and is available for download for use in other Tripal databases. The query interface allows the user to select Species, Map, and Linkage Group options. Help information includes a video tutorial, user manual, and sample map, correspondence matrix, dot plot, and exported figures. Resources in this dataset:Resource Title: Website Pointer for CottonGen Map Viewer. File Name: Web Page, url: https://www.cottongen.org/MapViewer MapViewer is a graphical tool for viewing and comparing Gossypium spp. genetic maps. It includes dynamically scrollable maps, correspondence matrices, dot plots, links to details about map features, and exporting functionality. It was developed by the MainLab at Washington State University and is available for download for use in other Tripal databases. The query interface allows the user to select Species, Map, and Linkage Group options. Help information includes a video tutorial, user manual, and sample map, correspondence matrix, dot plot, and exported figures.
Facebook
TwitterThe Federal/State Tribal Data Comparison web map can be used to compare the reservation boundaries that appear on the Minnesota State Highway Map with the U.S. Census Bureau reservation boundaries. This map also shows off-reservation trust land owned by tribes. The map is for informational purposes only. It is not a land survey and does not contain coordinate correct data. Boundaries are not recognition, endorsement, or acceptance by MnDOT or the State of Minnesota.
Facebook
TwitterThe Trivago dataset is a real-world task in the travel metasearch domain. Users that are planning a business or leisure trip can use Trivago's website to compare accommodations and prices from various booking sites.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset extracted from the post Mutual Funds Comparison Calculator – 2025 | Free and Fast Online Tool | How to Use it on Smart Investello.
Facebook
Twitterhttps://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset includes whether the page is a blog or not from the website urls. Most of the features are taken from this article [1]. You can review for detailed information. Information about features not included in this dataset will be added soon.
[1] Vrbančič, G., Fister Jr, I., & Podgorelec, V. (2020). Datasets for phishing websites detection. Data in Brief, 33, 106438.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Compiled in mid-2022, this dataset contains the raw data file, randomized ranked lists of R1 and R2 research institutions, and files created to support data visualization for Elizabeth Szkirpan's 2022 study regarding availability of data services and research data information via university libraries for online users. Files are available in Microsoft Excel formats.
Facebook
TwitterComparison of missing values, ‘don’t know’ values and inconsistent values between the paper-and-pencil and web-based mode and number of data entry mistakes in the paper-and-pencil mode (n = 149).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
More and more customers demand online reviews of products and comments on the Web to make decisions about buying a product over another. In this context, sentiment analysis techniques constitute the traditional way to summarize a user’s opinions that criticizes or highlights the positive aspects of a product. Sentiment analysis of reviews usually relies on extracting positive and negative aspects of products, neglecting comparative opinions. Such opinions do not directly express a positive or negative view but contrast aspects of products from different competitors.
Here, we present the first effort to study comparative opinions in Portuguese, creating two new Portuguese datasets with comparative sentences marked by three humans. This repository consists of three important files: (1) lexicon that contains words frequently used to make a comparison in Portuguese; (2) Twitter dataset with labeled comparative sentences; and (3) Buscapé dataset with labeled comparative sentences.
The lexicon is a set of 176 words frequently used to express a comparative opinion in the Portuguese language. In these contexts, the lexicon is aggregated in a filter and used to build two sets of data with comparative sentences from two important contexts: (1) Social Network Online; and (2) Product reviews.
For Twitter, we collected all Portuguese tweets published in Brazil on 2018/01/10 and filtered all tweets that contained at least one keyword present in the lexicon, obtaining 130,459 tweets. Our work is based on the sentence level. Thus, all sentences were extracted and a sample with 2,053 sentences was created, which was labeled for three human manuals, reaching an 83.2% agreement with Fleiss' Kappa coefficient. For Buscapé, a Brazilian website (https://www.buscape.com.br/) used to compare product prices on the web, the same methodology was conducted by creating a set of 2,754 labeled sentences, obtained from comments made in 2013. This dataset was labeled by three humans, reaching an agreement of 83.46% with the Fleiss Kappa coefficient.
The Twitter dataset has 2,053 labeled sentences, of which 918 are comparative. The Buscapé dataset has 2,754 labeled sentences, of which 1,282 are comparative.
The datasets contain these labeled properties:
text: the sentence extracted from the review comment.
entity_s1: the first entity compared in the sentence.
entity_s2: the second entity compared in the sentence.
keyword: the comparative keyword used in the sentence to express comparison.
preferred_entity: the preferred entity.
id_start: the keyword's initial position in the sentence.
id_end: the keyword's final position in the sentence.
type: the sentence label, which specifies whether the phrase is a comparison.
Additional Information:
1 - The sentences were separated using a sentence tokenizer.
2 - If the compared entity is not specified, the field will receive a value: "_".
3 - The property "type" can contain five values, they are:
0: Non-comparative (Não Comparativa).
1: Non-Equal-Gradable (Gradativa com Predileção).
2: Equative (Equitativa).
3: Superlative (Superlativa).
4: Non-Equal-Gradable (Não Gradativa).
If you use this data, please cite our paper as follows:
"Daniel Kansaon, Michele A. Brandão, Julio C. S. Reis, Matheus Barbosa,Breno Matos, and Fabrício Benevenuto. 2020. Mining Portuguese Comparative Sentences in Online Reviews. In Brazilian Symposium on Multimedia and the Web (WebMedia ’20), November 30-December 4, 2020, São Luís, Brazil. ACM, New York, NY, USA, 8 pages. https://doi.org/10.1145/3428658.3431081"
--------------
Plus Information:
We make the raw sentences available in the dataset to allow future work to test different pre-processing steps. Then, if you want to obtain the exact sentences used in the paper above, you must reproduce the pre-processing step described in the paper (Figure 2).
For each sentence with more than one keyword in the dataset:
Note that: the final processed sentence can have more than six words because the stopwords are not counted as part of the range.
Facebook
TwitterBackground The objective of the study is to examine the reliability of erosion and joint space narrowing scores derived from hand x-rays posted on the Internet compared to scores derived from original plain x-rays.
Methods
Left and right x-rays of the hands of 36 patients were first digitized and then posted in standard fashion to a secure Internet website. Both the plain and Internet x-rays were scored for erosions and joint space narrowing using the Sharp/Genant method. All scoring was completed in a blind and randomized manner. Agreement between plain and Internet x-ray scores was calculated using Lin's concordance correlations and Bland-Altman graphical representation.
Results
Erosion scores for plain x-rays showed almost perfect concordance with x-rays read on the Internet (concordance 0.887). However, joint space narrowing scores were only "fair" (concordance 0.365). Global scores demonstrated substantial concordance between plain and Internet readings (concordance 0.769). Hand x-rays with less disease involvement showed a tendency to be scored higher on the Internet versions than those with greater disease involvement. This was primarily evident in the joint space narrowing scores.
Conclusions
The Internet represents a valid medium for displaying and scoring hand x-rays of patients with RA. Higher scores from the Internet version may be related to better viewing conditions on the computer screen relative to the plain x-ray viewing, which did not include magnifying lens or bright light. The capability to view high quality x-rays on the Internet has the potential to facilitate information sharing, education, and encourage collaborative studies.
Facebook
TwitterState estimates for these years are no longer available due to methodological concerns with combining 2019 and 2020 data. We apologize for any inconvenience or confusion this may causeBecause of the COVID-19 pandemic, most respondents answered the survey via the web in Quarter 4 of 2020, even though all responses in Quarter 1 were from in-person interviews. It is known that people may respond to the survey differently while taking it online, thus introducing what is called a mode effect.When the state estimates were released, it was assumed that the mode effect was similar for different groups of people. However, later analyses have shown that this assumption should not be made. Because of these analyses, along with concerns about the rapid societal changes in 2020, it was determined that averages across the two years could be misleading.For more detail on this decision, see the 2019-2020state data page.
Facebook
TwitterPsychological scientists increasingly study web data, such as user ratings or social media postings. However, whether research relying on such web data leads to the same conclusions as research based on traditional data is largely unknown. To test this, we (re)analyzed three datasets, thereby comparing web data with lab and online survey data. We calculated correlations across these different datasets (Study 1) and investigated identical, illustrative research questions in each dataset (Studies 2 to 4). Our results suggest that web and traditional data are not fundamentally different and usually lead to similar conclusions, but also that it is important to consider differences between data types such as populations and research settings. Web data can be a valuable tool for psychologists when accounting for such differences, as it allows for testing established research findings in new contexts, complementing them with insights from novel data sources.