Nursing Home Compare has detailed information about every Medicare and Medicaid nursing home in the country. A nursing home is a place for people who can’t be cared for at home and need 24-hour nursing care. These are the official datasets used on the Medicare.gov Nursing Home Compare Website provided by the Centers for Medicare & Medicaid Services. These data allow you to compare the quality of care at every Medicare and Medicaid-certified nursing home in the country, including over 15,000 nationwide.
This online application gives manufacturers the ability to compare Iowa to other states on a number of different topics including: business climate, education, operating costs, quality of life and workforce.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of OR tables between the interaction of rs7522462 and rs11945978 in the WTCCC data with the shared controls (left) and the interaction of the proxy SNPs, rs296533 and rs2089509 in the IBDGC data (right). The legend to this table is the same as that of Table 3.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for WORLD reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
This Location Data & Foot traffic dataset available for all countries include enriched raw mobility data and visitation at POIs to answer questions such as:
-How often do people visit a location? (daily, monthly, absolute, and averages).
-What type of places do they visit ? (parks, schools, hospitals, etc)
-Which social characteristics do people have in a certain POI? - Breakdown by type: residents, workers, visitors.
-What's their mobility like enduring night hours & day hours?
-What's the frequency of the visits partition by day of the week and hour of the day?
Extra insights -Visitors´ relative income Level. -Visitors´ preferences as derived by their visits to shopping, parks, sports facilities, churches, among others.
Overview & Key Concepts Each record corresponds to a ping from a mobile device, at a particular moment in time and at a particular latitude and longitude. We procure this data from reliable technology partners, which obtain it through partnerships with location-aware apps. All the process is compliant with applicable privacy laws.
We clean and process these massive datasets with a number of complex, computer-intensive calculations to make them easier to use in different data science and machine learning applications, especially those related to understanding customer behavior.
Featured attributes of the data Device speed: based on the distance between each observation and the previous one, we estimate the speed at which the device is moving. This is particularly useful to differentiate between vehicles, pedestrians, and stationery observations.
Night base of the device: we calculate the approximated location of where the device spends the night, which is usually their home neighborhood.
Day base of the device: we calculate the most common daylight location during weekdays, which is usually their work location.
Income level: we use the night neighborhood of the device, and intersect it with available socioeconomic data, to infer the device’s income level. Depending on the country, and the availability of good census data, this figure ranges from a relative wealth index to a currency-calculated income.
POI visited: we intersect each observation with a number of POI databases, to estimate check-ins to different locations. POI databases can vary significantly, in scope and depth, between countries.
Category of visited POI: for each observation that can be attributable to a POI, we also include a standardized location category (park, hospital, among others). Coverage: Worldwide.
Delivery schemas We can deliver the data in three different formats:
Full dataset: one record per mobile ping. These datasets are very large, and should only be consumed by experienced teams with large computing budgets.
Visitation stream: one record per attributable visit. This dataset is considerably smaller than the full one but retains most of the more valuable elements in the dataset. This helps understand who visited a specific POI, characterize and understand the consumer's behavior.
Audience profiles: one record per mobile device in a given period of time (usually monthly). All the visitation stream is aggregated by category. This is the most condensed version of the dataset and is very useful to quickly understand the types of consumers in a particular area and to create cohorts of users.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The two datasets provided contains temperature and light sensor readings. The measurements was done from 22:35:41 - 04:49:41.
The problem with this dataset is to plot the measurements with respect to the datetime.date format when the measurements are taken from one day to another (22:35:41 - 04:49:41) without showing any date. The plot-function automatically starts from 00:00 and puts the data that was measured before 00:00 to the end of the plot.
The data types are: - Temperature dataset: - Time : datetime.time - Temperature : numpy.float64 - Light dataset: - Time : datetime.time - Light: numpy.float64
I hope you can help me solve this little plot mystery!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GENERAL INFORMATION
Title of Dataset: A dataset from a survey investigating disciplinary differences in data citation
Date of data collection: January to March 2022
Collection instrument: SurveyMonkey
Funding: Alfred P. Sloan Foundation
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: These data are available under a CC BY 4.0 license
Links to publications that cite or use the data:
Gregory, K., Ninkov, A., Ripp, C., Peters, I., & Haustein, S. (2022). Surveying practices of data citation and reuse across disciplines. Proceedings of the 26th International Conference on Science and Technology Indicators. International Conference on Science and Technology Indicators, Granada, Spain. https://doi.org/10.5281/ZENODO.6951437
Gregory, K., Ninkov, A., Ripp, C., Roblin, E., Peters, I., & Haustein, S. (2023). Tracing data: A survey investigating disciplinary differences in data citation. Zenodo. https://doi.org/10.5281/zenodo.7555266
DATA & FILE OVERVIEW
File List
Filename: MDCDatacitationReuse2021Codebookv2.pdf Codebook
Filename: MDCDataCitationReuse2021surveydatav2.csv Dataset format in csv
Filename: MDCDataCitationReuse2021surveydatav2.sav Dataset format in SPSS
Filename: MDCDataCitationReuseSurvey2021QNR.pdf Questionnaire
Additional related data collected that was not included in the current data package: Open ended questions asked to respondents
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data:
The development of the questionnaire (Gregory et al., 2022) was centered around the creation of two main branches of questions for the primary groups of interest in our study: researchers that reuse data (33 questions in total) and researchers that do not reuse data (16 questions in total). The population of interest for this survey consists of researchers from all disciplines and countries, sampled from the corresponding authors of papers indexed in the Web of Science (WoS) between 2016 and 2020.
Received 3,632 responses, 2,509 of which were completed, representing a completion rate of 68.6%. Incomplete responses were excluded from the dataset. The final total contains 2,492 complete responses and an uncorrected response rate of 1.57%. Controlling for invalid emails, bounced emails and opt-outs (n=5,201) produced a response rate of 1.62%, similar to surveys using comparable recruitment methods (Gregory et al., 2020).
Methods for processing the data:
Results were downloaded from SurveyMonkey in CSV format and were prepared for analysis using Excel and SPSS by recoding ordinal and multiple choice questions and by removing missing values.
Instrument- or software-specific information needed to interpret the data:
The dataset is provided in SPSS format, which requires IBM SPSS Statistics. The dataset is also available in a coded format in CSV. The Codebook is required to interpret to values.
DATA-SPECIFIC INFORMATION FOR: MDCDataCitationReuse2021surveydata
Number of variables: 95
Number of cases/rows: 2,492
Missing data codes: 999 Not asked
Refer to MDCDatacitationReuse2021Codebook.pdf for detailed variable information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 1 row and is filtered where the book is Ordinary difference-differential equations. It features 7 columns including author, publication date, language, and book publisher.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
AmsterTime dataset offers a collection of 2,500 well-curated images matching the same scene from a street view matched to historical archival image data from Amsterdam city. The image pairs capture the same place with different cameras, viewpoints, and appearances. Unlike existing benchmark datasets, AmsterTime is directly crowdsourced in a GIS navigation platform (Mapillary). In turn, all the matching pairs are verified by a human expert to verify the correct matches and evaluate the human competence in the Visual Place Recognition (VPR) task for further references.
The properties of the dataset are summarized as:
Two sub-tasks are created on the dataset:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for INDEX reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 1 MINUTE information about some stocks, crypto and forex.
You can use this dataset for data analysis or even training a machine learning model for trading. :)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for 1000 reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
The dataset is designed to simulate password-related events, creating a synthetic representation of actions related to password management. It includes fields like timestamp, action, event type, location, IP address, password, hour, and time difference.
This synthetic dataset can be used for training and testing machine learning models related to cyber security, anomaly detection, or password management. It allows researchers and practitioners to experiment with data resembling real-world scenarios without compromising actual user information.
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Three projects (LANDES, REMORA, DEPECHEMOD) were conducted between 2007 and 2016 to quantify the atmospheric vertical fluxes of aerosol particle as function of their size and the micrometeorological conditions, above different natural surfaces (maize, grassland, bare soil and forest). An original methodology based on eddy correlation method and spectral analysis of the turbulent scalars in the atmospheric surface layer was used to quantify the vertical fluxes and dry deposition (Damay, 2010; Damay et al., 2009; Pellerin, 2017; Pellerin et al., 2017). This method uses two condensation particle counters (CPC 3788 and 3786, TSI) for the particle size range of 2.5-14 nm (Twin CPC device) and an Electrical Low-Pressure Impactor (ELPI, DEKATI) for particles between 7 nm and 1.2 µm (ELPI device). These devices have been coupled with sonic anemometer (Young 81000). Nine experimental campaigns were realised in France, and emission fluxes and deposition fluxes of aerosol particles were quantified. Emission fluxes are probably due to the particle condensation growth with water vapor at the level of the canopy. Deposition fluxes are due to dry deposition (Brownian diffusion, interception, impaction, gravitational settling) on surfaces. These datasets of vertical fluxes and the dry deposition velocity could use to validate atmospheric model for atmospheric pollution and climate change.
https://ega-archive.org/dacs/EGAC00001000179https://ega-archive.org/dacs/EGAC00001000179
Regions of common inter-individual DNA methylation differences in human monocytes – potential function and genetic basis WGBS Data of Samples: 43_Hm03_BlMo_Ct, 43_Hm02_BlMo_Ct, 43_Hm05_BlMo_Ct, 43_Hm01_BlMo_Ct For details about sequencing or sample metadata check http://deep.dkfz.de/
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
The storage and persistence of soil organic matter (SOM) is of critical importance to soil health, and to the terrestrial carbon cycle with implications for long-term climate change. To better understand the spatio-temporal controls on SOM, we have developed a new dataset spanning two previously described marine terrace soil chronosequences from northern, CA, USA: the Santa Cruz and the Mattole River chronosequences. Each of these sites, is comprised of several terraces surfaces that span at least 200 ka of soil development. The sites differ with regard to local precipitation, with the Mattole site receiving nearly double the mean annual precipitation of the Santa Cruz site. During the period from 2011 through 2016, we collected and analyzed samples from eight soil pits (four at each chronosequence site) in order to facilitate the comparison of SOM content and isotopic composition with other soil biogeochemical measurements. In the resulting climo-chronosequence dataset, we repor ...
Published at 12th AIAA Aviation Technology, Integration, and Operations (ATIO) Conference and 14th AIAA/ISSM 17 - 19 September 2012, Indianapolis, Indiana
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for G reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
The NADAC Weekly Comparison identifies the drug products with current NADAC rates that are replaced with new NADAC rates. Other changes (e.g. NDC additions and terminations) to the NADAC file are not reflected in this comparison. Note: Effective Date was not recorded in the dataset until 6/7/2017
There has been a tremendous increase in the volume of sensor data collected over the last decade for different monitoring tasks. For example, petabytes of earth science data are collected from modern satellites, in-situ sensors and different climate models. Similarly, huge amount of flight operational data is downloaded for different commercial airlines. These different types of datasets need to be analyzed for finding outliers. Information extraction from such rich data sources using advanced data mining methodologies is a challenging task not only due to the massive volume of data, but also because these datasets are physically stored at different geographical locations with only a subset of features available at any location. Moving these petabytes of data to a single location may waste a lot of bandwidth. To solve this problem, in this paper, we present a novel algorithm which can identify outliers in the entire data without moving all the data to a single location. The method we propose only centralizes a very small sample from the different data subsets at different locations. We analytically prove and experimentally verify that the algorithm offers high accuracy compared to complete centralization with only a fraction of the communication cost. We show that our algorithm is highly relevant to both earth sciences and aeronautics by describing applications in these domains. The performance of the algorithm is demonstrated on two large publicly available datasets: (1) the NASA MODIS satellite images and (2) a simulated aviation dataset generated by the ‘Commercial Modular Aero-Propulsion System Simulation’ (CMAPSS).
Nursing Home Compare has detailed information about every Medicare and Medicaid nursing home in the country. A nursing home is a place for people who can’t be cared for at home and need 24-hour nursing care. These are the official datasets used on the Medicare.gov Nursing Home Compare Website provided by the Centers for Medicare & Medicaid Services. These data allow you to compare the quality of care at every Medicare and Medicaid-certified nursing home in the country, including over 15,000 nationwide.