100+ datasets found

Data from: STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC...
scielo.figshare.com
jpeg
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paula Fernandez Costa; Irving da Silva Badolato; Rogério Luís Ribeiro Borba; Julia Celia Mercedes Strauch (2023). STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC INFORMATION THROUGH DATA MINING [Dataset]. http://doi.org/10.6084/m9.figshare.8031641.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8031641.v1
Dataset updated
May 31, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
Paula Fernandez Costa; Irving da Silva Badolato; Rogério Luís Ribeiro Borba; Julia Celia Mercedes Strauch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract This aim of this paper is the acquisition of geographic data from the Foursquare application, using data mining to perform exploratory and spatial analyses of the distribution of tourist attraction and their density distribution in Rio de Janeiro city. Thus, in accordance with the Extraction, Transformation, and Load methodology, three research algorithms were developed using a tree hierarchical structure to collect information for the categories of Museums, Monuments and Landmarks, Historic Sites, Scenic Lookouts, and Trails, in the foursquare database. Quantitative analysis was performed of check-ins per neighborhood of Rio de Janeiro city, and kernel density (hot spot) maps were generated The results presented in this paper show the need for the data filtering process - less than 50% of the mined data were used, and a large part of the density of the Museums, Historic Sites, and Monuments and Landmarks categories is in the center of the city; while the Scenic Lookouts and Trails categories predominate in the south zone. This kind of analysis was shown to be a tool to support the city's tourist management in relation to the spatial localization of these categories, the tourists’ evaluations of the places, and the frequency of the target public.
The top 10 clusters of innovativeness research named using the dominant...
plos.figshare.com
xls
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yousif Elsamani; Cristian Mejia; Yuya Kajikawa (2023). The top 10 clusters of innovativeness research named using the dominant theme with the most important quantitative data (number of articles, average publication year, top three journals, and number of articles in each journal) until 2021. [Dataset]. http://doi.org/10.1371/journal.pone.0280005.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0280005.t003
Dataset updated
Jun 21, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Yousif Elsamani; Cristian Mejia; Yuya Kajikawa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The top 10 clusters of innovativeness research named using the dominant theme with the most important quantitative data (number of articles, average publication year, top three journals, and number of articles in each journal) until 2021.
Bitcoin data part two from Jan 2009 to Feb 2018
kaggle.com
zip
Updated Apr 17, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ZouJiu (2020). Bitcoin data part two from Jan 2009 to Feb 2018 [Dataset]. https://www.kaggle.com/shiheyingzhe/bitcoin-data-part-two-from-jan-2009-to-feb-2018
Explore at:
zip(10311105755 bytes)Available download formats
Dataset updated
Apr 17, 2020
Authors
ZouJiu
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
During my Senior in the Shan Dong University, my tutor give me research direction of University thesis, which is bitcoin transaction data analysis, so I crawled all of bitcoin transaction data from January 2009 to February 2018.I make statistical analysis and quantitative analysis,I hope this data will give you some help, data mining is interesting and helping not only in the skill of data mining but also in our life.

I crawled these data from website https://www.blockchain.com/explorer, each file contains many blocks,the scope of blocks is reflected in the file name,e.g. this file 0-68732.csv is composed of zero block which is also called genesis block until 68732 block.if a block that didn't have input is not in this file. let's see the columns and rows, there has five columns, the Height column represent block height,the Input column represent the input address of this block,the Output column represent the output address of this block,the Sum column represent bitcoin transaction amount corresponding to the Output,the Time column represent the generation time of this block.A block contains many transactions.

The page is just part two of all data, others can be found here https://www.kaggle.com/shiheyingzhe/datasets
Additional file 1 of Novel methods of qualitative analysis for health policy...
springernature.figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mireya Martínez-García; Maite Vallejo; Enrique Hernández-Lemus; Jorge Alberto Álvarez-Díaz (2023). Additional file 1 of Novel methods of qualitative analysis for health policy research [Dataset]. http://doi.org/10.6084/m9.figshare.7587416.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7587416.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Mireya Martínez-García; Maite Vallejo; Enrique Hernández-Lemus; Jorge Alberto Álvarez-Díaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Interactive network files. Interactive network files with all statistical and topological analyses. This is a Cytoscape.cys session. In order to open/view/modify this file please use the freely available Cytoscape software platform, available at http://www.cytoscape.org/download.php . (SIF 3413 kb)
t
Which of the Five Types of Data Science Does Your Startup Need? - Data...
tomtunguz.com
Updated Oct 2, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomasz Tunguz (2013). Which of the Five Types of Data Science Does Your Startup Need? - Data Analysis [Dataset]. https://tomtunguz.com/data-science-types/
Explore at:
Dataset updated
Oct 2, 2013
Dataset provided by
Theory Ventures
Authors
Tomasz Tunguz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Discover the 5 distinct types of data scientists your startup needs, from quantitative PhDs to operational analysts. Learn which role best fits your company's growth stage.
m
SPHERE: Students' performance dataset of conceptual understanding,...
data.mendeley.com
Updated Jan 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Purwoko Haryadi Santoso (2025). SPHERE: Students' performance dataset of conceptual understanding, scientific ability, and learning attitude in physics education research (PER) [Dataset]. http://doi.org/10.17632/88d7m2fv7p.2
Explore at:
Unique identifier
https://doi.org/10.17632/88d7m2fv7p.2
Dataset updated
Jan 15, 2025
Authors
Purwoko Haryadi Santoso
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The SPHERE is students' performance in physics education research dataset. It is presented as a multi-domain learning dataset of students’ performance on physics that has been collected through several research-based assessments (RBAs) established by the physics education research (PER) community. A total of 497 eleventh-grade students were involved from three large and a small public high school located in a suburban district of a high-populated province in Indonesia. Some variables related to demographics, accessibility to literature resources, and students’ physics identity are also investigated. Some RBAs utilized in this data were selected based on concepts learned by the students in the Indonesian physics curriculum. We commenced the survey of students’ understanding on Newtonian mechanics at the end of the first semester using Force Concept Inventory (FCI) and Force and Motion Conceptual Evaluation (FMCE). In the second semester, we assessed the students’ scientific abilities and learning attitude through Scientific Abilities Assessment Rubrics (SAAR) and the Colorado Learning Attitudes about Science Survey (CLASS) respectively. The conceptual assessments were continued at the second semester measured through Rotational and Rolling Motion Conceptual Survey (RRMCS), Fluid Mechanics Concept Inventory (FMCI), Mechanical Waves Conceptual Survey (MWCS), Thermal Concept Evaluation (TCE), and Survey of Thermodynamic Processes and First and Second Laws (STPFaSL). We expect SPHERE could be a valuable dataset for supporting the advancement of the PER field particularly in quantitative studies. For example, there is a need to help advance research on using machine learning and data mining techniques in PER that might face challenges due to the unavailable dataset for the specific purpose of PER studies. SPHERE can be reused as a students’ performance dataset on physics specifically dedicated for PER scholars which might be willing to implement machine learning techniques in physics education.
n
Dataset for: Fifty years of research on questionable research practices in...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michelle Jin Yee Neoh; Alessandro Carollo; Albert Lee; Gianluca Esposito (2023). Dataset for: Fifty years of research on questionable research practices in science: Quantitative analysis of co-citation patterns [Dataset]. http://doi.org/10.5061/dryad.2fqz612tx
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.2fqz612tx
Dataset updated
Sep 26, 2023
Dataset provided by
Nanyang Technological University
University of Trento
Authors
Michelle Jin Yee Neoh; Alessandro Carollo; Albert Lee; Gianluca Esposito
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Questionable research practices (QRPs) have been the focus of the scientific community amid greater scrutiny and evidence highlighting issues with replicability across many fields of science. To capture the most impactful publications and the main thematic domains in the literature on QRPs, this study uses a document co-citation analysis. The analysis was conducted on a sample of 341 documents that covered the past 50 years of research in QRPs. Nine major thematic clusters emerged. Statistical reporting and statistical power emerged as key areas of research, where systemic-level factors in how research is conducted are consistently raised as the precipitating factors for QRPs. There is also an encouraging shift in the focus of research into open science practices designed to address engagement in QRPs. Such a shift is indicative of the growing momentum of the open science movement, and more research can be conducted on how these practices are employed on the ground and how their uptake by researchers can be further promoted. However, the results suggest that, while pre-registration and registered reports receive the most research interest, less attention has been paid to other open science practices (e.g., data and methods sharing). Methods All data were downloaded from the Scopus platform on 6 February 2023. Data were retrieved using the string TITLE-ABS-KEY(``questionable research practice*"). The files contains information for 341 documents published between 1974–2023.
u
Data from: The use of project portfolios in effective strategy execution to...
researchdata.up.ac.za
zip
Updated May 31, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Palesa Agnes Ramashala (2023). The use of project portfolios in effective strategy execution to improve business value [Dataset]. http://doi.org/10.25403/UPresearchdata.13280141.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25403/UPresearchdata.13280141.v3
Dataset updated
May 31, 2023
Dataset provided by
University of Pretoria
Authors
Palesa Agnes Ramashala
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Qualitative data gathered from interviews that were conducted with case organisations. The data is analysed using a qualitative data analysis tool (AtlasTi) to code and generate network diagrams. Software such as Atlas.ti 8 Windows will be a great advantage to use in order to view these results. Interviews were conducted with four case organisations. The details of the responses from the respondents from case organisations are captured. The data gathered during the interview sessions is captured in a tabular form and graphs were also created to identify trends. Also in this study is desktop review of the case organisations that formed part of the study. The desktop study was done using published annual reports over a period of more than seven years. The analysis was done given the scope of the project and its constructs.
d
Code from: Beyond the classroom: Aliciaâ€™s multivariate journey
search.dataone.org
Updated Nov 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Allison Theobold (2025). Code from: Beyond the classroom: Aliciaâ€™s multivariate journey [Dataset]. http://doi.org/10.5061/dryad.c59zw3rg6
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.c59zw3rg6
Dataset updated
Nov 27, 2025
Dataset provided by
Dryad Digital Repository
Authors
Allison Theobold
Description
The importance of data science skills for modern scientific research cannot be understated. Although policy documents increasingly recommend what skills should be included in undergraduate statistics and data science curricula, little is known about how students actually develop and apply these skills. This paper addresses this gap through an in-depth case study tracing one studentâ€™s learning progressions throughout her masterâ€™s program. Using a qualitative method to analyze student code, which has seen little use in statistics education research, I examined how Alicia transferred the data science skills from her applied statistics course into authentic research settings. The analysis shows that, while Alicia successfully navigated new challenges, she encountered persistent hurdles when extending bivariate techniques into multivariate contexts, particularly with visualizations and summary statistics. These findings highlight the obs..., R Script files submitted by Alicia (pseudonym) over the course of the study. The files are named according to when they were submitted:

December 2018

R Script #1

April 2019

R Script #1 (revised) R Script #2

September 2019

R Script #1 (revised) R Script #2 (revised)

Qualitative Data Analysis Files (Rich text files)

December 2018 Script #1 April 2019 Script #1 April 2019 Script #2 September 2019 Script #1 September 2019 Script #2

Quantitative Data Analysis Files

r-code-themes.csv

Comma separated values file with separate sheets for each R script Each sheet contains the qualitative code assigned to each line of code and whether the code contained errors.

, , # Code from: Beyond the classroom: Aliciaâ€™s multivariate journey

https://doi.org/10.5061/dryad.c59zw3rg6

This repository contains the R script files submitted by Alicia (pseudonym) throughout this study, files associated with the qualitative analysis of the code, and files associated with visualizations of the qualitative themes included in Alicia's code.

Description of the data and file structure

As this is a qualitative analysis, the usage of these "data" files differs from a typical quantitative analysis.

The .R Files contain the scripts generated by Alicia at each time point (December 2018, April 2019, September 2019)

The -codes.rftÂ Files contain the (qualitative) process codes for each R script

The r-code-themes.xlsxÂ The file contains information on every script and the qualitative code assigned to each line of code.

Code/Software

While the "data" for this analysis are R scripts, these scripts cannot be execu...,
o
Enhancing Quantitative Analysis in Social Sciences with Large Language...
openicpsr.org
delimited
Updated Sep 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James Sebastian; Jeong-Mi Moon; Eric Camburn (2025). Enhancing Quantitative Analysis in Social Sciences with Large Language Models (LLMs): A Methodological Case Study in Educational Research [Dataset]. http://doi.org/10.3886/E237744V3
Explore at:
delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E237744V3
Dataset updated
Sep 6, 2025
Dataset provided by
University of Missouri-Columbia
University of Missouri-Kansas City
Korea Institute of Energy Technology
Authors
James Sebastian; Jeong-Mi Moon; Eric Camburn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The objective of this paper is to explore the potential of Large Language Models (LLMs) for assisting with quantitative data analysis in social science research. Specifically, it aims to introduce key concepts to help researchers effectively integrate LLMs into their workflows. For this purpose, we replicate a research paper in educational leadership on the relationship between school program coherence and student achievement. By leveraging LLMs to generate code for statistical tools like Mplus and R, researchers can streamline their data analysis, potentially saving time and effort. The quality of analytical code generated by LLMs can be influenced by the researcher’s understanding and application of concepts like context windows, LLM training data and training cut-off, model parameter settings like temperature, zero- and few-shot learning, and Retrieval-Augmented Generation. By describing and demonstrating the applications of these concepts, we aim to equip researchers with a basic toolset to leverage LLMs effectively to assist with coding for quantitative analysis.
p
Research General Stopwords.csv
psycharchives.org
Updated Oct 8, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Research General Stopwords.csv [Dataset]. https://www.psycharchives.org/en/item/cab36090-633c-473c-9b78-420010637fa4
Explore at:
Dataset updated
Oct 8, 2019
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Systematic reviews are the method of choice to synthesize research evidence. To identify main topics (so-called hot spots) relevant to large corpora of original publications in need of a synthesis, one must address the “three Vs” of big data (volume, velocity, and variety), especially in loosely defined or fragmented disciplines. For this purpose, text mining and predictive modeling are very helpful. Thus, we applied these methods to a compilation of documents related to digitalization in aesthetic, arts, and cultural education, as a prototypical, loosely defined, fragmented discipline, and particularly to quantitative research within it (QRD-ACE). By broadly querying the abstract and citation database Scopus with terms indicative of QRD-ACE, we identified a corpus of N = 55,553 publications for the years 2013–2017. As the result of an iterative approach of text mining, priority screening, and predictive modeling, we identified n = 8,304 potentially relevant publications of which n = 1,666 were included after priority screening. Analysis of the subject distribution of the included publications revealed video games as a first hot spot of QRD-ACE. Topic modeling resulted in aesthetics and cultural activities on social media as a second hot spot, related to 4 of k = 8 identified topics. This way, we were able to identify current hot spots of QRD-ACE by screening less than 15% of the corpus. We discuss implications for harnessing text mining, predictive modeling, and priority screening in future research syntheses and avenues for future original research on QRD-ACE. Dataset for: Christ, A., Penthin, M., & Kröner, S. (2019). Big Data and Digital Aesthetic, Arts, and Cultural Education: Hot Spots of Current Quantitative Research. Social Science Computer Review, 089443931988845. https://doi.org/10.1177/0894439319888455:
H
Nobel Laureates, from 1901 to 2023
dataverse.harvard.edu
search.dataone.org
Updated Feb 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tyler J Duckworth (2024). Nobel Laureates, from 1901 to 2023 [Dataset]. http://doi.org/10.7910/DVN/DJQFDE
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/DJQFDE
Dataset updated
Feb 4, 2024
Dataset provided by
Harvard Dataverse
Authors
Tyler J Duckworth
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains data about all Nobel Prizes and their respective recipients from 1901 to 2023 as well as the code to regenerate the dataset to include future years. This dataset can be used to conduct quantitative analysis and was created to fulfill an assignment for COSC426: Introduction to Data Mining.
d
Dataframe of Significant Stems for: Big Data and Digital Aesthetic, Arts and...
demo-b2find.dkrz.de
Updated Sep 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Dataframe of Significant Stems for: Big Data and Digital Aesthetic, Arts and Cultural Education: Hot Spots of Current Quantitative Research Dataset for: Big Data and Digital Aesthetic, Arts and Cultural Education: Hot Spots of Current Quantitative Research - Dataset - B2FIND [Dataset]. http://demo-b2find.dkrz.de/dataset/0bd97871-d19f-5b9b-bfcc-87f133bd9275
Explore at:
Dataset updated
Sep 21, 2025
Description
Systematic reviews are the method of choice to synthesize research evidence. To identify main topics (so-called hot spots) relevant to large corpora of original publications in need of a synthesis, one must address the “three Vs” of big data (volume, velocity, and variety), especially in loosely defined or fragmented disciplines. For this purpose, text mining and predictive modeling are very helpful. Thus, we applied these methods to a compilation of documents related to digitalization in aesthetic, arts, and cultural education, as a prototypical, loosely defined, fragmented discipline, and particularly to quantitative research within it (QRD-ACE). By broadly querying the abstract and citation database Scopus with terms indicative of QRD-ACE, we identified a corpus of N = 55,553 publications for the years 2013–2017. As the result of an iterative approach of text mining, priority screening, and predictive modeling, we identified n = 8,304 potentially relevant publications of which n = 1,666 were included after priority screening. Analysis of the subject distribution of the included publications revealed video games as a first hot spot of QRD-ACE. Topic modeling resulted in aesthetics and cultural activities on social media as a second hot spot, related to 4 of k = 8 identified topics. This way, we were able to identify current hot spots of QRD-ACE by screening less than 15% of the corpus. We discuss implications for harnessing text mining, predictive modeling, and priority screening in future research syntheses and avenues for future original research on QRD-ACE. Dataset for: Christ, A., Penthin, M., & Kröner, S. (2019). Big Data and Digital Aesthetic, Arts, and Cultural Education: Hot Spots of Current Quantitative Research. Social Science Computer Review, 089443931988845. https://doi.org/10.1177/0894439319888455
p
Dataframe of Significant Stems.csv
psycharchives.org
Updated Oct 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2019). Dataframe of Significant Stems.csv [Dataset]. https://www.psycharchives.org/en/item/84d5c4b2-579d-48a0-8d4e-f02f2ae99192
Explore at:
Dataset updated
Oct 8, 2019
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Systematic reviews are the method of choice to synthesize research evidence. To identify main topics (so-called hot spots) relevant to large corpora of original publications in need of a synthesis, one must address the “three Vs” of big data (volume, velocity, and variety), especially in loosely defined or fragmented disciplines. For this purpose, text mining and predictive modeling are very helpful. Thus, we applied these methods to a compilation of documents related to digitalization in aesthetic, arts, and cultural education, as a prototypical, loosely defined, fragmented discipline, and particularly to quantitative research within it (QRD-ACE). By broadly querying the abstract and citation database Scopus with terms indicative of QRD-ACE, we identified a corpus of N = 55,553 publications for the years 2013–2017. As the result of an iterative approach of text mining, priority screening, and predictive modeling, we identified n = 8,304 potentially relevant publications of which n = 1,666 were included after priority screening. Analysis of the subject distribution of the included publications revealed video games as a first hot spot of QRD-ACE. Topic modeling resulted in aesthetics and cultural activities on social media as a second hot spot, related to 4 of k = 8 identified topics. This way, we were able to identify current hot spots of QRD-ACE by screening less than 15% of the corpus. We discuss implications for harnessing text mining, predictive modeling, and priority screening in future research syntheses and avenues for future original research on QRD-ACE. Dataset for: Christ, A., Penthin, M., & Kröner, S. (2019). Big Data and Digital Aesthetic, Arts, and Cultural Education: Hot Spots of Current Quantitative Research. Social Science Computer Review, 089443931988845. https://doi.org/10.1177/0894439319888455:
StarMine Text Mining Credit Risk Model
lseg.com
Updated Oct 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LSEG (2025). StarMine Text Mining Credit Risk Model [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/company-data/quantitative-models/credit-risk-models/starmine-text-mining
Explore at:
csv,delimited,gzip,json,python,sql,text,user interface,xml,zip archiveAvailable download formats
Dataset updated
Oct 14, 2025
Dataset provided by
London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
Authors
LSEG
License
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Description
Assess risk in publically traded companies with LSEG's StarMine Text Mining Credit Risk Model (TMCR), scoring over 38,000 companies.
d
Integrating Machine Learning Techniques in the Evaluation of Management...
search.dataone.org
Updated Mar 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David, Lemuel (2024). Integrating Machine Learning Techniques in the Evaluation of Management Control Systems for Enhanced Predictive Analytics [Dataset]. http://doi.org/10.7910/DVN/7AYC1L
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/7AYC1L
Dataset updated
Mar 6, 2024
Dataset provided by
Harvard Dataverse
Authors
David, Lemuel
Description
this data was use to analize the transformative role of machine learning (ML) techniques in refining Management Control Systems (MCS) to bolster predictive analytics capabilities within varied organizational contexts. Utilizing a mixed-methods approach, this research synthesizes a comprehensive quantitative analysis of Bloomberg's extensive dataset, encompassing 4,500 companies across multiple industries from 2015 to 2023
Data from: Analysis of spatiotemporal specificity of small RNAs regulating...
figshare.com
xlsx
Updated Sep 29, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lu Li (2019). Analysis of spatiotemporal specificity of small RNAs regulating hPSC differentiation and beyond [Dataset]. http://doi.org/10.6084/m9.figshare.9911918.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.9911918.v2
Dataset updated
Sep 29, 2019
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Lu Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We present a quantitative analysis of small RNA dynamics during the transition from hPSCs to the three germ layer lineages to identify spatiotemporal-specific small RNAs that may be involved in hPSC differentiation. To determine the degree of spatiotemporal specificity, we utilized two algorithms, namely normalized maximum timepoint specificity index (NMTSI) and across-tissue specificity index (ASI). NMTSI could identify spatiotemporal-specific small RNAs that go up or down at just one timepoint in a specific lineage. ASI could identify spatiotemporal-specific small RNAs that maintain high expression from intermediate timepoints to the terminal timepoint in a specific lineage. Beyond analyzing single small RNAs, we also quantified the spatiotemporal-specificity of microRNA families and observed their differential expression patterns in certain lineages. To clarify the regulatory effects of group miRNAs on cellular events during lineage differentiation, we performed a gene ontology (GO) analysis on the downstream targets of synergistically up- and downregulated microRNAs. To provide an integrated interface for researchers to access and browse our analysis results, we designed a web-based tool at https://keyminer.pythonanywhere.com/km/.
4
Data underlying the paper: Quantitative analysis of spectroscopic Low Energy...
data.4tu.nl
zip
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tobias A. de Jong; J. (Johannes) Jobst, Data underlying the paper: Quantitative analysis of spectroscopic Low Energy Electron Microscopy Data [Dataset]. http://doi.org/10.4121/uuid:7f672638-66f6-4ec3-a16c-34181cc45202
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/uuid:7f672638-66f6-4ec3-a16c-34181cc45202
Dataset provided by
4TU.Centre for Research Data
Authors
Tobias A. de Jong; J. (Johannes) Jobst
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains a Low Energy Electron Microscopy dataset consisting of raw data of both a dark field and a bright field spectroscopic image series of a region of few layer graphene on Silicon Carbide. Additionally it contains calibration data: a dark count dataset, a HDR calibration dataset and two curves showcasing the difference between HDR and non-HDR imaging.
EURUSD 15 minutes data
kaggle.com
zip
Updated Sep 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DOCTOR DIEGO LEON (2025). EURUSD 15 minutes data [Dataset]. https://www.kaggle.com/datasets/doctordiegoleon/eurusd-15-minutes-data
Explore at:
zip(1511380 bytes)Available download formats
Dataset updated
Sep 16, 2025
Authors
DOCTOR DIEGO LEON
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
This portfolio provides a detailed analysis of the EUR/USD currency pair on a 15-minute timeframe, aiming to explore market patterns, volatility, and potential opportunities for developing algorithmic trading strategies.

Included in this work:

Data cleaning and preprocessing of historical records.

Exploratory analysis of prices, volumes, and movement ranges.

Pattern detection such as consecutive candles, trends, and reversals.

Quantitative metrics to assess risk and performance.

Dataset preparation for backtesting and predictive modeling.

This project is designed for traders, quantitative analysts, and data science enthusiasts interested in applying analytical methods to Forex markets, with a practical and replicable approach to generating financial insights.
g
Looking for data (Expert interviews)
search.gesis.org
Updated Jul 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Friedrich, Tanja (2025). Looking for data (Expert interviews) [Dataset]. https://search.gesis.org/research_data/SDN-10.7802-1.1943
Explore at:
Dataset updated
Jul 2, 2025
Dataset provided by
GESIS search
GESIS, Köln
Authors
Friedrich, Tanja
License
https://www.gesis.org/en/institute/data-usage-termshttps://www.gesis.org/en/institute/data-usage-terms
Description
These interview data are part of the project "Looking for data: information seeking behaviour of survey data users", a study of secondary data users’ information-seeking behaviour. The overall goal of this study was to create evidence of actual information practices of users of one particular retrieval system for social science data in order to inform the development of research data infrastructures that facilitate data sharing. In the project, data were collected based on a mixed methods design. The research design included a qualitative study in the form of expert interviews and – building on the results found therein – a quantitative web survey of secondary survey data users. For the qualitative study, expert interviews with six reference persons of a large social science data archive have been conducted. They were interviewed in their role as intermediaries who provide guidance for secondary users of survey data. The knowledge from their reference work was expected to provide a condensed view of goals, practices, and problems of people who are looking for survey data. The anonymized transcripts of these interviews are provided here. They can be reviewed or reused upon request. The survey dataset from the quantitative study of secondary survey data users is downloadable through this data archive after registration. The core result of the Looking for data study is that community involvement plays a pivotal role in survey data seeking. The analyses show that survey data communities are an important determinant in survey data users' information seeking behaviour and that community involvement facilitates data seeking and has the capacity of reducing problems or barriers. The qualitative part of the study was designed and conducted using constructivist grounded theory methodology as introduced by Kathy Charmaz (2014). In line with grounded theory methodology, the interviews did not follow a fixed set of questions, but were conducted based on a guide that included areas of exploration with tentative questions. This interview guide can be obtained together with the transcript. For the Looking for data project, the data were coded and scrutinized by constant comparison, as proposed by grounded theory methodology. This analysis resulted in core categories that make up the "theory of problem-solving by community involvement". This theory was exemplified in the quantitative part of the study. For this exemplification, the following hypotheses were drawn from the qualitative study: (1) The data seeking hypotheses: (1a) When looking for data, information seeking through personal contact is used more often than impersonal ways of information seeking. (1b) Ways of information seeking (personal or impersonal) differ with experience. (2) The experience hypotheses: (2a) Experience is positively correlated with having ambitious goals. (2b) Experience is positively correlated with having more advanced requirements for data. (2c) Experience is positively correlated with having more specific problems with data. (3) The community involvement hypothesis: Experience is positively correlated with community involvement. (4) The problem solving hypothesis: Community involvement is positively correlated with problem solving strategies that require personal interactions.

Facebook

Twitter

Click to copy link

Link copied

Cite

Paula Fernandez Costa; Irving da Silva Badolato; Rogério Luís Ribeiro Borba; Julia Celia Mercedes Strauch (2023). STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC INFORMATION THROUGH DATA MINING [Dataset]. http://doi.org/10.6084/m9.figshare.8031641.v1

Data from: STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC INFORMATION THROUGH DATA MINING

Explore at:

jpegAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.8031641.v1

Dataset updated

May 31, 2023

Dataset provided by

SciELOhttp://www.scielo.org/

Authors

Paula Fernandez Costa; Irving da Silva Badolato; Rogério Luís Ribeiro Borba; Julia Celia Mercedes Strauch

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Abstract This aim of this paper is the acquisition of geographic data from the Foursquare application, using data mining to perform exploratory and spatial analyses of the distribution of tourist attraction and their density distribution in Rio de Janeiro city. Thus, in accordance with the Extraction, Transformation, and Load methodology, three research algorithms were developed using a tree hierarchical structure to collect information for the categories of Museums, Monuments and Landmarks, Historic Sites, Scenic Lookouts, and Trails, in the foursquare database. Quantitative analysis was performed of check-ins per neighborhood of Rio de Janeiro city, and kernel density (hot spot) maps were generated The results presented in this paper show the need for the data filtering process - less than 50% of the mined data were used, and a large part of the density of the Museums, Historic Sites, and Monuments and Landmarks categories is in the center of the city; while the Scenic Lookouts and Trails categories predominate in the south zone. This kind of analysis was shown to be a tool to support the city's tourist management in relation to the spatial localization of these categories, the tourists’ evaluations of the places, and the frequency of the target public.

Clear search

Close search

Google apps

Main menu

Data from: STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC...

The top 10 clusters of innovativeness research named using the dominant...

Bitcoin data part two from Jan 2009 to Feb 2018

Additional file 1 of Novel methods of qualitative analysis for health policy...

Which of the Five Types of Data Science Does Your Startup Need? - Data...

SPHERE: Students' performance dataset of conceptual understanding,...

Dataset for: Fifty years of research on questionable research practices in...

Data from: The use of project portfolios in effective strategy execution to...

Code from: Beyond the classroom: Aliciaâ€™s multivariate journey

Description of the data and file structure

Code/Software

Enhancing Quantitative Analysis in Social Sciences with Large Language...

Research General Stopwords.csv

Nobel Laureates, from 1901 to 2023

Dataframe of Significant Stems for: Big Data and Digital Aesthetic, Arts and...

Dataframe of Significant Stems.csv

StarMine Text Mining Credit Risk Model

Integrating Machine Learning Techniques in the Evaluation of Management...

Data from: Analysis of spatiotemporal specificity of small RNAs regulating...

Data underlying the paper: Quantitative analysis of spectroscopic Low Energy...

EURUSD 15 minutes data

Looking for data (Expert interviews)

Data from: STRATEGY FOR EXTRACTION OF FOURSQUARE’S SOCIAL MEDIA GEOGRAPHIC INFORMATION THROUGH DATA MINING