21 datasets found

T
Text Analysis Software Report
marketresearchforecast.com
doc, pdf, ppt
Updated Mar 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Text Analysis Software Report [Dataset]. https://www.marketresearchforecast.com/reports/text-analysis-software-42331
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Mar 20, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global text analysis software market is booming, projected to reach $2746.5 million by 2033 with a 5.6% CAGR. Discover key trends, drivers, restraints, and leading companies shaping this rapidly evolving landscape of NLP and AI-powered text analytics solutions.

Global Data Science Tool Market Research Report: By Application (Predictive...

wiseguyreports.com

Updated Sep 15, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Data Science Tool Market Research Report: By Application (Predictive Analytics, Data Mining, Machine Learning, Statistical Analysis), By Deployment Model (On-Premise, Cloud-Based, Hybrid), By End User (Retail, Healthcare, Finance, Manufacturing), By Functionality (Data Visualization, Data Preparation, Model Building, Model Deployment) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/data-science-tool-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	9.0(USD Billion)
MARKET SIZE 2025	10.05(USD Billion)
MARKET SIZE 2035	30.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Model, End User, Functionality, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Growing demand for data-driven insights, Increasing adoption of machine learning, Rising need for data visualization tools, Expanding use of big data analytics, Emergence of cloud-based solutions
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	RapidMiner, IBM, Snowflake, TIBCO Software, Datarobot, Oracle, Tableau, Teradata, MathWorks, Microsoft, Cloudera, Google, SAS Institute, Alteryx, Qlik, DataRobot
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Increased demand for AI solutions, Growing importance of big data analytics, Rising adoption of cloud-based tools, Integration of automation technologies, Expanding use cases across industries
COMPOUND ANNUAL GROWTH RATE (CAGR)	11.6% (2025 - 2035)

D
Data Analytics Software Report
archivemarketresearch.com
doc, pdf, ppt
Updated May 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Data Analytics Software Report [Dataset]. https://www.archivemarketresearch.com/reports/data-analytics-software-558003
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
May 4, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global Data Analytics Software market is experiencing robust growth, driven by the increasing adoption of cloud-based solutions, the expanding volume of big data, and the rising demand for data-driven decision-making across various industries. The market, valued at approximately $150 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% during the forecast period of 2025-2033. This significant expansion is fueled by several key factors. Businesses are increasingly recognizing the strategic importance of data analytics in optimizing operations, enhancing customer experiences, and gaining a competitive edge. The shift towards cloud-based solutions offers scalability, cost-effectiveness, and accessibility, making data analytics accessible to a broader range of businesses, from SMEs to large enterprises. Furthermore, advancements in artificial intelligence (AI) and machine learning (ML) are integrating seamlessly into data analytics platforms, providing more sophisticated insights and predictive capabilities. The market's growth is further segmented by deployment model (on-premise vs. cloud-based) and user type (SMEs vs. large enterprises), reflecting the diverse needs and adoption rates across various business segments. While the market presents substantial opportunities, certain challenges persist. Data security and privacy concerns remain paramount, requiring robust security measures and compliance with evolving regulations. The complexity of implementing and managing data analytics solutions can also pose a barrier to entry for some organizations, requiring skilled professionals and substantial investments in infrastructure and training. Despite these challenges, the long-term outlook for the Data Analytics Software market remains highly positive, driven by continuous technological innovation, growing data volumes, and the increasing strategic importance of data-driven decision-making across industries. The market's evolution will continue to be shaped by the ongoing integration of AI and ML, the expansion of cloud-based offerings, and the increasing demand for advanced analytics capabilities. This dynamic landscape will present both challenges and opportunities for existing players and new entrants alike.
T
Text Analytics Tool Report
datainsightsmarket.com
doc, pdf, ppt
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Text Analytics Tool Report [Dataset]. https://www.datainsightsmarket.com/reports/text-analytics-tool-1460176
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Text Analytics Tools market is booming, projected to reach $50 billion by 2033 with an 18% CAGR. Explore key drivers, trends, and regional insights in this comprehensive market analysis, covering leading vendors like IBM, Google, and more. Discover the potential of NLP and AI in unlocking valuable insights from text data.
CORD-19 Dataset v2020
kaggle.com
zip
Updated Oct 18, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SMLRA-KJSCE (2020). CORD-19 Dataset v2020 [Dataset]. https://www.kaggle.com/smlrakjsce/cord19-dataset-v2020
Explore at:
zip(47121 bytes)Available download formats
Dataset updated
Oct 18, 2020
Authors
SMLRA-KJSCE
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Open-Ended track where your team can build anything using the dataset provided by us

Dataset Description In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). CORD-19 is a resource of over 200,000 scholarly articles, including over 100,000 with full text, about COVID-19, SARS-CoV-2, and related coronaviruses. This freely available dataset is provided to the global research community to apply recent advances in natural language processing and other AI techniques to generate new insights in support of the ongoing fight against this infectious disease. There is a growing urgency for these approaches because of the rapid acceleration in new coronavirus literature, making it difficult for the medical research community to keep up.

Call to Action We are issuing a call to action to the world's artificial intelligence experts to develop text and data mining tools that can help the medical community develop answers to high priority scientific questions. The CORD-19 dataset represents the most extensive machine-readable coronavirus literature collection available for data mining to date. This allows the worldwide AI research community the opportunity to apply text and data mining approaches to find answers to questions within, and connect insights across, this content in support of the ongoing COVID-19 response efforts worldwide. There is a growing urgency for these approaches because of the rapid increase in coronavirus literature, making it difficult for the medical community to keep up.

Many of the questions are suitable for text mining, and we encourage researchers to develop text mining tools to provide insights on these questions.We are maintaining a summary of the community's contributions.

Acknowledgements We wouldn't be here without the help of others. The datset is a subset of the dataset available at AI2's Semantic Scholar - https://pages.semanticscholar.org/coronavirus-research This dataset was created by the Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, IBM, and the National Library of Medicine - National Institutes of Health, in coordination with The White House Office of Science and Technology Policy. Dataset The dataset is in tar.gz format and can be downloaded from - https://drive.google.com/file/d/15SV8_Nc1HECN9uaplDSQx7H1yKFR4F_Z/view?usp=sharing

Submissions Notebook and Output results are expected as appropriate submissions.
m
Saudi Banking App Customer Reviews
data.mendeley.com
Updated Oct 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nuha ALmoqren (2025). Saudi Banking App Customer Reviews [Dataset]. http://doi.org/10.17632/7rv5j4tw9r.1
Explore at:
Unique identifier
https://doi.org/10.17632/7rv5j4tw9r.1
Dataset updated
Oct 11, 2025
Authors
Nuha ALmoqren
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Saudi Arabia
Description
This dataset was collected and curated as part of an ongoing PhD research project focusing on user-centered requirement analysis and satisfaction modeling in Saudi mobile banking applications. It contains raw customer review data, including the review date, source store (Google Play or App Store), bank name, textual review content, and user rating (1–5 scale).

As a PhD student researcher, I intend to release an updated and refined version of this dataset in the future that will include additional processed attributes such as sentiment polarity, user intent, Kano classification (Must-Be, Performance, Attractive), and structural metrics derived from ontology-based analysis.

This initial version serves as a foundational resource for researchers interested in sentiment analysis, feature prioritization, or satisfaction–achievement modeling within the context of Saudi mobile banking services.
T
Text Analysis Software Report
datainsightsmarket.com
doc, pdf, ppt
Updated Apr 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Text Analysis Software Report [Dataset]. https://www.datainsightsmarket.com/reports/text-analysis-software-1448703
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
Apr 25, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The text analysis software market is booming, projected to reach $2864 million by 2033 with a 5.8% CAGR. This in-depth analysis explores market drivers, trends, restraints, and key players like Microsoft & IBM, covering regional breakdowns and segments. Discover the future of text analytics!
C
Commercial Patent Database Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jul 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Commercial Patent Database Report [Dataset]. https://www.datainsightsmarket.com/reports/commercial-patent-database-1413170
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Jul 12, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global commercial patent database market is experiencing robust growth, driven by the increasing need for intellectual property (IP) management and competitive intelligence among businesses. The market's expansion is fueled by several key factors, including rising R&D investments across various industries (pharmaceuticals, technology, etc.), a surge in patent filings worldwide, and the growing adoption of sophisticated analytical tools for patent data mining. This necessitates comprehensive and user-friendly databases that offer advanced search functionalities, allowing businesses to identify opportunities, track competitors, and protect their own innovations effectively. The market's segmentation reflects the diverse needs of users, encompassing solutions tailored to specific industries and IP management tasks. Leading players are continuously innovating, integrating AI and machine learning capabilities to enhance search precision and data analysis, creating more efficient and insightful platforms. The competitive landscape is characterized by a mix of established players and emerging technology companies, each striving for differentiation through superior user experience, data quality, and analytical features. We estimate the market size to be approximately $2.5 billion in 2025, growing at a compound annual growth rate (CAGR) of 12% between 2025 and 2033. This strong growth is projected to continue throughout the forecast period, primarily due to the ongoing digital transformation across sectors and the increasing reliance on data-driven decision-making. However, challenges remain, including the high cost of access to premium database features and the complex nature of patent data, requiring specialized expertise to interpret effectively. The market will see continued consolidation, with larger players acquiring smaller companies to expand their market reach and product offerings. Furthermore, the focus on user experience and the development of more intuitive interfaces will be critical to broaden the appeal of these databases to a wider range of users, from IP professionals to business strategists. Geographic expansion, particularly in emerging economies with growing R&D activities, will also be a key driver of market growth in the coming years.
Product data mining: entity classification&linking
kaggle.com
zip
Updated Jul 13, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
zzhang (2020). Product data mining: entity classification&linking [Dataset]. https://www.kaggle.com/ziqizhang/product-data-miningentity-classificationlinking
Explore at:
zip(10933 bytes)Available download formats
Dataset updated
Jul 13, 2020
Authors
zzhang
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
IMPORTANT: Round 1 results are now released, check our website for the leaderboard. We now open Round 2 submissions!

1. Overview

We release two datasets that are part of the the Semantic Web Challenge on Mining the Web of HTML-embedded Product Data is co-located with the 19th International Semantic Web Conference (https://iswc2020.semanticweb.org/, 2-6 Nov 2020 at Athens, Greece). The datasets belong to two shared tasks related to product data mining on the Web: (1) product matching (linking) and (2) product classification. This event is organised by The University of Sheffield, The University of Mannheim and Amazon, and is open to anyone. Systems successfully beating the baseline of the respective task, will be invited to write a paper describing their method and system and present the method as a poster (and potentially also a short talk) at the ISWC2020 conference. Winners of each task will be awarded 500 euro as prize (partly sponsored by Peak Indicators, https://www.peakindicators.com/).

2. Task and dataset brief

The challenge organises two tasks, product matching and product categorisation.

i) Product Matching deals with identifying product offers on different websites that refer to the same real-world product (e.g., the same iPhone X model offered using different names/offer titles as well as different descriptions on various websites). A multi-million product offer corpus (16M) containing product offer clusters is released for the generation of training data. A validation set containing 1.1K offer pairs and a test set of 600 offer pairs will also be released. The goal of this task is to classify if the offer pairs in these datasets are match (i.e., referring to the same product) or non-match.

ii) Product classification deals with assigning predefined product category labels (which can be multiple levels) to product instances (e.g., iPhone X is a ‘SmartPhone’, and also ‘Electronics’). A training dataset containing 10K product offers, a validation set of 3K product offers and a test set of 3K product offers will be released. Each dataset contains product offers with their metadata (e.g., name, description, URL) and three classification labels each corresponding to a level in the GS1 Global Product Classification taxonomy. The goal is to classify these product offers into the pre-defined category labels.

All datasets are built based on structured data that was extracted from the Common Crawl (https://commoncrawl.org/) by the Web Data Commons project (http://webdatacommons.org/). Datasets can be found at: https://ir-ischool-uos.github.io/mwpd/

3. Resources and tools

The challenge will also release utility code (in Python) for processing the above datasets and scoring the system outputs. In addition, the following language resources for product-related data mining tasks: A text corpus of 150 million product offer descriptions Word embeddings trained on the above corpus

4. Challenge website

For details of the challenge please visit https://ir-ischool-uos.github.io/mwpd/

5. Organizing committee

Dr Ziqi Zhang (Information School, The University of Sheffield) Prof. Christian Bizer (Institute of Computer Science and Business Informatics, The Mannheim University) Dr Haiping Lu (Department of Computer Science, The University of Sheffield) Dr Jun Ma (Amazon Inc. Seattle, US) Prof. Paul Clough (Information School, The University of Sheffield & Peak Indicators) Ms Anna Primpeli (Institute of Computer Science and Business Informatics, The Mannheim University) Mr Ralph Peeters (Institute of Computer Science and Business Informatics, The Mannheim University) Mr. Abdulkareem Alqusair (Information School, The University of Sheffield)

6. Contact

To contact the organising committee please use the Google discussion group https://groups.google.com/forum/#!forum/mwpd2020
n
Parkinsons Disease Discovery Database
neuinfo.org
scicrunch.org
+1more
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Parkinsons Disease Discovery Database [Dataset]. http://identifiers.org/RRID:SCR_014160/resolver/mentions?q=&i=rrid
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014160 https://identifiers.org/RRID:SCR_014160/resolver/mentions?q=&i=rrid
Dataset updated
Mar 4, 2025
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented Jan. 5, 2016. Tools will be available for biomedical data mining and visualization as well as linkages to Google Maps and other online resources.
Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter...
zenodo.org
application/gzip
Updated Mar 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana (2021). Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks / Understanding and Improving the Quality and Reproducibility of Jupyter Notebooks [Dataset]. http://doi.org/10.5281/zenodo.3519618
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3519618
Dataset updated
Mar 16, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of Jupyter Notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourages poor coding practices and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we analyzed 1.4 million notebooks from GitHub. Based on the results, we proposed and evaluated Julynter, a linting tool for Jupyter Notebooks.

Papers:

PIMENTEL, J. F.; MURTA, L.; BRAGANHOLO, V.; FREIRE, J.; A large-scale study about quality and reproducibility of jupyter notebooks. In: International Conference on Mining Software Repositories (MSR), 2019, Montreal, Canada.

PIMENTEL, J. F.; MURTA, L.; BRAGANHOLO, V.; FREIRE, J.; Understanding and Improving the Quality and Reproducibility of Jupyter Notebooks. Empirical Software Engineering, 2021 (in press)

This repository contains three files:

db2020-09-22.dump.gz

sample.tar.gz

julynter_reproducility.tar.gz

Reproducing the Notebook Study

The db2020-09-22.dump.gz file contains a PostgreSQL dump of the database, with all the data we extracted from notebooks. For loading it, run:

gunzip -c db2020-09-22.dump.gz | psql jupyter

Note that this file contains only the database with the extracted data. The actual repositories are available in a google drive folder, which also contains the docker images we used in the reproducibility study. The repositories are stored as content/{hash_dir1}/{hash_dir2}.tar.bz2, where hash_dir1 and hash_dir2 are columns of repositories in the database.

For scripts, notebooks, and detailed instructions on how to analyze or reproduce the data collection, please check the instructions on the Jupyter Archaeology repository (tag 1.0.0)

The sample.tar.gz file contains the repositories obtained during the manual sampling.

Reproducing the Julynter Experiment

The julynter_reproducility.tar.gz file contains all the data collected in the Julynter experiment and the analysis notebooks. Reproducing the analysis is straightforward:

Uncompress the file: $ tar zxvf julynter_reproducibility.tar.gz

Install the dependencies: $ pip install julynter/requirements.txt

Run the notebooks in order: J1.Data.Collection.ipynb; J2.Recommendations.ipynb; J3.Usability.ipynb.

The collected data is stored in the julynter/data folder.

Changelog

2019/01/14 - Version 1 - Initial version
2019/01/22 - Version 2 - Update N8.Execution.ipynb to calculate the rate of failure for each reason
2019/03/13 - Version 3 - Update package for camera ready. Add columns to db to detect duplicates, change notebooks to consider them, and add N1.Skip.Notebook.ipynb and N11.Repository.With.Notebook.Restriction.ipynb.
2021/03/15 - Version 4 - Add Julynter experiment; Update database dump to include new data collected for the second paper; remove scripts and analysis notebooks from this package (moved to GitHub), add a link to Google Drive with collected repository files

Global Augmented Analytics Market Research Report: By Technology (Natural...

wiseguyreports.com

Updated Sep 15, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Augmented Analytics Market Research Report: By Technology (Natural Language Processing, Machine Learning, Data Mining, Predictive Analytics), By Deployment Type (On-Premises, Cloud-Based, Hybrid), By End User (BFSI, Retail, Healthcare, IT and Telecom, Manufacturing), By Application (Business Intelligence, Customer Experience Management, Fraud Detection, Risk Management) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/augmented-analytics-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	4.65(USD Billion)
MARKET SIZE 2025	5.51(USD Billion)
MARKET SIZE 2035	30.0(USD Billion)
SEGMENTS COVERED	Technology, Deployment Type, End User, Application, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Data democratization trends, AI and machine learning integration, Growing demand for self-service analytics, Increased cloud adoption, Enhanced data visualization tools
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	ThoughtSpot, Microsoft, Yellowfin, Google, Domo, TIBCO Software, SAP, Oracle, Alteryx, Qlik, Looker, SAS Institute, IBM, MicroStrategy, Tableau Software, Sisense
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Increased demand for real-time insights, Growth of self-service analytics tools, Rising adoption of AI-driven solutions, Expanding use in various industries, Enhanced data governance capabilities
COMPOUND ANNUAL GROWTH RATE (CAGR)	18.4% (2025 - 2035)

SOTorrent 2018-12-09
kaggle.com
zip
Updated Dec 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SOTorrent (2018). SOTorrent 2018-12-09 [Dataset]. https://www.kaggle.com/datasets/sotorrent/2018-12-09
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Dec 18, 2018
Dataset authored and provided by
SOTorrent
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Please notice

Tables TitleVersion and Votes are not yet visible in the Data preview page, but they are accessible in Kernels.

Context

Stack Overflow (SO) is the most popular question-and-answer website for software developers, providing a large amount of code snippets and free-form text on a wide variety of topics. Like other software artifacts, questions and answers on SO evolve over time, for example when bugs in code snippets are fixed, code is updated to work with a more recent library version, or text surrounding a code snippet is edited for clarity. To be able to analyze how content on SO evolves, we built SOTorrent, an open dataset based on the official SO data dump.

Content

SOTorrent provides access to the version history of SO content at the level of whole posts and individual text or code blocks. It connects SO posts to other platforms by aggregating URLs from text blocks and comments, and by collecting references from GitHub files to SO posts. Our vision is that researchers will use SOTorrent to investigate and understand the evolution of SO posts and their relation to other platforms such as GitHub. If you use this dataset in your work, please cite our MSR 2018 paper or our MSR 2019 mining challenge proposal.

This version is based on the official Stack Overflow data dump released 2018-12-02 and the Google BigQuery GitHub data set queried 2018-12-09.

Inspiration

The goal of the MSR 2019 mining challenge is to study the origin, evolution, and usage of Stack Overflow code snippets. Questions that are, to the best of our knowledge, not sufficiently answered yet include:

How are code snippets on Stack Overflow maintained?

How many clones of code snippets exist inside Stack Overflow?

How can we detect buggy versions of Stack Overflow code snippets and find them in GitHub projects?

How frequently are code snippets copied from external sources into Stack Overflow and then co-evolve there?

How do snippets copied from Stack Overflow to GitHub co-evolve?

Does the evolution of Stack Overflow code snippets follow patterns?

Do these patterns differ between programming languages?

Are the licenses of external sources compatible with Stack Overflow’s license (CC BY-SA 3.0)?

How many code blocks on Stack Overflow do not contain source code (and are only used for markup)?

Can we reliably predict bug-fixing edits to code on Stack Overflow?

Can we reliably predict popularity of Stack Overflow code snippets on GitHub?

These are just some of the questions that could be answered using SOTorrent. We encourage challenge participants to adapt the above questions or formulate their own research questions about the origin, evolution, and usage of content on Stack Overflow.
f
Data from: QSAR-Co: An Open Source Software for Developing Robust...
acs.figshare.com
datasetcatalog.nlm.nih.gov
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pravin Ambure; Amit Kumar Halder; Humbert González Díaz; M. Natália D. S. Cordeiro (2023). QSAR-Co: An Open Source Software for Developing Robust Multitasking or Multitarget Classification-Based QSAR Models [Dataset]. http://doi.org/10.1021/acs.jcim.9b00295.s002
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1021/acs.jcim.9b00295.s002
Dataset updated
May 30, 2023
Dataset provided by
ACS Publications
Authors
Pravin Ambure; Amit Kumar Halder; Humbert González Díaz; M. Natália D. S. Cordeiro
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Quantitative structure–activity relationships (QSAR) modeling is a well-known computational technique with wide applications in fields such as drug design, toxicity predictions, nanomaterials, etc. However, QSAR researchers still face certain problems to develop robust classification-based QSAR models, especially while handling response data pertaining to diverse experimental and/or theoretical conditions. In the present work, we have developed an open source standalone software “QSAR-Co” (available to download at https://sites.google.com/view/qsar-co) to setup classification-based QSAR models that allow mining the response data coming from multiple conditions. The software comprises two modules: (1) the Model development module and (2) the Screen/Predict module. This user-friendly software provides several functionalities required for developing a robust multitasking or multitarget classification-based QSAR model using linear discriminant analysis or random forest techniques, with appropriate validation, following the principles set by the Organisation for Economic Co-operation and Development (OECD) for applying QSAR models in regulatory assessments.

Global Online Analytical Processing OLAP Tool Market Research Report: By...

wiseguyreports.com

Updated Sep 15, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Online Analytical Processing OLAP Tool Market Research Report: By Application (Finance, Retail, Healthcare, Telecommunications, Manufacturing), By Deployment Mode (On-Premises, Cloud-Based, Hybrid), By User Type (Small Enterprises, Medium Enterprises, Large Enterprises), By Functionality (Data Mining, Reporting, Data Integration, Predictive Analysis) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/online-analytical-processing-olap-tool-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	4.17(USD Billion)
MARKET SIZE 2025	4.52(USD Billion)
MARKET SIZE 2035	10.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Mode, User Type, Functionality, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Growing demand for data analytics, Increasing cloud adoption, Rising need for real-time insights, Technological advancements in data processing, Expanding investment in BI tools
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Informatica, Sisense, IBM, Amazon Web Services, Domo, TIBCO Software, Oracle, MicroStrategy, SAP, Pentaho, Microsoft, Tableau Software, Board International, Google, SAS Institute, Qlik
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Cloud-based OLAP solutions, Increased demand for real-time analytics, Integration with AI and ML technologies, Expansion in emerging markets, Growing business intelligence adoption
COMPOUND ANNUAL GROWTH RATE (CAGR)	8.3% (2025 - 2035)

Global Clustering Software Market Research Report: By Application (Data...

wiseguyreports.com

Updated Oct 14, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Clustering Software Market Research Report: By Application (Data Mining, Machine Learning, Image Processing, Natural Language Processing), By Deployment Type (On-Premises, Cloud-Based, Hybrid), By End User (BFSI, Healthcare, Retail, Telecommunications), By Organization Size (Small Enterprises, Medium Enterprises, Large Enterprises) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/clustering-software-market

Explore at:

Dataset updated

Oct 14, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Oct 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	2397.5(USD Million)
MARKET SIZE 2025	2538.9(USD Million)
MARKET SIZE 2035	4500.0(USD Million)
SEGMENTS COVERED	Application, Deployment Type, End User, Organization Size, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	increasing big data adoption, rising demand for advanced analytics, growing need for real-time insights, expansion of cloud computing, integration of AI technologies
MARKET FORECAST UNITS	USD Million
KEY COMPANIES PROFILED	Tableau, Qlik, SAS Institute, MathWorks, SAP, Google Cloud, Knime, TIBCO Software, Microsoft, H2O.ai, Alteryx, IBM, AWS, databricks, Oracle, RapidMiner
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	AI-driven data analysis, Cloud-based clustering solutions, Integration with IoT devices, Real-time data processing, Enhanced cybersecurity features
COMPOUND ANNUAL GROWTH RATE (CAGR)	5.9% (2025 - 2035)

Global Analytics Software Market Research Report: By Application (Business...

wiseguyreports.com

Updated Sep 15, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Analytics Software Market Research Report: By Application (Business Intelligence, Predictive Analytics, Data Visualization, Data Mining), By Deployment Type (On-Premises, Cloud-Based, Hybrid), By End Use (BFSI, Healthcare, Retail, Manufacturing, Telecom), By Analytics Type (Descriptive Analytics, Diagnostic Analytics, Predictive Analytics, Prescriptive Analytics) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/analytics-software-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	30.7(USD Billion)
MARKET SIZE 2025	32.4(USD Billion)
MARKET SIZE 2035	55.6(USD Billion)
SEGMENTS COVERED	Application, Deployment Type, End Use, Analytics Type, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Increasing data volume, Rising demand for insights, Advancements in AI technologies, Growing cloud adoption, Enhanced regulatory compliance requirements
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Sisense, IBM, Domo, TIBCO Software, Oracle, MicroStrategy, ThoughtSpot, MathWorks, SAP, Looker, Microsoft, Tableau Software, Google, SAS Institute, Alteryx, Qlik
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	AI-driven analytics solutions, Increased demand for real-time insights, Growth in big data adoption, Expansion of cloud-based analytics, Focus on predictive analytics and automation
COMPOUND ANNUAL GROWTH RATE (CAGR)	5.6% (2025 - 2035)

Global Analytics Sandbox Market Research Report: By Application (Data...

wiseguyreports.com

Updated Sep 15, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Analytics Sandbox Market Research Report: By Application (Data Integration, Data Analysis, Data Visualization, Predictive Analytics), By Deployment Type (Cloud-Based, On-Premises, Hybrid), By End User (BFSI, Healthcare, Retail, Telecommunications), By Functionality (Real-Time Analytics, Batch Processing, Data Mining, Predictive Modeling) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/analytics-sandbox-market

Explore at:

Dataset updated

Sep 15, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	2.72(USD Billion)
MARKET SIZE 2025	3.06(USD Billion)
MARKET SIZE 2035	10.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Type, End User, Functionality, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	data-driven decision making, cloud adoption increasing, regulatory compliance requirements, competitive analytics capabilities, integration with AI technologies
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	IBM, Amazon Web Services, Domo, TIBCO Software, Oracle, MicroStrategy, Tableau, SAP, Microsoft, Google, SAS Institute, Qlik
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Cloud-based analytics solutions, Increased demand for data privacy, Enhanced collaboration tools integration, Growing adoption of IoT devices, Rising need for real-time analytics
COMPOUND ANNUAL GROWTH RATE (CAGR)	12.6% (2025 - 2035)

Global Data Science Platform Market Research Report: By Application...

wiseguyreports.com

Updated Jan 1, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global Data Science Platform Market Research Report: By Application (Predictive Analytics, Data Mining, Machine Learning, Data Visualization), By Deployment Model (On-Premises, Cloud-Based, Hybrid), By End User (Healthcare, Financial Services, Retail, Government), By Features (Automated Machine Learning, Data Preparation, Collaboration Tools, Data Governance) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/de/reports/data-science-platform-market

Explore at:

Dataset updated

Jan 1, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Sep 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	7.77(USD Billion)
MARKET SIZE 2025	8.79(USD Billion)
MARKET SIZE 2035	30.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Model, End User, Features, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Growing demand for data analytics, Increased investment in AI technologies, Rising adoption of cloud solutions, Expanding need for real-time insights, Shortage of skilled data professionals
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Datarobot, Microsoft, DataRobot, Cloudera, MathWorks, Google, SAS, SAP, H2O.ai, Amazon, IBM, RapidMiner, TIBCO Software, Oracle, Tableau, Alteryx, Qlik
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	AI integration for analytics, Growing demand for predictive insights, Cloud-based solutions expansion, Increasing focus on data-driven decisions, Enhancement of machine learning tools
COMPOUND ANNUAL GROWTH RATE (CAGR)	13.1% (2025 - 2035)

Global High Performance Data Analytics Market Research Report: By Deployment...

wiseguyreports.com

Updated Mar 20, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). Global High Performance Data Analytics Market Research Report: By Deployment Model (On-Premises, Cloud-Based, Hybrid), By Application (Predictive Analytics, Prescriptive Analytics, Descriptive Analytics, Diagnostic Analytics), By End Use Industry (Healthcare, Financial Services, Retail, Manufacturing), By Tool Type (Data Integration Tools, Data Visualization Tools, Data Mining Tools, Big Data Analytics Tools) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/cn/reports/high-performance-data-analytic-market

Explore at:

Dataset updated

Mar 20, 2025

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Time period covered

Aug 25, 2025

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2023
REGIONS COVERED	North America, Europe, APAC, South America, MEA
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2024	11.23(USD Billion)
MARKET SIZE 2025	12.28(USD Billion)
MARKET SIZE 2035	30.0(USD Billion)
SEGMENTS COVERED	Deployment Model, Application, End Use Industry, Tool Type, Regional
COUNTRIES COVERED	US, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
KEY MARKET DYNAMICS	Increased data volume, Growing cloud adoption, Rising demand for real-time analytics, Technological advancements in AI, Competitive market landscape
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Informatica, Deloitte, Amazon Web Services, MicroStrategy, Cloudera, Microsoft, Google, Oracle, SAP, SAS Institute, Qlik, Teradata, TIBCO Software, Palantir Technologies, Snowflake, Salesforce, IBM
MARKET FORECAST PERIOD	2025 - 2035
KEY MARKET OPPORTUNITIES	Increased demand for real-time analytics, Growth in AI and machine learning applications, Expansion of cloud-based data solutions, Rising need for predictive analytics, Enhanced focus on data security and compliance
COMPOUND ANNUAL GROWTH RATE (CAGR)	9.3% (2025 - 2035)

Facebook

Twitter

Click to copy link

Link copied

Cite

Market Research Forecast (2025). Text Analysis Software Report [Dataset]. https://www.marketresearchforecast.com/reports/text-analysis-software-42331

Text Analysis Software Report

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

doc, pdf, pptAvailable download formats

Dataset updated

Mar 20, 2025

Dataset authored and provided by

Market Research Forecast

License

https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

Time period covered

2025 - 2033

Area covered

Global

Variables measured

Market Size

Description

The global text analysis software market is booming, projected to reach $2746.5 million by 2033 with a 5.6% CAGR. Discover key trends, drivers, restraints, and leading companies shaping this rapidly evolving landscape of NLP and AI-powered text analytics solutions.

Clear search

Close search

Google apps

Main menu

Text Analysis Software Report

Global Data Science Tool Market Research Report: By Application (Predictive...

Data Analytics Software Report

Text Analytics Tool Report

CORD-19 Dataset v2020

Open-Ended track where your team can build anything using the dataset provided by us

Saudi Banking App Customer Reviews

Text Analysis Software Report

Commercial Patent Database Report

Product data mining: entity classification&linking

IMPORTANT: Round 1 results are now released, check our website for the leaderboard. We now open Round 2 submissions!

1. Overview

2. Task and dataset brief

3. Resources and tools

4. Challenge website

5. Organizing committee

6. Contact

Parkinsons Disease Discovery Database

Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter...

Global Augmented Analytics Market Research Report: By Technology (Natural...

SOTorrent 2018-12-09

Please notice

Context

Content

Inspiration

Data from: QSAR-Co: An Open Source Software for Developing Robust...

Global Online Analytical Processing OLAP Tool Market Research Report: By...

Global Clustering Software Market Research Report: By Application (Data...

Global Analytics Software Market Research Report: By Application (Business...

Global Analytics Sandbox Market Research Report: By Application (Data...

Global Data Science Platform Market Research Report: By Application...

Global High Performance Data Analytics Market Research Report: By Deployment...

Text Analysis Software Report