18 datasets found

Dataset - Templates Recommendation in the Open Research Knowledge Graph

zenodo.org

json

Updated Jun 3, 2022

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Omar Arab Oghli; Omar Arab Oghli (2022). Dataset - Templates Recommendation in the Open Research Knowledge Graph [Dataset]. http://doi.org/10.5281/zenodo.6607165

Explore at:

jsonAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.6607165

Dataset updated

Jun 3, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Omar Arab Oghli; Omar Arab Oghli

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset has been created for implementing a content-based recommender system in the context of the Open Research Knowledge Graph (ORKG). The recommender system accepts research paper's title and abstracts as input and recommends existing templates in the ORKG semantically relevant to the given paper.

Two approaches have been trained on this dataset in the context of this master's thesis, namely a Natural Language Inference (NLI) approach based on SciBERT embeddings and an unsupervised approach based on ElasticSearch.

This publication consists therefore of one general dataset, two training sets for each approach, validation set for the supervised approach and a test set for both approaches.

dataset.json

The main JSON object consists of a list of templates and a list of neutral papers.

Each template object has an ID, label, list of research fields, list of properties and list of papers using that template, whereas each paper object has ID, label, DOI, research field and abstract.

Each neutral paper object has the same schema of a paper object using that template.

See an example instance below.

{
  "templates": [
    {
      "id": "R138668",
      "label": "Psychiatric Disorders AI Overview",
      "research_fields": [
        {
          "id": "http://orkg.org/orkg/resource/R133",
          "label": "Artificial Intelligence"
        }
        ...
      ],
      "properties": [
        "Study cohort",
        ...
      ],
      "papers": [
        {
          "id": "R138698",
          "label": "Application of Autoencoder in Depression Diagnosis",
          "doi": "10.12783/dtcse/csma2017/17335",
          "research_field": {
            "id": "R104",
            "label": "Bioinformatics"
          },
          "abstract": "Major depressive disorder (MDD) is a mental disorder characterized by at least two weeks of low mood which is present across most situations. Diagnosis of MDD using rest-state functional magnetic resonance imaging (fMRI) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability. No method can automatically extract discriminative features from the origin time series in fMRI images for MDD diagnosis. In this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge. An autoencoder was used to learn pre-training parameters of a dimensionality reduction process using 3-D convolution network. Through comparison with the other three feature extraction methods, our method achieved the best classification performance. This method can be used not only in MDD diagnosis, but also other similar disorders."
        },
        ...
    },
   ...
   ]
  "neutral_papers": [
    {
      "id": "R109377",
      "label": "Structural basis of SARS-CoV-2 3CLpro and anti-COVID-19 drug discovery from medicinal plants",
      "doi": "10.1016/j.jpha.2020.03.009",
      "research_field": {
        "id": "R104",
        "label": "Bioinformatics"
      },
      "abstract": "Abstract The recent outbreak of coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 in December 2019 raised global health concerns. The viral 3-chymotrypsin-like cysteine protease (3CLpro) enzyme controls coronavirus replication and is essential for its life cycle. 3CLpro is a proven drug discovery target in the case of severe acute respiratory syndrome coronavirus (SARS-CoV) and middle east respiratory syndrome coronavirus (MERS-CoV). Recent studies revealed that the genome sequence of SARS-CoV-2 is very similar to that of SARS-CoV. Therefore, herein, we analysed the 3CLpro sequence, constructed its 3D homology model, and screened it against a medicinal plant library containing 32,297 potential anti-viral phytochemicals/traditional Chinese medicinal compounds. Our analyses revealed that the top nine hits might serve as potential anti- SARS-CoV-2 lead molecules for further optimisation and drug development process to combat COVID-19."
    },
    ...
  ]
}

All other files

The main JSON object consists of a list of entailments, a list of contradiction and a list of neutrals.

Each object of the above mentioned lists has the same schema. An instance_id created by concatenating the template_id (when exists) with the paper_id, a template_id, a paper_id, premise (representing the paper's title), hypthesis (representing the paper's abstract), their concatenation in sequence and the target class.

See an example instance below.

{
  "entailments": [
    {
      "instance_id": "R138668xR138698",
      "template_id": "R138668",
      "paper_id": "R138698",
      "premise": "psychiatric disorders ai overview study cohort outcome assessment aims performance findings used models data",
      "hypothesis": "application of autoencoder in depression diagnosis major depressive disorder (mdd) is a mental disorder characterized by at least two weeks of low mood which is present across most situations diagnosis of mdd using rest state functional magnetic resonance imaging (fmri) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability no method can automatically extract discriminative features from the origin time series in fmri images for mdd diagnosis in this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge an autoencoder was used to learn pre training parameters of a dimensionality reduction process using 3 d convolution network through comparison with the other three feature extraction methods, our method achieved the best classification performance this method can be used not only in mdd diagnosis, but also other similar disorders",
      "sequence": "[CLS] psychiatric disorders ai overview study cohort outcome assessment aims performance findings used models data [SEP] application of autoencoder in depression diagnosis major depressive disorder (mdd) is a mental disorder characterized by at least two weeks of low mood which is present across most situations diagnosis of mdd using rest state functional magnetic resonance imaging (fmri) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability no method can automatically extract discriminative features from the origin time series in fmri images for mdd diagnosis in this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge an autoencoder was used to learn pre training parameters of a dimensionality reduction process using 3 d convolution network through comparison with the other three feature extraction methods, our method achieved the best classification performance this method can be used not only in mdd diagnosis, but also other similar disorders [SEP]",
      "target": "entailment"
    },
   ...
   ],
  "contradictions": [ ... ],
  "neutrals": [ ... ]
}

Statistics

-	Training (supervised)	Validation (supervised)	Training (unsupervised)	Test
Entailment	180	20	200	52
Neutral	180	20	200	64
Contradictrion	736	84	0	0
Total	1096	124	400	116

f
Data Sheet 1_A global model-agnostic rule-based XAI method based on...
frontiersin.figshare.com
pdf
Updated Sep 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ephrem Tibebe Mekonnen; Luca Longo; Pierpaolo Dondio (2024). Data Sheet 1_A global model-agnostic rule-based XAI method based on Parameterized Event Primitives for time series classifiers.pdf [Dataset]. http://doi.org/10.3389/frai.2024.1381921.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/frai.2024.1381921.s001
Dataset updated
Sep 20, 2024
Dataset provided by
Frontiers
Authors
Ephrem Tibebe Mekonnen; Luca Longo; Pierpaolo Dondio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Time series classification is a challenging research area where machine learning and deep learning techniques have shown remarkable performance. However, often, these are seen as black boxes due to their minimal interpretability. On the one hand, there is a plethora of eXplainable AI (XAI) methods designed to elucidate the functioning of models trained on image and tabular data. On the other hand, adapting these methods to explain deep learning-based time series classifiers may not be straightforward due to the temporal nature of time series data. This research proposes a novel global post-hoc explainable method for unearthing the key time steps behind the inferences made by deep learning-based time series classifiers. This novel approach generates a decision tree graph, a specific set of rules, that can be seen as explanations, potentially enhancing interpretability. The methodology involves two major phases: (1) training and evaluating deep-learning-based time series classification models, and (2) extracting parameterized primitive events, such as increasing, decreasing, local max and local min, from each instance of the evaluation set and clustering such events to extract prototypical ones. These prototypical primitive events are then used as input to a decision-tree classifier trained to fit the model predictions of the test set rather than the ground truth data. Experiments were conducted on diverse real-world datasets sourced from the UCR archive, employing metrics such as accuracy, fidelity, robustness, number of nodes, and depth of the extracted rules. The findings indicate that this global post-hoc method can improve the global interpretability of complex time series classification models.
Machine Learning model data
ecmwf.int
Updated Jan 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
European Centre for Medium-Range Weather Forecasts (2023). Machine Learning model data [Dataset]. https://www.ecmwf.int/en/forecasts/dataset/machine-learning-model-data
Explore at:
Dataset updated
Jan 1, 2023
Dataset authored and provided by
European Centre for Medium-Range Weather Forecastshttp://ecmwf.int/
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
three of these models are available:
Energy consumption when training LLMs in 2022 (in MWh)
statista.com
Updated Sep 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Energy consumption when training LLMs in 2022 (in MWh) [Dataset]. https://www.statista.com/statistics/1384401/energy-use-when-training-llm-models/
Explore at:
Dataset updated
Sep 10, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2022
Area covered
Worldwide
Description
Energy consumption of artificial intelligence (AI) models in training is considerable, with both GPT-3, the original release of the current iteration of OpenAI's popular ChatGPT, and Gopher consuming well over a thousand-megawatt hours of energy simply for training. As this is only for the training model it is likely that the energy consumption for the entire usage and lifetime of GPT-3 and other large language models (LLMs) is significantly higher. The largest consumer of energy, GPT-3, consumed roughly the equivalent of 200 Germans in 2022. While not a staggering amount, it is a considerable use of energy.

Energy savings through AI

While it is undoubtedly true that training LLMs takes a considerable amount of energy, the energy savings are also likely to be substantial. Any AI model that improves processes by minute numbers might save hours on shipment, liters of fuel, or dozens of computations. Each one of these uses energy as well and the sum of energy saved through a LLM might vastly outperform its energy cost. A good example is mobile phone operators, of which a third expect that AI might reduce power consumption by ten to fifteen percent. Considering that much of the world uses mobile phones this would be a considerable energy saver.

Emissions are considerable

The amount of CO2 emissions from training LLMs is also considerable, with GPT-3 producing nearly 500 tonnes of CO2. This again could be radically changed based on the types of energy production creating the emissions. Most data center operators for instance would prefer to have nuclear energy play a key role, a significantly low-emission energy producer.
Impact of AI and ML use on retail performance 2022-2024
statista.com
Updated Feb 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Impact of AI and ML use on retail performance 2022-2024 [Dataset]. https://www.statista.com/statistics/1453198/ai-and-ml-impact-on-retail-performance/
Explore at:
Dataset updated
Feb 23, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Worldwide
Description
Retailers using artificial intelligence (AI) and machine learning (ML) technologies performed better than their competitors. Both in 2023 and 2024, retail companies using this kind of technologies saw a two-digit growth of their sales compared to the respective previous years. Similarly, their annual profit grew by roughly eight percent, outperforming retailers who did not use AI or ML solutions.
f
Data from: Revisiting Electrocatalyst Design by a Knowledge Graph of...
acs.figshare.com
figshare.com
xlsx
Updated Jun 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang Gao; Ludi Wang; Xueqing Chen; Yi Du; Bin Wang (2023). Revisiting Electrocatalyst Design by a Knowledge Graph of Cu-Based Catalysts for CO2 Reduction [Dataset]. http://doi.org/10.1021/acscatal.3c00759.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/acscatal.3c00759.s002
Dataset updated
Jun 13, 2023
Dataset provided by
ACS Publications
Authors
Yang Gao; Ludi Wang; Xueqing Chen; Yi Du; Bin Wang
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Electrocatalysis takes a significant role in the production of sustainable fuels and chemicals. The combination of artificial intelligence and catalytic science is exhibiting great potential to extract, analyze, and predict electrocatalysts. However, the currently developed machine learning approach usually requires a mass of data from density functional theory calculations to train and optimize models. In contrast, a knowledge graph has the potential to extract useful information from a large amount of the literature without referring to density functional theory. Herein, a knowledge graph of Cu-based electrocatalysts for electrocatalytic CO2 reduction is constructed based on a linguistically enriched SciBERT-based framework. This framework retrieves multiple types of entities including material, regulation method, product, Faradaic efficiency, etc. from 757 scientific literature, generates representations with abundant domain-specific semantic information, and exhibits the capability to deal with electrocatalysts for CO2 reduction. The obtained graph shows the development history of related catalysts, builds relationships between the factors associated with catalysis, and provides intuitive charts for researchers to gain useful information. Furthermore, we propose a deep learning-based prediction model, which integrates the semantic information from the scientific literature (word embedding) with the correlation of knowledge triples (graph embedding) and realizes the prediction of the Faradaic efficiency for a targeted case. This work paves the way for catalyst design in the manner of merging artificial intelligence with catalytic science.
HindiMathQuest - Math Problems & Reasoning
kaggle.com
Updated Oct 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dnyanesh Walwadkar (2024). HindiMathQuest - Math Problems & Reasoning [Dataset]. http://doi.org/10.34740/kaggle/ds/5832290
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/ds/5832290
Dataset updated
Oct 14, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Dnyanesh Walwadkar
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Overview:

The Hindi Mathematics Reasoning and Problem-Solving Dataset is designed to advance the capabilities of language models in understanding and solving mathematical problems presented in the Hindi language. The dataset covers a comprehensive range of question types, including logical reasoning, numeric calculations, translation-based problems, and complex mathematical tasks typically seen in competitive exams. This dataset is intended to fill a critical gap by focusing on numeric reasoning and mathematical logic in Hindi, offering high-quality prompts that challenge models to handle both linguistic and mathematical complexity in one of the world’s most widely spoken languages.

Key Features:

-**Diverse Range of Mathematical Problems**: The dataset includes questions from areas such as arithmetic, algebra, geometry, physics, and number theory, all expressed in Hindi.

-**Logical and Reasoning Tasks**: Includes logic-based problems requiring pattern recognition, deduction, and reasoning, often seen in competitive exams like IIT JEE, GATE, and GRE.

-**Complex Numerical Calculations in Hindi**: Numeric expressions and their handling in Hindi text, a common challenge for language models, are a major focus of this dataset. Questions require models to accurately interpret and solve mathematical problems where numbers are written in Hindi words (e.g., "पचासी हजार सात सौ नवासी" for 85789).

-**Real-World Application Scenarios**: Paragraph-based problems, puzzles, and word problems that mirror real-world scenarios and test both language comprehension and problem-solving capabilities.

-**Culturally Relevant Questions**: Carefully curated questions that avoid regional or social biases, ensuring that the dataset accurately reflects the linguistic and cultural nuances of Hindi-speaking regions.

Dataset Breakdown:

-**Logical and Reasoning-based Questions**: Questions testing pattern recognition, deduction, and logical reasoning, often seen in IQ tests and competitive exams.

Calculation-based Problems: Includes numeric operations such as addition, subtraction, multiplication, and division, presented in Hindi text.

-**Translation-based Mathematical Problems**: Questions that involve translating between numeric expressions and Hindi word forms, enhancing model understanding of Hindi numerals.

-**Competitive Exam-style Questions**: Sourced and inspired by advanced reasoning and problem-solving questions from exams like GATE, IIT JEE, and GRE, providing high-level challenge.

-**Series and Sequence Questions**: Number series, progressions, and pattern recognition problems, essential for logical reasoning tasks.

-**Paragraph-based Word Problems**: Real-world math problems described in multiple sentences of Hindi text, requiring deeper language comprehension and reasoning.

-**Geometry and Trigonometry**: Includes geometry-based problems using Hindi terminology for angles, shapes, and measurements.

-**Physics-based Problems**: Mathematical problems based on physics concepts like mechanics, thermodynamics, and electricity, all expressed in Hindi.

-**Graph and Data Interpretation**: Interpretation of graphs and data in Hindi, testing both visual and mathematical understanding.

-**Olympiad-style Questions**: Advanced math problems, similar to those found in math Olympiads, designed to test high-level reasoning and problem-solving skills.

Preprocessing and Quality Control:

-**Human Verification**: Over 30% of the dataset has been manually reviewed and verified by native Hindi speakers. Additionally, a random sample of English-to-Hindi translated prompts showed a 100% success rate in translation quality, further boosting confidence in the overall quality of the dataset.

-**Dataset Curation**: The dataset was generated using a combination of human-curated questions, AI-assisted translations from existing English datasets, and publicly available educational resources. Special attention was given to ensure cultural sensitivity and accurate representation of the language.

-**Handling Numeric Challenges in Hindi**: Special focus was given to numeric reasoning tasks, where numbers are presented in Hindi words—a well-known challenge for existing language models. The dataset aims to push the boundaries of current models by providing complex scenarios that require a deep understanding of both language and numeric relationships.

Usage:

This dataset is ideal for researchers, educators, and developers working on natural language processing, machine learning, and AI models tailored for Hindi-speaking populations. The dataset can be used for:

Fine-tuning language models for improved understanding of mathematical reasoning in Hindi.

Training question-answering systems for educational tools that cater to Hindi-speaking students.

Developing AI systems for competitive exam preparati...
c
Machine Learning in Finance Market will grow at a CAGR of 22.50% from 2023...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Jan 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). Machine Learning in Finance Market will grow at a CAGR of 22.50% from 2023 to 2030! [Dataset]. https://www.cognitivemarketresearch.com/machine-learning-in-finance-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Jan 15, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
The global Machine Learning in Finance market was valued at USD 7.52 billion in 2022 and is projected to reach USD 38.13 billion by 2030, registering a CAGR of 22.50% for the forecast period 2023-2030. Market Dynamics of the Machine Learning in Finance Market

Market Driver of the Machine Learning in Finance Market

The growing demand for predictive analytics and data-driven insights is driving the market for Machine Learning in Finance Market.

The rising need for data-driven insights and predictive analytics can be attributed for the machine learning (ML) industry's rapid expansion and adoption. The necessity of using the vast databases and find insightful patterns has become important as financial institutions try to navigate the complexity of a constantly shifting global economy. This increase in demand is being driven by the understanding that standard analytical techniques frequently fail to capture the details and complex relationships contained in financial data. The ability of ML algorithms to analyse enormous volumes of data at high speeds gives them the power to find hidden trends, correlations, and inconsistencies that are inaccessible to manual testing. In the financial markets, where a slight edge in anticipating market movements, asset price fluctuations, and risk exposures can result in significant gains or reduced losses, this skill is particularly important. Additionally, the use of ML in finance goes beyond trading and investing plans. Various fields, including risk management, fraud detection, customer service, and regulatory compliance, are affected. Financial organizations can more effectively analyze and manage risk by recognizing possible risks and modeling scenarios that allow for better decision-making by utilizing advanced algorithms. Systems that use machine learning to detect fraud are more accurate than those that use rule-based methods because they can identify unexpected patterns and behaviors that could be signs of fraud in real time. For instance, Customers who use its machine learning (ML)-based CPP Fraud Analytics software for credit card fraud detection and prevention experience increases in detection rates between 50% and 90% and decreases in investigation times for individual fraud cases of up to 70%.

Growing demand for cost-effectiveness and scalability

Market Restraint of the Machine Learning in Finance Market

The efficiency of machine learning models in finance may be affected by a lack of reliable, unbiased financial data.

The accessibility and quality of the data used to develop and employ machine learning (ML) models in the field of finance are directly related to these factors. The absence of high-quality and unbiased financial data is a significant barrier that frequently prevents the effectiveness of ML applications in finance. Lack of thorough and reliable information can compromise the effectiveness and dependability of ML models in a sector characterized by complexity, quick market changes, and a wide range of affecting factors. Financial data includes market prices, economic indicators, trade volumes, sentiment research, and much more. It is also extremely diverse. For ML algorithms to produce useful insights and precise forecasts, it is essential that this data be precise, current, and indicative of the larger financial scene. If the historical data is biased and provides half information the machine learning software might give biased result depending on the data which would also results in the wrong and ineffective trends.

The growing use of Artificial Intelligence to improve customer service and automate financial tasks is a trend in Machine Learning in Finance Market.

The rapid and prevalent adoption of artificial intelligence (AI) is currently driving a revolutionary trend in the financial market. There is growing use of artificial intelligence (AI) to improve customer service and automate a variety of financial processes. For instance, AI has the ability to increase economic growth by 26% and financial services revenue by 34%. This change is radically changing how financial organizations engage with their customers, streamline their processes, and provide services. These smart systems are made to respond to consumer queries, offer immediate support, and make specific suggestions. These AI-driven interfaces can comprehend and reply to consumer inquiries in a human-like manner by utilizin...
AI in marketing revenue worldwide 2020-2028
statista.com
Updated Dec 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). AI in marketing revenue worldwide 2020-2028 [Dataset]. https://www.statista.com/statistics/1293758/ai-marketing-revenue-worldwide/
Explore at:
Dataset updated
Dec 10, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2020
Area covered
Worldwide
Description
In 2021, the market for artificial intelligence (AI) in marketing was estimated at 15.84 billion U.S. dollars. The source projected that the value would increase to more than 107.5 billion by 2028.

What is AI and who uses it?

Artificial intelligence (AI) has become one of the most impactful digital innovations of the past few decades. The term refers to the ability of a computer or machine to mimic the competencies of the human mind, with the current ecosystem consisting of machine learning, robotics, artificial neural networks, and natural language processing. All of these features and algorithms are highly versatile and adaptable to the specific requirements of the user, explaining why they have become embedded into many different industries, ranging from telecommunications and financial services to healthcare and pharma. Overall, the global artificial intelligence market was valued at around 327 billion U.S. dollars in 2021.

AI at the marketing wheel

AI is deeply embedded into the digital marketing landscape, and based on the latest reports, more than 80 percent of industry experts integrate some form of AI technology into their online marketing activities. This vast adaptation of artificial intelligence for marketing purposes is no surprise considering that its benefits include task automation, campaign personalization, and data analysis, to name but a few. When asked about marketers' main application areas of AI in a recent survey, roughly 50 percent of respondents from the U.S., Canada, the UK, and India mentioned ad targeting. Other popular activities they trusted AI with included personalizing content, optimizing e-mail send times, and calculating conversion probability.
r
International Journal of Engineering and Advanced Technology -...
researchhelpdesk.org
Updated Feb 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Help Desk (2022). International Journal of Engineering and Advanced Technology - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/552/international-journal-of-engineering-and-advanced-technology
Explore at:
Dataset updated
Feb 23, 2022
Dataset authored and provided by
Research Help Desk
Description
International Journal of Engineering and Advanced Technology - ResearchHelpDesk - International Journal of Engineering and Advanced Technology (IJEAT) is having Online-ISSN 2249-8958, bi-monthly international journal, being published in the months of February, April, June, August, October, and December by Blue Eyes Intelligence Engineering & Sciences Publication (BEIESP) Bhopal (M.P.), India since the year 2011. It is academic, online, open access, double-blind, peer-reviewed international journal. It aims to publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. All submitted papers will be reviewed by the board of committee of IJEAT. Aim of IJEAT Journal disseminate original, scientific, theoretical or applied research in the field of Engineering and allied fields. dispense a platform for publishing results and research with a strong empirical component. aqueduct the significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. seek original and unpublished research papers based on theoretical or experimental works for the publication globally. publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. impart a platform for publishing results and research with a strong empirical component. create a bridge for a significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. solicit original and unpublished research papers, based on theoretical or experimental works. Scope of IJEAT International Journal of Engineering and Advanced Technology (IJEAT) covers all topics of all engineering branches. Some of them are Computer Science & Engineering, Information Technology, Electronics & Communication, Electrical and Electronics, Electronics and Telecommunication, Civil Engineering, Mechanical Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. The main topic includes but not limited to: 1. Smart Computing and Information Processing Signal and Speech Processing Image Processing and Pattern Recognition WSN Artificial Intelligence and machine learning Data mining and warehousing Data Analytics Deep learning Bioinformatics High Performance computing Advanced Computer networking Cloud Computing IoT Parallel Computing on GPU Human Computer Interactions 2. Recent Trends in Microelectronics and VLSI Design Process & Device Technologies Low-power design Nanometer-scale integrated circuits Application specific ICs (ASICs) FPGAs Nanotechnology Nano electronics and Quantum Computing 3. Challenges of Industry and their Solutions, Communications Advanced Manufacturing Technologies Artificial Intelligence Autonomous Robots Augmented Reality Big Data Analytics and Business Intelligence Cyber Physical Systems (CPS) Digital Clone or Simulation Industrial Internet of Things (IIoT) Manufacturing IOT Plant Cyber security Smart Solutions – Wearable Sensors and Smart Glasses System Integration Small Batch Manufacturing Visual Analytics Virtual Reality 3D Printing 4. Internet of Things (IoT) Internet of Things (IoT) & IoE & Edge Computing Distributed Mobile Applications Utilizing IoT Security, Privacy and Trust in IoT & IoE Standards for IoT Applications Ubiquitous Computing Block Chain-enabled IoT Device and Data Security and Privacy Application of WSN in IoT Cloud Resources Utilization in IoT Wireless Access Technologies for IoT Mobile Applications and Services for IoT Machine/ Deep Learning with IoT & IoE Smart Sensors and Internet of Things for Smart City Logic, Functional programming and Microcontrollers for IoT Sensor Networks, Actuators for Internet of Things Data Visualization using IoT IoT Application and Communication Protocol Big Data Analytics for Social Networking using IoT IoT Applications for Smart Cities Emulation and Simulation Methodologies for IoT IoT Applied for Digital Contents 5. Microwaves and Photonics Microwave filter Micro Strip antenna Microwave Link design Microwave oscillator Frequency selective surface Microwave Antenna Microwave Photonics Radio over fiber Optical communication Optical oscillator Optical Link design Optical phase lock loop Optical devices 6. Computation Intelligence and Analytics Soft Computing Advance Ubiquitous Computing Parallel Computing Distributed Computing Machine Learning Information Retrieval Expert Systems Data Mining Text Mining Data Warehousing Predictive Analysis Data Management Big Data Analytics Big Data Security 7. Energy Harvesting and Wireless Power Transmission Energy harvesting and transfer for wireless sensor networks Economics of energy harvesting communications Waveform optimization for wireless power transfer RF Energy Harvesting Wireless Power Transmission Microstrip Antenna design and application Wearable Textile Antenna Luminescence Rectenna 8. Advance Concept of Networking and Database Computer Network Mobile Adhoc Network Image Security Application Artificial Intelligence and machine learning in the Field of Network and Database Data Analytic High performance computing Pattern Recognition 9. Machine Learning (ML) and Knowledge Mining (KM) Regression and prediction Problem solving and planning Clustering Classification Neural information processing Vision and speech perception Heterogeneous and streaming data Natural language processing Probabilistic Models and Methods Reasoning and inference Marketing and social sciences Data mining Knowledge Discovery Web mining Information retrieval Design and diagnosis Game playing Streaming data Music Modelling and Analysis Robotics and control Multi-agent systems Bioinformatics Social sciences Industrial, financial and scientific applications of all kind 10. Advanced Computer networking Computational Intelligence Data Management, Exploration, and Mining Robotics Artificial Intelligence and Machine Learning Computer Architecture and VLSI Computer Graphics, Simulation, and Modelling Digital System and Logic Design Natural Language Processing and Machine Translation Parallel and Distributed Algorithms Pattern Recognition and Analysis Systems and Software Engineering Nature Inspired Computing Signal and Image Processing Reconfigurable Computing Cloud, Cluster, Grid and P2P Computing Biomedical Computing Advanced Bioinformatics Green Computing Mobile Computing Nano Ubiquitous Computing Context Awareness and Personalization, Autonomic and Trusted Computing Cryptography and Applied Mathematics Security, Trust and Privacy Digital Rights Management Networked-Driven Multicourse Chips Internet Computing Agricultural Informatics and Communication Community Information Systems Computational Economics, Digital Photogrammetric Remote Sensing, GIS and GPS Disaster Management e-governance, e-Commerce, e-business, e-Learning Forest Genomics and Informatics Healthcare Informatics Information Ecology and Knowledge Management Irrigation Informatics Neuro-Informatics Open Source: Challenges and opportunities Web-Based Learning: Innovation and Challenges Soft computing Signal and Speech Processing Natural Language Processing 11. Communications Microstrip Antenna Microwave Radar and Satellite Smart Antenna MIMO Antenna Wireless Communication RFID Network and Applications 5G Communication 6G Communication 12. Algorithms and Complexity Sequential, Parallel And Distributed Algorithms And Data Structures Approximation And Randomized Algorithms Graph Algorithms And Graph Drawing On-Line And Streaming Algorithms Analysis Of Algorithms And Computational Complexity Algorithm Engineering Web Algorithms Exact And Parameterized Computation Algorithmic Game Theory Computational Biology Foundations Of Communication Networks Computational Geometry Discrete Optimization 13. Software Engineering and Knowledge Engineering Software Engineering Methodologies Agent-based software engineering Artificial intelligence approaches to software engineering Component-based software engineering Embedded and ubiquitous software engineering Aspect-based software engineering Empirical software engineering Search-Based Software engineering Automated software design and synthesis Computer-supported cooperative work Automated software specification Reverse engineering Software Engineering Techniques and Production Perspectives Requirements engineering Software analysis, design and modelling Software maintenance and evolution Software engineering tools and environments Software engineering decision support Software design patterns Software product lines Process and workflow management Reflection and metadata approaches Program understanding and system maintenance Software domain modelling and analysis Software economics Multimedia and hypermedia software engineering Software engineering case study and experience reports Enterprise software, middleware, and tools Artificial intelligent methods, models, techniques Artificial life and societies Swarm intelligence Smart Spaces Autonomic computing and agent-based systems Autonomic computing Adaptive Systems Agent architectures, ontologies, languages and protocols Multi-agent systems Agent-based learning and knowledge discovery Interface agents Agent-based auctions and marketplaces Secure mobile and multi-agent systems Mobile agents SOA and Service-Oriented Systems Service-centric software engineering Service oriented requirements engineering Service oriented architectures Middleware for service based systems Service discovery and composition Service level agreements (drafting,
T
cardiotox
tensorflow.org
Updated Dec 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). cardiotox [Dataset]. https://www.tensorflow.org/datasets/catalog/cardiotox
Explore at:
Dataset updated
Dec 6, 2022
Description
Drug Cardiotoxicity dataset [1-2] is a molecule classification task to detect cardiotoxicity caused by binding hERG target, a protein associated with heart beat rhythm. The data covers over 9000 molecules with hERG activity.

Note:

The data is split into four splits: train, test-iid, test-ood1, test-ood2.

Each molecule in the dataset has 2D graph annotations which is designed to facilitate graph neural network modeling. Nodes are the atoms of the molecule and edges are the bonds. Each atom is represented as a vector encoding basic atom information such as atom type. Similar logic applies to bonds.

We include Tanimoto fingerprint distance (to training data) for each molecule in the test sets to facilitate research on distributional shift in graph domain.

For each example, the features include: atoms: a 2D tensor with shape (60, 27) storing node features. Molecules with less than 60 atoms are padded with zeros. Each atom has 27 atom features. pairs: a 3D tensor with shape (60, 60, 12) storing edge features. Each edge has 12 edge features. atom_mask: a 1D tensor with shape (60, ) storing node masks. 1 indicates the corresponding atom is real, othewise a padded one. pair_mask: a 2D tensor with shape (60, 60) storing edge masks. 1 indicates the corresponding edge is real, othewise a padded one. active: a one-hot vector indicating if the molecule is toxic or not. [0, 1] indicates it's toxic, otherwise [1, 0] non-toxic.

References

[1]: V. B. Siramshetty et al. Critical Assessment of Artificial Intelligence Methods for Prediction of hERG Channel Inhibition in the Big Data Era. JCIM, 2020. https://pubs.acs.org/doi/10.1021/acs.jcim.0c00884

[2]: K. Han et al. Reliable Graph Neural Networks for Drug Discovery Under Distributional Shift. NeurIPS DistShift Workshop 2021. https://arxiv.org/abs/2111.12951

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('cardiotox', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
f
The PCCs and RMSEs (pKd/pKi) for our DC-GBT models in three test cases,...
plos.figshare.com
xls
Updated Jun 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiang Liu; Huitao Feng; Jie Wu; Kelin Xia (2023). The PCCs and RMSEs (pKd/pKi) for our DC-GBT models in three test cases, i.e., PDBbind-v2007, PDBbind-v2013 and PDBbind-v2016. [Dataset]. http://doi.org/10.1371/journal.pcbi.1009943.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pcbi.1009943.t003
Dataset updated
Jun 8, 2023
Dataset provided by
PLOS Computational Biology
Authors
Xiang Liu; Huitao Feng; Jie Wu; Kelin Xia
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Three DC-GBT models are considered with features from different types of bipartite graphs. The DC-GBT(Dist) model uses features from distance-based bipartite graphs; The DC-GBT(Chrg) model uses features from electrostatic-based bipartite graphs; The DC-GBT(Dist+Chrg) model uses features from both distance-based bipartite graphs and electrostatic-based bipartite graphs.
r
International Journal of Engineering and Advanced Technology CiteScore...
researchhelpdesk.org
Updated Apr 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Help Desk (2022). International Journal of Engineering and Advanced Technology CiteScore 2024-2025 - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/sjr/552/international-journal-of-engineering-and-advanced-technology
Explore at:
Dataset updated
Apr 5, 2022
Dataset authored and provided by
Research Help Desk
Description
International Journal of Engineering and Advanced Technology CiteScore 2024-2025 - ResearchHelpDesk - International Journal of Engineering and Advanced Technology (IJEAT) is having Online-ISSN 2249-8958, bi-monthly international journal, being published in the months of February, April, June, August, October, and December by Blue Eyes Intelligence Engineering & Sciences Publication (BEIESP) Bhopal (M.P.), India since the year 2011. It is academic, online, open access, double-blind, peer-reviewed international journal. It aims to publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. All submitted papers will be reviewed by the board of committee of IJEAT. Aim of IJEAT Journal disseminate original, scientific, theoretical or applied research in the field of Engineering and allied fields. dispense a platform for publishing results and research with a strong empirical component. aqueduct the significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. seek original and unpublished research papers based on theoretical or experimental works for the publication globally. publish original, theoretical and practical advances in Computer Science & Engineering, Information Technology, Electrical and Electronics Engineering, Electronics and Telecommunication, Mechanical Engineering, Civil Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. impart a platform for publishing results and research with a strong empirical component. create a bridge for a significant gap between research and practice by promoting the publication of original, novel, industry-relevant research. solicit original and unpublished research papers, based on theoretical or experimental works. Scope of IJEAT International Journal of Engineering and Advanced Technology (IJEAT) covers all topics of all engineering branches. Some of them are Computer Science & Engineering, Information Technology, Electronics & Communication, Electrical and Electronics, Electronics and Telecommunication, Civil Engineering, Mechanical Engineering, Textile Engineering and all interdisciplinary streams of Engineering Sciences. The main topic includes but not limited to: 1. Smart Computing and Information Processing Signal and Speech Processing Image Processing and Pattern Recognition WSN Artificial Intelligence and machine learning Data mining and warehousing Data Analytics Deep learning Bioinformatics High Performance computing Advanced Computer networking Cloud Computing IoT Parallel Computing on GPU Human Computer Interactions 2. Recent Trends in Microelectronics and VLSI Design Process & Device Technologies Low-power design Nanometer-scale integrated circuits Application specific ICs (ASICs) FPGAs Nanotechnology Nano electronics and Quantum Computing 3. Challenges of Industry and their Solutions, Communications Advanced Manufacturing Technologies Artificial Intelligence Autonomous Robots Augmented Reality Big Data Analytics and Business Intelligence Cyber Physical Systems (CPS) Digital Clone or Simulation Industrial Internet of Things (IIoT) Manufacturing IOT Plant Cyber security Smart Solutions – Wearable Sensors and Smart Glasses System Integration Small Batch Manufacturing Visual Analytics Virtual Reality 3D Printing 4. Internet of Things (IoT) Internet of Things (IoT) & IoE & Edge Computing Distributed Mobile Applications Utilizing IoT Security, Privacy and Trust in IoT & IoE Standards for IoT Applications Ubiquitous Computing Block Chain-enabled IoT Device and Data Security and Privacy Application of WSN in IoT Cloud Resources Utilization in IoT Wireless Access Technologies for IoT Mobile Applications and Services for IoT Machine/ Deep Learning with IoT & IoE Smart Sensors and Internet of Things for Smart City Logic, Functional programming and Microcontrollers for IoT Sensor Networks, Actuators for Internet of Things Data Visualization using IoT IoT Application and Communication Protocol Big Data Analytics for Social Networking using IoT IoT Applications for Smart Cities Emulation and Simulation Methodologies for IoT IoT Applied for Digital Contents 5. Microwaves and Photonics Microwave filter Micro Strip antenna Microwave Link design Microwave oscillator Frequency selective surface Microwave Antenna Microwave Photonics Radio over fiber Optical communication Optical oscillator Optical Link design Optical phase lock loop Optical devices 6. Computation Intelligence and Analytics Soft Computing Advance Ubiquitous Computing Parallel Computing Distributed Computing Machine Learning Information Retrieval Expert Systems Data Mining Text Mining Data Warehousing Predictive Analysis Data Management Big Data Analytics Big Data Security 7. Energy Harvesting and Wireless Power Transmission Energy harvesting and transfer for wireless sensor networks Economics of energy harvesting communications Waveform optimization for wireless power transfer RF Energy Harvesting Wireless Power Transmission Microstrip Antenna design and application Wearable Textile Antenna Luminescence Rectenna 8. Advance Concept of Networking and Database Computer Network Mobile Adhoc Network Image Security Application Artificial Intelligence and machine learning in the Field of Network and Database Data Analytic High performance computing Pattern Recognition 9. Machine Learning (ML) and Knowledge Mining (KM) Regression and prediction Problem solving and planning Clustering Classification Neural information processing Vision and speech perception Heterogeneous and streaming data Natural language processing Probabilistic Models and Methods Reasoning and inference Marketing and social sciences Data mining Knowledge Discovery Web mining Information retrieval Design and diagnosis Game playing Streaming data Music Modelling and Analysis Robotics and control Multi-agent systems Bioinformatics Social sciences Industrial, financial and scientific applications of all kind 10. Advanced Computer networking Computational Intelligence Data Management, Exploration, and Mining Robotics Artificial Intelligence and Machine Learning Computer Architecture and VLSI Computer Graphics, Simulation, and Modelling Digital System and Logic Design Natural Language Processing and Machine Translation Parallel and Distributed Algorithms Pattern Recognition and Analysis Systems and Software Engineering Nature Inspired Computing Signal and Image Processing Reconfigurable Computing Cloud, Cluster, Grid and P2P Computing Biomedical Computing Advanced Bioinformatics Green Computing Mobile Computing Nano Ubiquitous Computing Context Awareness and Personalization, Autonomic and Trusted Computing Cryptography and Applied Mathematics Security, Trust and Privacy Digital Rights Management Networked-Driven Multicourse Chips Internet Computing Agricultural Informatics and Communication Community Information Systems Computational Economics, Digital Photogrammetric Remote Sensing, GIS and GPS Disaster Management e-governance, e-Commerce, e-business, e-Learning Forest Genomics and Informatics Healthcare Informatics Information Ecology and Knowledge Management Irrigation Informatics Neuro-Informatics Open Source: Challenges and opportunities Web-Based Learning: Innovation and Challenges Soft computing Signal and Speech Processing Natural Language Processing 11. Communications Microstrip Antenna Microwave Radar and Satellite Smart Antenna MIMO Antenna Wireless Communication RFID Network and Applications 5G Communication 6G Communication 12. Algorithms and Complexity Sequential, Parallel And Distributed Algorithms And Data Structures Approximation And Randomized Algorithms Graph Algorithms And Graph Drawing On-Line And Streaming Algorithms Analysis Of Algorithms And Computational Complexity Algorithm Engineering Web Algorithms Exact And Parameterized Computation Algorithmic Game Theory Computational Biology Foundations Of Communication Networks Computational Geometry Discrete Optimization 13. Software Engineering and Knowledge Engineering Software Engineering Methodologies Agent-based software engineering Artificial intelligence approaches to software engineering Component-based software engineering Embedded and ubiquitous software engineering Aspect-based software engineering Empirical software engineering Search-Based Software engineering Automated software design and synthesis Computer-supported cooperative work Automated software specification Reverse engineering Software Engineering Techniques and Production Perspectives Requirements engineering Software analysis, design and modelling Software maintenance and evolution Software engineering tools and environments Software engineering decision support Software design patterns Software product lines Process and workflow management Reflection and metadata approaches Program understanding and system maintenance Software domain modelling and analysis Software economics Multimedia and hypermedia software engineering Software engineering case study and experience reports Enterprise software, middleware, and tools Artificial intelligent methods, models, techniques Artificial life and societies Swarm intelligence Smart Spaces Autonomic computing and agent-based systems Autonomic computing Adaptive Systems Agent architectures, ontologies, languages and protocols Multi-agent systems Agent-based learning and knowledge discovery Interface agents Agent-based auctions and marketplaces Secure mobile and multi-agent systems Mobile agents SOA and Service-Oriented Systems Service-centric software engineering Service oriented requirements engineering Service oriented architectures Middleware for service based systems Service discovery and composition Service level
P
GSM8K Dataset
paperswithcode.com
tensorflow.org
+2more
Updated Dec 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karl Cobbe; Vineet Kosaraju; Mohammad Bavarian; Mark Chen; Heewoo Jun; Lukasz Kaiser; Matthias Plappert; Jerry Tworek; Jacob Hilton; Reiichiro Nakano; Christopher Hesse; John Schulman (2024). GSM8K Dataset [Dataset]. https://paperswithcode.com/dataset/gsm8k
Explore at:
Dataset updated
Dec 31, 2024
Authors
Karl Cobbe; Vineet Kosaraju; Mohammad Bavarian; Mark Chen; Heewoo Jun; Lukasz Kaiser; Matthias Plappert; Jerry Tworek; Jacob Hilton; Reiichiro Nakano; Christopher Hesse; John Schulman
Description
GSM8K is a dataset of 8.5K high quality linguistically diverse grade school math word problems created by human problem writers. The dataset is segmented into 7.5K training problems and 1K test problems. These problems take between 2 and 8 steps to solve, and solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the final answer. A bright middle school student should be able to solve every problem. It can be used for multi-step mathematical reasoning.
f
Data Sheet 1_MDNN-DTA: a multimodal deep neural network for drug-target...
figshare.com
pdf
Updated Mar 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xu Gao; Mengfan Yan; Chengwei Zhang; Gang Wu; Jiandong Shang; Congxiang Zhang; Kecheng Yang (2025). Data Sheet 1_MDNN-DTA: a multimodal deep neural network for drug-target affinity prediction.pdf [Dataset]. http://doi.org/10.3389/fgene.2025.1527300.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2025.1527300.s001
Dataset updated
Mar 20, 2025
Dataset provided by
Frontiers
Authors
Xu Gao; Mengfan Yan; Chengwei Zhang; Gang Wu; Jiandong Shang; Congxiang Zhang; Kecheng Yang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Determining drug-target affinity (DTA) is a pivotal step in drug discovery, where in silico methods can significantly improve efficiency and reduce costs. Artificial intelligence (AI), especially deep learning models, can automatically extract high-dimensional features from the biological sequences of drug molecules and target proteins. This technology demonstrates lower complexity in DTA prediction compared to traditional experimental methods, particularly when handling large-scale data. In this study, we introduce a multimodal deep neural network model for DTA prediction, referred to as MDNN-DTA. This model employs Graph Convolutional Networks (GCN) and Convolutional Neural Networks (CNN) to extract features from the drug and protein sequences, respectively. One notable strength of our method is its ability to accurately predict DTA directly from the sequences of the target proteins, obviating the need for protein 3D structures, which are frequently unavailable in drug discovery. To comprehensively extract features from the protein sequence, we leverage an ESM pre-trained model for extracting biochemical features and design a specific Protein Feature Extraction (PFE) block for capturing both global and local features of the protein sequence. Furthermore, a Protein Feature Fusion (PFF) Block is engineered to augment the integration of multi-scale protein features derived from the abovementioned techniques. We then compare MDNN-DTA with other models on the same dataset, conducting a series of ablation experiments to assess the performance and efficacy of each component. The results highlight the advantages and effectiveness of the MDNN-DTA method.
T
math_dataset
tensorflow.org
huggingface.co
Updated Jan 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). math_dataset [Dataset]. https://www.tensorflow.org/datasets/catalog/math_dataset
Explore at:
Dataset updated
Jan 4, 2023
Description
Mathematics database.

This dataset code generates mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.

Original paper: Analysing Mathematical Reasoning Abilities of Neural Models (Saxton, Grefenstette, Hill, Kohli).

Example usage:

train_examples, val_examples = tfds.load( 'math_dataset/arithmetic_mul', split=['train', 'test'], as_supervised=True)

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('math_dataset', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
P
Project CodeNet Dataset
paperswithcode.com
Updated Jun 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruchir Puri; David S. Kung; Geert Janssen; Wei zhang; Giacomo Domeniconi; Vladimir Zolotov; Julian Dolby; Jie Chen; Mihir Choudhury; Lindsey Decker; Veronika Thost; Luca Buratti; Saurabh Pujar; Shyam Ramji; Ulrich Finkler; Susan Malaika; Frederick Reiss (2022). Project CodeNet Dataset [Dataset]. https://paperswithcode.com/dataset/project-codenet
Explore at:
Dataset updated
Jun 10, 2022
Authors
Ruchir Puri; David S. Kung; Geert Janssen; Wei zhang; Giacomo Domeniconi; Vladimir Zolotov; Julian Dolby; Jie Chen; Mihir Choudhury; Lindsey Decker; Veronika Thost; Luca Buratti; Saurabh Pujar; Shyam Ramji; Ulrich Finkler; Susan Malaika; Frederick Reiss
Description
Project CodeNet is a large-scale dataset with approximately 14 million code samples, each of which is an intended solution to one of 4000 coding problems. The code samples are written in over 50 programming languages (although the dominant languages are C++, C, Python, and Java) and they are annotated with a rich set of information, such as its code size, memory footprint, cpu run time, and status, which indicates acceptance or error types. The dataset is accompanied by a repository, where we provide a set of tools to aggregate codes samples based on user criteria and to transform code samples into token sequences, simplified parse trees and other code graphs. A detailed discussion of Project CodeNet is available in this paper.

The rich annotation of Project CodeNet enables research in code search, code completion, code-code translation, and a myriad of other use cases. We also extracted several benchmarks in Python, Java and C++ to drive innovation in deep learning and machine learning models in code classification and code similarity.

Citation @inproceedings{puri2021codenet, author = {Ruchir Puri and David Kung and Geert Janssen and Wei Zhang and Giacomo Domeniconi and Vladmir Zolotov and Julian Dolby and Jie Chen and Mihir Choudhury and Lindsey Decker and Veronika Thost and Luca Buratti and Saurabh Pujar and Ulrich Finkler}, title = {Project CodeNet: A Large-Scale AI for Code Dataset for Learning a Diversity of Coding Tasks}, year = {2021}, }
Implementation of emerging technologies in companies worldwide 2023
statista.com
flwrdeptvarieties.store
Updated Nov 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Implementation of emerging technologies in companies worldwide 2023 [Dataset]. https://www.statista.com/statistics/661164/worldwide-cio-survey-operational-priorities/
Explore at:
Dataset updated
Nov 9, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 22, 2023 - Sep 18, 2023
Area covered
Worldwide
Description
As of 2023, nearly 92 percent of digital leaders globally stated that their companies adopted cloud technology either on small or large scale. Big data/ analytics were the second most popular adopted technology with around 61 percent of respondents reporting the same. Artificial intelligence/ machine learning At the same time, 26 percent of respondents were considering using Artificial intelligence (AI) / machine learning (ML) technology, while 24 percent said that their companies were piloting the implementation AI/ML technology.

What is cloud computing?  

Cloud computing refers to the use of networks of remote servers accessed over the internet to store, manage, and process data. It offers customers access to a wide range of technologies while lowering costs and reducing the need for technical expertise. The cloud service market is divided into three primary service models encompassing infrastructure, platforms, and software. Customers are able to choose between private, public, or hybrid cloud deployment depending on their business needs and security concerns.

SaaS: the most widely adopted cloud solutions    

In line with increases in companies’ adoption of cloud computing technologies, the worldwide revenue generated from these technologies has increased rapidly in recent years. Software as a Service (SaaS) is the largest segment of the global cloud computing market with revenues forecast to be around 197 billion U.S. dollars in 2023. Popular applications of SaaS include customer relationship management and enterprise resource planning software.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Omar Arab Oghli; Omar Arab Oghli (2022). Dataset - Templates Recommendation in the Open Research Knowledge Graph [Dataset]. http://doi.org/10.5281/zenodo.6607165

Dataset - Templates Recommendation in the Open Research Knowledge Graph

Explore at:

jsonAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.6607165

Dataset updated

Jun 3, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Omar Arab Oghli; Omar Arab Oghli

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This publication consists therefore of one general dataset, two training sets for each approach, validation set for the supervised approach and a test set for both approaches.

dataset.json

The main JSON object consists of a list of templates and a list of neutral papers.

Each template object has an ID, label, list of research fields, list of properties and list of papers using that template, whereas each paper object has ID, label, DOI, research field and abstract.

Each neutral paper object has the same schema of a paper object using that template.

See an example instance below.

{
  "templates": [
    {
      "id": "R138668",
      "label": "Psychiatric Disorders AI Overview",
      "research_fields": [
        {
          "id": "http://orkg.org/orkg/resource/R133",
          "label": "Artificial Intelligence"
        }
        ...
      ],
      "properties": [
        "Study cohort",
        ...
      ],
      "papers": [
        {
          "id": "R138698",
          "label": "Application of Autoencoder in Depression Diagnosis",
          "doi": "10.12783/dtcse/csma2017/17335",
          "research_field": {
            "id": "R104",
            "label": "Bioinformatics"
          },
          "abstract": "Major depressive disorder (MDD) is a mental disorder characterized by at least two weeks of low mood which is present across most situations. Diagnosis of MDD using rest-state functional magnetic resonance imaging (fMRI) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability. No method can automatically extract discriminative features from the origin time series in fMRI images for MDD diagnosis. In this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge. An autoencoder was used to learn pre-training parameters of a dimensionality reduction process using 3-D convolution network. Through comparison with the other three feature extraction methods, our method achieved the best classification performance. This method can be used not only in MDD diagnosis, but also other similar disorders."
        },
        ...
    },
   ...
   ]
  "neutral_papers": [
    {
      "id": "R109377",
      "label": "Structural basis of SARS-CoV-2 3CLpro and anti-COVID-19 drug discovery from medicinal plants",
      "doi": "10.1016/j.jpha.2020.03.009",
      "research_field": {
        "id": "R104",
        "label": "Bioinformatics"
      },
      "abstract": "Abstract The recent outbreak of coronavirus disease 2019 (COVID-19) caused by SARS-CoV-2 in December 2019 raised global health concerns. The viral 3-chymotrypsin-like cysteine protease (3CLpro) enzyme controls coronavirus replication and is essential for its life cycle. 3CLpro is a proven drug discovery target in the case of severe acute respiratory syndrome coronavirus (SARS-CoV) and middle east respiratory syndrome coronavirus (MERS-CoV). Recent studies revealed that the genome sequence of SARS-CoV-2 is very similar to that of SARS-CoV. Therefore, herein, we analysed the 3CLpro sequence, constructed its 3D homology model, and screened it against a medicinal plant library containing 32,297 potential anti-viral phytochemicals/traditional Chinese medicinal compounds. Our analyses revealed that the top nine hits might serve as potential anti- SARS-CoV-2 lead molecules for further optimisation and drug development process to combat COVID-19."
    },
    ...
  ]
}

All other files

The main JSON object consists of a list of entailments, a list of contradiction and a list of neutrals.

See an example instance below.

{
  "entailments": [
    {
      "instance_id": "R138668xR138698",
      "template_id": "R138668",
      "paper_id": "R138698",
      "premise": "psychiatric disorders ai overview study cohort outcome assessment aims performance findings used models data",
      "hypothesis": "application of autoencoder in depression diagnosis major depressive disorder (mdd) is a mental disorder characterized by at least two weeks of low mood which is present across most situations diagnosis of mdd using rest state functional magnetic resonance imaging (fmri) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability no method can automatically extract discriminative features from the origin time series in fmri images for mdd diagnosis in this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge an autoencoder was used to learn pre training parameters of a dimensionality reduction process using 3 d convolution network through comparison with the other three feature extraction methods, our method achieved the best classification performance this method can be used not only in mdd diagnosis, but also other similar disorders",
      "sequence": "[CLS] psychiatric disorders ai overview study cohort outcome assessment aims performance findings used models data [SEP] application of autoencoder in depression diagnosis major depressive disorder (mdd) is a mental disorder characterized by at least two weeks of low mood which is present across most situations diagnosis of mdd using rest state functional magnetic resonance imaging (fmri) data faces many challenges due to the high dimensionality, small samples, noisy and individual variability no method can automatically extract discriminative features from the origin time series in fmri images for mdd diagnosis in this study, we proposed a new method for feature extraction and a workflow which can make an automatic feature extraction and classification without a prior knowledge an autoencoder was used to learn pre training parameters of a dimensionality reduction process using 3 d convolution network through comparison with the other three feature extraction methods, our method achieved the best classification performance this method can be used not only in mdd diagnosis, but also other similar disorders [SEP]",
      "target": "entailment"
    },
   ...
   ],
  "contradictions": [ ... ],
  "neutrals": [ ... ]
}

Statistics

-	Training (supervised)	Validation (supervised)	Training (unsupervised)	Test
Entailment	180	20	200	52
Neutral	180	20	200	64
Contradictrion	736	84	0	0
Total	1096	124	400	116

Clear search

Close search

Google apps

Main menu

Dataset - Templates Recommendation in the Open Research Knowledge Graph

Data Sheet 1_A global model-agnostic rule-based XAI method based on...

Machine Learning model data

Energy consumption when training LLMs in 2022 (in MWh)

Impact of AI and ML use on retail performance 2022-2024

Data from: Revisiting Electrocatalyst Design by a Knowledge Graph of...

HindiMathQuest - Math Problems & Reasoning

Overview:

Key Features:

Dataset Breakdown:

Preprocessing and Quality Control:

Usage:

Machine Learning in Finance Market will grow at a CAGR of 22.50% from 2023...

AI in marketing revenue worldwide 2020-2028

International Journal of Engineering and Advanced Technology -...

cardiotox

References

The PCCs and RMSEs (pKd/pKi) for our DC-GBT models in three test cases,...

International Journal of Engineering and Advanced Technology CiteScore...

GSM8K Dataset

Data Sheet 1_MDNN-DTA: a multimodal deep neural network for drug-target...

math_dataset

Project CodeNet Dataset

Implementation of emerging technologies in companies worldwide 2023

Dataset - Templates Recommendation in the Open Research Knowledge GraphSee More Versions

Dataset - Templates Recommendation in the Open Research Knowledge Graph