Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Please temporarily download the raw data from: https://drive.google.com/drive/folders/1ATaUy2VKGyvh6DzuvkMm_DSSAvOEgdH0?usp=drive_link We will upload it here as soon as possible. Please note that the full raw data requires approximately 270 GB of storage.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
The dataset comprises 8,712 files across 6 programming languages, featuring verified tasks and benchmarks for evaluating coding agents and language models. It introduces new benchmarks with real-world coding tasks, providing datasets for software engineering problems and tests. It builds upon the original swe-bench by evaluating repository-level challenges and scoring performances.
By utilizing this dataset with its multi-language test sets and golden patches, researchers and developers can advance their understanding of large language models and developer tools, comparing their performances on real software engineering challenges. - Get the data
Specifically engineered for evaluating advanced coding and software development, SWE-Bench Dataset supports research in code generation, automated patching, and fixing GitHub issues.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F27063537%2F6876a1091e5e4e12d330177c6ec3a0e6%2F1.PNG?generation=1759494538704549&alt=media" alt="">
The dataset provides a robust foundation for achieving higher accuracy in code generation and advancing automated software development tools, which are essential for improving developer productivity and software quality.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Several grafting methods have been developed, and bench grafting with stratification is the most widely used technique, except in Brazil, which is still in adaptation. The objective of this study was to evaluate for how long plant material can be stored before the grafting and the optimum temperature for stratification. Cultivar 'Paulsen 1103' was used as rootstock and 'Niagara Rosada' as scion cultivar. The storage period treatments were 0, 30, 60 and 90 days at the temperature of 3 ā and 95% of relative humidity. After the storage period, the branches were removed from the cold chamber, taken to grafting, and then placed at 19 °C and 24 °C for stratification. After 21 days of stratification, the vine grafts were planted in commercial substrate and left to grow for 160 days. The vine cuttings of cultivars Niagara and Paulsen 1103 can be stored in cold chamber at 3 °C for 90 days and, during this period, bench grafting can be performed at any time. However, the vines from cuttings stored in cold chamber for more than 30 days have better growth. It is recommended to stratify the vine grafts at 19 °C.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
WebGen-Bench is created to benchmark LLM-based agent's ability to generate websites from scratch. The dataset is introduced in WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch. It contains 101 instructions and 647 test cases. It also has a training set of 6667 instructions, named WebGen-Instruct.
The code for evaluation as well as the training code and data are released at WebGen-Bench (Github)
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11818724%2Ff21b9227c91890850d045450adbb8528%2F2025-05-08%20213306.png?generation=1746711223375173&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F11818724%2Fc48f3086fc2c1dc95e5bd99511cd559d%2F2025-05-08%20213431.png?generation=1746711286481320&alt=media" alt="">
If you find our project useful, please cite:
@misc{lu2025webgenbenchevaluatingllmsgenerating,
title={WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch},
author={Zimu Lu and Yunqiao Yang and Houxing Ren and Haotian Hou and Han Xiao and Ke Wang and Weikang Shi and Aojun Zhou and Mingjie Zhan and Hongsheng Li},
year={2025},
eprint={2505.03733},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.03733},
}
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
paper ļ½ kaggle ļ½ huggingface ļ½ github
This repository is the official implementation of the paper: DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?
Visionālanguage models (VLMs) exhibit strong zero-shot generalization on natural images and show early promise in interpretable medical image analysis. However, existing benchmarks do not systematically evaluate whether these models truly reason like human clinicians or merely imitate superficial patterns.
To address this gap, we propose DrVD-Bench, the first multimodal benchmark for clinical visual reasoning. DrVD-Bench consists of three modules: Visual Evidence Comprehension, Reasoning Trajectory Assessment, and Report Generation Evaluation, comprising 7 789 imageāquestion pairs.
Our benchmark covers 20 task types, 17 diagnostic categories, and five imaging modalitiesāCT, MRI, ultrasound, X-ray, and pathology. DrVD-Bench mirrors the clinical workflow from modality recognition to lesion identification and diagnosis.
We benchmark 19 VLMs (general-purpose & medical-specific, open-source & proprietary) and observe that performance drops sharply as reasoning complexity increases. While some models begin to exhibit traces of human-like reasoning, they often rely on shortcut correlations rather than grounded visual understanding. DrVD-Bench therefore provides a rigorous framework for developing clinically trustworthy VLMs.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F26863010%2F57e63a57a9aa29bd8e502dbfeb16e834%2Fcover_image.jpeg?generation=1747904850177313&alt=media" alt="">
pip3 install -r requirements.txt
Report generation will use DeepSeek to extract report keywords, and instruction-following weaker models can also leverage DeepSeek to extract answers from their outputs.
You can apply for an API key on the DeepSeek platform.
For more details, please refer to the official documentation: DeepSeek API Docs.
model_response field in the corresponding files. model_response format requirements
A / B / C ⦠['B','D','A'] The Qwen-2.5-VL-72B API can be obtained on the Alibaba Cloud Bailian platform (link).
Ā· task - joint_qa.jsonl
~~~bash
python qwen2.5vl_example.py
--API_KEY="your_qwen_api_key"
--INPUT_PATH="/path/to/joint_qa.jsonl"
--OUTPUT_PATH="/path/to/result.jsonl"
--IMAGE_ROOT='path/to/benchmark/data/root'
--type="joint"
~~~
Ā· other tasks
~~~bash
python qwen2.5vl_example.py
--API_KEY="your_qwen_api_key"
--INPUT_PATH="/path/to/qa.jsonl"
--OUTPUT_PATH="/path/to/result.jsonl"
--IMAGE_ROOT='path/to/benchmark/data/root'
--type="single"
~~~
Applicable for instruction-following weaker models; if your model cannot standardize outputs according to the above format, you can use the following script to extract option answers from the model_response field:
~~~bash
python map.py
--API_KEY="your_deepseek_api_key"
--INPUT_FILE="/path/to/model_result.jsonl"
--OUTPUT_FILE="/path/to/model_result_mapped.jsonl"
~~~
python compute_choice_metric.py \
--json_path="/path/to/results.jsonl" \
--type='single'
python compute_choice_metric.py \
--json_path="/path/to/results.jsonl" \
--type='joint'
python report_generation_metric.py \
--API_KEY='your_deepseek_api_key' \
--JSON_PATH='/path/to/results.jsonl'
Facebook
Twitterhttp://www.nationalarchives.gov.uk/doc/non-commercial-government-licence/non-commercial-government-licence.htmhttp://www.nationalarchives.gov.uk/doc/non-commercial-government-licence/non-commercial-government-licence.htm
Data from Transnationalizing Modern Languages (09-2018)
Transnationalizing Modern Languages: Mobility, Identity and Translation in Modern Italian Cultures (TML) (funded by the AHRC under the āTranslating Culturesā theme, 2014-17)
PI Charles Burdett, University of Bristol. CIs Jenny Burns (Warwick), Loredana Polezzi (Warwick/Cardiff), Derek Duncan (St Andrews), Margaret Hills de Zarate (QMU)
RAs: Barbara Spadaro (Bristol), Carlo Pirozzi (St Andrews), Marco Santello (Warwick), Naomi Wells (Warwick), Luisa Percopo (Cardiff)
PhD students: Iacopo Colombini (St Andrews), Georgia Wall (Warwick)
Below is a short description of the project. Within the repository, there is a longer description of TML and each folder is accompanied by an explanatory text.
The project investigates practices of linguistic and cultural interchange within communities and individuals and explores the ways in which cultural translation intersects with linguistic translation in the everyday lives of people. The project has used as its primary object of enquiry the 150-year history of Italy as a nation state and its patterns of emigration and immigration. TML has concentrated on a series of exemplary cases, representative of the geographic, historical and linguistic map of Italian mobility. Focussing on the cultural associations that each community has formed, it examines the wealth of publications and materials that are associated with these organizations.
Working closely with researchers from across Modern Languages, the project has sought to demonstrate the principle that language is most productively apprehended in the frame of translation and the national in the frame of the transnational. TML is contributing to the development of a new framework for the disciplinary field of MLs, one which puts the interaction of languages and cultures at its core.
The principles of co-production and co-research lie at the core of the project and TML has worked closely with a very extensive range of partners. It has worked closely with Castlebrae and Drummond Community High Schools and with cultural associations across the world. The project exhibition, featuring the research of the project and including the work of photographer Mario Badagliacca, was curated by Viviana Gravano and Giulia Grechi of Routes Agency. Project events in the UK have drawn on the expertise of Rita Wilson (Monash), the writer Shirin Ramzanali Fazel and all members of the Advisory Board. The project, in close collaboration with the University of Namibia (UNAM) and the Phoenix Project (Cardiff), has been followed by āTML: Global Challengesā.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F22426573%2Fcf73fe47ffc203579aaf448ac99e60ab%2Fbenchmark_samples.png?generation=1747364225904852&alt=media" alt="">This is the OST datasets of our work: OST-Bench, An Online Spatio-temporal Scene Understanding benchmark.
In the OST_bench_v0.json file, there are a total of 1.4k multi-turn sessions(totally 10k samples) for evaluation; in the OST_bench_training_v0.json file, there are a total of 7k multi-turn sessions(totally 50k samples) for training. Each multi-turn data representing an agent's exploration and containing a set of multi-turn dialogues. Each sample conforms to the following dictionary format:
python
{
scan_id(str): The scan ID,
system_prompt(str): The system prompt to input to the model,
user_message(str): A list of multi-turn dialogue.
}
The "user_message" contains multi-turn dialogues with the agent, where each turn is represented as a dictionary in the following format, each path in the "image_paths" refers to an image in the img directory:
python
{
turn_id(int): the index of the turn, act as a timestamp,
type(str): the subtype of the question,
origin_question(str): the original version of question,
answer(str): the answer to the question,
option(list[str]): the options
image_paths(list[str]): a list of new observations,
prompt(str): the prompt input into model with the new observations together for each turn,
}
Our data samples include diverse question types, covering three major categories(Agent State, Agent Visible Info and Agent-object Spatial Relationship) and four distinct question formats(judgement / Counting / Temporal-localization / Estimation). For details on how to use this data for model evaluation, please refer to our code repository.
The evaluation images are stored in img.zip, while the training images are split into two parts: img_train_part.zip and img_train_part.z01.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The Battery Test Bench market is experiencing robust growth, driven by the burgeoning electric vehicle (EV) industry and the increasing demand for energy storage solutions. The market's expansion is fueled by stringent government regulations promoting EV adoption globally, alongside the continuous advancements in battery technology requiring rigorous testing procedures. This necessitates sophisticated test benches capable of evaluating various battery parameters, including performance, safety, and lifespan under diverse operating conditions. While precise market sizing data wasn't provided, considering the rapid growth of the EV sector and the crucial role of battery testing, a reasonable estimate for the 2025 market size could be placed in the range of $2.5 billion to $3 billion. Assuming a conservative Compound Annual Growth Rate (CAGR) of 15% over the forecast period (2025-2033), the market is projected to reach a significant size by 2033, driven by factors like increasing EV production, grid-scale energy storage deployments, and expanding research and development in battery technologies. Key market segments include those catering to different battery chemistries (Li-ion, solid-state, etc.), testing types (cell, module, pack), and application areas (EVs, grid storage, portable electronics). Major players like FEV, HORIBA, Simpro, Chroma ATE, and others are actively investing in R&D and expanding their product portfolios to capitalize on this expanding market. However, challenges remain, including the high cost of advanced testing equipment and the need for standardized testing protocols across different regions. Furthermore, the relatively longer lead times for customized testing solutions can pose a constraint to immediate market expansion. Nevertheless, the long-term outlook remains positive, given the continued growth of the EV and renewable energy sectors, driving demand for robust and reliable battery test benches for years to come.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4739409%2Fbb6a5e7edd4e7dc2e50e22ce5454042e%2Finbox_3819783_abe43385305442deceb5a0019e62e1ea_UPB_LEA_Headder_300dpi.png?generation=1587388001472129&alt=media" alt="">
The data set comprises several sensor data collected from a typical induction motor drive deployed at a test bench. Thereby, measurements from the electrical, thermal and mechanical domain are included in the data set. The test bench measurements were collected at the LEA department of the Paderborn University.
The data set comprises approximately 262 hours of test bench measurements in the complete operating range of the exemplary drive system.
A comprehensive description of the data set can be found in the following paper (freely available):
This work was funded by the German Research Foundation (DFG) under the reference number 389029890.
Department of Power Electronics and Electrical Drives, Paderborn University, Germany
Data files Ā© Original Authors
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3819783%2Fabe43385305442deceb5a0019e62e1ea%2FUPB_LEA_Headder_300dpi.png?generation=1585167925099468&alt=media" alt="titlepage header">
All data is deanonymized now. Moreover, 17 additional measurement profiles were added, expanding the dataset from 138 hours to 185 hours of records.
The data set comprises several sensor data collected from a permanent magnet synchronous motor (PMSM) deployed on a test bench. The PMSM represents a german OEM's prototype model. Test bench measurements were collected by the LEA department at Paderborn University.
All recordings are sampled at 2 Hz. The data set consists of multiple measurement sessions, which can be distinguished from each other by column "profile_id". A measurement session can be between one and six hours long.
The motor is excited by hand-designed driving cycles denoting a reference motor speed and a reference torque. Currents in d/q-coordinates (columns "i_d" and i_q") and voltages in d/q-coordinates (columns "u_d" and "u_q") are a result of a standard control strategy trying to follow the reference speed and torque. Columns "motor_speed" and "torque" are the resulting quantities achieved by that strategy, derived from set currents and voltages.
Most driving cycles denote random walks in the speed-torque-plane in order to imitate real world driving cycles to a more accurate degree than constant excitations and ramp-ups and -downs would.
Several publications leveraged the setup of the PMSM in the Paderborn University Lab:
Please cite the following publication if you intend to use this dataset for your own publications:
The most interesting target features are rotor temperature ("pm"), stator temperatures ("stator_*") and torque. Especially rotor temperature and torque are not reliably and economically measurable in a commercial vehicle.
Being able to have strong estimators for the rotor temperature helps the automotive industry to manufacture motors with less material and enables control strategies to utilize the motor to its maximum capability. A precise torque estimate leads to more accurate and adequate control of the motor, reducing power losses and eventually heat build-up.
Facebook
Twitterhttps://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4739409%2Fbb6a5e7edd4e7dc2e50e22ce5454042e%2Finbox_3819783_abe43385305442deceb5a0019e62e1ea_UPB_LEA_Headder_300dpi.png?generation=1587388001472129&alt=media" alt="">
The data set comprises several sensor data collected from a typical combined system between an inverter, an induction motor, and a control system, deployed on a test bench. Test bench measurements were collected by the LEA department at Paderborn University.
An inverter is a power electronic component with transistors (read 'switches'), that determine how the battery voltage (so called DC-link voltage) is applied on the three phase circuits of the electric motor. The control unit decides according to some control strategy the current switching states of the inverter at each discrete point in time.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F642765%2Fff29577cce2f7018785f91a5d1a3805c%2FScreenshot%20from%202020-10-27%2011-39-16.png?generation=1603795173235540&alt=media" alt="">
The data set comprises approximately 235 thousand samples in the complete operating range of an exemplary drive system.
Rows follow no particular order.
The most important aspect of an electric vehicle from a marketing and engineering perspective is its efficiency and, thus, achievable range. For this, it is essential to avoid over-dimensioning of the drive train, i.e. applying more and heavier metal packs to increase its thermal capabilities. If the motor is controlled inefficiently through the inverter, there'll be superfluous power losses, i.e. heat build-up, which eventually leads to electric power derating during operation and, crucially, early depletion of the battery.
Precise phase voltage information is mandatory in order to enable an accurate, efficient or high dynamic control performance of electric motor drives, especially if a torque-controlled operation is considered. However, most electrical drives do not measure the phase voltages online due to their cost implications, and, therefore, these have to be estimated by inverter models.
Because of various nonlinear switching effects partly at nanosecond scale, an analytical white-box modeling approach is hardly feasible in a control context. Hence, data-driven inverter models seem favorable for this purpose.
Since the control utilizes pulse width modulation (PWM), the mean phase voltages for each PWM interval are the targets of the inverter models.
A comprehensive description of the data set can be found in the following paper (freely available): Data Set Description: Three-Phase IGBT Two-Level Inverter for Electrical Drives DOI: 10.13140/RG.2.2.23335.37280
This work was funded by the German Research Foundation (DFG) under the reference number 389029890.
Department of Power Electronics and Electrical Drives, Paderborn University, Germany
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is a dataset of VM images from the VMWare marketplace, mainly intended for use within data deduplication projects. This dataset is compatible with the DedupBench framework on GitHub.
Note that this is the full DEB dataset used in our VectorCDC paper from FAST 2025. Please cite our paper using the citation below if you are using this dataset.
Udayashankar, S., Baba, A. and Al-Kiswany, S., 2025, February. VectorCDC: Accelerating Data Deduplication with Vector Instructions. In 2025 USENIX 23rd Conference on File and Storage Technologies (FAST' 25). USENIX
@inproceedings {305256,
author = {Sreeharsha Udayashankar and Abdelrahman Baba and Samer Al-Kiswany},
title = {{VectorCDC}: Accelerating Data Deduplication with Vector Instructions},
booktitle = {23rd USENIX Conference on File and Storage Technologies (FAST 25)},
year = {2025},
isbn = {978-1-939133-45-8},
address = {Santa Clara, CA},
pages = {513--522},
url = {https://www.usenix.org/conference/fast25/presentation/udayashankar},
publisher = {USENIX Association},
month = feb
}
`
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Please temporarily download the raw data from: https://drive.google.com/drive/folders/1ATaUy2VKGyvh6DzuvkMm_DSSAvOEgdH0?usp=drive_link We will upload it here as soon as possible. Please note that the full raw data requires approximately 270 GB of storage.