Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
C-Eval is a comprehensive Chinese evaluation suite for foundation models. It consists of 13948 multi-choice questions spanning 52 diverse disciplines and four difficulty levels. Please visit our website and GitHub or check our paper for more details. Each subject consists of three splits: dev, val, and test. The dev set per subject consists of five exemplars with explanations for few-shot evaluation. The val set is intended to be used for hyperparameter tuning. And the test set is for model… See the full description on the dataset page: https://huggingface.co/datasets/ceval/ceval-exam.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by GinRawin
Released under Apache 2.0
Facebook
TwitterAdd Column 'choices' to the original dataset.
Citation
If you use C-Eval benchmark or the code in your research, please cite their paper: @article{huang2023ceval, title={C-Eval: A Multi-Level Multi-Discipline Chinese Evaluation Suite for Foundation Models}, author={Huang, Yuzhen and Bai, Yuzhuo and Zhu, Zhihao and Zhang, Junlei and Zhang, Jinghan and Su, Tangjun and Liu, Junteng and Lv, Chuancheng and Zhang, Yikai and Lei, Jiayi and Fu, Yao and Sun, Maosong and He, Junxian}… See the full description on the dataset page: https://huggingface.co/datasets/zacharyxxxxcr/ceval-exam.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
C-Eval (Test-Split) in Unified JSONL Format
数据集描述
本项目将 C-Eval 数据集 的 test 部分转换为 统一的指令式 JSONL 格式,以便于大语言模型的评估和测试。 C-Eval 是一个全面的中文基础模型评估套件,旨在衡量语言模型在中文语言和知识方面的能力。本仓库的数据源自 ceval 的 test 集,并与 CMMLU 数据集采用了完全相同的处理流程和数据结构,方便用户在统一的框架下进行评测。
数据格式
数据集为 JSONL 格式,每一行都是一个独立的 JSON 对象。该结构经过精心设计,以适应标准的指令微调和推理流程。 每个 JSON 对象包含以下字段:
id: 样本的唯一标识符。 instruction: 指令文本,用于引导模型进行单项选择题的回答。 choices: 一个包含四个选项的 字典,键为 "A", "B", "C", "D"。 answer: 问题的正确答案('A', 'B', 'C', 或 'D')。
格式示例: { "id":… See the full description on the dataset page: https://huggingface.co/datasets/zhaode/ceval.
Facebook
TwitterWebQA, CEval, CMMLU, and MMLU for general chat
Facebook
Twitterliangzid/robench-eval-Time1-c dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterNew Ceval Ltda Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
(1) As far as we know, this is the largest QA dataset for Chinese Construction Laws and Regulations (CCLR). For example, well-known datasets like c-eval typically contain only about 500 questions in a single domain, whereas our dataset specifically focuses on the CCLR domain and includes 6,359 questions.
(2) This dataset has 2,240 questions from Registered Constructor Qualification Examination (RCQE) and 4,119 self-designed questions covering 8 CCLR subdomains.
(3)… See the full description on the dataset page: https://huggingface.co/datasets/AnonymousSite/QA_dataset_for_CCLR.
Facebook
TwitterAccent Footwear Limited C O Ceval Logis Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset is about: (Appendix C) Carbonate, organic carbon, and Rock-Eval pyrolysis at DSDP Hole 77-535. Please consult parent dataset @ https://doi.org/10.1594/PANGAEA.809136 for more information.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveThis study aims to construct a multimodal radiomics model based on contrast-enhanced ultrasound (CEUS) radiomic features, combined with conventional ultrasonography (US) images and clinical data, to evaluate its diagnostic efficacy in differentiating benign and malignant thyroid nodules (TNs) classified as C-TIRADS 4, and to assess the clinical application value of the model.MethodsThis retrospective study enrolled 135 patients with C-TIRADS 4 thyroid nodules who underwent concurrent US and CEUS before FNA/surgery. From each case, one US image and three CEUS key frames (2s post-perfusion, peak enhancement, 2s post-peak) were selected. Patients were randomly split into training (n=108) and test (n=27) cohorts (8:2 ratio). ROIs were manually delineated (3D-Slicer), with radiomics features extracted (PyRadiomics) and selected via mRMR and LASSO. Six CEUS radiomics-based machine learning models (KNN, SVM, RF, XGBoost, LightGBM, SGD) were developed and evaluated using AUC, accuracy, sensitivity, specificity, and F1-score. The optimal classifier was used to build US-only, US+CEUS, and clinical+US+CEUS models. Statistical comparisons employed DeLong tests, calibration curves, and DCA.ResultsThe CEUS radiomics model demonstrated favorable diagnostic performance in differentiating benign and malignant C-TIRADS 4 thyroid nodules, with sensitivity, specificity, and accuracy of 0.875, 0.769, and 0.833, respectively. When CEUS radiomic features were combined with US features, the diagnostic performance of the CEUS radiomics model was comparable to that of the US+CEUS radiomics model (AUC: 0.813 vs. 0.829, P=0.005). Furthermore, the multimodal radiomics model integrating clinical data (clinical+US+CEUS radiomics model) achieved significantly improved diagnostic efficacy, with an AUC of 0.967, along with accuracy, sensitivity, specificity, and F1-score values of 0.815, 0.823, 0.792, and 0.884, respectively.ConclusionOur study developed a high-performance multimodal diagnostic model through the innovative integration of radiomic features from three critical CEUS timepoints combined with conventional ultrasound and clinical data, establishing a novel decision-support tool for accurate noninvasive classification of C-TIRADS 4 thyroid nodules. The model’s superior diagnostic performance (AUC 0.967) demonstrates the transformative potential of multimodal integration in overcoming single-modality limitations and enhancing clinical decision-making, positioning this approach as a promising solution to mitigate unnecessary diagnostic procedures and overtreatment.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SA-Co/Gold is a benchmark for promptable concept segmentation (PCS) in images developed by Meta for the Segment Anything 3 model (SAM 3). The benchmark contains images paired with text labels (also referred as Noun Phrases aka NPs), each annotated exhaustively with masks on all object instances that match the label.
SA-Co/Gold comprises 7 subsets, each targeting a different annotation domain. For each subset, the annotations are multi-reviewed and agreed by 3 human annotators resulting in a high-quality benchmark.
This Project allows you to explore MetaCLIP Merged C Release Test, which is one of three annotations for the MetaCLIP captioner NPs subset. You can see the other two at MetaCLIP Merged B Release Test and MetaCLIP Merged A Release Test.
The SA-Co/Gold test data is available in its canonical, eval-ready form below.
Download SA-1B images: https://sa-co.roboflow.com/gold/sa1b-images.zip
Download MetaCLIP images: https://sa-co.roboflow.com/gold/metaclip-images.zip
Download ground truth annotations: https://sa-co.roboflow.com/gold/gt-annotations.zip
Download the full bundle: https://sa-co.roboflow.com/gold/all.zip
The Sa-Co/Gold dataset covers 2 image sources and 7 annotation domains. The image sources are: MetaCLIP and SA-1B. The annotation domains are: MetaCLIP captioner NPs, SA-1B captioner NPs, Attributes, Crowded Scenes, Wiki-Common1K, Wiki-Food/Drink, Wiki-Sports Equipment.
Explore all: SA-Co/Gold on Roboflow Universe
Read Meta's data license for SA-Co/Gold: License
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SA-Co/Gold is a benchmark for promptable concept segmentation (PCS) in images developed by Meta for the Segment Anything 3 model (SAM 3). The benchmark contains images paired with text labels (also referred as Noun Phrases aka NPs), each annotated exhaustively with masks on all object instances that match the label.
SA-Co/Gold comprises 7 subsets, each targeting a different annotation domain. For each subset, the annotations are multi-reviewed and agreed by 3 human annotators resulting in a high-quality benchmark.
This Project allows you to explore Attributes Merged C Release Test, which is one of three annotations for the Attributes subset. You can see the other two at Attributes Merged A Release Test and Attributes Merged B Release Test.
The SA-Co/Gold test data is available in its canonical, eval-ready form below.
Download SA-1B images: https://sa-co.roboflow.com/gold/sa1b-images.zip
Download MetaCLIP images: https://sa-co.roboflow.com/gold/metaclip-images.zip
Download ground truth annotations: https://sa-co.roboflow.com/gold/gt-annotations.zip
Download the full bundle: https://sa-co.roboflow.com/gold/all.zip
The Sa-Co/Gold dataset covers 2 image sources and 7 annotation domains. The image sources are: MetaCLIP and SA-1B. The annotation domains are: MetaCLIP captioner NPs, SA-1B captioner NPs, Attributes, Crowded Scenes, Wiki-Common1K, Wiki-Food/Drink, Wiki-Sports Equipment.
Explore all: SA-Co/Gold on Roboflow Universe
Read Meta's data license for SA-Co/Gold: License
Facebook
TwitterThe primary objectives for the initial treatment period of this study are to further evaluate the safety of natalizumab monotherapy by evaluating the risk of hypersensitivity reactions and immunogenicity following re-exposure to natalizumab and confirming the safety of switching from interferon (IFN), glatiramer acetate, or other multiple sclerosis (MS) therapies to natalizumab. The primary objective for the long-term treatment period of this study is to evaluate the long-term impact of natalizumab monotherapy on the progression of disability measured by Expanded Disability Status Scale (EDSS) changes over time.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SA-Co/Gold is a benchmark for promptable concept segmentation (PCS) in images developed by Meta for the Segment Anything 3 model (SAM 3). The benchmark contains images paired with text labels (also referred as Noun Phrases aka NPs), each annotated exhaustively with masks on all object instances that match the label.
SA-Co/Gold comprises 7 subsets, each targeting a different annotation domain. For each subset, the annotations are multi-reviewed and agreed by 3 human annotators resulting in a high-quality benchmark.
This Project allows you to explore SA-1B Merged C Release Test, which is one of three annotations for the SA-1B captioner NPs subset. You can see the other two at SA-1B Merged A Release Test and SA-1B Merged B Release Test.
The SA-Co/Gold test data is available in its canonical, eval-ready form below.
Download SA-1B images: https://sa-co.roboflow.com/gold/sa1b-images.zip
Download MetaCLIP images: https://sa-co.roboflow.com/gold/metaclip-images.zip
Download ground truth annotations: https://sa-co.roboflow.com/gold/gt-annotations.zip
Download the full bundle: https://sa-co.roboflow.com/gold/all.zip
The Sa-Co/Gold dataset covers 2 image sources and 7 annotation domains. The image sources are: MetaCLIP and SA-1B. The annotation domains are: MetaCLIP captioner NPs, SA-1B captioner NPs, Attributes, Crowded Scenes, Wiki-Common1K, Wiki-Food/Drink, Wiki-Sports Equipment.
Explore all: SA-Co/Gold on Roboflow Universe
Read Meta's data license for SA-Co/Gold: License
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SA-Co/Gold is a benchmark for promptable concept segmentation (PCS) in images developed by Meta for the Segment Anything 3 model (SAM 3). The benchmark contains images paired with text labels (also referred as Noun Phrases aka NPs), each annotated exhaustively with masks on all object instances that match the label.
SA-Co/Gold comprises 7 subsets, each targeting a different annotation domain. For each subset, the annotations are multi-reviewed and agreed by 3 human annotators resulting in a high-quality benchmark.
This Project allows you to explore FG Food Merged C Release Test, which is one of three annotations for the Wiki-Food/Drink subset. You can see the other two at FG Food Merged A Release Test and FG Food Merged B Release Test.
The SA-Co/Gold test data is available in its canonical, eval-ready form below.
Download SA-1B images: https://sa-co.roboflow.com/gold/sa1b-images.zip
Download MetaCLIP images: https://sa-co.roboflow.com/gold/metaclip-images.zip
Download ground truth annotations: https://sa-co.roboflow.com/gold/gt-annotations.zip
Download the full bundle: https://sa-co.roboflow.com/gold/all.zip
The Sa-Co/Gold dataset covers 2 image sources and 7 annotation domains. The image sources are: MetaCLIP and SA-1B. The annotation domains are: MetaCLIP captioner NPs, SA-1B captioner NPs, Attributes, Crowded Scenes, Wiki-Common1K, Wiki-Food/Drink, Wiki-Sports Equipment.
Explore all: SA-Co/Gold on Roboflow Universe
Read Meta's data license for SA-Co/Gold: License
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SA-Co/Gold is a benchmark for promptable concept segmentation (PCS) in images developed by Meta for the Segment Anything 3 model (SAM 3). The benchmark contains images paired with text labels (also referred as Noun Phrases aka NPs), each annotated exhaustively with masks on all object instances that match the label.
SA-Co/Gold comprises 7 subsets, each targeting a different annotation domain. For each subset, the annotations are multi-reviewed and agreed by 3 human annotators resulting in a high-quality benchmark.
This Project allows you to explore FG Sports Equipment Merged C Release Test, which is one of three annotations for the Wiki-Sports Equipment subset. You can see the other two at FG Sports Equipment Merged A Release Test and FG Sports Equipment Merged B Release Test.
The SA-Co/Gold test data is available in its canonical, eval-ready form below.
Download SA-1B images: https://sa-co.roboflow.com/gold/sa1b-images.zip
Download MetaCLIP images: https://sa-co.roboflow.com/gold/metaclip-images.zip
Download ground truth annotations: https://sa-co.roboflow.com/gold/gt-annotations.zip
Download the full bundle: https://sa-co.roboflow.com/gold/all.zip
The Sa-Co/Gold dataset covers 2 image sources and 7 annotation domains. The image sources are: MetaCLIP and SA-1B. The annotation domains are: MetaCLIP captioner NPs, SA-1B captioner NPs, Attributes, Crowded Scenes, Wiki-Common1K, Wiki-Food/Drink, Wiki-Sports Equipment.
Explore all: SA-Co/Gold on Roboflow Universe
Read Meta's data license for SA-Co/Gold: License
Facebook
TwitterIntroductionThe notion of physical literacy is gaining interest from several countries as a potential mechanism for understanding the development of the physical self. This research endeavor represents an inaugural attempt to translate the Australian Physical Literacy Questionnaire for Children (PL-C Quest) into Chinese to evaluate the reliability and validity of the Chinese version of the PL-C Quest to assess physical literacy among children in mainland China.MethodsThe Beaton translation paradigm was used to carry out language translation, back-translation, cultural adaptation, and presurveys. Data were collected from 642 children aged 6–12 years, with a mean age of 9.71 years (SD 1.816), to test the reliability of the Chinese version of the PL-C Quest.ResultsThe PL-C Quest items translated well (6.187 ~ 15.499) and correlated well (0.441 ~ 0.622). The Chinese version of the PL-C Quest had good reliability, with retest reliability values ranging from 0.91 to 0.74, Cronbach’s alpha from 0.65 to 0.894, and McDonald’s ω from the Spearman-Brown Coefficient was 0.84. The validity results are acceptable because the CFI, IFI, and TLI values are above 0.8 and close to 0.9, but the model fit’s chi-square degrees-of-freedom ratio of 2.299, the RMSEA of 0.05, which was less than 0.08.DiscussionAfter translation and cultural adaptation, the Chinese version of the PL-C Quest is a reliable measurement tool and can be used in the Chinese region.
Facebook
TwitterThe primary objectives of this study are to further evaluate the safety of natalizumab (Tysabri®) monotherapy by evaluating the risk of hypersensitivity and immunogenicity following re-exposure to natalizumab, and to confirm the safety of switching to natalizumab from interferon beta (IFN-β), glatiramer acetate (GA), or other multiple sclerosis (MS) therapies.
Facebook
TwitterBackgroundWe aimed to suggest muscle mass-based criteria for using of the cystatin C test for the accurate estimated glomerular filtration rate (eGFR).Materials and methodsWe recruited 138 Korean subjects and evaluated eGFRcr (derived from Chronic Kidney Disease Epidemiology Collaboration (CKD-EPI) based on creatinine) was compared to eGFRcys based on cystatin C as the reference value. The skeletal muscle mass index (SMI) by bioelectrical impedance analysis (BIA) was used as representative of muscle mass. Calf circumference (CC) was also evaluated. We defined the patients by eGFRcr as those with values of eGFRcr ≥ 60 mL/min/1.73 m2 but eGFRcys < 60 mL/min/1.73 m2 as the detection of hidden renal impairment (DHRI). Cut-off values were determined based on muscle mass for the cases of DHRI suggesting the criteria of cystatin C test in renal function evaluation.ResultsWe confirmed significant negative correlation between %difference of eGFRcr from eGFRcys and SMI (r, −0.592 for male, −0.484 for female) or CC (r, −0.646 for male, −0.351 for female). SMI of 7.3 kg/m2 for males and 5.7 kg/m2 for females were suggested to be significant cutoffs for indication of cystatin C test. We also suggested CC would be valuable for cystatin C indication.ConclusionWe suggested the muscle mass-based objective criteria relating to SMI and CC that would indicate the use of cystatin C to evaluate renal function test in sarcopenic cases. Our results highlight the importance of muscle mass-based selection of renal function.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
C-Eval is a comprehensive Chinese evaluation suite for foundation models. It consists of 13948 multi-choice questions spanning 52 diverse disciplines and four difficulty levels. Please visit our website and GitHub or check our paper for more details. Each subject consists of three splits: dev, val, and test. The dev set per subject consists of five exemplars with explanations for few-shot evaluation. The val set is intended to be used for hyperparameter tuning. And the test set is for model… See the full description on the dataset page: https://huggingface.co/datasets/ceval/ceval-exam.