22 datasets found

Ranking of LLM tools in solving math problems 2024
statista.com
Updated Jun 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Ranking of LLM tools in solving math problems 2024 [Dataset]. https://www.statista.com/statistics/1458141/leading-math-llm-tools/
Explore at:
Dataset updated
Jun 25, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 2024
Area covered
Worldwide
Description
As of March 2024, OpenAI o1 was the large language model (LLM) tool that had the best benchmark score in solving math problems, with a score of **** percent. Close behind, in second place, was OpenAI o1-mini, followed by GPT-4o.
r
Australian and New Zealand journal of statistics Impact Factor 2024-2025 -...
researchhelpdesk.org
Updated Feb 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Help Desk (2022). Australian and New Zealand journal of statistics Impact Factor 2024-2025 - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/impact-factor-if/211/australian-and-new-zealand-journal-of-statistics
Explore at:
Dataset updated
Feb 23, 2022
Dataset authored and provided by
Research Help Desk
Description
Australian and New Zealand journal of statistics Impact Factor 2024-2025 - ResearchHelpDesk - The Australian & New Zealand Journal of Statistics is an international journal managed jointly by the Statistical Society of Australia and the New Zealand Statistical Association. Its purpose is to report significant and novel contributions in statistics, ranging across articles on statistical theory, methodology, applications and computing. The journal has a particular focus on statistical techniques that can be readily applied to real-world problems, and on application papers with an Australasian emphasis. Outstanding articles submitted to the journal may be selected as Discussion Papers, to be read at a meeting of either the Statistical Society of Australia or the New Zealand Statistical Association. The main body of the journal is divided into three sections. The Theory and Methods Section publishes papers containing original contributions to the theory and methodology of statistics, econometrics and probability, and seeks papers motivated by a real problem and which demonstrate the proposed theory or methodology in that situation. There is a strong preference for papers motivated by, and illustrated with, real data. The Applications Section publishes papers demonstrating applications of statistical techniques to problems faced by users of statistics in the sciences, government and industry. A particular focus is the application of newly developed statistical methodology to real data and the demonstration of better use of established statistical methodology in an area of application. It seeks to aid teachers of statistics by placing statistical methods in context. The Statistical Computing Section publishes papers containing new algorithms, code snippets, or software descriptions (for open source software only) which enhance the field through the application of computing. Preference is given to papers featuring publically available code and/or data, and to those motivated by statistical methods for practical problems. In addition, suitable review papers and articles of historical and general interest will be considered. The journal also publishes book reviews on a regular basis. Abstracting and Indexing Information Academic Search (EBSCO Publishing) Academic Search Alumni Edition (EBSCO Publishing) Academic Search Elite (EBSCO Publishing) Academic Search Premier (EBSCO Publishing) CompuMath Citation Index (Clarivate Analytics) Current Index to Statistics (ASA/IMS) Journal Citation Reports/Science Edition (Clarivate Analytics) Mathematical Reviews/MathSciNet/Current Mathematical Publications (AMS) RePEc: Research Papers in Economics Science Citation Index Expanded (Clarivate Analytics) SCOPUS (Elsevier) Statistical Theory & Method Abstracts (Zentralblatt MATH) ZBMATH (Zentralblatt MATH)
e
Mathematical Statistics
paper.erudition.co.in
html
Updated Oct 10, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2018). Mathematical Statistics [Dataset]. https://paper.erudition.co.in/competitive-exams/jam/exam-paper
Explore at:
htmlAvailable download formats
Dataset updated
Oct 10, 2018
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Get Exam Question Paper Solutions of Mathematical Statistics and many more.
f
Data from: Data Fission: Splitting a Single Data Point
tandf.figshare.com
txt
Updated Dec 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James Leiner; Boyan Duan; Larry Wasserman; Aaditya Ramdas (2023). Data Fission: Splitting a Single Data Point [Dataset]. http://doi.org/10.6084/m9.figshare.24328745.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24328745.v2
Dataset updated
Dec 14, 2023
Dataset provided by
Taylor & Francis
Authors
James Leiner; Boyan Duan; Larry Wasserman; Aaditya Ramdas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Suppose we observe a random vector X from some distribution in a known family with unknown parameters. We ask the following question: when is it possible to split X into two pieces f(X) and g(X) such that neither part is sufficient to reconstruct X by itself, but both together can recover X fully, and their joint distribution is tractable? One common solution to this problem when multiple samples of X are observed is data splitting, but Rasines and Young offers an alternative approach that uses additive Gaussian noise—this enables post-selection inference in finite samples for Gaussian distributed data and asymptotically when errors are non-Gaussian. In this article, we offer a more general methodology for achieving such a split in finite samples by borrowing ideas from Bayesian inference to yield a (frequentist) solution that can be viewed as a continuous analog of data splitting. We call our method data fission, as an alternative to data splitting, data carving and p-value masking. We exemplify the method on several prototypical applications, such as post-selection inference for trend filtering and other regression problems, and effect size estimation after interactive multiple testing. Supplementary materials for this article are available online.
Experiment 1: The main effect of distance, of RC Type, and their interaction...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samar Husain; Shravan Vasishth; Narayanan Srinivasan (2023). Experiment 1: The main effect of distance, of RC Type, and their interaction on question-response accuracy; and the effect of distance within subject and object relatives. [Dataset]. http://doi.org/10.1371/journal.pone.0100986.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0100986.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Samar Husain; Shravan Vasishth; Narayanan Srinivasan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Experiment 1: The main effect of distance, of RC Type, and their interaction on question-response accuracy; and the effect of distance within subject and object relatives.
q
Data from: The Berth Allocation Problem with Channel Restrictions - Datasets...
researchdatafinder.qut.edu.au
researchdata.edu.au
Updated Jun 22, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr Paul Corry (2018). The Berth Allocation Problem with Channel Restrictions - Datasets [Dataset]. https://researchdatafinder.qut.edu.au/individual/n4992
Explore at:
Dataset updated
Jun 22, 2018
Dataset provided by
Queensland University of Technology (QUT)
Authors
Dr Paul Corry
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These datatasets relate to the computational study presented in the paper The Berth Allocation Problem with Channel Restrictions, authored by Paul Corry and Christian Bierwirth. They consist of all the randomly generated problem instances along with the computational results presented in the paper.

Results across all problem instances assume ship separation parameters of [delta_1, delta_2, delta_3] = [0.25, 0, 0.5].

Excel Workbook Organisation:

The data is organised into separate Excel files for each table in the paper, as indicated by the file description. Within each file, each row of data presented (aggregating 10 replications) in the corrsponding table is captured in two worksheets, one with the problem instance data, and the other with generated solution data obtained from several solution methods (described in the paper). For example, row 3 of Tab. 2, will have data for 10 problem instances on worksheet T2R3, and corresponding solution data on T2R3X.

Problem Instance Data Format:

On each problem instance worksheet (e.g. T2R3), each row of data corresponds to a different problem instance, and there are 10 replications on each worksheet.

The first column provides a replication identifier which is referenced on the corresponding solution worksheet (e.g. T2R3X).

Following this, there are n*(2c+1) columns (n = number of ships, c = number of channel segmenets) with headers p(i)_(j).(k)., where i references the operation (channel transit/berth visit) id, j references the ship id, and k references the index of the operation within the ship. All indexing starts at 0. These columns define the transit or dwell times on each segment. A value of -1 indicates a segment on which a berth allocation must be applied, and hence the dwell time is unkown.

There are then a further n columns with headers r(j), defining the release times of each ship.

For ChSP problems, there are a final n colums with headers b(j), defining the berth to be visited by each ship. ChSP problems with fixed berth sequencing enforced have an additional n columns with headers toa(j), indicating the order in which ship j sits within its berth sequence. For BAP-CR problems, these columnns are not present, but replaced by n*m columns (m = number of berths) with headers p(j).(b) defining the berth processing time of ship j if allocated to berth b.

Solution Data Format:

Each row of data corresponds to a different solution.

Column A references the replication identifier (from the corresponding instance worksheet) that the soluion refers to.

Column B defines the algorithm that was used to generate the solution.

Column C shows the objective function value (total waiting and excess handling time) obtained.

Column D shows the CPU time consumed in generating the solution, rounded to the nearest second.

Column E shows the optimality gap as a proportion. A value of -1 or an empty value indicates that optimality gap is unknown.

From column F onwards, there are are n*(2c+1) columns with the previously described p(i)_(j).(k). headers. The values in these columns define the entry times at each segment.

For BAP-CR problems only, following this there are a further 2n columns. For each ship j, there will be columns titled b(j) and p.b(j) defining the berth that was allocated to ship j, and the processing time on that berth respectively.
r
Australian and New Zealand journal of statistics - ResearchHelpDesk
researchhelpdesk.org
Updated Sep 15, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Research Help Desk (2017). Australian and New Zealand journal of statistics - ResearchHelpDesk [Dataset]. https://www.researchhelpdesk.org/journal/211/australian-and-new-zealand-journal-of-statistics
Explore at:
Dataset updated
Sep 15, 2017
Dataset authored and provided by
Research Help Desk
Description
Australian and New Zealand journal of statistics - ResearchHelpDesk - The Australian & New Zealand Journal of Statistics is an international journal managed jointly by the Statistical Society of Australia and the New Zealand Statistical Association. Its purpose is to report significant and novel contributions in statistics, ranging across articles on statistical theory, methodology, applications and computing. The journal has a particular focus on statistical techniques that can be readily applied to real-world problems, and on application papers with an Australasian emphasis. Outstanding articles submitted to the journal may be selected as Discussion Papers, to be read at a meeting of either the Statistical Society of Australia or the New Zealand Statistical Association. The main body of the journal is divided into three sections. The Theory and Methods Section publishes papers containing original contributions to the theory and methodology of statistics, econometrics and probability, and seeks papers motivated by a real problem and which demonstrate the proposed theory or methodology in that situation. There is a strong preference for papers motivated by, and illustrated with, real data. The Applications Section publishes papers demonstrating applications of statistical techniques to problems faced by users of statistics in the sciences, government and industry. A particular focus is the application of newly developed statistical methodology to real data and the demonstration of better use of established statistical methodology in an area of application. It seeks to aid teachers of statistics by placing statistical methods in context. The Statistical Computing Section publishes papers containing new algorithms, code snippets, or software descriptions (for open source software only) which enhance the field through the application of computing. Preference is given to papers featuring publically available code and/or data, and to those motivated by statistical methods for practical problems. In addition, suitable review papers and articles of historical and general interest will be considered. The journal also publishes book reviews on a regular basis. Abstracting and Indexing Information Academic Search (EBSCO Publishing) Academic Search Alumni Edition (EBSCO Publishing) Academic Search Elite (EBSCO Publishing) Academic Search Premier (EBSCO Publishing) CompuMath Citation Index (Clarivate Analytics) Current Index to Statistics (ASA/IMS) Journal Citation Reports/Science Edition (Clarivate Analytics) Mathematical Reviews/MathSciNet/Current Mathematical Publications (AMS) RePEc: Research Papers in Economics Science Citation Index Expanded (Clarivate Analytics) SCOPUS (Elsevier) Statistical Theory & Method Abstracts (Zentralblatt MATH) ZBMATH (Zentralblatt MATH)
E
Google Gemini Statistics By Features, Performance and AI Versions
enterpriseappstoday.com
Updated Dec 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EnterpriseAppsToday (2023). Google Gemini Statistics By Features, Performance and AI Versions [Dataset]. https://www.enterpriseappstoday.com/stats/google-gemini-statistics.html
Explore at:
Dataset updated
Dec 20, 2023
Dataset authored and provided by
EnterpriseAppsToday
License
https://www.enterpriseappstoday.com/privacy-policyhttps://www.enterpriseappstoday.com/privacy-policy
Time period covered
2022 - 2032
Area covered
Global
Description
Google Gemini Statistics: In 2023, Google unveiled the most powerful AI model to date. Google Gemini is the worldâ€™s most advanced AI leaving the ChatGPT 4 behind in the line. Google has 3 different sizes of models, superior to each, and can perform tasks accordingly. According to Google Gemini Statistics, these can understand and solve complex problems related to absolutely anything. Google even said, they will develop AI in such as way that it will let you know how helpful AI is in our daily routine. Well, we hope our next generation wonâ€™t be fully dependent on such technologies, otherwise, we will lose all of our natural talent! Editorâ€™s Choice Google Gemini can follow natural and engaging conversations. According to Google Gemini Statistics, Gemini Ultra has a 90.0% score on the MMLU benchmark for testing the knowledge of and problem-solving on subjects including history, physics, math, law, ethics, history, and medicine. If you ask Gemini what to do with your raw material, it can provide you with ideas in the form of text or images according to the given input. Gemini has outperformed ChatGPT -4 tests in the majority of the cases. According to the report this LLM is said to be unique because it can process multiple types of data at the same time along with video, images, computer code, and text. Google is considering its development as The Gemini Era, showing the importance of our AI is significant in improving our daily lives. Google Gemini can talk like a real person Gemini Ultra is the largest model and can solve extremely complex problems. Gemini models are trained on multilingual and multimodal datasets. Geminiâ€™s Ultra performance on the MMMU benchmark has also outperformed the GPT-4V in the following results Art and Design (74.2), Business (62.7), Health and Medicine (71.3), Humanities and Social Science (78.3), and Technology and Engineering (53.00).
q
Transit Network Planning Problem data
researchdatafinder.qut.edu.au
researchdata.edu.au
Updated Nov 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mr Joshua Rosentreter (2024). Transit Network Planning Problem data [Dataset]. https://researchdatafinder.qut.edu.au/individual/n8901
Explore at:
Dataset updated
Nov 20, 2024
Dataset provided by
Queensland University of Technology (QUT)
Authors
Mr Joshua Rosentreter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains research output from studying the Transit Network Design Problem (TNDP). At a high level, the dataset includes: a novel transit network based on the Brisbane transport infrastructure, and results from the testing of new methods on the Brisbane network and existing benchmark networks (Mandl and Mumford).

This dataset contains four subsets of data, and are related to Joshua Rosentreter's PhD Thesis. These are outlined below:

Transit Network Dataset: A novel transit network for researchers to use when addressing the Transit Network Planning Problem. The network is based on the Brisbane City transportation infrastructure. MIP Model for TNFSP: Evaluations of existing solutions to the TNDP and TNDFSP using a variety of existing methods and a proposed mixed integer programming (MIP) model. Meta-Heuristic Method for TNDFSP: Results from a novel (adapted from existing) method designed to target the hub-and- spoke style structure of the demand within a metropolitan city based network. Hybrid Method for TNDFSP: Results from a novel method created through the hybridisation of the MIP model and meta-heuristic method.

Further descriptions of the data are contained in the subfolders within.
Binary Classification as a Phase Separation Process (data repository)
zenodo.org
application/gzip
Updated Sep 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rafael Monteiro; Rafael Monteiro (2021). Binary Classification as a Phase Separation Process (data repository) [Dataset]. http://doi.org/10.5281/zenodo.4005131
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4005131
Dataset updated
Sep 30, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rafael Monteiro; Rafael Monteiro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is a data repository for the paper "Binary classification as a phase separation process", by Rafael Monteiro.

Website with description of this project: https://rafael-a-monteiro-math.github.io/Binary_classification_phase_separation/index.html

Github: https://github.com/rafael-a-monteiro-math/Binary_classification_phase_separation

They contain

Examples

1D toy model examples

Computational statistics

Several trained PSBC on MNIST dataset, with different parameter configurations

Extra simulations, investigating normalization properties, low dimensional models that fail due to "too much" model compression, and comparison among ANNs, KNNs, and the PSBC in 1D

If you want to know

how to read the data

how to access computational statistics, raw data, and examples

how to use the data stored in this data repository

see the guide README.pdf on GitHub page at Binary_Classification_Phase_Separation, where a script that downloads (and organizes) all this data is also available ("download_PSBC.sh).

I did not include a copy of the train-test set (0-1dubset of the MNIST database) in every folder with simulations. But you can find a copy of the normalized dataset in the tar ball "PSBC_Examples.tar.gz" as

data_test_normalized_MNIST.csv and data_train_normalized_MNIST.csv.
e
Applied Statistics
paper.erudition.co.in
html
Updated Jul 18, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Applied Statistics [Dataset]. https://paper.erudition.co.in/makaut/btech-in-computer-science-and-engineering/2/mathematics-ii-a
Explore at:
htmlAvailable download formats
Dataset updated
Jul 18, 2020
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Applied Statistics of Mathematics - II A, 2nd Semester , Computer Science and Engineering
o
Qualitative analysis of the reflection of the mathematical dimension of...
openicpsr.org
Updated Nov 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Catalin Barboianu (2024). Qualitative analysis of the reflection of the mathematical dimension of gambling in gaming online content [Dataset]. http://doi.org/10.3886/E211962V1
Explore at:
Unique identifier
https://doi.org/10.3886/E211962V1
Dataset updated
Nov 27, 2024
Dataset provided by
University of Bucharest
Authors
Catalin Barboianu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
worldwide
Description
Mathematics is strongly connected to gambling through the mathematical models underlying any game of chance. Mathematics is reflected not only in games’ design/characteristics and their outcomes, but also in gamblers’ perception and knowledge of the mathematics-related facts of gambling – which influence their gambling behavior. The math-indispensability principle (Bărboianu, 2013) applies not only in problem-gambling research, but also in the gambling industry. The structural, informative, strategic, psychological, pathological, and ethical aspects of gambling have been identified to be grounded in the mathematics of games and gambling (Griffiths, 1993; Bărboianu 2014, 2015; Turner & Hobay, 2004; Harrigan, 2009, and others).In this theoretical framework, research is able to derive concrete norms and criteria to adequately reflect the mathematical dimension of gambling in the communication and texts associated with the gambling industry. These norms and criteria of adequacy will be further communicated to policy and decision makers in both governmental and private sectors, with the recommendation for implementation. Our study aims to evaluate qualitatively the reflection of the mathematical dimension of gambling in the content of gambling websites. This analysis is necessary in order to have an objective and concrete image of the actual state of this matter in the online industry and of the challenges that such research and application would face in the real world of gambling. A minimum number of 600 gambling websites will be reviewed annually for their content in that respect. A statistical analysis will record the presence of the mathematical dimension of gambling and its forms in the content of participating websites, and a qualitative research will analyze and assess the quality of the content with respect to that dimension.
e
Applied Statistics
paper.erudition.co.in
html
Updated Jul 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Applied Statistics [Dataset]. https://paper.erudition.co.in/makaut/btech-in-mechanical-engineering/3/mathematics-iii
Explore at:
htmlAvailable download formats
Dataset updated
Jul 31, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Applied Statistics of Mathematics III, 3rd Semester , Mechanical Engineering
f
Data from: The Effect of Visualization on Students’ Understanding of...
tandf.figshare.com
docx
Updated Jul 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John M. Vargas; Jeffrey J. Starns; Andrew L. Cohen; Darrell Earnest (2025). The Effect of Visualization on Students’ Understanding of Probability Concepts [Dataset]. http://doi.org/10.6084/m9.figshare.29519897.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29519897.v1
Dataset updated
Jul 9, 2025
Dataset provided by
Taylor & Francis
Authors
John M. Vargas; Jeffrey J. Starns; Andrew L. Cohen; Darrell Earnest
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Bayesian reasoning - the optimal process of updating a hypothesis or belief with new information - is a critical aspect of both everyday decision-making and statistics education, but strategies for effectively teaching the topic in the classroom remain elusive. This study leverages the findings of prior research on facilitating Bayesian reasoning by utilizing a visualization, called the bar display, as a method for teaching Bayes theorem and its underlying probability concepts. Data were collected from a college-level statistics-in-psychology course, wherein students were taught and tested on Bayesian reasoning either with or without the bar display. In addition to testing the immediate efficacy of the bar display, data were also collected to test long-term retention and the potential differential benefits for low numeracy and high anxiety students. Results indicated engagement with the bar display as a method for visually approximating answers to Bayesian questions, with students trained with the bar display providing more accurate answers to Bayesian reasoning questions before training and at long-term assessment. Additionally, students with self-reported low numeracy and high math anxiety performed better on Bayesian reasoning questions when learning with the bar display. Recommendations for future implementations are discussed.
AceMath-RM-Training-Data
huggingface.co
Updated Jul 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NVIDIA (2025). AceMath-RM-Training-Data [Dataset]. https://huggingface.co/datasets/nvidia/AceMath-RM-Training-Data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 21, 2025
Dataset provided by
Nvidiahttp://nvidia.com/
Authors
NVIDIA
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
website | paper

AceMath RM Training Data Card

We release the AceMath RM Training data that is used to train the AceMath-7/72B-RM for math outcome reward modeling. Below is the data statistics:

number of unique math questions: 356,058 number of examples: 2,136,348 (each questions have 6 different responses)

Benchmark Results (AceMath-Instruct + AceMath-72B-RM)

We compare AceMath to leading proprietary and open-access math models in above Table. Our… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/AceMath-RM-Training-Data.
f
Population characteristic examples and goodness of fit statistics for census...
plos.figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan I. Levy; Maria Patricia Fabian; Junenette L. Peters (2023). Population characteristic examples and goodness of fit statistics for census tract level synthetic microdata with 13 constraints simultaneously imposed. [Dataset]. http://doi.org/10.1371/journal.pone.0087144.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0087144.t001
Dataset updated
Jun 4, 2023
Dataset provided by
PLOS ONE
Authors
Jonathan I. Levy; Maria Patricia Fabian; Junenette L. Peters
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All population characteristics in the table were identical for the synthetic microdata and the American Community Survey data.
f
Examples comparisons for random partition algorithms
figshare.com
application/gzip
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kenneth Locey (2023). Examples comparisons for random partition algorithms [Dataset]. http://doi.org/10.6084/m9.figshare.96310.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.96310.v2
Dataset updated
Jun 3, 2023
Dataset provided by
figshare
Authors
Kenneth Locey
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Topic: generating uniform random samples from the set of all integer partitions for a given total N and a number of parts S.Problem: current random integer partitioning functions of mathematical software can take a long time to generate a single partition for a given N (regardless of S) and an untennable amount of time generating partitions of N with S parts. Currently, no function among math softwares, peer-reviewed literature, and on stackexchange and arxiv.org generates random partitions with respect to both N and S. Consequently, if one is interested in generating random integer partitions for N having S parts, then one must usually waste time generating random partitions of N and rejecting those not matching S. Note! I have since solved this problem definitively: The question is asked and the solution is presented here: stackoverflow.com/questions/10287021/an-algorithm-for-randomly-generating-integer-partitions-of-a-particular-length/12742508#12742508 I've recently published a preprint of a manuscript on figshare outlining a simple and unbiased solution to this question. figshare.com/articles/Random_integer_partitions_with_restricted_numbers_of_parts/156290 However, below is an approach I tried and was unable to eliminate sampling bias from while keeping reasonable speed. I guess this page is more useful as a good way to NOT go about getting random integer partitions for N and S. Deprecated alternative approach (often biased and slower than the above solution): Generate a single random partition of N and randomly manipulate it until its number of parts equals S. Why? Because randomly perturbing a partition of N until it satisfies S can be faster than generating random partitions based solely on N and rejecting those without S parts.Contents (results of deprecated algorithm): Visual comparisons of 500 random samples generated from the new function derived by myself (red curves) against 500 random samples generated using the random partition function found in the Sage mathematical environment (black curves). Kernel density curves (red ones and black ones) are for statistical evenness across the partition. Statistical evenness is a standardized log-transform of the variance. Kernel density cures that overlap nearly completely reveal that the random samples of partitions generated between the two approaches share a similar structure. Evenness is estimated using Evar, a transform of the variance of log summand values. Evar is standardized to take values between 0.0 (no evenness) and 1.0 (perfect evenness). Close agreement between the random manipulation approach and the Sage function (very high rejection rates as most partitions of N don't match S) was also found using other statistical characteristics (e.g. median summand, relative size of largest summand). These results reveal that the statistical quality of evenness (a transform of the variance) is in high agreement between the two approaches (Sage's function and the potential alternative of randomly manipulating integer partitions using conjugates).Note: I have found biases in skewness and the median summand value with this type of method (randomly manipulate an integer partition to arrive at a uniform random sample based on N and S), and would not recommend this approach.
LogicJa
huggingface.co
Updated Feb 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
sionic-ai (2025). LogicJa [Dataset]. https://huggingface.co/datasets/sionic-ai/LogicJa
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 20, 2025
Dataset provided by
Sionic AI Inc
Authors
sionic-ai
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
LogicJa Dataset Card

Overview

LogicJa is a multi-turn benchmark designed to assess the reasoning capabilities of Japanese language models across multiple domains. This dataset consists of 105 multi-turn tasks (each containing two questions) for a total of 210 questions. Each category has 30 questions to ensure statistical significance.

Category Reasoning Math Writing Coding Understanding Grammar Culture Total

Multi-turn Tasks 15 15 15 15 15 15 15 105… See the full description on the dataset page: https://huggingface.co/datasets/sionic-ai/LogicJa.
f
Number of positive and negative examples for each functional site.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ljubomir Buturovic; Mike Wong; Grace W. Tang; Russ B. Altman; Dragutin Petkovic (2023). Number of positive and negative examples for each functional site. [Dataset]. http://doi.org/10.1371/journal.pone.0091240.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0091240.t004
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Ljubomir Buturovic; Mike Wong; Grace W. Tang; Russ B. Altman; Dragutin Petkovic
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Number of positive and negative examples for each functional site.
f
Rubric Questions.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carolyn B. Lauzon; Andrew J. Asman; Michael L. Esparza; Scott S. Burns; Qiuyun Fan; Yurui Gao; Adam W. Anderson; Nicole Davis; Laurie E. Cutting; Bennett A. Landman (2023). Rubric Questions. [Dataset]. http://doi.org/10.1371/journal.pone.0061737.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0061737.t004
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Carolyn B. Lauzon; Andrew J. Asman; Michael L. Esparza; Scott S. Burns; Qiuyun Fan; Yurui Gao; Adam W. Anderson; Nicole Davis; Laurie E. Cutting; Bennett A. Landman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Rubric Questions.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Ranking of LLM tools in solving math problems 2024 [Dataset]. https://www.statista.com/statistics/1458141/leading-math-llm-tools/

Ranking of LLM tools in solving math problems 2024

Explore at:

Dataset updated

Jun 25, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Mar 2024

Area covered

Worldwide

Description

As of March 2024, OpenAI o1 was the large language model (LLM) tool that had the best benchmark score in solving math problems, with a score of **** percent. Close behind, in second place, was OpenAI o1-mini, followed by GPT-4o.

Clear search

Close search

Google apps

Main menu

Ranking of LLM tools in solving math problems 2024

Australian and New Zealand journal of statistics Impact Factor 2024-2025 -...

Mathematical Statistics

Data from: Data Fission: Splitting a Single Data Point

Experiment 1: The main effect of distance, of RC Type, and their interaction...

Data from: The Berth Allocation Problem with Channel Restrictions - Datasets...

Australian and New Zealand journal of statistics - ResearchHelpDesk

Google Gemini Statistics By Features, Performance and AI Versions

Transit Network Planning Problem data

Binary Classification as a Phase Separation Process (data repository)

Applied Statistics

Qualitative analysis of the reflection of the mathematical dimension of...

Applied Statistics

Data from: The Effect of Visualization on Students’ Understanding of...

AceMath-RM-Training-Data

Population characteristic examples and goodness of fit statistics for census...

Examples comparisons for random partition algorithms

LogicJa

Number of positive and negative examples for each functional site.

Rubric Questions.

Ranking of LLM tools in solving math problems 2024