2 datasets found
  1. o

    Arithmetic Word Problem Compendium

    • opendatabay.com
    .undefined
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cephalopod Studio (2025). Arithmetic Word Problem Compendium [Dataset]. https://www.opendatabay.com/data/ai-ml/b3f879dd-c434-4df5-bb6d-430513edf930
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jun 18, 2025
    Dataset authored and provided by
    Cephalopod Studio
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Machine Learning and AI
    Description

    Arithmetic Word Problem Compendium Dataset (AWPCD)

    Dataset Description

    The dataset is a comprehensive collection of mathematical word problems spanning multiple domains with rich metadata and natural language variations. The problems contain 1 - 5 steps of mathematical operations that are specifically designed to encourage showing work and maintaining appropriate decimal precision throughout calculations.

    All the problems have never been seen before and are free from copyright restrictions.

    The available data has 100,000 problems. To license the templating system that created the data for magnitudes more data or customizations like the number of mathematical steps involved, and the addition of domains. Contact hello@cephalopod.studio for more information.

    Intended Uses & Limitations

    Intended Uses: The data can be used in 4 areas: 1) Pretraining 2) Instruction tuning 3) Finetuning 4) Benchmarking existing models

    All those areas are in service of: - Training mathematical reasoning systems - Developing step-by-step problem-solving capabilities - Testing arithmetic operations across diverse real-world contexts - Evaluating precision in decimal calculations

    Limitations: - Currently English-only - Limited to specific mathematical operations - Template-based generation may introduce structural patterns - Focused on arithmetic operations with up to 5 numbers

    Dataset Description

    Dataset Summary

    The dataset contains 100,000 total problems:

    Problems span multiple domains including: - Agriculture (soil temperature changes, etc.) - Athletics (training hours, distances, etc.) - Construction (elevation changes, work hours, etc.) - Culinary (cooking temperature changes, calories per serving, etc.) - Education (GPA changes, etc.) - Entertainment (show ratings, stage lighting, etc.) - Finance (stock prices, account balances, etc.)

    Data Format

    Each example is provided in JSONL format with the following structure: json { "id": "problem_X", "question": "Text of the math problem", "metadata": { "discrete": boolean, "domain": string, "numbers": number[], "object_type": string, "solution": number, "operators": string[], "decimals": number }, "answer": "Text of the step-by-step solution to the problem" }

    Sample Data

    1. Finance (Account Management):
    "Jack sets up 19 bank accounts for clients. First the total rises to be 2 times greater than before. Following that, another 4 accounts were added."
    
    2. Agriculture (Grain Storage):
    "Kevin oversees 14,457 metric tons of grain storage in the new concrete silo. In the beginning, the facility grows to 3 times its current measure of grain. Following that, the overall supply of grain grows by 1,514 tons. Then, Kevin divides the holdings evenly by 1 and continues operations with a single unit."
    
    3. Temperature Monitoring:
    "The ground temperature measures 5.48 degrees Celsius. First, the temperature values are adjusted to be 1/3.74 the value they were before. Next, a sensor calibration adjustment multiplies all readings by 2.07, and later the measurements need to be scaled down by 1/3.94 due to sensor calibration. Later, the measurements need to be scaled down by 1/2.54 due to sensor calibration, and after that the measurements need to be scaled down by 1/2.21 due to sensor calibration. What is the final soil temperature in degrees Celsius? Round your answer and any steps to 2 decimal places."
    

    Answer Examples

    1. Finance (Account Management):
    "Here's how we can solve this problem:
    "19 accounts times 2 equals 38
    "Addition step: 38 + 4 = 42 accounts
    
    "Based on these steps, the answer is 42."
    
    2. Agriculture (Grain Storage):
    "Following these steps will give us the answer:
    "Multiplication operation: 14,457 tons * 3 = 43,371
    "Add 1514 to 43,371 tons: 44,885
    "x) 44,885 x 1 = 44,885 tons
    
    "Thus, we arrive at the answer: 44885.0."
    
    3. Temperature Monitoring:
    "We can find the answer like this:
    "Division: 5.48 degrees ÷ 3.74 = (Note: rounding to 2 decimal places) about 1.47
    "Multiplication operation: 1.47 degrees * 2.07 = (Note: rounding to 2 decimal places) approximately 3.04
    "3.04 degrees ÷ 3.94 (Note: rounding to 2 decimal places) approximately 0.77
    "0.77 degrees ÷ 2.54 (Note: rounding to 2 decimal places) approximately 0.30
    "When 0.30 degrees are divided by 2.21, the result is (Note: rounding to 2 decimal places) about 0.14
    
    "This means the final result is 0.14."
    

    Features

    Each problem includes: - Unique problem ID - Natural language question text - Includes arithemetic operations involving decimals and integers, values that are positive and negative, and requirements for rounding to a specific number of decimal places. - Detailed metadata including: - Domain classification - Object types and units - Numerical values used - Mathematical operators - Solution value - Discreteness flag - Decimal precision - Tailored value ranges

  2. h

    AWPCD

    • huggingface.co
    Updated Jan 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Waller (2025). AWPCD [Dataset]. https://huggingface.co/datasets/HelloCephalopod/AWPCD
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 29, 2025
    Authors
    Matthew Waller
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Arithmetic Word Problem Compendium Dataset (AWPCD)

      Dataset Description
    

    The dataset is a comprehensive collection of mathematical word problems spanning multiple domains with rich metadata and natural language variations. The problems contain 1 - 5 steps of mathematical operations that are specifically designed to encourage showing work and maintaining appropriate decimal precision throughout calculations. The available data is a sample of 1,000 problems, and commerical… See the full description on the dataset page: https://huggingface.co/datasets/HelloCephalopod/AWPCD.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Cephalopod Studio (2025). Arithmetic Word Problem Compendium [Dataset]. https://www.opendatabay.com/data/ai-ml/b3f879dd-c434-4df5-bb6d-430513edf930

Arithmetic Word Problem Compendium

Explore at:
.undefinedAvailable download formats
Dataset updated
Jun 18, 2025
Dataset authored and provided by
Cephalopod Studio
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Area covered
Machine Learning and AI
Description

Arithmetic Word Problem Compendium Dataset (AWPCD)

Dataset Description

The dataset is a comprehensive collection of mathematical word problems spanning multiple domains with rich metadata and natural language variations. The problems contain 1 - 5 steps of mathematical operations that are specifically designed to encourage showing work and maintaining appropriate decimal precision throughout calculations.

All the problems have never been seen before and are free from copyright restrictions.

The available data has 100,000 problems. To license the templating system that created the data for magnitudes more data or customizations like the number of mathematical steps involved, and the addition of domains. Contact hello@cephalopod.studio for more information.

Intended Uses & Limitations

Intended Uses: The data can be used in 4 areas: 1) Pretraining 2) Instruction tuning 3) Finetuning 4) Benchmarking existing models

All those areas are in service of: - Training mathematical reasoning systems - Developing step-by-step problem-solving capabilities - Testing arithmetic operations across diverse real-world contexts - Evaluating precision in decimal calculations

Limitations: - Currently English-only - Limited to specific mathematical operations - Template-based generation may introduce structural patterns - Focused on arithmetic operations with up to 5 numbers

Dataset Description

Dataset Summary

The dataset contains 100,000 total problems:

Problems span multiple domains including: - Agriculture (soil temperature changes, etc.) - Athletics (training hours, distances, etc.) - Construction (elevation changes, work hours, etc.) - Culinary (cooking temperature changes, calories per serving, etc.) - Education (GPA changes, etc.) - Entertainment (show ratings, stage lighting, etc.) - Finance (stock prices, account balances, etc.)

Data Format

Each example is provided in JSONL format with the following structure: json { "id": "problem_X", "question": "Text of the math problem", "metadata": { "discrete": boolean, "domain": string, "numbers": number[], "object_type": string, "solution": number, "operators": string[], "decimals": number }, "answer": "Text of the step-by-step solution to the problem" }

Sample Data

1. Finance (Account Management):
"Jack sets up 19 bank accounts for clients. First the total rises to be 2 times greater than before. Following that, another 4 accounts were added."

2. Agriculture (Grain Storage):
"Kevin oversees 14,457 metric tons of grain storage in the new concrete silo. In the beginning, the facility grows to 3 times its current measure of grain. Following that, the overall supply of grain grows by 1,514 tons. Then, Kevin divides the holdings evenly by 1 and continues operations with a single unit."

3. Temperature Monitoring:
"The ground temperature measures 5.48 degrees Celsius. First, the temperature values are adjusted to be 1/3.74 the value they were before. Next, a sensor calibration adjustment multiplies all readings by 2.07, and later the measurements need to be scaled down by 1/3.94 due to sensor calibration. Later, the measurements need to be scaled down by 1/2.54 due to sensor calibration, and after that the measurements need to be scaled down by 1/2.21 due to sensor calibration. What is the final soil temperature in degrees Celsius? Round your answer and any steps to 2 decimal places."

Answer Examples

1. Finance (Account Management):
"Here's how we can solve this problem:
"19 accounts times 2 equals 38
"Addition step: 38 + 4 = 42 accounts

"Based on these steps, the answer is 42."

2. Agriculture (Grain Storage):
"Following these steps will give us the answer:
"Multiplication operation: 14,457 tons * 3 = 43,371
"Add 1514 to 43,371 tons: 44,885
"x) 44,885 x 1 = 44,885 tons

"Thus, we arrive at the answer: 44885.0."

3. Temperature Monitoring:
"We can find the answer like this:
"Division: 5.48 degrees ÷ 3.74 = (Note: rounding to 2 decimal places) about 1.47
"Multiplication operation: 1.47 degrees * 2.07 = (Note: rounding to 2 decimal places) approximately 3.04
"3.04 degrees ÷ 3.94 (Note: rounding to 2 decimal places) approximately 0.77
"0.77 degrees ÷ 2.54 (Note: rounding to 2 decimal places) approximately 0.30
"When 0.30 degrees are divided by 2.21, the result is (Note: rounding to 2 decimal places) about 0.14

"This means the final result is 0.14."

Features

Each problem includes: - Unique problem ID - Natural language question text - Includes arithemetic operations involving decimals and integers, values that are positive and negative, and requirements for rounding to a specific number of decimal places. - Detailed metadata including: - Domain classification - Object types and units - Numerical values used - Mathematical operators - Solution value - Discreteness flag - Decimal precision - Tailored value ranges

Search
Clear search
Close search
Google apps
Main menu