100+ datasets found
  1. Mathematics Dataset

    • github.com
    • opendatalab.com
    • +1more
    Updated Apr 3, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DeepMind (2019). Mathematics Dataset [Dataset]. https://github.com/Wikidepia/mathematics_dataset_id
    Explore at:
    Dataset updated
    Apr 3, 2019
    Dataset provided by
    DeepMindhttp://deepmind.com/
    Description

    This dataset consists of mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.

    ## Example questions

     Question: Solve -42*r + 27*c = -1167 and 130*r + 4*c = 372 for r.
     Answer: 4
     
     Question: Calculate -841880142.544 + 411127.
     Answer: -841469015.544
     
     Question: Let x(g) = 9*g + 1. Let q(c) = 2*c + 1. Let f(i) = 3*i - 39. Let w(j) = q(x(j)). Calculate f(w(a)).
     Answer: 54*a - 30
    

    It contains 2 million (question, answer) pairs per module, with questions limited to 160 characters in length, and answers to 30 characters in length. Note the training data for each question type is split into "train-easy", "train-medium", and "train-hard". This allows training models via a curriculum. The data can also be mixed together uniformly from these training datasets to obtain the results reported in the paper. Categories:

    • algebra (linear equations, polynomial roots, sequences)
    • arithmetic (pairwise operations and mixed expressions, surds)
    • calculus (differentiation)
    • comparison (closest numbers, pairwise comparisons, sorting)
    • measurement (conversion, working with time)
    • numbers (base conversion, remainders, common divisors and multiples, primality, place value, rounding numbers)
    • polynomials (addition, simplification, composition, evaluating, expansion)
    • probability (sampling without replacement)
  2. f

    two datasets and the visualization

    • figshare.com
    txt
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dongsheng Yang (2023). two datasets and the visualization [Dataset]. http://doi.org/10.6084/m9.figshare.13007747.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Dongsheng Yang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    dataset_a.json: 1000 random numbers over the range 0-100dataset_b.json: new numbers from the original 1000 numbers in 1_a.json using the equation y=3x+6results.png: generated by these two datasets

  3. Consecutive Bates Range - Gap Finder

    • kaggle.com
    Updated Sep 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Zelazko (2023). Consecutive Bates Range - Gap Finder [Dataset]. https://www.kaggle.com/datasets/patrickzel/consecutive-bates-range
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Patrick Zelazko
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Here's a sample Production Bates Range for a Gap Analysis exercise via Python. It's a CSV with one column containing a range of numbers following the convention "D0000001, D0000002, .... D0099999."

    This script can be run against a variable/column on a document production index to identify document sequence gaps, which can be helpful to determine missing documents in a set or to diagnose a technical issue during data processing or exchange phases.

    More broadly, this code can be updated to apply over any sequential data range (dates, student ID, serial number, item number, etc.), to show any gaps or available digits.

  4. Prime Number Source Code with Dataset

    • figshare.com
    zip
    Updated Oct 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ayman Mostafa (2024). Prime Number Source Code with Dataset [Dataset]. http://doi.org/10.6084/m9.figshare.27215508.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 12, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Ayman Mostafa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper addresses the computational methods and challenges associated with prime number generation, a critical component in encryption algorithms for ensuring data security. The generation of prime numbers efficiently is a critical challenge in various domains, including cryptography, number theory, and computer science. The quest to find more effective algorithms for prime number generation is driven by the increasing demand for secure communication and data storage and the need for efficient algorithms to solve complex mathematical problems. Our goal is to address this challenge by presenting two novel algorithms for generating prime numbers: one that generates primes up to a given limit and another that generates primes within a specified range. These innovative algorithms are founded on the formulas of odd-composed numbers, allowing them to achieve remarkable performance improvements compared to existing prime number generation algorithms. Our comprehensive experimental results reveal that our proposed algorithms outperform well-established prime number generation algorithms such as Miller-Rabin, Sieve of Atkin, Sieve of Eratosthenes, and Sieve of Sundaram regarding mean execution time. More notably, our algorithms exhibit the unique ability to provide prime numbers from range to range with a commendable performance. This substantial enhancement in performance and adaptability can significantly impact the effectiveness of various applications that depend on prime numbers, from cryptographic systems to distributed computing. By providing an efficient and flexible method for generating prime numbers, our proposed algorithms can develop more secure and reliable communication systems, enable faster computations in number theory, and support advanced computer science and mathematics research.

  5. H

    Large Dataset of Generalization Patterns in the Number Game

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Aug 10, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric J. Bigelow; Steven T. Piantadosi (2018). Large Dataset of Generalization Patterns in the Number Game [Dataset]. http://doi.org/10.7910/DVN/A8ZWLF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 10, 2018
    Dataset provided by
    Harvard Dataverse
    Authors
    Eric J. Bigelow; Steven T. Piantadosi
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    272,700 two-alternative forced choice responses in a simple numerical task modeled after Tenenbaum (1999, 2000), collected from 606 Amazon Mechanical Turk workers. Subjects were shown sets of numbers length 1 to 4 from the range 1 to 100 (e.g. {12, 16}), and asked what other numbers were likely to belong to that set (e.g. 1, 5, 2, 98). Their generalization patterns reflect both rule-like (e.g. “even numbers,” “powers of two”) and distance-based (e.g. numbers near 50) generalization. This data set is available for further analysis of these simple and intuitive inferences, developing of hands-on modeling instruction, and attempts to understand how probability and rules interact in human cognition.

  6. o

    Range View Road Cross Street Data in Estes Park, CO

    • ownerly.com
    Updated Jan 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2022). Range View Road Cross Street Data in Estes Park, CO [Dataset]. https://www.ownerly.com/co/estes-park/range-view-rd-home-details
    Explore at:
    Dataset updated
    Jan 16, 2022
    Dataset authored and provided by
    Ownerly
    Area covered
    Colorado, Range View Road, Estes Park
    Description

    This dataset provides information about the number of properties, residents, and average property values for Range View Road cross streets in Estes Park, CO.

  7. n

    Data from: HomeRange: A global database of mammalian home ranges

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Sep 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maarten Broekman; Selwyn Hoeks; Rosa Freriks; Merel Langendoen; Katharina Runge; Ecaterina Savenco; Ruben ter Harmsel; Mark Huijbregts; Marlee Tucker (2023). HomeRange: A global database of mammalian home ranges [Dataset]. http://doi.org/10.5061/dryad.d2547d85x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 13, 2023
    Dataset provided by
    Radboud University Nijmegen
    Authors
    Maarten Broekman; Selwyn Hoeks; Rosa Freriks; Merel Langendoen; Katharina Runge; Ecaterina Savenco; Ruben ter Harmsel; Mark Huijbregts; Marlee Tucker
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Motivation: Home range is a common measure of animal space use as it provides ecological information that is useful for conservation applications. In macroecological studies, values are typically aggregated to species means to examine general patterns of animal space use. However, this ignores the environmental context in which the home range was estimated and does not account for intraspecific variation in home range size. In addition, the focus of macroecological studies on home ranges has been historically biased toward terrestrial mammals. The use of aggregated numbers and terrestrial focus limits our ability to examine home range patterns across different environments, variation in time and between different levels of organisation. Here we introduce HomeRange, a global database with 75,611 home-range values across 960 different mammal species, including terrestrial, as well as aquatic and aerial species. Main types of variable contained: The dataset contains mammal home-range estimates, species names, methodological information on data collection, home-range estimation method, period of data collection, study coordinates and name of location, as well as species traits derived from the studies, such as body mass, life stage, reproductive status and locomotor habit. Spatial location and grain: The collected data is distributed globally. Across studies, the spatial accuracy varies, with the coarsest resolution being 1 degree. Time period and grain: The data represent information published between 1939 and 2022. Across studies, the temporal accuracy varies, some studies report start and end dates specific to the day. For other studies, only the month or year is reported. Major taxa and level of measurement: Mammal species from 24 of the 27 different taxonomic orders. Home-range estimates range from individual-level values to population-level averages. Methods Mammalian home range papers were compiled via an extensive literature search. All home range values were extracted from the literature including individual, group and population-level home range values. Associated values were also compiled including species names, methodological information on data collection, home-range estimation method, period of data collection, study coordinates and name of location, as well as species traits derived from the studies, such as body mass, life stage, reproductive status and locomotor habit. Here we include the database, associated metadata and reference list of all sources from which home range data was extracted from. We also provide an R package, which can be installed from https://github.com/SHoeks/HomeRange. The HomeRange R package provides functions for downloading the latest version of the HomeRange database and loading it as a standard dataframe into R, plotting several statistics of the database and finally attaching species traits (e.g. species average body mass, trophic level) from the COMBINE (Soria et al. 2021) for statistical analysis.

  8. Data from: Current and projected research data storage needs of Agricultural...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    • +2more
    Updated Apr 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Current and projected research data storage needs of Agricultural Research Service researchers in 2016 [Dataset]. https://catalog.data.gov/dataset/current-and-projected-research-data-storage-needs-of-agricultural-research-service-researc-f33da
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Servicehttps://www.ars.usda.gov/
    Description

    The USDA Agricultural Research Service (ARS) recently established SCINet , which consists of a shared high performance computing resource, Ceres, and the dedicated high-speed Internet2 network used to access Ceres. Current and potential SCINet users are using and generating very large datasets so SCINet needs to be provisioned with adequate data storage for their active computing. It is not designed to hold data beyond active research phases. At the same time, the National Agricultural Library has been developing the Ag Data Commons, a research data catalog and repository designed for public data release and professional data curation. Ag Data Commons needs to anticipate the size and nature of data it will be tasked with handling. The ARS Web-enabled Databases Working Group, organized under the SCINet initiative, conducted a study to establish baseline data storage needs and practices, and to make projections that could inform future infrastructure design, purchases, and policies. The SCINet Web-enabled Databases Working Group helped develop the survey which is the basis for an internal report. While the report was for internal use, the survey and resulting data may be generally useful and are being released publicly. From October 24 to November 8, 2016 we administered a 17-question survey (Appendix A) by emailing a Survey Monkey link to all ARS Research Leaders, intending to cover data storage needs of all 1,675 SY (Category 1 and Category 4) scientists. We designed the survey to accommodate either individual researcher responses or group responses. Research Leaders could decide, based on their unit's practices or their management preferences, whether to delegate response to a data management expert in their unit, to all members of their unit, or to themselves collate responses from their unit before reporting in the survey. Larger storage ranges cover vastly different amounts of data so the implications here could be significant depending on whether the true amount is at the lower or higher end of the range. Therefore, we requested more detail from "Big Data users," those 47 respondents who indicated they had more than 10 to 100 TB or over 100 TB total current data (Q5). All other respondents are called "Small Data users." Because not all of these follow-up requests were successful, we used actual follow-up responses to estimate likely responses for those who did not respond. We defined active data as data that would be used within the next six months. All other data would be considered inactive, or archival. To calculate per person storage needs we used the high end of the reported range divided by 1 for an individual response, or by G, the number of individuals in a group response. For Big Data users we used the actual reported values or estimated likely values. Resources in this dataset:Resource Title: Appendix A: ARS data storage survey questions. File Name: Appendix A.pdfResource Description: The full list of questions asked with the possible responses. The survey was not administered using this PDF but the PDF was generated directly from the administered survey using the Print option under Design Survey. Asterisked questions were required. A list of Research Units and their associated codes was provided in a drop down not shown here. Resource Software Recommended: Adobe Acrobat,url: https://get.adobe.com/reader/ Resource Title: CSV of Responses from ARS Researcher Data Storage Survey. File Name: Machine-readable survey response data.csvResource Description: CSV file includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed. This information is that same data as in the Excel spreadsheet (also provided).Resource Title: Responses from ARS Researcher Data Storage Survey. File Name: Data Storage Survey Data for public release.xlsxResource Description: MS Excel worksheet that Includes raw responses from the administered survey, as downloaded unfiltered from Survey Monkey, including incomplete responses. Also includes additional classification and calculations to support analysis. Individual email addresses and IP addresses have been removed.Resource Software Recommended: Microsoft Excel,url: https://products.office.com/en-us/excel

  9. d

    Data from: Accounting for nonlinear responses to traits improves range shift...

    • datadryad.org
    • search.dataone.org
    • +1more
    zip
    Updated Apr 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Cannistra; Lauren Buckley (2024). Accounting for nonlinear responses to traits improves range shift predictions [Dataset]. http://doi.org/10.5061/dryad.wstqjq2v8
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 3, 2024
    Dataset provided by
    Dryad
    Authors
    Anthony Cannistra; Lauren Buckley
    Time period covered
    Mar 21, 2024
    Description

    We assess model performance using six datasets encompassing a broad taxonomic range. The number of species per dataset ranges from 28 to 239 (mean=118, median=94), and range shifts were observed over periods ranging from 20 to 100+ years. Each dataset was derived from previous evaluations of traits as range shift predictors and consists of a list of focal species, associated species-level traits, and a range shift metric.

  10. Tree Age Estimation Across the U.S. Using Forest Inventory and Analysis...

    • zenodo.org
    csv
    Updated Mar 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiaming Lu; Jiaming Lu; Chengquan Huang; Chengquan Huang; Karen Schleeweis; Karen Schleeweis; Zhenhua Zou; Zhenhua Zou; Weishu Gong; Weishu Gong (2025). Tree Age Estimation Across the U.S. Using Forest Inventory and Analysis Database (FIADB) [Dataset]. http://doi.org/10.5281/zenodo.14775738
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 11, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jiaming Lu; Jiaming Lu; Chengquan Huang; Chengquan Huang; Karen Schleeweis; Karen Schleeweis; Zhenhua Zou; Zhenhua Zou; Weishu Gong; Weishu Gong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 30, 2025
    Description

    The tree age dataset was derived for tally trees in the Forest Inventory and Analysis program (FIA) of the US Forest Service using an age-size relationship modeling framework that incorporates species-specific and environmental variables.

    Associated paper: Lu, J., Huang, C., Schleeweis, K., Zou, Z., & Gong, W. (2025). Tree age estimation across the US using forest inventory and analysis database. Forest Ecology and Management, 584, 122603.

    Abstract
    Tree age information is crucial for a range of environmental, scientific, and conservation-related purposes. It helps in understanding and managing forest resources effectively and sustainably. This study presents an approach to estimate tree age across diverse U.S. forested ecosystems using field inventory and climate datasets. The age-size relationship modeling framework incorporates species-specific and environmental variables, enabling its application across various regions. Model R² values range from 0.51 to 0.87 and relative RMSEs (using the mean as the denominator) ranging from 0.14 to 0.49. These models have higher accuracies and are applicable over larger areas than existing studies. The developed tree age dataset reveals marked differences in tree age distribution between Eastern and Western U.S. forests, attributed to historical land use, disturbance, climatic variations, and forest management practices. In the East, forests exhibit a younger age structure due to historical deforestation and subsequent reforestation, while Western forests show an older age structure, influenced by diverse environmental conditions and less human disturbance. By deriving individual tree ages for all the trees surveyed in the United States Forest Inventory and Analysis Program, the approach increases by more than 20 times the number of tally trees in the FIA database that have age data over what is currently. The curated dataset emerges as a crucial resource for forest management and conservation, enhancing our ability to estimate forest carbon sequestration accurately.
    Keywords: Tree Age; Forests; FIA; Structural Attributes
    Data Summary
    The tables are stored as csv files separately for each state. Please see the table below for the column names and description. Among the columns, CN, PLT_CN, INVYR, STATE can be linked to FIA's tree and plot data to query the tree and plot records. Users can also query other variables that were used in the modeling such as diameter and species groups using these keys.
    Columns NameDescription
    CNTree sequence number
    PLT_CNPlot sequence number
    INVYRInventory year
    Tree_AgePredicted tree age
    zoneIDID number indicating the modeling zone where this tree is located, corresponding to the modeling zones in Figure 6 in the paper.
    US_L3CODECode indicating the US level-3 ecoregion where this tree is located.
    US_L3NAMEName of the US level-3 ecoregion where this tree is located.
    StateTwo-letter abbreviation for each state.


  11. N

    South Range, MI Population Breakdown by Gender and Age Dataset: Male and...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). South Range, MI Population Breakdown by Gender and Age Dataset: Male and Female Population Distribution Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/e200fba9-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    South Range, Michigan
    Variables measured
    Male and Female Population Under 5 Years, Male and Female Population over 85 years, Male and Female Population Between 5 and 9 years, Male and Female Population Between 10 and 14 years, Male and Female Population Between 15 and 19 years, Male and Female Population Between 20 and 24 years, Male and Female Population Between 25 and 29 years, Male and Female Population Between 30 and 34 years, Male and Female Population Between 35 and 39 years, Male and Female Population Between 40 and 44 years, and 8 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the three variables, namely (a) Population (Male), (b) Population (Female), and (c) Gender Ratio (Males per 100 Females), we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau across 18 age groups, ranging from under 5 years to 85 years and above. These age groups are described above in the variables section. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of South Range by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for South Range. The dataset can be utilized to understand the population distribution of South Range by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in South Range. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for South Range.

    Key observations

    Largest age group (population): Male # 20-24 years (49) | Female # 20-24 years (50). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.

    Variables / Data Columns

    • Age Group: This column displays the age group for the South Range population analysis. Total expected values are 18 and are define above in the age groups section.
    • Population (Male): The male population in the South Range is shown in the following column.
    • Population (Female): The female population in the South Range is shown in the following column.
    • Gender Ratio: Also known as the sex ratio, this column displays the number of males per 100 females in South Range for each age group.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for South Range Population by Gender. You can refer the same here

  12. Success.ai | | US Premium B2B Emails & Phone Numbers Dataset - APIs and flat...

    • datarade.ai
    Updated Oct 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2024). Success.ai | | US Premium B2B Emails & Phone Numbers Dataset - APIs and flat files available – 170M+, Verified Profiles - Best Price Guarantee [Dataset]. https://datarade.ai/data-products/success-ai-us-premium-b2b-emails-phone-numbers-dataset-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 12, 2024
    Dataset provided by
    Area covered
    United States
    Description

    Success.ai offers a comprehensive, enterprise-ready B2B leads data solution, ideal for businesses seeking access to over 150 million verified employee profiles and 170 million work emails. Our data empowers organizations across industries to target key decision-makers, optimize recruitment, and fuel B2B marketing efforts. Whether you're looking for UK B2B data, B2B marketing data, or global B2B contact data, Success.ai provides the insights you need with pinpoint accuracy.

    Tailored for B2B Sales, Marketing, Recruitment and more: Our B2B contact data and B2B email data solutions are designed to enhance your lead generation, sales, and recruitment efforts. Build hyper-targeted lists based on job title, industry, seniority, and geographic location. Whether you’re reaching mid-level professionals or C-suite executives, Success.ai delivers the data you need to connect with the right people.

    API Features:

    • Real-Time Updates: Our APIs deliver real-time updates, ensuring that the contact data your business relies on is always current and accurate.
    • High Volume Handling: Designed to support up to 860k API calls per day, our system is built for scalability and responsiveness, catering to enterprises of all sizes.
    • Flexible Integration: Easily integrate with CRM systems, marketing automation tools, and other enterprise applications to streamline your workflows and enhance productivity.

    Key Categories Served: B2B sales leads – Identify decision-makers in key industries, B2B marketing data – Target professionals for your marketing campaigns, Recruitment data – Source top talent efficiently and reduce hiring times, CRM enrichment – Update and enhance your CRM with verified, updated data, Global reach – Coverage across 195 countries, including the United States, United Kingdom, Germany, India, Singapore, and more.

    Global Coverage with Real-Time Accuracy: Success.ai’s dataset spans a wide range of industries such as technology, finance, healthcare, and manufacturing. With continuous real-time updates, your team can rely on the most accurate data available: 150M+ Employee Profiles: Access professional profiles worldwide with insights including full name, job title, seniority, and industry. 170M Verified Work Emails: Reach decision-makers directly with verified work emails, available across industries and geographies, including Singapore and UK B2B data. GDPR-Compliant: Our data is fully compliant with GDPR and other global privacy regulations, ensuring safe and legal use of B2B marketing data.

    Key Data Points for Every Employee Profile: Every profile in Success.ai’s database includes over 20 critical data points, providing the information needed to power B2B sales and marketing campaigns: Full Name, Job Title, Company, Work Email, Location, Phone Number, LinkedIn Profile, Experience, Education, Technographic Data, Languages, Certifications, Industry, Publications & Awards.

    Use Cases Across Industries: Success.ai’s B2B data solution is incredibly versatile and can support various enterprise use cases, including: B2B Marketing Campaigns: Reach high-value professionals in industries such as technology, finance, and healthcare. Enterprise Sales Outreach: Build targeted B2B contact lists to improve sales efforts and increase conversions. Talent Acquisition: Accelerate hiring by sourcing top talent with accurate and updated employee data, filtered by job title, industry, and location. Market Research: Gain insights into employment trends and company profiles to enrich market research. CRM Data Enrichment: Ensure your CRM stays accurate by integrating updated B2B contact data. Event Targeting: Create lists for webinars, conferences, and product launches by targeting professionals in key industries.

    Use Cases for Success.ai's Contact Data - Targeted B2B Marketing: Create precise campaigns by targeting key professionals in industries like tech and finance. - Sales Outreach: Build focused sales lists of decision-makers and C-suite executives for faster deal cycles. - Recruiting Top Talent: Easily find and hire qualified professionals with updated employee profiles. - CRM Enrichment: Keep your CRM current with verified, accurate employee data. - Event Targeting: Create attendee lists for events by targeting relevant professionals in key sectors. - Market Research: Gain insights into employment trends and company profiles for better business decisions. - Executive Search: Source senior executives and leaders for headhunting and recruitment. - Partnership Building: Find the right companies and key people to develop strategic partnerships.

    Why Choose Success.ai’s Employee Data? Success.ai is the top choice for enterprises looking for comprehensive and affordable B2B data solutions. Here’s why: Unmatched Accuracy: Our AI-powered validation process ensures 99% accuracy across all data points, resulting in higher engagement and fewer bounces. Global Scale: With 150M+ employee profiles and 170M veri...

  13. o

    Range View Drive Cross Street Data in Bailey, CO

    • ownerly.com
    Updated Feb 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2024). Range View Drive Cross Street Data in Bailey, CO [Dataset]. https://www.ownerly.com/co/bailey/range-view-dr-home-details
    Explore at:
    Dataset updated
    Feb 17, 2024
    Dataset authored and provided by
    Ownerly
    Area covered
    Rangeview Drive, Bailey, Colorado
    Description

    This dataset provides information about the number of properties, residents, and average property values for Range View Drive cross streets in Bailey, CO.

  14. Coffee Shop Daily Revenue Prediction Dataset

    • kaggle.com
    Updated Feb 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Himel Sarder (2025). Coffee Shop Daily Revenue Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/himelsarder/coffee-shop-daily-revenue-prediction-dataset/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Himel Sarder
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset Overview

    This dataset contains 2,000 rows of data from coffee shops, offering detailed insights into factors that influence daily revenue. It includes key operational and environmental variables that provide a comprehensive view of how business activities and external conditions affect sales performance. Designed for use in predictive analytics and business optimization, this dataset is a valuable resource for anyone looking to understand the relationship between customer behavior, operational decisions, and revenue generation in the food and beverage industry.

    Columns & Variables

    The dataset features a variety of columns that capture the operational details of coffee shops, including customer activity, store operations, and external factors such as marketing spend and location foot traffic.

    1. Number of Customers Per Day

      • The total number of customers visiting the coffee shop on any given day.
      • Range: 50 - 500 customers.
    2. Average Order Value ($)

      • The average dollar amount spent by each customer during their visit.
      • Range: $2.50 - $10.00.
    3. Operating Hours Per Day

      • The total number of hours the coffee shop is open for business each day.
      • Range: 6 - 18 hours.
    4. Number of Employees

      • The number of employees working on a given day. This can influence service speed, customer satisfaction, and ultimately, sales.
      • Range: 2 - 15 employees.
    5. Marketing Spend Per Day ($)

      • The amount of money spent on marketing campaigns or promotions on any given day.
      • Range: $10 - $500 per day.
    6. Location Foot Traffic (people/hour)

      • The number of people passing by the coffee shop per hour, a variable indicative of the shop's location and its potential to attract customers.
      • Range: 50 - 1000 people per hour.

    Target Variable

    • Daily Revenue ($)
      • This is the dependent variable representing the total revenue generated by the coffee shop each day.
      • It is calculated as a combination of customer visits, average spending, and other operational factors like marketing spend and staff availability.
      • Range: $200 - $10,000 per day.

    Data Distribution & Insights

    The dataset spans a wide variety of operational scenarios, from small neighborhood coffee shops with limited traffic to larger, high-traffic locations with extensive marketing budgets. This variety allows for exploring different predictive modeling strategies. Key insights that can be derived from the data include:

    • The effect of marketing spend on daily revenue.
    • The correlation between customer count and daily sales.
    • The relationship between staffing levels and revenue generation.
    • The influence of foot traffic and operating hours on customer behavior.

    Use Cases & Applications

    The dataset offers a wide range of applications, especially in predictive analytics, business optimization, and forecasting:

    • Predictive Modeling: Use machine learning models such as regression, decision trees, or neural networks to predict daily revenue based on operational data.
    • Business Strategy Development: Analyze how changes in marketing spend, staff numbers, or operating hours can optimize revenue and improve efficiency.
    • Customer Insights: Identify patterns in customer behavior related to shop operations and external factors like foot traffic and marketing campaigns.
    • Resource Allocation: Determine optimal staffing levels and marketing budgets based on predicted sales, improving overall profitability.

    Real-World Applications in the Food & Beverage Industry

    For coffee shop owners, managers, and analysts in the food and beverage industry, this dataset provides an essential tool for refining daily operations and boosting profitability. Insights gained from this data can help:

    • Optimize Marketing Campaigns: Evaluate the effectiveness of daily or seasonal marketing campaigns on revenue.
    • Staff Scheduling: Predict busy days and ensure that the right number of employees are scheduled to maximize efficiency.
    • Revenue Forecasting: Provide accurate revenue projections that can assist with financial planning and decision-making.
    • Operational Efficiency: Discover the most profitable operating hours and adjust business hours accordingly.

    This dataset is also ideal for aspiring data scientists and machine learning practitioners looking to apply their skills to real-world business problems in the food and beverage sector.

    Conclusion

    The Coffee Shop Revenue Prediction Dataset is a versatile and comprehensive resource for understanding the dynamics of daily sales performance in coffee shops. With a focus on key operational factors, it is perfect for building predictive models, ...

  15. SIGMOD 2024 Programming Contest Datasets

    • zenodo.org
    bin
    Updated Oct 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guoliang Li; Dong Deng; Guoliang Li; Dong Deng (2024). SIGMOD 2024 Programming Contest Datasets [Dataset]. http://doi.org/10.5281/zenodo.13998879
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Guoliang Li; Dong Deng; Guoliang Li; Dong Deng
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Our datasets, both released and evaluation set, are derived from the YFCC100M Dataset. Each dataset comprises vectors encoded from images using the CLIP model, which are then reduced to 100 dimensions using Principal Component Analysis (PCA). Additionally, categorical and timestamp attributes are selected from the metadata of the images. The categorical attribute is discretized into integers starting from 0, and the timestamp attribute is normalized into floats between 0 and 1.

    For each query, a query type is randomly selected from four possible types, denoted by the numbers 0 to 3. Then, we randomly choose two data points from dataset D, utilizing their categorical attribute (C) timestamp attribute (T), and vectors, to determine the values of the query. Specifically:

    • Randomly sample two data points from D.
    • Use the categorical value of the first data point as v for the equality predicate over the categorical attribute C.
    • Use the timestamp attribute values of the two sampled data points for the range predicate. Designate l as the smaller timestamp value and r as the larger. The range predicate is thus defined as l≤T≤r.
    • Use the vector of the first data point as the query vector.
    • If the query type does not involve v, l, or r, their values are set to -1.

    We assure that at least 100 data points in D meet the query limit.

    Dataset Structure

    Dataset D is in a binary format, beginning with a 4-byte integer num_vectors (uint32_t) indicating the number of vectors. This is followed by data for each vector, stored consecutively, with each vector occupying 102 (2 + vector_num_dimension) x sizeof(float32) bytes, summing up to num_vectors x 102 (2 + vector_num_dimension) x sizeof(float32) bytes in total. Specifically, for the 102 dimensions of each vector: the first dimension denotes the discretized categorical attribute C and the second dimension denotes the normalized timestamp attribute T. The rest 100 dimensions are the vector.

    Query Set Structure

    Query set Q is in a binary format, beginning with a 4-byte integer num_queries (uint32_t) indicating the number of queries. This is followed by data for each query, stored consecutively, with each query occupying 104 (4 + vector_num_dimension) x sizeof(float32) bytes, summing up to num_queries x 104 (4 + vector_num_dimension) x sizeof(float32) bytes in total.

    The 104-dimensional representation for a query is organized as follows:

    • The first dimension denotes query_type (takes values from 0, 1, 2, 3).
    • The second dimension denotes the specific query value v for the categorical attribute (if not queried, takes -1).
    • The third dimension denotes the specific query value l for the timestamp attribute (if not queried, takes -1).
    • The fourth dimension denotes the specific query value r for the timestamp attribute (if not queried, takes -1).
    • The rest 100 dimensions are the query vector.

    There are four types of queries, i.e., the query_type takes values from 0, 1, 2 and 3. The 4 types of queries correspond to:

    • If query_type=0: Vector-only query, i.e., the conventional approximate nearest neighbor (ANN) search query.
    • If query_type=1: Vector query with categorical attribute constraint, i.e., ANN search for data points satisfying C=v.
    • If query_type=2: Vector query with timestamp attribute constraint, i.e., ANN search for data points satisfying l≤T≤r.
    • If query_type=3: Vector query with both categorical and timestamp attribute constraints, i.e. ANN search for data points satisfying C=v and l≤T≤r.

    The predicate for the categorical attribute is an equality predicate, i.e., C=v. And the predicate for the timestamp attribute is a range predicate, i.e., l≤T≤r.

    Originally provided on https://dbgroup.cs.tsinghua.edu.cn/sigmod2024/task.shtml?content=datasets .

  16. Z

    Data from: Regression-Test History Data for Flaky Test-Research, Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Aug 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Winter, Stefan (2024). Regression-Test History Data for Flaky Test-Research, Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10639029
    Explore at:
    Dataset updated
    Aug 12, 2024
    Dataset provided by
    Wendler, Philipp
    Winter, Stefan
    Description

    The dataset comprises developer test results of Maven projects with flaky tests across a range of consecutive commits from the projects' git commit histories. The Maven projects are a subset of those investigated in an OOPSLA 2020 paper. The commit range for this dataset has been chosen as the flakiness-introducing commit (FIC) and iDFlakies-commit (see the OOPSLA paper for details). The commit hashes have been obtained from the IDoFT dataset.

    The dataset will be presented at the 1st International Flaky Tests Workshop 2024 (FTW 2024). Please refer to our extended abstract for more details about the motivation for and context of this dataset.

    The following table provides a summary of the data.

    Slug (Module) FIC Hash Tests Commits Av. Commits/Test Flaky Tests Tests w/ Consistent Failures Total Distinct Histories

    TooTallNate/Java-WebSocket 822d40 146 75 75 24 1 2.6x10^9

    apereo/java-cas-client (cas-client-core) 5e3655 157 65 61.7 3 2 1.0x10^7

    eclipse-ee4j/tyrus (tests/e2e/standard-config) ce3b8c 185 16 16 12 0 261

    feroult/yawp (yawp-testing/yawp-testing-appengine) abae17 1 191 191 1 1 8

    fluent/fluent-logger-java 5fd463 19 131 105.6 11 2 8.0x10^32

    fluent/fluent-logger-java 87e957 19 160 122.4 11 3 2.1x10^31

    javadelight/delight-nashorn-sandbox d0d651 81 113 100.6 2 5 4.2x10^10

    javadelight/delight-nashorn-sandbox d19eee 81 93 83.5 1 5 2.6x10^9

    sonatype-nexus-community/nexus-repository-helm 5517c8 18 32 32 0 0 18

    spotify/helios (helios-services) 23260 190 448 448 0 37 190

    spotify/helios (helios-testing) 78a864 43 474 474 0 7 43

    The columns are composed of the following variables:

    Slug (Module): The project's GitHub slug (i.e., the project's URL is https://github.com/{Slug}) and, if specified, the module for which tests have been executed.

    FIC Hash: The flakiness-introducing commit hash for a known flaky test as described in this OOPSLA 2020 paper. As different flaky tests have different FIC hashes, there may be multiple rows for the same slug/module with different FIC hashes.

    Tests: The number of distinct test class and method combinations over the entire considered commit range.

    Commits: The number of commits in the considered commit range

    Av. Commits/Test: The average number of commits per test class and method combination in the considered commit range. The number of commits may vary for each test class, as some tests may be added or removed within the considered commit range.

    Flaky Tests: The number of distinct test class and method combinations that have more than one test result (passed/skipped/error/failure + exception type, if any + assertion message, if any) across 30 repeated test suite executions on at least one commit in the considered commit range.

    Tests w/ Consistent Failures: The number of distinct test class and method combinations that have the same error or failure result (error/failure + exception type, if any + assertion message, if any) across all 30 repeated test suite executions on at least one commit in the considered commit range.

    Total Distinct Histories: The number of distinct test results (passed/skipped/error/failure + exception type, if any + assertion message, if any) for all test class and method combinations along all commits for that test in the considered commit range.

  17. Human Vital Sign Dataset

    • kaggle.com
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DatasetEngineer (2024). Human Vital Sign Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/8992827
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    DatasetEngineer
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview The Human Vital Signs Dataset is a comprehensive collection of key physiological parameters recorded from patients. This dataset is designed to support research in medical diagnostics, patient monitoring, and predictive analytics. It includes both original attributes and derived features to provide a holistic view of patient health.

    Attributes Patient ID

    Description: A unique identifier assigned to each patient. Type: Integer Example: 1, 2, 3, ... Heart Rate

    Description: The number of heartbeats per minute. Type: Integer Range: 60-100 bpm (for this dataset) Example: 72, 85, 90 Respiratory Rate

    Description: The number of breaths taken per minute. Type: Integer Range: 12-20 breaths per minute (for this dataset) Example: 16, 18, 15 Timestamp

    Description: The exact time at which the vital signs were recorded. Type: Datetime Format: YYYY-MM-DD HH:MM Example: 2023-07-19 10:15:30 Body Temperature

    Description: The body temperature measured in degrees Celsius. Type: Float Range: 36.0-37.5°C (for this dataset) Example: 36.7, 37.0, 36.5 Oxygen Saturation

    Description: The percentage of oxygen-bound hemoglobin in the blood. Type: Float Range: 95-100% (for this dataset) Example: 98.5, 97.2, 99.1 Systolic Blood Pressure

    Description: The pressure in the arteries when the heart beats (systolic pressure). Type: Integer Range: 110-140 mmHg (for this dataset) Example: 120, 130, 115 Diastolic Blood Pressure

    Description: The pressure in the arteries when the heart rests between beats (diastolic pressure). Type: Integer Range: 70-90 mmHg (for this dataset) Example: 80, 75, 85 Age

    Description: The age of the patient. Type: Integer Range: 18-90 years (for this dataset) Example: 25, 45, 60 Gender

    Description: The gender of the patient. Type: Categorical Categories: Male, Female Example: Male, Female Weight (kg)

    Description: The weight of the patient in kilograms. Type: Float Range: 50-100 kg (for this dataset) Example: 70.5, 80.3, 65.2 Height (m)

    Description: The height of the patient in meters. Type: Float Range: 1.5-2.0 m (for this dataset) Example: 1.75, 1.68, 1.82 Derived Features Derived_HRV (Heart Rate Variability)

    Description: A measure of the variation in time between heartbeats. Type: Float Formula: 𝐻 𝑅

    𝑉

    Standard Deviation of Heart Rate over a Period Mean Heart Rate over the Same Period HRV= Mean Heart Rate over the Same Period Standard Deviation of Heart Rate over a Period ​

    Example: 0.10, 0.12, 0.08 Derived_Pulse_Pressure (Pulse Pressure)

    Description: The difference between systolic and diastolic blood pressure. Type: Integer Formula: 𝑃

    𝑃

    Systolic Blood Pressure − Diastolic Blood Pressure PP=Systolic Blood Pressure−Diastolic Blood Pressure Example: 40, 45, 30 Derived_BMI (Body Mass Index)

    Description: A measure of body fat based on weight and height. Type: Float Formula: 𝐵 𝑀

    𝐼

    Weight (kg) ( Height (m) ) 2 BMI= (Height (m)) 2

    Weight (kg) ​

    Example: 22.8, 25.4, 20.3 Derived_MAP (Mean Arterial Pressure)

    Description: An average blood pressure in an individual during a single cardiac cycle. Type: Float Formula: 𝑀 𝐴

    𝑃

    Diastolic Blood Pressure + 1 3 ( Systolic Blood Pressure − Diastolic Blood Pressure ) MAP=Diastolic Blood Pressure+ 3 1 ​ (Systolic Blood Pressure−Diastolic Blood Pressure) Example: 93.3, 100.0, 88.7 Target Feature Risk Category Description: Classification of patients into "High Risk" or "Low Risk" based on their vital signs. Type: Categorical Categories: High Risk, Low Risk Criteria: High Risk: Any of the following conditions Heart Rate: > 90 bpm or < 60 bpm Respiratory Rate: > 20 breaths per minute or < 12 breaths per minute Body Temperature: > 37.5°C or < 36.0°C Oxygen Saturation: < 95% Systolic Blood Pressure: > 140 mmHg or < 110 mmHg Diastolic Blood Pressure: > 90 mmHg or < 70 mmHg BMI: > 30 or < 18.5 Low Risk: None of the above conditions Example: High Risk, Low Risk This dataset, with a total of 200,000 samples, provides a robust foundation for various machine learning and statistical analysis tasks aimed at understanding and predicting patient health outcomes based on vital signs. The inclusion of both original attributes and derived features enhances the richness and utility of the dataset.

  18. UKCP09: Gridded Datasets of Annual values of extreme temperature range

    • ckan.publishing.service.gov.uk
    • cloud.csiss.gmu.edu
    • +1more
    Updated Jan 26, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2011). UKCP09: Gridded Datasets of Annual values of extreme temperature range [Dataset]. https://ckan.publishing.service.gov.uk/dataset/ukcp09-gridded-annual-datasets-of-extreme-temperature-range
    Explore at:
    Dataset updated
    Jan 26, 2011
    Dataset provided by
    CKANhttps://ckan.org/
    Description

    UKCP09: Gridded datasets of annual values. Extreme temperature range. The day-by-day sum of the mean number of degrees by which the air temperature is more than a value of 22 °C Annual maximum temperature minus annual minimum temperature. The datasets have been created with financial support from the Department for Environment, Food and Rural Affairs (Defra) and they are being promoted by the UK Climate Impacts Programme (UKCIP) as part of the UK Climate Projections (UKCP09). http://ukclimateprojections.defra.gov.uk/content/view/12/689/. To view this data you will have to register on the Met Office website, here: http://www.metoffice.gov.uk/research/climate/climate-monitoring/UKCP09/register

  19. d

    CompanyData.com (BoldData) - Company Dataset of 6M IT companies worldwide

    • datarade.ai
    Updated Apr 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CompanyData.com (BoldData) (2021). CompanyData.com (BoldData) - Company Dataset of 6M IT companies worldwide [Dataset]. https://datarade.ai/data-products/list-of-6m-it-companies-worldwide-bolddata
    Explore at:
    .json, .csv, .xls, .txtAvailable download formats
    Dataset updated
    Apr 27, 2021
    Dataset authored and provided by
    CompanyData.com (BoldData)
    Area covered
    Libya, British Indian Ocean Territory, New Zealand, Maldives, Algeria, Turks and Caicos Islands, Swaziland, Korea (Democratic People's Republic of), Uruguay, Taiwan
    Description

    At CompanyData.com (BoldData), we provide verified company data sourced directly from official trade registers. Our global IT company dataset gives you access to 6 million IT businesses worldwide, including software firms, tech consultancies, system integrators, SaaS providers, and other IT service companies. Every record is sourced from authoritative local registries, ensuring unmatched accuracy, coverage, and compliance.

    This dataset is built for professionals who need reliable, structured insights into the global technology sector. Each company profile includes firmographic details such as legal entity name, registration number, business structure, size, revenue range, and industry classification (NACE/SIC). In addition, you'll find direct contact information for decision-makers—emails, mobile numbers, job titles, and department roles—helping you connect with the right people instantly.

    Whether you're validating suppliers for compliance, identifying high-potential leads for sales, enriching your CRM data, or building AI models with clean and segmented business intelligence, our IT dataset is designed to support a wide range of critical use cases. From global enterprises to fast-scaling startups, our data empowers businesses to move faster and smarter.

    We offer multiple delivery methods tailored to your needs. Choose from custom bulk files, access data through our self-service platform, integrate it directly into your systems via real-time API, or let us enrich your existing database with missing fields and decision-maker insights.

    With a database spanning 380 million companies globally, deep IT sector segmentation, and proven expertise in sourcing from local trade registers, CompanyData.com (BoldData) helps your team identify opportunities, ensure compliance, and scale efficiently—wherever your growth takes you.

  20. m

    USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven

    • app.mobito.io
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    USA POI & Foot Traffic Enriched Geospatial Dataset by Predik Data-Driven [Dataset]. https://app.mobito.io/data-product/usa-enriched-geospatial-framework-dataset
    Explore at:
    Area covered
    United States
    Description

    Our dataset provides detailed and precise insights into the business, commercial, and industrial aspects of any given area in the USA (Including Point of Interest (POI) Data and Foot Traffic. The dataset is divided into 150x150 sqm areas (geohash 7) and has over 50 variables. - Use it for different applications: Our combined dataset, which includes POI and foot traffic data, can be employed for various purposes. Different data teams use it to guide retailers and FMCG brands in site selection, fuel marketing intelligence, analyze trade areas, and assess company risk. Our dataset has also proven to be useful for real estate investment.- Get reliable data: Our datasets have been processed, enriched, and tested so your data team can use them more quickly and accurately.- Ideal for trainning ML models. The high quality of our geographic information layers results from more than seven years of work dedicated to the deep understanding and modeling of geospatial Big Data. Among the features that distinguished this dataset is the use of anonymized and user-compliant mobile device GPS location, enriched with other alternative and public data.- Easy to use: Our dataset is user-friendly and can be easily integrated to your current models. Also, we can deliver your data in different formats, like .csv, according to your analysis requirements. - Get personalized guidance: In addition to providing reliable datasets, we advise your analysts on their correct implementation.Our data scientists can guide your internal team on the optimal algorithms and models to get the most out of the information we provide (without compromising the security of your internal data).Answer questions like: - What places does my target user visit in a particular area? Which are the best areas to place a new POS?- What is the average yearly income of users in a particular area?- What is the influx of visits that my competition receives?- What is the volume of traffic surrounding my current POS?This dataset is useful for getting insights from industries like:- Retail & FMCG- Banking, Finance, and Investment- Car Dealerships- Real Estate- Convenience Stores- Pharma and medical laboratories- Restaurant chains and franchises- Clothing chains and franchisesOur dataset includes more than 50 variables, such as:- Number of pedestrians seen in the area.- Number of vehicles seen in the area.- Average speed of movement of the vehicles seen in the area.- Point of Interest (POIs) (in number and type) seen in the area (supermarkets, pharmacies, recreational locations, restaurants, offices, hotels, parking lots, wholesalers, financial services, pet services, shopping malls, among others). - Average yearly income range (anonymized and aggregated) of the devices seen in the area.Notes to better understand this dataset:- POI confidence means the average confidence of POIs in the area. In this case, POIs are any kind of location, such as a restaurant, a hotel, or a library. - Category confidences, for example"food_drinks_tobacco_retail_confidence" indicates how confident we are in the existence of food/drink/tobacco retail locations in the area. - We added predictions for The Home Depot and Lowe's Home Improvement stores in the dataset sample. These predictions were the result of a machine-learning model that was trained with the data. Knowing where the current stores are, we can find the most similar areas for new stores to open.How efficient is a Geohash?Geohash is a faster, cost-effective geofencing option that reduces input data load and provides actionable information. Its benefits include faster querying, reduced cost, minimal configuration, and ease of use.Geohash ranges from 1 to 12 characters. The dataset can be split into variable-size geohashes, with the default being geohash7 (150m x 150m).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
DeepMind (2019). Mathematics Dataset [Dataset]. https://github.com/Wikidepia/mathematics_dataset_id
Organization logo

Mathematics Dataset

Related Article
Explore at:
Dataset updated
Apr 3, 2019
Dataset provided by
DeepMindhttp://deepmind.com/
Description

This dataset consists of mathematical question and answer pairs, from a range of question types at roughly school-level difficulty. This is designed to test the mathematical learning and algebraic reasoning skills of learning models.

## Example questions

 Question: Solve -42*r + 27*c = -1167 and 130*r + 4*c = 372 for r.
 Answer: 4
 
 Question: Calculate -841880142.544 + 411127.
 Answer: -841469015.544
 
 Question: Let x(g) = 9*g + 1. Let q(c) = 2*c + 1. Let f(i) = 3*i - 39. Let w(j) = q(x(j)). Calculate f(w(a)).
 Answer: 54*a - 30

It contains 2 million (question, answer) pairs per module, with questions limited to 160 characters in length, and answers to 30 characters in length. Note the training data for each question type is split into "train-easy", "train-medium", and "train-hard". This allows training models via a curriculum. The data can also be mixed together uniformly from these training datasets to obtain the results reported in the paper. Categories:

  • algebra (linear equations, polynomial roots, sequences)
  • arithmetic (pairwise operations and mixed expressions, surds)
  • calculus (differentiation)
  • comparison (closest numbers, pairwise comparisons, sorting)
  • measurement (conversion, working with time)
  • numbers (base conversion, remainders, common divisors and multiples, primality, place value, rounding numbers)
  • polynomials (addition, simplification, composition, evaluating, expansion)
  • probability (sampling without replacement)
Search
Clear search
Close search
Google apps
Main menu