35 datasets found
  1. Daily active users of DeepSeek 2025

    • statista.com
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Daily active users of DeepSeek 2025 [Dataset]. https://www.statista.com/statistics/1561128/deepseek-daily-active-users/
    Explore at:
    Dataset updated
    Jul 18, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 11, 2025 - Feb 15, 2025
    Area covered
    China
    Description

    As of mid-February 2025, the Chinese AI chatbot DeepSeek had around ** million daily active users. When DeepSeek released its research paper illustrating the capabilities of their chatbot, a global audience became aware of the company. As a result, the number of daily active users skyrocketed.

  2. b

    DeepSeek Revenue and Usage Statistics (2025)

    • businessofapps.com
    Updated Jan 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Business of Apps (2025). DeepSeek Revenue and Usage Statistics (2025) [Dataset]. https://www.businessofapps.com/data/deepseek-statistics/
    Explore at:
    Dataset updated
    Jan 28, 2025
    Dataset authored and provided by
    Business of Apps
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    A chatbot from Chinese AI lab DeepSeek sent shockwaves through the market in January, due to its ability to perform mathematics, coding and reasoning at a similar level to ChatGPT and other top-tier...

  3. Gender distribution of deepseek.com's audience 2025

    • statista.com
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Gender distribution of deepseek.com's audience 2025 [Dataset]. https://www.statista.com/statistics/1556992/deepseek-global-website-user-gender-distribution/
    Explore at:
    Dataset updated
    Feb 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2025
    Area covered
    Worldwide, China
    Description

    In January 2025, deepseek.com attracted a total of 278 million visits. Male users accounted for over two-thirds. With a fraction of costs to develop its advanced large language model, the Chinese company Deepseek has rapidly emerged as a significant player in the global AI industry. Its chatbot app hit 20 million daily active users in just three weeks.

  4. Age distribution of deepseek.com's audience 2025

    • statista.com
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Age distribution of deepseek.com's audience 2025 [Dataset]. https://www.statista.com/statistics/1556994/deepseek-global-website-user-age-distribution/
    Explore at:
    Dataset updated
    Feb 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2025
    Area covered
    Worldwide, China
    Description

    At the end of January 2025, Deepseek recorded a spike in its web traffic after media reports highlighted the company's efficient and affordable large language model, which disrupted the global AI landscape. Younger internet users have exhibited a strong enthusiasm. That month, around 57 percent of deepseek.com's visitors were between 18 and 34 years old.

  5. Distribution of deepseek.com traffic 2025, by country

    • statista.com
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Distribution of deepseek.com traffic 2025, by country [Dataset]. https://www.statista.com/statistics/1556986/deepseek-global-web-traffic-share-by-country/
    Explore at:
    Dataset updated
    Feb 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2025
    Area covered
    Worldwide, China
    Description

    At the end of January 2025, the Chinese AI company Deepseek made global headlines with its cost-effective large language model (LLM), which rivals industry leaders like OpenAI's GPT-4o, sending shockwaves through the global tech community. The web traffic of deepseek.com surged to 278 million visits on desktop and mobile in January 2025, compared to only 12 million visits in the previous month. The company's home country contributed almost a quarter of the desktop traffic, followed by the United States and Brazil.

  6. h

    grammar-correction-deepseek-v9-10k

    • huggingface.co
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stimuler (2025). grammar-correction-deepseek-v9-10k [Dataset]. https://huggingface.co/datasets/stimuler/grammar-correction-deepseek-v9-10k
    Explore at:
    Dataset updated
    Jul 8, 2025
    Dataset authored and provided by
    Stimuler
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    grammar-correction-deepseek-v9-10k

    Grammar correction dataset using DeepSeek v9 with GPT prompts for training conversational models

      Dataset Description
    

    This dataset contains conversational data for grammar correction tasks, with system prompts, user inputs, and assistant responses.

      Dataset Structure
    

    Each example contains:

    messages: List of conversation messages with roles (system/user/assistant) and content source: Source identifier for the dataset… See the full description on the dataset page: https://huggingface.co/datasets/stimuler/grammar-correction-deepseek-v9-10k.

  7. h

    math-code-science-deepseek-r1-en

    • huggingface.co
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugo Wong (2025). math-code-science-deepseek-r1-en [Dataset]. https://huggingface.co/datasets/Hugodonotexit/math-code-science-deepseek-r1-en
    Explore at:
    Dataset updated
    Jul 5, 2025
    Authors
    Hugo Wong
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    R1 Dataset Collection

    Aggregated high-quality English prompts and model-generated responses from DeepSeek R1 and DeepSeek R1-0528.

      Dataset Summary
    

    The R1 Dataset Collection combines multiple public DeepSeek-generated instruction-response corpora into a single, cleaned, English-only JSONL file. Each example consists of a <|user|> prompt and a <|assistant|> response in one "text" field. This release includes:

    ~21,000 examples from the DeepSeek-R1-0528 Distilled Custom… See the full description on the dataset page: https://huggingface.co/datasets/Hugodonotexit/math-code-science-deepseek-r1-en.

  8. h

    DAG-Reasoning-DeepSeek-R1-0528

    • huggingface.co
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    t.d.a.g. (2025). DAG-Reasoning-DeepSeek-R1-0528 [Dataset]. http://doi.org/10.57967/hf/6134
    Explore at:
    Dataset updated
    Jul 29, 2025
    Authors
    t.d.a.g.
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Click here to support our open-source dataset and model releases! DAG-Reasoning-DeepSeek-R1-0528 is a dataset focused on analysis and reasoning, creating directed acyclic graphs testing the limits of DeepSeek R1 0528's graph-reasoning skills! This dataset contains:

    4.08k synthetically generated prompts to create directed acyclic graphs in response to user input, with all responses generated using DeepSeek R1 0528. All responses contain a multi-step thinking process to perform effective… See the full description on the dataset page: https://huggingface.co/datasets/sequelbox/DAG-Reasoning-DeepSeek-R1-0528.

  9. h

    deepseek-synthetic-emotional-support

    • huggingface.co
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mrfakename (2025). deepseek-synthetic-emotional-support [Dataset]. https://huggingface.co/datasets/mrfakename/deepseek-synthetic-emotional-support
    Explore at:
    Dataset updated
    Jul 12, 2025
    Authors
    mrfakename
    Description

    Synthetic DeepSeek Emotional Support - Multi-Turn

    100% synthetic emotional support dataset generated by R1 0528 with a new method I'm working on. This is more than just two instances of DeepSeek talking to each other - this new method allows much more genuine and realistic user responses. For now I used this method to generate emotional support conversations but it can easily be applied to other fields in the future. I plan to open-source the entire framework soon - stay tuned.… See the full description on the dataset page: https://huggingface.co/datasets/mrfakename/deepseek-synthetic-emotional-support.

  10. Monthly active users of AI search apps in China 2025

    • ai-chatbox.pro
    • statista.com
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lai Lin Thomala (2025). Monthly active users of AI search apps in China 2025 [Dataset]. https://www.ai-chatbox.pro/?_=%2Ftopics%2F8084%2Fbaidu-inc%2F%23XgboDwS6a1rKoGJjSPEePEUG%2FVFd%2Bik%3D
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    Statistahttp://statista.com/
    Authors
    Lai Lin Thomala
    Area covered
    China
    Description

    China's most popular search engine, Baidu, has leveraged AI capabilities to cement its dominant status. In March 2025, Baidu AI reportedly had over 290 million monthly active users. Douyin's AI search feature outperformed DeepSeek in terms of monthly active user size.

  11. Leading global generative AI apps 2025, by downloads

    • statista.com
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading global generative AI apps 2025, by downloads [Dataset]. https://www.statista.com/statistics/1554189/top-gen-ai-apps-by-downloads/
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2025
    Area covered
    Worldwide
    Description

    In January 2025, ChatGPT was the most downloaded generative AI mobile app worldwide, with over 40.5 million downloads. DeepSeek ranked second, with 17.6 million downloads, while the app's domestic version for the Chinese market ranked fifth and added 7.8 million downloads to the AI brand. Google Gemini ranked third with approximately 10 million global downloads from global users during January 2025.

  12. Leading generative AI apps South Korea 2025

    • statista.com
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading generative AI apps South Korea 2025 [Dataset]. https://www.statista.com/statistics/1553757/south-korea-popular-gen-ai-apps/
    Explore at:
    Dataset updated
    May 22, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2025
    Area covered
    South Korea
    Description

    As of February 2025, the leading generative artificial intelligence (AI) smartphone app in South Korea was ChatGPT, with almost 3.9 million monthly users. The Chinese-developed service DeepSeek-R1 still featured on the list despite a ban on new downloads from the South Korean government.

  13. Replication Package of the paper "Large Language Models for Multilingual...

    • zenodo.org
    zip
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anonymous; Anonymous (2025). Replication Package of the paper "Large Language Models for Multilingual Code Generation: A Benchmark and a Study on Code Quality" [Dataset]. http://doi.org/10.5281/zenodo.15028641
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anonymous; Anonymous
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Large Language Models for Multilingual Code Generation: A Benchmark and a Study on Code Quality

    Abstract

    Having been trained in the wild, Large Language Models (LLMs) may suffer from different types of bias. As shown in previous studies outside software engineering, this includes a language bias, i.e., these models perform differently depending on the language used for the query/prompt. However, so far the impact of language bias on source code generation has not been thoroughly investigated. Therefore, in this paper, we study the influence of the language adopted in the prompt on the quality of the source code generated by three LLMs, specifically GPT, Claude, and DeepSeek. We consider 230 coding tasks for Python and 230 for Java, and translate their related prompts into four languages: Chinese, Hindi, Spanish, and Italian. After generating the code, we measure code quality in terms of passed tests, code metrics, warnings generated by static analysis tools, and language used for the identifiers. Results indicate that (i) source code generated from the English queries is not necessarily better in terms of passed test and quality metrics, (ii) the quality for different languages varies depending on the programming language and LLM being used, and (iii) the generated code tend to contain mixes of comments and literals written in English and the language used to formulate the prompt.

    Replication Package

    This replication package is organized into two main directories: data and scripts. The datadirectory contains all the data used in the analysis, including prompts and final results. The scripts directory contains all the Python scripts used for code generation and analysis.

    Data

    The data directory contains five subdirectories, each corresponding to a stage in the analysis pipeline. These are enumerated to reflect the order of the process:

    1. prompt_translation: Contains files with manually translated prompts for each language. Each file is associated with both Python and Java. The structure of each file is as follows:

      • id: The ID of the query in the CoderEval benchmark.
      • prompt: The original English prompt.
      • summary: The original summary.
      • code: The original code.
      • translation: The translation generated by GPT.
      • correction: The manual correction of the GPT-generated translation.
      • correction_tag: A list of tags indicating the corrections made to the translation.
      • generated_code: This column is initially empty and will contain the code generated from the translated prompt.
    2. generation: Contains the code generated by the three LLMs for each programming language and natural language. Each subdirectory (e.g., java_chinese_claude) contains the following:

      • files: The files with the generated code (named by the query ID).
      • report: Reports generated by static analysis tools.
      • A CSV file (e.g., java_chinese_claude.csv) containing the generated code in the corresponding column.
    3. tests: Contains input files for the testing process and the results of the tests. Files in the input_files directory are formatted according to the CoderEval benchmark requirements. The results directory holds the output of the testing process.

    4. quantitative_analysis: Contains all the csv reports of the static analysis tools and test output for all languages and models. These files are the inputs for the statistical analysis. The directory stats contains all the output tables for the statistical analysis, which are shown in paper's tables.

    5. qualitative_analysis: Contains files used for the qualitative analysis:

      • CohenKappaagreement.csv: A file containing the subset used to compute Cohen's kappa metrics for manual analysis.
      • files: Contains all files for the qualitative analysis. Each file has the following columns:
        • id: The ID of the query in the CoderEval benchmark.
        • generated_code: The code generated by the model.
        • comments: The language used for comments.
        • identifiers: The language used for identifiers.
        • literals: The language used for literals.
        • notes: Additional notes.
    6. ablation_study: Contains files for the ablation study. Each file has the following columns:

      • id: The ID of the query in the CoderEval benchmark.
      • prompt: The prompt used for code generation.
      • generated_code, comments, identifiers, and literals: Same as in the qualitative analysis. results.pdf: This file shows the table containing all the percentages of comments, identifiers and literals extracted from the csv files of the ablation study.

      Files prefixed with italian contain prompts with signatures and docstrings translated into Italian. The system prompt used is the same as the initial one (see the paper). Files with the english prefix have prompts with the original signature (in English) and the docstring in Italian. The system prompt differs as follows:

    You are an AI that only responds with Python code. You will be given a function signature and its docstring by the user. Write your full implementation (restate the function signature).
    Use a Python code block to write your response.
    Comments and identifiers must be in Italian. 
    For example:
    ```python
    print("Hello World!")

    Scripts

    The scripts directory contains all the scripts used to perform all the generations and analysis. All files are properly commented. Here a brief description of each file:

    • code_generation.py: This script automates code generation using AI models (GPT, DeepSeek, and Claude) for different programming and natural languages. It reads prompts from CSV files, generates code based on the prompts, and saves the results in structured directories. It logs the process, handles errors, and stores the generated code in separate files for each iteration.

    • computeallanalysis.py: This script performs static code analysis on generated code files using different models, languages, and programming languages. It runs various analyses (Flake8, Pylint, Lizard) depending on the programming language: for Python, it runs all three analyses, while for Java, only Lizard is executed. The results are stored in dedicated report directories for each iteration. The script ensures the creation of necessary directories and handles any errors that occur during the analysis process.

    • createtestjava.py: This script processes Java code generated by different models and languages, extracting methods using a JavaParser server. It iterates through multiple iterations of generated code, extracts the relevant method code (or uses the full code if no method is found), and stores the results in a JSONL file for each language and model combination.

    • deepseek_model.py: This function sends a request to the DeepSeek API, passing a system and user prompt, and extracts the generated code snippet based on the specified programming language. It prints the extracted code in blue to the console, and if any errors occur during the request or extraction, it prints an error message in red. If successful, it returns the extracted code snippet; otherwise, it returns None.

    • extractpmdreport.py: This script processes PMD analysis reports in SARIF format and converts them into CSV files. It extracts the contents of ZIP files containing the PMD reports, parses the SARIF file to gather analysis results, and saves the findings in a CSV file. The output includes details such as file names, rules, messages, and the count of issues found. The script iterates through multiple languages, models, and iterations, ensuring that PMD reports are properly processed and saved for each combination.

    • flake_analysis.py: The flake_analysis function runs Flake8 to analyze Python files for errors and generates a CSV report summarizing the results. It processes the output, extracting error details such as filenames, error codes, and messages. The errors are grouped by file and saved in a CSV file for easy review.

    • generatepredictionclaude_java.py: The generatecodefrom_prompt function processes a JSON file containing prompts, generates Java code using the Claude API, and saves the generated code to a new JSON file. It validates each prompt, ensures it's JSON-serializable, and sends it to the Claude API for code generation. If the generation is successful, the code is stored in a structured format, and the output is saved to a JSON file for further use.

    • generatepredictionclaude_python.py: This code defines a function generatecodefrom_prompt that processes a JSON file containing prompts, generates Python code using the Claude API, and saves the generated code to a new JSON file. It handles invalid values and ensures all prompts are

  14. h

    open_r1_dataset

    • huggingface.co
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    huang (2025). open_r1_dataset [Dataset]. https://huggingface.co/datasets/xiushenghuang/open_r1_dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2025
    Authors
    huang
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Integration of publicly available datasets related to R1

    We integrate all data and remove contaminated data and data with inconsistent formats. The user defaults to selecting version 'V1', with a total of 2592286 samples.

      1 Relevant datasets mentioned in HuggingFace/open_r1:
    

    (1) HuggingFaceH4/numina-deepseek-r1-qwen-7b: A dataset distilled using DeepSeek-R1-Distill-Qwen-7B. Hugging Face downloads: 631.

    (2) AI-MO/NuminaMath-TIR: A subset of 70K math-related samples… See the full description on the dataset page: https://huggingface.co/datasets/xiushenghuang/open_r1_dataset.

  15. h

    Bespoke-Stratos-17k-DeepSeekrized

    • huggingface.co
    Updated Jan 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seungwoo Ryu (2025). Bespoke-Stratos-17k-DeepSeekrized [Dataset]. https://huggingface.co/datasets/tryumanshow/Bespoke-Stratos-17k-DeepSeekrized
    Explore at:
    Dataset updated
    Jan 25, 2025
    Authors
    Seungwoo Ryu
    Description

    Bespoke-Stratos-17k-DeepSeekrized

    Created by: Seungwoo Ryu

      Introduction
    

    This dataset is a modified version of the original HuggingFaceH4/Bespoke-Stratos-17k dataset, reformatted to match the output format of DeepSeek models.

      Modifications
    

    The user and assistant fields from the original dataset's messages have been moved to user_modified and agent_modified respectively. The content in the agent_modified field has been transformed to match the DeepSeek model's… See the full description on the dataset page: https://huggingface.co/datasets/tryumanshow/Bespoke-Stratos-17k-DeepSeekrized.

  16. a

    End-to-End Response Time by Input Token Count by Models Model

    • artificialanalysis.ai
    Updated Dec 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2024). End-to-End Response Time by Input Token Count by Models Model [Dataset]. https://artificialanalysis.ai/models
    Explore at:
    Dataset updated
    Dec 19, 2024
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Seconds to Output 500 Tokens, including reasoning model 'thinking' time; Lower is better by Model

  17. Top AI applications worldwide 2025, by MAU

    • statista.com
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Top AI applications worldwide 2025, by MAU [Dataset]. https://www.statista.com/statistics/1609163/top-ai-applications-mau-worldwide/
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 2025
    Area covered
    Worldwide
    Description

    In February 2025, ChatGPT was the most popular artificial intelligence (AI) application worldwide, with over 400.61 million monthly active users (MAU). The ByteDance-owned chatbot Doubao had around 81.91 million MAU, making it the most popular Chinese-based tool of this kind. ChatGPT-operated Nova Assistant ranked third with 62.79 million MAU and was followed by Chinese-based DeepSeek with around 61.81 million MAU.

  18. Global ChatGPT and Gemini app downloads 2025

    • statista.com
    • ai-chatbox.pro
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Global ChatGPT and Gemini app downloads 2025 [Dataset]. https://www.statista.com/statistics/1497377/global-chatgpt-vs-gemini-app-downloads/
    Explore at:
    Dataset updated
    Apr 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    May 2023 - Mar 2025
    Area covered
    Worldwide
    Description

    In March 2025, ChatGPT’s mobile app recorded over 64.26 million App Store and Google Play downloads worldwide. Google's Gemini AI Assistant mobile app was released on February 8, 2024, and was initially available in the U.S. market only. In the same month, the app registered around 13.92 million downloads. Regional preferences shape AI app adoption ChatGPT has a strong global presence with over 400.61 million monthly active users in February 2025, but regional preferences vary. In the United States, ChatGPT had a 45 percent download market share, compared to Google Gemini's 11 percent. However, Gemini emerged as the preferred generative AI app in India, representing a 52 percent market share. This competitive landscape now also includes Chinese-based players like ByteDance's Doubao and DeepSeek, indicating an even more diverse and evolving AI worldwide ecosystem. The AI-powered revolution in online search The global AI market has experienced substantial growth, exceeding 184 billion U.S. dollars in 2024 and projected to surpass 826 billion U.S. dollars by 2030. This expansion is mirrored in user behavior, with around 15 million adults in the United States using AI-powered tools as their first option for online search in 2024. Additionally, 68 percent of U.S. adults reported the use of AI-powered search engines for exploring new topics in 2024, with another 44 percent of respondents utilizing these tools to learn or explain concepts.

  19. h

    s1_54k_filter_with_isreasoning

    • huggingface.co
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    XuHu (2025). s1_54k_filter_with_isreasoning [Dataset]. https://huggingface.co/datasets/XuHu6736/s1_54k_filter_with_isreasoning
    Explore at:
    Dataset updated
    May 29, 2025
    Authors
    XuHu
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Card for XuHu6736/s1_54k_filter_with_isreasoning

      Dataset Description
    

    XuHu6736/s1_54k_filter_with_isreasoning is an enhanced version of the XuHu6736/s1_54k_filter dataset. This version includes additional annotations to assess the suitability of each question for reasoning training. These annotations, isreasoning_score and isreasoning, were generated using the deepseek-v3 model. The purpose of these new fields is to allow users to filter, weight, or specifically… See the full description on the dataset page: https://huggingface.co/datasets/XuHu6736/s1_54k_filter_with_isreasoning.

  20. a

    Coding Index by Models Model

    • artificialanalysis.ai
    Updated Dec 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Artificial Analysis (2024). Coding Index by Models Model [Dataset]. https://artificialanalysis.ai/models
    Explore at:
    Dataset updated
    Dec 19, 2024
    Dataset authored and provided by
    Artificial Analysis
    Description

    Comparison of Represents the average of coding benchmarks in the Artificial Analysis Intelligence Index (LiveCodeBench & SciCode) by Model

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Daily active users of DeepSeek 2025 [Dataset]. https://www.statista.com/statistics/1561128/deepseek-daily-active-users/
Organization logo

Daily active users of DeepSeek 2025

Explore at:
Dataset updated
Jul 18, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 11, 2025 - Feb 15, 2025
Area covered
China
Description

As of mid-February 2025, the Chinese AI chatbot DeepSeek had around ** million daily active users. When DeepSeek released its research paper illustrating the capabilities of their chatbot, a global audience became aware of the company. As a result, the number of daily active users skyrocketed.

Search
Clear search
Close search
Google apps
Main menu