100+ datasets found
  1. #Janatahack: Independence Day 2020 ML Hackathon

    • kaggle.com
    zip
    Updated Aug 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VinayVikram (2020). #Janatahack: Independence Day 2020 ML Hackathon [Dataset]. https://www.kaggle.com/vin1234/janatahack-independence-day-2020-ml-hackathon
    Explore at:
    zip(12001207 bytes)Available download formats
    Dataset updated
    Aug 15, 2020
    Authors
    VinayVikram
    Description

    Problem Statement

    Topic Modeling for Research Articles Researchers have access to large online archives of scientific articles. As a consequence, finding relevant articles has become more difficult. Tagging or topic modelling provides a way to give token of identification to research articles which facilitates recommendation and search process.

    Given the abstract and title for a set of research articles, predict the topics for each article included in the test set.

    Note that a research article can possibly have more than 1 topic. The research article abstracts and titles are sourced from the following 6 topics:

    1. Computer Science

    2. Physics

    3. Mathematics

    4. Statistics

    5. Quantitative Biology

    6. Quantitative Finance

    Data Dictionary

      train.csv
    ColumnDescription
    IDUnique ID for each article
    TITLETitle of the research article
    ABSTRACTAbstract of the research article
    Computer ScienceWhether article belongs to topic computer science (1/0)
    PhysicsWhether article belongs to topic physics (1/0)
    MathematicsWhether article belongs to topic Mathematics (1/0)
    StatisticsWhether article belongs to topic Statistics (1/0)
    Quantitative BiologyWhether article belongs to topic Quantitative Biology (1/0)
    Quantitative FinanceWhether article belongs to topic Quantitative Finance (1/0)
    IDUnique ID for each article
    TITLETitle of the research article
    ABSTRACTAbstract of the research article
    IDUnique ID for each article
    TITLETitle of the research article
    ABSTRACTAbstract of the research article
    Computer ScienceWhether article belongs to topic computer science (1/0)
    PhysicsWhether article belongs to topic physics (1/0)
    MathematicsWhether article belongs to topic Mathematics (1/0)
    StatisticsWhether article belongs to topic Statistics (1/0)
    Quantitative BiologyWhether article belongs to topic Quantitative Biology (1/0)
    Quantitative FinanceWhether article belongs to topic Quantitative Finance (1/0)

    Evaluation Metric

    • Submissions are evaluated on micro F1 Score between the predicted and observed topics for each article in the test set.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  2. Interactions: Beyond the Research Article

    • figshare.com
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean Liu (2023). Interactions: Beyond the Research Article [Dataset]. http://doi.org/10.6084/m9.figshare.649417.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Jean Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    What do the alt-metrics of figshare items tell us? This dataset lists Altmetric data for the top 100 figshare repository items, categorised by type (retrieved on 9 March 2013). The data appear in an Interactions post on the Altmetric blog.

  3. Z

    CT-FAN: A Multilingual dataset for Fake News Detection

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Juliane Köhler; Michael Wiegand; Melanie Siegel (2022). CT-FAN: A Multilingual dataset for Fake News Detection [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4714516
    Explore at:
    Dataset updated
    Oct 23, 2022
    Dataset provided by
    University of Applied Sciences Potsdam
    University of Hildesheim
    Darmstadt University of Applied Sciences
    University of Duisburg-Essen
    University of Klagenfurt
    Authors
    Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Juliane Köhler; Michael Wiegand; Melanie Siegel
    Description

    By downloading the data, you agree with the terms & conditions mentioned below:

    Data Access: The data in the research collection may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use them only for research purposes.

    Summaries, analyses and interpretations of the linguistic properties of the information may be derived and published, provided it is impossible to reconstruct the information from these summaries. You may not try identifying the individuals whose texts are included in this dataset. You may not try to identify the original entry on the fact-checking site. You are not permitted to publish any portion of the dataset besides summary statistics or share it with anyone else.

    We grant you the right to access the collection's content as described in this agreement. You may not otherwise make unauthorised commercial use of, reproduce, prepare derivative works, distribute copies, perform, or publicly display the collection or parts of it. You are responsible for keeping and storing the data in a way that others cannot access. The data is provided free of charge.

    Citation

    Please cite our work as

    @InProceedings{clef-checkthat:2022:task3, author = {K{"o}hler, Juliane and Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Wiegand, Michael and Siegel, Melanie and Mandl, Thomas}, title = "Overview of the {CLEF}-2022 {CheckThat}! Lab Task 3 on Fake News Detection", year = {2022}, booktitle = "Working Notes of CLEF 2022---Conference and Labs of the Evaluation Forum", series = {CLEF~'2022}, address = {Bologna, Italy},}

    @article{shahi2021overview, title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection}, author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas}, journal={Working Notes of CLEF}, year={2021} }

    Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English and German.

    Task 3: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. The training data will be released in batches and roughly about 1264 articles with the respective label in English language. Our definitions for the categories are as follows:

    False - The main claim made in an article is untrue.

    Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.

    True - This rating indicates that the primary elements of the main claim are demonstrably true.

    Other- An article that cannot be categorised as true, false, or partially false due to a lack of evidence about its claims. This category includes articles in dispute and unproven articles.

    Cross-Lingual Task (German)

    Along with the multi-class task for the English language, we have introduced a task for low-resourced language. We will provide the data for the test in the German language. The idea of the task is to use the English data and the concept of transfer to build a classification model for the German language.

    Input Data

    The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:

    ID- Unique identifier of the news article

    Title- Title of the news article

    text- Text mentioned inside the news article

    our rating - class of the news article as false, partially false, true, other

    Output data format

    public_id- Unique identifier of the news article

    predicted_rating- predicted class

    Sample File

    public_id, predicted_rating 1, false 2, true

    IMPORTANT!

    We have used the data from 2010 to 2022, and the content of fake news is mixed up with several topics like elections, COVID-19 etc.

    Baseline: For this task, we have created a baseline system. The baseline system can be found at https://zenodo.org/record/6362498

    Related Work

    Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf

    G. K. Shahi and D. Nandini, “FakeCovid – a multilingual cross-domain fact check news dataset for covid-19,” in workshop Proceedings of the 14th International AAAI Conference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14

    Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104

    Shahi, G. K., Struß, J. M., & Mandl, T. (2021). Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection. Working Notes of CLEF.

    Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeno, A., Míguez, R., Shaar, S., ... & Mandl, T. (2021, March). The CLEF-2021 CheckThat! lab on detecting check-worthy claims, previously fact-checked claims, and fake news. In European Conference on Information Retrieval (pp. 639-649). Springer, Cham.

    Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeño, A., Míguez, R., Shaar, S., ... & Kartal, Y. S. (2021, September). Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 264-291). Springer, Cham.

  4. A study of the impact of data sharing on article citations using journal...

    • plos.figshare.com
    • dataverse.harvard.edu
    • +1more
    docx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose (2023). A study of the impact of data sharing on article citations using journal policies as a natural experiment [Dataset]. http://doi.org/10.1371/journal.pone.0225883
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.

  5. Google Scholar Article Listing(Data Mining)

    • kaggle.com
    zip
    Updated Apr 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Anas Mahmood (2023). Google Scholar Article Listing(Data Mining) [Dataset]. https://www.kaggle.com/muhammadanasmahmood/google-scholar-article-listingdata-mining
    Explore at:
    zip(155055 bytes)Available download formats
    Dataset updated
    Apr 21, 2023
    Authors
    Muhammad Anas Mahmood
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset includes google scholar articles listing on data mining, this is very helpful in many educational research works. This dataset contains 936 unique entries. including title, description, author names, article link, cited by and related articles.

  6. Types of internal company data/information in articles using internal...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    L. Susan Wieland; Lainie Rutkow; S. Swaroop Vedula; Christopher N. Kaufmann; Lori M. Rosman; Claire Twose; Nirosha Mahendraratnam; Kay Dickersin (2023). Types of internal company data/information in articles using internal documents from different types of companies (n = 361 articles). [Dataset]. http://doi.org/10.1371/journal.pone.0094709.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    L. Susan Wieland; Lainie Rutkow; S. Swaroop Vedula; Christopher N. Kaufmann; Lori M. Rosman; Claire Twose; Nirosha Mahendraratnam; Kay Dickersin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    1The totals in this column equal the number of articles using a particular type of data, minus instances of duplicate classification by type of company within category of type of data. These instances were: Other types of data were used by articles classified as both tobacco and transportation, both mining and manufacturing, and both tobacco and alcohol, and quantitative data from internal company studies were used by the article classified as both mining and manufacturing. The overall column total is not shown, as it is greater than the total number of included articles (n = 361) because several articles used multiple types of internal documents.2The totals in this row equal the total number of articles for each type of company, minus instances where articles used multiple types of data, of which there are too many to list. The totals for the columns are therefore not equal to the sum of the classifications within the columns. The overall row total is not shown, as it is greater than the total number of included articles (N = 361) because three articles were classified with two types of companies.

  7. d

    Research Article: Breast Cancer Research : BCR

    • catalog.data.gov
    • data.virginia.gov
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Research Article: Breast Cancer Research : BCR [Dataset]. https://catalog.data.gov/dataset/research-article-breast-cancer-research-bcr
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Disruption of the balance between apoptosis and proliferation is considered to be an important factor in the development and progression of tumours. In the present study we determined the in vivo cell kinetics along the spectrum of apparently normal epithelium, hyperplasia, preinvasive lesions and invasive carcinoma, in breast tissues affected by fibrocystic changes in which preinvasive and/or invasive lesions developed, as a model of breast carcinogenesis. Materials and methods A total of 32 areas of apparently normal epithelium and 135 ductal proliferative and neoplastic lesions were studied. More than one epithelial lesion per case were analyzed. The apoptotic index (AI) and the proliferative index (PI) were expressed as the percentage of TdT-mediated dUTP-nick end-labelling (TUNEL) and Ki-67-positive cells, respectively. The PI/AI (P/A index) was calculated for each case. Results The AIs and PIs were significantly higher in hyperplasia than in apparently normal epithelium (P = 0.04 and P = 0.0005, respectively), in atypical hyperplasia than in hyperplasia (P = 0.01 and P = 0.04, respectively) and in invasive carcinoma than in in situ carcinoma (P < 0.001 and P < 0.001, respectively). The two indices were similar in atypical hyperplasia and in in situ carcinoma. The P/A index increased significantly from normal epithelium to hyperplasia (P = 0.01) and from preinvasive lesions to invasive carcinoma (P = 0.04) whereas it was decreased (non-significantly) from hyperplasia to preinvasive lesions. A strong positive correlation between the AIs and the PIs was found (r = 0.83, P < 0.001). Conclusion These findings suggest accelerating cell turnover along the continuum of breast carcinogenesis. Atypical hyperplasias and in situ carcinomas might be kinetically similar lesions. In the transition from normal epithelium to hyperplasia and from preinvasive lesions to invasive carcinoma the net growth of epithelial cells results from a growth imbalance in favour of proliferation. In the transition from hyperplasia to preinvasive lesions there is an imbalance in favour of apoptosis.

  8. Italy Residents Trends 2030 vs. 2022

    • kaggle.com
    zip
    Updated Jul 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roberto Lofaro (2022). Italy Residents Trends 2030 vs. 2022 [Dataset]. https://www.kaggle.com/datasets/robertolofaro/italy-residents-trends-2030-vs-2022
    Explore at:
    zip(2874 bytes)Available download formats
    Dataset updated
    Jul 22, 2022
    Authors
    Roberto Lofaro
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    Italy
    Description

    As part of forthcoming publications, I collect data that might interesting in association with other data.

    On 2022-07-18 the Italian business newspaper published an article that reminded me of a book I read few years ago, "Peoplequake" (you can read here a couple of articles that I posted in Italian in 2017 and 2018).

    The population of Italy, along with Japan, is old and getting older, and most commentators focus on the health system impacts.

    In reality, coupled with a contraction of births well below the "replacement level" (i.e. to keep population steady), this implies the need to rethinksomething more than just the health system.

    For the time being, see article in Italian referencing part of the data

    As for the data: * it is the same information contained within the article, i.e. at the county ("provincia") level * to ease clustering analysis and comparison with other data that usually are by region or aggregation of regions within Italy, added clustering by Region/Area from ISTAT, the National Statistics Bureau of Italy

    More information about other indicators at the county level will be gradually added.

    Sources: * for the main table- Il Sole 24 Ore (paper edition, manually re-entered) * for the region and area list ISTAT

  9. i

    Austria: Ceramic Household Articles and Toilet Articles 2019-2025

    • app.indexbox.io
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IndexBox AI Platform, Austria: Ceramic Household Articles and Toilet Articles 2019-2025 [Dataset]. https://app.indexbox.io/table/6911h6912/40/monthly/
    Explore at:
    Dataset authored and provided by
    IndexBox AI Platform
    License

    Attribution-NoDerivs 3.0 (CC BY-ND 3.0)https://creativecommons.org/licenses/by-nd/3.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2019 - Dec 31, 2025
    Area covered
    Austria
    Description

    Statistics illustrates consumption, production, prices, and trade of Ceramic Household Articles and Toilet Articles in Austria from Jan 2019 to Nov 2025.

  10. R

    Data and code for the article "Bayesian inference for spatio-temporal...

    • entrepot.recherche.data.gouv.fr
    pdf, zip
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frédéric Fabre; Frédéric Fabre (2022). Data and code for the article "Bayesian inference for spatio-temporal stochastic transmission of plant disease in the presence of roguing: a case study to estimate the dispersal distance of Flavescence dorée" by Kwame Adrakey H., Gibson G.J., Eveillard S., Malembic-Maher S. and F. Fabre [Dataset]. http://doi.org/10.57745/YXOEHX
    Explore at:
    pdf(414489), zip(393183939)Available download formats
    Dataset updated
    Dec 14, 2022
    Dataset provided by
    Recherche Data Gouv
    Authors
    Frédéric Fabre; Frédéric Fabre
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Description

    Data and codes for reproducing the MCMC inferences, tables and the main figures described in Kwame Adrakey et al. 2023. Bayesian inference for spatio-temporal stochastic transmission of plant disease in the presence of roguing: a case study to estimate the dispersal distance of Flavescence dorée. The codes are written in C and R languages.

  11. d

    Data from: Data release associated with the journal article "Solar and...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Nov 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Data release associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-associated-with-the-journal-article-solar-and-sensor-geometry-not-vegetation-
    Explore at:
    Dataset updated
    Nov 19, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Western United States, United States
    Description

    This dataset supports the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" (DOI:10.1016/j.rse.2020.112013). The data release allows users to replicate, test, or further explore results. The dataset consists of 4 separate items based on the analysis approach used in the original publication 1) the 'Phenocam' dataset uses images from a phenocam in a pinyon juniper ecosystem in Grand Canyon National Park to determine phenological patterns of multiple plant species. The 'Phenocam' dataset consists of scripts and tabular data developed while performing analyses and includes the final NDVI values for all areas of interest (AOIs) described in the associated publication. 2) the 'SolarSensorAnalysis' dataset uses downloaded tabular MODIS data to explore relationships between NDVI and multiple solar and sensor angles. The 'SolarSensorAnalysis' dataset consists of download and analysis scripts in Google Earth Engine and R. The source MODIS data used in the analysis are too large to include but are provided through MODIS providers and can be accessed through Google Earth Engine using the included script. A csv file includes solar and sensor angle information for the MODIS pixel closest to the phenocam as well as for a sample of 100 randomly selected MODIS pixels within the GRCA-PJ ecosystem. 3) the 'WinterPeakExtent' dataset includes final geotiffs showing the temporal frequency extent and associated vegetation physiognomic types experiencing winter NDVI peaks in the western US. 4) the "SensorComparison" dataset contains the NDVI time series at the phenocam location from 4 other satellites as well as the code used to download these data.

  12. e

    List of Top Authors of Journal of Big Data sorted by article citations

    • exaly.com
    csv, json
    Updated Nov 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). List of Top Authors of Journal of Big Data sorted by article citations [Dataset]. https://exaly.com/journal/30122/journal-of-big-data/top-authors/most-cited
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Nov 1, 2025
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    List of Top Authors of Journal of Big Data sorted by article citations.

  13. Descriptive statistics of extracted article titles.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Bérubé; Maxime Sainte-Marie; Philippe Mongeon; Vincent Larivière (2023). Descriptive statistics of extracted article titles. [Dataset]. http://doi.org/10.1371/journal.pone.0197775.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Nicolas Bérubé; Maxime Sainte-Marie; Philippe Mongeon; Vincent Larivière
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive statistics of extracted article titles.

  14. T

    Slovakia Exports of other articles of plastics to Italy

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2017). Slovakia Exports of other articles of plastics to Italy [Dataset]. https://tradingeconomics.com/slovakia/exports/italy/articles-plastics-polymers-resins
    Explore at:
    excel, csv, json, xmlAvailable download formats
    Dataset updated
    Jul 29, 2017
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1990 - Dec 31, 2025
    Area covered
    Slovakia
    Description

    Slovakia Exports of other articles of plastics to Italy was US$43.07 Million during 2024, according to the United Nations COMTRADE database on international trade. Slovakia Exports of other articles of plastics to Italy - data, historical chart and statistics - was last updated on November of 2025.

  15. C

    China CN: Other Daily Sundry Article: YoY: Product Inventory

    • ceicdata.com
    Updated Feb 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). China CN: Other Daily Sundry Article: YoY: Product Inventory [Dataset]. https://www.ceicdata.com/en/china/daily-sundry-article-other-daily-sundry-article/cn-other-daily-sundry-article-yoy-product-inventory
    Explore at:
    Dataset updated
    Feb 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 1, 2014 - Oct 1, 2015
    Area covered
    China
    Variables measured
    Economic Activity
    Description

    China Other Daily Sundry Article: YoY: Product Inventory data was reported at 14.061 % in Oct 2015. This records an increase from the previous number of 11.419 % for Sep 2015. China Other Daily Sundry Article: YoY: Product Inventory data is updated monthly, averaging 10.963 % from Jan 2006 (Median) to Oct 2015, with 89 observations. The data reached an all-time high of 43.130 % in Feb 2008 and a record low of -3.172 % in May 2013. China Other Daily Sundry Article: YoY: Product Inventory data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Industrial Sector – Table CN.BIM: Daily Sundry Article: Other Daily Sundry Article.

  16. b

    Article on Awassi Sheep in Palmyra and Its Surrounding Desert Areas:...

    • drs.britishmuseum.org
    • figshare.com
    pdf
    Updated Oct 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AHMAD ALKHANEE; Hasan Ali (2025). Article on Awassi Sheep in Palmyra and Its Surrounding Desert Areas: Statistics and Renowned Breeders in English [Dataset]. http://doi.org/10.25420/britishmuseum.30054130.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Oct 24, 2025
    Dataset provided by
    The British Museum
    Authors
    AHMAD ALKHANEE; Hasan Ali
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Article on Awassi Sheep in Palmyra and Its Surrounding Desert Areas: Statistics and Renowned Breeders in English

  17. C

    China CN: Other Daily Sundry Article: Account Receivable

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2019). China CN: Other Daily Sundry Article: Account Receivable [Dataset]. https://www.ceicdata.com/en/china/daily-sundry-article-other-daily-sundry-article/cn-other-daily-sundry-article-account-receivable
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 1, 2014 - Oct 1, 2015
    Area covered
    China
    Variables measured
    Economic Activity
    Description

    China Other Daily Sundry Article: Account Receivable data was reported at 9.647 RMB bn in Oct 2015. This records an increase from the previous number of 9.364 RMB bn for Sep 2015. China Other Daily Sundry Article: Account Receivable data is updated monthly, averaging 7.258 RMB bn from Dec 2003 (Median) to Oct 2015, with 97 observations. The data reached an all-time high of 9.781 RMB bn in Jul 2015 and a record low of 2.043 RMB bn in Dec 2003. China Other Daily Sundry Article: Account Receivable data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Industrial Sector – Table CN.BIM: Daily Sundry Article: Other Daily Sundry Article.

  18. Prevalence of journal-specific features (peer-reviewed journal articles...

    • plos.figshare.com
    xls
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brady T. West; Joseph W. Sakshaug; Guy Alain S. Aurelien (2023). Prevalence of journal-specific features (peer-reviewed journal articles only). [Dataset]. http://doi.org/10.1371/journal.pone.0158120.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Brady T. West; Joseph W. Sakshaug; Guy Alain S. Aurelien
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Prevalence of journal-specific features (peer-reviewed journal articles only).

  19. T

    Spain Imports of statuettes and other ornamental ceramic articles from...

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). Spain Imports of statuettes and other ornamental ceramic articles from Kyrgyzstan [Dataset]. https://tradingeconomics.com/spain/imports/kyrgyzstan/statuettes-ornamental-ceramic-articles
    Explore at:
    xml, csv, excel, jsonAvailable download formats
    Dataset updated
    Jul 17, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1990 - Dec 31, 2025
    Area covered
    Spain
    Description

    Spain Imports of statuettes and other ornamental ceramic articles from Kyrgyzstan was US$834 during 2024, according to the United Nations COMTRADE database on international trade. Spain Imports of statuettes and other ornamental ceramic articles from Kyrgyzstan - data, historical chart and statistics - was last updated on November of 2025.

  20. T

    Spain Exports of statuettes and other ornamental ceramic articles to...

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jun 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2024). Spain Exports of statuettes and other ornamental ceramic articles to Bulgaria [Dataset]. https://tradingeconomics.com/spain/exports/bulgaria/statuettes-ornamental-ceramic-articles
    Explore at:
    excel, xml, json, csvAvailable download formats
    Dataset updated
    Jun 28, 2024
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1990 - Dec 31, 2025
    Area covered
    Spain
    Description

    Spain Exports of statuettes and other ornamental ceramic articles to Bulgaria was US$78.51 Thousand during 2024, according to the United Nations COMTRADE database on international trade. Spain Exports of statuettes and other ornamental ceramic articles to Bulgaria - data, historical chart and statistics - was last updated on December of 2025.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
VinayVikram (2020). #Janatahack: Independence Day 2020 ML Hackathon [Dataset]. https://www.kaggle.com/vin1234/janatahack-independence-day-2020-ml-hackathon
Organization logo

#Janatahack: Independence Day 2020 ML Hackathon

Topic Modeling for Research Articles

Explore at:
zip(12001207 bytes)Available download formats
Dataset updated
Aug 15, 2020
Authors
VinayVikram
Description

Problem Statement

Topic Modeling for Research Articles Researchers have access to large online archives of scientific articles. As a consequence, finding relevant articles has become more difficult. Tagging or topic modelling provides a way to give token of identification to research articles which facilitates recommendation and search process.

Given the abstract and title for a set of research articles, predict the topics for each article included in the test set.

Note that a research article can possibly have more than 1 topic. The research article abstracts and titles are sourced from the following 6 topics:

  1. Computer Science

  2. Physics

  3. Mathematics

  4. Statistics

  5. Quantitative Biology

  6. Quantitative Finance

Data Dictionary

    train.csv
ColumnDescription
IDUnique ID for each article
TITLETitle of the research article
ABSTRACTAbstract of the research article
Computer ScienceWhether article belongs to topic computer science (1/0)
PhysicsWhether article belongs to topic physics (1/0)
MathematicsWhether article belongs to topic Mathematics (1/0)
StatisticsWhether article belongs to topic Statistics (1/0)
Quantitative BiologyWhether article belongs to topic Quantitative Biology (1/0)
Quantitative FinanceWhether article belongs to topic Quantitative Finance (1/0)
IDUnique ID for each article
TITLETitle of the research article
ABSTRACTAbstract of the research article
IDUnique ID for each article
TITLETitle of the research article
ABSTRACTAbstract of the research article
Computer ScienceWhether article belongs to topic computer science (1/0)
PhysicsWhether article belongs to topic physics (1/0)
MathematicsWhether article belongs to topic Mathematics (1/0)
StatisticsWhether article belongs to topic Statistics (1/0)
Quantitative BiologyWhether article belongs to topic Quantitative Biology (1/0)
Quantitative FinanceWhether article belongs to topic Quantitative Finance (1/0)

Evaluation Metric

  • Submissions are evaluated on micro F1 Score between the predicted and observed topics for each article in the test set.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?

Search
Clear search
Close search
Google apps
Main menu