100+ datasets found
  1. A study of the impact of data sharing on article citations using journal...

    • plos.figshare.com
    • dataverse.harvard.edu
    • +2more
    docx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose (2023). A study of the impact of data sharing on article citations using journal policies as a natural experiment [Dataset]. http://doi.org/10.1371/journal.pone.0225883
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.

  2. CT-FAN-21 corpus: A dataset for Fake News Detection

    • zenodo.org
    Updated Oct 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl (2022). CT-FAN-21 corpus: A dataset for Fake News Detection [Dataset]. http://doi.org/10.5281/zenodo.4714517
    Explore at:
    Dataset updated
    Oct 23, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl
    Description

    Data Access: The data in the research collection provided may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use it only for research purposes. Due to these restrictions, the collection is not open data. Please download the Agreement at Data Sharing Agreement and send the signed form to fakenewstask@gmail.com .

    Citation

    Please cite our work as

    @article{shahi2021overview,
     title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection},
     author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas},
     journal={Working Notes of CLEF},
     year={2021}
    }

    Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English.

    Subtask 3A: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. The training data will be released in batches and roughly about 900 articles with the respective label. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. Our definitions for the categories are as follows:

    • False - The main claim made in an article is untrue.

    • Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.

    • True - This rating indicates that the primary elements of the main claim are demonstrably true.

    • Other- An article that cannot be categorised as true, false, or partially false due to lack of evidence about its claims. This category includes articles in dispute and unproven articles.

    Subtask 3B: Topical Domain Classification of News Articles (English) Fact-checkers require background expertise to identify the truthfulness of an article. The categorisation will help to automate the sampling process from a stream of data. Given the text of a news article, determine the topical domain of the article (English). This is a classification problem. The task is to categorise fake news articles into six topical categories like health, election, crime, climate, election, education. This task will be offered for a subset of the data of Subtask 3A.

    Input Data

    The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:

    Task 3a

    • ID- Unique identifier of the news article
    • Title- Title of the news article
    • text- Text mentioned inside the news article
    • our rating - class of the news article as false, partially false, true, other

    Task 3b

    • public_id- Unique identifier of the news article
    • Title- Title of the news article
    • text- Text mentioned inside the news article
    • domain - domain of the given news article(applicable only for task B)

    Output data format

    Task 3a

    • public_id- Unique identifier of the news article
    • predicted_rating- predicted class

    Sample File

    public_id, predicted_rating
    1, false
    2, true

    Task 3b

    • public_id- Unique identifier of the news article
    • predicted_domain- predicted domain

    Sample file

    public_id, predicted_domain
    1, health
    2, crime

    Additional data for Training

    To train your model, the participant can use additional data with a similar format; some datasets are available over the web. We don't provide the background truth for those datasets. For testing, we will not use any articles from other datasets. Some of the possible source:

    IMPORTANT!

    1. Fake news article used for task 3b is a subset of task 3a.
    2. We have used the data from 2010 to 2021, and the content of fake news is mixed up with several topics like election, COVID-19 etc.

    Evaluation Metrics

    This task is evaluated as a classification task. We will use the F1-macro measure for the ranking of teams. There is a limit of 5 runs (total and not per day), and only one person from a team is allowed to submit runs.

    Submission Link: https://competitions.codalab.org/competitions/31238

    Related Work

    • Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf
    • G. K. Shahi and D. Nandini, “FakeCovid – a multilingualcross-domain fact check news dataset for covid-19,” inWorkshop Proceedings of the 14th International AAAIConference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14
    • Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104
  3. WaterAPT article dataset

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Dec 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2021). WaterAPT article dataset [Dataset]. https://catalog.data.gov/dataset/waterapt-article-dataset
    Explore at:
    Dataset updated
    Dec 11, 2021
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Input tables used to generate the output, i.e., technology ranking in the manuscript.

  4. Z

    Data articles in journals

    • data.niaid.nih.gov
    • zenodo.org
    Updated Sep 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Balsa-Sanchez, Carlota (2023). Data articles in journals [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3753373
    Explore at:
    Dataset updated
    Sep 22, 2023
    Dataset provided by
    Balsa-Sanchez, Carlota
    Loureiro, Vanesa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Version: 5

    Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

    Date of data collection: 2023/09/05

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v5.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_v5.csv: full list of 140 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 5th version - Information updated: number of journals, URL, document types associated to a specific journal.

    Version: 4

    Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

    Date of data collection: 2022/12/15

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v4.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_v4.csv: full list of 140 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 4th version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR), Scopus and Web of Science (WOS), Journal Master List.

    Version: 3

    Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

    Date of data collection: 2022/10/28

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v3.xlsx: full list of 124 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_3.csv: full list of 124 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 3rd version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR).

    Erratum - Data articles in journals Version 3:

    Botanical Studies -- ISSN 1999-3110 -- JCR (JIF) Q2 Data -- ISSN 2306-5729 -- JCR (JIF) n/a Data in Brief -- ISSN 2352-3409 -- JCR (JIF) n/a

    Version: 2

    Author: Francisco Rubio, Universitat Politècnia de València.

    Date of data collection: 2020/06/23

    General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

    • data_articles_journal_list_v2.xlsx: full list of 56 academic journals in which data papers or/and software papers could be published
    • data_articles_journal_list_v2.csv: full list of 56 academic journals in which data papers or/and software papers could be published

    Relationship between files: both files have the same information. Two different formats are offered to improve reuse

    Type of version of the dataset: final processed version

    Versions of the files: 2nd version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Scimago Journal and Country Rank (SJR)

    Total size: 32 KB

    Version 1: Description

    This dataset contains a list of journals that publish data articles, code, software articles and database articles.

    The search strategy in DOAJ and Ulrichsweb was the search for the word data in the title of the journals. Acknowledgements: Xaquín Lores Torres for his invaluable help in preparing this dataset.

  5. Frequency of reported types of studies and use of descriptive and...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew J. Hayat; Amanda Powell; Tessa Johnson; Betsy L. Cadwell (2023). Frequency of reported types of studies and use of descriptive and inferential statistics (n = 216). [Dataset]. http://doi.org/10.1371/journal.pone.0179032.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Matthew J. Hayat; Amanda Powell; Tessa Johnson; Betsy L. Cadwell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Frequency of reported types of studies and use of descriptive and inferential statistics (n = 216).

  6. f

    Proportion of articles that share data.

    • figshare.com
    tiff
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan P. Womack (2023). Proportion of articles that share data. [Dataset]. http://doi.org/10.1371/journal.pone.0143460.g004
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ryan P. Womack
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This graph shows the proportion of all articles by discipline that share data, making it available to the reader via any indicated mechanism, along with associated confidence intervals. See Tables 9 and 11 for numeric values.

  7. article meta data

    • kaggle.com
    zip
    Updated Feb 27, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandeep Gautam (2022). article meta data [Dataset]. https://www.kaggle.com/datasets/gautamsandeep/article-meta-data
    Explore at:
    zip(330196 bytes)Available download formats
    Dataset updated
    Feb 27, 2022
    Authors
    Sandeep Gautam
    Description

    Dataset

    This dataset was created by Sandeep Gautam

    Contents

  8. T

    India - Scientific And Technical Journal Articles

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jun 25, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2017). India - Scientific And Technical Journal Articles [Dataset]. https://tradingeconomics.com/india/scientific-and-technical-journal-articles-wb-data.html
    Explore at:
    excel, json, xml, csvAvailable download formats
    Dataset updated
    Jun 25, 2017
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    India
    Description

    Scientific and technical journal articles in India was reported at 207390 in 2022, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Scientific and technical journal articles - actual values, historical data, forecasts and projections were sourced from the World Bank on March of 2025.

  9. G

    Les articles à la une

    • pacificdata.org
    • data.gouv.nc
    • +1more
    csv, geojson, json +1
    Updated Mar 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gouvernement de la Nouvelle-Calédonie (2025). Les articles à la une [Dataset]. https://pacificdata.org/data/dataset/data_articles_a_la_une-qzpsfn
    Explore at:
    csv, xls, geojson, jsonAvailable download formats
    Dataset updated
    Mar 4, 2025
    Dataset provided by
    Gouvernement de la Nouvelle-Calédonie
    Description

    Ce jeu de données recense l'historique des publications mises en avant sur data.gouv.nc depuis 2019.

  10. Data used by EPA researchers to generate illustrative figures for overview...

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Nov 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Data used by EPA researchers to generate illustrative figures for overview article "Multiscale Modeling of Background Ozone: Research Needs to Inform and Improve Air Quality Management" [Dataset]. https://catalog.data.gov/dataset/data-used-by-epa-researchers-to-generate-illustrative-figures-for-overview-article-multisc
    Explore at:
    Dataset updated
    Nov 14, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Data sets used to prepare illustrative figures for the overview article “Multiscale Modeling of Background Ozone” Overview The CMAQ model output datasets used to create illustrative figures for this overview article were generated by scientists in EPA/ORD/CEMM and EPA/OAR/OAQPS. The EPA/ORD/CEMM-generated dataset consisted of hourly CMAQ output from two simulations. The first simulation was performed for July 1 – 31 over a 12 km modeling domain covering the Western U.S. The simulation was configured with the Integrated Source Apportionment Method (ISAM) to estimate the contributions from 9 source categories to modeled ozone. ISAM source contributions for July 17 – 31 averaged over all grid cells located in Colorado were used to generate the illustrative pie chart in the overview article. The second simulation was performed for October 1, 2013 – August 31, 2014 over a 108 km modeling domain covering the northern hemisphere. This simulation was also configured with ISAM to estimate the contributions from non-US anthropogenic sources, natural sources, stratospheric ozone, and other sources on ozone concentrations. Ozone ISAM results from this simulation were extracted along a boundary curtain of the 12 km modeling domain specified over the Western U.S. for the time period January 1, 2014 – July 31, 2014 and used to generate the illustrative time-height cross-sections in the overview article. The EPA/OAR/OAQPS-generated dataset consisted of hourly gridded CMAQ output for surface ozone concentrations for the year 2016. The CMAQ simulations were performed over the northern hemisphere at a horizontal resolution of 108 km. NO2 and O3 data for July 2016 was extracted from these simulations generate the vertically-integrated column densities shown in the illustrative comparison to satellite-derived column densities. CMAQ Model Data The data from the CMAQ model simulations used in this research effort are very large (several terabytes) and cannot be uploaded to ScienceHub due to size restrictions. The model simulations are stored on the /asm archival system accessible through the atmos high-performance computing (HPC) system. Due to data management policies, files on /asm are subject to expiry depending on the template of the project. Files not requested for extension after the expiry date are deleted permanently from the system. The format of the files used in this analysis and listed below is ioapi/netcdf. Documentation of this format, including definitions of the geographical projection attributes contained in the file headers, are available at https://www.cmascenter.org/ioapi/ Documentation on the CMAQ model, including a description of the output file format and output model species can be found in the CMAQ documentation on the CMAQ GitHub site at https://github.com/USEPA/CMAQ. This dataset is associated with the following publication: Hogrefe, C., B. Henderson, G. Tonnesen, R. Mathur, and R. Matichuk. Multiscale Modeling of Background Ozone: Research Needs to Inform and Improve Air Quality Management. EM Magazine. Air and Waste Management Association, Pittsburgh, PA, USA, 1-6, (2020).

  11. Prevalence of journal-specific features (peer-reviewed journal articles...

    • plos.figshare.com
    xls
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brady T. West; Joseph W. Sakshaug; Guy Alain S. Aurelien (2023). Prevalence of journal-specific features (peer-reviewed journal articles only). [Dataset]. http://doi.org/10.1371/journal.pone.0158120.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Brady T. West; Joseph W. Sakshaug; Guy Alain S. Aurelien
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Prevalence of journal-specific features (peer-reviewed journal articles only).

  12. CONTRAST-IT corpus: Spanish data collection

    • zenodo.org
    Updated Sep 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna-Maria De Cesare; Anna-Maria De Cesare (2023). CONTRAST-IT corpus: Spanish data collection [Dataset]. http://doi.org/10.5281/zenodo.8307878
    Explore at:
    Dataset updated
    Sep 2, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anna-Maria De Cesare; Anna-Maria De Cesare
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data collection served to create the CONTRAST-IT corpus.

    CONTRAST-IT is a medium-size multilingual corpus (including ca. 1.5 million words) based on a comparable collection of articles published in online daily newspapers. The articles are written in five languages: Italian (from Italy), French (from France), Spanish (from Spain), English (from the UK), and German (from Germany).

    This Spanish dataset includes 300'000 words drawn from 476 articles. All the texts collected are authentic, full-length electronic journalistic articles, chosen based on their high representativeness of contemporary Spanish newspaper language. The articles were published in 2011 and 2012 in two electronic daily newspapers (elpais.com and elmundo.es).

    The corpus and data collection were used in two Swiss National Science Foundation Projects:

    For details on the corpus and data collection, see:

    CONTRAST-IT. Corpus | contrast_it: italiano in prospettiva contrastiva / Italian in a contrastive perspective (unibas.ch)

  13. 4

    Data underlying the research of four scenarios in the operation of water...

    • data.4tu.nl
    • 4tu.edu.hpc.n-helix.com
    • +1more
    zip
    Updated Apr 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hang Wan (2021). Data underlying the research of four scenarios in the operation of water discharge patterns of a dam [Dataset]. http://doi.org/10.4121/14398946.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 14, 2021
    Dataset provided by
    4TU.ResearchData
    Authors
    Hang Wan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Reservoir operation rules(1) continuous flood discharge with ecological priority(2) pulse flood discharge with ecological priority
    (3) pulse flood discharge with equal weight of ecology and power generation(4) pulse flood discharge with power generation priority

  14. d

    Data release associated with the journal article "Solar and sensor geometry,...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data release associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-associated-with-the-journal-article-solar-and-sensor-geometry-not-vegetation-
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Western United States, United States
    Description

    This dataset supports the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" (DOI:10.1016/j.rse.2020.112013). The data release allows users to replicate, test, or further explore results. The dataset consists of 4 separate items based on the analysis approach used in the original publication 1) the 'Phenocam' dataset uses images from a phenocam in a pinyon juniper ecosystem in Grand Canyon National Park to determine phenological patterns of multiple plant species. The 'Phenocam' dataset consists of scripts and tabular data developed while performing analyses and includes the final NDVI values for all areas of interest (AOIs) described in the associated publication. 2) the 'SolarSensorAnalysis' dataset uses downloaded tabular MODIS data to explore relationships between NDVI and multiple solar and sensor angles. The 'SolarSensorAnalysis' dataset consists of download and analysis scripts in Google Earth Engine and R. The source MODIS data used in the analysis are too large to include but are provided through MODIS providers and can be accessed through Google Earth Engine using the included script. A csv file includes solar and sensor angle information for the MODIS pixel closest to the phenocam as well as for a sample of 100 randomly selected MODIS pixels within the GRCA-PJ ecosystem. 3) the 'WinterPeakExtent' dataset includes final geotiffs showing the temporal frequency extent and associated vegetation physiognomic types experiencing winter NDVI peaks in the western US. 4) the "SensorComparison" dataset contains the NDVI time series at the phenocam location from 4 other satellites as well as the code used to download these data.

  15. g

    Data for GMD article "A framework for expanding aqueous chemistry in the...

    • gimi9.com
    • datasets.ai
    • +3more
    Updated Apr 24, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Data for GMD article "A framework for expanding aqueous chemistry in the Community Multiscale Air Quality (CMAQ) model version 5.1" [Dataset]. https://www.gimi9.com/dataset/data-gov_data-for-gmd-article-a-framework-for-expanding-aqueous-chemistry-in-the-community-multisca/
    Explore at:
    Dataset updated
    Apr 24, 2017
    Description

    These data were used to generate the figures included in the following manuscript: Fahey, et al. (2017) "A framework for expanding aqueous chemistry in the Community Multiscale Air Quality (CMAQ) model version 5.1". Geosci. Mod. Dev. This dataset is associated with the following publication: Fahey, K., A. Carlton, H. Pye, J. Baek, B. Hutzell, C. Stanier, K. Baker, W. Appel, M. Jaoui, and J. Offenberg. A framework for expanding aqueous chemistry in the Community Multiscale Air Quality (CMAQ) model version 5.1. Geoscientific Model Development. Copernicus Publications, Katlenburg-Lindau, GERMANY, 10: 1587-1605, (2017).

  16. Data for the article "Low latitude mesospheric clouds in a warmer climate"

    • figshare.com
    bin
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deepashree Dutta (2023). Data for the article "Low latitude mesospheric clouds in a warmer climate" [Dataset]. http://doi.org/10.6084/m9.figshare.24808437.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    figshare
    Authors
    Deepashree Dutta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Seasonal mean (DJF and JJA) temperature, water vapor and cloud fraction data for different experiments from WACCM4.

  17. c

    Data from: MLASK: Multimodal Summarization of Video-based News Articles

    • lindat.mff.cuni.cz
    • live.european-language-grid.eu
    • +1more
    Updated Nov 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mateusz Krubiński; Pavel Pecina (2023). MLASK: Multimodal Summarization of Video-based News Articles [Dataset]. https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-5135?show=full&locale-attribute=cs
    Explore at:
    Dataset updated
    Nov 2, 2023
    Authors
    Mateusz Krubiński; Pavel Pecina
    License

    https://lindat.mff.cuni.cz/repository/xmlui/page/szn-dataset-licencehttps://lindat.mff.cuni.cz/repository/xmlui/page/szn-dataset-licence

    Description

    The MLASK corpus consists of 41,243 multi-modal documents – video-based news articles in the Czech language – collected from Novinky.cz (https://www.novinky.cz/) and Seznam Zprávy (https://www.seznamzpravy.cz/). It was introduced in "MLASK: Multimodal Summarization of Video-based News Articles" (Krubiński & Pecina, EACL 2023). The articles' publication dates range from September 2016 to February 2022. The intended use case of the dataset is to model the task of multimodal summarization with multimodal output: based on a pair of a textual article and a short video, a textual summary is generated, and a single frame from the video is chosen as a pictorial summary.

    Each document consists of the following: - a .mp4 video - a single image (cover picture) - the article's text - the article's summary - the article's title - the article's publication date

    All of the videos are re-sampled to 25 fps and resized to the same resolution of 1280x720p. The maximum length of the video is 5 minutes, and the shortest one is 7 seconds. The average video duration is 86 seconds. The quantitative statistics of the lengths of titles, abstracts, and full texts (measured in the number of tokens) are below. Q1 and Q3 denote the first and third quartiles, respectively.

    / - / mean / Q1 / Median / Q3 / / Title / 11.16 ± 2.78 / 9 / 11 / 13 / / Abstract / 33.40 ± 13.86 / 22 / 32 / 43 / / Article / 276.96 ± 191.74 / 154 / 231 / 343 /

    The proposed training/dev/test split follows the chronological ordering based on publication data. We use the articles published in the first half (Jan-Jun) of 2021 for validation (2,482 instances) and the ones published in the second half (Jul-Dec) of 2021 and the beginning (Jan-Feb) of 2022 for testing (2,652 instances). The remaining data is used for training (36,109 instances).

    The textual data is shared as a single .tsv file. The visual data (video+image) is shared as a single archive for validation and test splits, and the one from the training split is partitioned based on the publication date.

  18. v

    Global import data of Article Made

    • volza.com
    csv
    Updated Feb 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza.LLC (2025). Global import data of Article Made [Dataset]. https://www.volza.com/imports-mali/mali-import-data-of-article+made
    Explore at:
    csvAvailable download formats
    Dataset updated
    Feb 17, 2025
    Dataset provided by
    Volza.LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
    Description

    724 Global import shipment records of Article Made with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

  19. Z

    Data related to research article: Towards mouse genetic-specific...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Oct 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan, Maxime (2021). Data related to research article: Towards mouse genetic-specific RNA-sequencing read mapping [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5513979
    Explore at:
    Dataset updated
    Oct 2, 2021
    Dataset provided by
    Franken, Paul
    Xenarios, Ioannis
    Gobet, Nastassia
    Jan, Maxime
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains data related to the research article: "Towards mouse genetic-specific RNA-sequencing read mapping".

  20. m

    Data supplementary for article "PRV-FCM: an extension of fuzzy cognitive...

    • data.mendeley.com
    Updated Mar 9, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    William Hoyos (2023). Data supplementary for article "PRV-FCM: an extension of fuzzy cognitive maps for prescriptive modeling" [Dataset]. http://doi.org/10.17632/sz79bzd8mf.1
    Explore at:
    Dataset updated
    Mar 9, 2023
    Authors
    William Hoyos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository stores the synthetic data, programs and models developed for the research entitled "PRV-FCM: an extension of fuzzy cognitive maps for prescriptive modeling".

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose (2023). A study of the impact of data sharing on article citations using journal policies as a natural experiment [Dataset]. http://doi.org/10.1371/journal.pone.0225883
Organization logo

A study of the impact of data sharing on article citations using journal policies as a natural experiment

Explore at:
60 scholarly articles cite this dataset (View in Google Scholar)
docxAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.

Search
Clear search
Close search
Google apps
Main menu