100+ datasets found

A study of the impact of data sharing on article citations using journal...
plos.figshare.com
dataverse.harvard.edu
+2more
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose (2023). A study of the impact of data sharing on article citations using journal policies as a natural experiment [Dataset]. http://doi.org/10.1371/journal.pone.0225883
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0225883
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.
CT-FAN-21 corpus: A dataset for Fake News Detection
zenodo.org
Updated Oct 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl (2022). CT-FAN-21 corpus: A dataset for Fake News Detection [Dataset]. http://doi.org/10.5281/zenodo.4714517
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4714517
Dataset updated
Oct 23, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl; Gautam Kishore Shahi; Julia Maria Struß; Thomas Mandl
Description
Data Access: The data in the research collection provided may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use it only for research purposes. Due to these restrictions, the collection is not open data. Please download the Agreement at Data Sharing Agreement and send the signed form to fakenewstask@gmail.com .

Citation

Please cite our work as

@article{shahi2021overview, title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection}, author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas}, journal={Working Notes of CLEF}, year={2021} }

Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English.

Subtask 3A: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. The training data will be released in batches and roughly about 900 articles with the respective label. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. Our definitions for the categories are as follows:

False - The main claim made in an article is untrue.

Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.

True - This rating indicates that the primary elements of the main claim are demonstrably true.

Other- An article that cannot be categorised as true, false, or partially false due to lack of evidence about its claims. This category includes articles in dispute and unproven articles.

Subtask 3B: Topical Domain Classification of News Articles (English) Fact-checkers require background expertise to identify the truthfulness of an article. The categorisation will help to automate the sampling process from a stream of data. Given the text of a news article, determine the topical domain of the article (English). This is a classification problem. The task is to categorise fake news articles into six topical categories like health, election, crime, climate, election, education. This task will be offered for a subset of the data of Subtask 3A.

Input Data

The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:

Task 3a

ID- Unique identifier of the news article

Title- Title of the news article

text- Text mentioned inside the news article

our rating - class of the news article as false, partially false, true, other

Task 3b

public_id- Unique identifier of the news article

Title- Title of the news article

text- Text mentioned inside the news article

domain - domain of the given news article(applicable only for task B)

Output data format

Task 3a

public_id- Unique identifier of the news article

predicted_rating- predicted class

Sample File

public_id, predicted_rating 1, false 2, true

Task 3b

public_id- Unique identifier of the news article

predicted_domain- predicted domain

Sample file

public_id, predicted_domain 1, health 2, crime

Additional data for Training

To train your model, the participant can use additional data with a similar format; some datasets are available over the web. We don't provide the background truth for those datasets. For testing, we will not use any articles from other datasets. Some of the possible source:

Fakenews Classification Datasets

Fake News Detection Challenge KDD 2020

FakeNewsNet

IMPORTANT!

Fake news article used for task 3b is a subset of task 3a.

We have used the data from 2010 to 2021, and the content of fake news is mixed up with several topics like election, COVID-19 etc.

Evaluation Metrics

This task is evaluated as a classification task. We will use the F1-macro measure for the ranking of teams. There is a limit of 5 runs (total and not per day), and only one person from a team is allowed to submit runs.

Submission Link: https://competitions.codalab.org/competitions/31238

Related Work

Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf

G. K. Shahi and D. Nandini, “FakeCovid – a multilingualcross-domain fact check news dataset for covid-19,” inWorkshop Proceedings of the 14th International AAAIConference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14

Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104
WaterAPT article dataset
catalog.data.gov
s.cnmilf.com
+1more
Updated Dec 11, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2021). WaterAPT article dataset [Dataset]. https://catalog.data.gov/dataset/waterapt-article-dataset
Explore at:
Dataset updated
Dec 11, 2021
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Input tables used to generate the output, i.e., technology ranking in the manuscript.
Z
Data articles in journals
data.niaid.nih.gov
zenodo.org
Updated Sep 22, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Balsa-Sanchez, Carlota (2023). Data articles in journals [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3753373
Explore at:
Dataset updated
Sep 22, 2023
Dataset provided by
Balsa-Sanchez, Carlota
Loureiro, Vanesa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Version: 5

Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

Date of data collection: 2023/09/05

General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

data_articles_journal_list_v5.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published

data_articles_journal_list_v5.csv: full list of 140 academic journals in which data papers or/and software papers could be published

Relationship between files: both files have the same information. Two different formats are offered to improve reuse

Type of version of the dataset: final processed version

Versions of the files: 5th version - Information updated: number of journals, URL, document types associated to a specific journal.

Version: 4

Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

Date of data collection: 2022/12/15

General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

data_articles_journal_list_v4.xlsx: full list of 140 academic journals in which data papers or/and software papers could be published

data_articles_journal_list_v4.csv: full list of 140 academic journals in which data papers or/and software papers could be published

Relationship between files: both files have the same information. Two different formats are offered to improve reuse

Type of version of the dataset: final processed version

Versions of the files: 4th version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR), Scopus and Web of Science (WOS), Journal Master List.

Version: 3

Authors: Carlota Balsa-Sánchez, Vanesa Loureiro

Date of data collection: 2022/10/28

General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

data_articles_journal_list_v3.xlsx: full list of 124 academic journals in which data papers or/and software papers could be published

data_articles_journal_list_3.csv: full list of 124 academic journals in which data papers or/and software papers could be published

Relationship between files: both files have the same information. Two different formats are offered to improve reuse

Type of version of the dataset: final processed version

Versions of the files: 3rd version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Journal Citation Reports (JCR) and/or Scimago Journal and Country Rank (SJR).

Erratum - Data articles in journals Version 3:

Botanical Studies -- ISSN 1999-3110 -- JCR (JIF) Q2 Data -- ISSN 2306-5729 -- JCR (JIF) n/a Data in Brief -- ISSN 2352-3409 -- JCR (JIF) n/a

Version: 2

Author: Francisco Rubio, Universitat Politècnia de València.

Date of data collection: 2020/06/23

General description: The publication of datasets according to the FAIR principles, could be reached publishing a data paper (or software paper) in data journals or in academic standard journals. The excel and CSV file contains a list of academic journals that publish data papers and software papers. File list:

data_articles_journal_list_v2.xlsx: full list of 56 academic journals in which data papers or/and software papers could be published

data_articles_journal_list_v2.csv: full list of 56 academic journals in which data papers or/and software papers could be published

Relationship between files: both files have the same information. Two different formats are offered to improve reuse

Type of version of the dataset: final processed version

Versions of the files: 2nd version - Information updated: number of journals, URL, document types associated to a specific journal, publishers normalization and simplification of document types - Information added : listed in the Directory of Open Access Journals (DOAJ), indexed in Web of Science (WOS) and quartile in Scimago Journal and Country Rank (SJR)

Total size: 32 KB

Version 1: Description

This dataset contains a list of journals that publish data articles, code, software articles and database articles.

The search strategy in DOAJ and Ulrichsweb was the search for the word data in the title of the journals. Acknowledgements: Xaquín Lores Torres for his invaluable help in preparing this dataset.
Frequency of reported types of studies and use of descriptive and...
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew J. Hayat; Amanda Powell; Tessa Johnson; Betsy L. Cadwell (2023). Frequency of reported types of studies and use of descriptive and inferential statistics (n = 216). [Dataset]. http://doi.org/10.1371/journal.pone.0179032.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0179032.t002
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Matthew J. Hayat; Amanda Powell; Tessa Johnson; Betsy L. Cadwell
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Frequency of reported types of studies and use of descriptive and inferential statistics (n = 216).
f
Proportion of articles that share data.
figshare.com
tiff
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryan P. Womack (2023). Proportion of articles that share data. [Dataset]. http://doi.org/10.1371/journal.pone.0143460.g004
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0143460.g004
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Ryan P. Womack
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This graph shows the proportion of all articles by discipline that share data, making it available to the reader via any indicated mechanism, along with associated confidence intervals. See Tables 9 and 11 for numeric values.
article meta data
kaggle.com
zip
Updated Feb 27, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandeep Gautam (2022). article meta data [Dataset]. https://www.kaggle.com/datasets/gautamsandeep/article-meta-data
Explore at:
zip(330196 bytes)Available download formats
Dataset updated
Feb 27, 2022
Authors
Sandeep Gautam
Description
Dataset

This dataset was created by Sandeep Gautam

Contents
T
India - Scientific And Technical Journal Articles
tradingeconomics.com
csv, excel, json, xml
Updated Jun 25, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2017). India - Scientific And Technical Journal Articles [Dataset]. https://tradingeconomics.com/india/scientific-and-technical-journal-articles-wb-data.html
Explore at:
excel, json, xml, csvAvailable download formats
Dataset updated
Jun 25, 2017
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1976 - Dec 31, 2025
Area covered
India
Description
Scientific and technical journal articles in India was reported at 207390 in 2022, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Scientific and technical journal articles - actual values, historical data, forecasts and projections were sourced from the World Bank on March of 2025.
G
Les articles à la une
pacificdata.org
data.gouv.nc
+1more
csv, geojson, json +1
Updated Mar 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gouvernement de la Nouvelle-Calédonie (2025). Les articles à la une [Dataset]. https://pacificdata.org/data/dataset/data_articles_a_la_une-qzpsfn
Explore at:
csv, xls, geojson, jsonAvailable download formats
Dataset updated
Mar 4, 2025
Dataset provided by
Gouvernement de la Nouvelle-Calédonie
Description
Ce jeu de données recense l'historique des publications mises en avant sur data.gouv.nc depuis 2019.
Data used by EPA researchers to generate illustrative figures for overview...
catalog.data.gov
datasets.ai
+1more
Updated Nov 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Data used by EPA researchers to generate illustrative figures for overview article "Multiscale Modeling of Background Ozone: Research Needs to Inform and Improve Air Quality Management" [Dataset]. https://catalog.data.gov/dataset/data-used-by-epa-researchers-to-generate-illustrative-figures-for-overview-article-multisc
Explore at:
Dataset updated
Nov 14, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Data sets used to prepare illustrative figures for the overview article “Multiscale Modeling of Background Ozone” Overview The CMAQ model output datasets used to create illustrative figures for this overview article were generated by scientists in EPA/ORD/CEMM and EPA/OAR/OAQPS. The EPA/ORD/CEMM-generated dataset consisted of hourly CMAQ output from two simulations. The first simulation was performed for July 1 – 31 over a 12 km modeling domain covering the Western U.S. The simulation was configured with the Integrated Source Apportionment Method (ISAM) to estimate the contributions from 9 source categories to modeled ozone. ISAM source contributions for July 17 – 31 averaged over all grid cells located in Colorado were used to generate the illustrative pie chart in the overview article. The second simulation was performed for October 1, 2013 – August 31, 2014 over a 108 km modeling domain covering the northern hemisphere. This simulation was also configured with ISAM to estimate the contributions from non-US anthropogenic sources, natural sources, stratospheric ozone, and other sources on ozone concentrations. Ozone ISAM results from this simulation were extracted along a boundary curtain of the 12 km modeling domain specified over the Western U.S. for the time period January 1, 2014 – July 31, 2014 and used to generate the illustrative time-height cross-sections in the overview article. The EPA/OAR/OAQPS-generated dataset consisted of hourly gridded CMAQ output for surface ozone concentrations for the year 2016. The CMAQ simulations were performed over the northern hemisphere at a horizontal resolution of 108 km. NO2 and O3 data for July 2016 was extracted from these simulations generate the vertically-integrated column densities shown in the illustrative comparison to satellite-derived column densities. CMAQ Model Data The data from the CMAQ model simulations used in this research effort are very large (several terabytes) and cannot be uploaded to ScienceHub due to size restrictions. The model simulations are stored on the /asm archival system accessible through the atmos high-performance computing (HPC) system. Due to data management policies, files on /asm are subject to expiry depending on the template of the project. Files not requested for extension after the expiry date are deleted permanently from the system. The format of the files used in this analysis and listed below is ioapi/netcdf. Documentation of this format, including definitions of the geographical projection attributes contained in the file headers, are available at https://www.cmascenter.org/ioapi/ Documentation on the CMAQ model, including a description of the output file format and output model species can be found in the CMAQ documentation on the CMAQ GitHub site at https://github.com/USEPA/CMAQ. This dataset is associated with the following publication: Hogrefe, C., B. Henderson, G. Tonnesen, R. Mathur, and R. Matichuk. Multiscale Modeling of Background Ozone: Research Needs to Inform and Improve Air Quality Management. EM Magazine. Air and Waste Management Association, Pittsburgh, PA, USA, 1-6, (2020).
Prevalence of journal-specific features (peer-reviewed journal articles...
plos.figshare.com
xls
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Brady T. West; Joseph W. Sakshaug; Guy Alain S. Aurelien (2023). Prevalence of journal-specific features (peer-reviewed journal articles only). [Dataset]. http://doi.org/10.1371/journal.pone.0158120.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0158120.t005
Dataset updated
Jun 15, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Brady T. West; Joseph W. Sakshaug; Guy Alain S. Aurelien
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Prevalence of journal-specific features (peer-reviewed journal articles only).
CONTRAST-IT corpus: Spanish data collection
zenodo.org
Updated Sep 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anna-Maria De Cesare; Anna-Maria De Cesare (2023). CONTRAST-IT corpus: Spanish data collection [Dataset]. http://doi.org/10.5281/zenodo.8307878
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8307878
Dataset updated
Sep 2, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anna-Maria De Cesare; Anna-Maria De Cesare
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data collection served to create the CONTRAST-IT corpus.

CONTRAST-IT is a medium-size multilingual corpus (including ca. 1.5 million words) based on a comparable collection of articles published in online daily newspapers. The articles are written in five languages: Italian (from Italy), French (from France), Spanish (from Spain), English (from the UK), and German (from Germany).

This Spanish dataset includes 300'000 words drawn from 476 articles. All the texts collected are authentic, full-length electronic journalistic articles, chosen based on their high representativeness of contemporary Spanish newspaper language. The articles were published in 2011 and 2012 in two electronic daily newspapers (elpais.com and elmundo.es).

The corpus and data collection were used in two Swiss National Science Foundation Projects:

Italian Constituent Order in a Contrastive Perspective (ICOCP) (snf.ch)

Italian Sentence Adverbs in a Contrastive Perspective (ISAaC) (snf.ch)

For details on the corpus and data collection, see:

De Cesare, A.-M. 2019. CONTRAST-IT e COMPARE-IT. Due nuovi corpora per l’italiano contemporaneo. CHIMERA. Romance Corpora and Linguistic studies 6: 43-74. https://revistas.uam.es/chimera/article/view/11430

CONTRAST-IT. Corpus | contrast_it: italiano in prospettiva contrastiva / Italian in a contrastive perspective (unibas.ch)
4
Data underlying the research of four scenarios in the operation of water...
data.4tu.nl
4tu.edu.hpc.n-helix.com
+1more
zip
Updated Apr 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hang Wan (2021). Data underlying the research of four scenarios in the operation of water discharge patterns of a dam [Dataset]. http://doi.org/10.4121/14398946.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.4121/14398946.v1
Dataset updated
Apr 14, 2021
Dataset provided by
4TU.ResearchData
Authors
Hang Wan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Reservoir operation rules(1) continuous flood discharge with ecological priority(2) pulse flood discharge with ecological priority
(3) pulse flood discharge with equal weight of ecology and power generation(4) pulse flood discharge with power generation priority
d
Data release associated with the journal article "Solar and sensor geometry,...
catalog.data.gov
data.usgs.gov
+1more
Updated Jul 6, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Data release associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-associated-with-the-journal-article-solar-and-sensor-geometry-not-vegetation-
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Area covered
Western United States, United States
Description
This dataset supports the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" (DOI:10.1016/j.rse.2020.112013). The data release allows users to replicate, test, or further explore results. The dataset consists of 4 separate items based on the analysis approach used in the original publication 1) the 'Phenocam' dataset uses images from a phenocam in a pinyon juniper ecosystem in Grand Canyon National Park to determine phenological patterns of multiple plant species. The 'Phenocam' dataset consists of scripts and tabular data developed while performing analyses and includes the final NDVI values for all areas of interest (AOIs) described in the associated publication. 2) the 'SolarSensorAnalysis' dataset uses downloaded tabular MODIS data to explore relationships between NDVI and multiple solar and sensor angles. The 'SolarSensorAnalysis' dataset consists of download and analysis scripts in Google Earth Engine and R. The source MODIS data used in the analysis are too large to include but are provided through MODIS providers and can be accessed through Google Earth Engine using the included script. A csv file includes solar and sensor angle information for the MODIS pixel closest to the phenocam as well as for a sample of 100 randomly selected MODIS pixels within the GRCA-PJ ecosystem. 3) the 'WinterPeakExtent' dataset includes final geotiffs showing the temporal frequency extent and associated vegetation physiognomic types experiencing winter NDVI peaks in the western US. 4) the "SensorComparison" dataset contains the NDVI time series at the phenocam location from 4 other satellites as well as the code used to download these data.
g
Data for GMD article "A framework for expanding aqueous chemistry in the...
gimi9.com
datasets.ai
+3more
Updated Apr 24, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Data for GMD article "A framework for expanding aqueous chemistry in the Community Multiscale Air Quality (CMAQ) model version 5.1" [Dataset]. https://www.gimi9.com/dataset/data-gov_data-for-gmd-article-a-framework-for-expanding-aqueous-chemistry-in-the-community-multisca/
Explore at:
Dataset updated
Apr 24, 2017
Description
These data were used to generate the figures included in the following manuscript: Fahey, et al. (2017) "A framework for expanding aqueous chemistry in the Community Multiscale Air Quality (CMAQ) model version 5.1". Geosci. Mod. Dev. This dataset is associated with the following publication: Fahey, K., A. Carlton, H. Pye, J. Baek, B. Hutzell, C. Stanier, K. Baker, W. Appel, M. Jaoui, and J. Offenberg. A framework for expanding aqueous chemistry in the Community Multiscale Air Quality (CMAQ) model version 5.1. Geoscientific Model Development. Copernicus Publications, Katlenburg-Lindau, GERMANY, 10: 1587-1605, (2017).
Data for the article "Low latitude mesospheric clouds in a warmer climate"
figshare.com
bin
Updated Dec 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deepashree Dutta (2023). Data for the article "Low latitude mesospheric clouds in a warmer climate" [Dataset]. http://doi.org/10.6084/m9.figshare.24808437.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24808437.v1
Dataset updated
Dec 14, 2023
Dataset provided by
figshare
Authors
Deepashree Dutta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Seasonal mean (DJF and JJA) temperature, water vapor and cloud fraction data for different experiments from WACCM4.
c
Data from: MLASK: Multimodal Summarization of Video-based News Articles
lindat.mff.cuni.cz
live.european-language-grid.eu
+1more
Updated Nov 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mateusz Krubiński; Pavel Pecina (2023). MLASK: Multimodal Summarization of Video-based News Articles [Dataset]. https://lindat.mff.cuni.cz/repository/xmlui/handle/11234/1-5135?show=full&locale-attribute=cs
Explore at:
Dataset updated
Nov 2, 2023
Authors
Mateusz Krubiński; Pavel Pecina
License
https://lindat.mff.cuni.cz/repository/xmlui/page/szn-dataset-licencehttps://lindat.mff.cuni.cz/repository/xmlui/page/szn-dataset-licence
Description
The MLASK corpus consists of 41,243 multi-modal documents – video-based news articles in the Czech language – collected from Novinky.cz (https://www.novinky.cz/) and Seznam Zprávy (https://www.seznamzpravy.cz/). It was introduced in "MLASK: Multimodal Summarization of Video-based News Articles" (Krubiński & Pecina, EACL 2023). The articles' publication dates range from September 2016 to February 2022. The intended use case of the dataset is to model the task of multimodal summarization with multimodal output: based on a pair of a textual article and a short video, a textual summary is generated, and a single frame from the video is chosen as a pictorial summary.

Each document consists of the following: - a .mp4 video - a single image (cover picture) - the article's text - the article's summary - the article's title - the article's publication date

All of the videos are re-sampled to 25 fps and resized to the same resolution of 1280x720p. The maximum length of the video is 5 minutes, and the shortest one is 7 seconds. The average video duration is 86 seconds. The quantitative statistics of the lengths of titles, abstracts, and full texts (measured in the number of tokens) are below. Q1 and Q3 denote the first and third quartiles, respectively.

/ - / mean / Q1 / Median / Q3 / / Title / 11.16 ± 2.78 / 9 / 11 / 13 / / Abstract / 33.40 ± 13.86 / 22 / 32 / 43 / / Article / 276.96 ± 191.74 / 154 / 231 / 343 /

The proposed training/dev/test split follows the chronological ordering based on publication data. We use the articles published in the first half (Jan-Jun) of 2021 for validation (2,482 instances) and the ones published in the second half (Jul-Dec) of 2021 and the beginning (Jan-Feb) of 2022 for testing (2,652 instances). The remaining data is used for training (36,109 instances).

The textual data is shared as a single .tsv file. The visual data (video+image) is shared as a single archive for validation and test splits, and the one from the training split is partitioned based on the publication date.
v
Global import data of Article Made
volza.com
csv
Updated Feb 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza.LLC (2025). Global import data of Article Made [Dataset]. https://www.volza.com/imports-mali/mali-import-data-of-article+made
Explore at:
csvAvailable download formats
Dataset updated
Feb 17, 2025
Dataset provided by
Volza.LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
Description
724 Global import shipment records of Article Made with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
Z
Data related to research article: Towards mouse genetic-specific...
data.niaid.nih.gov
zenodo.org
Updated Oct 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jan, Maxime (2021). Data related to research article: Towards mouse genetic-specific RNA-sequencing read mapping [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5513979
Explore at:
Dataset updated
Oct 2, 2021
Dataset provided by
Franken, Paul
Xenarios, Ioannis
Gobet, Nastassia
Jan, Maxime
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data related to the research article: "Towards mouse genetic-specific RNA-sequencing read mapping".
m
Data supplementary for article "PRV-FCM: an extension of fuzzy cognitive...
data.mendeley.com
Updated Mar 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
William Hoyos (2023). Data supplementary for article "PRV-FCM: an extension of fuzzy cognitive maps for prescriptive modeling" [Dataset]. http://doi.org/10.17632/sz79bzd8mf.1
Explore at:
Unique identifier
https://doi.org/10.17632/sz79bzd8mf.1
Dataset updated
Mar 9, 2023
Authors
William Hoyos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository stores the synthetic data, programs and models developed for the research entitled "PRV-FCM: an extension of fuzzy cognitive maps for prescriptive modeling".

Facebook

Twitter

Click to copy link

Link copied

Cite

Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose (2023). A study of the impact of data sharing on article citations using journal policies as a natural experiment [Dataset]. http://doi.org/10.1371/journal.pone.0225883

A study of the impact of data sharing on article citations using journal policies as a natural experiment

Explore at:

60 scholarly articles cite this dataset (View in Google Scholar)

docxAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0225883

Dataset updated

Jun 1, 2023

Dataset provided by

PLOShttp://plos.org/

Authors

Garret Christensen; Allan Dafoe; Edward Miguel; Don A. Moore; Andrew K. Rose

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.

Clear search

Close search

Google apps

Main menu

A study of the impact of data sharing on article citations using journal...

CT-FAN-21 corpus: A dataset for Fake News Detection

WaterAPT article dataset

Data articles in journals

Frequency of reported types of studies and use of descriptive and...

Proportion of articles that share data.

article meta data

Dataset

Contents

India - Scientific And Technical Journal Articles

Les articles à la une

Data used by EPA researchers to generate illustrative figures for overview...

Prevalence of journal-specific features (peer-reviewed journal articles...

CONTRAST-IT corpus: Spanish data collection

Data underlying the research of four scenarios in the operation of water...

Data release associated with the journal article "Solar and sensor geometry,...

Data for GMD article "A framework for expanding aqueous chemistry in the...

Data for the article "Low latitude mesospheric clouds in a warmer climate"

Data from: MLASK: Multimodal Summarization of Video-based News Articles

Global import data of Article Made

Data related to research article: Towards mouse genetic-specific...

Data supplementary for article "PRV-FCM: an extension of fuzzy cognitive...

A study of the impact of data sharing on article citations using journal policies as a natural experiment