Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1The totals in this column equal the number of articles using a particular type of data, minus instances of duplicate classification by type of company within category of type of data. These instances were: Other types of data were used by articles classified as both tobacco and transportation, both mining and manufacturing, and both tobacco and alcohol, and quantitative data from internal company studies were used by the article classified as both mining and manufacturing. The overall column total is not shown, as it is greater than the total number of included articles (n = 361) because several articles used multiple types of internal documents.2The totals in this row equal the total number of articles for each type of company, minus instances where articles used multiple types of data, of which there are too many to list. The totals for the columns are therefore not equal to the sum of the classifications within the columns. The overall row total is not shown, as it is greater than the total number of included articles (N = 361) because three articles were classified with two types of companies.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study estimates the effect of data sharing on the citations of academic articles, using journal policies as a natural experiment. We begin by examining 17 high-impact journals that have adopted the requirement that data from published articles be publicly posted. We match these 17 journals to 13 journals without policy changes and find that empirical articles published just before their change in editorial policy have citation rates with no statistically significant difference from those published shortly after the shift. We then ask whether this null result stems from poor compliance with data sharing policies, and use the data sharing policy changes as instrumental variables to examine more closely two leading journals in economics and political science with relatively strong enforcement of new data policies. We find that articles that make their data available receive 97 additional citations (estimate standard error of 34). We conclude that: a) authors who share data may be rewarded eventually with additional scholarly citations, and b) data-posting policies alone do not increase the impact of articles published in a journal unless those policies are enforced.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Images, data, and statistical analysis scripts for review article on cover crop roots.
Optimization of root traits to provide enhanced ecosystem services in agricultural systems: a focus on cover crops - [https://doi.org/10.1111/pce.14247]
Research site, planting, and growth
10/2020 - 04/26/2021 cover crop field trial. DDPSC FRS at Planthaven Farm, O'Fallon, MO 63366 (latitude 38.848240°, longitude -90.686640°).
The field was tilled before sowing of cover crops. Seed for each cover crop were spread in using a push seed spreader and were lightly irrigated.
Alfalfa (Medicago sativa), dundale pea (Pisum sativum), milkvetch (Astragalus canadensis, Astragalus bisulcatus), crimson clover (Trifolium incarnatum), hairy vetch (Vicia villosa), mustard (Brassica juncea var Mighty Mustard, var Kodiak), barley (Hordeum vulgare), wheat (Triticum aestivum, winter, spring), winter rye (Secale cereale), and triticale (× Triticosecale Wittmack).
Field harvest measurements
Four canopy images were taken across each cover crop row using a Canon 5DS R camera. Images were taken from above each plot at 5ft height manually. Green color was thresholded from the canopy images in batch using OpenCV python script and the percent green cover calculated (Jupiter notebook).
Five soil monoliths were excavated using a "shovelomics" approach with an average monolith size of 25.4cm x 25.4cm x 20 cm. The remaining four soil monoliths were destructively analyzed.
One soil monolith was imaged using a Canon 50D DLSR camera in a photogrammetry shed. All photogrammetric analysis was conducted using Pix4D mapper software (Pix4D S.A. Prilly, Switzerland), and point cloud cleaning was conducted in CloudCompare V2. 10.2.
Cover crop shoots from the remaining soil monoliths were cut and placed into a paper bag for dry biomass determination (60oC for 5 days). A cover crop shoot count was conducted for each monolith with each tiller considered as a shoot for the grasses (barley, wheat, triticale). After cover crop shoot harvesting, a photo was then taken of each soil monolith with remaining weed biomass. A weed score was assigned to each image by one trained researcher with a score 1 low weeds to 5 high weed presence.
Soil monoliths were the soaked briefly in water and then the soil washed using a hose keeping the roots. Roots were then scanned on an Epson Expression 12000XL Photo Scanner with transparency unit. Images labeled with "_part" were samples with too many roots for scanning and so were separately weighed. Dry root biomass was taken for the scanned and unscanned roots separately. Root length was determined from images using software RhizoVision Explorer (https://doi.org/10.5281/zenodo.4095629), total root length was estimated using scanned root length and scanned dry biomass with unscanned root biomass.
Along each cover crop plot a 10ft trench was dug using a Yanmar Excavator Vi020-6 perpendicular to the row with each trench fully bisecting the plot. Trench was one bucket wide (19 inches) and approximately 36 inches deep in the middle of the row. The five deepest roots that could be observed in the trench wall was measured manually with a tape measure for each cover crop. A garden trowel and shovel were used to excavate and confirm roots in trench wall.
Data was analyzed using R Statistics script and raw data used for data processing and figure generation (2021PlantHavenCovercrop_dataprocessing.R). PCA analysis was conducted using the “FactoMineR” package (Husson et al. 2019) to explore the relationships between the traits within the dataset and clustered by family.
Individual ZIP file contents:
2021PlantHavenCovercrop_CanopyImages.zip – Raw canopy images, processed percent green cover images, and Jupiter notebook python script (2021PlantHavenCovercrop_ImageBatchColorThreshold.ipynb).
2021PlantHavenCovercrop_RootFlatbedImages.zip – Raw flatbed root scans of cover crops and processed images using RhizoVision Explorer.
2021PlantHavenCovercrop_SoilMonolithWeedImages.zip – Images of soil monoliths after cover crop shoot biomass was removed.
2021PlantHavenCovercrop_dataprocessing.zip – R Statistics script and raw data used for data processing and figure generation (2021PlantHavenCovercrop_dataprocessing.R).
2021PlantHavenCovercrop_ShootPhotogrammetry.zip – 3D models of cover crop shoots from excavated soil monoliths. The .bin files can be opened using CloudCompare app.
Facebook
TwitterThis dataset supports the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" (DOI:10.1016/j.rse.2020.112013). The data release allows users to replicate, test, or further explore results. The dataset consists of 4 separate items based on the analysis approach used in the original publication 1) the 'Phenocam' dataset uses images from a phenocam in a pinyon juniper ecosystem in Grand Canyon National Park to determine phenological patterns of multiple plant species. The 'Phenocam' dataset consists of scripts and tabular data developed while performing analyses and includes the final NDVI values for all areas of interest (AOIs) described in the associated publication. 2) the 'SolarSensorAnalysis' dataset uses downloaded tabular MODIS data to explore relationships between NDVI and multiple solar and sensor angles. The 'SolarSensorAnalysis' dataset consists of download and analysis scripts in Google Earth Engine and R. The source MODIS data used in the analysis are too large to include but are provided through MODIS providers and can be accessed through Google Earth Engine using the included script. A csv file includes solar and sensor angle information for the MODIS pixel closest to the phenocam as well as for a sample of 100 randomly selected MODIS pixels within the GRCA-PJ ecosystem. 3) the 'WinterPeakExtent' dataset includes final geotiffs showing the temporal frequency extent and associated vegetation physiognomic types experiencing winter NDVI peaks in the western US. 4) the "SensorComparison" dataset contains the NDVI time series at the phenocam location from 4 other satellites as well as the code used to download these data.
Facebook
TwitterBy downloading the data, you agree with the terms & conditions mentioned below:
Data Access: The data in the research collection may only be used for research purposes. Portions of the data are copyrighted and have commercial value as data, so you must be careful to use them only for research purposes.
Summaries, analyses and interpretations of the linguistic properties of the information may be derived and published, provided it is impossible to reconstruct the information from these summaries. You may not try identifying the individuals whose texts are included in this dataset. You may not try to identify the original entry on the fact-checking site. You are not permitted to publish any portion of the dataset besides summary statistics or share it with anyone else.
We grant you the right to access the collection's content as described in this agreement. You may not otherwise make unauthorised commercial use of, reproduce, prepare derivative works, distribute copies, perform, or publicly display the collection or parts of it. You are responsible for keeping and storing the data in a way that others cannot access. The data is provided free of charge.
Citation
Please cite our work as
@InProceedings{clef-checkthat:2022:task3, author = {K{"o}hler, Juliane and Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Wiegand, Michael and Siegel, Melanie and Mandl, Thomas}, title = "Overview of the {CLEF}-2022 {CheckThat}! Lab Task 3 on Fake News Detection", year = {2022}, booktitle = "Working Notes of CLEF 2022---Conference and Labs of the Evaluation Forum", series = {CLEF~'2022}, address = {Bologna, Italy},}
@article{shahi2021overview, title={Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection}, author={Shahi, Gautam Kishore and Stru{\ss}, Julia Maria and Mandl, Thomas}, journal={Working Notes of CLEF}, year={2021} }
Problem Definition: Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other (e.g., claims in dispute) and detect the topical domain of the article. This task will run in English and German.
Task 3: Multi-class fake news detection of news articles (English) Sub-task A would detect fake news designed as a four-class classification problem. Given the text of a news article, determine whether the main claim made in the article is true, partially true, false, or other. The training data will be released in batches and roughly about 1264 articles with the respective label in English language. Our definitions for the categories are as follows:
False - The main claim made in an article is untrue.
Partially False - The main claim of an article is a mixture of true and false information. The article contains partially true and partially false information but cannot be considered 100% true. It includes all articles in categories like partially false, partially true, mostly true, miscaptioned, misleading etc., as defined by different fact-checking services.
True - This rating indicates that the primary elements of the main claim are demonstrably true.
Other- An article that cannot be categorised as true, false, or partially false due to a lack of evidence about its claims. This category includes articles in dispute and unproven articles.
Cross-Lingual Task (German)
Along with the multi-class task for the English language, we have introduced a task for low-resourced language. We will provide the data for the test in the German language. The idea of the task is to use the English data and the concept of transfer to build a classification model for the German language.
Input Data
The data will be provided in the format of Id, title, text, rating, the domain; the description of the columns is as follows:
ID- Unique identifier of the news article
Title- Title of the news article
text- Text mentioned inside the news article
our rating - class of the news article as false, partially false, true, other
Output data format
public_id- Unique identifier of the news article
predicted_rating- predicted class
Sample File
public_id, predicted_rating 1, false 2, true
IMPORTANT!
We have used the data from 2010 to 2022, and the content of fake news is mixed up with several topics like elections, COVID-19 etc.
Baseline: For this task, we have created a baseline system. The baseline system can be found at https://zenodo.org/record/6362498
Related Work
Shahi GK. AMUSED: An Annotation Framework of Multi-modal Social Media Data. arXiv preprint arXiv:2010.00502. 2020 Oct 1.https://arxiv.org/pdf/2010.00502.pdf
G. K. Shahi and D. Nandini, “FakeCovid – a multilingual cross-domain fact check news dataset for covid-19,” in workshop Proceedings of the 14th International AAAI Conference on Web and Social Media, 2020. http://workshop-proceedings.icwsm.org/abstract?id=2020_14
Shahi, G. K., Dirkson, A., & Majchrzak, T. A. (2021). An exploratory study of covid-19 misinformation on twitter. Online Social Networks and Media, 22, 100104. doi: 10.1016/j.osnem.2020.100104
Shahi, G. K., Struß, J. M., & Mandl, T. (2021). Overview of the CLEF-2021 CheckThat! lab task 3 on fake news detection. Working Notes of CLEF.
Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeno, A., Míguez, R., Shaar, S., ... & Mandl, T. (2021, March). The CLEF-2021 CheckThat! lab on detecting check-worthy claims, previously fact-checked claims, and fake news. In European Conference on Information Retrieval (pp. 639-649). Springer, Cham.
Nakov, P., Da San Martino, G., Elsayed, T., Barrón-Cedeño, A., Míguez, R., Shaar, S., ... & Kartal, Y. S. (2021, September). Overview of the CLEF–2021 CheckThat! Lab on Detecting Check-Worthy Claims, Previously Fact-Checked Claims, and Fake News. In International Conference of the Cross-Language Evaluation Forum for European Languages (pp. 264-291). Springer, Cham.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Costa Rica Imports from Mexico of Textile Products and Articles, for Technical Use was US$135.46 Thousand during 2024, according to the United Nations COMTRADE database on international trade. Costa Rica Imports from Mexico of Textile Products and Articles, for Technical Use - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterWhich keywords are most widely used in online media worldwide in relation to the Russia-Ukraine war? Looking at the number of online articles published by global news outlets in 65 languages, keyword combinations related to sending weapons to Ukraine, Western sanctions on Russia, and nuclear weapons were used the most in 2024. The Russia-Ukraine war began in February 2022, resulting in a humanitarian crisis, economic, trade, and financial restrictions on Russia, and supply chain interruptions around the globe. Sending weapons to Ukraine has been widely covered in online media A keyword combination commonly mentioned in global online articles was that of the words "Ukraine" and "weapons." In total, from 2022 to 2024, around *** million articles in online press mentioned them in one text. In view of the Russian invasion, Western countries provided military aid to Ukraine. The United States, Germany, and the United Kingdom (UK) provided the highest worth of military aid. The invasion has raised fears of a nuclear war Over the course of 2022, online press worldwide extensively covered the topic of nuclear weapons and the nuclear threat, with around ******* online articles published across the world mentioning "Russia" or "Ukraine" together with "nuclear" in combination with "war." In general, in that year, the number of online articles mentioning nuclear weapons was the highest over the past years.
Facebook
TwitterThis dataset includes papers containing published research results on COVID-19. Each paper has its PubMed ID, DOI number, journal title, journal country, article title, authors, abstract, date of publication, and the number of citations until the date of update. It contains more than 150,000 articles in total. Newly added paper and citation numbers are updated monthly.
All articles with the word "COVID-19" published before September 2021 were included in the dataset. All data were collected using the PubMed API. Using multiple APIs, the data related to the articles were combined and made into a data set. The data set will be updated by adding new articles every month.
All data is in papers.csv file.
See LICENSE for details.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Deskdrop is an internal communications platform developed by CI&T, focused in companies using Google G Suite. Among other features, this platform allows companies employees to share relevant articles with their peers, and collaborate around them.
This rich and rare dataset contains a real sample of 12 months logs (Mar. 2016 - Feb. 2017) from CI&T's Internal Communication platform (DeskDrop).
I contains about 73k logged users interactions on more than 3k public articles shared in the platform.
This dataset features some distinctive characteristics:
If you like it, please upvote!
Take a look in these featured Python kernels:
- Deskdrop datasets EDA: Exploratory analysis of the articles and interactions in the dataset
- DeskDrop Articles Topic Modeling: A statistical analysis of the main articles topics using LDA
- Recommender Systems in Python 101: A practical introduction of the main Recommender Systems approaches: Popularity model, Collaborative Filtering, Content-Based Filtering and Hybrid Filtering.
We thank CI&T for the support and permission to share a sample of real usage data from its internal communication platform: Deskdrop.
The two main approaches for Recommender Systems are Collaborative Filtering and Content-Based Filtering.
In the RecSys community, there are some popular datasets available with users ratings on items (explicit feedback), like MovieLens and Netflix Prize, which are useful for Collaborative Filtering techniques.
Therefore, it is very difficult to find open datasets with additional item attributes, which would allow the application of Content-Based filtering techniques or Hybrid approaches, specially in the domain of ephemeral textual items (eg. articles and news).
News datasets are also reported in academic literature as very sparse, in the sense that, as users are usually not required to log in in news portals, IDs are based on device cookies, making it hard to track the users page visits in different portals, browsing sessions and devices.
This difficult scenario for research and experiments on Content Recommender Systems was the main motivation for the sharing of this dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Colombia Imports of textile products and articles, for technical use from San Marino was US$19 during 2024, according to the United Nations COMTRADE database on international trade. Colombia Imports of textile products and articles, for technical use from San Marino - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cumulative usage statistics (HTML views, XML views, PDF downloads) from PLOS website for all PLOS articles. Data collected August 27, 2013.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundGoogle Trends is a novel, freely accessible tool that allows users to interact with Internet search data, which may provide deep insights into population behavior and health-related phenomena. However, there is limited knowledge about its potential uses and limitations. We therefore systematically reviewed health care literature using Google Trends to classify articles by topic and study aim; evaluate the methodology and validation of the tool; and address limitations for its use in research.Methods and FindingsPRISMA guidelines were followed. Two independent reviewers systematically identified studies utilizing Google Trends for health care research from MEDLINE and PubMed. Seventy studies met our inclusion criteria. Google Trends publications increased seven-fold from 2009 to 2013. Studies were classified into four topic domains: infectious disease (27% of articles), mental health and substance use (24%), other non-communicable diseases (16%), and general population behavior (33%). By use, 27% of articles utilized Google Trends for casual inference, 39% for description, and 34% for surveillance. Among surveillance studies, 92% were validated against a reference standard data source, and 80% of studies using correlation had a correlation statistic ≥0.70. Overall, 67% of articles provided a rationale for their search input. However, only 7% of articles were reproducible based on complete documentation of search strategy. We present a checklist to facilitate appropriate methodological documentation for future studies. A limitation of the study is the challenge of classifying heterogeneous studies utilizing a novel data source.ConclusionGoogle Trends is being used to study health phenomena in a variety of topic domains in myriad ways. However, poor documentation of methods precludes the reproducibility of the findings. Such documentation would enable other researchers to determine the consistency of results provided by Google Trends for a well-specified query over time. Furthermore, greater transparency can improve its reliability as a research tool.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
France Imports from Belarus of Textile Products and Articles, for Technical Use was US$1 during 2023, according to the United Nations COMTRADE database on international trade. France Imports from Belarus of Textile Products and Articles, for Technical Use - data, historical chart and statistics - was last updated on October of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mexico Exports of textile products and articles, for technical use to Ethiopia was US$21 during 2013, according to the United Nations COMTRADE database on international trade. Mexico Exports of textile products and articles, for technical use to Ethiopia - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In bold, the catchment areas directly involved with the decisional process.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Honduras Imports from El Salvador of Textile Products and Articles, for Technical Use was US$202 during 2021, according to the United Nations COMTRADE database on international trade. Honduras Imports from El Salvador of Textile Products and Articles, for Technical Use - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Morocco Exports of textile products and articles, for technical use to Colombia was US$5.47 Thousand during 2023, according to the United Nations COMTRADE database on international trade. Morocco Exports of textile products and articles, for technical use to Colombia - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estonia Exports of textile products and articles, for technical use to Qatar was US$1.29 Thousand during 2024, according to the United Nations COMTRADE database on international trade. Estonia Exports of textile products and articles, for technical use to Qatar - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Switzerland Exports of textile products and articles, for technical use to Norway was US$107.46 Thousand during 2024, according to the United Nations COMTRADE database on international trade. Switzerland Exports of textile products and articles, for technical use to Norway - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Trinidad And Tobago Imports of textile products and articles, for technical use from Ghana was US$28.08 Thousand during 2024, according to the United Nations COMTRADE database on international trade. Trinidad And Tobago Imports of textile products and articles, for technical use from Ghana - data, historical chart and statistics - was last updated on November of 2025.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
1The totals in this column equal the number of articles using a particular type of data, minus instances of duplicate classification by type of company within category of type of data. These instances were: Other types of data were used by articles classified as both tobacco and transportation, both mining and manufacturing, and both tobacco and alcohol, and quantitative data from internal company studies were used by the article classified as both mining and manufacturing. The overall column total is not shown, as it is greater than the total number of included articles (n = 361) because several articles used multiple types of internal documents.2The totals in this row equal the total number of articles for each type of company, minus instances where articles used multiple types of data, of which there are too many to list. The totals for the columns are therefore not equal to the sum of the classifications within the columns. The overall row total is not shown, as it is greater than the total number of included articles (N = 361) because three articles were classified with two types of companies.