100+ datasets found

Twitter usage in MENA by language 2016
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Twitter usage in MENA by language 2016 [Dataset]. https://www.statista.com/statistics/729700/mena-twitter-usage-by-language/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 2016
Area covered
MENA, Asia
Description
This statistic described the distribution of twitter usage in the Middle East and North Africa in 2016, by language. During 2016, the most used language on twitter in the MENA region was Arabic with ** percent.
Twitter event datasets (2012-2016)
figshare.com
tar
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arkaitz Zubiaga (2023). Twitter event datasets (2012-2016) [Dataset]. http://doi.org/10.6084/m9.figshare.5100460.v2
Explore at:
tarAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5100460.v2
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Arkaitz Zubiaga
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This collection includes data for 30 different Twitter datasets associated with real world events. The datasets were collected between 2012 and 2016, always using the streaming API with a set of keywords.These datasets are released in accordance with Twitter's TOS, which allows sharing of tweet IDs and are intended for non-commercial research.Note: Twitter's developer policy doesn't allow sharing more than 1,500,000 tweet IDs (https://dev.twitter.com/overview/terms/policy#updated-policy), unless the author is affiliated with an academic institution (which is my case) and tweet IDs are solely used for non-commercial purposes (https://twittercommunity.com/t/policy-update-clarification-research-use-cases/87566). Hence, by downloading these datasets you agree that you will not use it for commercial purposes.Please cite the following paper if you make use of these datasets for your research: https://onlinelibrary.wiley.com/doi/full/10.1002/asi.24026See README file for more details.
Z
Data from: Twitter historical dataset: March 21, 2006 (first tweet) to July...
data.niaid.nih.gov
live.european-language-grid.eu
+2more
Updated May 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gayo-Avello, Daniel (2020). Twitter historical dataset: March 21, 2006 (first tweet) to July 31, 2009 (3 years, 1.5 billion tweets) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3833781
Explore at:
Dataset updated
May 20, 2020
Dataset authored and provided by
Gayo-Avello, Daniel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Disclaimer: This dataset is distributed by Daniel Gayo-Avello, an associate professor at the Department of Computer Science in the University of Oviedo, for the sole purpose of non-commercial research and it just includes tweet ids.

The dataset contains tweet IDs for all the published tweets (in any language) bettween March 21, 2006 and July 31, 2009 thus comprising the first whole three years of Twitter from its creation, that is, about 1.5 billion tweets (see file Twitter-historical-20060321-20090731.zip).

It covers several defining issues in Twitter, such as the invention of hashtags, retweets and trending topics, and it includes tweets related to the 2008 US Presidential Elections, the first Obama’s inauguration speech or the 2009 Iran Election protests (one of the so-called Twitter Revolutions).

Finally, it does contain tweets in many major languages (mainly English, Portuguese, Japanese, Spanish, German and French) so it should be possible–at least in theory–to analyze international events from different cultural perspectives.

The dataset was completed in November 2016 and, therefore, the tweet IDs it contains were publicly available at that moment. This means that there could be tweets public during that period that do not appear in the dataset and also that a substantial part of tweets in the dataset has been deleted (or locked) since 2016.

To make easier to understand the decay of tweet IDs in the dataset a number of representative samples (99% confidence level and 0.5 confidence interval) are provided.

In general terms, 85.5% ±0.5 of the historical tweets are available as of May 19, 2020 (see file Twitter-historical-20060321-20090731-sample.txt). However, since the amount of tweets vary greatly throughout the period of three years covered in the dataset, additional representative samples are provided for 90-day intervals (see the file 90-day-samples.zip).

In that regard, the ratio of publicly available tweets (as of May 19, 2020) is as follows:

March 21, 2006 to June 18, 2006: 88.4% ±0.5 (from 5,512 tweets).

June 18, 2006 to September 16, 2006: 82.7% ±0.5 (from 14,820 tweets).

September 16, 2006 to December 15, 2006: 85.7% ±0.5 (from 107,975 tweets).

December 15, 2006 to March 15, 2007: 88.2% ±0.5 (from 852,463 tweets).

March 15, 2007 to June 13, 2007: 89.6% ±0.5 (from 6,341,665 tweets).

June 13, 2007 to September 11, 2007: 88.6% ±0.5 (from 11,171,090 tweets).

September 11, 2007 to December 10, 2007: 87.9% ±0.5 (from 15,545,532 tweets).

December 10, 2007 to March 9, 2008: 89.0% ±0.5 (from 23,164,663 tweets).

March 9, 2008 to June 7, 2008: 66.5% ±0.5 (from 56,416,772 tweets; see below for more details on this).

June 7, 2008 to September 5, 2008: 78.3% ±0.5 (from 62,868,189 tweets; see below for more details on this).

September 5, 2008 to December 4, 2008: 87.3% ±0.5 (from 89,947,498 tweets).

December 4, 2008 to March 4, 2009: 86.9% ±0.5 (from 169,762,425 tweets).

March 4, 2009 to June 2, 2009: 86.4% ±0.5 (from 474,581,170 tweets).

June 2, 2009 to July 31, 2009: 85.7% ±0.5 (from 589,116,341 tweets).

The apparent drop in available tweets from March 9, 2008 to September 5, 2008 has an easy, although embarrassing, explanation.

At the moment of cleaning all the data to publish this dataset there seemed to be a gap between April 1, 2008 to July 7, 2008 (actually, the data was not missing but in a different backup). Since tweet IDs are easy to regenerate for that Twitter era (source code is provided in generate-ids.m) I simply produced all those that were created between those two dates. All those tweets actually existed but a number of them were obviously private and not crawlable. For those regenerated IDs the actual ratio of public tweets (as of May 19, 2020) is 62.3% ±0.5.

In other words, what you see in that period (April to July, 2008) is not actually a huge number of tweets having been deleted but the combination of deleted and non-public tweets (whose IDs should not be in the dataset for performance purposes when rehydrating the dataset).

Additionally, given that not everybody will need the whole period of time the earliest tweet ID for each date is provided in the file date-tweet-id.tsv.

For additional details regarding this dataset please see: Gayo-Avello, Daniel. "How I Stopped Worrying about the Twitter Archive at the Library of Congress and Learned to Build a Little One for Myself." arXiv preprint arXiv:1611.08144 (2016).

If you use this dataset in any way please cite that preprint (in addition to the dataset itself).

If you need to contact me you can find me as @PFCdgayo in Twitter.
Twitter usage in Saudi Arabia by device 2016
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Twitter usage in Saudi Arabia by device 2016 [Dataset]. https://www.statista.com/statistics/729706/saudi-arabia-twitter-usage-by-device/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2016
Area covered
Saudi Arabia
Description
This statistic described the distribution of twitter usage in Saudi Arabia in 2016, by device. During 2016, the most used device for twitter in Saudi Arabia was mobile with almost ** percent.
d
Replication Data for: An Analysis of How the Twitter Discourse Surrounding...
search.dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kimmons, Royce; Paskevicius, Michael; Veletsianos, George (2023). Replication Data for: An Analysis of How the Twitter Discourse Surrounding Open Education Unfolded From 2009 to 2016 [Dataset]. http://doi.org/10.5683/SP2/FPFKUN
Explore at:
Unique identifier
https://doi.org/10.5683/SP2/FPFKUN
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Kimmons, Royce; Paskevicius, Michael; Veletsianos, George
Description
Inspired by open educational resources, open pedagogy, and open source software, the openness movement in education has different meanings for different people. In this study, we use Twitter data to examine the discourses surrounding openness as well as the people who participate in discourse around openness. By targeting hashtags related to open education, we gathered the most extensive dataset of historical open education tweets to date (n = 178,304 tweets and 23,061 users) and conducted a mixed methods analysis of openness from 2009 to 2016. Findings show that the diversity of participants has varied somewhat over time and that the discourse has predominantly revolved around open resources, although there are signs that an increase in interest around pedagogy, teaching, and learning is emerging.
w
City of Seattle official Twitter accounts statistics
data.wu.ac.at
application/excel +5
Updated Aug 9, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eckstine, Nate (2017). City of Seattle official Twitter accounts statistics [Dataset]. https://data.wu.ac.at/odso/data_seattle_gov/bTdwYS1qejZi
Explore at:
xml, application/excel, application/xml+rdf, csv, xlsx, jsonAvailable download formats
Dataset updated
Aug 9, 2017
Dataset provided by
Eckstine, Nate
Area covered
Seattle
Description
This dataset contains statistics on the usage patterns of the official City of Seattle Twitter accounts, as well their outreach impact. Jun 2016 - Jan 2017
US 2016 Election Data used for classification of Organized Behavior in...
figshare.com
bz2
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erdem Begenilmis (2023). US 2016 Election Data used for classification of Organized Behavior in Twitter [Dataset]. http://doi.org/10.6084/m9.figshare.6683004.v1
Explore at:
bz2Available download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6683004.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Erdem Begenilmis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
TweetIds and Collections

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 8.0px Helvetica}

Erdem Beğenilmiş and Suzan Uskudarli. 2018. Organized Behavior Classification of Tweet Sets using Supervised Learning Methods. In WIMS ’18: 8th International Conference on Web Intelligence, Mining and Semantics, June 25–27, 2018, Novi Sad, Serbia. ACM, New York, NY, USA, 9 pages. https://doi.org/10.1145/3227609.3227665
Z
TweetsKB (Part 4, Nov 2015 - Aug 2016)
data.niaid.nih.gov
Updated Dec 8, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fafalios, Pavlos (2021). TweetsKB (Part 4, Nov 2015 - Aug 2016) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_785055
Explore at:
Dataset updated
Dec 8, 2021
Dataset provided by
Fafalios, Pavlos
Iosifidis, Vasileios
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
TweetsKB is a public RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for more than 1.9 billion tweets, spanning more than 7 years (2013 - 2020). Metadata information about the tweets as well as extracted entities, sentiments, hashtags, user mentions and URLs are exposed in RDF using established RDF/S vocabularies*. Example queries and more information are available through TweetsKB's home page: https://data.gesis.org/tweetskb/.

For the sake of privacy, we anonymize user IDs and we do not provide the text of the tweets.
Data from: Annotated Dataset of History-related Tweets
zenodo.org
csv
Updated Sep 19, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yasunobu Sumikawa; Adam Jatowt; Yasunobu Sumikawa; Adam Jatowt (2021). Annotated Dataset of History-related Tweets [Dataset]. http://doi.org/10.5281/zenodo.4657223
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4657223
Dataset updated
Sep 19, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yasunobu Sumikawa; Adam Jatowt; Yasunobu Sumikawa; Adam Jatowt
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains tweet IDs and their 5 types of contextual information including 1) hashtags, 2) their categories, 3) entities obtained by NERD, 4) time-references normalized by Heideltime, and 5) Web categories for URLs attached with history-related hashtag that are related to history and that were collected for the purpose of analyzing how history-related content is disseminated in online social networks. Our IJDL paper shows the analysis results. The preliminary version of the analysis report is available here.

We used the Twitter official search API provided by Twitter to collect tweets. Note that three kinds of tweets are typically found in Twitter: tweets, retweets and quote tweets. Tweet is an original text issued as a post by a Twitter user. A retweet is a copy of an original tweet for the purpose of propagating the tweet content to more users (i.e., one's followers). Finally, a quote tweet copies the content of another tweet and allows also to add new content. A quote tweet is sometimes called a retweet with a comment. In this work, we simply treat all quote tweets as original tweets since they include additional information/text. There were however only 1,877 (0.2%) tweets recognized as quote tweets in our dataset.

To collect tweets that refer to the past or are related to collective memory of past events/entities, we performed hashtag based crawling together with bootstrapping procedure.
At the beginning, we gathered several historical hashtags selected by experts (e.g. #HistoryTeacher, #history, #WmnHist).
In addition, we prepared several hashtags that are commonly used when referring to the past: #onthisday, #thisdayinhistory, #throwbackthursday, #otd. We then collected tweets that contain these hashtags by using Twitter official search API.

The collected tweets were issued from 8 March 2016 to 2 July 2018.
Bootstrapping allowed us to search for other hashtags frequently used with the seed hashtags. The tweets tagged by such hashtags were then included into the seed set after the manual inspection of all the discovered hashtags as of their relation to the history, and filtering ones that are unrelated.
In total, we gathered 147 history-related hashtags which allowed us to collect 2,370,252 tweet IDs pointing to 882,977 tweets and 1,487,275 re-tweets.

Related papers:

Yasunobu Sumikawa, Adam Jatowt, and Marten During, "Digital History meets Microblogging: Analyzing Collective Memories in Twitter", In Proceedings of the 18th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL'18, IEEE/ACM, pp. 213 -- 222, 2018. [paper]

Yasunobu Sumikawa and Adam Jatowt, "Analyzing History-related Posts in Twitter", International Journal on Digital Libraries, Springer, 2020. https://doi.org/10.1007/s00799-020-00296-2 [paper][dataset]

Yasunobu Sumikawa and Adam Jatowt, "Annotated Dataset of History-related Tweets", Data in Brief, Vol. 38, pp. 107344, Elsevier, 2021. [paper]
D
Data from: Dataset: input and results related to the paper 'Anticipointment...
ssh.datastations.nl
narcis.nl
zip
Updated Jun 29, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch; F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch (2017). Dataset: input and results related to the paper 'Anticipointment detection in event tweets' [Dataset]. http://doi.org/10.17026/DANS-XCP-X989
Explore at:
zip(17693), zip(424769593)Available download formats
Unique identifier
https://doi.org/10.17026/DANS-XCP-X989
Dataset updated
Jun 29, 2017
Dataset provided by
DANS Data Station Social Sciences and Humanities
Authors
F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch; F.A. Kunneman; M.J.P. van Mulken; A.P.J. van den Bosch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset features the training models, emotion classifications and emotion patterns before and after events, related to the paper:F. Kunneman, M. van Mulken and A. Van den Bosch, Anticipointment detection in event tweets (under review)Abstract of the study:We developed a system to detect positive expectation, disappointment, and satisfaction in tweets that refer to events automatically discovered in the Twitter stream. The emotional content shared on Twitter when referring to public events can provide insights into the presumed and experienced quality of the event. We expected to find a connection between positive expectation and disappointment, a succession that is sometimes referred to as anticipointment. The application of computational approaches makes it possible to detect the presence and strength of this hypothetical relation for a large number of events. We extracted events from a longitudinal data set of Dutch Twitter posts, and modeled classifiers to recognize emotion in the tweets related to those events by means of hashtag-labeled training data. After classifying all tweets before and after the events in our data set, we summarized the collective emotions by calculating the percentage of tweets classified with an emotion as well as ranking tweets based on the classifier confidence score for an emotion and selecting the 90th percentile. Only a weak correlation of around 0.2 was found between positive expectation and disappointment, while a higher correlation of 0.6 was found between positiveexpectation and satisfaction. The most anticipointing events were events with a clear loss, such as a canceled event or when the favored sports team had lost. We conclude that senders of Twitter posts might be more inclined to share satisfaction than disappointment after a much anticipated event.Subject period: January 1st 2011 until October 31st 2015 Date: start=2015-11-01; end=2016-02-28 (data collection)
f
Data for right-wing groups found on Twitter around the 2016 US election
royalholloway.figshare.com
txt
Updated Oct 25, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Bryden; Eric Silverman (2018). Data for right-wing groups found on Twitter around the 2016 US election [Dataset]. http://doi.org/10.17637/rh.7160027.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17637/rh.7160027.v2
Dataset updated
Oct 25, 2018
Dataset provided by
Royal Holloway, University of London
Authors
John Bryden; Eric Silverman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
File contains a list of Twitter account IDs in ASCII format. These accounts were those which we sampled and then analysed in the paper. The data we used are available from Twitter with the REST API.
Frequency of Twitter usage among young adults in India 2016
statista.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Frequency of Twitter usage among young adults in India 2016 [Dataset]. https://www.statista.com/statistics/734936/twitter-usage-frequency-among-young-adults-india/
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2016
Area covered
India
Description
This statistic displays the results of a youth survey conducted among 15-34 year olds in across ** states across India in 2016 about the frequency of Twitter usage. A majority of respondents, about ** percent never used the news and social networking service, while about eight percent used it daily during the survey period.
TweetsKB (Part 5, Dec 2016 - Oct 2017)
zenodo.org
Updated Dec 8, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pavlos Fafalios; Vasileios Iosifidis; Pavlos Fafalios; Vasileios Iosifidis (2021). TweetsKB (Part 5, Dec 2016 - Oct 2017) [Dataset]. http://doi.org/10.5281/zenodo.1765240
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1765240
Dataset updated
Dec 8, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Pavlos Fafalios; Vasileios Iosifidis; Pavlos Fafalios; Vasileios Iosifidis
Description
CHECK THE OPEN ACCESS VERSION OF THIS DATASET: https://zenodo.org/record/1095592
d
Twitter data related to Canada at the Rio 2016 Olympics and Paralympics
search.dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Library and Archives Canada (2023). Twitter data related to Canada at the Rio 2016 Olympics and Paralympics [Dataset]. http://doi.org/10.5683/SP/V0ZIFT
Explore at:
Unique identifier
https://doi.org/10.5683/SP/V0ZIFT
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Library and Archives Canada
Time period covered
Jul 29, 2016 - Sep 23, 2016
Area covered
Canada
Description
Tweet IDs for tweets containing hashtags related to Canada at the 2016 Rio Olympics and Paralympics, held Aug. 5-21 and Sept. 7-18 respectively. These were captured as part of a larger web archiving project focused on Canada's involvement in the Rio Games. Tweet IDs can be hydrated using Ed Summers' twarc (https://github.com/edsu/twarc). Hydrating will recreate the original tweet(s) in json format, provided the content is still available on Twitter. / Tweets were collected July 29 - Sept. 23. Several hashtags were tracked, with new ones added as they were identified: Added July 29: #teamcanada,#equipecanada / Added Aug. 3: #CAN, #CanadaRED, #CanWNT / Added Aug. 8: #GoCanadaGo / Added Aug. 10: #LetsGoCanada, #Canadaolympics, #Flytheflag / Added Aug. 12: #Pennyoleksiak / Added Sept. 6: #Paratough, #Parafort /
D
Data from: Dataset: tweets and events linked to the paper 'Open-domain...
phys-techsciences.datastations.nl
bin, pdf +3
Updated Jan 31, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
F.A. Kunneman; A.P.J. van den Bosch; F.A. Kunneman; A.P.J. van den Bosch (2017). Dataset: tweets and events linked to the paper 'Open-domain extraction of future events from Twitter' [Dataset]. http://doi.org/10.17026/DANS-227-36WN
Explore at:
text/plain; charset=us-ascii(189), text/plain; charset=us-ascii(360), text/plain; charset=us-ascii(151), text/plain; charset=us-ascii(227), text/plain; charset=us-ascii(284), text/plain; charset=us-ascii(1386), text/plain; charset=us-ascii(113), text/plain; charset=us-ascii(170), text/plain; charset=us-ascii(208), text/plain; charset=us-ascii(436), text/plain; charset=us-ascii(94), text/plain; charset=us-ascii(930), text/plain; charset=us-ascii(1158), text/plain; charset=us-ascii(246), text/plain; charset=us-ascii(132), text/plain; charset=us-ascii(417), text/plain; charset=us-ascii(265), text/plain; charset=us-ascii(778), text/plain; charset=us-ascii(1215), text/plain; charset=us-ascii(303), text/plain; charset=us-ascii(607), text/plain; charset=us-ascii(474), text/plain; charset=us-ascii(550), text/plain; charset=us-ascii(398), text/plain; charset=us-ascii(797), text/plain; charset=us-ascii(835), text/plain; charset=us-ascii(493), text/plain; charset=us-ascii(702), text/plain; charset=us-ascii(322), text/plain; charset=us-ascii(1272), text/plain; charset=us-ascii(379), text/plain; charset=us-ascii(1405), text/plain; charset=us-ascii(512), text/plain; charset=us-ascii(531), text/plain; charset=us-ascii(987), text/plain; charset=us-ascii(645), text/plain; charset=us-ascii(455), text/plain; charset=us-ascii(873), text/plain; charset=us-ascii(341), text/plain; charset=us-ascii(683), text/plain; charset=us-ascii(1595), pdf(55247), text/plain; charset=us-ascii(721), text/plain; charset=us-ascii(626), text/plain; charset=us-ascii(911), text/plain; charset=us-ascii(759), text/plain; charset=us-ascii(1652), text/plain; charset=us-ascii(569), text/plain; charset=us-ascii(949), text/plain; charset=us-ascii(1025), text/plain; charset=us-ascii(816), txt(364695), text/plain; charset=us-ascii(1177), text/plain; charset=us-ascii(664), bin(144), text/plain; charset=us-ascii(1823), text/plain; charset=us-ascii(1063), text/plain; charset=us-ascii(892), text/plain; charset=us-ascii(854), text/plain; charset=us-ascii(2241), text/plain; charset=us-ascii(740), text/plain; charset=us-ascii(1633), text/plain; charset=us-ascii(1709), txt(815769173), text/plain; charset=us-ascii(588), zip(431638), txt(525949774), text/plain; charset=us-ascii(2792)Available download formats
Unique identifier
https://doi.org/10.17026/DANS-227-36WN
Dataset updated
Jan 31, 2017
Dataset provided by
DANS Data Station Physical and Technical Sciences
Authors
F.A. Kunneman; A.P.J. van den Bosch; F.A. Kunneman; A.P.J. van den Bosch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Input data and output of research conducted in the study described in the paper:F. Kunneman and A. Van den Bosch (2016), Open-domain extraction of future events from Twitter, Natural Language Engineering, doi: 10.1017/S1351324916000036The paper describes a system that extracts future referring time expressions and entities from Twitter messages, and subsequently detects events as a pair of a date and entity the are often mentioned in the same tweet. This dataset features the ids of a large set of Dutch tweets posted in August 2014, which was used as input to the system, as well as the time expression and / or entity that was extracted from each tweet, if any. Furthermore, the detected events are included, represented as a date, one or more describing terms, the tweetids that refer to it and the assessment of the event by human annotators.
TweetsKB (Part 4, Jan 2016 - Nov 2016)
zenodo.org
Updated Dec 8, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pavlos Fafalios; Vasileios Iosifidis; Pavlos Fafalios; Vasileios Iosifidis (2021). TweetsKB (Part 4, Jan 2016 - Nov 2016) [Dataset]. http://doi.org/10.5281/zenodo.1757118
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1757118
Dataset updated
Dec 8, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Pavlos Fafalios; Vasileios Iosifidis; Pavlos Fafalios; Vasileios Iosifidis
Description
CHECK THE OPEN ACCESS VERSION OF THIS DATASET: https://zenodo.org/record/579601
Twitter usage in Qatar by device 2016
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Twitter usage in Qatar by device 2016 [Dataset]. https://www.statista.com/statistics/729709/qatar-twitter-usage-by-device/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2016
Area covered
Qatar
Description
This statistic described the distribution of twitter usage in Qatar in 2016, by device. During 2016, the most used device for twitter in Qatar was mobile with almost ** percent.
A collection of Tweets related to climate change/infectious diseases and...
zenodo.org
datadryad.org
txt, zip
Updated Jun 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zitao He; Zitao He; Chris T. Bauch; Justin Schonfeld; Edward Qian; Chris T. Bauch; Justin Schonfeld; Edward Qian (2022). A collection of Tweets related to climate change/infectious diseases and vaccines, broken down by week [Dataset]. http://doi.org/10.5061/dryad.djh9w0w05
Explore at:
zip, txtAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.djh9w0w05
Dataset updated
Jun 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Zitao He; Zitao He; Chris T. Bauch; Justin Schonfeld; Edward Qian; Chris T. Bauch; Justin Schonfeld; Edward Qian
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Twitter has been widely used to share opinions and sentiments on various topics. Studies have found correlations between the sentiments on Twitter and social trends in the real world. Here, we collected the tweet IDs related to climate change, infectious diseases, and vaccines through Twitter Application Programming Interface (API), which can be useful for further research on different topics. The data ranges from October 30th, 2016, to April 24th, 2021, and is broken down by week.
A Twitter Dataset for Spatial Infectious Disease Surveillance
zenodo.org
data.niaid.nih.gov
csv, txt, zip
Updated Jan 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roberto C.S.N.P. Souza; Manoel Horta Ribeiro; Manoel Horta Ribeiro; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos; Roberto C.S.N.P. Souza; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos (2021). A Twitter Dataset for Spatial Infectious Disease Surveillance [Dataset]. http://doi.org/10.5281/zenodo.2541440
Explore at:
csv, txt, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2541440
Dataset updated
Jan 6, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Roberto C.S.N.P. Souza; Manoel Horta Ribeiro; Manoel Horta Ribeiro; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos; Roberto C.S.N.P. Souza; Wagner Meira Jr.; Renato M. Assuncao; Walter dos Santos
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dengue is a mosquito-borne viral disease which infects millions of people every year, specially in developing countries. Some of the main challenges facing the disease are reporting risk indicators and rapidly detecting outbreaks. Traditional surveillance systems rely on passive reporting from health-care facilities, often ignoring human mobility and locating each individual by their home address. Yet, geolocated data are becoming commonplace in social media, which is widely used as means to discuss a large variety of health topics, including the users' health status. In this dataset paper, we make available two large collections of dengue related labeled Twitter data. One is a set of tweets available through the Streaming API using the keywords dengue and aedes from 2010 to 2016. The other is the set of all geolocated tweets in Brazil during the year of 2015 (available also through the Streaming API). We detail the process of collecting and labeling each tweet containing keywords related to dengue in one of 5 categories: personal experience, information, opinion, campaign, and joke. This dataset can be useful for the development of models for spatial disease surveillance, but also scenarios such as understanding health-related content in a language other than English, and studying human mobility.
P
Data from: Natural Hazards Twitter Dataset Dataset
paperswithcode.com
Updated May 2, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Natural Hazards Twitter Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/natural-hazards-twitter-dataset
Explore at:
Dataset updated
May 2, 2021
Description
Natural Hazards is a natural disaster dataset with sentiment labels, which contains nearly 50,00 Twitter data about different natural disasters in the United States (e.g., a tornado in 2011, a hurricane named Sandy in 2012, a series of floods in 2013, a hurricane named Matthew in 2016, a blizzard in 2016, a hurricane named Harvey in 2017, a hurricane named Michael in 2018, a series of wildfires in 2018, and a hurricane named Dorian in 2019).

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista (2025). Twitter usage in MENA by language 2016 [Dataset]. https://www.statista.com/statistics/729700/mena-twitter-usage-by-language/

Twitter usage in MENA by language 2016

Explore at:

Dataset updated

Jul 11, 2025

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

Mar 2016

Area covered

MENA, Asia

Description

This statistic described the distribution of twitter usage in the Middle East and North Africa in 2016, by language. During 2016, the most used language on twitter in the MENA region was Arabic with ** percent.

Clear search

Close search

Google apps

Main menu

Twitter usage in MENA by language 2016

Twitter event datasets (2012-2016)

Data from: Twitter historical dataset: March 21, 2006 (first tweet) to July...

Twitter usage in Saudi Arabia by device 2016

Replication Data for: An Analysis of How the Twitter Discourse Surrounding...

City of Seattle official Twitter accounts statistics

US 2016 Election Data used for classification of Organized Behavior in...

TweetsKB (Part 4, Nov 2015 - Aug 2016)

Data from: Annotated Dataset of History-related Tweets

Data from: Dataset: input and results related to the paper 'Anticipointment...

Data for right-wing groups found on Twitter around the 2016 US election

Frequency of Twitter usage among young adults in India 2016

TweetsKB (Part 5, Dec 2016 - Oct 2017)

Twitter data related to Canada at the Rio 2016 Olympics and Paralympics

Data from: Dataset: tweets and events linked to the paper 'Open-domain...

TweetsKB (Part 4, Jan 2016 - Nov 2016)

Twitter usage in Qatar by device 2016

A collection of Tweets related to climate change/infectious diseases and...

A Twitter Dataset for Spatial Infectious Disease Surveillance

Data from: Natural Hazards Twitter Dataset Dataset

Twitter usage in MENA by language 2016