https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Kaggle has fixed the issue with gzip files and Version 510 should now reflect properly working files
Please use the version 508 of the dataset, as 509 is broken. See link below of the dataset that is properly working https://www.kaggle.com/datasets/bwandowando/ukraine-russian-crisis-twitter-dataset-1-2-m-rows/versions/508
The context and history of the current ongoing conflict can be found https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine.
[Jun 16] (đSunset) Twitter has finally pulled the plug on all of my remaining TWITTER API accounts as part of their efforts for developers to migrate to the new API. The last tweets that I pulled was dated last Jun 14, and no more data from Jun 15 onwards. It was fun til it lasted and I hope that this dataset was able and will continue to help a lot. I'll just leave the dataset here for future download and reference. Thank you all!
[Apr 19] Two additional developer accounts have been permanently suspended, expect a lower throughtput in the next few weeks. I will pull data til they ban my last account.
[Apr 08] I woke up this morning and saw that Twitter has banned/ permanently suspended 4 of my developer accounts, I have around a few more but it is just a matter of time till all my accounts will most likely get banned as well. This was a fun project that I maintained for as long as I can. I will pull data til my last account gets banned.
[Feb 26] I've started to pull in RETWEETS again, so I am expecting a significant amount of throughput in tweets again on top of the dedicated processes that I have that gets NONRETWEETS. If you don't want RETWEETS, just filter them out.
[Feb 24] It's been a year since I started getting tweets of this conflict and had no idea that a year later this is still ongoing. Almost everyone assumed that Ukraine will crumble in a matter of days, but it is not the case. To those who have been using my dataset, i hope that I am helping all of you in one way or another. Ill do my best to maintain updating this dataset as long as I can.
[Feb 02] I seem to be getting less tweets as my crawlers are getting throttled, i used to get 2500 tweets per 15 mins but around 2-3 of my crawlers are getting throttling limit errors. There may be some kind of update that Twitter has done about rate limits or something similar. Will try to find ways to increase the throughput again.
[Jan 02] For all new datasets, it will now be prefixed by a year, so for Jan 01, 2023, it will be 20230101_XXXX.
[Dec 28] For those looking for a cleaned version of my dataset, with the retweets removed from before Aug 08, here is a dataset by @@vbmokin https://www.kaggle.com/datasets/vbmokin/russian-invasion-ukraine-without-retweets
[Nov 19] I noticed that one of my developer accounts, which ISNT TWEETING ANYTHING and just pulling data out of twitter has been permanently banned by Twitter.com, thus the decrease of unique tweets. I will try to come up with a solution to increase my throughput and signup for a new developer account.
[Oct 19] I just noticed that this dataset is finally "GOLD", after roughly seven months since I first uploaded my gzipped csv files.
[Oct 11] Sudden spike in number of tweets revolving around most recent development(s) about the Kerch Bridge explosion and the response from Russia.
[Aug 19- IMPORTANT] I raised the missing dataset issue to Kaggle team and they confirmed it was a bug brought by a ReactJs upgrade, the conversation and details can be seen here https://www.kaggle.com/discussions/product-feedback/345915 . It has been fixed already and I've reuploaded all the gzipped files that were lost PLUS the new files that were generated AFTER the issue was identified.
[Aug 17] Seems the latest version of my dataset lost around 100+ files, good thing this dataset is versioned so one can just go back to the previous version(s) and download them. Version 188 HAS ALL THE LOST FILES, I wont be reuploading all datasets as it will be tedious and I've deleted them already in my local and I only store the latest 2-3 days.
[Aug 10] 3/5 of my Python processes errored out and resulted to around 10-12 hours of NO data gathering for those processes thus the sharp decrease of tweets for Aug 09 dataset. I've applied an exception/ error checking to prevent this from happening.
[Aug 09] Significant drop in tweets extracted, but I am now getting ORIGINAL/ NON-RETWEETS.
[Aug 08] I've noticed that I had a spike of Tweets extracted, but they are literally thousands of retweets of a single original tweet. I also noticed that my crawlers seem to deviate because of this tactic being used by some Twitter users where they flood Twitter w...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about books. It has 9 rows and is filtered where the book subjects is Ukraine-Relations-Russia. It features 9 columns including author, publication date, language, and book publisher.
Russia launched its armed aggression against Ukraine in February 2014, seizing Crimea and subsequently occupying parts of the Donetsk and Luhansk oblasts of Ukraine. On February 24, 2022, Russia started a large-scale invasion of Ukraine on multiple fronts, deploying troops and shelling Ukrainian cities and infrastructure. As of the end of 2023 the war against Ukraine is still ongoing and its outcome is unknown. At different stages of the war, KIIS has studied the public opinion of the Ukrainian population regarding Russian aggression. It included surveys on people's attitudes towards the annexation of Crimea, and Ukraine's countermeasures in Eastern Ukraine (Anti-Terrorist Operation, ATO) covering the period from 2014 to 2018. Since 2022, public opinion polls have asked questions regarding people's feelings and opinions about the ongoing war between Ukraine and Russia, perceptions of the government's actions, readiness for concessions / compromises to end the war, etc. Data from individual surveys for the period 2014-2023 (14 in total) were combined into a merged dataset. Each of these polls is representative of the Ukraine's adult population (aged 18 and older), and typically includes about 2,000 respondents. The background information includes respondents' socio-demographic profiles (gender, age, education, nationality, occupation, self-assessment of financial situation) and place of residence (oblast, type of settlement). These data provide a snapshot of public opinion of the Ukrainian population on some aspects of the Russian-Ukrainian war. Some questions are repeated, which makes it possible to track changes in opinions over time.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset - Ukraine-Relations-Russia in the news
This dataset, DSI-6411 is comprised of soil moisture data and the accompanying information for the agricultural regions of Western Russia (west of ~ 60E) and The Ukraine for the period from 1992 to 1996. These data are collected routinely for agro-meteorological monitoring of these two countries and serve as an input for the in-situ assessment of the state of the major crops.
Version 11.1 Release Date: August 22, 2022
The Office of the Geographer and Global Issues at the U.S. Department of State produces the Large Scale International Boundaries (LSIB) dataset. These data and their derivatives are the only international boundary lines approved for U.S. Government use. They reflect U.S. Government policy, and not necessarily de facto limits of control. This dataset is a National Geospatial Data Asset.
Sources for these data include treaties, relevant maps, and data from boundary commissions and national mapping agencies. Where available, the dataset incorporates information from courts, tribunals, and international arbitrations. The research and recovery of the data involves analysis of satellite imagery and elevation data. Due to the limitations of source materials and processing techniques, most lines are within 100 meters of their true position on the ground.
The dataset uses the following attributes: Attribute Name Explanation Country Code Country-level codes are from the Geopolitical Entities, Names, and Codes Standard (GENC). The Q2 code denotes a line representing a boundary associated with an area not in GENC. Country Names Names approved by the U.S. Board on Geographic Names (BGN). Names for lines associated with a Q2 code are descriptive and are not necessarily BGN-approved. Label Required text label for the line segment where scale permits Rank/Status Rank 1: International Boundary Rank 2: Other Line of International Separation Rank 3: Special Line Notes Explanation of any applicable special circumstances Cartographic Usage Depiction of the LSIB requires a visual differentiation between the three categories of boundaries: International Boundaries (Rank 1), Other Lines of International Separation (Rank 2), and Special Lines (Rank 3). Rank 1 lines must be the most visually prominent. Rank 2 lines must be less visually prominent than Rank 1 lines. Rank 3 lines must be shown in a manner visually subordinate to Ranks 1 and 2. Where scale permits, Rank 2 and 3 lines must be labeled in accordance with the âLabelâ field. Data marked with a Rank 2 or 3 designation does not necessarily correspond to a disputed boundary. Additional cartographic information can be found in Guidance Bulletins (https://hiu.state.gov/data/cartographic_guidance_bulletins/) published by the Office of the Geographer and Global Issues. Please direct inquiries to internationalboundaries@state.gov.
The lines in the LSIB dataset are the product of decades of collaboration between geographers at the Department of State and the National Geospatial-Intelligence Agency with contributions from the Central Intelligence Agency and the UK Defence Geographic Centre. Attribution is welcome: U.S. Department of State, Office of the Geographer and Global Issues.
This version of the LSIB contains changes and accuracy refinements for the following line segments. These changes reflect improvements in spatial accuracy derived from newly available source materials, an ongoing review process, or the publication of new treaties or agreements. Changes to lines include: ⢠Akrotiri (UK) / Cyprus ⢠Albania / Montenegro ⢠Albania / Greece ⢠Albania / North Macedonia ⢠Armenia / Turkey ⢠Austria / Czechia ⢠Austria / Slovakia ⢠Austria / Hungary ⢠Austria / Slovenia ⢠Austria / Germany ⢠Austria / Italy ⢠Austria / Switzerland ⢠Azerbaijan / Turkey ⢠Azerbaijan / Iran ⢠Belarus / Latvia ⢠Belarus / Russia ⢠Belarus / Ukraine ⢠Belarus / Poland ⢠Bhutan / India ⢠Bhutan / China ⢠Bulgaria / Turkey ⢠Bulgaria / Romania ⢠Bulgaria / Serbia ⢠Bulgaria / Romania ⢠China / Tajikistan ⢠China / India ⢠Croatia / Slovenia ⢠Croatia / Hungary ⢠Croatia / Serbia ⢠Croatia / Montenegro ⢠Czechia / Slovakia ⢠Czechia / Poland ⢠Czechia / Germany ⢠Finland / Russia ⢠Finland / Norway ⢠Finland / Sweden ⢠France / Italy ⢠Georgia / Turkey ⢠Germany / Poland ⢠Germany / Switzerland ⢠Greece / North Macedonia ⢠Guyana / Suriname ⢠Hungary / Slovenia ⢠Hungary / Serbia ⢠Hungary / Romania ⢠Hungary / Ukraine ⢠Iran / Turkey ⢠Iraq / Turkey ⢠Italy / Slovenia ⢠Italy / Switzerland ⢠Italy / Vatican City ⢠Italy / San Marino ⢠Kazakhstan / Russia ⢠Kazakhstan / Uzbekistan ⢠Kosovo / north Macedonia ⢠Kosovo / Serbia ⢠Kyrgyzstan / Tajikistan ⢠Kyrgyzstan / Uzbekistan ⢠Latvia / Russia ⢠Latvia / Lithuania ⢠Lithuania / Poland ⢠Lithuania / Russia ⢠Moldova / Ukraine ⢠Moldova / Romania ⢠Norway / Russia ⢠Norway / Sweden ⢠Poland / Russia ⢠Poland / Ukraine ⢠Poland / Slovakia ⢠Romania / Ukraine ⢠Romania / Serbia ⢠Russia / Ukraine ⢠Syria / Turkey ⢠Tajikistan / Uzbekistan
This release also contains topology fixes, land boundary terminus refinements, and tripoint adjustments.
While U.S. Government works prepared by employees of the U.S. Government as part of their official duties are not subject to Federal copyright protection (see 17 U.S.C. § 105), copyrighted material incorporated in U.S. Government works retains its copyright protection. The works on or made available through download from the U.S. Department of Stateâs website may not be used in any manner that infringes any intellectual property rights or other proprietary rights held by any third party. Use of any copyrighted material beyond what is allowed by fair use or other exemptions may require appropriate permission from the relevant rightsholder. With respect to works on or made available through download from the U.S. Department of Stateâs website, neither the U.S. Government nor any of its agencies, employees, agents, or contractors make any representations or warrantiesâexpress, implied, or statutoryâas to the validity, accuracy, completeness, or fitness for a particular purpose; nor represent that use of such works would not infringe privately owned rights; nor assume any liability resulting from use of such works; and shall in no way be liable for any costs, expenses, claims, or demands arising out of use of such works.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset includes Telegram channels with both pro-Kremlin and anti-Kremlin communications, collected over a timeframe covering one year prior to and one year following the Russian invasion. It consists of 404 pro-Kremlin channels featuring 4,109,645 posts and 114 anti-Kremlin channels containing 1,117,768 posts, all provided in JSON format. anti_kremlin_channel_list and pro_kremlin_channel_list encompasses details such as the channel name, username, Telegram link, and corresponding annotations. Important Note: For proper attribution, researchers who use this dataset in their work are invited to cite the following papers that describe this dataset and an example analysis.Bawa, A., Kursuncu, U., Achilov, D., Shalin, V. L., Agarwal, N., & Akbas, E. (2025). Telegram as a Battlefield: Kremlin-related Communications during the Russia-Ukraine Conflict. arXiv preprint arXiv:2501.01884.Bawa, A., Kursuncu, U., Achilov, D., & Shalin, V. L. (2024). the adaptive strategies of anti-kremlin digital dissent in telegram during the Russian invasion of Ukraine. arXiv preprint arXiv:2408.07135.
Since 2008, KIIS has been tracking public opinion in Ukraine regarding Russia by asking the question 'What is your general attitude towards Russia now?' with a 4-point scale from 'very good' to 'very bad.' To gain a deeper understanding of the situation, every few years the surveys also included additional questions about attitudes towards Russians (residents of Russia) and the Russian leadership. Each survey wave in Ukraine was carried out on a sample representative of Ukraine's adult population (aged 18 and older), with an average sample size of about 2,000 respondents. The merged dataset contains data from 49 waves of the survey conducted in Ukraine from 2008 to 2022 with a total of 98,575 respondents. The background information includes respondents' socio-demographic profiles (gender, age, education, nationality, occupation, self-assessment of financial situation) and place of residence (oblast, type of settlement). These data enable tracking Ukrainian public opinion regarding Russia for the period of 14 years, from 2008 to 2022, both among the population as a whole and among its different subpopulations. This monitoring of public opinion in Ukraine on Russia is a part of a joint project with the Levada Center, which simultaneously tracked public opinion in Russia on Ukraine, using the same question wording. However, only the data from the polls conducted in Ukraine are presented in this data collection.
Since the early 1990s, KIIS has systematically polled the question "How would you like to see Ukraine's relations with Russia?" to estimate Ukrainian preferences regarding these relations. The answer options provided to respondents were: "They should be the same as with other states - with closed borders, visas, customs"; "Ukraine and Russia should be independent but friendly states - with open borders, no visas, and no customs"; "Ukraine and Russia should unite into one state." Each survey wave was carried out on a sample representative of the Ukraine's adult population (aged 18 and older), with an average sample size of about 2,000 respondents. To facilitate analysis, the results of the individual survey waves from 1993 to 2023 were merged into a single dataset, including 82 polls with a total of 166,314 respondents. The background information includes respondents' socio-demographic profiles (gender, age, education, nationality, occupation, self-assessment of financial situation) and place of residence (oblast, type of settlement). These data enable tracking Ukrainian public opinion on what the relationship between the Ukraine and Russia should be like, from Ukraine's independence to the 2023, both among the population as a whole and among its different subpopulations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The benchmark interest rate in Russia was last recorded at 20 percent. This dataset provides the latest reported value for - Russia Interest Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset QOL Ukraine
The dataset includes fifteen programmes of the festival, which has been held annually in the border regions of Ukraine, Belarus and Russia since at least 1975. Until 2005 the festival was called Friendship, and since 2005 the Russian organisers have designated it as the Slavic Unity Festival, a name that was never adopted by Ukraine. After Russia's annexation of Crimea and the outbreak of armed conflict in Donbass in 2014, Ukraine withdrew from participation. The dataset with programmes before and after Ukraine's withdrawal clearly demonstrates how new narratives were added to the festival first in 2005 and then again after 2014.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sheet 1 describes the per capital electricity that is generated country and region-wise from all across the globe, and it shows that Africa's energy generation in comparison to the rest of the world has been the lowest. Sheet 2 describes the number of people that are severely insecure and have no access to food, and it has been depicted according to the different areas that Africa is divided into. The trend shows that every year, the number of people who are severely food insecure keep increasing at a drastic rate. Sheet 3 FAO real food prices - it gives us a trend of how food prices have changed ever since the Ukraine-Russia war started.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book series. It has 1 row and is filtered where the books is National identity and foreign policy : nationalism and leadership in Poland, Russia and Ukraine. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
On 24 February 2022, Russia invaded Ukraine, also known now as the Russo-Ukrainian War. We obtained our dataset through Twitter API from 23 February of 2022 until 23 June of 2023. The collected dataset has 127.275.386 tweets, shared in the form of anonymized text, where the tweet/user IDs and user mentions are anonymized and do not provide any personal information. The provided dataset contains user discussion in more than 70 languages, where the 20 most popular are : 'eng', 'fr', 'de', 'mix', 'it', 'es', 'ja', 'ru', 'pl', 'uk', 'tr', 'th', 'hi', 'qme', 'qht', 'nl', 'fi', 'ar', 'zh' and 'pt'. For the purpose of the information integrity tweets are separated and stored in different files ordered by creation date. The provided dataset is shared for further research purposes. Additionally, we provide the list of tweets IDs at the GitHub repository which can be retracted via Twitter API. Furthermore, we also manage to execute some initial analysis including: volume/activity, hashtags popularity, sentiment and military intelligence and publish the results in the web portal.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://imgur.com/h4ObE3v.jpg" alt="World Trade">
- I'd recommend going through the Content to understand the Data.
- This data is Clean
- The data contains Exports made by Russia to the world i. 225 Partnering countries ii. ~**20** Years iii. ~**3000** unique commodities SITC Code description
- Starter Code to optimize performance
The world trade is going through a massive change after the COVID-19 Situation fueled by the recent Russia-Ukraine conflict, This is Russia's part in the World Trade for around 15years with 225 countries.
Reference to sanctions might help drawing conclusions
The COVID-19 pandemic is likely to be known as that inflection point in the history which changed the nature of the post-World Trade Organization (WTO) global trade policy environment. The last time the world witnessed a similar situation was in 1995 when WTO was established, creating a rule-based global trading system.
The war in Ukraine is causing worldwide disruptions to trade and investment, affecting auto makers in Europe, hoteliers in Georgia and the Maldives, as well as impacting consumers of food and fuel globally . Although the worldâs poorâwho spend a large part of their incomes on lifeâs necessitiesâare the most vulnerable, no country, region, or industry is left untouched by these disruptions.
Classification - SITC Version 4( Latest ) Code description | more
Year - Year when the trade was made đ
Commodity Code - Code of the Commodity
Commodity - Name of the Commodity more
Qty Unit - Unit of the Item / Quantity
Qty - Quantity of Item Netweight (kg) - Item weight in kilograms âď¸
Trade Value (US$) - Trade value in US Dollars đľ
Aggregate Level - 5 Levels ( 1 to 5 ) You can choose one / All ( Group is the sweet spot ) more | Aggregate Level | Level Name | Code Format | Number of Items | | --- | --- | --- | --- | | 1 | Section | 0 | 10 Items | | 2 | Division | 01 | 67 Items | | 3 | Group | 012 | 261 Items | | 4 | Subgroup | 012.1 | 1033 Items | | 5 | Item | 012.13 | 3121 Items |
Reporter Code Reporter Reporter ISO Reporting Countries Code Name of country reporting Reporting countries ISO Code 644 (Constant) Russia (Constant) RUS (Constant)
Partner Code Partner Partner ISO Partner ( Receiver ) Countries Code Partner countries name Partner ISO Code 3 Digit code 225 Unique countries 3 Digit code)
- Starter Code to optimize performance
<--- | (â´âĄ`â) | ---> |
---|---|---|
![]() | ![]() | ![]() |
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Information about incidents within a conflict, e.g., shelling of an area of interest, is scattered amongst different data or media sources. For example, the ACLED dataset continuously documents local incidents recorded within the context of a specific conflict such as Russiaâs war in Ukraine. However, these blocks of information might be incomplete. Therefore, it is useful to collect data from several sources to enrich the information pool of a certain incident. In this paper, we present a dataset of social media messages covering the same war events as those collected in the ACLED dataset. The information is extracted from automatically geocoded Twitter text data using state-of-the-art natural language processing methods based on large pre-trained language models (LMs). Our method can be applied to various textual data sources. Both the data as well as the approach can serve to help human analysts obtain a broader understanding of conflict events.
KIIS monitors the geopolitical preferences of the Ukrainian population by asking respondents about their readiness to act in a certain way (vote for, against, or not to participate in the vote) in a hypothetical situation, namely, if a referendum on Ukraine's accession to the European Union, NATO, the Union with Russia and Belarus, or the Customs Union (with Russia, Belarus, Kazakhstan) were held now. In addition to these questions, some polls also ask respondents which direction of foreign policy they consider more preferable, with the options "accession to the European Union", "accession to the Customs Union of Russia, Belarus, Kazakhstan, Kyrgyzstan" and "not joining either the European Union or the Customs Union". This wording of the question enables evaluating the broader attitudes of the population regarding the geopolitical direction without requiring a definitive choice (such as voting for or against a specific option). Each survey wave was carried out on a sample representative of Ukraine's adult population (aged 18 and older), with an average sample size of about 2,000 respondents. In order to facilitate the analysis, the data collected for the period 2005-2022 was combined into one data set, including 31 polls with a total of 62,911 respondents. The background information includes respondents' socio-demographic profiles (gender, age, education, nationality, occupation, self-assessment of financial situation) and place of residence (oblast, type of settlement). These data enable tracking Ukrainian public opinion on the desired course of the Ukraine's foreign policy for the period of 17 years, from 2005 to 2022, both among the population as a whole and among its different subpopulations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data from the experiment and python code for the project titled âLove or politics? Political views regarding the war in Ukraine in an online dating experimentâ.
Paper abstract: Political views affect various behaviors, including relationship formation. This study conducts a field experiment on a large Russian dating site and gathers data from over 3,000 profile evaluations. The findings reveal significant penalties for those who express pro-war or anti-war positions on their dating profiles. Age emerges as the most polarizing factor: younger individuals are less likely to approach pro-war profiles but not anti-war ones, whereas older individuals are less likely to respond positively to profiles indicating anti-war views but not pro-war ones. The results align with survey evidence of a positive relationship between respondents' age and expressed support for the war in Russia, although the experiment indicates a higher degree of polarization. Overall, the experimental findings demonstrate that survey data can reveal trends and relationships between individuals' characteristics and their opinions, but may overstate the levels of support for government agendas in non-democratic states.
The experiment was conducted in October - November, 2022, on a large online dating site in Russia in three Russian regions: Moscow, Saint Petersburg, and Sverdlovskaya oblast. There are three separate data files, one for each region. Each file contains information on dating site users that have been liked by and/or have viewed the experimental profiles.
File ExperimentDataMainLikedUsers.csv contains data on the main sample of liked users. The hair color of these users was recorded from profile photos whenever possible. Weights have also been added to enable analysis with adjustment for differences in age distribution between dating site users and a subset of the Russian population that shares similar observable characteristics.
The folder also contains python code for data analysis.
The description of the study is available at https://mpra.ub.uni-muenchen.de/120731/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset - Pianists-Russia-Biography in the news
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Kaggle has fixed the issue with gzip files and Version 510 should now reflect properly working files
Please use the version 508 of the dataset, as 509 is broken. See link below of the dataset that is properly working https://www.kaggle.com/datasets/bwandowando/ukraine-russian-crisis-twitter-dataset-1-2-m-rows/versions/508
The context and history of the current ongoing conflict can be found https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine.
[Jun 16] (đSunset) Twitter has finally pulled the plug on all of my remaining TWITTER API accounts as part of their efforts for developers to migrate to the new API. The last tweets that I pulled was dated last Jun 14, and no more data from Jun 15 onwards. It was fun til it lasted and I hope that this dataset was able and will continue to help a lot. I'll just leave the dataset here for future download and reference. Thank you all!
[Apr 19] Two additional developer accounts have been permanently suspended, expect a lower throughtput in the next few weeks. I will pull data til they ban my last account.
[Apr 08] I woke up this morning and saw that Twitter has banned/ permanently suspended 4 of my developer accounts, I have around a few more but it is just a matter of time till all my accounts will most likely get banned as well. This was a fun project that I maintained for as long as I can. I will pull data til my last account gets banned.
[Feb 26] I've started to pull in RETWEETS again, so I am expecting a significant amount of throughput in tweets again on top of the dedicated processes that I have that gets NONRETWEETS. If you don't want RETWEETS, just filter them out.
[Feb 24] It's been a year since I started getting tweets of this conflict and had no idea that a year later this is still ongoing. Almost everyone assumed that Ukraine will crumble in a matter of days, but it is not the case. To those who have been using my dataset, i hope that I am helping all of you in one way or another. Ill do my best to maintain updating this dataset as long as I can.
[Feb 02] I seem to be getting less tweets as my crawlers are getting throttled, i used to get 2500 tweets per 15 mins but around 2-3 of my crawlers are getting throttling limit errors. There may be some kind of update that Twitter has done about rate limits or something similar. Will try to find ways to increase the throughput again.
[Jan 02] For all new datasets, it will now be prefixed by a year, so for Jan 01, 2023, it will be 20230101_XXXX.
[Dec 28] For those looking for a cleaned version of my dataset, with the retweets removed from before Aug 08, here is a dataset by @@vbmokin https://www.kaggle.com/datasets/vbmokin/russian-invasion-ukraine-without-retweets
[Nov 19] I noticed that one of my developer accounts, which ISNT TWEETING ANYTHING and just pulling data out of twitter has been permanently banned by Twitter.com, thus the decrease of unique tweets. I will try to come up with a solution to increase my throughput and signup for a new developer account.
[Oct 19] I just noticed that this dataset is finally "GOLD", after roughly seven months since I first uploaded my gzipped csv files.
[Oct 11] Sudden spike in number of tweets revolving around most recent development(s) about the Kerch Bridge explosion and the response from Russia.
[Aug 19- IMPORTANT] I raised the missing dataset issue to Kaggle team and they confirmed it was a bug brought by a ReactJs upgrade, the conversation and details can be seen here https://www.kaggle.com/discussions/product-feedback/345915 . It has been fixed already and I've reuploaded all the gzipped files that were lost PLUS the new files that were generated AFTER the issue was identified.
[Aug 17] Seems the latest version of my dataset lost around 100+ files, good thing this dataset is versioned so one can just go back to the previous version(s) and download them. Version 188 HAS ALL THE LOST FILES, I wont be reuploading all datasets as it will be tedious and I've deleted them already in my local and I only store the latest 2-3 days.
[Aug 10] 3/5 of my Python processes errored out and resulted to around 10-12 hours of NO data gathering for those processes thus the sharp decrease of tweets for Aug 09 dataset. I've applied an exception/ error checking to prevent this from happening.
[Aug 09] Significant drop in tweets extracted, but I am now getting ORIGINAL/ NON-RETWEETS.
[Aug 08] I've noticed that I had a spike of Tweets extracted, but they are literally thousands of retweets of a single original tweet. I also noticed that my crawlers seem to deviate because of this tactic being used by some Twitter users where they flood Twitter w...