https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Kaggle has fixed the issue with gzip files and Version 510 should now reflect properly working files
Please use the version 508 of the dataset, as 509 is broken. See link below of the dataset that is properly working https://www.kaggle.com/datasets/bwandowando/ukraine-russian-crisis-twitter-dataset-1-2-m-rows/versions/508
The context and history of the current ongoing conflict can be found https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine.
[Jun 16] (🌇Sunset) Twitter has finally pulled the plug on all of my remaining TWITTER API accounts as part of their efforts for developers to migrate to the new API. The last tweets that I pulled was dated last Jun 14, and no more data from Jun 15 onwards. It was fun til it lasted and I hope that this dataset was able and will continue to help a lot. I'll just leave the dataset here for future download and reference. Thank you all!
[Apr 19] Two additional developer accounts have been permanently suspended, expect a lower throughtput in the next few weeks. I will pull data til they ban my last account.
[Apr 08] I woke up this morning and saw that Twitter has banned/ permanently suspended 4 of my developer accounts, I have around a few more but it is just a matter of time till all my accounts will most likely get banned as well. This was a fun project that I maintained for as long as I can. I will pull data til my last account gets banned.
[Feb 26] I've started to pull in RETWEETS again, so I am expecting a significant amount of throughput in tweets again on top of the dedicated processes that I have that gets NONRETWEETS. If you don't want RETWEETS, just filter them out.
[Feb 24] It's been a year since I started getting tweets of this conflict and had no idea that a year later this is still ongoing. Almost everyone assumed that Ukraine will crumble in a matter of days, but it is not the case. To those who have been using my dataset, i hope that I am helping all of you in one way or another. Ill do my best to maintain updating this dataset as long as I can.
[Feb 02] I seem to be getting less tweets as my crawlers are getting throttled, i used to get 2500 tweets per 15 mins but around 2-3 of my crawlers are getting throttling limit errors. There may be some kind of update that Twitter has done about rate limits or something similar. Will try to find ways to increase the throughput again.
[Jan 02] For all new datasets, it will now be prefixed by a year, so for Jan 01, 2023, it will be 20230101_XXXX.
[Dec 28] For those looking for a cleaned version of my dataset, with the retweets removed from before Aug 08, here is a dataset by @@vbmokin https://www.kaggle.com/datasets/vbmokin/russian-invasion-ukraine-without-retweets
[Nov 19] I noticed that one of my developer accounts, which ISNT TWEETING ANYTHING and just pulling data out of twitter has been permanently banned by Twitter.com, thus the decrease of unique tweets. I will try to come up with a solution to increase my throughput and signup for a new developer account.
[Oct 19] I just noticed that this dataset is finally "GOLD", after roughly seven months since I first uploaded my gzipped csv files.
[Oct 11] Sudden spike in number of tweets revolving around most recent development(s) about the Kerch Bridge explosion and the response from Russia.
[Aug 19- IMPORTANT] I raised the missing dataset issue to Kaggle team and they confirmed it was a bug brought by a ReactJs upgrade, the conversation and details can be seen here https://www.kaggle.com/discussions/product-feedback/345915 . It has been fixed already and I've reuploaded all the gzipped files that were lost PLUS the new files that were generated AFTER the issue was identified.
[Aug 17] Seems the latest version of my dataset lost around 100+ files, good thing this dataset is versioned so one can just go back to the previous version(s) and download them. Version 188 HAS ALL THE LOST FILES, I wont be reuploading all datasets as it will be tedious and I've deleted them already in my local and I only store the latest 2-3 days.
[Aug 10] 3/5 of my Python processes errored out and resulted to around 10-12 hours of NO data gathering for those processes thus the sharp decrease of tweets for Aug 09 dataset. I've applied an exception/ error checking to prevent this from happening.
[Aug 09] Significant drop in tweets extracted, but I am now getting ORIGINAL/ NON-RETWEETS.
[Aug 08] I've noticed that I had a spike of Tweets extracted, but they are literally thousands of retweets of a single original tweet. I also noticed that my crawlers seem to deviate because of this tactic being used by some Twitter users where they flood Twitter w...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ukraine Open Data Site
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data collected and prepared for a project of the World Bank Group Power Transmission Project in Support of the Energy Sector Reform & Development Program in Ukraine. This data is based on a digitized PDF map, and so is intended as a schematic of rough locations of the power network. It is not suitable for applications requiring high accuracy. The PDF map can be viewed on the last page of report attached. To learn more, please visit http://projects.worldbank.org/P096207/power-transmission-project-support...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Contains data from the World Bank's data portal. There is also a consolidated country dataset on HDX.
Education is one of the most powerful instruments for reducing poverty and inequality and lays a foundation for sustained economic growth. The World Bank compiles data on education inputs, participation, efficiency, and outcomes. Data on education are compiled by the United Nations Educational, Scientific, and Cultural Organization (UNESCO) Institute for Statistics from official responses to surveys and from reports provided by education authorities in each country.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data are generated using the approach described in Sadeh et al., (submitted to Scientific Data). Files are structured as zipped shapefiles per year, per oblast. Please contact yuval.sadeh@monash.edu with questions or comments
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Speech Recognition for Ukrainian 🇺🇦 The aim of this repository is to collect information and datasets for speech recognition in Ukrainian. Get in touch with us in our Telegram group: Datasets
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Descriptive data on Russian operations in and against Europe since its invasion of Ukraine in 2022. This dataset has been made by three researchers at Leiden University in the Netherlands: Bart Schuurman, Dion Jordens, and Stijn Willem van 't Land. The dataset's focus is on operations / incidents in the physical domain, such as sabotage, assassination and certain types of influence operations. Most cyber attacks and espionage operations are not included for this reason, unless they had a physical effect (e.g., a cyber attack that caused physical sabotage, or an influence campaign that targeted specific individuals rather than a broad online audience). Although Russian sabotage and hybrid warfare operations against Europe have repeatedly made news headlines, a complete overview of these activities since the invasion of Ukraine has not yet been put together. The primary purpose of this dataset is to provide such an overview and use it to inform the public debate on the threat posed by Russia. Further information about the project will be made available on the Leiden University website in late 2024 / early 2025 (see metadata below for more information).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The official data on the new administrative-territorial structure of Ukraine were taken from the Geoportal of the administrative-territorial structure of Ukraine at the link: https://atu.gki.com.ua/
The digital topographic map of Ukraine of the scale 1:100 000 The link for the access to the map: https://gki.com.ua/access_map_100k
https://choosealicense.com/licenses/openrail++/https://choosealicense.com/licenses/openrail++/
Ukrainian NLI (translated)
We obtained the first of its kind Ukrainian Natural Language Inference Dataset by trainslating English NLI data.
Dataset formation:
English data source: https://nlp.stanford.edu/projects/snli/ Translation into Ukrainian language using model: https://huggingface.co/facebook/nllb-200-distilled-600M
Labels: 0 - entailment, 1 - neutral, 2 - contradiction.
Load dataset:
from datasets import load_dataset dataset =… See the full description on the dataset page: https://huggingface.co/datasets/ukr-detect/ukr-nli-dataset-translated-stanford.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ukraine Government Securities Allocation: Open Market data was reported at 36,874.470 UAH mn in Sep 2018. This records an increase from the previous number of 35,735.090 UAH mn for Aug 2018. Ukraine Government Securities Allocation: Open Market data is updated monthly, averaging 15,714.520 UAH mn from Jan 2005 (Median) to Sep 2018, with 165 observations. The data reached an all-time high of 127,045.680 UAH mn in Aug 2014 and a record low of 533.120 UAH mn in May 2009. Ukraine Government Securities Allocation: Open Market data remains active status in CEIC and is reported by National Bank of Ukraine. The data is categorized under Global Database’s Ukraine – Table UA.Z005: Government Bonds and Government Securities Allocation.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
GDP from Public Administration in Ukraine decreased to 121459 UAH Million in the fourth quarter of 2024 from 138773 UAH Million in the third quarter of 2024. This dataset provides - Ukraine Gdp From Public Administration- actual values, historical data, forecast, chart, statistics, economic calendar and news.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset for our paper Linked4Resilience: Linked Open Data for Data-Centric Resilience of Damaged Cultural Properties and Infrastructures in Ukraine presented in 2024 at the 3rd International Workshop on Social Communication and Information Activity in Digital Humanities, October 31, 2024, Lviv, Ukraine.
The dataset consists of annotated cultural sites damaged in Ukraine during the Russian invasion of Ukraine. The source of the UNESCO and ScienceAtRisk webpages are included.
Among the 351 damaged cultural properties published by UNESCO, only 211 meet our criteria and were included. More details can be found on our website: https://linked4resilience.eu .
The SPARQL endpoint of our datasets and use cases can be found on TriplyDB: https://triplydb.com/linked4resilience/cultural-sites-poc-v2/graphs.
The annotation criteria and other details can be found in the paper.
The license is CC-BY NC 4.0. The authors should be informed of any derivation and use.
UNFPA Real-Time Ukraine Activity Dataset
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset includes information on the distribution of amphibians listed in Red Data Book of Ukraine in Ukraine according to the literature resources (Писанец Е.М., Литвинчук С.Н., Куртяк Ф.Ф., Радченко В.И. // Земноводные Красной книги Украины (Справочник-кадастр). - К.: Зоомузей ННПМ НАН Украины, 2005. 230 с.).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about universities in Ukraine. It has 4 rows. It features 3 columns: country, and ranking.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Ukraine administrative levels 0 (country), 1 (oblast, autonomous republic, city with special status), and 2 (raion, city municipality) population statistics
REFERENCE YEAR: 2018
These tables are suitable for GIS or database linkage to the Ukraine - Subnational Administrative Divisions at administrative level 0-1.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Key information about Ukraine Public Consumption: % of GDP
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Register of datasets held by the Office of the Prosecutor General of Ukraine, which states: identification numbers, list of names of open data sets, file formats and hyperlinks, which contain information on the website of the Prosecutor General's Office and the Unified State Open Data Portal
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes bat records collected over several decades by various researchers mainly in Kharkiv city. The records were initially gathered by Professor A.S. Lysetskiy (Department of Zoology) from 1948 to 1991, followed by additional data collected by A. Vlaschenko from 1999 to 2007, with the involvement of K. Kravchenko, A. Prylutska, and others starting from 2008. The data was compiled and digitized by the Ukrainian Bat Rehabilitation Center (UBRC) team, coordinated by M. Yerofieieva, in 2024. This dataset focuses exclusively on bats that were found and subsequently examined by specialists. It provides valuable insights into the bat populations of Kharkiv and its surrounding regions, covering 5 species and a total of 5,844 individual bat records from 1948 to 2012. The dataset significantly contributes to the conservation and monitoring of bat populations in northeastern Ukraine.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The project "Open biodiversity data: serving nature conservation in Ukraine" is supported by the Rufford Foundation and is aimed on collecting data on biodiversity in Ukraine by specialists of the Ukrainian Nature Conservation Group (UNCG) for publication in GBIF. The project became especially relevant in connection with the invasion of Russian troops into Ukraine, which led to many threats to rare species in different parts of Ukraine.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Kaggle has fixed the issue with gzip files and Version 510 should now reflect properly working files
Please use the version 508 of the dataset, as 509 is broken. See link below of the dataset that is properly working https://www.kaggle.com/datasets/bwandowando/ukraine-russian-crisis-twitter-dataset-1-2-m-rows/versions/508
The context and history of the current ongoing conflict can be found https://en.wikipedia.org/wiki/2022_Russian_invasion_of_Ukraine.
[Jun 16] (🌇Sunset) Twitter has finally pulled the plug on all of my remaining TWITTER API accounts as part of their efforts for developers to migrate to the new API. The last tweets that I pulled was dated last Jun 14, and no more data from Jun 15 onwards. It was fun til it lasted and I hope that this dataset was able and will continue to help a lot. I'll just leave the dataset here for future download and reference. Thank you all!
[Apr 19] Two additional developer accounts have been permanently suspended, expect a lower throughtput in the next few weeks. I will pull data til they ban my last account.
[Apr 08] I woke up this morning and saw that Twitter has banned/ permanently suspended 4 of my developer accounts, I have around a few more but it is just a matter of time till all my accounts will most likely get banned as well. This was a fun project that I maintained for as long as I can. I will pull data til my last account gets banned.
[Feb 26] I've started to pull in RETWEETS again, so I am expecting a significant amount of throughput in tweets again on top of the dedicated processes that I have that gets NONRETWEETS. If you don't want RETWEETS, just filter them out.
[Feb 24] It's been a year since I started getting tweets of this conflict and had no idea that a year later this is still ongoing. Almost everyone assumed that Ukraine will crumble in a matter of days, but it is not the case. To those who have been using my dataset, i hope that I am helping all of you in one way or another. Ill do my best to maintain updating this dataset as long as I can.
[Feb 02] I seem to be getting less tweets as my crawlers are getting throttled, i used to get 2500 tweets per 15 mins but around 2-3 of my crawlers are getting throttling limit errors. There may be some kind of update that Twitter has done about rate limits or something similar. Will try to find ways to increase the throughput again.
[Jan 02] For all new datasets, it will now be prefixed by a year, so for Jan 01, 2023, it will be 20230101_XXXX.
[Dec 28] For those looking for a cleaned version of my dataset, with the retweets removed from before Aug 08, here is a dataset by @@vbmokin https://www.kaggle.com/datasets/vbmokin/russian-invasion-ukraine-without-retweets
[Nov 19] I noticed that one of my developer accounts, which ISNT TWEETING ANYTHING and just pulling data out of twitter has been permanently banned by Twitter.com, thus the decrease of unique tweets. I will try to come up with a solution to increase my throughput and signup for a new developer account.
[Oct 19] I just noticed that this dataset is finally "GOLD", after roughly seven months since I first uploaded my gzipped csv files.
[Oct 11] Sudden spike in number of tweets revolving around most recent development(s) about the Kerch Bridge explosion and the response from Russia.
[Aug 19- IMPORTANT] I raised the missing dataset issue to Kaggle team and they confirmed it was a bug brought by a ReactJs upgrade, the conversation and details can be seen here https://www.kaggle.com/discussions/product-feedback/345915 . It has been fixed already and I've reuploaded all the gzipped files that were lost PLUS the new files that were generated AFTER the issue was identified.
[Aug 17] Seems the latest version of my dataset lost around 100+ files, good thing this dataset is versioned so one can just go back to the previous version(s) and download them. Version 188 HAS ALL THE LOST FILES, I wont be reuploading all datasets as it will be tedious and I've deleted them already in my local and I only store the latest 2-3 days.
[Aug 10] 3/5 of my Python processes errored out and resulted to around 10-12 hours of NO data gathering for those processes thus the sharp decrease of tweets for Aug 09 dataset. I've applied an exception/ error checking to prevent this from happening.
[Aug 09] Significant drop in tweets extracted, but I am now getting ORIGINAL/ NON-RETWEETS.
[Aug 08] I've noticed that I had a spike of Tweets extracted, but they are literally thousands of retweets of a single original tweet. I also noticed that my crawlers seem to deviate because of this tactic being used by some Twitter users where they flood Twitter w...