Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is what real world data looks like! It is often messy, complicated, and leaves you wondering what you can even do with it. That is the fun and difficulty of data science. You have information, but what can you do with it? Should you try to use machine learning? Should you use statistics? That is for you to find out! 😄
This dataset contains information regarding the ongoing Ukrainian and Russian conflict data dating back to 2014. There are two CSV files in this dataset. One contains data from 2014 to 2021, the other contains data from 2018 to 2023. Use your data science skills to better understand a conflict that is happening in real time! This is an excellent project for those looking to better understand global events or who are looking to work on a dataset with greater implications and a larger impact than a cat vs. dog classifier. 👍
I will be contributing to this dataset as new data becomes available, so stay tuned!
The Ukraine-Russia conflict began in 2014 when Russia annexed Crimea from Ukraine, but the history of these two nations goes back much further than 2014. Since then, pro-Russian separatists have been fighting Ukrainian government forces in the Donbas region of Eastern Ukraine. The conflict has resulted in thousands of deaths and the displacement of over 1.5 million people.
In 2022, the conflict escalated again, with Russia mobilizing its military near the Ukrainian border and launching a large-scale invasion in February. Ukrainian forces have been engaged in heavy fighting with Russian troops and separatist militias, resulting in a humanitarian crisis and significant civilian casualties.
The international community has condemned Russia's actions and imposed economic sanctions on the country. Diplomatic efforts to resolve the conflict, including negotiations and ceasefires, have not been successful so far. The conflict remains ongoing and the situation is highly volatile.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
On 24 February 2022, Russia invaded Ukraine in a major escalation of the Russo-Ukrainian War that began in 2014. The invasion caused Europe's largest refugee crisis since World War II, with more than 6.3 million Ukrainians fleeing the country and a third of the population displaced (Source: Wikipedia).
This dataset is a collection of 407 news articles from NYT and Guardians related to ongoing conflict between Russia and Ukraine. The publishing date of articles ranges from Feb 1st, 2022 to Jul 31st, 2022.
Here are some ideas to explore:
I am looking forward to see your work and ideas and will keep adding more ideas to explore.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset describes russian and Ukrainian Equipment Losses During The 2022 Russian Invasion Of Ukraine. The dataset was created based on Oryx by scraping Ukrainian losses and russian losses pages. This list only includes destroyed vehicles and equipment of which photo or videographic evidence is available. Therefore, the amount of equipment destroyed is significantly higher than recorded. You can find numbers here 2022 Ukraine Russia War Dataset.
Images data include pictures of Equipment Losses. More than 30k (10 GB) images of destroyed equipment can be found here. Data has been split into different folders by country and type of equipment. You can find the folder structure and some picture examples in Data Overview Notebok.
Tabular data includes Equipment Losses, Equipment Models, Countries that produce Equipment, the Number of Equipment Losses, and types of Losses (abandoned, damaged, destroyed, captured, etc.). You can find a basic overview of data in Data Overview Notebok.
Tabular metadata includes a list of images available in the dataset.
Main Columns
- equipment
- model
- sub_model
- manufacturer
- losses_total
| Update Date | War Day | Notes |
|---|---|---|
| 2025-06-18 | 1211 | updated |
| 2024-07-12 | 870 | updated |
| 2023-10-02 | 586 | images metadata csv added |
| 2023-02-05 | 347 | |
| 2022-11-27 | 277 | |
| 2022-10-09 | 228 | |
| 2022-09-18 | 207 | |
| 2022-09-04 | 193 | |
| 2022-08-14 | 172 | |
| 2022-07-31 | 158 | |
| 2022-07-17 | 144 | |
| 2022-07-03 | 120 | |
| 2022-06-19 | 116 | |
| 2022-06-12 | 109 | |
| 2022-06-05 | 102 | |
| 2022-05-29 | 95 | |
| 2022-05-15 | 81 | images added |
| 2022-05-08 | 74 | |
| 2022-04-30 | 66 |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The USD/RUB exchange rate fell to 77.1688 on December 2, 2025, down 0.72% from the previous session. Over the past month, the Russian Ruble has strengthened 4.39%, and is up by 26.50% over the last 12 months. Russian Ruble - values, historical data, forecasts and news - updated on December of 2025.
Facebook
Twitterhttps://dataverse.no/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18710/1U2AQJhttps://dataverse.no/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18710/1U2AQJ
Dataset description: This dataset contains corpus data used in the paper described below. The dataset set consists of html-pages that contain the results for corpus searches in the Russian National Corpus (RNC) as described in the methodology of the corresponding paper and in the methodological information of this README file. Furthermore, it contains the scripts that were used to save these html-pages and to extract the relevant information from them. The scripts created csv files which were then imported into a LibreOffice Calc document with the ".ods" extension. Article description: The present small-scale study compares the usage of the verbal prefix do- in contemporary Russian and Ukrainian using the Ukrainian parallel corpus of the Russian National Corpus. Two datasets were analyzed: In the first one, translations of Russian do- verbs into Ukrainian were analyzed, whereas the second dataset dealt with translations of Ukrainian do- verbs into Russian. The focus of the discussion was on cognate translations with different prefixes. While the amount of data does not allow any strong conclusions, it is shown that in both languages do- prefixes can express the same meanings, namely REACH, REACH (ABSTRACT), ADD, CONVEY, and, when used together with postfix -sja, EXCESS. As the discussion shows, there is reason to believe that the CONVEY meaning is less productive in Russian where it is used in words restricted to official contexts and in fixed expressions. A quantitative analysis showed that among cognate translations from Ukrainian into Russian, the prefix was more often different than in translations from Russian into Ukrainian. This can be seen as a further clue for a wider application of Ukrainian do- compared to its Russian counterpart.
Facebook
TwitterThe dataset includes fifteen programmes of the festival, which has been held annually in the border regions of Ukraine, Belarus and Russia since at least 1975. Until 2005 the festival was called Friendship, and since 2005 the Russian organisers have designated it as the Slavic Unity Festival, a name that was never adopted by Ukraine. After Russia's annexation of Crimea and the outbreak of armed conflict in Donbass in 2014, Ukraine withdrew from participation. The dataset with programmes before and after Ukraine's withdrawal clearly demonstrates how new narratives were added to the festival first in 2005 and then again after 2014.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To assess media transformation, news from Channel One (2013-2023) is analyzed due to its popularity and status as a state-owned broadcaster (Statista Research Department, 2023). Television remains a key source of information in Russia, especially among older generations, and is more influential than newspapers (Levada-Center, 2022). Data is collected and coded into major and minor themes (e.g., "domestic issues" and "corruption"), with shifts in tone, framing, and emphasis over time considered.Values are measured through themes in media discourse and government messaging. The same coding process is applied to explore how media and government policies promote specific values. Official government statements are analyzed to understand how these values are communicated.A comparative analysis is conducted to explore the media's role in the war with Ukraine. Government decisions are compared to media coverage to see how events were framed or omitted. Special focus is given to Ukraine-related themes, such as portrayals of the Ukrainian government and narratives justifying the war (e.g., "neo-Nazi," "terrorist," "NATO expansion threat"). Thematic analysis traces how these narratives evolved to shape values that legitimize the invasion.
Facebook
Twitterhttps://www.reddit.com/wiki/apihttps://www.reddit.com/wiki/api
On the 14 of february the mods of r/worldnews started creating "live" threads to aggregate all the comments related to what was then catalogued as the "*Ukraine-Russia Tensions*", over time the interest in these threads (measured in terms of how many comments they accure) has varied as the conflict advances.
This dataset aims to capture all the comments user have made so far in these threads, useful if you want to examine how the discourse around the war has been evolving, or if you have an interest on seeing which comments get upvoted the most, or simply want to toy around with some dataset that contains categorical, numeric and text data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Flash Eurobarometer survey shows large consensus among EU citizens in all EU Member States in favour of the EU’s response to Russia’s invasion of Ukraine. The majority of Europeans think that since the war started, the EU has shown solidarity (79%) and has been united (63%) and fast (58%) in its reaction. Respondents are widely in favour of the unwavering support to Ukraine and its people. In particular, more than nine out of ten respondents (93%) approve providing humanitarian support to the people affected by the war. 88% of Europeans approve the idea of welcoming in the EU people fleeing the war. 80% approve the financial support provided to Ukraine. 66% agree that ‘Ukraine should join the EU when it is ready’, 71% believe that Ukraine is part of the European family and 89% feel sympathy towards Ukrainians.
Processed data files for the Eurobarometer surveys are published in .xlsx format.
For SPSS files and questionnaires, please contact GESIS - Leibniz Institute for the Social Sciences: https://www.gesis.org/eurobarometer
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Russia RU: Pharmaceutical Industry: Total Exports data was reported at 2.523 USD bn in 2021. This records an increase from the previous number of 1.039 USD bn for 2020. Russia RU: Pharmaceutical Industry: Total Exports data is updated yearly, averaging 315.687 USD mn from Dec 1996 (Median) to 2021, with 26 observations. The data reached an all-time high of 2.523 USD bn in 2021 and a record low of 99.550 USD mn in 2000. Russia RU: Pharmaceutical Industry: Total Exports data remains active status in CEIC and is reported by Organisation for Economic Co-operation and Development. The data is categorized under Global Database’s Russian Federation – Table RU.OECD.MSTI: Trade Statistics: Non OECD Member: Annual.
In response to Russia's large-scale aggression against Ukraine, the OECD Council decided on 8 March 2022 to immediately suspend the participation of Russia and Belarus in OECD bodies. In view of this decision, the OECD suspended its solicitation of official statistics on R&D from Russian authorities, leading to the absence of more recent R&D statistics for this country in the OECD database. Previously collected and compiled indicators are still available.
The business enterprise sector includes all organisations and enterprises whose main activity is connected with the production of goods and services for sale, including those owned by the state, and private non-profit institutions serving the above-mentioned organisations. In practice however, R&D performed in this sector is carried out mostly by industrial research institutes other than enterprises. This particularity reflects the traditional organisation of Russian R&D.
Headcount data include full-time personnel only, and hence are underestimated, while data in full-time equivalents (FTE) are calculated on the basis of both full-time and part-time personnel. This explains why the FTE data are greater than the headcount data.
New budgetary procedures introduced in 2005 have resulted in items previously classified as GBARD being attributed to other headings and have affected the coverage and breakdown by socio-economic objective.
Facebook
TwitterThe dataset collects tweets regarding Russia and Ukraine from 21st February till date. It will be updated every day.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionThe widespread HIV epidemic in Ukraine is concentrated among people who inject drugs (PWID), making access to sterile injection paraphernalia (SIP) like sterile needles and syringes a critical method of HIV/AIDS prevention; however, the Russian invasion has threatened to disrupt the operations of syringe services programs (SSPs), creating a risk of HIV outbreaks among PWID.MethodsWe conducted 10 semi-structured interviews with outreach workers from SSPs. Interviews were purposively sampled to cover three prototypic regions of Ukraine: temporarily Russian-controlled, frontline, and destination. Qualitative results from interviews were then compared against a standardized, nationwide harm reduction database.ResultsWe found that the Russian invasion triggered both supply and demand challenges for SSPs. Demand increased for all regions due to client transitions from pharmacies that closed to SSPs, increases in illicit drug use, greater client openness to NGO support, and displacement of clients to destination regions. Supply decreased for all areas (except for remote destination regions) due to battle-related barriers like curfews, roadblocks, and Internet disruptions; diminished deliveries of SIP and funding; and staff displacement. Time series plots of the number of unique clients accessing harm reduction services showed that an initial decrease in service provision occurred at the start of the war but that most regions recovered within several months except for Russian-controlled regions, which continued to provide services to fewer clients relative to previous years.ConclusionTo ensure continued scale-up of SIP and other HIV prevention services, the SyrEx database should be leveraged to serve as a streamlined harm reduction locator that can inform workers and clients of open site locations and other pertinent information.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The dataset contains available information about launched and shot down missiles and drones during russian massive missile and drone (UAV) strikes on infrastructure (since October 2022) as part of its invasion of Ukraine. The dataset also contains information about some single shot-down UAVs.
The dataset was created manually based on the official reports of Air Force Command of UA Armed Forces and General Staff of the Armed Forces of Ukraine published on social media such as Facebook or Telegram.
You can find a short data overview here.
Russian attacks on Ukraine double since Trump inauguration, by BBC News
Russia continues record-setting aerial attacks, US cuts off arms shipments to Ukraine, by USA Today
Operational Fires in the Age of Punishment, by Center for Strategic and International Studies
Sustained Russian Shahed Swarms: The War of Precision Mass Continues, by War Quants
Calculating the Cost-Effectiveness of Russia’s Drone Strikes, by Center for Strategic and International Studies
Missile Attacks Calendar, by Pavlo Krasnomovets
Putin bruker Vestens somling for alt det er verdt. De neste dagene blir helt avgjørende, by Aftenposten
missile_attacks_daily.csvtime_start- start attack time;time_end - end attack time;model - missile or UAV name; launch_place - city or region from which missiles were launched;target - could be a city in Ukraine, or a region of Ukraine, or a direction, or all of Ukraine;carrier - missile launch platform;launched - the number of launched missiles or UAVs, null means 'Unknown';destroyed - the number of destroyed missiles or UAVs, null means 'Unknown';not_reach_goal - the number of missiles or UAVs that have not reached the target (crashed), since July 2024.border_crossing - the number of UAVs that crossed Ukraine and then crossed another border;still_attacking - the number of UAVs that were still attacking at the time of publishing the results;num_hit_location - number of locations with direct hits;num_fall_fragment_location - number of locations with hits of fall fragments of UAVs or missiles;affected_region - Ukraine regions (oblasts, cities, etc.) where missiles or UAVs hits were registered;destroyed_details - detailed information about the target (usually if there are more than 3 targets during the attack);launched_details - detailed information about the number of launched missiles or UAVs; launch_place_details - detailed information about launch_place;source - information source (mainly official Facebook posts, sometimes with corrections from Monitor and Monitorwar Telegram Channels).missiles_and_uavs.csvmodel - missile or UAV name; category - type of missile or UAV;national_origin - manufacturer country;type - subtype of missile or UAV;launch_platform - launch platform;name - official name;name_NATO - official name by NATO;in_service - year of start of production;designer - company-designer;manufacturer - company-manufacturer;guidance_system - guidance system;unit_cost - one unit cost.launch_place| Region | Target | Decimal Coordinates | Google Maps |
|---|---|---|---|
| Bryansk Oblast | Navlya | 52.8281, 34.4989 | Open |
| Kursk Oblast | Khalino | 51.7504, 36.3108 | [Open](https://www.google.com/... |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Russian invasion of Ukraine started on February 24, 2022. Attacks by Russian forces were reported in major cities across Ukraine, including Berdyansk, Chernihiv, Kharkiv, Odesa, Sumy, and the capital Kyiv. Western officials claimed that by scope, the war could be the largest in Europe since 1945. The Office of the United Nations High Commissioner for Human Rights (OHCHR) verified over 5.7 thousand deaths of civilians in Ukraine during the war as of September 2022.
The invasion caused Europe's largest refugee crisis since World War II, with over 7.2 million Ukrainians fleeing the country and a third of the population displaced. The refugees of the war mostly fled to the neighboring countries of Ukraine located in Central and Eastern Europe, prominently the nations of Poland, Hungary, Romania, Slovakia, Belarus, Republic of Moldova and Russia as well. With the situation in the regions of Ukraine changing, it is important to keep a general record regarding where the refugees are located, to provide better assistance to them and the concerned authorities.
This dataset contains information about the number of Ukrainian refugees that a neighboring country is housing at different points in time, starting from early March. The countries that mostly feature in the data are obviously the ones mentioned before that share borders with the nation of Ukraine. Each record mentions the country, the date of recording, the number of refugees in that country, and geospatial data of the particular region which could help in some useful geographical analysis. The consecutive entries for one country seem to be not more than a week apart at any given time. United Nations High Commissioner for Refugees (UNHCR) and local governments are the main sources.
This file was extracted using an API about war data from RapidAPI. I will also provide regular updates to this dataset whenever I find any. I am still new to this technique of extraction so any feedback would be highly appreciated.
The war has inflicted large scale damage on many different communities and I believe the data science community has the knowledge and resources of providing help. I believe all data enthusiasts learn about data science to help in solving real world problems that society faces and providing aid during times of humanitarian crises would be influential work of the highest order.
Visit this link if you wish to donate or provide other support to the efforts in Ukraine: https://stand-with-ukraine.pp.ua/
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Data will be updated weekly
Each new record accumulates data from previous days
This dataset describes Equipment Losses & Death Toll & Military Wounded & Prisoner of War of russians in 2022 Ukraine russia War. All data are official and additionally structured by myself. A lot of civilians and children have already been killed by russia troops. Ukraine is in war flame and under missile attack now. We are strong. Stand with Ukraine.
russia_losses_personnel.csv- contains Personnel Losses during the warrussia_losses_equipment.csv - contains Equipment Losses during the warrussia_losses_equipment_correction.csv - contains some data correction in russia_losses_equipment.csv (date: 2022-10-13, date: 2023-05-27)I recommend the Daily Data Notebook if you need daily bases data. It contains a full code that shows how to convert the data or you can adapt the below code snippet.
df = df.diff().fillna(df).fillna(0).astype(int)
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The Ukraine conflict has shaken Europe, and the world. The atrocities committed are difficult to deny, but then again, there are polar opinions, and stances, surrounding this conflict. For those who are not living under a rock, you might agree that disinformation is a big issue in today's world. It can maliciously affect the electorate's opinion, or influence individuals to commit irrational and harmful acts. On the other hand, it can be weaponized to discredit a group or an individual who has an unfavourable opinion. This problem is applied to the conflict in Ukraine. It is hard to know exactly which side has the "true" information as the two sides are giving polar opposite information on the conflict.
Knowing the problem stated above, Lex Fridman was astute enough to invite two individuals with polar opinions regarding the war in Ukraine on his podcast. At first, he had Oliver Stone (American film director, producer, and screenwriter) to discuss the issues surrounding the war in Ukraine. Stone has a very clear bias for Russia. Link to the interview: https://www.youtube.com/watch?v=ygAqYC8JOQI
A few days later, Lex Fridman invited Stephen Kotkin (American historian, academic, and author.), to discuss the same topics. Kotkin is leaning towards the West and Ukraine. Link to the interview : https://www.youtube.com/watch?v=2a7CDKqWcZ0&t=5128s
This effort by Lex Fridman to give his audience a better understanding of the conflict by inviting polar opposite views was personally enriching, but further analysis can be done.
The comment section is arguably as rich as the content itself, so I decided to scrape the comments and publish them so the community could draw analysis and exploit the data.
An idea would be to do a sentiment or polarization analysis with the two datasets, and see with which side the public agrees with.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Detailed graphically-documented daily losses of Russian tanks according to ORYX
About the dataset
Detailed graphically-documented daily losses of Russian tanks according to ORYX ventilated by the model of the tank, its generation and by series-year.
Contents
The Excel file contains 3 sheets: - Model_Level : the cumulative number of tanks lost (destroyed, damaged, abandoned and captured), ventilated by the precise model of the tank - Year_Level : the cumulative number of tanks lost (destroyed, damaged, abandoned and captured), ventilated by the decade in which the tank entered production - SeriesYear_Level : the cumulative number of tanks lost (destroyed, damaged, abandoned and captured), ventilated by the decade in which the tank entered production and the series the tank belongs to
The column headers depend on the sheet : - Model_Level : the headers contain the name of the model. White spaces, dashes and other signs of punctuation have been removed - Year_Level : the headers contain the decade in which the tank entered production. It is preceded by an ‘x’. Unknown tanks are grouped under the ‘xUnknown’ header - SeriesYear_Level : the headers contains the concatenation of the decade (yyyy) and the series of the tank (up to four letters of the series name). Unknown tanks do not have an attributed decade.
What sets this dataset apart?
Method of collection
The data was collected from the ORYX website on a daily basis. Since ORYX does not provide a real-time dataset, I obtain the real-time data by using the Wayback machine and then save the snapshot as HTML code for each date.
Upon loading the HTML file for each day, I filter each level of aggregation by using the h3 tags and only select the Tanks. At the category level, the values are found within an h3 tag, while each specific piece of equipment is found in a bullet list.
Although ORYX provides the number by item, one of its limitations is that the numbers reported may be different from the sum of the individual pieces because of inputting errors or miscategorization. I therefore contrast this information with the sum of the individual pieces of equipment listed according to their state (more on this in the following section).
Cleaning and treatment of the data
There is an extensive cleaning process involved: - Cleaning of the names to remove typographical errors - Correcting the aggregate number by equipment if that value is absurd given the amount of individual pieces of equipment - These checks ensure that the error between the aggregate and individual numbers are less than 5 in absolute terms or less than 5%, whichever condition is the most restrictive. The final number in the dataset is the minimum value between the aggregate and the individual numbers. - As there may be revisions after the first time the information is published, I make sure to take the minimum value of the remaining series. This ensures that the numbers I provide are the most conservative possible.
Frequency of the dataset
I plan on updating the dataset every week, with the dataset made available on Tuesday.
Companion datasets
All my datasets : - Russian losses (materiel and personnel) according to the Ukrainian Ministry of Defense : https://www.kaggle.com/datasets/ol4ubert/rus-modukr-equipmentpersonnel - Ukrainian losses (materiel and personnel) according to the Russian Ministry of Defense : https://www.kaggle.com/datasets/ol4ubert/ukr-modrus-equipmentpersonnel - Russian losses (materiel) according to ORYX : https://www.kaggle.com/datasets/ol4ubert/rus-oryx-equipment - Ukrainian losses (materiel) according to ORYX : https://www.kaggle.com/datasets/ol4ubert/ukr-oryx-equipment - Russian tank losses according to ORYX : https://www.kaggle.com/datasets/ol4ubert/rus-oryx-tanks - Ukrainian tank losses according to ORYX : https://www.kaggle.com/datasets/ol4ubert/ukr-oryx-tanks - Ukrainian personnel losses (UALosses) : https://www.kaggle.com/datasets/ol4ubert/confirmed-ukrainian-military-personnel-losses - Russian personnel losses (KilledInUkraine) : https://www.kaggle.com/datasets/ol4ubert/confirmed-russian-military-officers-losses - Ukrainian losses in Kursk (materiel and personnel) according to the Russian Ministry of Defense: https://www.kaggle.com/datasets/ol4ubert/ukrainian-military-losses-in-kursk-mod-russia
Any comment is welcome. Please use the Discussion feature or send me an email directly.
Facebook
Twitterhttps://www.reddit.com/wiki/apihttps://www.reddit.com/wiki/api
The data was extracted in order to conduct a sentiment analysis on the Ukraine-Russia war and understand public opinion on the conflict. The posts were searched based on the keywords like Ukraine, Russia and Nato. The idea behind selecting Reddit was the ease with which its users can extract data and the subreddits that provide rich content. To learn more about the PRAW API that was used to extract data, visit [https://praw.readthedocs.io/en/stable/].
The data is not ready for sentiment analysis!!! So your first task would be to clean the data by removing punctuation, tags, URLs, symbols, processing emojis and emoticons, tokenizing, lemmatizing, removing stop words, and so on.
Then, using sentiment analysis, you can predict the overall positive, negative, and neutral comments. You could come up with your own methods of analyzing data, such as categorizing comments as pro-Russian, neutral, or pro-Ukraine, and so on.
For the further details on how I extracted the data you can refer to my colab notebook
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F8676029%2F15d7cd0893ac22746e641126763d7624%2Fukraine-gecdb961b4_1280.jpg?generation=1689326237875840&alt=media" alt="">
Ukraine and Russia are major exporters of agricultural commodities, including wheat, corn, sunflower oil, and fertilizer. Together, they account for about 30% of the world's wheat exports, 60% of the world's sunflower oil exports, and 20% of the world's corn exports. The war in Ukraine has disrupted global food supplies, as Ukrainian ports have been blocked and Russian exports have been sanctioned. This has led to rising food prices and concerns about food shortages in some countries. The United Nations has warned that the war could have a "devastating impact" on global food security. Here are some specific examples of how the war in Ukraine has affected global food supplies:
Wheat prices have risen by more than 50% since the start of the war. Sunflower oil prices have doubled. The price of corn has risen by about 30%. The price of fertilizer has risen by more than 100%.
| Columns | Description |
|---|---|
| Domestic wheat supply | National wheat production for domestic consumption. |
| Wheat exports | Quantity of wheat sent to other countries for trade. |
| Wheat imports | Amount of wheat purchased from other countries. |
| Wheat stocks | Remaining supply of wheat within the country. |
| Net wheat imports | Difference between wheat imports and exports. |
| Wheat imports (% domestic supply) | Percentage of wheat imports relative to domestic supply. |
| Wheat stocks (% domestic supply) | Percentage of wheat stocks relative to domestic supply. |
| Wheat exports per capita | Amount of wheat exports per person. |
| Wheat imports per capita | Amount of wheat imports per person. |
| Net wheat imports per capita | Difference between wheat imports and exports per person. |
| Wheat stocks per capita | Amount of wheat stocks per person. |
| Domestic wheat per capita | Amount of domestic wheat production per person. |
| Wheat imports from Ukraine | Quantity of wheat imported from Ukraine. |
| Wheat imports from Russia | Quantity of wheat imported from Russia. |
| Wheat imports from Ukraine + Russia | Combined wheat imports from Ukraine and Russia. |
| Wheat imports from Ukraine per capita | Amount of wheat imports from Ukraine per person. |
| Wheat imports from Russia per capita | Amount of wheat imports from Russia per person. |
| Wheat imports from Ukraine + Russia per capita | Combined wheat imports from Ukraine and Russia per person. |
| Wheat imports from Ukraine (% imports) | Percentage of wheat imports from Ukraine relative to total imports. |
| Wheat imports from Russia (% imports) | Percentage of wheat imports from Russia relative to total imports. |
| Wheat imports from Ukraine + Russia (% imports) | Percentage of wheat imports from Ukraine and Russia relative to total imports. |
| Wheat imports from Ukraine (% supply) | Percentage of wheat imports from Ukraine relative to domestic supply. |
| Wheat imports from Russia (% supply) | Percentage of wheat imports from Russia relative to domestic supply. |
| Wheat imports from Ukraine + Russia (% supply) | Percentage of wheat imports from Ukraine and Russia relative to domestic supply. |
| Domestic maize supply | National maize production for domestic consumption. |
| Maize exports | Quantity of maize sent to other countries for trade. |
| Maize imports | Amount of maize purchased from other countries. |
| Maize stocks | Remaining supply of maize within the country. |
| Net maize imports | Difference between maize imports and exports. |
| Maize imports (% domestic supply) | Percentage of maize imports relative to domestic supply. |
| Maize stocks (% domestic supply) | Percentage of maize stocks relative to domestic supply. |
| Maize exports per capita | Amount of maize exports per person. |
| Maize imports per capita | Amount of maize imports per person. |
| Net maize imports per capita | Difference between maize imports and exports per person. |
| Maize stocks per capita | Amount of maize stocks per person. |
| Domestic maize per capita | Amount of domestic maize production per person. |
| Maize imports from Ukraine | Quantity of maize imported from Ukraine. |
| Maize imports from Russia | Quantity of maize imported from Russia. |
| Maize imports from Ukraine + Russia | Combined maize imports from Ukraine and Russia. |
| Maize imports from Ukraine per capita | Amount of maize imports from Ukraine per person. |
| Maize imports from Russia per capita | Amount of maize imports from Russia per person. |
| Maize imports from Ukraine + Russia per capita | Combined maize imports from Ukraine and Russia per person. |
| Maize imports from Ukraine (% imports) | Percentage of maize imports from Ukraine relative to total imports. |
| Maize imports from R... |
Facebook
TwitterThis repo contains tweets about Ukraine war collected from both sides, Russian accounts and Western sources accounts. I chose some of the most well known Russian propaganda accounts and also some of the most representative Western accounts (Twitter) which presents the events from Ukraine and I have scraped their tweets from the beginning of the war in Ukraine until May 9, 2022.
You can find insights about how the Russian propaganda machine is working in terms of how disinformation is created, spread and how is targeting the audience.
Than, you can compare it with the Western perspective about the war. I proceeded in this way (by scraping tweets from both sides) to avoid bias.
Ideas: use naive bayes classifier to identify sentiment score of both sides; identify accounts with similar behavior, like hate speech or positive speech (clustering); search for top tweets and identify hot topics;
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is what real world data looks like! It is often messy, complicated, and leaves you wondering what you can even do with it. That is the fun and difficulty of data science. You have information, but what can you do with it? Should you try to use machine learning? Should you use statistics? That is for you to find out! 😄
This dataset contains information regarding the ongoing Ukrainian and Russian conflict data dating back to 2014. There are two CSV files in this dataset. One contains data from 2014 to 2021, the other contains data from 2018 to 2023. Use your data science skills to better understand a conflict that is happening in real time! This is an excellent project for those looking to better understand global events or who are looking to work on a dataset with greater implications and a larger impact than a cat vs. dog classifier. 👍
I will be contributing to this dataset as new data becomes available, so stay tuned!
The Ukraine-Russia conflict began in 2014 when Russia annexed Crimea from Ukraine, but the history of these two nations goes back much further than 2014. Since then, pro-Russian separatists have been fighting Ukrainian government forces in the Donbas region of Eastern Ukraine. The conflict has resulted in thousands of deaths and the displacement of over 1.5 million people.
In 2022, the conflict escalated again, with Russia mobilizing its military near the Ukrainian border and launching a large-scale invasion in February. Ukrainian forces have been engaged in heavy fighting with Russian troops and separatist militias, resulting in a humanitarian crisis and significant civilian casualties.
The international community has condemned Russia's actions and imposed economic sanctions on the country. Diplomatic efforts to resolve the conflict, including negotiations and ceasefires, have not been successful so far. The conflict remains ongoing and the situation is highly volatile.