Analyzing the War in Ukraine Using R. An exploratory analysis of the conflict using the tidyverse and maps!
In 2014, Russian separatists in the Donetsk and Luhansk Oblasts of Ukraine claimed independence. A conflict ensued between the self-declared Donetsk and Luhansk People’s Republics and the Ukrainian government. The conflict is very complex and involves innumerable variables. People have written books, articles, and all types of opinion pieces about the conflict. I was curious about what the data show.
After some effort, I stumbled upon The Humanitarian Data Exchange: https://data.humdata.org/. There’s a lot of really interesting data available there. Luckily for me, a very nice and fairly granular data set concerning the Ukrainian conflict was available. Each row in the data set represented a kinetic event. The data spans from the Euromaidan revolution of 2014 up to 2018. I read the data into R and munged it a bit by removing some columns I wasn’t interested in (I left that part out of the cod below). We’ll begin by checking the data for na’s (make sure you load the tidyverse!):
## read in the data conflict <- read_csv('conflict_data_ukr - Raw.csv') ## make.names on the columns colnames(conflict) <- make.names(colnames(conflict)) ## check for NA's in the column adm_1 sum(is.na(conflict$adm_1))#119 conflict[is.na(conflict$adm_1),] ## all of the NAs in adm_1 are for the donets basin area which ## isn't contained in one oblast. ## replace NA's in adm_1 w/ 'Donets Basin Area' conflict$adm_1 <- conflict$adm_1 %>% replace_na('Donets Basin Area') any(is.na(conflict)) ## FALSE
The “adm_1” variable contained information on what oblast the event occurred in (think of an oblast like a large US county). I was concerned about na’s in that specific column. The first line of code told me I had 119 na’s there. I was able to subset the data in the second line of code and learned that all of the na’s were events in the “Donets Basin Area” based on the information in the “where_coordinates” column. I replaced the na’s in the adm_1 columns with “Donets Basin Area” and then renamed some columns:
## rename some columns conflict <- conflict %>% rename(gov.forces = side_a , op.forces = side_b , kia = deaths_a , ekia = deaths_b , civcas = deaths_civilians , region = adm_1)
A data scientist/analyst in the making needs to format and clean data before being able to perform any kind of exploratory data analysis.
Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments. Our latest survey report suggests that as the overall Data Science and Analytics market evolves to adapt to the constantly changing economic and business environments, data scientists and AI practitioners should be aware of the skills and tools that the broader community is working on. A good grip in these skills will further help data science enthusiasts to get the best jobs that various industries in their data science functions are offering.
These data science tools illustrated guides are broken up into four distinct categories: data retrieval, data manipulation, data visualization, and engineering tips. Both online and PDF versions of these guides are available.
Become a data analysis expert using the R programming language in this [data science](https://360digitmg.com/usa/data-science-using-python-and-r-programming-in-dallas "data science") certification training in Dallas, TX. You will master data...
How to use graphs effectively while working on Analytical problems. Data visualization is the process of creating interactive visuals to understand trends, variations, and derive meaningful insights from the data. Data visualization is used mainly for data checking and cleaning, exploration and discovery, and communicating results to business stakeholders.