Military Spending Analysis

Pandas Couple
7 min readApr 11, 2022

A quick and insightful data analysis of countries military spending.

Photo by Levi Meir Clancy on Unsplash

Wars are part of human history, being recorded and studied since antiquity.

Wars are armed conflicts that happen for different reasons, such as religious disagreements, political and economic interests, territorial disputes, ethnic rivalries, among other reasons.

Unfortunately, the world is living this terrible part of history again with the current conflict between Russia and Ukraine and that was the motivation of this work: Analyze country military/defense spending.

Through the data, we will try to find out how countries are acting according to their military expenditures and compare them, in order to obtain information and insights on this topic that has come up again this year. Let’s go!

Data Collection

Data were collected from the following sources: ourworldindata and worldbank.

Ourworldindata databases contain information on countries military expenditures, until 2020.

The worldbank databases contain information on the populations and GDP of countries, until 2020.

To proceed with this analysis, we are going to do some data engineering, which we will not detail here in this article. Feel free to head over to my github and see how this engineering was done in full!

Final dataset that we use for work

This is the result of our data engineering that we did by joining all the datasets.

Exploratory Data Analysis

Let’s use this dataset above to answer some pertinent and interesting questions!

1. How much does a country’s military expenditure corresponds of the respective GDP?

The idea here is to have a percentage notion of how much countries spend on militarism and how much this corresponds to total GDP.

Dataset with the column we created to answer the first question

Let’s see graphically, this data filtered by the last year.

The ten countries with the highest military spending as a percentage of GDP in 2020

With a list made up only of Asian and African countries, these countries have the most military expenditures as a percentage of GDP.

These countries have relatively low GDP, so when extracting military expenditures as a percentage of GDP, their percentages are higher than countries that have infinitely higher military expenditures in absolute terms.

Anyway, it’s always a surprise not to see any North American or European country on this list, do you agree?

2. What is the per capita military expenditure?

Let’s analyze now, what is the military expenditure per capita and compare the ten countries with the highest expenditures as we did in the previous topic.

Dataset with the column we created to answer the second question

Let’s see graphically, this data filtered by the last year.

The ten countries with the highest military spending per capita in 2020

Now this list is more diverse and with some major world powers, North American, European and Asian.

See how countries like Israel and the United States spend almost twice as much per capita compared to the last countries on this list. Can you imagine for the countries that are not even on this list?

3. What were the top ten countries with the highest war expenditures in the last year?

Now let’s analyze something more simple, but interesting, which is the absolute expenditure of countries.

Let’s follow the same line of reasoning as in the previous topics to be able to compare.

The ten countries with the highest absolute military spending in 2020

How surprising is that? Absolute USA spending far outstrips every other country on the list, but we see China approaching!

Now, instead of looking at the values ​​just for the last year, let’s check the values ​​over all the years available in the dataset.

The ten countries with the highest absolute military expenditures over the years

We see that the United States has always invested much more than other countries.

One thing that made me curious about this graph are these extreme points in the boxplot relative to Russia, shall we investigate?

Unfortunately we don’t have data from Russia before 1988. Could it be because of the extremely closed government at the time of the Soviet Union?

The data began to emerge from 1988, and perhaps it may be an indication of the flexibilization of the government at the time when the president was Mikhail Gorbachev. It was precisely at this time of the current president Mikhail Gorbachev, when the USSR began to decline because of his attempts to reform the system. The USSR came to an end in 1991. Makes sense?

Anyway, these extreme values ​​that we saw in the boxplot plot of Russia, came from there, in the mid-1980s to 1990s, a time when the USSR was still standing and then we see a long decline after the USSR ceased to exist.

This leads us to believe that the USSR invested much more in military spending and that I also imagine that the country at this time of turmoil and transition went through strong financial crises. Hence this sudden drop.

4. Were there extreme values in military expenditures by countries in the years there were wars?

We know that there have been some wars in history and that it would be plausible if countries increased their military spending in these periods. Let’s see if the data tells us that?

The first year we had data

Between that period from 1949 to 2020, we didn’t have any world war, because the second war ended in 1945, but between 1947 and 1991 we had the cold war, which was a long conflict and maybe it’s not the best to draw conclusions for what we are wanting in that specific topic.

Anyway, let’s plot a graph between Russia and the United States to see if we can get any evidence.

As we think, it is difficult to affirm this pattern that we are trying to capture by this example of the cold war. What we can say is that the United States spends much more money than Russia according to the data we have.

A major conflict that we had in the 21st century was between the United States and extremist groups that were based in Afghanistan and Pakistan. This conflict lasted around 2001 to 2011, let’s check it out graphically.

Once again, we were unable to draw significant conclusions. We even see a possible upward trend in US spending in this period, but we cannot say the same for the other two countries. Again, what we can see is the giant difference between the military power of the United States and the other two.

5. What is the correlation of military expenditures with GDP and population?

Finally, let’s check the degree of correlation between the variables presented during this project, which were: military expenditure, population, GDP and the other two that we created (percentage military expenditure and military expenditure percapita).

Pearson correlation between dataset variables

For contextualization, the correlation coefficient varies from -1 to 1 and two variables are positively correlated if a variation in one of them is associated with a variation in the other in the same direction. A correlation is negative if a positive change in one of the variables is associated with a negative change in the other. If it is 0, it means that the variables have no significant association with each other.

As we can see, population and GDP have very high correlations with a country’s military spending. We also see that GDP and population have a high correlation with each other, but that was to be expected, right?

Let’s see this in a scatterplot!

GDP x Military Expenditure
Population x Military Expenditure

See that we have linear trends between the variables as the correlation coefficient showed us but we also see some noise.

Would we be able to predict countries military expenditures with just these variables or would we need others for a predictive model to work well?

I think we would have to try to be sure, who knows in a next project? I imagine it would be very interesting to try to predict these values!

Conclusion

Through this analysis we were able to explore some situations in which military expenditure, population and GDP data come in to help us obtain insights for better decision making in a company or even a government.

Also, in this project, we go through essential steps in any work related to information and data, let’s recap:

  • We made a brief introduction
  • We collect the data
  • We did data engineering to join the datasets and work with a final table
  • We did extensive exploratory analysis and answered interesting questions
  • Conclusion

In the end, we left a question in the air, which is how well a predictive model can perform with this dataset. Would we have to collect more samples? Add more variables? Or maybe create new variables from the ones we already have, as we did in this project.

Anyway, that would be for a future work or you, reader, feel free to continue it, do more exploratory analysis and answer other questions and even do the predictive analysis we talked about.

Thank you for your time!

Let me know if you have any questions or feedbacks, I’d love to hear from you!

--

--

Pandas Couple

Casal de Cientistas de Dados, contribuindo para a comunidade de Data Science.