reddit

Analyzing 6,000+ Reddit posts to find the most common questions about personal finance

Jan. 21, 2019

1085 words long, 5 minute read

As I have talked about before, I think Reddit can be a great source for finding interesting content. The insights, feedback, and advice people give on there is often really helpful. The personal finance subreddit is one of the largest and has an active community. There are questions from all aspects of personal finance. You could see someone $100k+ in debt and another person trying to decide what to do with a multi-million dollar inheritance. It truly ranges across the board. 

I began to wonder what the most common questions were being asked. With such a large number of people asking questions there had to be repeats and similar questions being asked. With enough data, you could crowd source a list of the topics people care about most for their finances. As that question hung in my head, I got to work.

The criteria

Being a developer, I started figuring out a way to get a lot of titles from Reddit. In order for the data to be useful, I had to process the titles in a way that would make sense. Here was the criteria I ended up coming up with:

  • 1,000 most up voted posts from each month for the last 6 months (6,000 total)
  • Only use titles that were phrased as a question
  • Look for common phrases and subjects

The idea was that the most up voted questions would be the ones people cared about the most. Then I wanted to look for commonality between the posts to find topics that people were concerned about. It's not the most scientific, but I think it does the trick. 

I will save you the technical details, but I wrote a script that did just that. I got 1,000 posts at a time in 30 day increments. From there, I used natural language processing libraries to help filter the data and find common phrases. I then exported all of this data into an excel spreadsheet for easier processing. 

The initial results

From the 6,000 posts we analyzed only 2,231 of them were phrased as questions. This was still quite a large number of topics to glean insights from so I felt good about the number. I was surprised that over 50% of posts weren't questions. I chalked this up to my semi-scientific way of filtering for questions, which was seeing if the post title ended in a question mark.

From those 2,231 posts here were some interesting insights.

Most common topics

The first step was to figure out what topics people were asking about. To do this, I counted up the most common singular words while filtering out common words like articles (the, and, etc.) and pronouns (I, she, etc.). Here were the top 10 most common:

Word Number of times seen
credit 252
pay 159
money 114
card 109
car 90
student 88
401k 86
debt 77
job 76
account 69

It can be a bit difficult to decipher any real questions from this list, but there are some interesting insights starting to appear. It's obvious that people are concerned with debt as credit, car, student, debt are all topics related to debt. People also seem to be asking about their job (pay, job in the list) which isn't too surprising. I was a bit surprised not to see more retirement related topics retirement in this top 10. I imagine it might be from people asking about retirement in many different ways as it is a broad topic.

Most common phrases

The next step I did was find phrases that people were talking about. This would help me see more question-like insights. I also wanted to see if there was some overlap to the single topics. Here are the top 10 in the phrases for both 2 letter and 3 letter phrases:

2 letter phrases:

Phrase
credit card
to pay
my credit
pay off
the best
credit score
student loans
to buy a
a house
a car

3 letter phrases:

Phrase
a credit card
my credit card
my credit score
credit card debt
buy a house
a roth ira
can i afford
how much should
off student loans
student loan debt

A few more interesting topics and more retirement focused in the longer phrases. There are still a ton of debt-focused questions. It must be what people are struggling with most and need the most help with. A lot of questions about how much someone can afford, house affordability, etc. 

I think that the two and three letter phrases had a bit more context. The one word phrases I think function much more like category versus what people are actually looking for. By the time you get to 3 letter phrases you have a good idea of what the topic actually is.

Now that I had these lists, it was time to find these phrases in the long list of questions I had. From there I would be able to try and distill the commonalities between the questions.

Interested in retiring early?

Check out our collection of the best books about reaching financial independence

In this collection

Millionaire Next Door

The Millionaire Next Door

Boggleheads Guide to Investing

The Bogleheads Guide to Investing

The Intelligent Investor Cover

The Intelligent Investor

The most common asked about personal finance

From the most common topics and phrases I then went and read the titles of the posts myself. This part was all manual and gave me a bit more insight into how people were phrasing questions, context, etc. I used the search function in excel to find topics/phrases and the questions that surrounded it.

After doing some more analysis, I simplified most of the questions. There were a lot of similarities so I did my best to try and group questions together. Here is my final list (in no particular order):

  • Should I get a credit card? / Will I qualify for a credit card?
  • How to best pay off my credit card?
  • How can I improve my credit score?
  • Why did my credit score drop?
  • When can I buy a house? / Should I buy a house?
  • How much should I be saving?
  • How much can I afford for a house/car/etc.?
  • Should I be contributing to a Roth IRA?
  • Should I pay off X debt vs Y debt first?
  • Should I pay off debt faster?
  • Should I max out X account?

The end results look pretty similar to what we saw before. I will say that when going through by hand I noticed there were more 401k specific questions than my initial analysis surfaced. I suspect this is because many people spelled 401k differently (with a parentheses, space between, etc.) which my script did not account for. If I continue to do this analysis I will improve the script so I have to do less parsing myself.

I am also going to try and answer more of these questions on this site with calculators and posts. It's obvious debt is a huge concern for people. After seeing these results I think there is a big need for more tools to help people understand their debt and pay it off faster. Be on the lookout for these questions to be answered for you on this site.

What do you think? Do these questions look like topics you are concerned with? Have any other questions? Let us know!

Share This Post