Jan. 21, 2019
1,085 words long, 5 minute read
As I have talked about before, I think Reddit can be a great source for finding interesting content. The insights, feedback, and advice people give on there is often really helpful. The personal finance subreddit is one of the largest and has an active community. There are questions from all aspects of personal finance. You could see someone $100k+ in debt and another person trying to decide what to do with a multi-million dollar inheritance. It truly ranges across the board.
I began to wonder what the most common questions were being asked. With such a large number of people asking questions there had to be repeats and similar questions being asked. With enough data, you could crowd source a list of the topics people care about most for their finances. As that question hung in my head, I got to work.
Sign up for our newsletter to get our latest posts, interviews, and insights
Being a developer, I started figuring out a way to get a lot of titles from Reddit. In order for the data to be useful, I had to process the titles in a way that would make sense. Here was the criteria I ended up coming up with:
The idea was that the most up voted questions would be the ones people cared about the most. Then I wanted to look for commonality between the posts to find topics that people were concerned about. It's not the most scientific, but I think it does the trick.
I will save you the technical details, but I wrote a script that did just that. I got 1,000 posts at a time in 30 day increments. From there, I used natural language processing libraries to help filter the data and find common phrases. I then exported all of this data into an excel spreadsheet for easier processing.
From the 6,000 posts we analyzed only 2,231 of them were phrased as questions. This was still quite a large number of topics to glean insights from so I felt good about the number. I was surprised that over 50% of posts weren't questions. I chalked this up to my semi-scientific way of filtering for questions, which was seeing if the post title ended in a question mark.
From those 2,231 posts here were some interesting insights.
The first step was to figure out what topics people were asking about. To do this, I counted up the most common singular words while filtering out common words like articles (the, and, etc.) and pronouns (I, she, etc.). Here were the top 10 most common:
|Word||Number of times seen|
It can be a bit difficult to decipher any real questions from this list, but there are some interesting insights starting to appear. It's obvious that people are concerned with debt as credit, car, student, debt are all topics related to debt. People also seem to be asking about their job (pay, job in the list) which isn't too surprising. I was a bit surprised not to see more retirement related topics retirement in this top 10. I imagine it might be from people asking about retirement in many different ways as it is a broad topic.
The next step I did was find phrases that people were talking about. This would help me see more question-like insights. I also wanted to see if there was some overlap to the single topics. Here are the top 10 in the phrases for both 2 letter and 3 letter phrases:
|to buy a|
|a credit card|
|my credit card|
|my credit score|
|credit card debt|
|buy a house|
|a roth ira|
|can i afford|
|how much should|
|off student loans|
|student loan debt|
A few more interesting topics and more retirement focused in the longer phrases. There are still a ton of debt-focused questions. It must be what people are struggling with most and need the most help with. A lot of questions about how much someone can afford, house affordability, etc.
I think that the two and three letter phrases had a bit more context. The one word phrases I think function much more like category versus what people are actually looking for. By the time you get to 3 letter phrases you have a good idea of what the topic actually is.
Now that I had these lists, it was time to find these phrases in the long list of questions I had. From there I would be able to try and distill the commonalities between the questions.
Check out our collection of the best books about reaching financial independence
From the most common topics and phrases I then went and read the titles of the posts myself. This part was all manual and gave me a bit more insight into how people were phrasing questions, context, etc. I used the search function in excel to find topics/phrases and the questions that surrounded it.
After doing some more analysis, I simplified most of the questions. There were a lot of similarities so I did my best to try and group questions together. Here is my final list (in no particular order):
The end results look pretty similar to what we saw before. I will say that when going through by hand I noticed there were more 401k specific questions than my initial analysis surfaced. I suspect this is because many people spelled 401k differently (with a parentheses, space between, etc.) which my script did not account for. If I continue to do this analysis I will improve the script so I have to do less parsing myself.
I am also going to try and answer more of these questions on this site with calculators and posts. It's obvious debt is a huge concern for people. After seeing these results I think there is a big need for more tools to help people understand their debt and pay it off faster. Be on the lookout for these questions to be answered for you on this site.
What do you think? Do these questions look like topics you are concerned with? Have any other questions? Let us know!