Who gets bigger home loans?

Federal Housing Finance Agency Data Study

Data gathering, data scrubbing, analysis, visualizations and conclusions all by me, Michael Moss, as a final project for my data analytics class. Below is the summary of the study, followed by links to files of the slideshow I prepared for the final presentation, as well as the code in Python and raw data on Google Drive.

Summary/ Main Inquiry: The Federal Housing Finance Agency provides a number of datasets covering various aspects of the real estate lending market for research purposes. Using the 2019  Federal Housing Lender Bank Public Use Database (https://www.fhfa.gov/DataTools/Downloads/Documents/FHLBank-PUDB/2019_PUDB_EXPORT_123119.csv), this capstone project will explore the correlation, if any, between the borrower’s employment status (employed or self-employed) and the total amount of the mortgage loan. 

Correlation between Employed or Self-Employed Status and Mortgage Loan Amount

The key statistic for comparison will be the mortgage amount, a continuous variable, which can first be separated by the independent variable of employed vs. self-employed, and then can also be broken down further by zip code, issuing bank, as well as race and ethnicity of the borrowers.

There are close to 90,000 entries per year, I have only chosen 2019 so far but could add more to increase the population sizes if needed. 

Audience: This information would be helpful to borrowers researching a loan or lenders looking to determine the amount of risk on a certain loan. This information could also be used by government agencies to conduct policy review and implementation. On a personal note, as a self-employed translator and localization professional for the better part of the last decade, I am curious to see the actual data. Limited access capital  is a real drawback to working in the gig economy that is not often addressed. 

Statistics: 

Hypothesis Pair 1: The null hypothesis is that the mean mortgage amount for borrowers with a salaried job is equal to the mean mortgage amount for borrowers who are classified as self-employed. The alternative hypothesis is that the mean mortgage amount for borrowers with a salaried job is not equal to the mean mortgage amount for borrowers who are classified as self-employed.

Hypothesis Pair 2: The null hypothesis is that the mean mortgage amount for principal residence is equal to the mean mortgage amount for investment properties and second homes, and the alternative hypothesis is that the mean mortgage amount for principal residences is not equal to the mean mortgage amount for investment properties and second homes.

As the mean mortgage amount is a continuous variable, the independent samples t-test will be used to test whether there is a significant difference in the mean house price between home lenders with a salaried job, and those home lenders classified as self-employed.

View final presentation in slideshow.

View code in notebook.

View code in Github repository