First we are going to rename the Geographic Area Name to State in the asm data. We also want to replace the full name of the states with their abbreviations. For example Alabama with AL
## [1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "DC" "FL" "GA" "HI" "ID" "IL" "IN"
## [16] "IA" "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH"
## [31] "NJ" "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT"
## [46] "VT" "VA" "WA" "WI" "WY"
## `summarise()` has grouped output by 'NaicsIndustry', 'BorrState'. You can
## override using the `.groups` argument.
## State Year Total_Fringe_Benefits Total_Materials_Cost Cost_Materials_Used
## 1 0 0 0 0 0
## Cost_Resales Inventories_Beginning_Year Finished_Goods_Beginning
## 1 0 0 15
## Work_In_Process_Beginning Materials_Supplies_Beginning Inventories_End_Year
## 1 15 0 0
## Finished_Goods_End Work_In_Process_End Materials_Supplies_End
## 1 5 5 0
## Total_Capital_Expenditures Capital_Buildings Capital_Machinery
## 1 0 3 3
## Capital_Automobiles Capital_Computers Capital_Other sum_gross_approval
## 1 3 3 1 0
## sum_jobs_supported sum_third_party_dollars
## 1 0 0
First we are going to make a graph on the sum of jobs supported by each state. We are going to separate by year, and make separate graphs for each of the years so we can see how, over time,these loans are supporting jobs in the manufacturing industry for each respective state.
## Warning in sf_column %in% names(g): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 172 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
We are going to add another column named region and group the states by region.
## `summarise()` has grouped output by 'region'. You can override using the
## `.groups` argument.
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
## Warning: line.color doesn't (yet) support data arrays
The first step is filtering the data to focus on the manufacturing industry and selecting relevant columns.
## Rows: 12,444
## Columns: 6
## $ ApprovalFiscalYear <int> 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2010, 2…
## $ ProjectState <chr> "FL", "MA", "WI", "NJ", "TN", "TX", "CA", "UT", "…
## $ TermInMonths <int> 240, 240, 120, 240, 240, 240, 240, 240, 240, 240,…
## $ GrossApproval <int> 2019000, 211000, 174000, 446000, 762000, 899000, …
## $ JobsSupported <int> 21, 4, 10, 7, 50, 10, 20, 8, 15, 15, 20, 0, 12, 5…
## $ GrossChargeOffAmount <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
To analyze loan trends, the data is grouped by fiscal year. We calculate the total gross approval and average loan term for each year, to create a line graph of loan trends over the years.
We will rescale the total loan amount to millions to make the values easier to interpret.
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
For our next visual, we grouped states into regions to avoid overcrowding and analyze average loan amounts and jobs supported more clearly.
To make the average loan amounts easier to compare across regions, we rescaled them to a log scale, which helps show differences more clearly.
## `geom_smooth()` using formula = 'y ~ x'
For our last visual in this part, we are analyzing and visualizing the total charge-off amounts by state, rescaled to millions for easier interpretation.
We are going to highlight states with significant charge-off amounts using a color gradient, where lighter shades represent lower amounts and darker shades indicate higher amounts. This gradient will help emphasize the states with the highest loan charge-offs.
Let’s start by trying to create a map of the Total Capital Expenditures by Each State and year. In order to do this we have to first prepare the US States Map to recognize the abbreviations. Next, we have to summarize the data bu the state and year.
## `summarise()` has grouped output by 'State'. You can override using the
## `.groups` argument.
Next, lets merge the map data with the States and create the map and filter out any NA values.
## Warning in left_join(., asm_summary, by = "State"): Detected an unexpected many-to-many relationship between `x` and `y`.
## ℹ Row 1 of `x` matches multiple rows in `y`.
## ℹ Row 5 of `y` matches multiple rows in `x`.
## ℹ If a many-to-many relationship is expected, set `relationship =
## "many-to-many"` to silence this warning.
Finally, let’s generate an interactive map of total capital expenditure by state and year.
Let’s see if there is a relationship between the total fringe benefits that employees recieve and the total capital expenditures for that year. First let’s pick which variables to include
## `geom_smooth()` using formula = 'y ~ x'
## NULL
We are going to look at which states and which year had the highest material cost based on the state.
## `summarise()` has grouped output by 'Industry'. You can override using the
## `.groups` argument.