Chapter 7 Conclusion

7.1 Work have done

We have discussed some possible relationship among several socioeconomic factors through geographic analysis, dependency relation analysis, and time series analysis. Furthermore, we illustrate the most revealing findings based on county levels and states levels.

7.2 Limitations

A lot of variables we have elaborated in our project are presented by averaging that variable through a certain time period. For example, percents of civilian labor force in different occupational industries are all obtained by averaging from 2014 to 2018; similarly for those percentages of persons at different education levels, deep poverty rate, per capita income. Without yearly detailed information of each variable, we have to average the most important socioeconomic factor, Unemployment rate, we’ve discussed in out project. By doing that, we lose a lot of information on how variables, like percentage of education levels, percentage of occupation employed, deep poverty rate, affect unemployment rate on yearly basis.

In order to indicate possible trend or correlation among socioeconomic factors, we normally need data from a relative long time frame; however, a considerable large number of variables only have data on a specific year, such as, poverty rate which is another important socioeconomic factor we include in out project. Since it only has data of 2018, when we try to find possible correlation between poverty rate and other variables, we can only demonstrate the general pattern on year of 2018.

7.3 Future directions

We could look into seemly outliers or usual trend and try to figure out the reasoning behind it. Therefore, we need to look for other resources which contain all these socioeconomic factors in a relatively long time frame to explore the changing patterns of these factors in different regions of the United States. Based solely upon these plots, we can find some plausible correlations between different socioeconomic, geographical and other factors. But in order to have a deeper and more comprehensive understanding on the dependencies of these factors, we need to build some statistical models or implement some machine leaning algorithms on this dataset.