Analysis of Key Finding from "Immigration and American Jobs"

Summary

In December of 2011, the study "Immigration and American Jobs", written by economist Madeline Zavodny, was published. Page 4 of that study lists the following as the first of four main findings:

...from 2000 to 2007, an additional 100 foreign-born workers in STEM fields with advanced degrees from US universities is associated with an additional 262 jobs among US natives.

This analysis finds significant methodological problems with the study which indicates that its findings may be unwarranted. It consists of the following three parts:

  1. Part 1 looks at the study's own data and attempts to replicate the above finding.
  2. Part 2 looks at replicating the study's data by extracting it from the original source.
  3. Part 3 looks at updating the study through 2013 and seeing the effect that has on the finding.

Key Issues for 2000-2007

  1. California had over three times as many foreign STEM workers with advanced U.S. degress as the next highest state, New York, and over a quarter of the total in the United States.
  2. Over a quarter of the data points (one per each state and year) have no such STEM workers.
  3. The native worker employment rate used by the study incorrectly counts those not in the labor force, both retired and other, and the self-employed as unemployed.
  4. The results of the study depend strongly on the addition of dummy variables for state and year. As a result, the model implies that the major factors affecting the native employment rate are the state, the year, and the number of native foreign workers with advanced degrees. No consideration is given to any other factors, including other workers, foreign and native.
  5. After removing the predicted effects of the state and year, both the scatter plot and the correlation coefficient show very weak correlation.
  6. California, which has by far the most such workers, shows a negative correlation though at -0.1446, it's fairly weak.
  7. New Jersey and Oregon show stronger negative correlations of -0.5969 and -0.5362, respectively.
  8. Many states with a postive correlation show that it is often related to a decrease in both the native employment rate and the share of such workers. An extreme example is Michigan.
  9. It is possible to duplicate all key slopes in the study's findings with a model that includes the year, state, and 4 groups of foreign workers with a bachelor's degree or higher.
  10. Of these groups, foreign workers with only a bachelor's degree had a negative slope meaning that they are associated with a decrease in native jobs.
  11. For more information, conclusions are listed here.

Key Issues for 2000-2013

  1. The selection of 2000-2007 is critical to the slope of 0.0045 on which the 2.6 jobs number depends. For example, 2002-2009 gives a slope of -0.0020 and 2002-2007 gives a slope of 0.0005.
  2. Extending the study through 2013 decreases the slope of the regression from 0.0042 to 0.0034.
  3. Fixing the employment rate decreases the slope of the regression further to 0.0020 and greatly lowers the p-values of the first three regressions, indicating higher significance.
  4. For more information, conclusions are listed here.
Following is a table of contents of the three parts with links to the analysis:

Part 1 - Replicating the Finding from the Study's Own Data

  1. The Claim, its Source, and its Mention in the Media
  2. Calculation of the Claim from the Results of a Linear Regression
  3. Key Variables in the Regression
  4. An Initial Look at the Data
  5. Initial Attempt at a Regression Fails Due to Zero Values
  6. Problem with Study's Calculation of the Employment Rate
  7. Removing Zero Values Changes Correlation in Regression from Positive to Negative
  8. The Effect of Dummy or Indicator Variables on the Study
  9. Use of Correlation Coefficients and P-Values
  10. Removing the Effects of Year and State from the Scatter Plot
  11. Removing the Effects of Year from the Scatter Plot
  12. Looking at the States Individually
  13. Regression Statistics
  14. Replicating Other Results from the Study
  15. Fallacy of Composition
  16. First and Last Year are Critical to Results
  17. Correlation Does Not Imply Causation
  18. Conclusions

Part 2 - Replicating the Study's Data from the Original Source

  1. Recap of Prior Analysis
  2. Replicating and Extending the Study's Data
  3. Source of Missing Values that were Converted to Zero
  4. Comparison of Extracted Data to Study's Data
  5. Output Using Extracted Data instead of Author's Data

Part 3 - Updating the Study Through 2013 and Seeing the Effect on the Finding

  1. Recap of Prior Analysis
  2. Updating the Study Through 2013
  3. Slope of Key Regression Cut in Half and Significance of First 3 Regressions Increased
  4. An Initial Look at the Data
  5. Regression with Corrected Employment Rate Shows Negative Correlation
  6. Removing Zero Values Shows Negative Correlation
  7. The Effect of Dummy or Indicator Variables on the Study
  8. Removing the Effects of Year and State from the Scatter Plot
  9. Removing the Effects of Year from the Scatter Plot
  10. Looking at the States Individually
  11. Regression Statistics
  12. Conclusions

Source Code for R Programs Used in this Analysis

Note: The latest versions of the following code can be found on GitHub

  1. Source code for amjobs.R (function called by following 4 amjobs*.R files)
  2. Source code for amjobs0.R (processes author's data from 2000-2007)
  3. Source code for amjobs07.R (processes CPS MORG data from 2000-2007)
  4. Source code for amjobs13.R (processes CPS MORG data from 2000-2013)
  5. Source code for amjobs13lf.R (processes CPS MORG data from 2000-2013 with modified native_emprate)
  6. Source code for amjobs_coef0.R (coefficients for author's data from 2000-2007)
  7. Source code for amjobs_coef07.R (coefficients for CPS MORG data from 2000-2007)
  8. Source code for amjobs_coef13.R (coefficients for CPS MORG data from 2000-2013)
  9. Source code for amjobs_coef13lf.R (coefficients for CPS MORG data from 2000-2013 with modified native_emprate)
  10. Source code for amjobsg.R (various functions)
  11. Source code for morg07.R (extracts CPS MORG data from 2000-2007)
  12. Source code for morg13lf.R (extracts CPS MORG data from 2000-2013)
  13. Source code for morg13lf.R (extracts CPS MORG data from 2000-2013 with modified native_emprate)

Articles and Blog Posts Referencing this Analysis

  1. The Trouble with State-by-State Analyses of H-1B - Norm Matloff, November 19, 2014
  2. The paradox of job creation - Beryl Lieff Benderly, February 5, 2015
  3. Update on the Zavodny Job-Creation Research - Norm Matloff, February 6, 2015
  4. Sold Out: How High-Tech Billionaires & Bipartisan Beltway Crapweasels Are Screwing America's Best & Brightest Workers (Chapter 3) - Michelle Malkin and John Miano, November 10, 2015

Analysis of "Immigration and American Jobs"
Earlier Analysis of "Immigration and American Jobs"
Analysis of the claim that each H-1B worker creates 1.83 jobs
Analysis of the claim that each STEM worker with an advanced U.S. degrees creates 2.62 jobs
References to Claims that Foreign-born Workers Create Jobs
Analysis of "Foreign STEM Workers and Native Wages and Employment in U.S. Cities"
Analysis of "STEM Workers, H-1B Visas, and Productivity in US Cities"
A Look At Mariel Using R
Commentary on the Skills Gap
Composition of STEM Workers in Selected Locations: 2013
Computer Workforce by Age
H-1B Labor Condition Applications: 2001-2013
Information on H-1B Visas
Go to Budget Home Page