On March 29, 2018, the San Jose Mercury News published a story titled "H-1B: Silicon Valley doesn’t get majority of controversial visas, report says". It begins:
The controversial H-1B work visa used heavily by Silicon Valley tech firms to acquire talent is much more widely used by companies based in New York and Texas, according to new research.
From 2010 to 2016, almost a third of the visas, which are intended for workers in jobs requiring specialized knowledge and a bachelor’s degree or higher, went to businesses in the New York City area, the Pew Research Center reported Thursday.
The referenced Pew Research Center Study is titled "East Coast and Texas metros had the most H-1B visas for skilled workers from 2010 to 2016" and begins as follows:
The employment of high-skilled foreign workers with H-1B visas centered in large East Coast metropolitan areas from fiscal years 2010 to 2016. These foreign workers also made up a significant part of the workforces in several Texas metro areas, according to a Pew Research Center analysis of previously unpublished metro-level government data of H-1B visa approvals obtained through a public records request.
The H-1B visa program is the nation’s largest temporary employment visa program. About 247,900 H-1B visa approvals – 29% of the nation’s total – went to employers in the New York City metro area from fiscal 2010 to 2016 (the most recent years for which data are available at the metropolitan level). The Dallas and Washington metro areas (74,000 and 64,800 approvals, respectively) had the next-highest totals, with Boston (38,300 approvals) also among the top metro areas by this measure. The data, obtained from U.S. Citizenship and Immigration Services, include details of those approved for an H-1B visa.
In the next paragraph, the article states a remarkable finding that College Station, Texas "stands far above the rest, with about 32 H-1B approvals per 100 workers". It also mentions that "more than 99% of the metro area’s H-1B approvals went to employees of Cognizant Technology Solutions Corp., whose U.S. headquarters is in College Station".
This last finding especially seemed to merit addition investigation. There are links to PDF and CSV files showing the number of approved H-1B petitions by employer for 2015 through 2017 on the U.S. Citizenship and Immigration Services website but the data does not contain the location of the employers or the H-1B workers. The following code outputs the first 20 employers in the file:
import pandas as pd
csvfile = 'https://www.uscis.gov/sites/default/files/USCIS/Resources/Reports%20and%20Studies/Immigration%20Forms%20Data/BAHA/Approved_H1B_2017_Employers_3.2.18.csv'
aa = pd.read_csv(csvfile, skiprows = 4)
aa.columns = ['TaxID','Employer','Approved','Salary','Degree','Subcount','Misc']
aa = aa.iloc[:, 0:4]
print(aa[aa.Employer.str.strip() != ""].head(20))
As can be seen, the employers appear to be listed in descending number of approvals. Also, Cognizant did have the top number of approved H-1B visas in 2017.
In any event, the location for both the employer and the H-1B workers is contained in disclosure files for the LCA programs. LCA stands for Labor Condition Application and is a form employers must file with the U.S. Department of Labor for H-1B, H-1B1 (Singapore and Chile) and E-3 (Australia) work visas.
The Pew study gives more information about the LCA program. If you click on "About this analysis" in the study, you'll see a section that begins:
The H-1B visa program allows U.S. employers to hire foreigners to work for up to six years in jobs that require highly specialized knowledge, and workers’ employment may be extended if they have green card applications pending. To participate, employers first submit applications to the U.S. Department of Labor attesting that no U.S. worker would be displaced by the prospective foreign worker. The application is then reviewed by U.S. Citizenship and Immigration Services (USCIS) before the State Department interviews the foreign worker and issues the visa.
This section goes on to describe the source for this data used in the study. It states:
The data on H-1B visa approvals were obtained from USCIS through a Freedom of Information Act request and were received in November 2017. Approvals that are subject to the annual cap account for nearly all (99.9%) approvals in this analysis. The FOIA data does not distinguish between approvals for initial and continuing employment. It includes employer names and employer location (city and state) but excludes worker location, which could be at the employer location or another location. This analysis assumes H-1B visa approvals are for foreigners who will work at the employer location.
Hence, the study looks at the employer names and location, not the worker location. It states that it "assumes H-1B visa approvals are for foreigners who will work at the employer location". The analysis below will show that this is a very flawed assumption. It will look at the LCA files to obtain information about the worker location. These files can be found on the U.S. Department of Labor website under the Disclosure Data tab. The following python code loads the LCA file for 2017 in order to look at the location data.
At some point between November 9, 2018 and January 13, 2019, the "About this analysis" functionality in the Pew Study appears to have stopped working. Clicking on "ABOUT THIS ANALYSIS" (where it's capitalized) in a November 9, 2018 archive of the page causes text that includes the excerpts listed in the prior section to be displayed. On the other hand, clicking on it in a January 13, 2019 archive of the page has no effect. However, the excerpts listed in the prior section can be seen in the January 13th archive via the following steps:
The first of the following lines should be highlighted:
> <h4 class="collapsible-title">...</h4>
</div>
> <div class="content" style="display:none;">...</div>
Click the arrow on the left of the last line. That should display the following lines.
> <p>...</p>
> <p>...</p>
> <p>...</p>
> <p>...</p>
> <p>...</p>
> <p>...</p>
Repeating the above steps for the November 9, 2018 archive of the page will reveal similar but slightly different code. I'm not a javascript expert so I would appreciate the opinion of anyone who can look at the code and judge the precise purpose of the change that was made. However, I did find that, if you change the "display:none;" to "display:block;" in the third line above (this can be done via right-click and selecting "Edit as HTML"), the hidden text will permanently appear.
The H-1B LCA Data is from https://www.foreignlaborcert.doleta.gov/pdf/PerformanceData/2017/H-1B_Disclosure_Data_FY17.xlsx and is linked to at https://www.foreignlaborcert.doleta.gov/performancedata.cfm (under the Disclosure Data tab). This data needs to be copied to the local directory initially. It can take several minutes to load this XLSX file the first time so the code writes the needed subset of the data out to a CSV file which loads quickly in subsequent runs.
import pandas as pd
import os.path
import time
csv_file = 'H-1B_FY17.csv'
xlsx_file = 'H-1B_Disclosure_Data_FY17.xlsx'
if (os.path.isfile(csv_file)):
start = time.time()
yy = pd.read_csv(csv_file)
end = time.time()
print('Seconds to load: ' + str(end - start))
else:
start = time.time()
zz = pd.read_excel(xlsx_file)
end = time.time()
print('Seconds to load: ' + str(end - start))
zz.info()
yy = zz.loc[:, ['CASE_STATUS','VISA_CLASS','EMPLOYER_NAME','EMPLOYER_CITY','EMPLOYER_STATE','EMPLOYER_POSTAL_CODE',
'SOC_NAME','TOTAL_WORKERS','WORKSITE_CITY','WORKSITE_COUNTY','WORKSITE_STATE','WORKSITE_POSTAL_CODE']]
yy.to_csv(csv_file)
The following code limits the applications to just those that have been certified and outputs the first five rows of the data.
xx = yy[yy['CASE_STATUS'] == 'CERTIFIED']
xx.head()
The following code describes the fields in the current data.
xx.info()
The following code shows the countries which requested the most H-1B workers in 2017:
def checkCounty(county, state):
ee = xx[xx['WORKSITE_COUNTY'].str.contains(county, na=False)]
ff = ee[ee['WORKSITE_STATE'].str.contains(state, na=False)]
gg = ff.groupby(['WORKSITE_COUNTY','WORKSITE_STATE']).agg({'TOTAL_WORKERS':'sum'})
ss = gg.sort_values(by='TOTAL_WORKERS',ascending=False)
print(ss[ss['TOTAL_WORKERS'] > 9].head(10))
checkCounty('','')
As can be seen, Santa Clara County, the key county in Silicon Valley, requested the most H-1B workers. In fact, it requested more than double the number requested by New York County. This does not necessarily mean that they got the most approved but it certainly calls for further investigation.
For any specific company, the following function will list the locations that request the most H-1B workers by both the EMPLOYER_CITY (typically, the location of the company headquarters) and the WORKSITE_CITY (the location where the workers are to actually work).
def checkEmployer(empname):
ee = xx[xx['EMPLOYER_NAME'].str.contains(empname, na=False)]
gg = ee.groupby(['EMPLOYER_NAME','EMPLOYER_CITY','EMPLOYER_STATE']).agg({'TOTAL_WORKERS':'sum'})
ss = gg.sort_values(by='TOTAL_WORKERS',ascending=False)
print(ss[ss['TOTAL_WORKERS'] > 9].head(10))
gg = ee.groupby(['EMPLOYER_NAME','WORKSITE_CITY','WORKSITE_STATE']).agg({'TOTAL_WORKERS':'sum'})
ss = gg.sort_values(by='TOTAL_WORKERS',ascending=False)
print(ss[ss['TOTAL_WORKERS'] > 9].head(10))
If the function is called with an empty string, the function will look at all employers as shown below.
checkEmployer("")
As can be seen, Cognizant did request the second highest number of H-1B workers when the requests are grouped by EMPLOYER_CITY. However, the second list shows that Apple in Cupertino is the employer which requested the most workers when the requests are grouped by WORKSITE_CITY. In addition, Nvidia and Qualcomm Atheros also are in the top ten for worksite cities in Santa Clara County. In any case, this function is used to focus in on Cognizant below.
Following is the result of running the function for Cognizant:
checkEmployer("COGNIZANT TECH")
As can be seen, College Station, Texas was listed as the EMPLOYER_CITY for all of the H-1B workers requested by Cognizant in 2017. However, the WORKSITE_CITY appears to be distributed all throughout the country. The only Texas location in the top 10 was Irving which is in the Dallas-Fort Worth metropolitan area. Following is what the Pew study said about College Station:
When looking at the footprint of high-skilled foreign workers by metro area, College Station, Texas, stands far above the rest, with about 32 H-1B approvals per 100 workers. (More than 99% of the metro area’s H-1B approvals went to employees of Cognizant Technology Solutions Corp., whose U.S. headquarters is in College Station.) By comparison, no other metro area had more than five H-1B approvals per 100 workers.
From the wide distribution of worksite cities shown above, the 32 H-1B approvals per 100 workers for College Station is totally mistaken. Nearly all of the workers appear to have been working in other locations. It is the working population of those other locations that the Cognizant workers should have been compared against.
Is Cognizant the only such miscalculation? To answer this, it's useful to look at the other employers with the most H-1B approvals in 2017. Following are the results for Tata Consultancy, the second on the list:
checkEmployer("TATA CONSULT")
As can be seen, the EMPLOYER_CITY for Tata Consultancy is Rockville, Maryland. According to the Wikipedia entry for the Washington metropolitan area, Rockville is a part of that area. As can be seen in the interactive table in the Pew study, the Washington metropolitan area is the area with the third largest number of approvals. Note that the table can be sorted by approvals using the arrow above that column. In any case, once again the actual worksites of the H-1B workers appear to be spread throughout the country, at least according to the original LCA requests.
Following are the results for Infosys Limited, the employer with the third most H-1B approvals in 2017:
checkEmployer("INFOSYS LIMITED")
As can be seen, the EMPLOYER_CITY for Infosys Limited is Plano, Texas. According to the Wikipedia entry for the Dallas–Fort Worth–Arlington metropolitan area, Plano is a part of that area. As can be seen in the interactive table in the Pew study, this metropolitan area is the area with the second largest number of approvals. Once again the actual worksites of the H-1B workers appear to be spread throughout the country.
Following are the results for Wipro Limited, the employer with the forth most H-1B approvals in 2017:
checkEmployer("WIPRO")
As can be seen, the EMPLOYER_CITY for Wipro Limited is East Brunswick, New Jersey. According to the Wikipedia entry for East Brunswick, it's a part of the New York-Newark-Jersey City metropolitan area. As can be seen in the interactive table in the Pew study, this metropolitan area is the area with the largest number of approvals. Once again the actual worksites of the H-1B workers appear to be spread throughout the country.
Following are the results for Deloitte Consulting, the employer with the fifth most H-1B approvals in 2017:
checkEmployer("DELOITTE CONSULT")
As can be seen, the EMPLOYER_CITY for Deloitte Consulting is Philadelphia, Pennsylvania. As can be seen in the interactive table in the Pew study, the Philadelphia-Camden-Wilmington metropolitan area is the area with the sixth largest number of approvals. Once again the actual worksites of the H-1B workers appear to be spread throughout the country.
Following are the results for Accenture LLP, the employer with the sixth most H-1B approvals in 2017:
checkEmployer("ACCENTURE")
As can be seen, the EMPLOYER_CITY for Accenture LLP is Chicago, Illinois. As can be seen in the interactive table in the Pew study, the Chicago-Naperville-Elgin metropolitan area is the area with the seventh largest number of approvals. Once again the actual worksites of the H-1B workers appear to be spread throughout the country.
The following table shows the employers who made the 6 largest requests for H-1B workers in 2017 and their ranking for H-1B visa approvals in the Pew study. The last column shows the rank of metropolitan area by population as shown in the Wikipedia entry for U.S. metropolitan areas.
Pew Population Employer Approved Salary Employer_City Metropolitan Area Rank Rank ---------------------------- -------- ------- ------------------- ------------------------------- ---- ---------- COGNIZANT TECH SOLNS US CORP 28,908 85,429 COLLEGE STATION, TX College Station-Bryan, TX 5 187 TATA CONSULTANCY SVCS LTD 14,697 73,505 ROCKVILLE, MD Washington-Arlington-Alexandria 3 6 INFOSYS LTD 13,408 85,717 PLANO, TX Dallas-Fort Worth-Arlington 2 4 WIPRO LIMITED 6,529 75,082 EAST BRUNSWICK, NJ New York-Newark-Jersey City 1 1 DELOITTE CONSULTING LLP 6,027 106,797 PHILADELPHIA, PA Philadelphia-Camden-Wilmington 6 7 ACCENTURE LLP 5,070 83,573 CHICAGO, IL Chicago-Naperville-Elgin 7 3 San Jose-Sunnyvale-Santa Clara 10 35
Hence, those six companies appear to play a major role in the results in the Pew Study. As shown above, however, the H-1B workers in all six were spread among numerous metropolitan areas. For this reason, it would seem that the Pew study's conclusions on what metropolitan areas had the most H-1B visas for skilled workers form 2010 to 2016 is largely unfounded. As stated in the study inself, it is based on the locations where the company headquarters are located, not the location where the H-1B workers are actually working. It is the worksite locations that the number of approvals should be compared against.
All six of the companies listed above are IT consulting firms. The study's assumption that "H-1B visa approvals are for foreigners who will work at the employer location" may be less unreasonable for companies whose chief product is not consulting and/or services. Of those companies listed in the first table above, Amazon, Microsoft, Google, Apple, and Facebook had the majority of their requested workers slated to work at the employer location. The percentages ranged from 67% for Google to 86% for Amazon. The largest exception was Intel which had only 17% (1110 of 6586 requested workers) slated to work in the employer city of Santa Clara with 2666 slated for Hillsboro, OR and 1218 slated for Folsom, CA. Cisco had just 47% (3531 of 7583) slated to work in the employer city of San Jose, CA. Hence, the assumption is better for non-consulting companies but still has problems. In any event, companies whose main business is consulting obtain the great majority of H-1B workers and the assumption that their H-1B workers will work in the employer city appears to be totally wrong for them.
The above analysis looks at the difference between the Employer City and the Worksite City for specific employers. This helps show the effect of companies whose main business is consulting on the metropolitan areas listed in the Pew study. This section looks at the employer cities, worksite cities, and worksite counties with the largest H-1B worker requests since the Pew study likewise gives numbers by metropolitan area, ignoring specific employers.
Following is Python code that looks at the Employer Cities requesting the most H-1B workers in 2017, followed by the resulting table and bar chart:
import matplotlib.pyplot as plt
%matplotlib inline
gg = xx.groupby(['EMPLOYER_CITY','EMPLOYER_STATE']).agg({'TOTAL_WORKERS':'sum'})
ss = gg.sort_values(by='TOTAL_WORKERS',ascending=False)
ss.to_csv('EMPCITY'+'.csv', sep=',')
ss['1000s_OF_WORKERS'] = ss['TOTAL_WORKERS'] / 1000
print(ss.iloc[:,[1]].head(10))
ss.index = ss.index.get_level_values(0)+", "+ss.index.get_level_values(1)
plt.grid(zorder=0)
plt.barh(range(10),ss['1000s_OF_WORKERS'].head(10),zorder=3)
plt.yticks(range(10), ss.index)
plt.gca().invert_yaxis()
plt.xlabel('Thousands of H-1B Workers Requested')
plt.title("Employer Cities Requesting the Most H-1B Workers")
Following is Python code that looks at the Worksite Cities (instead of the Employer Cities) requesting the most H-1B workers in 2017, followed by the resulting table and bar chart:
gg = xx.groupby(['WORKSITE_CITY','WORKSITE_STATE']).agg({'TOTAL_WORKERS':'sum'})
ss = gg.sort_values(by='TOTAL_WORKERS',ascending=False)
ss.to_csv('WORKCITY'+'.csv', sep=',')
ss['1000s_OF_WORKERS'] = ss['TOTAL_WORKERS'] / 1000
print(ss.iloc[:,[1]].head(10))
ss.index = ss.index.get_level_values(0)+", "+ss.index.get_level_values(1)
plt.grid(zorder=0)
plt.barh(range(10),ss['1000s_OF_WORKERS'].head(10),zorder=3)
plt.yticks(range(10), ss.index)
plt.gca().invert_yaxis()
plt.xlabel('Thousands of H-1B Workers Requested')
plt.title("Worksite Cities Requesting the Most H-1B Workers")
As can be seen above, looking at worksite instead of employer cities causes Philadelphia to drop from first with 131,266 requests to tenth with 18,648 requests. Philadelphia is the location of the headquarters for Deloitte Consulting. Also, College Station (TX), Sunnyvale (CA), Plano (TX), and Rockville (MD) drop out of the top ten. These are the locations of the headquarters for Cognizant Technology, HCL America, Infosys Limited, and Tata Concultancy Services Limited, respectively.
Also notable is that three of the worksite cities (San Jose, Santa Clara, and Cupertino) are in Santa Clara County, California. This is the location of Silicon Valley. The following Python code looks at the worksite counties requesting the most H-1B workers in 2017, followed by the resulting table and bar chart:
gg = xx.groupby(['WORKSITE_COUNTY','WORKSITE_STATE']).agg({'TOTAL_WORKERS':'sum'})
ss = gg.sort_values(by='TOTAL_WORKERS',ascending=False)
ss.to_csv('WORKCOUNTY'+'.csv', sep=',')
ss['1000s_OF_WORKERS'] = ss['TOTAL_WORKERS'] / 1000
print(ss.iloc[:,[1]].head(10))
ss.index = ss.index.get_level_values(0)+", "+ss.index.get_level_values(1)
plt.grid(zorder=0)
plt.barh(range(10),ss['1000s_OF_WORKERS'].head(10),zorder=3)
plt.yticks(range(10), ss.index)
plt.gca().invert_yaxis()
plt.xlabel('Thousands of H-1B Workers Requested')
plt.title("Worksite Counties Requesting the Most H-1B Workers")
As can be seen, Santa Clara County had over twice as many requests as the number two worksite county, New York County. Santa Clara County is the location of Silicon Valley. Hence, Silicon Valley did have the most requests for H-1B workers by worksite county despite the San Jose-Sunnyvale-Santa Clara, CA area being ranked tenth in the Pew's table of employer metropolitan areas. In fact, the H-1B visa approvals per 100 workers shown in the table appears to be largely meaningless. This is especially the case for the 32 H-1B visa approvals per 100 workers listed for College Station, Texas. This is because the Pew study is comparing the H-1B workers who are working in the Worksite Cities (where they are actually working) to the population of workers in the Employer City (where the company headquarters is located). As shown above, these are often very different cities, especially in the case of [IT consulting firms]https://en.wikipedia.org/wiki/List_of_IT_consulting_firms).