Scrape data from SEC EDGAR website (from Form 10-K; 82,955 .txt links provided)

I’m interested in collecting information about employee unionization from public company annual reports (Form 10-K). Each 10-K contains several standardized sections. The labor union info I’m interested in is located in “Item 1. Business” and “Item 1A. Risk Factors”.

Step 1: Access the links to Form 10-K .txt files (N=82955) in the attached read file and search Item 1 and Item 1A ONLY for the keywords below…

KEYWORDS: collective bargaining, collective-bargaining, CBA, labo(u)r union(s), labo(u)r agreement(s), labo(u)r contract(s), labo(u)r organization(s), union agreement(s), union contract(s), union organization(s), or union(s)

Step 2: If one of the above keywords matches the text in Item 1 or 1A, add the entire sentence (or paragraph, whichever is easier) with the match to new field/column in the read file. Maybe the output file could have one field for any Item 1 output and a second field for any Item 1A output.

Appendix C and Appendix D of the attached research paper (pg36-37) provide some examples of the union-related text that I’m looking for.

Step 3: (If possible) Create 3 different union-related variables (binary, percentage, number) from the extracted Item 1 sentences/paragraphs. Create a separate set of 3 union-related variables from the Item 1A text. First, identify whether the union-related statement is positive or negative (i.e. employees are represented/covered by a union V.S. employees are NOT in a union, none of our employees are represented) with a binary variable (=1 for (some) union representation and =0 for no representation). Second, extract the percentage of employees covered if available. Third, extract the number of employees covered if available.

I realize this last part is tricky to do mechanically. I’ll have to check this part manually anyways, so any progress here with a reasonable error rate will be appreciated.

Skills: Data Mining, Web Scraping

See more: scrape data from a website, how to scrape data from a website, data entry from website form to ms excel spreadsheet, sec edgar database, sec edgar idx, webmaster www sec gov, sec archives edgar data, sec edgar feed, beautifulsoup sec edgar, edgar sec github, sec edgar down, python, web scraping, data mining, extract data sec edgar, website form data collection, scrape data website vba, python scrape data website, opening website scrape data python, php scrape website form submit

About the Employer:
( 1 review ) State College, United States

Project ID: #17182236

Awarded to:

$83 USD in 5 days
(3 Reviews)

18 freelancers are bidding on average $143 for this job


Hi, I have experience of working on similar projects to extract data from different sources. I have scraped upto 27 millions of records in past projects. So I can help in your project according to your requirements. More

$200 USD in 3 days
(53 Reviews)

Hello, I can help with you in your project Scrape data from SEC EDGAR website . I have more than 5 years of experience in Data Mining, Web Scraping. We have worked on several similar projects before! We have worked More

$250 USD in 3 days
(25 Reviews)

Hello sir, I have completed web/data scraping jobs for many times. I am interested in your project as well. I would like to discuss details via pm. I am looking forward to hear from you soon. Best Regards,

$84 USD in 3 days
(26 Reviews)

Hello there, Hope you are doing well and thanks for reviewing our proposal. We reviewed the job requirement thoroughly and would like to assist by offering our services related to website data scraping. We ha More

$250 USD in 5 days
(4 Reviews)
$100 USD in 3 days
(1 Review)

Hi, I have good experience in web scraping. I will provide application where you can add list of url, match with keywords automatically and search result. I already have application developed using and s More

$111 USD in 2 days
(2 Reviews)

I'm a developer with extensive experience in building high quality sites and apps. I have an experience in (Ionic framework/React Native/NativeScript/PHP/Javascript/UI design). I know how to do apps in native IOS More

$85 USD in 2 days
(2 Reviews)

Hello, I am a web search and data entry specialist . I will do a quality job for you which will meet your requirements and expectations with full compliance with the time limit. Hope to hear from you Relevant Skills More

$250 USD in 3 days
(2 Reviews)

Hello Having experience of python, I can do what you want. Python is my primary programming language. let us discuss details in chat.

$150 USD in 3 days
(4 Reviews)

I have 6 year experience Freelancer,up work,Fiverr & 99design market place I have seen your project that i can to do easily because I have many experience to Graphic Design,Webdesign,Web Develop & programming .So I cou More

$155 USD in 3 days
(0 Reviews)
$155 USD in 3 days
(0 Reviews)

Hello, I'm an Expert Python Developer. With over 12 years of demonstrated experience, I am an individual developer but have an ability as a company, I can work with local developers near me as a team for completi More

$80 USD in 4 days
(1 Review)

how are you,sir? I am a ultimate developer who has rich experience in this field. If you contact me, you and i will all be happy. Thank you for your reply in advance. Scrape data from SEC EDGAR website (from Form 10-K More

$155 USD in 1 day
(0 Reviews)

I can do this really fast , i have a whole app that Crawls like this, u can find it on playstore "Owledge"

$55 USD in 1 day
(0 Reviews)

Good day, i am an expert web scraper with lot of experience. i usually use php and curl and have success almost all the times; i do not support captcha; my code is well written, fast and long lasting. I have p More

$155 USD in 3 days
(0 Reviews)
$100 USD in 5 days
(0 Reviews)

How are you? I understood your requirements exactly. I have good experiences in this field. This job is easy for me and my talent. I am free now and can start right away and will finish this project asap. I will d More

$155 USD in 3 days
(0 Reviews)