I need to extract financial information from PDF forms. These PDFs are in standard formats with tables that contain the form entry text. At present, the available PDF reading software can read the empty tables, but can't extract the form entries. It is possible to use Acrobat Reader's 'file -> save as -> text' to extract the entries to a TXT file, but the formatting is lost. I would like a script that reads the PDF table and the form entries and outputs the data to a Google sheet. If possible, the process should be automated: the script periodically checks the relevant websites for PDFs with an unsaved date parameter and extracts it. The script should be capable of being run as a desktop application, or being embedded in a WordPress site that utilises the data once stored in the Google sheet. Robust error handling needs to be included. I can provide example PDF forms.
35 freelancers are bidding on average £342 for this job