2nd time around.
I would like to extract articles from nominated categories from different article websites.
I need a program where i can enter the name of the website and category and then it will extract all the articles from that category and save them as text file, or prefferably, as an XML file.
I will use several different article directory websites for this.
Please let me know your price and time frame along with a proposed solution.
This may run locally on a windows PC, or prefferably as a script (using a simple html interface) run from a (linux)webserver which has all the standard cpanel hosting package utilities available.
As this is a very simple task, low bids are expected.
Please note:
1) These articles will be imported into Wordpress (2.6+) which is why I mention XML, sorry, that should have been made clearer in the original post.
2) Yes, there are many article server configurations, so I will need to be able to nominate the URL format (domain/category/etc) Most don't use numbers, but SEF URLs, so I will most likely need to collect *all* articles for a nominated category, then filter out the unwanted articles.
3) Wordpress import has a limitation of 64mb per file (unlikely to be an issue, but worth noting)
4) I only want the article and not the whole page saved. i.e., I don't want to edit out all the advertising links etc.
Hi, i know how to extract data from different web pages using regular expressions. I will take care of your project for cost of $75 for each articles web site you're going to scrape. Let me know if you're ready to start this project.