Extract data out of webpages in json and excel. Offered by Innoplexus Consulting Services
How to use the plugin:
1. Go to the webpage, you wish to extract data from.
2. Right click on the element you want to select, and say "add to extractor".
3. A popup window will appear which has all the similar elements, give a field name for the selected element (like name, location title etc.) and add, you can see the field added in the metadata table at the bottom of the screen.
4. Repeat the above steps for all the elements you want to add.
5. The extension supports pagination, if you want data to be extracted from all the pagination urls also, select the pagination option, enter urls for next two pages and the number of pages to crawl data from, and say update.
6. For lazy loading, select the lazy load option, enter necessary information and update.
7. If you wish to do nested (deep) crawling, add two links(you wish to deep crawl) with attribute name as "deep_crawl_link". Then give fields from one of the urls, the extension will pick all the fields from all the similar urls.
8. If you wish to schedule your extraction process periodically, select the periodic schedule option(monthly/weekly/daily) and update.
9. After completing all these steps, you have multiple options:
i. Get json of the selected fields locally.
ii. Get csv of the selected fields locally.
iii. Or enter your email id on the top left corner, and say get data,
through mail after the crawling has been done.