Overview
web pages are crawled by being loaded into browser using multiple tabs parallelly
WATCH OUT: more tabs you use, more computer resources (CPU, memory) will be used, and each page costs a bit disk to save the content (in IndexedDb, accessible from extensions -> Inspect views: background page). The "spider" works in this way: 1) The current url is used as the starting point, and it's loaded again in a new tab. 2) After this page is loaded, fetch all the links on the page. 3) Get all the links on the page, including relative urls. 4) Open the extracted link parallelly in all the tabs used (by default 3, set in eventPage). 5) repeat 2-4 All source code at: https://github.com/nobodxbodon/ChromeCrawlerWildSpider
Details
- Version0.0.3
- UpdatedMarch 8, 2019
- Offered byXuan Wu
- Size121KiB
- LanguagesEnglish (United States)
- Non-traderThis developer has not identified itself as a trader. For consumers in the European Union, please note that consumer rights do not apply to contracts between you and this developer.
Privacy
The developer has not provided any information about the collection or usage of your data.