![]() You can create your own concurrency implementation. If one browser instance crashes for any reason, this will not affect other jobs. One browser (using an incognito page) per URL. Incognito page (see IncognitoBrowserContext) for each URL Shares everything (cookies, localStorage, etc.) between jobs. The default option is Cluster.CONCURRENCY_CONTEXT, but it is recommended to always specify which one you want to use. ![]() You can set it in the options when calling Cluster.launch. There are different concurrency models, which define how isolated each job is run. Provide types for input/output with TypeScript generics.Using a different puppeteer library (like puppeteer-core or puppeteer-firefox). ![]() Deep crawling the Google search results.We then queue two jobs and wait for the cluster to finish. Then a task is defined which includes going to the URL and taking a screenshot. A cluster is created with 2 concurrent workers. The following is a typical example of using puppeteer-cluster. Npm install -save puppeteer-cluster Usage Install puppeteer (if you don't already have it installed): Progress view and monitoring statistics (see below).Different concurrency models to choose from (pages, contexts, browsers).Auto restarts the browser in case of a crash.Typings for input/output (via TypeScript Generics).Puppeteer Cluster takes care of reusing Chromium and restarting the browser in case of errors. This is helpful if you want to crawl multiple pages or run tests in parallel. This library spawns a pool of Chromium instances via Puppeteer and helps to keep track of jobs and errors.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |