Using multiple collectors

It is advised to use multiple collectors for one scraping jobs if the task is complex enough or has different kind of subtasks. A good example is coursera course scraper where two collectors are used - one parses the list views and handles paging and the other one collects course details.

Colly has some built-in methods to support the usage of multiple collectors.


Use collector.ID in debugging to distinguish different collectors

Cloning collectors

You can use the Clone() method of a collector if collectors have similar configuration. Clone() duplicates a collector with identical configuration but without the attached callbacks.

c := colly.NewCollector(
	colly.AllowedDomains("", ""),
// Custom User-Agent and allowed domains are cloned to c2
c2 := c.Clone()

Passing custom data between collectors

Use collector’s Request() function to be able to share context with other collectors.

Example of sharing context:

c.OnResponse(func(r *colly.Response) {
	c2.Request("GET", "", nil, r.Ctx, nil)