Often called spidering.

Automatic browsing of World Wide Web.

Breadth-First crawling

image.png

Prioritize explore a website width before going deep.

Start by crawling all links on the seed page, then move to the links on those pages.

Get a broad overview of a website’s structure and content.

Depth-First Crawling

pako_eNo9zz0PgjAQBuC_0twsg18LgwlfGyYG4uQ5VHoC0RZS2sEQ_rsnTezU98mlvXeGZlAEMbRWjp0oKzSTf4RQEylxrUo0gk9yu8iWxPaOhoxCk4goOok06I41XSELsGfIVsgDHBjyFYoAR4YivCEEGtiAJqtlr3iZ-fclgutIE0LMVyXtCwHNwnPSu6H-mAZiZz1twA6-7SB-yvfEyY9KOsp7ySX0X0n1brDn0HWtvHwB2SFOww.svg

Prioritize depth over breadth.

Usefull for finding specific content or reaching deep into a website’s structure.

robots.txt