Basic Crawler Requirements The crawler should be limited to one domain. Given a starting URL โ say http://bbc.co.uk - it should visit all pages within the domain, but not follow the links to external sites such as Google or Twitter. The output should be a simple structured site map (this does not need to be a traditional XML sitemap - just some sort of output to reflect what your crawler has discovered), showing links to other pages under the same domain, links to external URLs and links to static content such as images for each respective page.
Documentation Log https://docs.google.com/document/d/1WlrFs5fia8KGxwyePGK7tTp9nTOiLJAYAtVi9-4ZBXo/edit?usp=sharing