Skip to content

Commit 948804f

Browse files
committed
Added static javascript and image files to the URLs crawled by the test-crawler.
- Legacy-Id: 9913
1 parent 1b36eec commit 948804f

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

bin/test-crawl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -65,7 +65,7 @@ def strip_url(url):
6565
return url
6666

6767
def extract_html_urls(content):
68-
for m in re.finditer(r'(<(?:a|link) [^>]*href=[\'"]([^"]+)[\'"][^>]*>)', content):
68+
for m in re.finditer(r'(<(?:(?:a|link) [^>]*href|(?:img|script) [^>]*src)=[\'"]([^"]+)[\'"][^>]*>)', content):
6969
if re.search(r'rel=["\']?nofollow["\']', m.group(1)):
7070
continue
7171

0 commit comments

Comments
 (0)