Exclude some URLs from the archives in robots.txt

This makes no major changes from what was there before from a pure search
perspective:

* /message-id/flat/ was already flagged with a META tag to  be excluded from
  indexing, since it's the same data as /message-id/.
* /list/ was already flagged with a META tag to be excluded from indexing,
  since it carries no actual content, just links, and the links and descriptions
  of the lists is already available under /community/ as well.
* /message-id/raw/ required a login so it produced a bunch of 401's anyway,
  but this way we don't need to probe for that.

It's more efficient to block these things in robots.txt so we don't have to
spend the processing power to render a page that's not going to get indexed
anyway.
This commit is contained in:
Magnus Hagander
2013-07-10 09:57:25 +02:00
parent 7dc9e105f9
commit 4d773d447f

View File

@ -124,6 +124,9 @@ def robots(request):
return HttpResponse("""User-agent: *
Disallow: /admin/
Disallow: /account/
Disallow: /list/
Disallow: /message-id/raw/
Disallow: /message-id/flat/
Sitemap: http://www.postgresql.org/sitemap.xml
""", mimetype='text/plain')