Exclude some URLs from the archives in robots.txt

This makes no major changes from what was there before from a pure search perspective: * /message-id/flat/ was already flagged with a META tag to be excluded from indexing, since it's the same data as /message-id/. * /list/ was already flagged with a META tag to be excluded from indexing, since it carries no actual content, just links, and the links and descriptions of the lists is already available under /community/ as well. * /message-id/raw/ required a login so it produced a bunch of 401's anyway, but this way we don't need to probe for that. It's more efficient to block these things in robots.txt so we don't have to spend the processing power to render a page that's not going to get indexed anyway.
2025-07-29 11:59:36 +00:00 · 2013-07-10 09:57:25 +02:00
parent 7dc9e105f9
commit 4d773d447f
1 changed files with 3 additions and 0 deletions
--- a/pgweb/core/views.py
+++ b/pgweb/core/views.py
@ -124,6 +124,9 @@ def robots(request):
 		return HttpResponse("""User-agent: *
 Disallow: /admin/
 Disallow: /account/
+Disallow: /list/
+Disallow: /message-id/raw/
+Disallow: /message-id/flat/

 Sitemap: http://www.postgresql.org/sitemap.xml
 """, mimetype='text/plain')