Technical Guide: Protecting Forgejo Instances from AI Web Crawlers with Nginx Configuration
By
todsacerdoti
Hand-rolled, kettle-boiled, baked to perfection. Worth every minute at the bakery.
Summary
The article provides a technical guide on implementing an nginx configuration to protect a Forgejo instance from AI web crawlers while maintaining accessibility for legitimate users. The solution uses a cookie-based system that blocks crawlers by default but allows access to users with a specific cookie or using git-related user agents. The configuration returns a 418 status code with JavaScript that sets the required cookie and reloads the page, creating minimal friction for human users while effectively blocking automated crawlers.
Key quotes
· 4 pulledTL;DR:Put that in your nginx config:location / {
# needed to still allow git clone from http/https URLs if ($http_user_agent ~* "git/|git-lfs/") { set $bypass_cookie 1; }
# If we see the expected cookie; we could also bypass the blocker page if ($cookie_Yogsototh_opens_the_door = "1") { set $bypass_cookie 1; }
# Redirect to 418 if neither condition is met if ($bypass_cookie != 1) { add_header Content-Type text/html always; return 418 '<script>document.cookie = "Yogsototh_opens_the_door=1; Path=/"; window.location.reload();</script>'; }
You might also wanna read
How to Set Up an Apache Reverse Proxy for an Ecommerce Website
This article provides a comprehensive, start-to-finish guide on setting up an Apache reverse proxy specifically for ecommerce websites. It c
blog.radwebhosting.com·2d agoHow to set up local git remotes using a home server
A technical guide explaining how to set up local git remotes using a home server. The author describes the process of creating a bare reposi
Why local configuration verification is critical before cloud deployment in DevOps pipelines
The article discusses the importance of verifying configuration shifts locally before deploying to cloud clusters, using the example of a co
dev.to·2d agoHow to Install Sakai LMS on AlmaLinux VPS: A Step-by-Step Guide
This article provides a step-by-step technical guide for installing Sakai LMS (Learning Management System) on an AlmaLinux VPS. It covers th
blog.radwebhosting.com·3d agoHow to Deploy Cachet Status Page on AlmaLinux VPS: A Step-by-Step Guide
This article provides a step-by-step guide for deploying Cachet, an open-source status page system, on an AlmaLinux VPS. It covers the insta
blog.radwebhosting.com·3d agoHow to Install and Configure HAProxy on Oracle Linux VPS Servers
This article provides a step-by-step guide for installing and configuring HAProxy (High Availability Proxy) on Oracle Linux VPS servers. It
blog.radwebhosting.com·4d ago