Nginx and Lua
For several months, we've been gradually integrating Lua into Mixlr's front-end Nginx configuration.Lua is a dynamic scripting language which can be embedded inside Nginx config files. This means that arbitrary Lua code can be executed inline during any phase of the Nginx request-handling process.
We started off just using Lua to route requests around backend servers, but it has ended up having a much more profound impact on our infrastructure.
Here are some things we make it do.
Force search engines to only index mixlr.com.
Google treats sub-domains as completely separate websites. We don't want our page rank diluting by inadvertently having crawlers fetch pages from anything other than our root domain.location /robots.txt {
rewrite_by_lua '
if ngx.var.http_host ~= "mixlr.com" then
return ngx.exec("/robots_disallow.txt");
end
';
}
robots.txt
to robots_diallow.txt
if the request was made to any domain other than mixlr.com. It's
possible to do this with standard Nginx rewrites, but this is easier to
maintain and extend.Set response headers based on some arbitrary logic
Lua provides much more flexibility than the default Nginx configuration rules.In this example, we want to ensure that a response's headers are correctly set. Browsers will cache static files indefinitely if they are sent with certain headers, meaning visitors only have to download them once.
This rewrite ensures that any static file requested with a timestamped query-string will be sent with appropriate
Expires
and Cache-Control
headers.location / {
header_filter_by_lua '
if ngx.var.query_string and ngx.re.match( ngx.var.query_string, "^([0-9]{10})$" ) then
ngx.header["Expires"] = ngx.http_time( ngx.time() + 31536000 );
ngx.header["Cache-Control"] = "max-age=31536000";
end
';
try_files $uri @dynamic;
}
Remove timestamp args from jQuery JSONP requests
Many external clients request JSONP from our servers while including a cache-busting_=some_timestamp
arg. This breaks Nginx's proxy cache, which cannot be configured to ignore specific HTTP args.This rewrite removes the arg internally and allows Nginx to cache responses from the upstream server, meaning our backend Rails servers are better protected from load.
location / {
rewrite_by_lua '
if ngx.var.args ~= nil then
-- /some_request?_=1346491660 becomes /some_request
local fixed_args, count = ngx.re.sub( ngx.var.args, "&?_=[0-9]+", "" );
if count > 0 then
return ngx.exec(ngx.var.uri, fixed_args);
end
end
';
}
location / {
rewrite_by_lua '
if ngx.var.arg__ and ngx.var.arg__ ~= "1" then
local arg_table = ngx.decode_args(ngx.var.args);
arg_table["_"] = "1";
ngx.exec(ngx.var.uri, arg_table);
end
';
}
Log slow Rails responses to the Nginx error log
If a response from a backend server takes a long time to arrive, it can be easily logged to the Nginx error log for further investigation.location / {
log_by_lua '
if tonumber(ngx.var.upstream_response_time) >= 1 then
ngx.log(ngx.WARN, "[SLOW] Ngx upstream response time: " .. ngx.var.upstream_response_time .. "s from " .. ngx.var.upstream_addr);
end
';
}
Instantly block IP addresses by adding them to a Redis set
Occasionally, a rogue crawler must be stopped. We used to do this upstream using Rack middleware, but again it's much more desirable to just not bother our backend servers at all.lua_shared_dict banned_ips 1m;
location / {
access_by_lua '
local banned_ips = ngx.shared.banned_ips;
local updated_at = banned_ips:get("updated_at");
-- only update banned_ips from Redis once every ten seconds:
if updated_at == nil or updated_at < ( ngx.now() - 10 ) then
local redis = require "resty.redis";
local red = redis:new();
red:set_timeout(200);
local ok, err = red:connect("your-redis-hostname", 6379);
if not ok then
ngx.log(ngx.WARN, "Redis connection error retrieving banned_ips: " .. err);
else
local updated_banned_ips, err = red:smembers("banned_ips");
if err then
ngx.log(ngx.WARN, "Redis read error retrieving banned_ips: " .. err);
else
-- replace the locally stored banned_ips with the updated values:
banned_ips:flush_all();
for index, banned_ip in ipairs(updated_banned_ips) do
banned_ips:set(banned_ip, true);
end
banned_ips:set("updated_at", ngx.now());
end
end
end
if banned_ips:get(ngx.var.remote_addr) then
ngx.log(ngx.WARN, "Banned IP detected and refused access: " .. ngx.var.remote_addr);
return ngx.exit(ngx.HTTP_FORBIDDEN);
end
';
}
ruby> $redis.sadd("banned_ips", "200.1.35.4")
$remote_addr
is set with the correct value.This approach could also be trivially modified to check for HTTP user agent matching certain patterns.
Render CSRF (form_authenticity_token) using Nginx
Mixlr makes heavy use of page caching. One problem page caching introduces is how to render a unique, per-session CSRF token in each page.To achieve this, we fetch the authenticity token from our upstream web servers using an Nginx subrequest, and then use Nginx's SSI (server-side include) functionality to include it in the rendered page (which may be generated cold by Rails, or served warm from a static HTML file).
This results in both protection against CSRF attacks, and our backend Rails servers are well protected from spells of heavy traffic.
location /csrf_token_endpoint {
internal;
include /opt/nginx/conf/proxy.conf;
proxy_pass "http://upstream";
}
location @dynamic {
ssi on;
set $csrf_token '';
rewrite_by_lua '
-- Using a subrequest, we our upstream servers for the CSRF token for this session:
local csrf_capture = ngx.location.capture("/csrf_token_endpoint");
if csrf_capture.status == 200 then
ngx.var.csrf_token = csrf_capture.body;
-- if this is a new session, ensure it sticks by passing through the new session_id
-- to both the subsequent upstream request, and the response:
if not ngx.var.cookie_session then
local match = ngx.re.match(csrf_capture.header["Set-Cookie"], "session=([a-zA-Z0-9_+=/+]+);");
if match then
ngx.req.set_header("Cookie", "session=" .. match[1]);
ngx.header["Set-Cookie"] = csrf_capture.header["Set-Cookie"];
end
end
else
ngx.log(ngx.WARN, "No CSRF token returned from upstream, ignoring.");
end
';
try_files /maintenance.html /rails_cache$uri @thin;
}
app/metal/csrf_token_endpoint.rb
:class CsrfTokenEndpoint
def self.call(env)
if env["PATH_INFO"] =~ /^\/csrf_token_endpoint/
session = env["rack.session"] || {}
token = session[:_csrf_token]
if token.nil?
token = SecureRandom.base64(32)
session[:_csrf_token] = token
end
[ 200, { "Content-Type" => "text/plain" }, [ token ] ]
else
[404, {"Content-Type" => "text/html"}, ["Not Found"]]
end
end
end
<meta name="csrf-param" value="authenticity_token"/>
<meta name="csrf-token" value="<!--# echo var="csrf_token" default="" encoding="none" -->"/>
/csrf_token_endpoint
.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.