In the beginning there was request? → response?

An HTTP serv­er is re­spon­si­ble for ac­cept­ing an HTTP re­quest and re­turn­ing an HTTP re­sponse, with some com­pu­ta­tion in be­tween. That much you prob­a­bly al­ready knew. To put a more ab­stract spin on this, an HTTP serv­er can be con­sid­ered a func­tion that takes an HTTP re­quest as ar­gu­ment and whose val­ue is an HTTP re­sponse.

That's what a servlet is.

Out of the box, Rack­et comes with two struc­ture types, one for HTTP re­quests and an­oth­er for HTTP re­spons­es. Us­ing the con­ven­tion­al ques­tion mark, re­quest? is a pred­i­cate that takes a Rack­et val­ue and re­turns #t if it is an HTTP re­quest. Sim­i­lar­ly, re­sponse? is for HTTP re­spons­es. A servlet, then, is a func­tion whose sig­na­ture is

re­quest? → re­sponse?

The Rack­et web serv­er will han­dle a stream of bytes com­ing over the net­work and make sure that you, the pro­gram­mer, get a re­quest? val­ue. Your task—should you choose to ac­cept it—is to gen­er­ate an HTTP re­sponse val­ue.

Your job, then, is to de­fine and com­bine servlets.

Servlets: big, small, and all around

Your web ap­pli­ca­tion, from the serv­er point of view, can be con­sid­ered as a sin­gle large servlet: a func­tion that takes in every re­quest what­so­ev­er, and re­turns suit­able re­spons­es. This sug­gests that servlets are big func­tions. They car­ry a heavy load. As your web project grows, this one servlet gets big­ger and big­ger.

A more help­ful per­spec­tive is to think of an HTTP serv­er as be­ing com­posed of servlets, each one de­vot­ed to han­dling a lit­tle part of your over­all site. There's the main servlet, the one through which every re­quest pass­es. But the main servlet can dis­patch re­quests to oth­er, small­er servlets. And these servlets, in turn, can them­selves be com­posed of oth­er servlets.

Think of servlets the way you think of the main func­tion in a pro­gram. The main func­tion is, of course, a func­tion. But I'll bet that if your pro­gram has any in­ter­est­ing com­plex­i­ty to it at all, your main func­tion will be di­vid­ed into small­er func­tions. These small­er func­tions are writ­ten to help de­com­pose our pro­gram, to make it more un­der­stand­able and mod­u­lar, and so on.

The same line of think­ing ap­plies to writ­ing servlets.

HTTP re­quests

Re­quests (pro­vid­ed from web-serv­er/http/re­quest-structs) are struc­tures with eight com­po­nents:

FieldCon­tractEx­pla­na­tion
methodbytes?The method (GET, POST, etc.) re­quest­ed
uriurl?The re­quest­ed URL.
head­ers/raw(listof head­er?)A list of head­ers
bind­ings/raw-promise(promise/c (listof bind­ing?))A (promise of an) as­so­ci­a­tion list of key-val­ue pairs. Pri­mar­i­ly used when pro­cess­ing forms.
post-data/raw(or/c false/c bytes?)The re­quest body. The post bit is some­what of a mis­nomer: a body may be present even for non-POST re­quests.
host-ipstring?The IP ad­dress of the host be­ing re­quest­ed
host-portnum­ber?The port num­ber of the host to which the re­quest should be sent.
client-ipstring?The IP ad­dress of the client mak­ing the re­quest.

HTTP re­spons­es

Re­spons­es (pro­vid­ed from web-serv­er/http/re­sponse-structs) have six fields:

FieldCon­tractEx­pla­na­tion
codenum­ber?The re­sponse sta­tus code (e.g., 200, 404, etc.)
mes­sagebytes?The sum­ma­ry of the re­sponse. Nor­mal­ly goes along with the sta­tus code: if that is 200, then this will be #"OK", etc. But it could be ar­bi­trary (even emp­ty).
sec­ondsnum­ber?Time­stamp. The cur­rent time, in sec­onds, since mid­night, Jan­u­ary 1, 1970 (UTC).
mime(or/c false/c bytes?)The MIME type for this re­sponse (e.g., text/html [as a se­quence of bytes]).
head­ers(listof head­er?)Head­ers.
out­put(-> out­put-port? any)The body of the re­sponse. Writes to an out­put port.

(For a list of stan­dard and not-so-stan­dard HTTP re­sponse sta­tus codes, see the list on Wikipedia.)

Head­ers

Head­ers can show up in re­quests or re­spons­es. A head­er is, es­sen­tial­ly, a key-val­ue as­so­ci­a­tion, where the key and the val­ue are byte strings.

FieldCon­tractEx­pla­na­tion
fieldbytes?The name of the head­er (e.g., Last-Mod­i­fied)
val­uebytes?The val­ue of the head­er.

Con­ve­nient­ly gen­er­at­ing re­spons­es

In­clud­ed with this chap­ter is a code snip­pet that is used in vir­tu­al­ly every oth­er chap­ter of the book: re­spond.rkt. In that mod­ule, the main is to de­fine a sin­gle func­tion, re­spond, that con­ve­nient­ly gen­er­ates Rack­et HTTP re­spons­es. Of course, one can al­ways di­rect­ly con­struct re­spons­es us­ing re­sponse. But if you're like me, you may well find that ap­proach rather net­tle­some, which will cause you to want to roll your own con­ve­nience func­tions. (Nat­u­ral­ly, you can do that. I make no claim on fi­nal­i­ty or su­pe­ri­or­i­ty of my code.)

Rather than walk through re­spond.rkt here, let's be con­tent with the fact that, in many chap­ters, you'll see this func­tion called many times.

Big bites of bytes

In the dis­cus­sion of re­quests, re­spons­es, and head­ers, you may have no­ticed that byte strings fea­tured promi­nent­ly. Why is that? Why not strings?

For in­stance, when ex­tract­ing the method of a re­quest, why do we get a byte string rather than, say, the string POST? That's a very sim­ple string. Why does it have to be so byte-y?

The byte per­spec­tive makes sense be­cause bytes are in fact what is com­ing to the serv­er over the wire. Strings are, from this point of view, a non-triv­ial data struc­ture, the re­sult of pars­ing a se­quence of bytes us­ing, say, the rules laid out in the de­f­i­n­i­tion of UTF-8.

Work­ing with bytes feels real and raw. But it may, at times, be a bit net­tle­some to con­stant­ly work in terms of bytes. One such an­noy­ance is the con­ver­sion of bytes strings into or­di­nary strings. The built-in bytes->string/utf-8 func­tion gets used fre­quent­ly. But this func­tion doesn't (and can't!) con­vert ar­bi­trary byte strings into strings. That is so be­cause not every se­quence of bytes is well-formed from the stand­point of UTF-8. (Con­tin­u­ing with the pars­ing idea, we know that not every se­quence of char­ac­ters can be parsed as a C pro­gram. Anal­o­gous­ly, not every se­quence of bytes can be un­der­stood as a UTF-8 string.)

Thus, in much of the code that you'll see in this book, there will fre­quent­ly be a check whether a byte string can be con­vert­ed to UTF-8 string. A func­tion that I've found use­ful goes some­thing like this:

1
2
3
(de­fine(bytes->stringb)
(with-han­dlers([exn:fail:con­tract?(con­st#f)])
(bytes->string/utf-8b)))

(We're us­ing con­st to make a con­stant func­tion.)

The func­tion bytes->string takes any Rack­et val­ue as in­put. If it's not a byte string, then we re­turn #f. If the val­ue is a byte string, we use bytes->string/utf-8 to get a prop­er string out of it; if that fails, we again re­turn #f. Oth­er­wise, we re­turn the (con­vert­ed) string.

I've writ­ten a servlet. How do I make it run?

Once you've got a servlet ready to roll, you can put it to use us­ing serve/servlet. Here's an in­vo­ca­tion that you'll see many times, with some vari­a­tions, through­out the book:

1
2
3
4
(serve/servlet
let-er-rip
#:port6995
#:servlet-reg­exp#rx"")

If this func­tion is run, you'll have an HTTP serv­er lis­ten­ing for re­quests on port 6995 and which will call let-er-rip and se­ri­al­ize the re­sponse (that is, the val­ue of let-er-rip) for you.

(The #:servlet-reg­exp bit is to en­sure that every re­quest re­ceived gets passed on to let-er-rip. The reg­u­lar ex­pres­sion is a pat­tern that al­lows you to by­pass cer­tain pat­terns in the URLs. Us­ing the emp­ty string has the ef­fect that noth­ing is fil­tered out.)

Servlet kata: HEAD re­quests

A com­mon task for many web sites is to rewrite HTTP re­quests and re­spons­es. In re­quest rewrit­ing, one re­ceives an HTTP re­quest, tweaks it in some way, and pass­es the ma­nip­u­lat­ed re­quest on to an­oth­er par­ty. Re­sponse rewrit­ing is sim­i­lar: one re­ceives an HTTP re­sponse, ma­nip­u­lates it some­how, and then pass­es that along to an­oth­er par­ty who is look­ing for a re­sponse.

With Rack­et, since re­quests and re­spons­es are struc­tures a straight­for­ward way to ac­com­plish rewrit­ing is to use struct-copy. This func­tion takes, say, an HTTP re­sponse as in­put and pro­duces a copy of it, with some de­tails changed.

Let's see how that works in the case of HEAD re­quests.

The pur­pose of an HTTP HEAD re­quest is, es­sen­tial­ly, to car­ry out a GET re­quest but re­turn no body. Such re­quests are of­ten used to de­ter­mine how big a re­source would be, if it were to be fetched with a real GET re­quest.

A nat­ur­al way of im­ple­ment­ing HEAD is to take the re­quest as in­put, rewrite its HTTP method from HEAD to GET, pass along that re­quest, and then throw away the re­sponse body.

To pull this off in Rack­et, we need a few in­gre­di­ents:

Let's take care of these tasks one at a time.

HEAD to GET

This func­tion un­con­di­tion­al­ly rewrites the HTTP method of a re­quest into GET:

1
2
3
4
5
;; re­quest? -> re­quest?
(de­fine(head->getreq)
(struct-copyre­quest
req
[method#"GET"]))

Throw away the body

This func­tion dis­cards a re­spons­es body:

1
2
3
4
5
;; re­sponse? -> re­sponse?
(de­fine(strip-bodyresp)
(struct-copyre­sponse
resp
[out­putwrite-noth­ing]))

where write-noth­ing is the func­tion

1
2
3
;; out­put-port? -> ex­act-non­neg­a­tive-in­te­ger?
(de­fine(write-noth­ingport)
(write-bytes#""port))

write-noth­ing takes a port as in­put and writes the emp­ty (byte) string to it.

The gotcha here is that, for Rack­et re­spons­es, the body is a func­tion. It's not sim­ply, say, a (byte) string. That's why write-noth­ing—a func­tion—is the val­ue stored in the out­put field.

Core and wrap­per func­tions

At this point, the re­spon­der func­tion can be what­ev­er you want. The mantra to keep in mind is: re­quest as ar­gu­ment, re­sponse as val­ue. Let's call the core func­tion dis­patch­er.

The wrap­per func­tion (for lack of a bet­ter word) is re­spon­si­ble for tak­ing the orig­i­nal re­quest as in­put, pos­si­bly chang­ing some de­tails, and pass­ing the pos­si­bly mod­i­fied re­quest along to the core re­spon­der. Let's call the wrap­per func­tion start.

1
2
3
4
5
;; re­quest? -> re­sponse?
(de­fine(startreq)
(if(bytes=?#"HEAD"(re­quest-methodreq))
(strip-body(dis­patch­er(head->getreq)))
(dis­patch­erreq)))

No­tice that dis­patch­er gets used in ei­ther case. In case the re­quest method is not HEAD, we sim­ply in­voke the dis­patch­er di­rect­ly. If we do get a HEAD re­quest, we

  1. fake a GET re­quest,
  2. pass it along to dis­patch­er, and
  3. throw away what­ev­er re­sponse body comes back from dis­patch­er.