Fight the Lie
March 19th, 2008It’s easy and fun to fight the lie.
It’s easy and fun to fight the lie.
(This was originally going to be a comment over on Comet Daily, but it got a bit long.)
Regarding this, my comment about avoiding pub/sub was directed to Dan’s proposal to introduce SUBSCRIBE and NOTIFY methods into HTTP. However, I think he’s really on to something in his desire to (otherwise) stick to the use of GET. Before I explain what that something is, I should say that I wasn’t clear in my previous post about caching and resource mutation. I had some time to think about it after I submitted, and I came to the conclusion that since caching aids scalability, and since in some comet applications there may be many clients interested in some resource, it would be useful for the resource to be cacheable. Say your application has 20k clients. (The ratio of clients to caches comes into play here, but it always does.) When some resource is updated, which would be better: the server sending the same update to each client or each client GETting the resource, allowing any intermediary cache(s) to return a cached result? If you answer doesn’t involve the use of GET, increase the number of clients to 100k, 500k, 1m, until your answer changes.
What I’m arguing is that since GETting updated resources will aid scalability of Comet applications in some rather important cases, a proposal to modify HTTP should be applicable to those cases.
Now back to Dan’s suggestion. While I don’t like the idea of SUBSCRIBE and NOTIFY methods per se, there’s a way of applying the spirit of his idea that seems maximally REST-friendly. In the case where a client is interested in getting updates to a single resource, an HTTP GET + When-Modified-After will suffice. If the client cares about multiple resources, instead of using POST to signal interest in those resources, the client can instead GET + When-Modified-After a feed which is updated whenever any of the resources is updated. To use Kris’ example, if a client is interested in updates to “news” and “weather,” the feed URIs of which are:
http://feeds.example.com/news
http://feeds.example.com/weather
Then client could simply issue a GET + When-Modified-After to:
http://feeds.example.com/news;weather
And the server would respond with any relevant updated news or weather URIs modified after the given time. The client could then GET those URIs, exploiting any intermediary caches.
A possibly healthy enhancement to HTTP cache behavior would be to respect the proposed semantics of When-Modified-After. I expect that HTTP caches typically behave like:
If I have a cached copy of resource R:
return it to the client
otherwise
block the client
GET resource R
return it to the client
To respect When-Modified-After, they would do:
If I have a cached copy of resource R:
return it to the client
otherwise
if header contains When-Modified-After
add client to the set S of clients waiting for R
GET resource R
write same copy of R to each client in set S
otherwise
block the client
GET resource R
return it to the client
In general, that’s what HTTP caches would have to do to respect When-Modified-After. However, while writing the pseudocode I realized that two clients that GET the same resource using When-Modified-After will not necessarily be interested in the same resource R. Client c1 may be interested in R only when it’s been modified after time t1, and client c2 may care about R only when modified after t2. That limits cacheability yet again, and it convinces me that there’s insufficient separation of concerns in Kris’ proposal. Instead of saying “Respond when the resource has been modified after time t” (which embeds state into the request), the semantics of the header should be, “Respond when the resource has been modified” and rely on feed paging to prevent loss of updates. That is, clients should receive a response when the resource has been modified, and in the case the response is a feed that contains the URIs of modified (or, in the case that the resources are immutable, new) resources, the client will be responsible for GETting the modified resource, taking advantage of feed paging to ensure that no relevant modifications are lost. This strikes me as a more REST-friendly approach — principally to REST’s statelessness and cacheability.
While I set out to think about a REST-friendly way for a client to keep track of multiple resources, the cache pseudocode exposed what should have been obvious to me: the time parameter of the proposed When-Modified-After header subverts the statelessness of REST whether the client is interested in a single resource (in which case the resource being GETted is the resource in question) or multiple resource (in which case – in my scheme, at least – the resource being GETted is a paged feed). With that point in mind, if I were on the IETF’s HTTP working group, I would reject Kris’s proposed When-Modified-After in favor of a parameterless version (”When-Modified”). Whether a parameterless When-Modified would work well in conjunction with cache channels, and whether cache channels obviate the need for When-Modified, are questions about which I’ll have to think about this coming week.
One of the problems with Comet is that it’s a bag of non-standard tricks (or even implementation-dependent hacks, depending on how you slice it). This proposal to add support for Comet techniques to the HTTP RFC may remedy that ailment. It boils down to:
Also, When-Modified-After plus Last-Modified or Content-Range aids in the prevention of lost updates. I truly hope this or something like it becomes standard and – fingers crossed – implemented by the major browsers. As if that weren’t asking for the world already, I’d like to see it in mobile browsers as well. One can wish….
I’ve been thinking for a while about how my current line of work, IP telephony, relates to REST. The future architecture of my employer’s product line will apparently be based on SIP (for signaling among endpoints, of course) and the WS-* stack (for applications to integrate into our platform), which is the primary object of derision of REST advocates. The back-and-forth among and between the REST and WS-* folks is something I’m not qualified to evaluate as though from above, nor is it a race in which I’m particularly interested in having a horse. I’m really just a grasshopper who’s been mulling over the design of a call control protocol based on URI and HTTP. All of my on-the-job work with IP telephony has been with CTI (computer telephony integration), which boils down to first- and third-party call control protocols like TSAPI and CSTA-III XML. Take TSAPI. It allows you to instruct one endpoint to call another using a method called MakeCall. It takes two endpoint identifiers as parameters and the result of a successful invocation is that a call is made from the calling endpoint to the called endpoint. (It doesn’t necessarily make the callled endpoint *answer* the call though.) MakeCall assumes a connection to a “switching element,” which is capable of making the calling endpoint go off hook and dial the called endpoint. How would this look with HTTP methods and URIs? The operation isn’t idempotent, so we’d use HTTP POST. One way would be to put the two endpoints on equal footing in the URI:
http://call.example.com/tel:5454;tel:6662
Another way would be to consider the calling endpoint “above” the called endpoint in the hierarchy of the URI:
http://call.example.com/tel:5454/tel:6662
In the first case, the meaning seems to be “establish a call between extensions 5454 and 6662.” In the second, it seems to be “make a call from 5454 to 6662.” The second seems more fitting, because MakeCall itself doesn’t force the called endpoint to answer the call.
It would be nice for a group of interested people to get together and hammer out a scheme for doing (CSTA-style?) call control using HTTP+URI. In addition to the design of the URIs, they’d have to define which operations are legal (e.g. what, if anything, does GET on http://call.example.com/tel:5454 mean?), the names and types of POST parameters, and the HTTP body formats. HTTP requests and responses should be used exhaustively. I have to wonder whether it’s possible or desirable to use the CSTA model — which seems to describe a graph in which endpoints and calls are vertices and connections are edges — as the starting point. If the WWW is a graph in which resources are vertices and URIs are edges between them, why can’t a network of telephony endpoints be described and manipulated using the same constraints as the WWW?
What would be really nice is a soft phone that ran on the iPhone. It could be a Skype app or someone else’s, I don’t particularly care, but something like SkypeOut would be useful. My fundamental problem is that AT&T’s signal isn’t so great inside my house; being able to make VOIP calls from my iPhone over wifi would solve that problem. The next problem is incoming calls. In order to prevent the soft phone and the iPhone from ringing at the same time, which could be problematic, the call routing system (whether it’s AT&T’s or, say, Google GrandCentral’s) would need to ring based on network presence. If my iPhone is using my home wifi network, I’d want the soft phone to ring. This could be accomplished using XMPP or SIP from the soft phone to the call router.
Googling a little, hey, cool! Not quite what I was looking for, but it’s a start.
I love my iPhone, etc, etc, but I was surprised that it doesn’t give you a way to tell Google Maps “I have no idea where I am, but I want to get to X,” which is what you’d want to do when you’re lost. Just use cellular location services to get a reasonable approximation of your location and give that to Google Maps as the start location. Microsoft apparently has software for doing this; why doesn’t Apple?
Fibonacci strings are related to Fibonacci numbers in that the length of the nth Fibonacci string is the nth Fibonacci number. Here’s pseudocode for a function that generates them:
fibstr: int -> string
if n is 0
return the empty string
else if n is 1
return the string "b"
else if n is 2
return the string "a"
else
return fibstr(n-1) concatenated with fibstr(n-2)
The strings generated by this function for n > 2 have the interesting property that if you delete the last two letters the resulting string is a palindrome. Anyway, for kicks I decided to use Fibonacci strings to test the string performance of popular programming languages. The test was simple: run fibstr(31) 31 times. I was lazy in writing the tests, but the results were consistent across many cups of Bhakti chai and many executions of the test yesterday morning , so as far as I’m concerned they’re valid representations of the relative performance of the languages.
| Language | Real | User | System |
|---|---|---|---|
| C | 0.809s | 0.800s | 0.000s |
| Java (StringBuilder w/JIT) | 2.807s | 2.652s | 0.084s |
| Java (String w/JIT) | 2.819s | 2.692s | 0.076s |
| JavaScript (Rhino 1.6.r5-3) | 4.240s | 4.100s | 0.080s |
| Python (2.4.4) | 4.708s | 3.992s | 0.012s |
| Perl (5.8.8) | 7.528s | 7.484s | 0.008s |
| Java (StringBuilder w/o JIT) | 8.171s | 8.053s | 0.040s |
| Java (String w/o JIT) | 11.707s | 11.553s | 0.068s |
| Ruby (1.8.6) | 21.348s | 18.565s | 2.488s |
Off-the-cuff … Ruby’s performance speaks for itself. I expected Perl to do better, since it’s been around the longest, except for C. I can see why Sun responded to early complaints about Java’s performance with JIT. It’s clearly effective, but I’d like to know more about how the JRE decides to compile to native code. With JIT, StringBuilder doesn’t buy you anything if it’s compiled to native code; without it, StringBuilder can make a noticeable (on paper) difference over plain String, but I wonder whether the difference shows up much in real workloads. If a string-manipulating function is a hot spot, it’ll probably be compiled to native code. I suppose one would have to examine the memory impact of String v. StringBuilder as well.
If I’m remembering my lambda calculus correctly, this is a function that evaluates to itself when applied to itself … or something. Hard core.
As if being horribly cool weren’t enough, David Bowie is also a smart cookie.
I was having the problem described here, except on Debian. My fix:
$ cd $(dirname $(locate setup.rb)) && sudo ruby setup.rb