Server-generated HTML: the browser GETs it and renders it as it streams through.
Client-generated HTML: the server GETs an HTML page with a JavaScript link, then GETs a JavaScript .js, then executes it, the jS GETs the JSON, then translates that into HTML at the DOM level.
The latter is arguably something that will lead to a better user experience once the page is in a "steady state", i.e. all dependent representations are loaded into the browser and rendered. But relying on it for "first render" makes for a slow experience when (e.g.) clicking on a link to an individual Twitter status.
Getting that JavaScript is a one-time operation, and a 304 from then on. Also, the HTML can include bootstrapped data, saving the roundtrip for JSON.
Also, with client-side rendering you execute more code on the client but less on the server, so in an environment like Twitter where it's not possible to do any sort of heavy caching (everybody sees something else), you're simply trading time on the server for time on the client. Not faster, not slower.
Server-side HTML generation is not a magical 0ms process.
Don't forget that JavaScript has to be interpreted every time the page loads, even if you have a cached copy of it. If it's a large chunk of code, the time to do that is not trivial.
I'm not quite sure that it's as much of a zero-sum game as you present it. I can easily think of scenarios where rendering on the server is much faster (e.g. using a compiled language vs JS, taking advantage of powerful hardware, granular caching, etc) and much more constant.
"Getting that JavaScript is a one-time operation, and a 304 from then on."
In theory it sounds right. However, there are a couple of cases where users will have to load JS a lot more than they should. Since most of the logic lives in the JS file(s) they will be changed and pushed out a lot more. This will force users to download the JS every time code is deployed.
Also, I am not sure what percent of "New users" land on Twitter pages, but they will have to download the JS.
Server-generated HTML: the browser GETs it and renders it as it streams through.
Client-generated HTML: the server GETs an HTML page with a JavaScript link, then GETs a JavaScript .js, then executes it, the jS GETs the JSON, then translates that into HTML at the DOM level.
The latter is arguably something that will lead to a better user experience once the page is in a "steady state", i.e. all dependent representations are loaded into the browser and rendered. But relying on it for "first render" makes for a slow experience when (e.g.) clicking on a link to an individual Twitter status.