This is a continuation of a series of blog entries on this topic. The series starts here.
The YSlow model of web application performance, depicted back in Equations 3 & 4 in the previous post, leads directly to an optimization strategy to minimize the number of round trips, decrease round trip time, or both. Several of the YSlow performance rules reflect tactics for minimizing the number of round trips to the web server and back that are required to render the page. These include
Meanwhile, the performance rules associated with cache effectiveness are designed to minimize RTT, the round trip time. If current copies of the HTML objects requested from the web server can be retrieved from sources physically located considerably closer to the requestor, the average network round trip time for those Requests can be improved.
With its focus on the number and size of the files necessary for the web browser to assemble in order to construct the page’s document object from these component parts, YSlow uses an approach to optimization known in the field of Operations Research (OR) as decomposition. The classic example of decomposition in OR is the time and motion study where a complex task is broken into a set of activities that are performed in sequence to complete a task. The one practical obstacle to using decomposition, however, is that YSlow understands the components that are used to compose the web page, but it lacks measurements of how long the task and its component parts take.
As discussed in the previous section, these measurements would be problematic from the standpoint of a tool like YSlow which analyzes the DOM once it has been completely assembled. YSlow does not attempt to measure the time it took to perform that assembly. Moreover, the way the tool works, YSlow deals with only a single instance of the rendered page. If it did attempt to measure network latency or cache effectiveness or client-side processing compute power, it would be capable of only gathering a single instance of those measurements. There is no guarantee that that single observation would be representative of the range and variation in behavior a public-facing web application would expect to encounter in reality. As we consider the many and varied ways caching technology, for example, is used to speed up page load times, you will start to see just how problematic the use of a single observation of the page load time measurement to represent the range and variation in actual web page load times can be.
Several of the YSlow performance rules reflect the effective
use of the caching services that are available for web content. These services
include that portion of the local file system that is used for the web client’s
cache, a Content Delivery Network, which are caches geographically distributed
around the globe, and various server-side caching mechanisms. Effective use of
caching improves the round trip time for any static content that can readily be
cached. Since network transmission time is roughly a function of distance, naturally,
the cache that is physically closest to the web client is the most effective at
reducing RTT. Of the caches that are available, the cache maintained by the web
browser on the client machine’s file system is physically the closest, and,
thus, is usually the best place for caching to occur. The web browser
automatically stores a copy of any HTTP objects it has requested that are
eligible for caching in a particular folder within the file system. The web
browser cache corresponds to the Temporary Internet Files folder in Internet
Explorer, for example.
The YSlow model of web application performance, depicted back in Equations 3 & 4 in the previous post, leads directly to an optimization strategy to minimize the number of round trips, decrease round trip time, or both. Several of the YSlow performance rules reflect tactics for minimizing the number of round trips to the web server and back that are required to render the page. These include
- designing the Page so there are fewer objects to Request,
- using compression to make objects smaller so they require fewer packets to be transmitted, and
- techniques for packing multiple objects into a single request.
Minify is a PHP5 app that helps you follow several of Yahoo!'s Rules for High Performance Web Sites. It combines multiple CSS or Javascript files, removes unnecessary whitespace and comments, and serves them with gzip encoding and optimal client-side cache headers.All the text-based files that are used in composing the page – .htm, .css, and .js – tend to compress very well, while the HTTP protocol supports automatic unpacking of gzip-encoded files. There is not a great benefit from compressing files already smaller than the Ethernet MTU, so YSlow recommends packing smaller files into larger ones so that text compression is more effective.
Meanwhile, the performance rules associated with cache effectiveness are designed to minimize RTT, the round trip time. If current copies of the HTML objects requested from the web server can be retrieved from sources physically located considerably closer to the requestor, the average network round trip time for those Requests can be improved.
With its focus on the number and size of the files necessary for the web browser to assemble in order to construct the page’s document object from these component parts, YSlow uses an approach to optimization known in the field of Operations Research (OR) as decomposition. The classic example of decomposition in OR is the time and motion study where a complex task is broken into a set of activities that are performed in sequence to complete a task. The one practical obstacle to using decomposition, however, is that YSlow understands the components that are used to compose the web page, but it lacks measurements of how long the task and its component parts take.
As discussed in the previous section, these measurements would be problematic from the standpoint of a tool like YSlow which analyzes the DOM once it has been completely assembled. YSlow does not attempt to measure the time it took to perform that assembly. Moreover, the way the tool works, YSlow deals with only a single instance of the rendered page. If it did attempt to measure network latency or cache effectiveness or client-side processing compute power, it would be capable of only gathering a single instance of those measurements. There is no guarantee that that single observation would be representative of the range and variation in behavior a public-facing web application would expect to encounter in reality. As we consider the many and varied ways caching technology, for example, is used to speed up page load times, you will start to see just how problematic the use of a single observation of the page load time measurement to represent the range and variation in actual web page load times can be.
Caching.
If a file referenced in a GET Request is already resident in
the web browser cache – the disk folder where recently accessed cacheable HTTP
objects are stored – the browser can add that file to the DOM without having to
make a network request. Web servers add an Expires
header to Response messages to indicate to the web browser that the content is eligible
for caching. As the name indicates, the Expires header specifies how long the
existing copy of that content remains current. Fetching that content from the
browser cache requires a disk operation which is normally significantly faster
than a network request. If a valid copy of the content requested is already
resident in the browser cache, the round trip time normally improves by an
order of magnitude since a block can be fetched from disk in 5-10 milliseconds
on average. Note that reading a cached file from disk isn’t always faster than
accessing the network to get the same data. Like any other factor, it is
important to measure to see which design alternative performs better. In the
case of an intranet web application where web browser requests can be fielded
very quickly, network access, often involving less than 1 ms of latency, might
actually be preferred because it could be much faster to get the Http object
requested directly from the IIS kernel-mode cache than for the web client to
have to access its local disk folder where Temporary Internet Files are stored.
Note also, that while caching does not help the first time a
customer accesses a new web page, it has a substantial impact on subsequent
accesses to the page. Web traffic analysis programs will report the number of
unique visitors to a web site – each of these is subject to a browser cache
that is empty of the any of the content that is requested. This is referred to as
a cold start in cache. It is only the
repeat visitors that benefit from caching, subject to the repeat visit to the
web site occurring prior to the content expiration date and time. In Souders’
book, he reports an encouragingly high number of repeat visits to the Yahoo
site as evidence for the YSlow recommendation. When network latency for an
external web site is at least 100-200 ms, accessing the local disk-resident
browser cache is an order of magnitude faster.
When the web browser is hosted on a mobile phone, which is
often configured without a secondary storage device, the capability to cache
content is consequently very limited. When Chrome detects it is running on an
Android phone, for example, it configures a memory resident cache that will
only hold up to 32 files at any one time. If you access any reasonably complex
web site landing page with, say, more than 20-30 href= external file references, the effect is to flush the contents
of the Chrome mobile phone cache.
Any CSS and JavaScript files that are relatively stable can
potentially also be cached, but this entails implementing a versioning policy
that your web developers adhere to. The snippet of html that I pulled from an
Amazon product landing page that I discussed earlier illustrates the sort of
versioning policy your web developers need to implement to reap the benefits of
caching, while still enabling program bug fixes, updates, and other maintenance
to ship promptly.
Another caching consideration is that when popular
JavaScript libraries like jquery.js or angular.js or any of their add-on
libraries that are incorporated into your web applications, you will find that
current copies of these files already exist in the browser’s cache and do not
require network requests to retrieve them. Taking a moment to check the
contents of my Internet Explorer disk cache, I can see several different
versions of jquery.js are currently
resident in the IE cache. Another example is the Google Analytics script, ga.js, which so many web sites utilize
for tracking web page usage is frequently already resident in the browser
cache. (I will be discussing some interesting aspects of the Google Analytics
program in an upcoming section.)
Content that is generated dynamically is more problematic to
cache. Web 2.0 pages that custom built for a specific customer probably contain
some elements that are unique for the user ID, while other web page parts are
apt to be shared among many customers. Typically, the web server programs that
build dynamic HTML Response messages will simply flag them to expire
immediately so that they are ineligible for caching by the web browser. Caching
content that is generated dynamically is challenging. Nevertheless, it is
appropriate whenever common portions of the pages are reused, especially when
it is resource-intensive to re-generate that content on demand. We will discuss
strategies and facilities for caching at least some portion of the dynamic
content web sites generate in a future Post.
Beyond caching at the local machine, YSlow also recommends
the use of a Content Delivery Network (CDN) similar to the Akamai commercial
caching engine to reduce the RTT for relatively static Response messages. CDNs
replicate your web site content across a set of geographically distributed web
servers, something which allows the CDN web server physically closer to the
requestor to serve up the requested content. The net result is a reduction in
the networking round trip time simply because the CDN server is physically
closer to the end user than your corporate site. Note that the benefits of a
CDN even extend to first time visitors of your site because they contain
up-to-date copies of the most recent static content from your primary web site
host. For Microsoft IIS web servers and ASP.NET applications, there are
additional server-side caching options for both static and dynamic content that
I will explore much later in this discussion.
Extensive use of caching techniques in web technologies to
improve page load time is one of the reasons why a performance tool like YSlow does
not actually attempt to measure Page Load Time. When YSlow re-loads the page to
inventory all the file-based HTTP objects that are assembled to construct the
DOM, the web browser is likely to discover many of these objects in its local
disk cache, drastically reducing the time it takes to compose and render the
web page. Were YSlow to measure the response time, the impact of the local disk
cache would bias the results. A tool like the WebPageTest.org site tries to
deal with this measurement quandary by accessing your web site a second time,
and comparing the results to first-time user access involving a browser cache
cold start.
Having read and absorbed the expert advice encapsulated in
the YSlow performance rules and beginning to contemplate modifying your web
application based on that advice, you start to feel the lack of actual page
load time measurements keenly. It is good to know that using a minify utility
and making effective use of the cache control headers should speed up page load
time. But without the actual page load time measurements you cannot know how much adopting these best practices
will help your specific application. It also means you do not know how to weigh
the value of improvements from tactics like Expires headers for CSS and
JavaScript files to boost cache effectiveness against the burden of augmenting
your software development practices with an appropriate approach to versioning
those files, for example. Fortunately, there are popular tools to measure web
browser Page Load Time directly, and we will look at them in a moment.
Next: Complications that the simple YSlow model does not fully take into account
Comments
Post a Comment