finish review and deleted the origin from sources

This commit is contained in:
wwy-hust 2015-05-14 13:19:34 +08:00
parent 3a28051046
commit 25df496145
2 changed files with 24 additions and 471 deletions

View File

@ -1,446 +0,0 @@
translating by wwy-hust
Web Caching Basics: Terminology, HTTP Headers, and Caching Strategies
=====================================================================
### Introduction
Intelligent content caching is one of the most effective ways to improve
the experience for your site's visitors. Caching, or temporarily storing
content from previous requests, is part of the core content delivery
strategy implemented within the HTTP protocol. Components throughout the
delivery path can all cache items to speed up subsequent requests,
subject to the caching policies declared for the content.
In this guide, we will discuss some of the basic concepts of web content
caching. This will mainly cover how to select caching policies to ensure
that caches throughout the internet can correctly process your content.
We will talk about the benefits that caching affords, the side effects
to be aware of, and the different strategies to employ to provide the
best mixture of performance and flexibility.
What Is Caching?
----------------
Caching is the term for storing reusable responses in order to make
subsequent requests faster. There are many different types of caching
available, each of which has its own characteristics. Application caches
and memory caches are both popular for their ability to speed up certain
responses.
Web caching, the focus of this guide, is a different type of cache. Web
caching is a core design feature of the HTTP protocol meant to minimize
network traffic while improving the perceived responsiveness of the
system as a whole. Caches are found at every level of a content's
journey from the original server to the browser.
Web caching works by caching the HTTP responses for requests according
to certain rules. Subsequent requests for cached content can then be
fulfilled from a cache closer to the user instead of sending the request
all the way back to the web server.
Benefits
--------
Effective caching aids both content consumers and content providers.
Some of the benefits that caching brings to content delivery are:
- **Decreased network costs**: Content can be cached at various points
in the network path between the content consumer and content origin.
When the content is cached closer to the consumer, requests will not
cause much additional network activity beyond the cache.
- **Improved responsiveness**: Caching enables content to be retrieved
faster because an entire network round trip is not necessary. Caches
maintained close to the user, like the browser cache, can make this
retrieval nearly instantaneous.
- **Increased performance on the same hardware**: For the server where
the content originated, more performance can be squeezed from the
same hardware by allowing aggressive caching. The content owner can
leverage the powerful servers along the delivery path to take the
brunt of certain content loads.
- **Availability of content during network interruptions**: With
certain policies, caching can be used to serve content to end users
even when it may be unavailable for short periods of time from the
origin servers.
Terminology
-----------
When dealing with caching, there are a few terms that you are likely to
come across that might be unfamiliar. Some of the more common ones are
below:
- **Origin server**: The origin server is the original location of the
content. If you are acting as the web server administrator, this is
the machine that you control. It is responsible for serving any
content that could not be retrieved from a cache along the request
route and for setting the caching policy for all content.
- **Cache hit ratio**: A cache's effectiveness is measured in terms of
its cache hit ratio or hit rate. This is a ratio of the requests
able to be retrieved from a cache to the total requests made. A high
cache hit ratio means that a high percentage of the content was able
to be retrieved from the cache. This is usually the desired outcome
for most administrators.
- **Freshness**: Freshness is a term used to describe whether an item
within a cache is still considered a candidate to serve to a client.
Content in a cache will only be used to respond if it is within the
freshness time frame specified by the caching policy.
- **Stale content**: Items in the cache expire according to the cache
freshness settings in the caching policy. Expired content is
"stale". In general, expired content cannot be used to respond to
client requests. The origin server must be re-contacted to retrieve
the new content or at least verify that the cached content is still
accurate.
- **Validation**: Stale items in the cache can be validated in order
to refresh their expiration time. Validation involves checking in
with the origin server to see if the cached content still represents
the most recent version of item.
- **Invalidation**: Invalidation is the process of removing content
from the cache before its specified expiration date. This is
necessary if the item has been changed on the origin server and
having an outdated item in cache would cause significant issues for
the client.
There are plenty of other caching terms, but the ones above should help
you get started.
What Can be Cached?
-------------------
Certain content lends itself more readily to caching than others. Some
very cache-friendly content for most sites are:
- Logos and brand images
- Non-rotating images in general (navigation icons, for example)
- Style sheets
- General Javascript files
- Downloadable Content
- Media Files
These tend to change infrequently, so they can benefit from being cached
for longer periods of time.
Some items that you have to be careful in caching are:
- HTML pages
- Rotating images
- Frequently modified Javascript and CSS
- Content requested with authentication cookies
Some items that should almost never be cached are:
- Assets related to sensitive data (banking info, etc.)
- Content that is user-specific and frequently changed
In addition to the above general rules, it's possible to specify
policies that allow you to cache different types of content
appropriately. For instance, if authenticated users all see the same
view of your site, it may be possible to cache that view anywhere. If
authenticated users see a user-sensitive view of the site that will be
valid for some time, you may tell the user's browser to cache, but tell
any intermediary caches not to store the view.
Locations Where Web Content Is Cached
-------------------------------------
Content can be cached at many different points throughout the delivery
chain:
- **Browser cache**: Web browsers themselves maintain a small cache.
Typically, the browser sets a policy that dictates the most
important items to cache. This may be user-specific content or
content deemed expensive to download and likely to be requested
again.
- **Intermediary caching proxies**: Any server in between the client
and your infrastructure can cache certain content as desired. These
caches may be maintained by ISPs or other independent parties.
- **Reverse Cache**: Your server infrastructure can implement its own
cache for backend services. This way, content can be served from the
point-of-contact instead of hitting backend servers on each request.
Each of these locations can and often do cache items according to their
own caching policies and the policies set at the content origin.
Caching Headers
---------------
Caching policy is dependent upon two different factors. The caching
entity itself gets to decide whether or not to cache acceptable content.
It can decide to cache less than it is allowed to cache, but never more.
The majority of caching behavior is determined by the caching policy,
which is set by the content owner. These policies are mainly articulated
through the use of specific HTTP headers.
Through various iterations of the HTTP protocol, a few different
cache-focused headers have arisen with varying levels of sophistication.
The ones you probably still need to pay attention to are below:
- **`Expires`**: The `Expires` header is very straight-forward,
although fairly limited in scope. Basically, it sets a time in the
future when the content will expire. At this point, any requests for
the same content will have to go back to the origin server. This
header is probably best used only as a fall back.
- **`Cache-Control`**: This is the more modern replacement for the
`Expires` header. It is well supported and implements a much more
flexible design. In almost all cases, this is preferable to
`Expires`, but it may not hurt to set both values. We will discuss
the specifics of the options you can set with `Cache-Control` a bit
later.
- **`Etag`**: The `Etag` header is used with cache validation. The
origin can provide a unique `Etag` for an item when it initially
serves the content. When a cache needs to validate the content it
has on-hand upon expiration, it can send back the `Etag` it has for
the content. The origin will either tell the cache that the content
is the same, or send the updated content (with the new `Etag`).
- **`Last-Modified`**: This header specifies the last time that the
item was modified. This may be used as part of the validation
strategy to ensure fresh content.
- **`Content-Length`**: While not specifically involved in caching,
the `Content-Length` header is important to set when defining
caching policies. Certain software will refuse to cache content if
it does not know in advanced the size of the content it will need to
reserve space for.
- **`Vary`**: A cache typically uses the requested host and the path
to the resource as the key with which to store the cache item. The
`Vary` header can be used to tell caches to pay attention to an
additional header when deciding whether a request is for the same
item. This is most commonly used to tell caches to key by the
`Accept-Encoding` header as well, so that the cache will know to
differentiate between compressed and uncompressed content.
### An Aside about the Vary Header
The `Vary` header provides you with the ability to store different
versions of the same content at the expense of diluting the entries in
the cache.
In the case of `Accept-Encoding`, setting the `Vary` header allows for a
critical distinction to take place between compressed and uncompressed
content. This is needed to correctly serve these items to browsers that
cannot handle compressed content and is necessary in order to provide
basic usability. One characteristic that tells you that
`Accept-Encoding` may be a good candidate for `Vary` is that it only has
two or three possible values.
Items like `User-Agent` might at first glance seem to be a good way to
differentiate between mobile and desktop browsers to serve different
versions of your site. However, since `User-Agent` strings are
non-standard, the result will likely be many versions of the same
content on intermediary caches, with a very low cache hit ratio. The
`Vary` header should be used sparingly, especially if you do not have
the ability to normalize the requests in intermediate caches that you
control (which may be possible, for instance, if you leverage a content
delivery network).
How Cache-Control Flags Impact Caching
--------------------------------------
Above, we mentioned how the `Cache-Control` header is used for modern
cache policy specification. A number of different policy instructions
can be set using this header, with multiple instructions being separated
by commas.
Some of the `Cache-Control` options you can use to dictate your
content's caching policy are:
- **`no-cache`**: This instruction specifies that any cached content
must be re-validated on each request before being served to a
client. This, in effect, marks the content as stale immediately, but
allows it to use revalidation techniques to avoid re-downloading the
entire item again.
- **`no-store`**: This instruction indicates that the content cannot
be cached in any way. This is appropriate to set if the response
represents sensitive data.
- **`public`**: This marks the content as public, which means that it
can be cached by the browser and any intermediate caches. For
requests that utilized HTTP authentication, responses are marked
`private` by default. This header overrides that setting.
- **`private`**: This marks the content as `private`. Private content
may be stored by the user's browser, but must *not* be cached by any
intermediate parties. This is often used for user-specific data.
- **`max-age`**: This setting configures the maximum age that the
content may be cached before it must revalidate or re-download the
content from the origin server. In essence, this replaces the
`Expires` header for modern browsing and is the basis for
determining a piece of content's freshness. This option takes its
value in seconds with a maximum valid freshness time of one year
(31536000 seconds).
- **`s-maxage`**: This is very similar to the `max-age` setting, in
that it indicates the amount of time that the content can be cached.
The difference is that this option is applied only to intermediary
caches. Combining this with the above allows for more flexible
policy construction.
- **`must-revalidate`**: This indicates that the freshness information
indicated by `max-age`, `s-maxage` or the `Expires` header must be
obeyed strictly. Stale content cannot be served under any
circumstance. This prevents cached content from being used in case
of network interruptions and similar scenarios.
- **`proxy-revalidate`**: This operates the same as the above setting,
but only applies to intermediary proxies. In this case, the user's
browser can potentially be used to serve stale content in the event
of a network interruption, but intermediate caches cannot be used
for this purpose.
- **`no-transform`**: This option tells caches that they are not
allowed to modify the received content for performance reasons under
any circumstances. This means, for instance, that the cache is not
able to send compressed versions of content it did not receive from
the origin server compressed and is not allowed.
These can be combined in different ways to achieve various caching
behavior. Some mutually exclusive values are:
- `no-cache`, `no-store`, and the regular caching behavior indicated
by absence of either
- `public` and `private`
The `no-store` option supersedes the `no-cache` if both are present. For
responses to unauthenticated requests, `public` is implied. For
responses to authenticated requests, `private` is implied. These can be
overridden by including the opposite option in the `Cache-Control`
header.
Developing a Caching Strategy
-----------------------------
In a perfect world, everything could be cached aggressively and your
servers would only be contacted to validate content occasionally. This
doesn't often happen in practice though, so you should try to set some
sane caching policies that aim to balance between implementing long-term
caching and responding to the demands of a changing site.
### Common Issues
There are many situations where caching cannot or should not be
implemented due to how the content is produced (dynamically generated
per user) or the nature of the content (sensitive banking information,
for example). Another problem that many administrators face when setting
up caching is the situation where older versions of your content are out
in the wild, not yet stale, even though new versions have been
published.
These are both frequently encountered issues that can have serious
impacts on cache performance and the accuracy of content you are
serving. However, we can mitigate these issues by developing caching
policies that anticipate these problems.
### General Recommendations
While your situation will dictate the caching strategy you use, the
following recommendations can help guide you towards some reasonable
decisions.
There are certain steps that you can take to increase your cache hit
ratio before worrying about the specific headers you use. Some ideas
are:
- **Establish specific directories for images, css, and shared
content**: Placing content into dedicated directories will allow you
to easily refer to them from any page on your site.
- **Use the same URL to refer to the same items**: Since caches key
off of both the host and the path to the content requested, ensure
that you refer to your content in the same way on all of your pages.
The previous recommendation makes this significantly easier.
- **Use CSS image sprites where possible**: CSS image sprites for
items like icons and navigation decrease the number of round trips
needed to render your site and allow your site to cache that single
sprite for a long time.
- **Host scripts and external resources locally where possible**: If
you utilize javascript scripts and other external resources,
consider hosting those resources on your own servers if the correct
headers are not being provided upstream. Note that you will have to
be aware of any updates made to the resource upstream so that you
can update your local copy.
- **Fingerprint cache items**: For static content like CSS and
Javascript files, it may be appropriate to fingerprint each item.
This means adding a unique identifier to the filename (often a hash
of the file) so that if the resource is modified, the new resource
name can be requested, causing the requests to correctly bypass the
cache. There are a variety of tools that can assist in creating
fingerprints and modifying the references to them within HTML
documents.
In terms of selecting the correct headers for different items, the
following can serve as a general reference:
- **Allow all caches to store generic assets**: Static content and
content that is not user-specific can and should be cached at all
points in the delivery chain. This will allow intermediary caches to
respond with the content for multiple users.
- **Allow browsers to cache user-specific assets**: For per-user
content, it is often acceptable and useful to allow caching within
the user's browser. While this content would not be appropriate to
cache on any intermediary caching proxies, caching in the browser
will allow for instant retrieval for users during subsequent visits.
- **Make exceptions for essential time-sensitive content**: If you
have content that is time-sensitive, make an exception to the above
rules so that the out-dated content is not served in critical
situations. For instance, if your site has a shopping cart, it
should reflect the items in the cart immediately. Depending on the
nature of the content, the `no-cache` or `no-store` options can be
set in the `Cache-Control` header to achieve this.
- **Always provide validators**: Validators allow stale content to be
refreshed without having to download the entire resource again.
Setting the `Etag` and the `Last-Modified` headers allow caches to
validate their content and re-serve it if it has not been modified
at the origin, further reducing load.
- **Set long freshness times for supporting content**: In order to
leverage caching effectively, elements that are requested as
supporting content to fulfill a request should often have a long
freshness setting. This is generally appropriate for items like
images and CSS that are pulled in to render the HTML page requested
by the user. Setting extended freshness times, combined with
fingerprinting, allows caches to store these resources for long
periods of time. If the assets change, the modified fingerprint will
invalidate the cached item and will trigger a download of the new
content. Until then, the supporting items can be cached far into the
future.
- **Set short freshness times for parent content**: In order to make
the above scheme work, the containing item must have relatively
short freshness times or may not be cached at all. This is typically
the HTML page that calls in the other assisting content. The HTML
itself will be downloaded frequently, allowing it to respond to
changes rapidly. The supporting content can then be cached
aggressively.
The key is to strike a balance that favors aggressive caching where
possible while leaving opportunities to invalidate entries in the future
when changes are made. Your site will likely have a combination of:
- Aggressively cached items
- Cached items with a short freshness time and the ability to
re-validate
- Items that should not be cached at all
The goal is to move content into the first categories when possible
while maintaining an acceptable level of accuracy.
Conclusion
----------
Taking the time to ensure that your site has proper caching policies in
place can have a significant impact on your site. Caching allows you to
cut down on the bandwidth costs associated with serving the same content
repeatedly. Your server will also be able to handle a greater amount of
traffic with the same hardware. Perhaps most importantly, clients will
have a faster experience on your site, which may lead them to return
more frequently. While effective web caching is not a silver bullet,
setting up appropriate caching policies can give you measurable gains
with minimal work.
---
作者: [Justin Ellingwood](https://www.digitalocean.com/community/users/jellingwood)
译者:[译者ID](https://github.com/译者ID)
校对:[校对者ID](https://github.com/校对者ID)
推荐:[royaso](https://github.com/royaso)
via: https://www.digitalocean.com/community/tutorials/web-caching-basics-terminology-http-headers-and-caching-strategies
本文由 [LCTT](https://github.com/LCTT/TranslateProject) 原创翻译,[Linux中国](http://linux.cn/) 荣誉推出

View File

@ -3,9 +3,9 @@ Web
### 简介
对于您的站点的访问者来说智能化的内容缓存是提高用户体验最有效的方式之一。缓存或者对之前请求的临时存储是HTTP协议实现中最核心的内容分发策略之一。分发路径中的组件均可以缓存内容来加速后续的请求这受制于对内容所声明的缓存策略。
对于您的站点的访问者来说智能化的内容缓存是提高用户体验最有效的方式之一。缓存或者对之前请求的临时存储是HTTP协议实现中最核心的内容分发策略之一。分发路径中的组件均可以缓存内容来加速后续的请求这受制于对内容所声明的缓存策略。
在这份指南中我们将讨论一些Web内容缓存的基本概念。这主要包括如何选择缓存策略以保证互联网范围内的缓存能够正确的处理您的内容。我们将谈一谈缓存带来的好处、副作用以及不同的策略能带来的性能和灵活性的最大结合。
在这份指南中我们将讨论一些Web内容缓存的基本概念。这主要包括如何选择缓存策略以保证互联网范围内的缓存能够正确的处理您的内容。我们将谈一谈缓存带来的好处、副作用以及不同的策略能带来的性能和灵活性的最大结合。
什么是缓存?
------------
@ -19,7 +19,7 @@ Web
好处
----
有效的缓存技术不仅帮助用户,还帮助内容的提供者。一些缓存对内容分发带来的好处有:
有效的缓存技术不仅帮助用户,还帮助内容的提供者。缓存对内容分发带来的好处有:
- **减少网络开销**:内容可以在从内容提供者到内容消费者网络路径之间的许多不同的地方被缓存。当内容在距离内容消费者更近的地方被缓存时,由于缓存的存在,请求将不会消耗额外的网络资源。
@ -27,7 +27,7 @@ Web
- **在同样的硬件上提高速度**:对于保存原始内容的服务器来说,更多的性能可以通过允许激进的缓存策略从硬件上压榨出来。内容拥有者们可以利用交付路径上强大的服务器来应对某个内容负载的冲击。
- **网络中断时内容依旧可用**:使用某种策略,缓存可以保证内容对用户的可用,尽管当原始服务器可能在某段时间内变得不可用。
- **网络中断时内容依旧可用**:使用某种策略,缓存可以保证在原始服务器变得不可用时,相应的内容对用户依旧可用。
术语
----
@ -36,20 +36,20 @@ Web
- **原始服务器**原始服务器是内容的原始存放地点。如果您是Web服务器管理员它就是您所管理的机器。它负责为任何不能从缓存中得到的内容进行回复并且负责设置所有内容的缓存策略。
- **缓存命中率**:一个缓存有效性依照缓存的命中率进行度量。他是可以从缓存中得到的数据的请求数与所有请求数的比率。缓存命中率高意味着有很高比例的数据可以从缓存中获得。这通常是大多数管理员想要的结果。
- **缓存命中率**:一个缓存的有效性依照缓存的命中率进行度量。它是可以从缓存中得到数据的请求数与所有请求数的比率。缓存命中率高意味着有很高比例的数据可以从缓存中获得。这通常是大多数管理员想要的结果。
- **新鲜度**:新鲜度用来描述一个缓存中的项目是否依旧适合返回给客户端。缓存中的内容只有在由缓存策略指定的新鲜度时间内才会被返回。
- **新鲜度**:新鲜度用来描述一个缓存中的项目是否依旧适合返回给客户端。缓存中的内容只有在由缓存策略指定的新鲜内才会被返回。
- **过期内容**:缓存中根据缓存策略的新鲜度设置已过期的内容。过期的内容被标记为"陈旧"。通常,过期内容不能用于回复客户端的请求。必须重新从原始服务器请求新的内容或者至少验证缓存的内容是否仍然准确。
- **过期内容**:缓存中根据缓存策略的新鲜期设置已过期的内容。过期的内容被标记为“陈旧”。通常,过期内容不能用于回复客户端的请求。必须重新从原始服务器请求新的内容或者至少验证缓存的内容是否仍然准确。
- **校验**:缓存中的过期内容可以被验证有效以便刷新过期时间。验证过程包括联系原始服务器以检查缓存的数据是否依旧代表了最近的版本。
- **失效**:失效是依据过期日期从缓存中移除内容的过程。当内容在原始服务器上已被改变并且缓存中过期的内容会导致客户端签名问题时这个步骤是必须的。
- **失效**:失效是依据过期日期从缓存中移除内容的过程。当内容在原始服务器上已被改变并且缓存中过期的内容会导致客户端签名问题时这个步骤是必须的。
还有许多其他的缓存术语,不过上面的这些应该能帮助开始。
还有许多其他的缓存术语,不过上面的这些应该能帮助开始。
什么能被缓存?
-------------
--------------
某些特定的内容比其他内容更容易被缓存。对大多数站点来说,一些缓存友好的内容如下:
@ -74,7 +74,7 @@ Web
- 与敏感信息相关的资源(银行数据,等)
- 用户相关且经常更改的数据
除上面的通用规则外,通常您需要指定一些规则以便于更好地缓存不同种类的内容。例如,如果登陆的用户都看到的是同样的网站视图,就应该在任何地方缓存这个页面。如果登陆的用户会在一段时间内看到站点中用户敏感的视图,您应该让用户的浏览器缓存该数据但不让中间的任何中介缓存该视图。
除上面的通用规则外,通常您需要指定一些规则以便于更好地缓存不同种类的内容。例如,如果登陆的用户都看到的是同样的网站视图,就应该在任何地方缓存这个页面。如果登陆的用户会在一段时间内看到站点中用户敏感的视图,您应该让用户的浏览器缓存该数据而不应让任何中介节点缓存该视图。
Web内容缓存的位置
-----------------
@ -92,9 +92,9 @@ Web
缓存策略依赖于两个不同的因素。缓存实体本身需要决定是否应该缓存可接受的内容。它可以只缓存部分可以缓存的内容,但不能缓存超过限制的内容。
缓存行为主要由缓存策略决定,而缓存策略由内容拥有者设置。这些策略主要由特定的HTTP头部来清晰的表达。
缓存行为主要由缓存策略决定,而缓存策略由内容拥有者设置。这些策略主要通过特定的HTTP头部来清晰地表达。
经过不同HTTP协议的迭代一些不同的聚焦缓存的头部出现了,它们的复杂度各不相同。下面列出了那些你也许应该注意的:
经过不同HTTP协议的迭代一些不同的聚焦缓存的头部出现了,它们的复杂度各不相同。下面列出了那些你也许应该注意的:
- **`Expires`**:尽管使用范围相当有限,但`Expires`头部是非常简洁明了的。通常它设置一个未来的时间内容会在此时间过期。这时任何对同样内容的请求都应该回到原始服务器处。这个头部或许仅仅最适合回退模式fall back
- **`Cache-Control`**:这是`Expires`的一个更加现代化的替换物。它已被很好的支持,且拥有更加灵活的实现。在大多数案例中,它比`Expires`更好,但同时设置两者的值也无妨。稍后我们将讨论您可以设置的`Cache-Control`的详细选项。
@ -105,11 +105,11 @@ Web
### Vary头部的隐语
`Vary`头部提供给您存储同一个内容不同版本的能力。代价是稀释了缓存的入口。
`Vary`头部提供给您存储同一个内容不同版本的能力。代价是稀释了缓存的入口。
在`Accept-Encoding`的案例中,设置`Vary`头部允许压缩和未压缩的内容拥有决定性的区别。这在服务某些不能处理压缩数据的浏览器时很重要,它可以保证基本的使用。`Accept-Encoding`是`Vary`的不错的候选值是因为它只有两到三个可用的值这一特性
在`Accept-Encoding`的案例中,设置`Vary`头部允许压缩和未压缩的内容拥有决定性的区别。这在服务某些不能处理压缩数据的浏览器时很重要,它可以保证基本的使用。`Accept-Encoding`是`Vary`的不错的候选值是因为它只有两到三个可选的值
`User-Agent`这样的头部可能一开始看上去是区分移动浏览器和桌面浏览器以便您的站点提供差异化的服务的好主意。但`User-Agent`字符串是非标准的结果将会造成在中间缓存中对同一内容的许多不同版本的缓存,这会导致缓存命中率的降低。`Vary`头部应该被保守地使用,尤其是您不具备在您控制的中间混存(例如,您控制着内容分发网络)中使请求标准化的能力。
`User-Agent`这样的头部可能一开始看上去是区分移动浏览器和桌面浏览器以便您的站点提供差异化的服务的好主意。但`User-Agent`字符串是非标准的结果将会造成在中间缓存中对同一内容的许多不同版本的缓存,这会导致缓存命中率的降低。`Vary`头部应该被保守地使用,尤其是您不具备在您控制的中间混存(例如,您控制着内容分发网络)中使请求标准化的能力。
缓存控制标志怎样影响缓存
------------------------
@ -120,9 +120,9 @@ Web
- **`no-cache`**:这个指令指示所有缓存的内容在新的请求到达时必须先重新验证,再发送给客户端。这条指令实际将内容立刻标记为过期的,但允许它们被验证技术重新验证以避免重新下载整个内容。
- **`no-store`**:这条指令指示缓存的内容不能以任何方式被缓存。它适合在回复敏感信息时设置。
- **`public`**它将内容标记为公有的这意味着它能被浏览器和其他任何中间节点缓存。通常对于使用HTTP验证的请求其回复被默认标记为`private`。`public`标记将会覆盖这个设置。
- **`public`**:它将内容标记为公有的,这意味着它能被浏览器和其他任何中间节点缓存。通常,对于使用HTTP验证的请求其回复被默认标记为`private`。`public`标记将会覆盖这个设置。
- **`private`**:它将内容标记为私有的。私有数据可以被用户的浏览器缓存,但*不能*被任何中间节点缓存。它通常用于用户相关的数据。
- **`max-age`**:这个设置指示了缓存内容的最大生存期,它在最大生存期后必须在源服务器处被验证或被重新下载。在现代浏览器中这个选项大体上取代了`Expires`头部,浏览器也将其作为决定内容的新鲜度的基础。这个选项的值以秒为单位表示,最大可以表示一年的新鲜31536000秒
- **`max-age`**:这个设置指示了缓存内容的最大生存期,它在最大生存期后必须在源服务器处被验证或被重新下载。在现代浏览器中这个选项大体上取代了`Expires`头部,浏览器也将其作为决定内容的新鲜度的基础。这个选项的值以秒为单位表示,最大可以表示一年的新鲜31536000秒
- **`s-maxage`**:这个选项非常类似于`max-age`,它指明了内容能够被缓存的时间。区别是这个选项只在中间节点的缓存中有效。结合这两个选项可以构建更加灵活的缓存策略。
- **`must-revalidate`**:它指明了由`max-age`、`s-maxage`或`Expires`头部指明的新鲜度信息必须被严格的遵守。它避免了缓存的数据在网络中断等类似的场景中被使用。
- **`proxy-revalidate`**:它和上面的选项有着一样的作用,但只应用于中间的代理节点。在这种情况下,用户的浏览器可以在网络中断时使用过期内容,但中间的缓存内容不能用于此目的。
@ -130,7 +130,7 @@ Web
这些选项能够以不同的方式结合以获得不同的缓存行为。一些互斥的值如下:
- `no-cache``no-store`以及由其他前面提到的选项指明的常用的缓存行为
- `no-cache``no-store`以及由其他前面提到的选项指明的常用的缓存行为
- `public`和`private`
如果`no-store`和`no-cache`都被设置,那么`no-store`会取代`no-cache`。对于非授权的请求的回复,`public`是隐含的设置。对于授权的请求的回复,`private`选项是隐含的。他们可以通过在`Cache-Control`头部中指明相应的相反的选项以覆盖。
@ -138,8 +138,7 @@ Web
开发一种缓存策略
----------------
在理想情况下,任何内容都可以被激进的缓存,而您的服务器只需要偶尔的提供一些验证内容即可。但这在现实中很少发生,因此您应该尝试设置一些理智的缓存策略,以在长期缓存和站点改变的需求间达到平衡。
在理想情况下,任何内容都可以被激进的缓存,而您的服务器只需要偶尔的提供一些验证内容即可。但这在现实中很少发生,因此您应该尝试设置一些明智的缓存策略,以在长期缓存和站点改变的需求间达到平衡。
### 常见问题
@ -147,11 +146,11 @@ Web
这些都是经常遇到的问题,它们会影响缓存的性能和您提供的数据的准确性。然而,我们可以通过开发提前预见这些问题的缓存策略来缓和这些问题。
### 一般建议
### 一般建议
尽管您的实际情况会指导您选择的缓存策略,但是下面的建议能帮助您获得一些合理的决定。
在您担心使用哪一个特定的头部之前,有一些特定的步骤可以帮助您提您的缓存命中率。一些建议如下:
在您担心使用哪一个特定的头部之前,有一些特定的步骤可以帮助您提您的缓存命中率。一些建议如下:
- **为图像、CSS和共享的内容建立特定的文件夹**:将内容放到特定的文件夹内使得您可以方便的从您的站点中的任何页面引用这些内容。
- **使用同样的URL来表示同样的内容**:由于缓存使用内容请求中的主机名和路径作为键,因此应保证您的所有页面中的该内容的引用方式相同,前一个建议能让这点更加容易做到。
@ -162,7 +161,7 @@ Web
对于不同的文件正确地选择不同的头部这件事,下面的内容可以作为一般性的参考:
- **允许所有的缓存存储一般内容**:静态内容以及非用户相关的内容应该在分发链的所有节点被缓存。这使得中间节点可以将该内容回复给多个用户。
- **允许浏览器缓存用户相关的内容**:对于每个用户的数据,通常在用户自己的浏览器中缓存是可以被接且有益的。缓存在用户自身的浏览器能够使得用户在接下来的浏览中能够瞬时读取,但这些内容不适合在任何中间代理节点缓存。
- **允许浏览器缓存用户相关的内容**:对于每个用户的数据,通常在用户自己的浏览器中缓存是可以被接且有益的。缓存在用户自身的浏览器能够使得用户在接下来的浏览中能够瞬时读取,但这些内容不适合在任何中间代理节点缓存。
- **将时间敏感的内容作为特例**:如果您的数据是时间敏感的,那么相对上面两条参考,应该将这些数据作为特例,以保证过期的数据不会在关键的情况下被使用。例如,您的站点有一个购物车,它应该立刻反应购物车里面的物品。依据内容的特点,可以在`Cache-Control`头部中使用`no-cache`或`no-store`选项。
- **总是提供验证器**:验证器使得过期的内容可以无需重新下载而得到刷新。设置`ETag`和`Last-Modified`头部将允许缓存向原始服务器验证内容,并在内容未修改时刷新该内容新鲜度以减少负载。
- **对于支持的内容设置长的新鲜期**为了更加有效的利用缓存一些作为支持性的内容应该被设置较长的新鲜期。这通常比较适合图像和CSS等由用户请求用来渲染HTML页面的内容。和文件摘要一起设置延长的新鲜期将允许缓存长时间的存储这些资源。如果资源发生改变修改的文件摘要将会使缓存的数据无效并触发对新的内容的下载。那时新的支持的内容会继续被缓存。
@ -179,7 +178,7 @@ Web
结论
----
花时间确保您的站点使用了合适的缓存策略将对您的站点产生重要的影响。混存使得您可以保证服务同样内容的同时减少带宽的使用。您的服务器因此可以靠同样的硬件处理更多的流量。或许更重要的是客户们能在您的网站中获得更快的体验这会使得他们更愿意频繁的访问。尽管有效的Web缓存并不是银弹但设置合适的缓存策略会使您以最小的代价获得可观的收获。
花时间确保您的站点使用了合适的缓存策略将对您的站点产生重要的影响。缓存使得您可以在保证服务同样内容的同时减少带宽的使用。您的服务器因此可以靠同样的硬件处理更多的流量。或许更重要的是,客户们能在您的网站中获得更快的体验,这会使得他们更愿意频繁的访问您的站点。尽管有效的Web缓存并不是银弹但设置合适的缓存策略会使您以最小的代价获得可观的收获。
---