How To Create Caching Of Web Site Pages For Offline Access

How To Create Caching Of Web Site Pages For Offline Access Main Logo

How To Create Caching Of Web Site Pages For Offline Access

From this article, you will learn how to create a page with a list of previously cached materials on the mobile device, in the browser, so that the conditional, stuck in the elevator user, did not miss the Internet. As we approach the goal, we will touch on the following topics:

  • The caching site pages for offline access;
  • The keeping records of pages available offline, transferring necessary data;
  • A monitor network status, online or offline;
  • A communicating service-worker with the pages and tabs that it serves;
  • An interception by the service-worker request to open the address /offline/ and generate a new page directly on the device, without requesting the server.

If the topic of service-vendors and Progressive Web Apps (PWA) is new for you, then before reading this article you need to get to know them better.

This guide helped us write pitfalls that we got when implementing the PWA for the mobile version of some of the websites.

In the text, there will be small examples of code that illustrate the story. An extended demo version can be viewed on GitHub.

Connecting the service-clerk

Service-worker, serving the entire site, should be located at the root. For example, have the /service-worker.js address. In our case, this is required. If you give the service-grumbler file from the /js/ directory, for example, /js/service-worker.js, then it will be able to process only those network requests that begin with /js/…

We connect the service-worker from the page of our site:


// app.js - runs on the site page

// After the page is fully loaded, register the service worker
if ('serviceWorker' in navigator) {
window.addEventListener ('load', registerServiceWorker);
}

function registerServiceWorker () {
// the file is connected from the root of the site, so
// can process requests for all subsections
navigator.serviceWorker.register ('/ service-worker.js')
.then (registration => {
if (! registration.active) {
// Not yet activated
return;
}

// Service-vorker activated, you can work with it.
// A little later we will add calls to the necessary functions.

});
}

The initialization code for the service-worker in our example should contain a complete list of the resources needed to correctly draw the future page /offline/, all styles, images, etc. We pre-cache them on the install event, the first of a chain of life-cycle events.


// service-worker.js

// Files that are required offline
const dependencies = [
'/css/app.css',
'/js/offline_page.js',
'/img/logo.png',
'/img/default_thumb.png'
];

// Installation phase, service-vorker not yet active
self.addEventListener ('install', event => {

// Load all the files that are required for offline mode
const loadDependencies = self.caches.open ('myApp')
.then (cache => cache.addAll (dependencies));

// Service-vorker will pass to the next stage of its cycle,
// when all the necessary files are loaded and cached
event.waitUntil (loadDependencies);
});

Next event activates. It is useful for us in order to clear the old cache and records in the database. In our example, a simple IDB-keyval helper is used to work with IndexedDB. He and his more prodigious brother idb are handy wrappers, promoting the work with the outdated API IndexedDB.


// service-worker.js

import {clear} from 'idb-keyval';

// Files that are required offline
const dependencies = [/ * ... * /];

// Activation
self.addEventListener ('activate', event => {

// clean records in IndexedDB
const promiseClearIDB = clear ();

// clean the old cache
const promiseClearCache = self.caches.open (cacheName)
.then ((cache) => cache.keys ()
.then ((cacheKeys) => Promise.all (cacheKeys.map ((request) => {
// Delete everything except resources from the list of files,
// which are required offline
const canDelete =! dependencies.includes (request.url);
return canDelete? cache.delete (request, {ignoreVary: true})
: Promise.resolve ();
}))));

const promiseClearAll = Promise.all ([promiseClearIDB, promiseClearCache])
.catch (err => console.error (error));

// The life cycle of the service-vorker will continue,
// when the cache and IndexedDB are cleared
event.waitUntil (promiseClearAll);
});

After activation, the service-worker is ready to work. It will be able to process network requests and receive messages from all pages of our site that were opened after its activation. You just need to add the appropriate event handlers. It is here that we will catch the page request /offline/.


// service-worker.js

// Process outgoing network requests
self.addEventListener ('fetch', event => {
const {request} = event;
const url = new URL (request.url);

if (url.origin! == self.location.origin) {
// Foreign domain, do not process this request
return fetch (request) .catch (err => console.log (err));
}

// Check if the page has been requested / offline /
const isOfflineListRequested = /^\/offline\//.test(url.pathname);

const response = isOfflineListRequested
// Create a custom response with a page
// list of materials available offline
? createOfflineListResponse ()

// Do the usual query.
// Here, depending on the URL, we can apply
// different caching strategies, something to save "for ages",
// something every time to update, etc.
: fetchWithOneOrAnotherCacheStrategy (event.request);

event.respondWith (response);
});

What are “caching strategies” and why are they needed?

The resources that we load play a different role on the page. It can be an image with a logo or some kind of shared JS library, which most likely will never change. It can be JSON with comments, which are updated every five minutes.

The files and documents involved in the construction of the page, in its life cycle, depending on the purpose, can be conditionally divided into groups:

  • It can be cached “forever”;
  • It cannot be cached for long;
  • It can be cached, but if possible, update;
  • And so on, this list is limited only by your imagination and business objectives.

If you implement the filtering of such groups at the address, file type, anything, you can apply to each of them the logic of the mutual work of network requests and the local cache. A few examples of different caching strategies you can see in the example repository.

Now we already have support for offline mode for cached pages. They will open when the phone is activated in flight mode. Now you need to collect them all in one place, on a separate page.


<meta property="og:title" content="Homer Simpson" />
<meta property="og:url" content="http://example.com/homer.html" />
<meta property="og:image" content="http://example.com/homer.png" />

Registering offline pages

To draw a page of the list of materials available for viewing offline, this list must first be created, and then, with each opening of a new page, updated. The logic for registering pages will be as follows:

  • When the page is opened, we will create the data object describing this page (address, title, preview address to display in the list).
  • After the formation of the data, we will send them to the service-worker through postMessage.
  • The service-worker will receive data and add them to the general list.

A script that collects data about a page will be executed on it, being a part of the page. So it is more convenient. This will allow you to skip the necessary information, for example, through the head block and get it from the layout. Or in any other way that suits you.

Let’s use the Open Graph micro-markup. Today it is difficult to imagine a site without it. In addition, with its help you can transfer all the necessary information in our case:

Why transfer the page address to the layout? Why not get it in JS via the location object?

Today, most sites use analytics for all possible get-parameters, marking, for example, the source of traffic. The result is that the addresses /homer.html, /homer.html?utm_source=vk and /homer.html?utm_source=email actually lead to the same page, which means that they must be registered in the list once. Here we will be helped by the “canonical” address transmitted through the og: URL, it will always be the same. Most likely all the necessary og-markup you already have, you can check its completeness using the extension for Google Chrome.

So, let’s teach the page to tell the service-richer that it’s loaded. We modify the function registerServiceWorker (see above).


// app.js - runs on the site page

function registerServiceWorker () {
navigator.serviceWorker.register ('/ service-worker.js')
.then (registration => {
if (! registration.active) {
// Not yet activated
return;
}
// Service-vorker activated, you can work with it
// Report that the current page is now available offline
registerPageAsCached ();
});
}

/ **
* Registers the current page of the site, as available offline
* /
function registerPageAsCached () {
// We will not analyze the function getPageInfoFromHtml,
// the main thing is that it should return an object with fields:
// url - "canonical" page address
// title - title of the page
// thumb - the address of the page thumbnail
const page = getPageInfoFromHtml ();

// Send the data to the service-vorker
postMessage ({
action: 'registerPage',
page
});
}

/ **
* Sending a message to the service worker
* @param {object} message
* /
function postMessage (message) {
const {controller} = navigator.serviceWorker;
if (controller) {
controller.postMessage (message);
}
}

Note: in the message, in addition to the data about the page, we pass the action field, which describes the type of the message. This will allow us to transmit different data for different purposes in the future.

Someone will ask, but how do we know that the page is cached?

All requests from our site go through one of the caching strategies that we introduced earlier, which means we accept that everything that was displayed in the browser passed through the cache.

We receive the data from the page in the service-broker:


// service-worker.js

import {get, set} from 'idb-keyval';

/ *
* Processing of messages from pages
* /
self.addEventListener ('message', event => {
const {data = {}} = event;
const {page} = data;

// Messages can be different,
// razrulivaem, using action
switch (data.action) {
case 'registerPage':
addToOfflineList (page);
break;
}
});

/ **
* Registers the page as available offline
* @param {object} pageInfo
* @return {Promise}
* /
export function addToOfflineList (pageInfo) {
// cache the preview of the page using the appropriate strategy,
// the image is useful in offline mode
if (pageInfo.thumb) {
fetchWithOneOrAnotherCacheStrategy (pageInfo.thumb);
}

// add page information to IndexedDB
return get ('cachedPages')
.then ((pages = {}) => set ('cachedPages', {
... pages,
[pageInfo.url]: pageInfo
}));
}

The page is registered.

In this example, we used the header, address, and image to describe the page, but the list of data can be expanded. For example, it makes sense to specify the timestamp of the last update. This will sort the articles by the download date, and also remove old materials from the cache.

Monitoring the status of the network connection

While the page is open, we will teach it to monitor the status of the network, or rather, the availability of our server. When the server stops responding, a corresponding message appears with a link to /offline/, which we will do later. Also for convenience, we will highlight the available links directly on the page.

You can make inaccessible materials dim, visually highlighting cached:

How To Create Caching Of Web Site Pages For Offline Access Photo 1

In the script running on the page, create a ping function that will be periodically called at the specified interval and send a message to the service-rarer.


// app.js - runs on the site page

const PING_INTERVAL = 10000; // 10 Seconds

function registerServiceWorker () {
navigator.serviceWorker.register ('/ service-worker.js')
.then (registration => {
if (! registration.active) {
// Not yet activated
return;
}
// Service-vorker activated, you can work with it
registerPageAsCached (); // see above

// start ping
ping ();
});
}

/ **
* Periodic verification of the availability of the network (our server)
* /
function ping () {
postMessage ({
action: 'ping',
});
setTimeout (ping, PING_INTERVAL);
}

/ **
* Sending a message to the service worker
* @param {object} message
* /
function postMessage (message) {
const {controller} = navigator.serviceWorker;
if (controller) {
controller.postMessage (message);
}
}

On the service-worker side, we will receive a message, check the availability of the server and send back the report. For verification, you can request any URL, it’s better if it is some kind of statics, for example, a traditional pixel.


// service-worker.js

import {get} from 'idb-keyval';

/ *
* Processing of messages from pages
* /
self.addEventListener ('message', event => {
const {data = {}} = event;
const {page} = data;

// Messages can be different,
// razrulivaem, using action
switch (data.action) {
case 'ping':
ping ();
break;
}
});

/ **
* Check the availability of the server
* /
export function ping () {
fetch ('/ ping.gif'). then (
() => pingHandler (true),
() => pingHandler (false)
);
}

/ **
* Logs and sends a message to the page
* about the success or failure of ping
* @param {boolean} isOnline
* /
function pingHandler (isOnline) {
postMessage ({
action: 'ping',
online: isOnline,
});
}

/ **
* Sends data to all pages and tabs,
* serviced by a service ranger
* @param {object} message
* /
function postMessage (message) {
// Find all the open pages and tabs of our site
self.clients.matchAll (). then (clients => {

// If there is no network, add to the message
// list of cached pages
const offlinePagesPromise = message.online?
Promise.resolve ()
: get ('cachedPages');

offlinePagesPromise.then (offlinePages => {
if (offlinePages) {
message.offlinePages = offlinePages;
}
clients.forEach (client => {
// The client can disappear, we do a check
if (client) {
client.postMessage (message);
}
});
});
});
}

In the browser can be opened several tabs on our site. Each of them will call its ping method. Therefore, it is better to load a pixel not from the page, but through a service-worker who can monitor the frequency of the verification network requests, for example, through the throttle micro-pattern. Also, knowledge of the status can be useful to the service-worker himself.

The page, after receiving the report, produces the necessary manipulations with its contents:


// app.js - runs on the site page

let isOnline = true;

function registerServiceWorker () {
navigator.serviceWorker.register ('/ service-worker.js')
.then (registration => {
if (! registration.active) {
// Not yet activated
return;
}

// Subscribe to messages from the service-clerk
serviceWorker.addEventListener ('message', handleMessage);

registerPageAsCached (); // see above
ping (); // see above
});
}
/ **
* Processing a message from the service worker
* @param {MessageEvent} e
* /
function handleMessage (e) {
const {data} = e;
if (data.action === 'ping' && isOnline! == data.online) {
isOnline = data.online;
toggleNetworkState (data);
}
}

/ **
* Toggles the status of the online / offline page
* @param {object} params
* /
function toggleNetworkState (params) {
const {online, offlinePages = {}} = params;

// We hang the global modifier,
// in our example it will make all links "fade"
document.body.classList.toggle ('offline',! online);

// For offline mode, highlight the cached links
if (! online) {
Array.from (document.links) .forEach (link => {
const href = link.getAttribute ('href');
const isCached = !! offlinePages [href] || href === '/ offline /';
link.classList.toggle ('cached', isCached);
});
}
}

Creating a page /offline/ service-worker

So, we got to the main thing, before creating a page inside the service-grinder without accessing the server. We need a template that draws HTML, and data about cached pages.

We used a simple pug template in the demo version. However, you can use any other up to “server render” for an isomorphic application on React.

Our template looks like this:


html (lang = "en")
head
title available offline
link (rel = "stylesheet" href = "/ css / app.css")

body
section.layout
header.layout__header
a.layout__header__logo (href = "/")

h1 You can read it offline

ul.articles-list
each page in pages
li.articles-list__item
a (href = page.url)
if page.thumb
img.avatar (src = page.thumb alt = "")
else
img.avatar (src = "/ img / default_thumb.png" alt = "")
span = page.title

In the service-barker, in the fetch event handler, select the request to /offline/ and return to the unsuspecting browser a freshly created page:


// service-worker.js

import {get} from 'idb-keyval';
const template = require ('offlinePage.pug');

// Process outgoing network requests
self.addEventListener ('fetch', event => {
const {request} = event;
const url = new URL (request.url);

if (url.origin! == self.location.origin) {
// Foreign domain, do not process this request
return fetch (request) .catch (err => console.log (err));
}

// Check if the page has been requested / offline /
const isOfflineListRequested = /^\/offline\//.test(url.pathname);

const response = isOfflineListRequested
// Our case!
// Create a custom response with a page
// list of materials available offline
? createOfflineListResponse ()

// Common query
: fetchWithOneOrAnotherCacheStrategy (event.request);

event.respondWith (response);
});

/ **
* Generates a response object with data about the pages,
* available offline
* @return {Promise <Response>}
* /
function createOfflineListResponse () {
// Get information about offline pages
return get ('cachedPages')
.then ((pagesList = {}) => {

// Send the list of available pages to the template
const html = template ({
pages: Object.values (pagesList)
});

// Create and return a response object
const blob = new Blob ([html], {
type: 'text / html; charset = utf-8 '
});
return new Response (blob, {
status: 200,
statusText: 'OK'
});
}). catch (err => console.error (err));
}

Result:

How To Create Caching Of Web Site Pages For Offline Access Photo 2

At last

In order not to inflate this manual, some topics had to be omitted. Nevertheless, they are extremely important. The most important thing is cleaning the cache. This should be done regularly and independently, otherwise, the place provided by the browser will end.

  • It makes sense to keep resources in the cache that are required regularly: CSS, JS, images of interface elements. For the rest, one should come up with some kind of “rule of extinction”. For example, delete everything that was not requested for more than three (five, ten, year?) Days.

For the convenience of debugging, detailed journaling of each stage is useful. To do this, you can create your own utility log, which inside can turn on/off the flag from the environment, and output information through it. Unlike the pages, the service-worker continues to live between their reboots and closing, so we recommend that you enable the Preserve log checkbox in the Developer Tools console.