Exposing privileged APIs to web content

This is a discussion on the challenges we face in exposing privileged APIs to web content and a proposal for exposing such APIs to web pages by mitigating the risks inherent in doing so.

Privileged APIs, in terms of a rough definition, include things such as access to a user’s web cam, geolocation, desktop notifications, access to anything that a user considers sensitive data (e.g. a native address book) or, looking to the future, any API that would allow web content to access system-level functionality. Exposing these APIs is a problem we already have today. Anything that has the potential to leak any type of information considered sensitive to the user or data that would allow third-parties to fingerprint the user or their device or a myriad of other issues is covered under the umbrella of this term.

Let’s jump right in to the problem.

Say we wanted to offer a brand new shiny API to web pages that allows them to e.g. code to raw sockets. I add some JavaScript to my web page to interact with the Acme Raw Sockets API as follows:

var socket = new UDPSocket('udp4');
socket.listen(10001); // open a listening socket on port 10001
socket.addEventListener('message', function(data) {
  socket.send(data.address, data.port, data.message); // echo message back
});

Based on current web security policy, providing e.g. a raw sockets API to web pages in this way is simply not possible. If we were to allow web pages to interact with raw sockets like this then we would be compromising the sandbox that web browsers have carefully constructed to protect its users.

The web, by its very nature, is an insecure and untrusted environment for content. That fact runs counter to the objective of exposing powerful but sensitive APIs to web content. If I called the Acme Raw Sockets API from a web page today and my browser implemented such an API, I should expect to have a Security Violation exception thrown back on line 1.

The only solution we implement today that we could hook in to for obtaining permission to access the Acme Raw Sockets API from a web page would be to prompt the user when that web page tried to access any privileged feature.

Prompting for each feature on a case-by-case basis doesn’t scale well and can quickly become disruptive for a user’s workflow:

notification_hell

Thus, the problem here is how I could enable my web page to interact with the Acme Raw Sockets API and other privileged APIs without compromising the sandbox the web page has been given by the web browser and, also, without having to resort to the case-by-case prompting situation we have today – a solution that doesn’t even begin to solve the underlying issues with exposing privileged APIs to web content in the first place.

To get to the bottom of this issue, this article will first explore the general risks in exposing such privileged APIs to web pages and go on to propose a solution to mitigating the general risks identified.

The inherent risks in exposing privileged APIs to web content

Let’s say, for arguments sake, that I do have full access to the UDPSocket object directly on the DOM of a standard web page. If I do have access to this API, where do the security risks lie in this scenario?

For the majority of privileged APIs the problem lies in the fact that a web page could be surreptitiously phoning my private and sensitive content from my device back to some 3rd party server. Even if I’m prompted on a case-by-case basis to enable specific APIs, providing access to these privileged APIs would not resolve this fundamental issue.

If we could reduce the ability for a web page to do this, we in turn could reduce the attack surface considerably. Essentially, by reducing or eliminating a web pages ability to access the Internet we could significantly reduce and/or eliminate the attack surface for the page to maliciously use the data we deliver to it.

Packaged Web Applications emulate bad current practice

Current proposals revolve around separating content from the general web and packaging up that content in to ‘applications’.

This approach does intrinsically enforce the environment we identified above: Internet access is, by default, revoked from a package since it no longer belongs to any particular domain origin. If a packaged web application needs to access any Internet servers it needs to explicitly request the specific URLs it requires access to in the outside world.

While packaging web applications generally works as a format for delivering privileged APIs to content it eschews the web ecosystem as we know it today. Web application packaging resorts to splicing web content in to standalone packages. It is a mechanism that emulates failure from existing application paradigms and foregoes the fundamental features of the web that made the web so successful in the first place (tl;dr: app discoverability and content discoverability is built in to the web via URLs).

What’s more, packaging web applications creates information silos and enforces proprietary App Store models to take root. This is a bad model for the web as an open ecosystem.

So let’s look at other mechanisms here. Let’s look at a proposal that treats existing web content as first-class code and simply builds on top of what web developers do already today…

Working towards exposing the Acme Raw Sockets API to web pages

If a web page doesn’t have access to the Internet at all, ever, then it is not going to be of much use to the user. Web pages do initially need some web access to load all the necessary resources it references – things like images, scripts and stylesheets – and those same web pages will often update their resources depending on whatever features the developers have decided to add or remove on any given day.

Removing network access on load is therefore not a viable option. The web would simply cease to work. A web page could fully cache all the resources it required at runtime in the web browser but such procedures still initially need network access to work. While it is entirely feasible that the web page could cache its required resources it shouldn’t be a necessary burden on developers to have to cache everything before they can gain access to privileged APIs. This article is an exploration of an idea that reduces the concepts of caching and privileged API access in to orthogonal and complimentary issues.

What if we started to think about web pages in terms of their initialization lifecycle? How could network access – the primary attack vector identified above – be revoked without breaking any existing web content? The current initialization states of a web page are exposed to web pages right now via the document.readyState attribute. The initialization states (or in specification terms, the current document readiness) of a web page is currently defined as follows:

loading -> interactive -> complete

By the time we reach the complete state all assets of the web page have been loaded and the web page has now been fully rendered and is functional. Along the way, each initialization state has triggered the web browser to do some specific setup procedures.

When I load a web page I will eventually be presented with the following rendering once the browser transitions the current document readiness to complete:

My Socket App

Now that the web page has reached the complete state let’s now forcibly remove my machine’s internet access. I’m going to simply yank the Internet cable out. Here is how my web page looks now:

My Socket App

No change from the connected state.

Why is there no change? Because all the web page’s resources were loaded during the standard web page initialization process and are temporarily held by the web browser for current execution of the web page. All of my page’s JavaScript still works. All my page’s CSS styles still apply. The web page is active despite there being no current Internet access.

So now, if I were able to use the Acme Raw Sockets API in this state then I wouldn’t have too much problem trusting that the web page was not operating maliciously. In this state the page cannot ‘phone out’ to remote servers and the sandbox in which my web page runs is still being enforced by the web browser with one important additional restriction having been placed on it (i.e. it now has no network access capability).

What if we could induce this environment without having to forcibly remove the network cable each time? Imagine if, as we recreate those conditions, I could catch that my page has entered some higher state of current document readiness and the initialization lifecycle instead looked like this:

loading -> interactive -> complete -> installed

A web page may or may not reach the installed state and reaching this state implies that the user has taken some action to install that web page via their web browser’s chrome. Once the user triggers that install then the web browser preps the environment to be safe for access to privileged APIs. Specifically, when the browser transitions the web page’s document.readyState to installed the following conditions would be enforced by the web browser for the current web page:

  1. Immediately suspend all Browser Extensions and, if applicable, UserJS from interacting with the current web page.
  2. Remove the Security Violation Exceptions from the Privileged APIs that have been requested and approved during the Installation process (see ‘Requesting Privileged API Access‘ below).
  3. Enforce X-Frame-Options for the current web page with a value set to deny. This avoids the web page from being rendered in an <iframe> element from any other web page which would otherwise pose a security risk.
  4. Ensure any API calls from other web pages that expect a WindowProxy return value actually receive an empty object. This prevents code injection to the installed web page and prevents other web pages being able to set up e.g. messaging channels for export of data from the installed execution environment.
  5. Enforce a blanket Network Proxy for the current web page with an initial configuration of deny all. Additional rules can only be added to this configuration based on URLs, if any, that have been requested in the Installation process (see ‘Requesting Privileged URL Access‘ below).
  6. If/when the current installed web page now attempts to navigate to or open another web page programmatically (via e.g. window.open or window.location) or the user invokes a URL in the interface that points outside of the current web page (e.g. via an <a> element) then the web browser must specifically display the desired URL being requested to the user and obtain opt-in permission from them before navigating to that URL. This is akin to ‘going back online’ and prevents an installed web page from exporting data from privileged APIs towards other web pages without the user’s permission.

Let’s look at the current web page again after this process has completed (after an installed readystatechange event has been fired):

My Socket App

Again, no change.

The web browser has effectively created the conditions we’d have if we had unplugged the network cable without having needed to physically unplug it ourselves.

In this environment and according to the steps discussed above we are now also providing sandboxed privileged API access to the web page. If I included the following JavaScript somewhere in my web page then I would expect to be able to use the Acme Raw Socket API to its full extent:

document.addEventListener('readystatechange', function(evt) {
  if(document.readyState == 'installed') { 
    var socket = new UDPSocket('udp4'); 
    socket.listen(10001); 
    socket.addEventListener('message', function(data) { 
      socket.send(data.address, data.port, data.message); 
    }); 
    // WORKS! 
    // ... do some more installed app initialization stuff here 
  } 
});

And how does the web page look in this state you ask? As follows:

My Socket App - working

It looks the same as before. Only this time we’ve run a bit of code in the installed readystatechange event handler function to replace a bit of the text on our web page to indicate success.

Dynamically loading app-only resources during installation

The current proposal discussed above allows us to transition the current web page to an installed state that provides a good baseline sandbox in which we can begin to expose privileged APIs.

Another consideration, especially for large or complex web applications, is how can we load JavaScript, CSS, images or any other resources that we require once we have transitioned to the installed state dynamically so we don’t require all assets to be downloaded to the user’s machine up-front in the case that user does not choose to install the current web page?

If we introduce a transitional current document readiness state between complete and installed then we could use this transitional state to load any resources we may require in app-only mode in to our web page.

The initialization states for our web page will, thus, look as follows:

loading -> interactive -> complete -> installing -> installed

The installing state does not have any of the limitations imposed once the web page transitions to installed. No network restrictions have yet been applied at this stage and resources can be downloaded as required during this state.

Having this model lets us load any resources we need by catching a transition in to the installing state as follows:

document.addEventListener('readystatechange', function(evt) {
  if(document.readyState == 'installing') { 
    // Dynamically load some JS/CSS/Image/etc resources
    // from the network for our app here
  } 
});

Having this intermediate transition states allows us to e.g. employ a landing web page with a minimum set of assets and then, once the user chooses to install the web page, we can import all the application logic and other assets required in to the web page before we reach and start executing the sandboxed installed state logic.

Requesting Privileged API Access

So to take things one step further, a web page is unlikely to need every privileged API provided by a web browser. If we were to offer a set of APIs similar to those available in the Browser Extensions environment it would be unnecessary and unwise for us to grant all web pages access to all privileged APIs all of the time.

Instead, we would prefer a mechanism that allowed web pages to request Privileged APIs on an as-needed basis.

This describes, more or less, one of the purposes of the Web Application Manifest format currently under discussion at the W3C.

I could include the permissions my web page requires in my socketapp.json manifest file:

{
  // ...socketapp.json content...
  "permissions": { 
    "acmerawsockets": { 
      "description": "We need this to setup a listening UDP socket" 
    } 
    // etc 
  }
  // ...socketapp.json content... 
}

Then I could attach this manifest file to my web page as follows:

<!DOCTYPE html>
<html appconfig="/socketapp.json"> 
<title>My Awesome Application</title> 
...

We now have a way for browsers to know what privileged APIs the web page would like to use and the web browser can present this information for review by the user before any kind of installation is triggered in the web browser.

In this way we prevent the web page from accessing all privileged APIs unnecessarily when it only actually needs, and is using, only one of them: the Acme Raw Sockets API.

Progressive enhancement of web pages

There is one key aspect to this proposal that I want to highlight further here:

A manifest file attached to a web page only refers to that web page.

The manifest file should not apply to the whole domain on which the web page happens to be provided and it should not mean other web pages can adopt the same permissions without going through the same installation process. Different web pages may point to the same manifest file but authorization is only provided on a page-by-page basis.

Why? Because this allows us to do a number of key things:

Requesting Privileged URL Access

Any Raw Sockets API would be severely limited if it only worked on the local machine and without any network access capability. There are a number of use cases for setting up a socket for listening only on a local port but what if we want to e.g. send and receive data over this socket with 3rd parties?

Enter, URL access restrictions.

Above I described how the default network policy for web pages in the installed state is set to deny all.

If a web page wants to send data to a given address it could specify the hosts it would like to interact with in its manifest file as follows:

{
  // ...socketapp.json content...
  "origins": [ 
    'http://acmeinc.com/*', 
    'http://richt.me/socketserver' 
    /* etc*/ 
  ]
  // ...socketapp.json content... 
}

Because the web page is required to be explicit on the URLs that it would like to connect with, the web browser is able to display the list of URLs that the web page will connect with to the user during the installation process.

This in and of itself does not eliminate the ability for a web page to be malicious but there are a number of things we can do to reduce the likelihood that a web page can do something malicious without the user knowing about it.

Specifically, we could bubble a safe browsing metric to the user during the installation procedure. By including a trust indicator in the installation process such as Web of Trust or McAfee SiteAdvisor we can give users an indication of the trustworthy-ness of the web page’s current domain and thus, an indication of whether installing the current web page would be a good idea or not.

Secondary to that, we could enable transparent logging at the web browser level of all data sent and received by the web page while it is operating in an installed state. While of little use to the average user, more technical users would be able to analyse these logs and the data being sent and received to the network by the current web page. When tied in to the safe browsing metric idea above, these users would then be able to report unethical or unsafe practices back to the safe browsing authorities who in turn would update their warnings to users for use during future installation procedures (and potentially, to warn users when they revisit a previously installed web page that has since received a change in trust status).

Additional security considerations

On top of the mechanisms presented here, there are additional aspects of security that we should address in order to protect data obtained by ‘installed’ web pages.

Let’s assume that a web application, operating in an installed state, decides to store data it has received in to localStorage:

localStorage.setItem('useremail', 'john.doe@gmail.com');

Web pages may try to store something when operating in an installed state and then try to sneakily read that data back on e.g. a page reload, before that page has been elevated to installed status once again.

To counter this risk, web browser should enforce that if a web page is operating in a current document readiness state that is not installed then a read of a key previously written while in an installed state should be prohibited and result in a standard Security Violation exception being thrown to the caller.

// 'email' will be undefined unless the current web  
// page is in a document.readyState == 'installed' state 
var email = localStorage.getItem('useremail');

The same security principle (i.e. no read access to any storage objects written while in the installed mode) would be applied to both sessionStorage and document.cookie attributes also.

This means we don’t need separate storage locations for web pages operating in an installed mode but does maintains a clear long-term storage boundary between pages installed and those that are not.

Conclusion

Given the current state-of-the-art this proposal is intended to present a way in which web browsers could bring more powerful APIs to the web platform without breaking existing web content, allowing developers to build richer experiences on top of existing web content, without requiring a separate run-time environment for that content and without requiring web developers to rewrite their web applications to operate specifically in other modalities (e.g. chromeless or packaged modalities).

This proposal is decoupled from caching mechanisms currently under discussion (such as NavigationController). The more nuanced explanation here is that this proposal actually decouples the concepts of ‘offline’ and ‘cached’ from each other. A web page can be ‘offline’ if I yank the network cable out once all the page’s resources have been fully loaded. A ‘cache’ mechanism is useful if that web page needs to be loaded and that network cable is unplugged during that load process only.

A mechanism that allows web pages to access privileged APIs is sorely needed for existing web pages. Having such a mechanism would avoid us emulating failure by taking the Packaged Applications route as the only mechanism for using privileged APIs. Hopefully we don’t stay only on that road.


Have something to add? Tweet me.