<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Papers in Computer Science &#187; Security</title>
	<atom:link href="http://papersincomputerscience.org/category/security/feed/" rel="self" type="application/rss+xml" />
	<link>http://papersincomputerscience.org</link>
	<description>Discussion of computer science publications</description>
	<lastBuildDate>Fri, 25 Mar 2011 20:24:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.4</generator>
		<item>
		<title>A Learning-Based Approach to Reactive Security</title>
		<link>http://papersincomputerscience.org/2011/03/25/a-learning-based-approach-to-reactive-security/</link>
		<comments>http://papersincomputerscience.org/2011/03/25/a-learning-based-approach-to-reactive-security/#comments</comments>
		<pubDate>Fri, 25 Mar 2011 20:21:56 +0000</pubDate>
		<dc:creator>dcoetzee</dc:creator>
				<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://papersincomputerscience.org/?p=186</guid>
		<description><![CDATA[This 2010 paper, a collaboration between security and machine learning researchers, makes the bold claim that rather invest your resources in making your system as secure as possible up-front, in many scenarios it's just as good &#8212; or even <em>preferable</em> &#8212; to fix security problems as attackers discover and exploit them, a paradigm they call <em>reactive security</em>.]]></description>
			<content:encoded><![CDATA[<p><strong>Citation</strong>: A. Barth, Benjamin I. P. Rubinstein, M. Sundararajan, J. C. Mitchell, Dawn Song, and Peter L. Bartlett. A learning-based approach to reactive security. In Proceedings of Financial Cryptography and Data Security (FC10), 2010. (<a href="http://www.adambarth.com/papers/2010/barth-rubinstein-sundararajan-mitchell-song-bartlett.pdf">PDF</a>)</p>
<p><strong>Abstract</strong>: Despite the conventional wisdom that proactive security is superior to reactive security, we show that reactive security can be competitive with proactive security as long as the reactive defender learns from past attacks instead of myopically overreacting to the last attack. Our game-theoretic model follows common practice in the security literature by making worst-case assumptions about the attacker: we grant the attacker complete knowledge of the defender’s strategy and do not require the attacker to act rationally. In this model, we bound the competitive ratio between a reactive defense algorithm (which is inspired by online learning theory) and the best fixed proactive defense. Additionally, we show that, unlike proactive defenses, this reactive strategy is robust to a lack of information about the attacker’s incentives and knowledge.</p>
<p><strong>Discussion</strong>: This 2010 paper, a collaboration between security and machine learning researchers, makes the bold claim that rather invest your resources in making your system as secure as possible up-front, in many scenarios it&#8217;s just as good — or even <em>preferable</em> — to fix security problems based on what systems attackers target, a paradigm they call <em>reactive security</em>. It justifies this with a simple game-theoretic model in which a defender with finite resources, typically a high-level security manager, must allocate them among particular lower-level security tasks.</p>
<div style="float: right;"><a href="http://papersincomputerscience.org/wp-content/uploads/2011/03/Reactive_security_example.png"><img class="alignnone size-full wp-image-189" title="Reactive_security_example" src="http://papersincomputerscience.org/wp-content/uploads/2011/03/Reactive_security_example.png" alt="" width="300" height="155" /></a></div>
<p>The primary model used by the paper, of interest on its own, is a two-player game taking place on a graph: the attacker begins at a <em>start vertex</em> and moves through the graph by executing successful attacks. Each vertex has a <em>payoff </em>value that the attackers receives once that vertex is reached, and each edge has a cost, indicating how much the attacker must pay to cross it (representing effort invested in an attack). The initial cost of an edge is based on the surface area of the system being attacked, and the defender can <em>invest</em> in defending an edge, temporarily increasing its cost for one round, but has only finite resources to go around during each round. Attackers and defenders alternate in rounds: the defender picks a set of edges to invest in, then the attacker gets to execute a series of attacks, always beginning at the start vertex. Finally, edge costs are hidden to the defender until the attacker uses the edge; this models how it&#8217;s difficult to determine <em>a priori</em> where security weaknesses in a system are.</p>
<p>Although the examples in the paper have vertices representing system components like machines on a network, I think vertices in the graph are best thought of not as particular systems being exploited, but rather a <em>set of resources</em> controlled by the attacker, or more generally, <em>the current attacking capabilities of the attacker</em>. At the start vertex, they control nothing but their own system; as each edge is traversed, they add new attacking resources to their collection. This allows systems like the one shown to the right, where an attacker may or may not have <em>full</em> control over a front-end system before attacking a back-end system, to be modelled.</p>
<p>The main result of the paper is this: under the assumption that the defender&#8217;s investment in an edge is linearly reflected in the attacker&#8217;s cost for that edge, a specific machine learning algorithm for reactive security (based on placing more weight on recently attacked nodes and exponentially decaying the weight over time) performs comparably to a <em>pure proactive</em> approach, in which the defender knows all edge costs <em>a priori</em> and picks a fixed, optimal strategy. This is done essentially by reducing the problem to a standard online learning problem and using known results. Moreover, they argue that in cases where the edge costs are estimated incorrectly or the attacker acts unexpectedly due to incomplete knowledge, the reactive security algorithm is superior to the proactive approach. Besides providing support for reactive security, this model also provides formal support for other pragmatic measures like <a href="http://en.wikipedia.org/wiki/Defense_in_depth_%28computing%29"><em>defense in depth</em></a>, which is investing in defense measures that are never needed unless some other defensive measure is first overcome.</p>
<p>Although the model is simple, it is quite general: not only can vertices represent machines on a LAN, but also components of an application, or a hybrid thereof. Traversing an edge can correspond to an attack from outside, a privilege elevation on a single machine, or to an attack between machines on a LAN. Two different attacks on the same system can be expressed by distinct edges with distinct costs, if the attacker possesses a different set of resources in each case, which makes sense. Even social engineering can be modelled: once the attacker has invested in tricking or bribing the employee, they can add the employee to their set of resources, lowering the cost of attacking new systems.</p>
<p>On the other hand, the model also has a number of weaknesses. A couple were made explicit in the paper:</p>
<ul>
<li>The model assumes that the mapping from defender investment in an edge to attacker cost to overcome it is linear. This is justified heuristically with the claim that the rate of discovering new security defects in software is roughly constant. Experience shows however that exploits that are easy to exploit can be hard to fix (e.g. DoS attacks), and exploits that are hard to exploit can be easy to fix (e.g. a buffer overflow in code that doesn&#8217;t directly handle user input). Moreover, the same &#8220;investment&#8221; in a system can be spent many different ways, and the model offers little insight into which way is most effective. One solution is to successively use the model itself to decompose vertices into subgraphs.</li>
<li>Reactive security is unsuitable for dealing with situations in which an attack is so devastating to the company that it cannot recover. For example, a company that suffers a high-profile blow to its reputation in the press may see a drop in sales that ultimately drives it into bankruptcy, or may have major investors pull out. Even an optimal response in such a scenario is too little, too late.</li>
</ul>
<p>Some other weaknesses are less obvious:</p>
<ul>
<li>The assumption that edge costs become known as soon as an attacker exploits that edge. In reality, the detection of a single exploit of a system gives very little information about the overall security of a system, particularly after that exploit is repaired. The frequency of exploits of an edge might be a better indicator (the learning may already effectively take this into account).</li>
<li>The assumption that edges have a fixed cost which only increases temporarily if the defender invests in them. In reality, the cost of an attack depends on many dynamic factors, including the skill set and resources available to the attacker marketplace, and the behavior of the system under attack. As new attackers enter the attacker marketplace, as attackers learn new techniques, as transaction costs among attackers change (or as they form teams), or as new hacking tools become available, the cost of attack can go up or down. Patches, upgrades, installation of new applications, or even changes in load can expose new vulnerabilities or fix old ones (in particular, patches implemented in reaction to particular attacks do not go away after the round is over). All these factors can change quickly and are largely invisible to the defender.</li>
<li>Similarly, payoff is assumed to be fixed from round to round. Payoff changes by the minute based not only on what new information the attacked resources are storing, but also the marketplace value of the information, which can be rapidly shifting and unpredictable. Reactive investment in defense of a system that, tomorrow, is of little interest to attackers is a bad idea.</li>
<li>Edge costs are assumed to be independent, in the sense that investing in one edge does not affect the cost of any other edge. In reality, as is easy to see in the diagram above, improvements to one edge may quite directly affect the cost of another related edge as well. Another common example would be rolling out an operating system upgrade in the data center, which could increase the cost of all edges simultaneously.</li>
<li>Attackers are assumed, after each round, to lose control over all their attacked resources. Long-lived attacks such as rootkits can go undetected during attempts to clean up after an attack, allowing the attacker to start in the next round with more resources in the bag at the beginning. In the worst case, the rootkit itself delivers a positive payoff to the attacker, and the attacker doesn&#8217;t need to take any additional action at all!</li>
</ul>
<p>Although the system provides formal evidence that reactive security can be beneficial, it also provides formal evidence that <em>extreme</em> reactive security, pouring all your resources into the system that was just attacked last week, is a terrible strategy. Simple attack strategies which alternate between systems can exploit this kind of reactionary tactic.</p>
<p>Despite the many weaknesses and limitations enumerated above, as I would expect to find in any nascent research area, I find this work exciting and think it opens up a range of possibilities for software security management and challenges the intuition that reacting to attackers is impulsive or short-sighted. Perhaps future work in this area may provide richer models that will offer more new and surprising strategies to defenders.</p>
]]></content:encoded>
			<wfw:commentRss>http://papersincomputerscience.org/2011/03/25/a-learning-based-approach-to-reactive-security/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Conditioned-safe Ceremonies and a User Study of an Application to Web Authentication</title>
		<link>http://papersincomputerscience.org/2010/06/13/conditioned-safe-ceremonies-and-a-user-study-of-an-application-to-web-authentication/</link>
		<comments>http://papersincomputerscience.org/2010/06/13/conditioned-safe-ceremonies-and-a-user-study-of-an-application-to-web-authentication/#comments</comments>
		<pubDate>Sun, 13 Jun 2010 10:30:38 +0000</pubDate>
		<dc:creator>dcoetzee</dc:creator>
				<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://papersincomputerscience.org/?p=181</guid>
		<description><![CDATA[This paper introduces conditioned-safe ceremonies, an informal model for security protocols that explicitly models the actions of users. Rather than conservatively considering users to be unpredictable agents capable of any action, it takes advantage of their properties as creatures of habit to help facilitate the desired, secure outcome.]]></description>
			<content:encoded><![CDATA[<p><strong>Citation</strong>: Chris Karlof, J. Doug Tygar, David Wagner. Conditioned-safe Ceremonies and a User Study of an Application to Web Authentication. Sixteenth Annual Network and Distributed Systems Security Symposium,  2009. (<a href="http://www.cs.berkeley.edu/~daw/papers/condsafe-ndss09.pdf">PDF</a>)</p>
<p><strong>Abstract</strong>: We introduce the notion of a conditioned-safe ceremony. A “ceremony” is similar to the conventional notion of a protocol, except that a ceremony explicitly includes human participants. Our formulation of a conditioned-safe ceremony draws on several ideas and lessons learned from the human factors and human reliability community: forcing functions, defense in depth, and the use of human tendencies, such as rule-based decision making. We propose design principles for building conditioned-safe ceremonies and apply these principles to develop a registration ceremony for machine authentication based on email. We evaluated our email registration ceremony with a user study of 200 participants. We designed our study to be as ecologically valid as possible: we employed deception, did not use a laboratory environment, and attempted to create an experience of risk. We simulated attacks against the users and found that email registration was significantly more secure than challenge question based registration. We also found evidence that conditioning helped email registration users resist attacks, but contributed towards making challenge question users more vulnerable.</p>
<p><strong>Discussion</strong>: This paper from NDSS 2009 introduces <em>conditioned-safe ceremonies</em>, an informal model for security protocols that explicitly models the actions of users. Rather than conservatively considering users to be unpredictable agents capable of any action, it takes advantage of their properties as creatures of habit to help facilitate the desired, secure outcome.</p>
<p>Many of us are familiar with the problem of security warnings sometimes known as <em>click fatigue</em>: when a user is asked to dismiss a security warning frequently during normal operation, they begin to disregard it in all situations. This was for example the primary criticism of Windows Vista&#8217;s <a href="http://en.wikipedia.org/wiki/User_Account_Control">User Account Control</a> (UAC) feature. There are two reasons for this: one is that security is not the primary concern of users who are focused on completing their primary task; the other is that humans asked to perform a process repeatedly will naturally begin to streamline the process by omitting optional steps and completing mandatory steps using rapid rule-based processing and simple pattern matching. If a situation called for a particular response in the past, visually similar stimuli will encourage users to perform the same task nearly automatically. In psychology this kind of decision-making strategy that settles upon an adequate solution is known as <a href="http://en.wikipedia.org/wiki/Satisficing"><em>satisficing</em></a>, and is very difficult to reverse.</p>
<p>Unfortunately this is precisely the type of user behavior exploited by phishers: a typical user presented with a log-in form resembling one they have used many times in the past will thoughtlessly enter their credentials, having long since eliminated the optional steps of carefully examining the security indicators that would expose it as a fraudulent replica.</p>
<p>A <a href="http://en.wikipedia.org/wiki/Cryptographic_protocol">cryptographic protocol</a> &#8211; such as <a href="http://en.wikipedia.org/wiki/Transport_Layer_Security">SSL</a> &#8211; is usually described in terms of a number of nodes representing participating machines which exchange messages over channels. A <em>ceremony</em>, coined by Intel&#8217;s Jesse Walker,<em> </em>extends the concept of protocols by incorporating nodes for the human users themselves and explicitly representing communication between users and their machines via I/O devices (Carl Ellison. <a href="http://eprint.iacr.org/2007/399.pdf">Ceremony Design and Analysis</a>. Cryptology ePrint Archive, Report 2007/399, 2007). This model opens up opportunities for modelling user behavior.</p>
<p>A <em>conditioned-safe </em>ceremony is one designed under the assumption that users will satisfice and behave according to habit; it operates by <em>conditioning </em>the user to follow certain <em>rules</em> during the ceremony. A simple example of conditioning is the Windows log-in screen, which  asks the user to press CTRL+ALT+DEL before logging in. This key signals  to the operating system that the information entered in the log-in  dialog should not be made available to applications or keyboard  interception drivers. Because users are always asked to do this, and are  not permitted to skip this step, they develop a consistent habit of  doing so. The inability of a user to log in without first pressing this  key is called a <em>forcing function</em>: forcing functions encourage  conditioning and discourage omitting steps which may seem unimportant.</p>
<p>A conditioned-safe ceremony should satisfy several important properties:</p>
<ul>
<li>It should only condition <em>safe </em>rules &#8211; &#8220;rules that are harmless to apply in the presence of an adversary.&#8221;</li>
<li>It should condition at least one <em>immunizing </em>rule &#8211; &#8220;a rule which when applied during an attack causes the attack to fail.&#8221;</li>
<li>Conditioned rules should be safe to  follow under all circumstances, without any complex  decision-making.</li>
<li>It should not assume users will reliably perform any action that is not conditioned by the ceremony.</li>
</ul>
<p>There are two different types of errors a user can make during a ceremony which may threaten its security:</p>
<ul>
<li>An error of <em>omission</em>: The user was expected to apply a rule but took no action.</li>
<li>An error of <em>commission</em>: The user took an unexpected action not conditioned by the ceremony.</li>
</ul>
<p>An attacker may attempt to induce either type of error. For example, if an application pops up a Windows log-in box, a user may unthinkingly enter their password without pressing the protective CTRL+ALT+DEL hotkey, because they were not instructed to do so in this instance. An error of <em>commission</em> is usually induced by the attacker giving the user specific instructions, such as &#8220;visit this URL in your web browser&#8221; or &#8221; Users tend to be suspicious of unfamiliar instructions, making attacks of this type more difficult. An ideal conditioned-safe ceremony should protect against as many errors of both types as possible.</p>
<p>The paper presents an example conditioned-safe ceremony for machine validation: the first time a user logs into a site from a particular machine, they must validate their identity. To do this, the site sends them an e-mail containing a link that they must click; after the link is clicked, a cookie is installed and the user has full access to the site from their current machine. The link only works once. The goal of the attacker is to trick the user into <em>not</em> clicking on the link in the e-mail, instead giving it to the attacker; to accomplish this, they display a phishing web page giving specific instructions on how to do this.  This involves both an error of omission (not clicking on a link that they usually click on), and an error of commission (pasting the link into the website, an action they do not normally take). The expectation is that, if users tend to perform actions they are accustomed to, they will ignore or fail to complete the attacker&#8217;s instructions, and the attack will fail.</p>
<p>Sure enough, the experiment bears this out: although as many as 40% of users fall for the attack described above, an alternative design that does not follow the principles of conditioned-safe ceremonies leads to attack  success rates of over 90%. Interviews with the subjects who didn&#8217;t fall for the attack show that over half of them didn&#8217;t notice the attacker&#8217;s special instructions, or thought they were unimportant &#8211; the same inattentive attitude that makes security warnings useless now <em>benefits</em> users!</p>
<p>The most exciting thing about this work to me is that it&#8217;s one of the first to adopt and exploit a successful model of human behavior in the design of security protocols &#8211; a critical step, as humans all too often remain the weakest link in any secure system. Previous efforts have given great insight into the types of errors  people make, but not into how designs can work around these limitations in human behavior.</p>
<p>On the other hand, the informal model presented in this work stands in stark contrast to the mathematical models used in cryptography, where cryptographic protocols are routinely subjected to formal verification techniques such as model checking and theorem proving &#8211; an important future direction is to generalize these same tools to ceremonies. Moreover, although satisficing behavior is evidently an important component in user behavior, it is obviously not the only such component: 40% of users were persuaded to commit multiple errors in the ceremony. It should come as no surprise that humans are complex creatures that cannot be adequately modelled by a simple set of conditioned rules. In this case, what processes underlie these divergent behaviors, and how can they be modelled? Another important question involves modelling of errors, or divergence of users from the model: can we empirically predict the likelihood of certain sets of errors occurring, and then formally validate that attacks are not possible  in the most likely scenarios? The attacks in this work were relatively <em>ad hoc</em> and don&#8217;t seem to rule out the possibility of another attack involving only a single user error.</p>
<p>In short, the area of ceremonies is fertile ground for the development of new models that can effectively predict the behavior of the system as a whole, facilitating the development of protocols that will subtly push users towards making all the right security decisions, even when security is the last thing on their mind.</p>
<p><em>The author releases all rights to all content herein and grants this  work into the public domain, with the exception of works owned by  others such as abstracts, quotations, and WordPress theme content.</em></p>
<div id="_mcePaste" style="position:absolute;left:-10000px;top:0;width:1px;height:1px;overflow:hidden;">http://www.cs.berkeley.edu/~daw/papers/condsafe-ndss09.pdf(P</div>
]]></content:encoded>
			<wfw:commentRss>http://papersincomputerscience.org/2010/06/13/conditioned-safe-ceremonies-and-a-user-study-of-an-application-to-web-authentication/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Efficient Software-Based Fault Isolation</title>
		<link>http://papersincomputerscience.org/2009/12/19/efficient-software-based-fault-isolation/</link>
		<comments>http://papersincomputerscience.org/2009/12/19/efficient-software-based-fault-isolation/#comments</comments>
		<pubDate>Sat, 19 Dec 2009 22:37:27 +0000</pubDate>
		<dc:creator>dcoetzee</dc:creator>
				<category><![CDATA[Operating systems]]></category>
		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://papersincomputerscience.org/?p=174</guid>
		<description><![CDATA[This 1993 paper describes a software-based method for isolating modules on RISC machines, with immediate application to fine-grained privilege separation. Address spaces, implemented in hardware, are used to isolate processes in modern commodity OS's, and software fault isolation (SFI) is an alternative with several advantages: most notably, it requires no hardware support, and communication cost between protection domains is much lower.]]></description>
			<content:encoded><![CDATA[<p><strong>Citation</strong>: Wahbe, R., Lucco, S., Anderson, T. E., and Graham, S. L. 1993. Efficient software-based fault isolation. In <em>Proceedings of the Fourteenth ACM Symposium on Operating Systems Principles</em> (Asheville, North Carolina, United States, December 05 &#8211; 08, 1993). SOSP &#8217;93. ACM, New York, NY, 203-216. (<a href="http://www.cs.washington.edu/homes/tom/pubs/sfi.ps">PS</a>) (<a href="http://crypto.stanford.edu/cs155/papers/sfi.pdf">PDF</a>)</p>
<p><strong>Abstract</strong>: One way to provide fault isolation among cooperating software modules is to place each in its own address space. However, for tightly-coupled modules, this solution incurs prohibitive context switch overhead. In this paper, we present a software approach to implementing fault isolation within a single address space. Our approach has two parts. First, we load the code and data for a distrusted module into its own <em>fault domain</em>, a logically separate portion of the application&#8217;s address space. Second, we modify the object code of a distrusted module to prevent it from writing or jumping to an address outside its fault domain. Both these software operations are portable and programming language independent.</p>
<p>Our approach poses a tradeoff relative to hardware fault isolation: substantially faster communication between fault domains, at a cost of slightly increased execution time for distrusted modules. We demonstrate that for frequently communicating modules, implementing fault isolation in software rather than hardware can substantially improve end-to-end application performance.</p>
<p><strong>Discussion</strong>: This 1993 paper by <a href="http://www.microsoft.com/presspass/exec/wahbe/">Wahbe</a> and <a href="http://www.microsoft.com/presspass/exec/de/Lucco/default.mspx">Lucco</a> (now at Microsoft), <a href="http://www.cs.washington.edu/homes/tom/">Thomas E. Anderson</a> (now at the University of Washington), and <a href="http://www.eecs.berkeley.edu/~graham/">Susan L. Graham</a> describes a software-based method for isolating modules on RISC machines, with immediate application to fine-grained privilege separation. Address spaces, implemented in hardware, are used to isolate processes in modern commodity OS&#8217;s, and software fault isolation (SFI) is an alternative with several advantages: most notably, it requires no hardware support, and communication cost between protection domains is much lower.</p>
<p><strong>Background</strong></p>
<p>Suppose a system is divided into <em>modules </em>(components) &#8211; for example, a multitasking operating system may have a kernel with various applications running on top of it. A system is said to provide <em>fault isolation</em> if it can recover from the failure of a module without risking the integrity of the rest of the system. For example, if you&#8217;re playing a game and it crashes, that shouldn&#8217;t cause your web browser &#8211; or your entire computer &#8211; to crash, or even to malfunction. Fault isolation is valuable for mitigating failures in large software systems where failures are inevitable.</p>
<p>Fault isolation also forms the foundation for a valuable security technique called <a href="http://en.wikipedia.org/wiki/Privilege_separation"><em>privilege separation</em></a>: if each module is only permitted to perform a limited set of certain operations, then even if a module is compromised by an attacker, the attacker only gains access to the privileges held by that module, rather than the entire system. For example, if you&#8217;re playing a game it should only have access to the game data files, not the files containing your bank information, which it doesn&#8217;t need; then, even if an attacker hijacks your game, they still can&#8217;t access your bank information. Privilege separation allows security vulnerabilities in large systems to be mitigated and allows the effort of security verification and auditing to be concentrated on the modules that hold the most dangerous privileges.</p>
<p>In a typical fault isolation system, each module <em>owns</em> some subset of memory where it stores its code and private data structures. To ensure that a misbehaving module does not compromise the integrity of other modules, modules must not be able to write to memory owned by other modules. Moreover, modules must communicate in a controlled manner, usually through an explicit interface, so that other modules can&#8217;t trick a module into modifying its own memory maliciously on their behalf.</p>
<p>Today, the primary mechanism used to enforce these properties is <em>dynamic address translation</em>, which maps virtual addresses (memory locations accessed by the application) to physical addresses (locations on the memory device itself) on-the-fly. This functionality is implemented by a specialized piece of hardware called a <a href="http://en.wikipedia.org/wiki/Memory_management_unit">memory management unit</a> (MMU). By introducing this layer of indirection, the operating system can ensure that a process has no ability to read or write memory belonging to other processes simply by configuring the MMU to provide no mapping from any virtual address to these locations. Because only the operating system kernel can reconfigure the MMU, this protection is secure enough to use with malicious code.</p>
<p>The downside to MMU-based protection is that communication is expensive. Suppose module A wants to send a message to module B. Normally, if both modules had the same view of memory, it would just call module B and pass it a pointer to the message. This takes only a few instructions and is very fast. When two modules have different address space mappings, however, the cost rises dramatically: just switching from running one module to running the other requires a context switch, which involves saving and restoring the complete register set, reconfiguring the MMU, and flushing the MMU&#8217;s cache (the <a href="http://en.wikipedia.org/wiki/Translation_lookaside_buffer">translation lookaside buffer</a>). On top of that, it can&#8217;t just pass a pointer to the message, because for module B that same pointer may map to a different location in physical memory. There are <a href="http://en.wikipedia.org/wiki/Inter-process_communication">many different ways</a> to send messages from one process to another, but these all involve overhead.</p>
<p>In applications where the number of messages sent is large, this overhead rapidly becomes prohibitive. An example is a Gigabit Ethernet driver, which has to send a new packet to the network stack about every 12 microseconds. Considering the CPU time needed to parse the headers, this leaves very little time for expensive context switches. As a consequence, in practice components like this are simply not modularized; if they crash, the system crashes.</p>
<p>More generally, any system that is decomposed into fine-grained modules will tend to exhibit a lot of communication between modules &#8211; the performance disadvantage of doing this under the MMU model outweighs the security and robustness advantages of having smaller modules.</p>
<p><strong>Software fault isolation (SFI)<br />
</strong></p>
<p>Software-based fault isolation, or SFI, aims to provide a substantially different model for fault isolation emphasizing fast communication between modules. In this model, all modules have the same view of memory (run in the same address space), but different subsets of memory are owned by different modules, and every write to memory is checked to make sure the current module owns that memory. Additionally, function calls between modules are controlled so that each module can only be entered at a predetermined set of entry points described in its interface specification.</p>
<p>Despite the name, the important thing here is not that these mechanisms are implemented in software &#8211; this offers the convenience of deploying the solution on existing commodity platforms, but is not essential, and indeed most SFI systems rely on some creative combination of software mechanisms and (existing) hardware mechanisms.</p>
<p>In its simplest form, this model is straightforward to provide: imagine you have a trusted compiler and you use it to compile an application consisting of multiple modules.  Each module is assigned a fixed range of memory for its code and data. Whenever the compiler emits a write instruction, it also emits a check to make sure that the address being written to is in the range of the module being executed. Likewise, whenever the compiler emits an indirect branch or jump, it inserts a check to make sure that that jump either lies within the current module, or is a valid entry point of some other module. This simple solution has two main problems:</p>
<ol>
<li>It&#8217;s really slow.</li>
<li>It&#8217;s possible for a malicious module to circumvent the checks. For example, it could insert an indirect branch to one of its own write instructions, skipping the check in front of it.</li>
</ol>
<p>To deal with the circumvention problem, Wahbe et al use a <em>dedicated register</em> that holds an address. When the compiler emits code, it maintains two invariants:</p>
<ol>
<li>All writes must be performed to the address stored in the dedicated register;</li>
<li>The dedicated register should almost always point to a valid address inside the current module; if any instruction invalidates this invariant, the program must either fail or restore the invariant before the next write or indirect branch instruction.</li>
</ol>
<p>Now, a module is free to jump to any instruction it wants within its own bounds, without risking writing to another module&#8217;s memory. This does imply that the dedicated register can&#8217;t be used for any other purpose, but on a RISC machine with 32 registers this is not a problem. The same trick can be used to protect indirect branches (jumps to data must be excluded; this can be done either by using a second dedicated register for code, or by marking all data non-executable).</p>
<p>This leaves open the question of how to allow calls between modules (called <em>cross-fault-domain RPCs</em> by Wahbe et al). The scheme implemented in this work stores a jump table inside each module, with one entry for each possible cross-module call. The trusted compiler permits this jump table (and only this jump table) to contain branches to points outside the module. This allows the checks on indirect jumps to be very simple (they just have to make sure modules only jump to their own code region). Rather than transferring control directly to another domain, these jump tables transfer control to call stubs which perform several important operations before invoking the actual call:</p>
<ul>
<li>Because each module needs to access its own data on the call stack, each module is given a private stack in its own data region. When transferring control to another module, we have to switch to its stack and copy across any stack-based arguments. A similar mechanism is used by commodity OSs when trapping in and out of the kernel.</li>
<li>Standard calling conventions include callee-save registers, registers which must not be altered by the function being called. Since the called module may be malicious, the call stub saves these registers instead.</li>
<li>The dedicated register invariants must be restored upon return.</li>
</ul>
<p>Finally, we return to the problem of performance: inserting complex checks before every write and indirect jump is expensive, especially if those checks involve branches. The insight of Wahbe et al is that if the code and data region for each module is the set of all addresses with some common bit prefix (say, addresses <tt>0x1f00000</tt> through <tt>0x1fffffff</tt>) then we can make sure that all writes and jumps go into this region by merely bitmasking the target address to have the correct prefix. If the original address lies outside the region, this will cause some random part of the module to get overwritten or jumped to &#8211; but only malicious or invalid code will encounter this behavior, so it&#8217;s not a problem (except, perhaps, during debugging).</p>
<p>Another important optimization involves the stack: writes to the stack are considerably more common than writes to the heap in typical programs, and are usually made at a small fixed offset from either the stack pointer or the frame pointer. Rather than check every one of these, they take advantage of the stack&#8217;s locality by maintaining the invariant that the stack pointer (or frame pointer) points inside the module&#8217;s data region, and only checking it when it&#8217;s modified. Since writes through the stack pointer may have a small offset attached, they create &#8220;guard regions&#8221; containing nothing useful before and after each module&#8217;s data region; even if the stack pointer is at the beginning or end of the region and the offset is set to the minimum or maximum value, it still can&#8217;t write past the guard regions.</p>
<p>To actually implement privilege separation with SFI, a little something extra is needed &#8211; if all modules in an application could make the same system calls, they would effectively have the same privilege. As described in section 3.4, Wahbe et al&#8217;s scheme uses a simple but flexible model in which one module is permitted to make system calls freely and no other module is permitted to make any; it acts as an <em>arbitrator</em> and can implement arbitrary fine-grained policies by observing which module invoked it.</p>
<p>Evaluations show that with all the optimizations described above, the overall scheme is quite efficient &#8211; runtime overhead ranged from 0 to 12%, with an average of 4.3%. However, this is only checking writes &#8211; checking reads, which is done in the MMU model and is important for protecting sensitive module-private data such as passwords, requires a much higher overhead due to the larger number of reads (21.8% on average).</p>
<p>One practical issue with Wahbe et al&#8217;s scheme as a security solution is that it depends on a large, trusted compiler for correctness &#8211; later works such as MIT&#8217;s PittSFIeld mitigated this problem by using a separate, smaller verifier that is run on machine code when loading it.</p>
<p><strong>Practical impact and later work<br />
</strong></p>
<p>Despite the apparent utility of Wahbe et al&#8217;s scheme, sometimes termed <em>classical SFI</em>, it is not widely used in practice. This may be because the scheme depends critically on a number of features of RISC machines, and RISC machines are not widely deployed in the PC market (they are, however, quite common in the mobile and embedded market). It may be because it was never effectively developed into a robust product or integrated well with other tools. More recent work, such as MIT&#8217;s <a href="http://people.csail.mit.edu/smcc/projects/pittsfield/">PittSFIeld</a> (2006) and <a href="http://pdos.csail.mit.edu/~baford/vm/">Vx32</a> (2008), have effectively extended SFI to CISC platforms such as the x86 and put more effort into promulgating a robust prototype.</p>
<p>Although SFI is a hot research area now, I remain skeptical of approaches like Vx32 that depend on creative exploitation of legacy hardware that is in the process of being phased out. The way I see it, SFI presents a useful fault isolation model that is independent of its software-based implementation, and deserves explicit hardware support to mitigate its performance costs. Doing an efficient check for each read and write is exactly the sort of thing hardware is good at (and software is really bad at). The hard question is exactly what hardware support should be implemented to support typical applications &#8211; SFI is so poorly deployed, that few applications have ever been architected with fine-grained modularization and privilege separation in mind, and the array of possible design choices is overwhelming. Some proposed schemes include <a href="http://groups.csail.mit.edu/cag/scale/mondriaan/index.html">Mondriaan Memory Protection</a> (2002), and the <a href="http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-97.html">Hard Object</a> (2009) project I worked on at UC Berkeley. Just as important is more work on programming tools to describe and implement module isolation and interfaces in a generic way that can be implemented using a variety of fault isolation schemes, ranging from no protection to dynamic address translation to SFI to new hardware schemes &#8211; this will help to bootstrap the process of getting large enough applications built that new platforms for isolation can evaluated. Regardless, this is going to be an important area of research in the near future and I&#8217;m excited to see what techniques will gain adoption.</p>
<p><em>The author releases all rights to all content herein and grants this work into the public domain, with the exception of works owned by others such as abstracts, quotations, and WordPress theme content.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://papersincomputerscience.org/2009/12/19/efficient-software-based-fault-isolation/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

