Members of the defense community identified the need for MLS-capable systems in the 1960s, and a few vendors implemented the basic features (Weissman 1969, Hoffman 1973, Karger and Schell 1974). However, government studies of the MLS problem emphasized the danger of relying on large, opaque operating systems to protect really valuable secrets (Ware 1970, Anderson 1972). Operating systems were already notorious for unreliability, and these reports highlighted the threat of a software bug allowing leaks of highly sensitive information. The recommended solution was to achieve high assurance through extensive analysis, review, and testing.
High assurance would clearly increase vendors' development costs and lead to higher product costs. This did not deter the US defense community, which foresaw long-term cost savings. Karger and Schell (1974) repeated an assertion that MLS capabilities could save the US Air Force alone $100,000,000 a year, based on computing costs at that time.
Every MLS device poses a fundamental question: does it really enforce MLS, or does it leak information somehow? The first MLS challenge is to develop a way to answer that question. We can decompose the problem into two more questions:
The first question was answered by the development of security models, like the Bell-LaPadula model summarized earlier. A true security model provides a formal, mathematical representation of MLS information flow restrictions. The formal model makes the enforcement problem clear to non-programmers. It also makes the operating requirement clear to the programmers who implemented the MLS mechanisms.
To address the evaluation question, designers needed a way to prove that the system's MLS controls indeed work correctly. By the late 1960s, this had become a really serious challenge. Software systems had become much too large for anyone to review and validate: Brooks (1975) reported that IBM had over a thousand people working on its ground breaking system, OS/360. In the book, The Mythical Man-Month, Brooks described the difficulties of building a large-scale software system. Project size wasn't the only challenge in building reliable and secure software: smaller teams, like the team responsible for the Multics security mechanisms, could not detect and close every vulnerability (Karger and Schell, 1974).
The security community developed two sets of strategies for evaluating MLS systems: strategies for designing a reliable MLS system and strategies to prove the MLS system works correctly. The design strategies emphasized a special structure to ensure uniform enforcement of data access rules, called the reference monitor. The design strategies further required that the designers explicitly identify all system components that played a role in enforcing MLS; those components were defined as being part of the trusted computing base, which included all components that required high assurance.
The strategies for proving correctness relied heavily on formal design specifications and on techniques to analyze those designs. Some of these strategies were a reaction to ongoing quality control problems in the software industry, but others were developed as an attempt to detect covert channels, a largely unresolved weakness in MLS systems.(back to top)
During the early 1970s, the US Air Force commissioned a study to develop feasible strategies for constructing and verifying MLS systems. The study pulled together significant findings by security researchers at that time into a report, called the Anderson report (1972), which heavily influenced subsequent US government support of MLS systems. A later study (Nibaldi 1979) identified the most promising strategies for trusted system development and proposed a set of criteria for evaluating such systems.
These proposals led to published criteria for developing and evaluating MLS systems called the Trusted Computer System Evaluation Criteria (TCSEC), or "Orange Book" (Department of Defense, 1985a). The US government established a process by which computer system vendors could submit their products for security evaluation. A government organization, the National Computer Security Center (NCSC), evaluated products against the TCSEC and rated the products according to their capabilities and trustworthiness. For a product to achieve the highest rating for trustworthiness, the NCSC needed to verify the correctness of the product's design.
To make design verification feasible, the Anderson report recommended (and the TCSEC required) that MLS systems enforce security through a "reference validation mechanism" that today we call the reference monitor. The reference monitor is the central point that enforces all access permissions. Specifically, a reference monitor must have three features:
Operating system designers had by that time recognized the concept of an operating system kernel: a portion of the system that made unrestricted accesses to the computer's resources so that other components didn't need unrestricted access. Many designers believed that a good kernel should be small for the same reason as a reference monitor: it's easier to build confidence in a small software component than in a large one. This led to the concept of a security kernel: an operating system kernel that incorporated a reference monitor. Layered atop the security kernel would be supporting processes and utility programs to serve the system's users and the administrators. Some non-kernel software would require privileged access to system resources, but none would bypass the security kernel. The combination of the computer hardware, the security kernel, and its privileged components made up the trusted computing base (TCB) - the system components responsible for enforcing MLS restrictions. The TCB was the focus of assurance efforts: if it worked correctly, then the system would correctly enforce the MLS restrictions.
The computer industry has always relied primarily on system testing for quality assurance. However, the Anderson report recognized the shortcomings of testing by repeating Dijkstra's observation that tests can only prove the presence of bugs, not their absence. To improve assurance, the report made specific recommendations about how MLS systems should be designed, built, and tested. These recommendations became requirements in the TCSEC, particularly for products intended for the most critical applications:
These activities were not substituted for conventional product development techniques. Instead, these additional tasks were combined with the accepted "best practices" used in conventional computer system development. These practices tended to follow a "waterfall" process (Boehm, 1981; Department of Defense, 1985b): first, the builders develop a requirements specification, from that they develop the top-down design, then they implement the product, and finally they test the product against the requirements. In the idealized process for developing an MLS product, the requirements specification focuses on testable functions and measurable performance capabilities while the policy model captures security requirements that can't be tested directly. (see Figure 6). shows how these elements worked together to validate an MLS product's correct operation.
Product development has always been expensive. Many development organizations, especially smaller ones, try to save time and money by skipping the planning and design steps of the waterfall process. The TCSEC did not demand the waterfall process, but its requirements for highly assured systems imposed significant costs on development organizations. Both the Nibaldi study and the TCSEC recognized that not all product developers could afford to achieve the highest levels of assurance. Instead, the evaluation process identified a range of assurance levels that a product could achieve. Products intended for less-critical activities could spend less money on their development process and achieve a lower standard of assurance. Products intended for the most critical applications, however, were expected to meet the highest practical assurance standard.(back to top)
Shortly after the Anderson report appeared, Lampson (1973) published a note which examined the general problem of keeping information in one program secret from another, a problem at the root of MLS enforcement. Lampson noted that computer systems contain a variety of channels by which two processes might exchange data. In addition to explicit channels like the file system or interprocess communications services, there are covert channels that can also carry data between processes. These channels typically exploit operating system resources shared among all processes. For example, when one process can take exclusive control of a file, it prevents other processes from accessing the file, or when one process uses up all the free space on the hard drive, other processes will "see" this activity.
Since MLS systems could not achieve their fundamental objective (to protect secrets) if covert channels were present, defense security experts developed techniques to detect such channels. The TCSEC required a covert channel analysis of all MLS systems except those achieving the lowest assurance levels.
In general, there are two categories of covert channels: storage channels and timing channels. A storage channel transmits data from a "high" process to a "low" one by writing data to a storage location visible to the "low" one. For example, if a Secret process can see how much memory is left after a Top Secret process allocates some memory, the Top Secret process can send a numeric message by allocating or freeing the amount of memory equal to the message's numeric value. The covert channel consists of setting the contents of a storage location (the size of free memory) to a value by the "high" process that is readable by the "low" one.
A timing channel is one in which the "high" process communicates to the "low" one by varying the timing of some detectable event. For example, the Top Secret process might instruct the hard drive to visit particular disk blocks. When the Secret process goes to read data from the hard drive itself, the disk activity by the Top Secret process will cause varying delays in the Secret program when it tries to use the hard drive itself. The Top Secret program can systematically impose delays on the Secret program's disk activities, and thus transmit information through the pattern of those delays. Wray (1991) describes a covert channel based on hard drive access speed, and also uses the example to show how ambiguous the two covert channel categories can be.
The fundamental strategy for seeking convert channels is to inspect all shared resources in the system, decide if any could yield an effective covert channel, and to measure the bandwidth of whatever covert channels are uncovered. While a casual inspection by a trained analyst may often uncover covert channels, there is no guarantee that a casual inspection will find all such channels. Systematic techniques help increase confidence that the search has been comprehensive. An early technique, the shared resource matrix (Kemmerer, 1983; 2002), can analyze a system from either a formal or informal specification. While the technique can detect covert storage channels, it cannot detect covert timing channels. An alternative approach, noninterference, requires formal policy and design specifications (Haigh and Young, 1987). This technique locates both timing and storage channels by proving theorems to show that processes in the system, as described in the design specification, can't perform detectable ("interfering") actions that are visible to other processes in violation of MLS restrictions.
To be effective at locating covert channels, the design specification must accurately model all resource sharing that is visible to user processes in the system. Typically, the specification focuses its attention on the system functions made available to user processes: system calls to manipulate files, allocate memory, communicate with other processes, and so on. The development program for the LOCK system (Saydjari, Beckman, and Leaman, 1989; Saydjari, 2002), for example, included the development of a formal design specification to support a covert channel analysis. The LOCK design specification identified all system calls, described all inputs and outputs produced by these calls, including error results, and represented the internal mechanisms necessary to support those capabilities. The LOCK team used a form of noninterference to develop proofs that the system enforced MLS correctly (Fine, 1994).
As with any flaw detection technique, there is no way to confirm that all flaws have been found. Techniques that analyze the formal specification will detect all flaws in that specification, but there is no way to conclusively prove that the actual system implements the specification perfectly. Techniques based on less-formal design descriptions are also limited by the quality of those descriptions: if the description omits a feature, there's no way to know if that feature opens a covert channel. At some point there must be a trade-off between the effort spent on searching for covert channels and the effort spent searching for other system flaws.
In practice, system developers have found it almost impossible to eliminate all covert channels. While evaluation criteria encourage developers to eliminate as many covert channels as possible, the criteria also recognize that practical systems will probably include some channels. Instead of eliminating the channels, developers must identify them, measure their possible bandwidth, and provide strategies to reduce their potential for damage. While not all security experts agree that covert channels are inevitable (Proctor and Neumann, 1992), typical MLS products contain covert channels. Thus, even the approved MLS products contain known weaknesses.
How does assurance fit into the process of actually deploying a system? In theory, one can plug a computer in and throw the switch without knowing anything about its reliability. In the defense community, however, a responsible officer must approve all critical systems before they can go into operation, especially if they handle classified information. Approval rarely occurs unless the officer receives appropriate assurance that the system will operate correctly. There are three major elements to this approval process in the US defense community:
In military environments, a highly-ranked officer, typically an admiral or general, must formally grant approval (accreditation) before a critical system goes into operation. Accreditation shows that the officer believes the system is safe to operate, or at least that the system's risks are outweighed by its benefits. The decision is based on the results of the system's certification: a process in which technical experts analyze and test the system to verify that it meets its security and safety requirements. The certification and accreditation process must meet certain standards (Department of Defense, 1997). Under rare, emergency conditions an officer could accredit a system even if there are problems with the certification.
Certification can be very expensive, especially for MLS systems. Tests and analyses must show that the system is not going to fail in a way that will leak classified information or interfere with the organization's mission. Tests must also show that all security mechanisms and procedures work as specified in the requirements. Certification of a custom-built system often involves design reviews and source code inspections. This work requires a lot of effort and special skills, leading to very high costs.
The product evaluation process heralded by the TCSEC was intended to provide off-the-shelf computing equipment that reliably enforced MLS restrictions. Although organizations could implement, certify, and accredit custom systems enforcing MLS, the certification costs were hard to predict and could overwhelm the project budget. If system developers could use off-the-shelf MLS products, their certification costs and project risks would be far lower. Certifiers could rely on the security features verified during evaluation, instead of having to verify a product's implementation themselves.
Product evaluations assess two major aspects: functionality and assurance. A successful evaluation indicates that the product contains the appropriate functional features and meets the specified level of assurance. The TCSEC defined a range of evaluation levels to reflect increasing levels of compliance with both functional and assurance requirements. Each higher evaluation level either incorporated the requirements of the next lower level, or superseded particular requirements with a stronger requirement. Alphanumeric codes indicated each level, with D being lowest and A1 being highest:
Although the TCSEC defined a whole range of evaluation levels, the government wanted to encourage vendors to develop systems that met the highest levels. In fact, one of the pioneering evaluated products was SCOMP, an A1 system constructed by Honeywell (Fraim, 1983). Very few other vendors pursued an A1 evaluation. High assurance caused high product development costs; one project estimated that the high assurance tasks added 26% to the development effort's labor hours (Smith, 2001). In the fast-paced world of computer product development, that extra effort can cause delays that make the difference between a product's success or failure.
To date, no commercial computer vendors have offered a genuine "off the shelf" MLS product. A handful of vendors had implemented MLS operating systems, but none of these were standard product offerings. All MLS products were expensive, special purpose systems marketed almost exclusively to the military and government customers. Almost all MLS products were evaluated to the B1 level, meeting minimum assurance standards. Thus, the TCSEC program failed on two levels: it failed to persuade vendors to incorporate MLS features into their standard products, and it failed to persuade any vendors to produce products that met the "A1" requirements for high assurance.
A survey of security product evaluations completed by the end of 1999 (Smith, 2000) noted that only a fraction of security products ever pursued evaluation. Most products pursued "medium assurance" evaluations, which could be sufficient for a minimal (B1) MLS implementation.
TCSEC evaluations were discontinued in 2000. The handful of modern MLS products are evaluated under the Common Criteria (Common Criteria Project Sponsoring Organizations, 1999), an evaluation criteria designed to address a broader range of security products.
The most visible failure of MLS technology is its absence from typical desktops. As Microsoft's Windows operating systems came to dominate the desktop in the 1990s, Microsoft made no significant move to implement MLS technology. Versions of Windows have earned a TCSEC C2 evaluation and a more-stringent EAL-4 evaluation under the Common Criteria, but it has never incorporated MLS. The closest Microsoft has come to offering MLS technology has been its "Palladium" effort announced in 2002. The technology focused on the problem of digital rights management - restricting the distribution of copyrighted music and video - but the underlying mechanisms caught the interest of many in the MLS community because of potential MLS applications. The technology was slated for incorporation in a future Windows release codenamed "Longhorn," but was dropped from Microsoft's plans in 2004 (Orlowski, 2004).
Arguably several factors have contributed to the failure of the MLS product space. Microsoft demonstrated clearly that there was a giant market for products that omit MLS. Falling computer prices also played a role: sites where users typically work at a couple of different security levels find it cheaper to put two computers on every desktop than to try to deploy MLS products. Finally, the sheer cost and uncertainty of MLS product development undoubtedly discourage many vendors. It is hard to justify the effort to develop a "highly secure" system when it's likely that the system will still have identifiable weaknesses, like covert channels, after all the costly, specialized work is done.(back to top)