1. SCAP Source Data Stream Collections

The Security Content Automation Protocol (SCAP — pronounced ess-cap) standard [1] contains a family of Extensible Markup Language (XML) [2] vocabularies for representing security content. System administrators use configuration checking and vulnerability scanning software tools that produce and consume SCAP data to secure servers, workstations, networks, and other deployed hardware and applications. United States government agencies and their service providers are required to use SCAP-conforming software products [3] and content [4] for security policy compliance checking and continuous monitoring of information technology assets. The private sector also relies on SCAP products and content for ensuring its information systems are protected. For example, banks and credit card companies use SCAP-conforming security configuration checklists [5] to verify compliance with the Payment Card Industry Data Security Standard [6]. Additionally, some private sector organizations use SCAP's XML vocabularies to develop security content for internal use, or to provide to their customers [7].

Central to the SCAP standard is the source data stream collection data model, an XML schema [8] defined in NIST Special Publication (SP) 800-126 (Technical Specification for the Security Content Automation Protocol) [9]. This schema specifies how to package, into a self-contained entity, the collective input required for an SCAP-conforming software product to assure a system is not overly vulnerable to cyberattack. The schema enables lossless exchange of security content between SCAP-conforming software products, allowing SCAP users to avoid vendor lock-in and to share source data stream collections within their organization or with partners.

But the same SCAP source data stream collection schema that promotes interoperability and shareability among SCAP-conforming software products also makes source data stream collection authoring hard. The reason why is that no schema can single-handedly meet the needs of SCAP scanner tool developers and SCAP content authors. Section 3 of this paper discusses use of the Darwin Information Typing Architecture (DITA) standard [10] to define a new source data stream collection XML document type for content authors. This new document type, which specializes DITA's map type, does not replace the SP 800-126 schema. Instead, the new DITA document type serves as an alternative that makes it easier for authors to express a source data stream collection. Section 4 describes SCAP Composer, a software application that implements the new, author-friendly document type and transforms valid instances of it into valid instances of the SP 800-126 schema. SCAP composer uses the DITA Open Toolkit [11][1], an open source DITA processor. The National Institute of Standards and Technology expects to make SCAP Composer available as open source software in late 2019.

Before delving into the authoring schema and application, it is important to understand why they are needed. Section 2 of this paper provides an overview of the NIST SP 800-126 schema's underlying data model — highlighting its advantages and drawbacks. But first it is instructive to see an example of SCAP in action. Figure 1 illustrates what an SCAP-conforming software product can do with a source data stream collection. The software depicted enables a user to open an SCAP source data stream collection file, view security checklists, and perform configuration and vulnerability scans. The checklists reside in the source data stream collection's checklist component, whose XML content conforms to the Extensible Configuration Checklist Description Format (XCCDF) specification [12]. XCCDF an XML vocabulary SCAP uses to represent security configuration rules. Clicking the Scan button causes the target system — either the local computer or a remote system — to be checked against the checklist to determine whether the checklist's rules are satisfied. Using this software to scan a remote target requires that an SCAP scanning client be installed on a remote system, and that the network allows secure upload of the SCAP source data stream collection file to that system.

Figure 1

SCAP-conforming scanning and tailoring software.

This software also provides a user interface for selecting a subset of rules from a checklist component, assigning parameters to the subset, and saving the result as a tailoring document. The tailoring document references the original checklist component without modifying the XML resource it contains, and thus promotes reuse of SCAP content.

The preceding discussion of Figure 1 suggests the following important requirements that the SP 800-126 schema must address:

  • Self-containment: A source data stream collection must bundle all information needed to perform a scan on the target, without relying on references to external files or resources. Self-containment facilitates scans of remote targets by ensuring a complete transmission of information between the host system initiating the scan and target. Also, a self-contained source data stream collection may be digitally signed in its entirety to ensure integrity and trustworthiness.

  • Reversibility: A source data stream collection must bundle its components such that the XML resources the components contain are unmodified from their original states, and any XML resource can be extracted and re-bundled into a new collection without modification to the XML. Reversibility makes it easier to reuse SCAP content such as the checklist shown in Figure 1.

2. The SP 800-126 Schema

Figure 2 provides a high-level example of an SCAP source data stream collection containing two data streams[2] and five components. Each component encapsulates an XML resource conforming to an SCAP vocabulary schema (such as the schema for checklists). Each data stream represents a specific SCAP use case, for example, checking the configuration of a server running Ubuntu Linux version 16.04. Data streams reference components, as shown by the arrows. More than one data stream can reference the same component.

Figure 2

SCAP data stream collection.

Figure 2 does not capture two subtleties of SCAP source data stream collections. The first has to do with the part-whole relationship between a component and the XML resource it encapsulates. An XML resource typically exists outside the scope of a source data stream collection. For example, a checklist contained in a component may have been copied from its original residing in a security checklist repository. Therefore, one can think of a component as a snapshot of an XML resource at a specific point in time. Figure 3 illustrates this idea with two distinct components, each in a distinct source data stream collection, encapsulating the same XML resource. To capture this snapshot notion, the SP 800-126 schema represents a component as a wrapper element, with a time stamp attribute, surrounding a copy of the XML resource.

Figure 3

Components reusing the same XML resource.

The second subtlety pertains to component references — shown as directional arrows in Figure 2 and Figure 3. At first glance, it seems that an XLink simple link [13] can easily represent a component reference. But there is a less-than-obvious complication. A data stream collection might have two components, with one component's XML resource referencing the other component's XML resource. In fact, it is common in SCAP for a source data stream to have both a component containing a checklist resource and a component containing a check resource described in the Open Vulnerability Assessment Language (OVAL). OVAL [14] is an XML vocabulary for representing system configuration information, tests and states. An XCCDF checklist rule typically references check definitions in an OVAL resource that are used to determine if the current state of a system satisfies the rule criteria. XCCDF checklist rules and OVAL definitions together usually make up most of the XML data in a source data stream collection[3].

Figure 4 shows a source data stream collection containing checklist and check components. The directional arrow labeled href indicates a reference from within a checklist rule to a check definition (each represented by a small square inside the XML resource). The problem is that the act of encapsulating the checklist and check resources into components in a source data stream collection breaks the internal references from checklist rules to check definitions. As an example, consider the following reference to a check definition from within a checklist rule:

<check-content-ref href="oval.xml" 
                   name="oval:nist.validation.family:def:1"/>
The referenced check definition's identifier is oval:nist.validation.family:def:1, and the definition is in a check resource whose Uniform Resource Locator (URI), relative to the checklist resource URI, is oval.xml. This relative URI reference is useless for an SCAP-conforming software product consuming the source data stream shown in Figure 4. What the SCAP-conforming software product needs to know is where to find the check component contained inside the source data stream collection, not the check resource outside the source data stream collection's scope. However, SCAP's reversibility requirement dictates that modifying the encapsulated XML resource inside a component is not allowed.

Figure 4

Checklist resource with URI reference to a check resource.

The SP 800-126 schema solves this problem by requiring that a data stream referencing a component with internal references to a location inside another component include a mapping. The mapping enables the source data stream collection consumer to translate internal references within the encapsulated XML resource to the corresponding component location within the source data stream collection. Figure 4 shows the mapping as dotted lines indicating a pair of URI references. The first URI reference is the relative URI reference inside the checklist rule. The second is the reference to the check component in the source data stream collection.

The SP 800-126 schema expresses the mapping using XML Catalogs standard [15] syntax. Figure 5 shows a possible XML representation of the reference to the checklist component shown in Figure 4. The component-ref element has an XLink simple link pointing to the component, whose identifier is scap_gov.nist_comp_content-xccdf. An embedded XML Catalogs uri element tells the source data stream collection consumer to translate internal references to oval.xml from within the checklist resource to references to the URI of the data stream component reference that points to the check component, #scap_gov.nist_cref_content-oval.

Figure 5

<component-ref id="scap_gov.nist_cref_content-xccdf"
   xlink:href="#scap_gov.nist_comp_content-xccdf">
   <catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
      <uri name="oval.xml" 
           uri="#scap_gov.nist_cref_content-oval"/>
   </catalog>
</component-ref>

Component reference with a mapping from an internal URI reference to the corresponding component reference.

The long, verbose identifiers in Figure 5 are the result of a SP 800-126 requirement that data stream collections, data streams, components and component references have globally-unique identifiers (GUIDs). To this end, the schema requires the identifier format conventions shown in Table I. An identifier must be underscore-delimited, beginning with scap, followed by a reverse domain name system (DNS) style substring associated with the creator, followed by a substring indicating the object type being identified (collection, datastream, cref or comp), and ending with an XML NCName [16]. For example, a data stream containing the component reference shown in Figure 5 could have scap_gov.nist_datastream_example as its identifier. By requiring GUIDs, SP 800-126 reduces the likelihood of conflicting identifiers within a source data stream collection or identifiers that conflict with those in another organization's source data stream collection.

Table I

SCAP GUID format convention.

Object Identifier Format Convention
Data Stream Collection scap_reverseDNS_collection_name
Data Stream scap_reverseDNS_datastream_name
Component Reference scap_reverseDNS_cref_name
Component scap_reverseDNS_comp_name

To recap, the SP 800-126 self-containment and reversibility requirements — although important for component integration and interoperability — result in added schema complexity and added pain for source data stream collection authors. The GUID format minimizes the possibility of ambiguous or dangling references, but it results in verbose, repetitive, and author-unfriendly identifiers. The SP 800-126 schema's underlying data model, hampered by XML's inability to intuitively express part-whole relationships between a component and its encapsulated XML resource, represents component references as first class objects using XML Catalogs to handle translation of resource-to-resource references from within the source data stream collection. This added complexity makes authoring a chore and impedes human readability of source data stream collections. Appendix A shows the source data stream collection XML that was read into the SCAP scanner software application shown in Figure 1. Figure 6 shows a less verbose, easier-to-read, and more author-friendly equivalent representation of this source data stream collection using the new DITA type discussed in Section 3.

3. The Source Data Stream Collection DITA Type

This section describes a new DITA element type for source data stream collections. The source data stream collection element type design arose from a conceptual implementation-agnostic information model developed to gain a deeper understanding of source data stream collections and how they achieve self-containment and reversibility. Reference [17] discusses the conceptual model and how it paved the way for an earlier DITA application. This paper provides a more implementation-focused perspective, with expanded discussions of DITA specialization in this section and DITA-OT implementation in Section 4.

DITA, developed by the Organization for the Advancement of Structured Information Standards (OASIS), is a standardized XML-based architecture for authoring, managing, reusing and transforming technical content [10]. DITA does not specify a schema or set of schemas per se. Instead, the DITA architecture provides:

  • A set of architectural building blocks for forming XML vocabularies called element types. These building blocks provide a variety of useful features for enabling content management and reuse.

  • Two basic element types created from the building blocks: the topic and the map. A topic is chunk of information. A map is structured collection of references to topics, other maps, and non-DITA resources such as, for example, XML documents created using non-DITA vocabularies.

  • Rules for creating new element types that inherit their processing semantics from existing DITA element types. A new element type is a specialization of the element type from which it inherits. The architecture provides rules for creation of specialized element and attribute domains, which are then collectively used to define specialized element types. DITA vocabulary developers may also specify constraints on a vocabulary's elements and attributes. Element and attribute domain specializations can be reused to create additional new element types, and thus promote modular and interoperable DITA development. Any new element type must be a specialization of an existing DITA vocabulary. Specialization makes DITA unique among other XML technologies in that implementations of specialized element types automatically inherit the functionality of DITA-conforming implementations of the element type specialized [18].

  • Rules for creating an XML schema for authoring documents conforming to a DITA element type. This schema is called a document type shell. The document type shell is used only for authoring. DITA processing is determined by the DITA document's architectural attributes, not the document type shell used to author it.

The burden of following the rules for creating specializations and document type shells is a downside of DITA. But this burden falls mostly on developers of specializations. Content authors are not exposed to this burden. A DITA document's architectural attributes are defined by the document type shell using default values, hiding them — and their complexity — from content authors. More importantly, software application developers implementing a DITA specialization gain the benefit of inherited functionality, resulting in reduced implementation cost [18].

The source data stream collection element type is a specialization of DITA's map element type. The map element type was chosen for specialization because SCAP source data stream collections and data streams are map-like in nature. Like a DITA map, a source data stream collection is essentially a structured collection of components and references to components. Each source data stream collection element inherits from one of the following elements from the DITA map element type:

  • map: A DITA map's top-level element.

  • topicref: References a topic or external (non-DITA) resource. Can also aggregate groups of nested topicref elements.

  • keydef: Creates an alias for a file path or short piece of text.

Most of the attributes defined in the source data stream collection element type inherit from @props, a DITA attribute from which new metadata attributes can be specialized.

Table II lists all the XML elements and attributes in the source data stream collection element type's document type shell. The leftmost column contains the element names. The second column from the left specifies the DITA map element (either map, topicref, or keydef) from which the source data stream collection element inherits. The third column from the left shows the element's content model. *, + and ? indicate zero or more occurrences, one or more occurrences, and optional, respectively. The rightmost column specifies each element's attributes.

Table II

Source data stream collection DITA document type.

Element Inherits From Content Model Attributes
scapDataStreamCollection map title?, scapComponent+, scapDataStream+

reverseDNS

scapName

schematronVersion

scapComponent keydef EMPTY

keys

href

scope?

scapDataStream topicref scapDictionaries?, scapChecklists?, scapChecks

scapName

scapVersion

useCase

scapDictionaries topicref scapCpeListRef+ NONE
scapChecklists topicref scapBenchmarkRef+, scapTailoringRef+ NONE
scapChecks topicref scapOvalRef+, scapOcilRef* NONE

scapCpeListRef

scapBenchmarkRef

scapTailoringRef

scapOvalRef

scapOcilRef

topicref scapExternalLinks? keyref
scapExternalLinks topicref scapUri+ NONE
scapUri topicref EMPTY

keyref

localUri?

scapDataStreamCollection, the source data stream collection's root element, has three attributes. @reverseDNS provides the reverseDNS portion for all GUIDs in the collection. scapName provides the name portion of the collection's GUID. schematronVersion specifies which version of the SCAP Requirements Schematron [19] schema to use for validating that a transformation (discussed in Section 4) of the DITA map conforms to SP 800-126 requirements.

scapComponent represents a component. @keys, @href, and @scope are attributes defined in the DITA standard. In DITA, @keys provides a list of key names, but the source data stream collection element type further constrains it to represent a single key name. This succinct name may be used elsewhere in the DITA map in place of the XML resource URI (represented by @href) that the component encapsulates. This attribute enables source data stream collection authors to specify the URI in just one place and use the key name elsewhere in the DITA map. The optional @scope attribute specifies whether @href points to a local resource (the default) or an external resource on the Internet.

scapDataStream represents a source data stream. @scapName provides the name portion of the data stream's GUID. @scapVersion and @useCase correspond to required attributes in the SP 800-126 schema specifying the version of the SCAP standard to which the data stream content should conform, and the data stream's use case.

scapDictionaries, scapChecklists, and scapChecks aggregate groups of dictionary component references, checklist component references, and check component references, respectively.

scapCpeListRef, scapBenchmarkRef, scapTailoringRef, scapOvalRef, and scapOcilRef are all component references. @keyref, a DITA attribute, enables the author to specify the component being referenced using the short name corresponding to the scapComponent element's @keys value, saving authors the trouble of having to type the same URI multiple times, and minimizing the number of DITA map revisions needed if the URI changes. scapCpeListRef references a dictionary component that assigns identifiers to platforms (hardware, operating system, or software application) using SCAP's Common Platform Enumeration (CPE) nomenclature. These identifiers are typically used in checklist and check components for checking the presence of a device, operating system, or software product on the target system. scapBenchmarkRef, scapTailoringRef, scapOvalRef, and scapOcilRef reference a benchmark component, tailoring component, check component containing an OVAL check resource, and check component containing an Open Checklist Interactive Language (OCIL) check resource, respectively. OCIL is used for checking state via a human-oriented collection of information when OVAL-based methods are not feasible.

scapExternalLinks and scapUri together represent the mapping (shown in Figure 4) translating internal references within an encapsulated XML resource to the corresponding component location within the source data stream collection. @keyref is used to specify the component whose encapsulated XML resource is being referenced from within the component referenced by the parent element of scapExternalLinks. The optional @localUri is for overriding the URI obtained when DITA processing resolves the key reference specified in @keyref. This is needed when the referencing and referenced XML resources are in different local directories, or when one is external and the other is local.

Figure 6 lists a DITA map representing the data stream collection shown in Appendix A. This data stream collection contains one data stream and four components referenced by the data stream. The dictionary component and the check component referenced by the dictionary component encapsulate dictionary and check resources in the same directory as the DITA map. The benchmark component encapsulates a checklist resource in a subdirectory. Another check component encapsulates a check resource located externally (and has the value external for @scope). Because the dictionary and checklist resources both reference check resources, the dictionary component reference and benchmark component references contain scapExternalLinks and scapUri elements.

Figure 6

<scapDataStreamCollection reverseDNS="gov.nist" scapName="example" 
    schematronVersion="1.3">
    <scapComponent href="checklist-content/xccdf.xml" keys="content-xccdf"/>
    <scapComponent scope="external" keys="content-oval" href=
"https://raw.githubusercontent.com/usnistgov/sctools/master/dita/examples/
nist-example/checklist-content/oval.xml"/>
    <scapComponent href="cpe-oval.xml" keys="content-cpe-oval"/>
    <scapComponent href="cpe-dictionary.xml" keys="content-cpe-dictionary"/>
    <scapDataStream scapName="example" scapVersion="1.3" 
                    useCase="CONFIGURATION">
        <scapDictionaries>
            <scapCpeListRef keyref="content-cpe-dictionary">
                <scapExternalLinks>
                    <scapUri localUri="cpe-oval.xml" 
                             keyref="content-cpe-oval"/>
                </scapExternalLinks>
            </scapCpeListRef>
        </scapDictionaries>
        <scapChecklists>
            <scapBenchmarkRef keyref="content-xccdf">
                <scapExternalLinks>
                    <scapUri localUri="oval.xml" keyref="content-oval"/>
                </scapExternalLinks>
            </scapBenchmarkRef>
        </scapChecklists>
        <scapChecks>
            <scapOvalRef keyref="content-oval"/>
            <scapOvalRef keyref="content-cpe-oval"/>
        </scapChecks>
    </scapDataStream>
</scapDataStreamCollection>

Source data stream collection DITA map representing the XML from Appendix A.

4. SCAP Composer DITA Open Toolkit Implementation

The DITA Open Toolkit (DITA-OT) [11] meets the DITA standard's requirements for a specialization-aware, output-producing DITA processor. As such, DITA-OT merges topics referenced in a map and resolves key references, eliminating the need for custom transformation code to perform the functions. Because it is specialization-aware, DITA-OT inherits the processing behavior for elements in the source data stream collection element type from their supertypes. Thus, scapComponent inherits keydef's processing behavior, scapSourceDataStream inherits map's processing behavior, and the elements based on topicref inherit topicref's processing behavior. As a result, DITA-OT built-in functionality greatly reduced the coding effort required to implement SCAP Composer.

DITA-OT has a modular architecture with an extensible plug-in mechanism. SCAP Composer consists of two DITA-OT plug-ins:

  • A document type plug-in implementing the source data stream collection element type discussed in Section 3. The source data stream collection document type shell is defined using RELAX NG compact syntax [20] with annotations to support default attributes [21].

  • A transformation plug-in, requiring the document type plug-in, that converts a source data stream collection DITA map into an SCAP source data stream collection XML document. The transformation plug-in uses the NIST SCAP Content Validation Tool [22], also known as SCAPVal[4], to check conformance of individual XML resources and the converted source data stream to SP 800-126 requirements.

SCAP Composer can be deployed with any XML authoring software product that uses version 3 or higher of DITA-OT. Alternatively, SCAP Composer may be deployed with a self-installed DITA-OT, where authoring is done with a non-DITA-aware XML editor or text editor. SCAP Composer has been successfully integrated into a commercially available XML authoring software product with a built-in DITA-OT. SCAP Composer has also been successfully deployed using the free GNU Emacs text editor, which includes an nxml mode for authoring and validating XML documents against a RELAX NG schema, and a standalone DITA-OT installation. Both deployment options were tested with numerous source data stream collection DITA maps, including the source data stream collection DITA map in Figure 6.

SCAP Composer's source code consists principally of the following:

  • RELAX NG compact syntax definitions for the specialized element and attribute domains, constraints, document type shell, and public and system identifiers needed for the source data stream collection specialized map type. The document type plug-in contains these definitions.

  • Extensible Style Language transformation (XSLT) [23] code that extracts data from the source data stream collection DITA map and generates SCAP output. This code is part of the transformation plug-in. The code leverages DITA-OT built-in transformation logic and therefore only needs to perform tasks beyond basic DITA processing, such as transforming the scapURI element and extracting values of attributes needed for SCAPval.

  • An Ant script for managing the various transformation steps. Ant [24] is a tool for declaring a sequence of build actions in XML. DITA-OT provides an Ant script declaring a default sequence of extensible transformation steps. Every plug-in has its own Ant script, which may use, extend, or skip steps in the default Ant script.

The flowchart in Figure 7 illustrates the workflow defined by the transformation plug-in's Ant script. This workflow begins when DITA-OT is invoked with sds (short for source data stream collection) as the output format argument. The plug-in accepts two additional optional arguments:

  • sds.scapval: The path to the SCAPval Java Archive (JAR) file.

  • sds.componentkey: The @key attribute value of one of the input DITA map's scapComponent elements.

Figure 7

SCAP Composer transformation plug-in processing flow.

Processing begins with a preprocessing stage common to all transformation plug-ins. The preprocessing includes the merging and key reference resolution operations mentioned earlier. If sds.componentkey is specified, SCAPval validates the XML resource pointed to by the key's scapComponent element, and a validation report is produced as output (assuming sds.scapval is also specified; if not, processing stops with an error message). If sds.componentkey is not specified, the plug-in generates a single file in SP 800-126 schema format from the DITA map and XML resources. If sds.scapval is specified, the plug-in then validates the generated file and produces a validation report.

Suppose the source data stream collection DITA map file name is nist-example-hybrid.ditamap, and DITA-OT is invoked as follows:

dita -i nist-example-hybrid.ditamap -f sds --sds.scapval=scapval-1.3.2.jar
SCAP Composer will produce two outputs: the transformation result shown in Appendix A and a SCAPval-generated validation report of the transformation result. Figure 8 shows the beginning summary section of the validation report. The rest of the report provides detailed results for each validation requirement tested. Reference [22] provides more information regarding SCAP validation requirements, their associated test procedures, and which test procedures include validation using SCAPval.

Figure 8

SCAPval validation report summary.

5. Discussion

SCAP Composer takes an incremental approach to aiding SCAP content authors. This is both a limitation and a strength. The limitation is that SCAP Composer only helps with creating source data stream collections. It does not offer any help with creating the XML resources encapsulated in a source data stream collection. Checklist and check resources are large, highly complex, and hard to create using conventional XML editing software applications. Reference [25] explores the feasibly of developing DITA element types for representing checklist rules and profiles (collections of rules), but more implementation and testing is needed to scale the proof-of-concept demonstration to real-world rule sets.

SCAP Composer's incrementalism is a strength in that its limited scope makes it easy both to deploy and integrate with other SCAP content development aids. SCAP Composer has no software dependencies other than DITA-OT, which runs in all common operating systems. SCAP Composer is not tethered to a larger SCAP software product or content repository infrastructure. This flexibility enables SCAP Composer to contribute to a larger authoring and content management solution by providing the piece responsible for creating source data stream collections, leaving it up to other mechanisms to produce and manage the XML resources to be encapsulated.

DITA-OT's extensible and customizable transformation workflow offers many possibilities for combining DITA processing with other capabilities. SCAP Composer exploits this to combine transformation from DITA to SP 800-126 XML with SCAPval validation and report generation. DITA-OT flexibility even allows a workflow with no DITA processing at all, as demonstrated by DITA-OT plug-in from Jason Fox [26] that automatically displays a random cat picture or XKCD comic strip as a splash screen while waiting for another plug-in's transformation to complete.

Although Fox's plug-in sounds frivolous, the same underlying idea can be practical in the context of SCAP Composer. For example, a future version of SCAP Composer could be supplemented with a decompose plug-in whose input includes an SP 800-126 schema-conforming source data stream collection, and whose output is the set of XML resources encapsulated by the components contained in the collection. Such a plug-in would perform no DITA processing yet would add useful and complementary functionality.

Software products that produce and manage SCAP are part of the broader research and development goal of modeling cybersecurity compliance requirements in a manner enabling them to be structured, organized, executed and reused efficiently [27]. SCAP Composer is a small contribution to this goal. Other efforts, such as the Compliance as Code open source project [28], are more ambitious. Compliance as Code contains a collection of compliance rules written in a YAML [29] format, OVAL XML fragments, as well as code fragments for automated remediation of compliance issues. A collection of build scripts generates SCAP source data stream collections and remediation scripts from the fragments. Another effort, ConfigValidator [30], is a system that checks a variety of targets — including running containers and cloud-based environments — for compliance with configuration rules written in a YAML-based declarative language.

The Compliance as Code and ConfigValidator projects have broader scopes and implement far more capabilities than SCAP Composer. However, both rely on one-off collections of scripts for processing compliance rules written in non-XML formats. Lack of standardization is a barrier to reusing the Compliance as Code and ConfigValidator technology in other projects. Also, both projects reject the use of XML for authoring compliance rules, claiming XML is hard for humans to edit. But, as SCAP Composer shows, XML is not the problem. The real problem is that, as discussed in Section 2, no single schema can meet all implementation requirements.

6. Conclusion

This paper describes SCAP Composer, a novel software application for creating and validating SCAP source data stream collections. What makes SCAP Composer unique is its use of DITA specialization and the DITA Open Toolkit to simplify the reuse of SCAP content while adhering to self-containment and reversibility requirements. Integration with SCAPval adds the ability for users to conveniently check source data stream collections and their components for conformance to SP 800-126. Other efforts to facilitate authoring and reuse of SCAP content rely on ad hoc authoring formats and extract/transform/load workflows, making them difficult to maintain or deploy in new projects. SCAP Composer is based on DITA, a robust architecture specifically for authoring and organization of topic-oriented information. Because it is standards-based and has a small footprint and scope, SCAP Composer is easy to integrate into a variety of SCAP authoring and deployment scenarios.

Note

The author thanks his colleagues at the National Institute of Standards and Technology and the anonymous Balisage reviewers for helpful comments and feedback on earlier versions of this paper.

Appendix A. SCAP-conforming Source Data Stream Collection

This appendix lists the SCAP-conforming XML read into the scanner software shown in Figure 1. To save space, the XML resource markup is not shown.

<sds:data-stream-collection 
   xmlns:sds="http://scap.nist.gov/schema/scap/source/1.2"
   xmlns:cat="urn:oasis:names:tc:entity:xmlns:xml:catalog"
   xmlns:xlink="http://www.w3.org/1999/xlink"
   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
   id="scap_gov.nist_collection_example"
   schematron-version="1.3">
   <sds:data-stream id="scap_gov.nist_datastream_example"
                    scap-version="1.3"
                    timestamp="2019-04-05T20:57:48.814-04:00"
                    use-case="CONFIGURATION">
      <sds:dictionaries>
         <sds:component-ref 
            id="scap_gov.nist_cref_content-cpe-dictionary"
            xlink:href="#scap_gov.nist_comp_content-cpe-dictionary">
            <cat:catalog>
               <cat:uri name="cpe-oval.xml" 
                        uri="#scap_gov.nist_cref_content-cpe-oval"/>
            </cat:catalog>
         </sds:component-ref>
      </sds:dictionaries>
      <sds:checklists>
         <sds:component-ref id="scap_gov.nist_cref_content-xccdf"
                            xlink:href="#scap_gov.nist_comp_content-xccdf">
            <cat:catalog>
               <cat:uri name="oval.xml" 
                        uri="#scap_gov.nist_cref_content-oval"/>
            </cat:catalog>
         </sds:component-ref>
      </sds:checklists>
      <sds:checks>
         <sds:component-ref 
            id="scap_gov.nist_cref_content-oval"
            xlink:href="#scap_gov.nist_comp_content-oval"/>
         <sds:component-ref 
            id="scap_gov.nist_cref_content-cpe-oval"
            xlink:href="#scap_gov.nist_comp_content-cpe-oval"/>
      </sds:checks>
   </sds:data-stream>
   <sds:component id="scap_gov.nist_comp_content-xccdf"
                  timestamp="2019-04-05T20:57:48.814-04:00">
      <Benchmark xmlns="http://checklists.nist.gov/xccdf/1.2" ...>
  ...</Benchmark>
   </sds:component>
   <sds:component id="scap_gov.nist_comp_content-oval"
                  timestamp="2019-04-05T20:57:48.814-04:00">
      <oval_definitions 
         xmlns="http://oval.mitre.org/XMLSchema/oval-definitions-5" ...>
   ...</oval_definitions>
   </sds:component>
   <sds:component id="scap_gov.nist_comp_content-cpe-oval"
                  timestamp="2019-04-05T20:57:48.814-04:00">
      <oval_definitions 
         xmlns="http://oval.mitre.org/XMLSchema/oval-definitions-5" ...>
   ...</oval_definitions>
   </sds:component>
   <sds:component id="scap_gov.nist_comp_content-cpe-dictionary"
                  timestamp="2019-04-05T20:57:48.814-04:00">
      <cpe-list xmlns="http://cpe.mitre.org/dictionary/2.0" ...>
   ...</cpe-list>
   </sds:component>
</sds:data-stream-collection>

References

[1] Quinn S, Scarfone K, Waltermire D (2012) Guide to Adopting and Using the Security Content Automation Protocol (SCAP) Version 1.2 (Draft), NIST Special Publication 800-117.

[2] Extensible Markup Language (XML) 1.0 (Fifth Edition) (2008), W3C Recommendation. Available at http://www.w3.org/TR/xml/

[3] SCAP Validated Products and Modules - Security Content Automation Protocol Validation Program. Available at https://csrc.nist.gov/Projects/scap-validation-program/Validated-Products-and-Modules

[4] The United States Government Configuration Baseline (USGCB) - NIST. Available at https://usgcb.nist.gov/

[5] Guide to the Secure Configuration of Red Hat Enterprise Linux 7. OpenSCAP Security Guide. Available at https://static.open-scap.org/ssg-guides/ssg-rhel7-guide-pci-dss.html

[6] Payment Card Industry (PCI) Data Security Standard (2018), Version 3.2.1.

[7] OVAL Repository: Top Contributors. Available at https://oval.cisecurity.org/repository/top-contributors

[8] XML Schema Part 0: Primer Second Edition (2004), W3C Recommendation. Available at https://www.w3.org/TR/xmlschema-0/

[9] Waltermire D, Quinn S, Booth H, Scarfone K, Prisaca D (2018) The technical specification for the security content automation protocol (SCAP) version 1.3 (National Institute of Standards and Technology, Gaithersburg, MD), NIST SP 800-126r3. doi:https://doi.org/10.6028/NIST.SP.800-126r3

[10] DITA Version 1.3 Specification (2018) (Organization for the Advancement of Structured Information Standards), OASIS Standard. Available at http://docs.oasis-open.org/dita/dita/v1.3/dita-v1.3-part0-overview.html

[11] The DITA Open Toolkit: dita-ot/dita-ot (2019) (DITA Open Toolkit). Available at https://github.com/dita-ot/dita-ot

[12] Waltermire D, Schmidt C, Scarfone K, Ziring N (2011) Specification for the Extensible Configuration Checklist Description Format (XCCDF) Version 1.2, NIST Interagency Report 7275 Revision 4. Available at http://csrc.nist.gov/publications/PubsNISTIRs.html

[13] XML Linking Language (XLink) Version 1.1 (2010), W3C Recommendation. Available at https://www.w3.org/TR/xlink11/

[14] OVAL Documentation. Available at http://ovalproject.github.io/

[15] XML Catalogs (2005), OASIS Standard V1.1. Available at https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html

[16] Namespaces in XML 1.0 (Third Edition) (2009), W3C Recommendation. Available at https://www.w3.org/TR/xml-names/

[17] Lubell J (2018) A New SCAP Information and Data Model for Content Authors. Critical Infrastructure Protection XII, eds Staggs J, Shenoi S (Springer International Publishing), pp 127–146. doi:https://doi.org/10.1007/978-3-030-04537-1_8. Available at https://www.nist.gov/publications/new-scap-information-model-and-data-model-content-authors

[18] Kimber E (2012) DITA for Practitioners Volume 1: Architecture and Technology (XMLPress).

[19] Information technology — Document Schema Definition Language (DSDL) — Part 3: Rule-based validation — Schematron (2016) (International Organization for Standardization), ISO/IEC 19757-3. Available at http://schematron.com

[20] Information technology — Document Schema Definition Language (DSDL) — Part 2: Regular-grammar-based validation — RELAX NG (2008) (International Organization for Standardization), ISO/IEC 19757-2. Available at https://relaxng.org

[21] RELAX NG DTD Compatibility (2001) (Organization for the Advancement of Structured Information Standards), Committee Specification. Available at https://relaxng.org/compatibility-20011203.html

[22] Cook M, Quinn S, Waltermire D, Prisaca D (2018) Security content automation protocol (SCAP) version 1.3 validation program test requirements (National Institute of Standards and Technology, Gaithersburg, MD), NIST IR 7511r5. doi:https://doi.org/10.6028/NIST.IR.7511r5

[23] XSL Transformations (XSLT) Version 2.0 (2007), W3C Recommendation. Available at https://www.w3.org/TR/xslt20/

[24] Apache Ant (2019) (The Apache Software Foundation). Available at https://github.com/apache/ant

[25] Lubell J (2017) Using DITA to Create Security Configuration Checklists: A Case Study. Proceedings of Balisage: The Markup Conference, Balisage Series on Markup Technologies. (Washington, DC). doi:https://doi.org/10.4242/BalisageVol19.Lubell01

[26] Fox J (2019) Splash Screen Plug-in for the DITA Open Toolkit. Available at https://github.com/jason-fox/fox.jason.splash

[27] Steffens A, Lichter H, Moscher M (2018) Towards Data-driven Continuous Compliance Testing. 3rd Workshop on Continuous Software Engineering (Ulm, Germany), pp 78–84.

[28] Security compliance content in SCAP, Bash, Ansible, and other formats: ComplianceAsCode/content (2019) (ComplianceAsCode). Available at https://github.com/ComplianceAsCode/content

[29] Ben-Kiki O, Evans C (2009) YAML Ain’t Markup Language (YAML™) Version 1.2, 3rd Edition.

[30] Baset S, Suneja S, Bila N, Tuncer O, Isci C (2017) Usable declarative configuration specification and validation for applications, systems, and cloud. Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference on Industrial Track - Middleware ’17 (ACM Press, Las Vegas, Nevada), pp 29–35. doi:https://doi.org/10.1145/3154448.3154453



[1] Certain commercial and third-party products and services are identified in this paper to foster understanding. Such identification does not imply recommendation or endorsement by the National Institute of Standards and Technology, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose.

[2] Although SCAP allows for source data stream collections to contain multiple data streams, it is common for a collection to contain only a single data stream.

[3] To help reduce the sea of acronyms in this paper, unless explicitly stated otherwise, the terms checklist, checklist component, and checklist resource imply XCCDF as the checklist language. Similarly, check, check component, and check resource imply OVAL as the check language.

[4] SCAPval is a command-line application that validates SCAP source and result data streams against SP 800-126 XML schemas, and encapsulated XML resources against their XML schemas. SCAPval uses Schematron [19] to perform additional validations. SCAPval's output is a detailed validation report (Figure 8 shows an example of a validation report's summary section). Laboratories accredited to test SCAP products are required to use SCAPval reports as part of their testing process. Developers of SCAP software products such as the scanner shown in Figure 1, SCAP content developers, and organizations deploying SCAP products may use SCAPval for quality assurance, or to gain insight into SCAP validation requirements or product capabilities.

Joshua Lubell

Computer Scientist

National Institute of Standards and Technology

Joshua Lubell is a computer scientist whose work focuses on smart manufacturing systems cybersecurity. His technical interests include markup languages and information modeling. His Baseline Tailor software tool for security control selection won an award from Government Computer News. He received the United States Department of Commerce Silver Medal for his leadership in developing ISO 10303-203, a standard for representation and exchange of computer-aided designs. He is also a Balisage hyper-local, residing in the heart of Rockville, Maryland.