← Men Are Included | Once Again Musica-Viva in Austria →
TDOSCA & OSCake: Automating FOSS Compliance
By releasing the Open Source License Compendium and the Open Source Compliance Advisor, Deutsche Telekom has already supported the task to deal with Open Source Compliance. But DT offers so many and complex Open Source based products that it is too expensive to create the necessary Open Source compliance artifacts manually. Thus, DT needs a practically usable automated toolchain. This article discusses a new method (TDOSCA) and a new tool (OSCake) that DT develops and contributes under the umbrella of the Open Chain Project.
3 simple questions for an Open Source Compliance tool
Without any doubt, there exist already many Open Source compliance tools. The Open-Chain-Reference-Tooling-Work-Group has compiled a list of relevant information that can be clustered according to various criteria:
- Some of the tools can be grouped by the offering organizations like the Apache Foundation, SPDX, Eclipse, or the About Code Initiative.
- Some of the tools are on the sidelines because they have a specific focus or are not really tools or anything else.
- Some of the ‘tools’ can be grouped by the fact that they are services, not tools.
Deutsche Telekom has a simple point of view on FOSS compliance tools. Whenever DT comes across such a tool, it asks:
- Does this tool deliver the FOSS compliance artifacts DT really needs? If not
- What part of them can it deliver?
- How much work does DT still have to do manually if it used the tool?
DT has a long tradition of evaluating FOSS compliance tools. Its employees met excellent tools and brilliant experts who often were completely convinced that they could essentially support DT. But in the end, DT mostly felt like that they didn’t really understand what DT needed (and still needs). To clarify this point: Whoever delivers large lists of (found) FOSS items and says that a company now has to discuss each entry of the list with its legal department does not really help the company.
Nevertheless, DT has to deal with such large lists. Open-Source-Compliance is not a question of pleasure or displeasure: either one uses Open-Source software and fulfills the respective requirements, or one does not use the software. Therefore DT can’t wait anymore. The complexity of its products enforces DT to advance the automation of open source compliance actively. For solving that issue, it doesn’t want to start the next greenfield approach but to participate in existing projects – entirely in the spirit of the open-source idea.
Setting up the Test-Driven environment to develop tools generating Open Source Compliance Artifacts
DT‘s first step was to improve its own communication: it wants to clarify in a better way what it really needs – from the point of view of a large company dealing with many complex software stacks. Thus, DT tried to apply the idea of ‘Test-Driven Software Development’ to the development of compliance tools:
- On the one side, these test cases should contain really usable software and the licensing and dependency information as they are usually put together in real projects.
- On the other side, these test cases should contain those compliance artifacts that would allow distributing the software compliantly if added to the respective software package.
Additionally, DT thinks,
- that existing open source projects are mostly too complex for being used as reference material
- that artificially generated software could better focus on essential compliance issues
- that the reference software on the one side should functionally be a simple hello world program,
- on the other side, should ‘implement’ sophisticated compliance issues as they are found in real open-source projects.
By using such test cases, the community, the tools, and the companies are enabled to verify,
- with which compliance traps a tool can already successfully deal,
- which artifacts a tool already deliver (and which not),
- where there are still some open issues, and
- where deviating results are only a matter of interpretation.
The ‘Hello World’ Open Source Compliance Test Cases
All TDOSCA-test-cases are offered under the umbrella of the GitHub organization Open-Source-Compliance and clustered by the prefix tdosca. The README of main repository tdosca describes the general approach: one may expect that each test case offers the same structure. For example, take a look at tdosca-tc06-plainhw:
- On the top level, a test case-specific README describes its intention.
- In the directory input-sources, you find a compilable software package
- that contains the licensing information just as real open source projects do
- and can be installed by a standard technique (in this case: java + maven).
- On the top level, a compliance-trap file describes the challenges that are implemented in the source and should be managed by the tools.
- And in the directory reference-compliance-artifacts, one can find the compliance artifacts that a tool should deliver:
- a BOM file listing the (sub) components of the package
- a list of the packages that must be preinstalled on the target host
- the Open Source Compliance File, which – added to the package – establishes a compliantly distributable open-source software package.
The test cases themselves are stored in the respective repositories tdosca-tc01 … tdosca-tc0n
The core reference entity of a test case is its Open Source Compliance File: Such a file shall contain all compliance artifacts so that a package is compliantly distributed if it is bundled with the respective OSCF. This idea was inspired by the file that CISCO adds to its jabber client: https://www.cisco.com/c/dam/en_us/about/doing_business/open_source/ docs/CiscoJabberforWindows-128-1578365187.pdf. This file is not completely sufficient. But it gives a good idea, how to deal with this issue. In the TDOSCA context, the meaning of such an Open Source Compliance File can be explained by looking at the OSCF of the 6th test case.
A summary and an addendum:
In general each TDOSCA test-case implements the following structure:
The TDOSCA initiative – hosted under the umbrella of OpenChain and the OpenChain Reference Tooling Work Group – could be a good method for the community to evaluate its tools by such test cases.
But if DT followed this approach purely, DT would easily slip into the role of a police officer or a judge. That’s not what DT wants to be; it wants to be a supportive part of the community. For that purpose, DT has already evaluated existing tools on the base of the TDOSCA test cases, has made some experiences, and decided on some consequences:
Applying the approach to ORT
First DT decided to use ORT – the Open Source Review Toolkit – for creating a break-through tool-chain-version which takes the test-case input and derives the compliance output:
In the picture you see
- the five components, ORT mentions in its README,
- the data they generate, and
- how they use the output of their predecessors.
Using this outline, we can now exemplify some of …
… and gaining experiences with ORT
- First, DT noticed that even the first and most simple test case that uses the GNU Autotools could not be evaluated yet.
- Second, DT had to learn that in cooperation with gradle, ORT – for the moment – can not decide which of the found licenses is the default license.
- Third, DT noticed that the standard templates included in ORT reader follow the principle of over fulfillment, the principle of over-fulfilling the license requirements.
What does the last point mean? If you have a software project completely and exclusively licensed under the MIT license, then it is sufficient to bundle the license text and its embedded copyright line with the package for making it compliantly distributable. Tools that follow the principle of over fulfillment would also add the artifacts created based on the GPL requirements, as ‘all copyright headers of all files’ and so on.
This is an often applied approach. But following the principle of over fulfillment is a problematic strategy:
- On the one hand, the distributors are also responsible for incorrectly created compliance artifacts even if they are not required by the really relevant license and should not have supplied it.
- On the other hand, the surplus compliance artifacts could overwrite or lever out the essential artifacts.
Fortunately, ORT follows the design principle to make everything configurable and extendable, which allows DT to adapt its needs in three ways:
Consequence 1: Improving ORT
- Deutsche Telekom plans to implement and to give back to ORT an evaluation technique of the Autotools scripts.
- It will define, implement, and give upstream to ORT a generally usable strategy to determine the default license of a package.
Consequence 2: Extending the case structure
- DT will define more test-cases according to the multi-dimensional room: complexity, programming language, and dependency manager.
Consequence 3: Defining an intelligent Open Source Compliance artifact knowledge engine
- DT develops an intelligent component into which it embeds the Open Source License Compliance knowledge in a declarative manner by
- adding respective writers into ORT
- adding a FOSS compliance domain-specific language realized on the base of Eclipse, XText
- adding a respective compliance artifact composer based on XTend.
This new component of and for Open Source Compliance Chains is called OSCake – the Open Source Compliance artifact knowledge engine -, that is developed under the terms of the Eclipse Public License 2.0
OSCake shall close the gaps evoked by Open Source scanning tools that follow the principle of compliance over-fulfillment. It will take Open Source Compliance collections and deliver Open Source Compliance Files that really fit the requirements of the involved Open Source Licenses and their contexts. OSCake will become an agnostic compliance knowledge engine; it will not depend on a specific scanning tool but only on an error-tolerant input format. For being able to offer these features, OSCake will have an internal structure:
Fazit
TDOSCA and OSCake establish a promising goal set for the company itself as well as for the community and other commercial approaches:
- DT indeed wants to set up a practically usable FOSS compliance toolchain that automatically generates the compliance artifacts we need.
- DT wants to reduce the manual work as far as possible.
- And DT develops this chain (and its components) under the control of TDOSCA: the project to develop Test-Driven Open Source Compliance Artifact Gatherers and Compilers – including our own tool ‘OSCake’.
And it is an outstanding aspect that both parts are developed under the umbrella of OpenChain and its Open Chain ReferenceTooling Workgroup.
Releated Links:
- OSLiC sources: https://github.com/telekom/oslic
- OSLiC homepage: http://telekom.github.io/oslic/
- OSLiC version 1.0.2: https://telekom.github.io/oslic/releases/oslic.pdf
- OSCAd sources: https://github.com/telekom/oscad
- OSCAd homepage: https://telekom.github.io/oscad/
- OSCAd instance: http://oscad.fodina.de/
- OpenChain homepage: https://www.openchainproject.org/
- Respective Linux Foundation project page: https://www.linuxfoundation.org/projects/security-compliance/
- Introduction into the Open Chain Reference Tooling Work Group: https://www.openchainproject.org/news/2020/03/15/openchain-reference-tooling-work-group-in-2020
- Open Chain Reference Tooling Work Group homepage: http://oss-compliance-tooling.org/
- Existing Open Source license compliance tools: http://oss-compliance-tooling.org/Tooling-Landscape/OSS-Based-License-Compliance-Tools/
- Open-source Review Toolkit: https://github.com/oss-review-toolkit/ort
- Test Driven Open Source Compliance Initiative: https://github.com/Open-Source-Compliance/tdosca
- Open Source Compliance artifact knowledge engine: https://github.com/Open-Source-Compliance/OSCake
- Open Compliance Summit 2020: https://events.linuxfoundation.org/open-compliance-summit/program/schedule/