Data model¶
-
class
firehose.model.
Analysis
¶ The
Analysis
class represents one invocation of a code analysis tool.It corresponds to the
<analysis>
XML element, the top-level element of a Firehose XML document.-
results
¶ A list of
Result
objects, representing the various issues, failures, and other information found during the analysis.
-
customfields
¶ CustomFields
orNone
Here is the pertinent part of the XML schema:
<start> <!-- Results from the invocation of an analysis tool --> <element name="analysis"> <ref name="metadata-element"/> <element name="results"> <zeroOrMore> <choice> <ref name="issue-element"/> <ref name="failure-element"/> <ref name="info-element"/> </choice> </zeroOrMore> </element> <optional> <ref name="custom-fields-element"/> </optional> </element> </start>
-
__init__(self, metadata, results, customfields=None):
Parameters: - metadata (
Metadata
) – - results (list(
Result
)) – - customfields (
CustomFields
or None) –
- metadata (
-
classmethod
from_xml
(cls, fileobj)¶ Parse XML from fileobj, and return an
Analysis
instance representing the data seen there.
-
to_xml
(self)¶ Generate an
ET.ElementTree()
representing the data within self.
-
to_xml_bytes
(self)¶ Generate a
bytes
instance containing an XML serialization of the data within self.
-
Results¶
-
class
firehose.model.
Result
¶ Result is a base class
There are three subclasses:
- an
Issue
represents a report from the analyzer about a possible problem with the software under test. - an
Info
represents additional kinds of information generated by an analyzer that isn’t a problem per-se e.g. code metrics, licensing info, etc. - a
Failure
represents a report about a failure of the analyzer itself (e.g. if the analyzer crashed).
- an
-
class
firehose.model.
Issue
(Result)¶ An
Issue
represents a report from the analyzer about a possible problem with the software under test.It corresponds to the
<issue>
XML element within a Firehose XML document.-
cwe
¶ (
int
orNone
): The Common Weakness Enumeration ID (see http://cwe.mitre.org/index.html ) e.g. “131” representing CWE-131 aka “Incorrect Calculation of Buffer Size” http://cwe.mitre.org/data/definitions/131.html
-
testid
¶ (
str
orNone
): Each static analysis tool potentially has multiple tests, with its own IDs for its own tests. These can be captured here, as free-form strings.
-
trace
¶ (
Trace
orNone
): An optional list of events that describe the circumstances leading up to a problem.
-
severity
¶ (
str
orNone
): Each static analysis tool potentially can report a “severity”, which may be of use for filtering.The precise strings are likely to vary from tool to tool. To avoid data-transfer issues, support storing it as an optional freeform string here.
See: http://lists.fedoraproject.org/pipermail/firehose-devel/2013-February/000001.html
-
customfields
¶ - (
CustomFields
orNone
): A given tool/testid may have additional key/value pairs that it may be useful to capture.
-
write_as_gcc_output
(self, out)¶ Write the issue in the style of a GCC warning to the given file-like object.
>>> issue.write_as_gcc_output(sys.stderr) examples/python-src-example.c:40:4: warning: ob_refcnt of '*item' is 1 too high [CWE-401] was expecting final item->ob_refcnt to be N + 1 (for some unknown N) due to object being referenced by: PyListObject.ob_item[0] but final item->ob_refcnt is N + 2 examples/python-src-example.c:36:14: note: PyLongObject allocated at: item = PyLong_FromLong(random()); examples/python-src-example.c:37:8: note: when PyList_Append() succeeds
-
get_cwe_str
(self)¶ Get a string giving the CWE title, or None:
>>> issue.get_cwe_str() 'CWE-131'
-
get_cwe_url
(self)¶ Get a string containing the URL of the CWE id, or None:
>>> issue.get_cwe_url() 'http://cwe.mitre.org/data/definitions/131.html'
-
-
class
firehose.model.
Info
(Result)¶ An
Info
represents additional kinds of information generated by an analyzer that isn’t a problem per-se e.g. code metrics, licensing info, cross-referencing information, etc.It corresponds to the
<info>
XML element within a Firehose XML document.-
infoid
¶ (
str
orNone
): an optional free-form string identifying the kind of information being reported.
-
customfields
¶ CustomFields
orNone
-
-
class
firehose.model.
Failure
(Result)¶ A
Failure
represents a report about a failure of the analyzer itself (e.g. if the analyzer crashed).If any of these are present then we don’t have full coverage.
For some analyzers this is an all-or-nothing affair: we either get issues reported, or a failure happens (e.g. a segfault of the analysis tool).
Other analyzers may be more fine-grained: able to report some issues, but choke on some subset of the code under analysis. For example cpychecker runs once per function, and any unhandled Python exceptions only affect one function.
It corresponds to the
<failure>
XML element within a Firehose XML document.-
failureid
¶ (
str
orNone
): Each static analysis tool potentially can identify types of way that it can fail.Capture those that do here, as (optional) free-form strings.
-
location
¶ Location
: Some analysis tools may be able to annotate a failure report by providing the location within the software-under-test that broke them.For example, gcc-python-plugin has a
gcc.set_location()
method which can be used by a code analysis script to record what location is being analyzed, so that if unhandled Python exception happens, it is reported at that location. This is invaluable when debugging analysis failures.
-
customfields
¶ CustomFields
orNone
: Every type of failure seems to have its own kinds of data that are worth capturing:- stdout/stderr/returncode for a failed subprocess
- traceback for an unhandled Python exception
- verbose extra information about a cppcheck failure
etc. Hence we allow a
<failure>
to optionally contain extra key/value pairs, based on thefailureid
.
-
Metadata¶
-
class
firehose.model.
Metadata
¶ Holder for metadata about an analyzer invocation.
It corresponds to the
<metadata>
XML element within a Firehose XML document.
-
class
firehose.model.
Stats
¶ Stats
is an optional field ofMetadata
for capturing stats about an analysis run.-
wallclocktime
¶ float
: how long (in seconds) the analyzer took to run
-
Describing the software under test¶
Warning
this part of the schema may need more thought/work
-
class
firehose.model.
Sut
¶ Base class for describing the software-under-test.
-
class
firehose.model.
SourceRpm
(Sut)¶ It corresponds to the
<source-rpm>
XML element within a Firehose XML document.-
name
¶ str
-
version
¶ str
-
release
¶ str
-
buildarch
¶ str
-
-
class
firehose.model.
DebianBinary
(Sut)¶ Internal Firehose representation of a Debian binary package. This Object is extremely similar to a SourceRpm.
It corresponds to the
<debian-binary>
XML element within a Firehose XML document.-
name
¶ str
: the binary package name.
-
version
¶ str
: should match Upstream’s version number
-
release
¶ str
orNone
: should be the Debian package local version. This should only be omited if the package is a Debian Native package.
-
buildarch
¶ str
: valid entries includeamd64`', ``kfreebsd-amd64
,armhf
,hurd-i386
, among others for Debian.
-
-
class
firehose.model.
DebianSource
(Sut)¶ Internal Firehose representation of a Debian source package. This Object is extremely similar to a SourceRpm, but does not include the buildarch attribute.
It corresponds to the
<debian-source>
XML element within a Firehose XML document.-
name
¶ str
: should be the source package name
-
version
¶ str
: should match Upstream’s version number
-
release
¶ str
orNone
: if given, should be the Debian package local version. This should only be omited if the package is a Debian Native package.
-
Describing source code¶
-
class
firehose.model.
Location
¶ A particular source code location.
It corresponds to the
<location>
XML element within a Firehose XML document.-
function
¶ Function
orNone
. The function (or method) containing the problem.This is optional. Some problems occur in global scope, and unfortunately, some analyzers don’t always report which function each problem was discovered in. Given that function names are less likely to change than line numbers, this is something that we should patch in each upstream analyzer as we go.
We can refer to either a location, or a range of locations within the file:
-
-
class
firehose.model.
File
¶ A description of a particular source file.
It corresponds to the
<file>
XML element within a Firehose XML document.-
givenpath
¶ str
: the filename given by the analyzer.This is typically the one supplied to it on the command line, which might be absolute or relative.
Examples:
- “foo.c”
- ”./src/foo.c”
- “/home/david/libfoo-1.0/src/foo.c”
-
abspath
¶ (
str
orNone
): Optionally, a record of the absolute path of the file, to help deal with collating results from a build that changes working directory (e.g. recursive make).
-
-
class
firehose.model.
Hash
¶ An optional value within
File
, allowing the report to specify a hash value for a particular file.This can be used for tracking different versions of files when collating different reports and e.g. for caching file content in a UI.
It corresponds to the
<hash>
XML element within a Firehose XML document.-
alg
¶ str
: the name of the hash algorithm.TODO: what naming convention?
-
hexdigest
¶ str
: the hexadecimal value of the digest (lower-case hexdigits, without any leading 0x).
-
-
class
firehose.model.
Function
¶ Identification of a particular function within source code.
It corresponds to the
<function>
XML element within a Firehose XML document.-
name
¶ str
: the name of the function or method.
-
-
class
firehose.model.
Point
¶ Identification of a particular line/column within a source file.
It corresponds to the
<point>
XML element within a Firehose XML document.-
line
¶ int
: the 1-based number of the line containing the point
-
column
¶ int
: 1-based number of the columnNote
GCC uses a 1-based convention for source columns, whereas Emacs’s
M-x column-number-mode
uses a 0-based convention.For example, an error in the initial, left-hand column of source line 3 is reported by GCC as:
some-file.c:3:1: error: ...etc...
On navigating to the location of that error in Emacs (e.g. via
next-error
), the locus is reported in the Mode Line (assumingM-x column-number-mode
) as:some-file.c 10% (3, 0)
i.e.
3:1:
in GCC corresponds to(3, 0)
in Emacs.
-
Capturing the circumstances leading up to a problem¶
-
class
firehose.model.
Trace
¶ An optional list of events within an
Issue
that describe the circumstances leading up to a problem.It corresponds to the
<trace>
XML element within a Firehose XML document.See example of a trace.
Other data¶
-
class
firehose.model.
CustomFields
(OrderedDict)¶ A big escape-hatch in the data model: support for arbitrary, ordered key/value pairs for roundtripping data specific to a particular situation. e.g. debugging attributes for a particular failure
It corresponds to the
<custom-fields>
XML element within a Firehose XML document.