Callgraphs
The Kythe graph contains information about callable objects like functions and ref/call edges that target them. This information can be used to satisfy queries about the locations at which different functions are called:
foo calls bar. (C++)
//- @bar defines/binding FnBar
void bar() { }
//- @"bar()" ref/call FnBar
//- @"bar()" childof FnFoo
//- @foo defines/binding FnFoo
void foo() { bar(); }
In this example, function foo makes a single call to function bar. The
callsite is the expression bar(). This is spanned by an anchor that has a
ref/call edge pointing to the function node for bar. The anchor also has a
childof edge that points to function foo. This edge indicates that the
effects of the anchor should be blamed on foo. There will be up to one such
blame childof edge with an anchor source and a semantic target.
The assertions in the example trace a possible set of queries made against a Kythe graph:
-
Identify the semantic object for the definition in question (
FnBar). -
Find the locations at which that object is called. These will be
ref/calledges withanchorsources and semantic targets. -
Find the semantic objects associated with each callsite anchor. These will be attached by
childofedges. -
Finally, we can show the binding location for the caller by looking up its
defines/bindingedge.
These queries will not capture the full set of traditional callsites. We will fix this by first finding calls to forward declarations; then we will consider overrides.
Forward declarations
A call made using a forward declaration may not always ref/call the definition
that completedby that declaration. This allows the
Kythe graph to model a set of programs (where different programs link against
different implementations for the same declaration) rather than a single program
(where linking different implementations in this way would result in undefined
behavior). In order to build a complete call graph, one must therefore also look
along completedby edges to find alternate possible nodes that may be used to
make calls to a given declaration or definition.
foo calls a forward-declared bar. (C++)
#include "bardecl.h"
//- @"c.bar()" ref/call BarDecl
void foo(C& c) { c.bar(); }
#example bardecl.h
//- @bar defines/binding BarDecl
struct C { void bar(); };
#example barimpl.cc
#include "bardecl.h"
//- @bar defines/binding BarImpl
void C::bar() { }
// Imagine that we only had BarDecl (by looking up the ref/call from c.bar()
// in foo()). We can find the implementation BarImpl of C::bar by taking the
// following trip:
//- BarDecl completedby BarImpl
Note that this overestimates the possible set of nodes for a given callee: for example, in a given program a forward declaration might never be linked (in the joining-together-object-files sense) with a definition. In order to filter these extra associations out, one would need to consider both which program is in focus by the user and which modules went into building that program. This is out of scope for this document.
completedby relationships should only be established when an indexer observes
that a completion has actually occurred. This allows users of the graph to
avoid connecting implementations to signatures that those implementations
never observe:
Unrelated defs and decls can be distinguished. (C++)
#example foo1.h
// A header that declares a global foo().
//- @foo defines/binding Foo1Decl
void foo();
#example foo2.h
// An unrelated header that also declares a foo().
//- @foo defines/binding Foo2Decl
void foo();
#example foo1.cc
#include "foo1.h"
// This definition sees the first declaration (but never the second).
//- @foo defines/binding Foo1Defn
//- Foo1Decl completedby Foo1Defn
void foo() { }
#example foo2.cc
#include "foo2.h"
// This definition sees only the second declaration.
//- @foo defines/binding Foo2Defn
//- Foo2Decl completedby Foo2Defn
void foo() { }
Finally, be careful to follow the completedby relationship in both directions,
since the Kythe graph records specifically which declaration is used in a
function call.
Make sure to reach every callsite. (C++)
#include "foo.h"
//- @"foo()"=FooDeclCall ref/call FooDecl
void baz() { foo(); }
//- @foo=FooImplAnchor defines/binding FooImpl
void foo() { }
//- @"foo()"=FooImplCall ref/call FooImpl
void bar() { foo(); }
#example foo.h
//- @foo=FooDeclAnchor defines/binding FooDecl
void foo();
// Finding all callsites for foo() follows the same query pattern. Depending
// on your starting point, different VNames are unknown: if you begin with
// FooImplAnchor, you'll need to search forward along the completedby edge for
// FooDecl.
//- FooDecl completedby FooImpl
// If you start with FooDeclAnchor, you first need to find FooDecl,
// then you can search backward for all of the nodes that complete it.
//- FooImplAnchor defines/binding FooImpl
//- FooDeclAnchor defines/binding FooDecl
// Then look for all the ref/call edges terminating at those nodes:
//- FooDeclCall ref/call FooDecl
//- FooImplCall ref/call FooImpl
Overrides
For callsites that involve dynamic binding, Kythe indexers should emit
ref/call edges to the most specific possible node.
Use the most specific static overrides. (C++)
//- @f defines/binding DefSF
struct S { virtual void f() { } };
//- @f defines/binding DefTF
//- DefTF overrides DefSF
struct T : public S { void f() override { } };
//- @"s->f()" ref/call DefSF
void CallSF(S* s) { s->f(); }
//- @"t->f()" ref/call DefTF
void CallTF(T* t) { t->f(); }
Depending on the semantics of the query you want to make, you may need to walk
up or down the override chain. For example, to find all of the calls to S::f
in the example above, including those calls that are made to functions that
override the implementation in S:
Follow the override chain downward. (C++)
//- @f defines/binding DefSF
struct S { virtual void f() { } };
struct T : public S { void f() override { } };
void CallSF(S* s) { s->f(); }
void CallTF(T* t) { t->f(); }
// Starting with DefSF, find all callers (and callers of overrides):
// Build a set of overrides.
//- DefTF overrides DefSF
// Find all callsites.
//- DefSFCall ref/call DefSF
//- DefTFCall ref/call DefTF
Final query
We can now build a single query that will unify defs and decls and search up and
down the override chain. This query starts with some semantic object Fn and
yields the various anchors that call Fn in the broadest sense.
-
Add
Fnto setO. -
Repeat until fixed point:
-
For every
oinO, for all nodeso'such thato' overrides ooro overrides o', addo'toO. -
For every
oinO, for all nodeso'such that (for somea)a defines/binding o'and (o completedby o'), addo'toO. -
For every
oinO, for all nodeso'such that (for somea)a defines/binding oand (o' completedby o), addo'toO.
-
-
Return the set of all
asuch that there is someoinOsuch thata ref/call o.
Since this query is expensive in practice, we (intend to) precompute it as part of building serving tables.