The influences relation
  Kythe stores information about dataflow in the graph using
influences edges. A node A
influences a node B if A directly affects B during the evaluation
of a program. This relationship is not intended to model all dataflow
interactions. In particular, it excludes nontrivial aliasing (that can’t
be decided within the scope of a single compilation unit). Despite this,
the influence relation can be used to trace simple paths through a program
by exploring the subgraph iteratively, allowing for non-universal queries
that fit the following examples:
- 
find places where a flag passed to a program directly affects a different program value
 - 
find places where the return value from a sensitive function is passed to an insensitive sink, like leaking a key to a logging function
 - 
identify the places that affect a program value that was assigned something unexpected at runtime
 
This document is intended to describe how to add the influences relationship
to a Kythe indexer based on common language features. Note that this document
is still under active development.
General framework
We assume that the indexer being amended performs an AST walk over a tree with resolved references (so it’s possible to construct VNames for, e.g., a variable that is referenced). The indexer should maintain a stack of sets of influencers (using whichever type is appopriate; for C++, we use pointers to declarations). As the indexer enters and leaves various kinds of AST node and encounters others, it will manipulate this stack.
Names follow this convention:
- 
vis a simple program variable. - 
pis a parameter. - 
eis an expression. 
Popping the set "onto" an object o with node n means removing the top
set from the influencers stack and recording i influences n edges for all
nodes i for the objects in that top set.
The blame context is the node or nodes that would be assigned responsibility for making a function call at the current point in the AST. For example:
int f() { g(); }
Here, the call to g would be assigned to blame context f.
Variables
Simple reference
v
- 
Insert
v. 
Simple assignment
v := e
- 
Push
{}. - 
Traverse
e. - 
Pop onto
v. 
A simple expression. (C++)
void f() {
  //- @x defines/binding VarX
  int x = 0;
  //- @y defines/binding VarY
  int y = 1;
  //- // VarX influences VarY
  y = x;
}
Functions
Simple calls
e(e1 ... eN)
- 
Traverse
e. - 
For each of
e1 ... eNwith corresponding parametersp1 ... pN: - 
Push
{}. - 
Traverse
eI. - 
Pop onto
pI. - 
Insert
e. 
Returns
return e;
- 
Push
{}. - 
Traverse
e. - 
Pop onto the blame context.
 
A function call. (C++)
//- @x defines/binding ParamX
//- @g defines/binding FnG
int g(int x) {
  //- ParamX influences FnG
  return x + 1;
}
void f() {
  //- @y defines/binding VarY
  int y = 0;
  //- @z defines/binding VarZ
  int z;
  //- VarY influences ParamX
  //- FnG influences VarZ
  z = y + g(y);
}
Forward declarations
f'(p'1 ... p'N);
f(p1 ... pN) { ... }
- 
p'IinfluencespI. - 
finfluencesf'. 
Forward declarations. (C++)
//- @f defines/binding FwdFnF
//- @x defines/binding FwdParamX
//- @y defines/binding FwdParamY
int f(int x, int y);
//- @f defines/binding DefnFnF
//- @x defines/binding DefnParamX
//- @y defines/binding DefnParamY
int f(int x, int y) {
  return x + y;
}
//- FwdParamX influences DefnParamX
//- FwdParamY influences DefnParamY
//- DefnFnF influences FwdFnF