This document assumes that the latest release archive from has been unpacked into /opt/kythe/. See /opt/kythe/README for more information.

Extracting Compilations

# Generate Kythe protobuf sources
bazel build //kythe/proto:all

# Environment variables common to Kythe extractors
export KYTHE_ROOT_DIRECTORY="$PWD"                        # Root of source code corpus
export KYTHE_OUTPUT_DIRECTORY="/tmp/kythe.compilations/"  # Output directory
export KYTHE_VNAMES="$PWD/kythe/data/vnames.json"         # Optional: VNames configuration


# Extract a Java compilation
# java -Xbootclasspath/p:third_party/javac/javac*.jar \
# \
#   <javac_arguments>
java -Xbootclasspath/p:third_party/javac/javac*.jar \
  -jar /opt/kythe/extractors/javac_extractor.jar \ \

# Extract a C++ compilation
# /opt/kythe/extractors/cxx_extractor <arguments>
/opt/kythe/extractors/cxx_extractor -x c++ kythe/cxx/common/scope_guard.h

Extracting Compilations using Bazel

Kythe uses Bazel to build itself and has implemented Bazel action_listeners that use Kythe’s Java and C++ extractors. This effectively allows Bazel to extract each compilation as it is run during the build.

Extracting the Kythe repository

Add the flag --experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_java to make Bazel extract Java compilations and --experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_cxx to do the same for C++.

# Extract all Java/C++ compilations in Kythe
bazel build -k \
  --experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_java \
  --experimental_action_listener=@io_kythe//kythe/extractors:extract_kzip_cxx \
  --experimental_extra_action_top_level_only \
  //kythe/cxx/... //kythe/java/...

# Find the extracted .kzip files
find -L bazel-out -name '*.kzip'

The provided utility script,, does a full extraction using Bazel and then moves the compilations into the directory structure used by the kythe/kythe Docker image.

Extracting other Bazel based repositories

You can use the Kythe release to extract compilations from other Bazel based repositories.

# Download and unpack the latest Kythe release
wget -O /tmp/kythe.tar.gz \$KYTHE_VERSION/kythe-$KYTHE_VERSION.tar.gz
tar --no-same-owner -xvzf /tmp/kythe.tar.gz --directory /opt
echo 'KYTHE_DIR=/opt/kythe-$KYTHE_VERSION' >> $BASH_ENV

# Build the repository with extraction enabled
bazel --bazelrc=$KYTHE_DIR/extractors.bazelrc \
    build --override_repository kythe_release=$KYTHE_DIR \

Extracting CMake based repositories

These instructions assume your environment is already set up to successfully run cmake for your repository.

Set the following three environment variables:

  • KYTHE_ROOT_DIRECTORY: The absolute path for file input to be extracted. This is generally the root of the repository. All files extracted will be stored relative to this path.
  • KYTHE_OUTPUT_DIRECTORY: The absolute path for storing output.
  • KYTHE_CORPUS: The corpus label for extracted files.
$ export KYTHE_ROOT_DIRECTORY="/absolute/path/to/repo/root"
$ export KYTHE_OUTPUT_DIRECTORY="/tmp/kythe-output"
$ export KYTHE_CORPUS=""

# $CMAKE_ROOT_DIRECTORY is passed into the -sourcedir flag. This value should be
# the directory that contains the top-level CMakeLists.txt file. In many
# repositories this path is the same as $KYTHE_ROOT_DIRECTORY.
$ export CMAKE_ROOT_DIRECTORY="/absolute/path/to/cmake/root"

$ /opt/kythe/tools/runextractor cmake \
    -extractor=/opt/kythe/extractors/cxx_extractor \

Indexing Compilations

All Kythe indexers analyze compilations emitted from extractors as either a .kzip file. The indexers will then emit a delimited stream of entry protobufs that can then be stored in a GraphStore.

# Indexing a C++ compilation
# /opt/kythe/indexers/cxx_indexer --ignore_unimplemented <kzip-file> > entries
/opt/kythe/indexers/cxx_indexer --ignore_unimplemented \
  .kythe_compilations/c++/ > entries

# Indexing a Java compilation
# java -Xbootclasspath/p:third_party/javac/javac*.jar \
# \
#   <kzip-file> > entries
java -Xbootclasspath/p:third_party/javac/javac*.jar \ \
  $PWD/.kythe_compilations/java/ > entries

# View indexer's output entry stream as JSON
/opt/kythe/tools/entrystream --write_format=json < entries

# Write entry stream into a GraphStore
/opt/kythe/tools/write_entries --graphstore leveldb:/tmp/gs < entries

Indexing the Kythe Repository

mkdir -p .kythe_{graphstore,compilations}
# .kythe_serving is the output directory for the resulting Kythe serving tables
# .kythe_graphstore is the output directory for the resulting Kythe GraphStore
# .kythe_compilations will contain the intermediary .kzip file for each
#   indexed compilation

# Produce the .kzip files for each compilation in the Kythe repo
./kythe/extractors/bazel/ "$PWD" .kythe_compilations

# Index the compilations, producing a GraphStore containing a Kythe index
bazel build //kythe/release:docker
docker run --rm \
  -v "${PWD}:/repo" \
  -v "${PWD}/.kythe_compilations:/compilations" \
  -v "${PWD}/.kythe_graphstore:/graphstore" \
  google/kythe --index

# Generate corresponding serving tables
/opt/kythe/tools/write_tables --graphstore .kythe_graphstore --out .kythe_serving

Using Cayley to explore a GraphStore

Install Cayley if necessary:

# Convert GraphStore to nquads format
bazel run //kythe/go/storage/tools/triples -- --graphstore /path/to/graphstore | \
  gzip >kythe.nq.gz

cayley repl --dbpath kythe.nq.gz # or cayley http --dbpath kythe.nq.gz
// Get all file nodes
cayley> g.V().Has("/kythe/node/kind", "file").All()

// Get definition anchors for all record nodes
cayley> g.V().Has("/kythe/node/kind", "record").Tag("record").In("/kythe/edge/defines").All()

// Get the file(s) defining a particular node
cayley> g.V("node_ticket").In("/kythe/edge/defines").Out("/kythe/edge/childof").Has("/kythe/node/kind", "file").All()

Visualizing Cross-References

As part of Kythe’s first release, a sample UI has been made to show Kythe’s basic cross-reference capabilities. The following command can be run over the serving table created with the write_tables binary (see above).

# --listen localhost:8080 allows access from only this machine; change to
# --listen :8080 to allow access from any machine
/opt/kythe/tools/http_server \
  --public_resources /opt/kythe/web/ui \
  --listen localhost:8080 \
  --serving_table .kythe_serving