Skip to contents

Indexes R code into a queryable graph: nodes are definitions (functions, S4/R5/R6 classes and their methods, plain objects) and edges are function calls, plus a table of imports (library/require/requireNamespace/ source). Query it with graph_search(), graph_node(), graph_callers(), and graph_callees().

Usage

build_graph(x, cache = TRUE, refresh = FALSE)

Arguments

x

A directory containing R source files, or the name of an installed package.

cache

Cache the graph (as .insider_graph.rds inside a source directory; under tools::R_user_dir("insider", "cache") for installed packages)? The cache is invalidated when source files change or the package version changes.

refresh

Force a rebuild even if a valid cache exists.

Value

A insider_graph object with data frames nodes, edges, imports, and unresolved, plus meta.

edges carries provenance for every call relationship:

  • provenance: how the edge was discovered — one of "static_parse" (source mode: exact call site from parse data), "codetools" (installed mode: codetools::findGlobals() free-variable analysis), "namespace", "s3_registry", "s4_registry", "runtime_trace", "heuristic", or "user_confirmed" (reserved for future builders / manual corrections).

  • confidence: a [0, 1] score defaulted by provenance — user_confirmed 1.00, runtime_trace 0.99, static_parse 0.95, namespace 0.90, s4_registry 0.85, codetools 0.80, s3_registry 0.70, heuristic 0.40.

  • resolved: is to a single, unambiguous target? FALSE for calls to generics whose dispatch target depends on a runtime class the static/namespace pass cannot know (currently predict(), summary(), plot(), print()).

  • notes: NA_character_ normally, or a short explanation when resolved is FALSE or the edge is otherwise noteworthy.

unresolved lists references the graph could not pin to one target, one row per occurrence: from_node, reference_name (the symbol called), reference_kind (e.g. "generic_call"), candidates (a comma-separated string of node names that could plausibly be the target, or "" when none are known), reason, and — in source mode — file/line (both NA in installed mode, which has no call-site positions).

Details

Two modes:

  • Source directory (path with .R files, e.g. a package repo): parsed statically, so nodes and call sites carry exact files and line numbers. Nothing is executed.

  • Installed package (package name): built by namespace introspection; call edges come from static analysis of the loaded functions and have no line numbers unless the package was installed with source references.

Examples

dir <- tempfile()
dir.create(dir)
writeLines("f <- function(x) g(x)\ng <- function(x) x + 1", file.path(dir, "code.R"))
g <- build_graph(dir, cache = FALSE)
g
#> 
#> ── insideR code graph ──────────────────────────────────────────────────────────
#> Target: /tmp/RtmpJhnVG8/file18a92e2f4d1e (source mode)
#> Nodes: 2 (function: 2)
#> Call edges: 1 (1 internal)
#> Unresolved references: 0
# \donttest{
# graph an installed package (slower: introspects the whole namespace)
g_stats <- build_graph("stats", cache = FALSE)
head(graph_callers(g_stats, "var"))
#>                caller     kind file line
#> 1            StructTS function <NA>   NA
#> 2 ansari.test.default function <NA>   NA
#> 3              bw.bcv function <NA>   NA
#> 4              bw.nrd function <NA>   NA
#> 5              bw.ucv function <NA>   NA
#> 6     density.default function <NA>   NA
# }