insideR — make R package calls transparent • insider

insideR is an experimental R package/framework for making R package calls transparent, editable, replayable, and optionally contributable.

The goal is not to replace R packages with loose scripts. The goal is to help users open up one package call, understand what happened, modify their own local copy safely, compare the result, and decide whether the change should remain local or be proposed back to the package.

Core idea

Many R users trust scripts because they can open and edit them, but they may fear packages because package logic feels hidden behind function calls. insideR addresses that fear at the level where users actually work: the individual call.

insider::unpack_call(
  predict(fit, newdata = recent_data),
  output_dir = "predict_replay"
)

A successful unpack creates a small, documented replay project. In v0.1 the core artifact is a single self-contained script:

predict_replay/
  replay.R              # documented extracted functions + input loading
                        # + the reproducing call + built-in verification
  data/
    fit.rds             # captured inputs
    recent_data.rds
    original_result.rds # the original package result
  insider_manifest.rds   # machine-readable unpack metadata

replay.R opens with a provenance header (package, version, source repository, R version) and the findings of a static security scan of the extracted code (shell execution, network access, dynamic evaluation, file writes, global-state mutation, compiled entry points). Running source("replay.R") from the project directory re-executes the call with the extracted functions and verifies the result against the stored original — and, when the package is installed, against a live package::f() call.

The richer multi-file layout below (customize/compare/proposal scripts) remains the roadmap for later versions.

What problem does insideR solve?

R packages improve reuse, testing, documentation, and maintainability. But in applied teams, especially teams used to scripts, packages can feel like black boxes. Users may worry that:

they cannot see what a function does;
they cannot safely modify behavior for their own analysis;
they do not know how to recover if they break something;
they do not know how to propose improvements;
maintainers become bottlenecks for every local variation.

insideR is designed to reduce that fear without destroying package discipline.

Design principle

Every important R package call should be able to become a transparent mini-project.

That mini-project should show:

the original call;
the resolved function or method;
the executed function path;
the relevant source code where possible;
the input objects needed to replay;
the output from the original call;
a safe customization script;
comparison between original and customized results;
replay limitations;
an optional proposal bundle for maintainers.

What insideR is not

insideR is not:

a replacement for R packages;
a replacement for Git;
a promise that every R call can become package-free code;
a full debugger;
a full dependency manager;
a way to bypass maintainers and mutate production packages directly.

It is a transparency and replay layer around package calls.

Planned user workflow

1. Explain a call

insider::explain_call(
  predict(fit, newdata = recent_data)
)

Expected output:

Original call:
predict(fit, newdata = recent_data)

Resolved method:
predict.my_model()

Function path:
1. predict.my_model()
2. build_features()
3. apply_model()
4. post_process()

Replay status:
Partial standalone replay. Requires mgcv for predict.gam().

2. Unpack a call

insider::unpack_call(
  predict(fit, newdata = recent_data),
  output_dir = "predict_replay"
)

This writes a local replay folder containing scripts, data, extracted functions, and documentation.

3. Customize locally

Users edit:

03_customize_here.R

They should not need to edit the installed package directly.

4. Compare results

insider::compare_call("predict_replay")

This compares the original output, replay output, and customized output.

5. Optionally propose a change

insider::propose_change("predict_replay")

This creates a reviewable bundle:

proposal/
  change_summary.md
  original_call.R
  custom_call.R
  changed_functions.diff
  before_after_comparison.md
  test_case.R
  session_info.txt

Not every customization should feed back to the package. Local changes stay local unless they reveal a general pattern, bug, documentation gap, or reusable feature.

Next direction: function sequence extraction

The next planned direction is to make the core object a function call sequence rather than a full package copy.

The goal is to trace one package function call, record the function path, extract the relevant R functions into a local workspace, let the user modify selected steps, then redocument, replay, compare, and save the result as a reusable workflow.

Concise product sentence:

insideR traces a function call sequence, extracts the minimal reproducible logic, lets the user modify selected steps, then saves the modified sequence as a documented and testable workflow.

See docs/insider-next.md for the sequence extraction roadmap.

Scope for the first version

The first version should support:

plain R functions;
S3 methods;
exported package functions;
non-exported namespace helpers where possible;
input capture using .rds files;
source extraction using R introspection;
simple call tree reporting;
replay project generation;
local customization;
before/after comparison;
proposal bundle generation.

The first version should not attempt to fully solve:

compiled C/C++/Fortran code;
complex S4/R6 systems;
database or API calls;
Shiny/reactive workflows;
heavy tidy-evaluation/non-standard evaluation;
parallel execution;
hidden global side effects.

Those should be reported as limitations, not hidden.

Package-level code graph

Beyond single calls, insideR can index a whole package (or any directory of R source) into a queryable code graph — functions, S4/R5/R6 classes and methods, imports, and call edges — similar to what tools like codegraph do for other languages, implemented natively in R with no external runtime:

g <- build_graph("~/repos/courieR")   # source dir: exact files + line numbers
g <- build_graph("mgcv")              # installed package: namespace introspection

graph_search(g, "predict")            # find definitions by pattern
graph_node(g, "build_features")       # one definition, line-numbered source
graph_callers(g, "build_features")    # who calls it (file:line in source mode)
graph_callees(g, "predict.my_model")  # what it calls, internal vs external

Graphs are cached (.insider_graph.rds in the source dir; the user cache dir for installed packages) and rebuilt automatically when files or the package version change. Nothing is executed in source mode — it is pure static parsing.

Security transparency

Transparency includes knowing what the code you are about to run or edit can do. insideR approaches this in three layers:

Static risk audit (implemented in v0.1). Every explain/unpack scans the involved functions for calls worth reviewing: shell command execution, network access, dynamic code evaluation, file system modification, environment/global-state mutation, and compiled/internal entry points. Findings appear in explain_call() output and in the replay.R header.
Supply-chain provenance (partial in v0.1). The generated script records the package name, version, and source repository, so a replay is always traceable to the code it came from. Checksum/dependency-tree verification is planned.
Sandboxed replay (planned). Running replays and customized code in a restricted child process (no network, temp-dir-only writes).

The static scan is a transparency report, not a malware detector.

Project status

v0.1 implements the first milestone: explain_call() and unpack_call() for plain R functions and S3 methods, with input capture, source extraction, documentation carry-over from the installed package’s help topics, self-verifying replay scripts, and the static security audit. R CMD check clean; test suite covers resolution, extraction, audit, and end-to-end replay of generated scripts.

Package API

Implemented:

explain_call(expr, max_depth = 5)
unpack_call(expr, output_dir, max_depth = 5, overwrite = FALSE)

build_graph(x, cache = TRUE, refresh = FALSE)
graph_search(graph, pattern, kind = NULL)
graph_node(graph, name)
graph_callers(graph, name)
graph_callees(graph, name, internal_only = FALSE)

Planned:

replay_call(path)
customize_call(path)
compare_call(path)
propose_change(path)
restore_replay(path)

Philosophy

The package should protect both sides:

users get visibility, safety, and freedom to experiment;
maintainers keep a clean package API and decide what becomes official.

The core promotion path is:

local customization
  -> repeated useful pattern
  -> documented recipe
  -> formal option
  -> package feature

insideR exists to make that path visible and manageable.