Query the rrlm graph and retrieve a token-budgeted context
Source:R/graph_traverse.R
query_context.RdPerforms a relevance-guided breadth-first search starting from a
seed node and builds a context string that fits within
budget_tokens. The budget is a hard constraint: the
function never returns more tokens than requested.
Usage
query_context(
graph,
query,
seed_node = NULL,
budget_tokens = 2000L,
min_relevance = 0.1,
max_nodes = 20L,
method = NULL,
verbose = FALSE
)Arguments
- graph
An
rrlm_graph/igraphobject created bybuild_rrlm_graph().- query
Character(1). User query string.
- seed_node
Character(1) or
NULL. Name of the vertex to start traversal from.NULL(default) triggers automatic selection: the function-type node with the highest PageRank.- budget_tokens
Integer(1). Hard token limit. Default
2000L.- min_relevance
Numeric(1). Minimum relevance score \([0, 1]\) for a node to be admitted. Default
0.1.- max_nodes
Integer(1). Maximum number of nodes (excluding the seed) to absorb. Default
20L.- method
Character(1) or
NULL. Embedding method passed toembed_query().NULL(default) reads the"embed_model"graph attribute.- verbose
Logical(1). Print progress messages via
cli_inform(). DefaultFALSE.
Value
A named list with class c("rrlm_context", "list"):
nodesCharacter vector of absorbed node names, relevance-ordered, seed first.
context_stringCharacter(1) assembled by
assemble_context_string().tokens_usedInteger(1). Approximate token count of
context_string.budget_tokensInteger(1). The budget that was used.
seed_nodeCharacter(1). The seed node name.
relevance_scoresNamed numeric vector of relevance scores for every absorbed node (seed included).
Algorithm
Embed
querywithembed_query().Identify the seed node: if
seed_nodeisNULL, select the function-type vertex with the highest pre-computed PageRank; otherwise validate and use the supplied name.Initialise
visited = {seed}andfrontier = neighbours(seed).BFS loop while
tokens_used < budget_tokensandfrontieris non-empty:Score every frontier node with
compute_relevance().Select the best-scoring node with score
>= min_relevance.Compute its token cost (
.count_tokens()).Accept the node only if adding it stays within the budget; otherwise skip and try the next-best.
Mark as visited; expand its neighbours into the frontier.
Call
update_task_weights()to update the learning trace.Assemble the final context string with
assemble_context_string().
Examples
if (FALSE) { # \dontrun{
g <- build_rrlm_graph("mypkg")
ctx <- query_context(g, "load training data", budget_tokens = 1000L)
cat(ctx$context_string)
} # }