
Building Functional Models with kerasnip
functional-api.Rmd
This vignette demonstrates how to use the
create_keras_functional_spec()
function to build complex,
non-linear Keras models that integrate seamlessly with the
tidymodels
ecosystem.
When to Use the Functional API
While create_keras_sequential_spec()
is perfect for
models that are a simple, linear stack of layers, many advanced
architectures are not linear. The Keras Functional API is designed for
these cases. You should use create_keras_functional_spec()
when your model has:
- Multiple input or multiple output layers.
- Shared layers between different branches.
- Residual connections (e.g., ResNets), where a layer’s input is added to its output.
- Any other non-linear topology.
kerasnip
makes it easy to define these architectures by
automatically connecting a graph of layer blocks.
The Core Concept: Building a Graph
kerasnip
builds the model’s graph by inspecting the
layer_blocks
you provide. The connection logic is simple
but powerful:
- The names of the list elements in
layer_blocks
define the names of the nodes in your graph (e.g.,main_input
,dense_path
,output
). - The names of the arguments in each block function
specify its inputs. A block function like
my_block <- function(input_a, input_b, ...)
declares that it needs input from the nodes namedinput_a
andinput_b
.
There are two special requirements:
- Input Block: The first block in the list is treated as the main input node. Its function should not take other blocks as input.
-
Output Block: Exactly one block must be named
"output"
. The tensor returned by this block is used as the final output of the Keras model.
Let’s see this in action.
Example 1: A Fork-Join Regression Model
We will build a model that forks the input, passes it through two separate dense layer paths, and then joins the results with a concatenation layer before producing a final prediction.
Step 1: Load Libraries
First, we load the necessary packages.
library(kerasnip)
library(tidymodels)
## ── Attaching packages ────────────────────────────────────── tidymodels 1.3.0 ──
## ✔ broom 1.0.9 ✔ recipes 1.3.1
## ✔ dials 1.4.1 ✔ rsample 1.3.1
## ✔ dplyr 1.1.4 ✔ tibble 3.3.0
## ✔ ggplot2 3.5.2 ✔ tidyr 1.3.1
## ✔ infer 1.0.9 ✔ tune 1.3.0
## ✔ modeldata 1.4.0 ✔ workflows 1.2.0
## ✔ parsnip 1.3.2 ✔ workflowsets 1.1.1
## ✔ purrr 1.1.0 ✔ yardstick 1.3.2
## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## ✖ purrr::discard() masks scales::discard()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ recipes::step() masks stats::step()
##
## Attaching package: 'keras3'
## The following object is masked from 'package:yardstick':
##
## get_weights
# Silence the startup messages from remove_keras_spec
options(kerasnip.show_removal_messages = FALSE)
Step 2: Define Layer Blocks
These are the building blocks of our model. Each function represents a node in the graph.
# The input node. `input_shape` is supplied automatically by the engine.
input_block <- function(input_shape) {
layer_input(shape = input_shape)
}
# A generic block for a dense path. `units` will be a tunable parameter.
path_block <- function(tensor, units = 16) {
tensor |> layer_dense(units = units, activation = "relu")
}
# A block to join two tensors.
concat_block <- function(input_a, input_b) {
layer_concatenate(list(input_a, input_b))
}
# The final output block for regression.
output_block_reg <- function(tensor) {
layer_dense(tensor, units = 1)
}
Step 3: Create the Model Specification
Now we assemble the blocks into a graph. We use the
inp_spec()
helper to connect the blocks. This avoids
writing verbose anonymous functions like
function(main_input, units) path_block(main_input, units)
.
inp_spec()
automatically creates a wrapper that renames the
arguments of our blocks to match the node names from the
layer_blocks
list.
model_name <- "forked_reg_spec"
# Clean up the spec when the vignette is done knitting
on.exit(remove_keras_spec(model_name), add = TRUE)
create_keras_functional_spec(
model_name = model_name,
layer_blocks = list(
# Node names are defined by the list names
main_input = input_block,
# `inp_spec()` renames the first argument of `path_block` ('tensor')
# to 'main_input' to match the node name.
path_a = inp_spec(path_block, "main_input"),
path_b = inp_spec(path_block, "main_input"),
# For multiple inputs, `inp_spec()` takes a named vector to map
# new argument names to the original block's argument names.
concatenated = inp_spec(concat_block, c(path_a = "input_a", path_b = "input_b")),
# The output block takes the concatenated tensor as its input.
output = inp_spec(output_block_reg, "concatenated")
),
mode = "regression"
)
Step 4: Use and Fit the Model
The new function forked_reg_spec()
is now available. Its
arguments (path_a_units
, path_b_units
) were
discovered automatically from our block definitions.
# We can override the default `units` from `path_block` for each path.
spec <- forked_reg_spec(
path_a_units = 16,
path_b_units = 8,
fit_epochs = 10,
fit_verbose = 0 # Suppress fitting output in vignette
) |>
set_engine("keras")
print(spec)
## forked reg spec Model Specification (regression)
##
## Main Arguments:
## num_main_input = structure(list(), class = "rlang_zap")
## num_path_a = structure(list(), class = "rlang_zap")
## num_path_b = structure(list(), class = "rlang_zap")
## num_concatenated = structure(list(), class = "rlang_zap")
## num_output = structure(list(), class = "rlang_zap")
## path_a_units = 16
## path_b_units = 8
## learn_rate = structure(list(), class = "rlang_zap")
## fit_batch_size = structure(list(), class = "rlang_zap")
## fit_epochs = 10
## fit_callbacks = structure(list(), class = "rlang_zap")
## fit_validation_split = structure(list(), class = "rlang_zap")
## fit_validation_data = structure(list(), class = "rlang_zap")
## fit_shuffle = structure(list(), class = "rlang_zap")
## fit_class_weight = structure(list(), class = "rlang_zap")
## fit_sample_weight = structure(list(), class = "rlang_zap")
## fit_initial_epoch = structure(list(), class = "rlang_zap")
## fit_steps_per_epoch = structure(list(), class = "rlang_zap")
## fit_validation_steps = structure(list(), class = "rlang_zap")
## fit_validation_batch_size = structure(list(), class = "rlang_zap")
## fit_validation_freq = structure(list(), class = "rlang_zap")
## fit_verbose = 0
## fit_view_metrics = structure(list(), class = "rlang_zap")
## compile_optimizer = structure(list(), class = "rlang_zap")
## compile_loss = structure(list(), class = "rlang_zap")
## compile_metrics = structure(list(), class = "rlang_zap")
## compile_loss_weights = structure(list(), class = "rlang_zap")
## compile_weighted_metrics = structure(list(), class = "rlang_zap")
## compile_run_eagerly = structure(list(), class = "rlang_zap")
## compile_steps_per_execution = structure(list(), class = "rlang_zap")
## compile_jit_compile = structure(list(), class = "rlang_zap")
## compile_auto_scale_loss = structure(list(), class = "rlang_zap")
##
## Computational engine: keras
# Fit the model on the mtcars dataset
rec <- recipe(mpg ~ ., data = mtcars)
wf <- workflow() |>
add_recipe(rec) |>
add_model(spec)
fit_obj <- fit(wf, data = mtcars)
predict(fit_obj, new_data = mtcars[1:5, ])
## 1/1 - 0s - 39ms/step
## # A tibble: 5 × 1
## .pred
## <dbl>
## 1 16.0
## 2 16.0
## 3 13.5
## 4 12.7
## 5 21.5
Example 2: Tuning a Functional Model’s Depth
A key feature of kerasnip
is the ability to tune the
depth of the network by repeating a block multiple times. A
block can be repeated if it has exactly one input
tensor from another block in the graph.
Let’s create a simple functional model and tune both its width
(units
) and its depth (num_...
).
Step 1: Define Blocks and Create Spec
This model is architecturally sequential, but we build it with the functional API to demonstrate the repetition feature.
dense_block <- function(tensor, units = 16) {
tensor |> layer_dense(units = units, activation = "relu")
}
output_block_class <- function(tensor, num_classes) {
tensor |> layer_dense(units = num_classes, activation = "softmax")
}
model_name_tune <- "tunable_func_mlp"
on.exit(remove_keras_spec(model_name_tune), add = TRUE)
create_keras_functional_spec(
model_name = model_name_tune,
layer_blocks = list(
main_input = input_block,
# This block has a single input ('main_input'), so it can be repeated.
dense_path = inp_spec(dense_block, "main_input"),
output = inp_spec(output_block_class, "dense_path")
),
mode = "classification"
)
Step 2: Set up and Run Tuning
We will tune dense_path_units
(the width) and
num_dense_path
(the depth). The num_dense_path
argument was created automatically because dense_path
is a
repeatable block.
tune_spec <- tunable_func_mlp(
dense_path_units = tune(),
num_dense_path = tune(),
fit_epochs = 5,
fit_verbose = 0
) |>
set_engine("keras")
rec <- recipe(Species ~ ., data = iris)
tune_wf <- workflow() |>
add_recipe(rec) |>
add_model(tune_spec)
folds <- vfold_cv(iris, v = 2)
# Define the tuning grid
params <- extract_parameter_set_dials(tune_wf) |>
update(
dense_path_units = hidden_units(c(8, 32)),
num_dense_path = num_terms(c(1, 3)) # Test models with 1, 2, or 3 hidden layers
)
grid <- grid_regular(params, levels = 2)
grid
## # A tibble: 4 × 2
## num_dense_path dense_path_units
## <int> <int>
## 1 1 8
## 2 3 8
## 3 1 32
## 4 3 32
control <- control_grid(save_pred = FALSE, verbose = FALSE)
tune_res <- tune_grid(
tune_wf,
resamples = folds,
grid = grid,
control = control
)
## 3/3 - 0s - 16ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 21ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 16ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 21ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 15ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 20ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 15ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 21ms/step
## 3/3 - 0s - 7ms/step
show_best(tune_res, metric = "accuracy")
## # A tibble: 4 × 8
## num_dense_path dense_path_units .metric .estimator mean n std_err .config
## <int> <int> <chr> <chr> <dbl> <int> <dbl> <chr>
## 1 3 32 accura… multiclass 0.853 2 0.0267 Prepro…
## 2 1 32 accura… multiclass 0.813 2 0.133 Prepro…
## 3 3 8 accura… multiclass 0.493 2 0.16 Prepro…
## 4 1 8 accura… multiclass 0.3 2 0.0333 Prepro…
The results show that tidymodels
successfully trained
and evaluated models with different numbers of hidden layers,
demonstrating that we can tune the very architecture of the network.
Conclusion
The create_keras_functional_spec()
function provides a
powerful and intuitive way to define, fit, and tune complex Keras models
within the tidymodels
framework. By defining the model as a
graph of connected blocks, you can represent nearly any architecture
while kerasnip
handles the boilerplate of integrating it
with parsnip
, dials
, and
tune
.