Building Functional Models with kerasnip • kerasnip

This vignette demonstrates how to use the create_keras_functional_spec() function to build complex, non-linear Keras models that integrate seamlessly with the tidymodels ecosystem.

When to Use the Functional API

While create_keras_sequential_spec() is perfect for models that are a simple, linear stack of layers, many advanced architectures are not linear. The Keras Functional API is designed for these cases. You should use create_keras_functional_spec() when your model has:

Multiple input or multiple output layers.
Shared layers between different branches.
Residual connections (e.g., ResNets), where a layer’s input is added to its output.
Any other non-linear topology.

kerasnip makes it easy to define these architectures by automatically connecting a graph of layer blocks.

The Core Concept: Building a Graph

kerasnip builds the model’s graph by inspecting the layer_blocks you provide. The connection logic is simple but powerful:

The names of the list elements in layer_blocks define the names of the nodes in your graph (e.g., main_input, dense_path, output).
The names of the arguments in each block function specify its inputs. A block function like my_block <- function(input_a, input_b, ...) declares that it needs input from the nodes named input_a and input_b.

There are two special requirements:

Input Block: The first block in the list is treated as the main input node. Its function should not take other blocks as input.
Output Block: Exactly one block must be named "output". The tensor returned by this block is used as the final output of the Keras model.

Let’s see this in action.

Example 1: A Fork-Join Regression Model

We will build a model that forks the input, passes it through two separate dense layer paths, and then joins the results with a concatenation layer before producing a final prediction.

Step 1: Load Libraries

First, we load the necessary packages.

library(kerasnip)
library(tidymodels)

## ── Attaching packages ────────────────────────────────────── tidymodels 1.3.0 ──

## ✔ broom        1.0.9     ✔ recipes      1.3.1
## ✔ dials        1.4.1     ✔ rsample      1.3.1
## ✔ dplyr        1.1.4     ✔ tibble       3.3.0
## ✔ ggplot2      3.5.2     ✔ tidyr        1.3.1
## ✔ infer        1.0.9     ✔ tune         1.3.0
## ✔ modeldata    1.4.0     ✔ workflows    1.2.0
## ✔ parsnip      1.3.2     ✔ workflowsets 1.1.1
## ✔ purrr        1.1.0     ✔ yardstick    1.3.2

## ── Conflicts ───────────────────────────────────────── tidymodels_conflicts() ──
## ✖ purrr::discard() masks scales::discard()
## ✖ dplyr::filter()  masks stats::filter()
## ✖ dplyr::lag()     masks stats::lag()
## ✖ recipes::step()  masks stats::step()

library(keras3)

## 
## Attaching package: 'keras3'

## The following object is masked from 'package:yardstick':
## 
##     get_weights

# Silence the startup messages from remove_keras_spec
options(kerasnip.show_removal_messages = FALSE)

Step 2: Define Layer Blocks

These are the building blocks of our model. Each function represents a node in the graph.

# The input node. `input_shape` is supplied automatically by the engine.
input_block <- function(input_shape) {
  layer_input(shape = input_shape)
}

# A generic block for a dense path. `units` will be a tunable parameter.
path_block <- function(tensor, units = 16) {
  tensor |> layer_dense(units = units, activation = "relu")
}

# A block to join two tensors.
concat_block <- function(input_a, input_b) {
  layer_concatenate(list(input_a, input_b))
}

# The final output block for regression.
output_block_reg <- function(tensor) {
  layer_dense(tensor, units = 1)
}

Step 3: Create the Model Specification

Now we assemble the blocks into a graph. We use the inp_spec() helper to connect the blocks. This avoids writing verbose anonymous functions like function(main_input, units) path_block(main_input, units). inp_spec() automatically creates a wrapper that renames the arguments of our blocks to match the node names from the layer_blocks list.

model_name <- "forked_reg_spec"
# Clean up the spec when the vignette is done knitting
on.exit(remove_keras_spec(model_name), add = TRUE)

create_keras_functional_spec(
  model_name = model_name,
  layer_blocks = list(
    # Node names are defined by the list names
    main_input = input_block,

    # `inp_spec()` renames the first argument of `path_block` ('tensor')
    # to 'main_input' to match the node name.
    path_a = inp_spec(path_block, "main_input"),
    path_b = inp_spec(path_block, "main_input"),

    # For multiple inputs, `inp_spec()` takes a named vector to map
    # new argument names to the original block's argument names.
    concatenated = inp_spec(concat_block, c(path_a = "input_a", path_b = "input_b")),

    # The output block takes the concatenated tensor as its input.
    output = inp_spec(output_block_reg, "concatenated")
  ),
  mode = "regression"
)

Step 4: Use and Fit the Model

The new function forked_reg_spec() is now available. Its arguments (path_a_units, path_b_units) were discovered automatically from our block definitions.

# We can override the default `units` from `path_block` for each path.
spec <- forked_reg_spec(
  path_a_units = 16,
  path_b_units = 8,
  fit_epochs = 10,
  fit_verbose = 0 # Suppress fitting output in vignette
) |>
  set_engine("keras")

print(spec)

## forked reg spec Model Specification (regression)
## 
## Main Arguments:
##   num_main_input = structure(list(), class = "rlang_zap")
##   num_path_a = structure(list(), class = "rlang_zap")
##   num_path_b = structure(list(), class = "rlang_zap")
##   num_concatenated = structure(list(), class = "rlang_zap")
##   num_output = structure(list(), class = "rlang_zap")
##   path_a_units = 16
##   path_b_units = 8
##   learn_rate = structure(list(), class = "rlang_zap")
##   fit_batch_size = structure(list(), class = "rlang_zap")
##   fit_epochs = 10
##   fit_callbacks = structure(list(), class = "rlang_zap")
##   fit_validation_split = structure(list(), class = "rlang_zap")
##   fit_validation_data = structure(list(), class = "rlang_zap")
##   fit_shuffle = structure(list(), class = "rlang_zap")
##   fit_class_weight = structure(list(), class = "rlang_zap")
##   fit_sample_weight = structure(list(), class = "rlang_zap")
##   fit_initial_epoch = structure(list(), class = "rlang_zap")
##   fit_steps_per_epoch = structure(list(), class = "rlang_zap")
##   fit_validation_steps = structure(list(), class = "rlang_zap")
##   fit_validation_batch_size = structure(list(), class = "rlang_zap")
##   fit_validation_freq = structure(list(), class = "rlang_zap")
##   fit_verbose = 0
##   fit_view_metrics = structure(list(), class = "rlang_zap")
##   compile_optimizer = structure(list(), class = "rlang_zap")
##   compile_loss = structure(list(), class = "rlang_zap")
##   compile_metrics = structure(list(), class = "rlang_zap")
##   compile_loss_weights = structure(list(), class = "rlang_zap")
##   compile_weighted_metrics = structure(list(), class = "rlang_zap")
##   compile_run_eagerly = structure(list(), class = "rlang_zap")
##   compile_steps_per_execution = structure(list(), class = "rlang_zap")
##   compile_jit_compile = structure(list(), class = "rlang_zap")
##   compile_auto_scale_loss = structure(list(), class = "rlang_zap")
## 
## Computational engine: keras

# Fit the model on the mtcars dataset
rec <- recipe(mpg ~ ., data = mtcars)
wf <- workflow() |> 
  add_recipe(rec) |>
  add_model(spec)


fit_obj <- fit(wf, data = mtcars)

predict(fit_obj, new_data = mtcars[1:5, ])

## 1/1 - 0s - 39ms/step

## # A tibble: 5 × 1
##   .pred
##   <dbl>
## 1  16.0
## 2  16.0
## 3  13.5
## 4  12.7
## 5  21.5

Example 2: Tuning a Functional Model’s Depth

A key feature of kerasnip is the ability to tune the depth of the network by repeating a block multiple times. A block can be repeated if it has exactly one input tensor from another block in the graph.

Let’s create a simple functional model and tune both its width (units) and its depth (num_...).

Step 1: Define Blocks and Create Spec

This model is architecturally sequential, but we build it with the functional API to demonstrate the repetition feature.

dense_block <- function(tensor, units = 16) {
  tensor |> layer_dense(units = units, activation = "relu")
}
output_block_class <- function(tensor, num_classes) {
  tensor |> layer_dense(units = num_classes, activation = "softmax")
}

model_name_tune <- "tunable_func_mlp"
on.exit(remove_keras_spec(model_name_tune), add = TRUE)

create_keras_functional_spec(
  model_name = model_name_tune,
  layer_blocks = list(
    main_input = input_block,
    # This block has a single input ('main_input'), so it can be repeated.
    dense_path = inp_spec(dense_block, "main_input"),
    output = inp_spec(output_block_class, "dense_path")
  ),
  mode = "classification"
)

Step 2: Set up and Run Tuning

We will tune dense_path_units (the width) and num_dense_path (the depth). The num_dense_path argument was created automatically because dense_path is a repeatable block.

tune_spec <- tunable_func_mlp(
  dense_path_units = tune(),
  num_dense_path = tune(),
  fit_epochs = 5,
  fit_verbose = 0
) |>
  set_engine("keras")

rec <- recipe(Species ~ ., data = iris)
tune_wf <- workflow() |> 
  add_recipe(rec) |>
  add_model(tune_spec)

folds <- vfold_cv(iris, v = 2)

# Define the tuning grid
params <- extract_parameter_set_dials(tune_wf) |>
  update(
    dense_path_units = hidden_units(c(8, 32)),
    num_dense_path = num_terms(c(1, 3)) # Test models with 1, 2, or 3 hidden layers
  )

grid <- grid_regular(params, levels = 2)
grid

## # A tibble: 4 × 2
##   num_dense_path dense_path_units
##            <int>            <int>
## 1              1                8
## 2              3                8
## 3              1               32
## 4              3               32

control <- control_grid(save_pred = FALSE, verbose = FALSE)

tune_res <- tune_grid(
  tune_wf,
  resamples = folds,
  grid = grid,
  control = control
)

## 3/3 - 0s - 16ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 21ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 16ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 21ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 15ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 20ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 15ms/step
## 3/3 - 0s - 7ms/step
## 3/3 - 0s - 21ms/step
## 3/3 - 0s - 7ms/step

show_best(tune_res, metric = "accuracy")

## # A tibble: 4 × 8
##   num_dense_path dense_path_units .metric .estimator  mean     n std_err .config
##            <int>            <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>  
## 1              3               32 accura… multiclass 0.853     2  0.0267 Prepro…
## 2              1               32 accura… multiclass 0.813     2  0.133  Prepro…
## 3              3                8 accura… multiclass 0.493     2  0.16   Prepro…
## 4              1                8 accura… multiclass 0.3       2  0.0333 Prepro…

The results show that tidymodels successfully trained and evaluated models with different numbers of hidden layers, demonstrating that we can tune the very architecture of the network.

Conclusion

The create_keras_functional_spec() function provides a powerful and intuitive way to define, fit, and tune complex Keras models within the tidymodels framework. By defining the model as a graph of connected blocks, you can represent nearly any architecture while kerasnip handles the boilerplate of integrating it with parsnip, dials, and tune.