Layer Wrappers
NaiveNASflux wraps Flux layers in mutable wrapper types by default so that the vertex operations can be mutated without having to recreate the whole model. Additional wrappers which might be useful are described here.
NaiveNASflux.ActivationContribution
— TypeActivationContribution{L,C,M} <: AbstractMutableComp
ActivationContribution(l)
ActivationContribution(l, method)
Calculate neuron utility based on activations and gradients using method
.
Designed to be used as layerfun
argument to fluxvertex
.
Can be a performance bottleneck in cases with large activations. Use NeuronUtilityEvery
to mitigate.
Default method
is described in https://arxiv.org/abs/1611.06440.
Short summary is that the first order taylor approximation of the optimization problem: "which neurons shall I remove to minimize impact on the loss function?" boils down to: "the ones which minimize abs(gradient * activation)
" (assuming parameter independence).
NaiveNASflux.LazyMutable
— TypeLazyMutable
LazyMutable(m::AbstractMutableComp)
Lazy version of MutableLayer in the sense that it does not perform any mutations until invoked to perform a computation.
This reduces the need to garbage collect when multiple mutations might be applied to a vertex before evaluating the model.
Also useable for factory-like designs where the actual layers of a computation graph are not instantiated until the graph is used.
Examples
julia> using NaiveNASflux, Flux
julia> struct DenseConfig end
julia> lazy = LazyMutable(DenseConfig(), 2, 3);
julia> layer(lazy)
DenseConfig()
julia> function NaiveNASflux.dispatch!(m::LazyMutable, ::DenseConfig, x)
m.mutable = Dense(nin(m)[1], nout(m), relu)
return m.mutable(x)
end;
julia> lazy(ones(Float32, 2, 5)) |> size
(3, 5)
julia> layer(lazy)
Dense(2 => 3, relu) # 9 parameters