Iterator Maps
Iterator maps is the name chosen (in lack of a better name) for mapping an iterator to a new iterator. The main use cases for this are:
- Limiting the batch size of a candidate to prevent out of memory errors (see Batch Size Utilities).
- Enabling search for the best training batch size (using e.g.
TrainBatchSizeMutation
and/orIteratorMapCrossover
). - Enabling search for the best data augmentation setting (not part of this package as of yet).
Iterator maps are inteded to be used with CandidateDataIterMap
and must extend AbstractIteratorMap
. See API
documentation for functions related to iterator maps.
In an attempt to hit two birds with one stone, here is an example of a custom iterator map which logs the sizes of what a wrapped iterator returns. This allows us to see the effects of BatchSizeIteratorMap
without digging too much into the internals.
using NaiveGAflux, Optimisers, Flux
import NaiveGAflux: AbstractIteratorMap
struct SizeSpyingIteratorMap <: AbstractIteratorMap end
NaiveGAflux.maptrain(::SizeSpyingIteratorMap, iter) = Iterators.map(iter) do val
@info "The sizes are $(size.(val))"
return val
end
Create the iterator map we want to use. Last argument to BatchSizeIteratorMap
is normally created through batchsizeselection
, but here we will use a dummy model for which the maximum batch size computation is not defined.
iteratormap = IteratorMaps(SizeSpyingIteratorMap(), BatchSizeIteratorMap(8, 16, (bs, _) -> bs))
Create a candidate with the above mentioned dummy model.
cand = CandidateDataIterMap(iteratormap, CandidateModel(sum))
Data set has 20
examples, and here we provide it "raw" without any batching for brevity. Other arguments are not important for this example.
fitstrat = TrainThenFitness(
dataiter = ((randn(32, 32, 3, 20), randn(1, 20)),),
defaultloss = (x, y) -> sum(x .+ y),
defaultopt = Optimisers.Descent(),
fitstrat = SizeFitness()
)
When the model is trained it will wrap the iterator accoring to our iteratormap
.
@test_logs((:info, "The sizes are ((32, 32, 3, 8), (1, 8))"),
(:info, "The sizes are ((32, 32, 3, 8), (1, 8))"),
(:info, "The sizes are ((32, 32, 3, 4), (1, 4))"),
match_mode=:any,
fitness(fitstrat, cand))
Lets mutate the candidate with a new batch size (SizeSpyingIteratorMap
does not have any properties to mutate). Here we set l1 == l2
to prevent that randomness breaks the testcase, but you might want to use something like TrainBatchSizeMutation(-0.1, 0.1, ntuple(i -> 2^i))
. The last argument is to make sure we select a power of two as the new batch size.
batchsizemutation = TrainBatchSizeMutation(0.1, 0.1, ntuple(i -> 2^i, 10))
MapCandidate creates new candidates from a set of mutations or crossovers.
newcand = cand |> MapCandidate(batchsizemutation)
@test_logs((:info, "The sizes are ((32, 32, 3, 16), (1, 16))"),
(:info, "The sizes are ((32, 32, 3, 4), (1, 4))"),
match_mode=:any,
fitness(fitstrat, newcand))
This page was generated using Literate.jl.