Deep Convolutional Auto Encoders - Part 2
Adaptation to encode entire swarm
- Introduction
- Data Noramlisation and preperation
- Define the encoder and decoder
- 1D Convolution
- Training
- Results
- Conclusion
Introduction
In the first part I reimplemented the convolutional auto encoder from TimeCluster by Ali et al This time, I will adapt the model to handle all 300 flock agents.
In this notebook a 1D convolutional approach is evaluated
function normalise(M)
min = minimum(minimum(eachcol(M)))
max = maximum(maximum(eachcol(M)))
return (M .- min) ./ (max - min)
end
normalised = Array(df) |> normalise
window_size = 60
data = slidingwindow(normalised',window_size,stride=1)
train, validate, test = splitobs(shuffleobs(data), (0.7,0.2));
Define the encoder and decoder
We can define the network shape in a couple of different ways:
- Keeping the convolution 1 dimentional and simply increasing the number of features from 3 to 900 (3 * num_of_agents)
- Using 2D convolution:
window_size X num_of_agents x dimensions (3) x batch
In this notebook we will look at the 1D approach.
function create_ae_1d()
# Define the encoder and decoder networks
encoder = Chain(
# 60x900xb
Conv((9,), 900 => 9000, relu; pad = SamePad()),
MaxPool((2,)),
# 30x9000xb
Conv((5,), 9000 => 4500, relu; pad = SamePad()),
MaxPool((2,)),
# 15x4500xb
Conv((5,),4500 => 2250, relu; pad = SamePad()),
# 15x2250xb
MaxPool((3,)),
Conv((3,),2250 => 1000, relu; pad = SamePad()),
Conv((3,),1000 => 100, relu; pad = SamePad()),
# 5x100xb
Flux.flatten,
Dense(500,100)
)
decoder = Chain(
Dense(100,500),
(x -> reshape(x, 5,100,:)),
# 5x100xb
ConvTranspose((3,), 100 => 1000, relu; pad = SamePad()),
ConvTranspose((3,), 1000 => 2250, relu; pad = SamePad()),
Upsample((3,)),
# 15x2250xb
ConvTranspose((5,), 2250 => 4500, relu; pad = SamePad()),
Upsample((2,)),
# 30x4500xb
ConvTranspose((5,), 4500 => 9000, relu; pad = SamePad()),
Upsample((2,)),
# 60x9000xb
ConvTranspose((9,), 9000 => 900, relu; pad = SamePad()),
# 60x900xb
)
return (encoder, decoder)
end
function save_model(m, epoch, loss)
model_row = LegolasFlux.ModelRow(; weights = fetch_weights(cpu(m)),architecture_version=1, loss=0.0001)
write_model_row("1d_300_model-$epoch-$loss.arrow", model_row)
end
function rearrange_1D(x)
permutedims(cat(x..., dims=3), [2,1,3])
end
function train_model_1D!(model, train, validate, opt; epochs=20, bs=16, dev=Flux.gpu)
ps = Flux.params(model)
local train_loss, train_loss_acc
local validate_loss, validate_loss_acc
local last_improvement = 0
local prev_best_loss = 0.01
local improvement_thresh = 5.0
validate_losses = Vector{Float64}()
for e in 1:epochs
train_loss_acc = 0.0
for x in eachbatch(train, size=bs)
x = rearrange_1D(x) |> dev
gs = Flux.gradient(ps) do
train_loss = Flux.Losses.mse(model(x),x)
return train_loss
end
train_loss_acc += train_loss
Flux.update!(opt, ps, gs)
end
validate_loss_acc = 0.0
for y in eachbatch(validate, size=bs)
y = rearrange_1D(y) |> dev
validate_loss = Flux.Losses.mse(model(y), y)
validate_loss_acc += validate_loss
end
validate_loss_acc = round(validate_loss_acc / (length(validate)/bs); digits=6)
train_loss_acc = round(train_loss_acc / (length(train)/bs) ;digits=6)
if validate_loss_acc < 0.001
if validate_loss_acc < prev_best_loss
@info "new best accuracy $validate_loss_acc saving model..."
save_model(model, e, validate_loss_acc)
last_improvement = e
prev_best_loss = validate_loss_acc
elseif (e - last_improvement) >= improvement_thresh && opt.eta > 1e-5
@info "Not improved in $improvement_thresh epochs. Dropping learning rate to $(opt.eta / 2.0)"
opt.eta /= 2.0
last_improvement = e # give it some time to improve
improvement_thresh = improvement_thresh * 1.5
elseif (e - last_improvement) >= 15
@info "Not improved in 15 epochs. Converged I guess"
break
end
end
push!(validate_losses, validate_loss_acc)
println("Epoch $e/$epochs\t train loss: $train_loss_acc\t validate loss: $validate_loss_acc")
end
validate_losses
end
losses_0001 = train_model_1D!(model, train, validate, Flux.Optimise.ADAM(0.0001); epochs=200, bs=48);
test_data = rand(test)
create_gif_from_raw(test_data)
input = Flux.unsqueeze(test_data', 3)
output = new_model(input)
output = reshape(output, 60,900)'
create_gif_from_raw(output)
Conclusion
After a few hours of training on GPU, we can now reasonably encode the movement of the whole swarm (300 agents) over 60 timesteps into 100 variables. However, I want to reduce that encoding even further into ~10 parameters that can be used to sonify the dynamics.
Next time I will see how much further a can reduce the latent space - as well as seeing how useful other DR methods are when applied to the latent space.