Commit 8f2e9a40 authored by David Peter's avatar David Peter
Browse files

Conclusion

parent a846ead6
......@@ -2,10 +2,10 @@
% **************************************************************************************************
Resource efficient \glspl{dnn} are the key components in modern \gls{kws} systems. In this thesis, we utilized different methods including \gls{nas}, weight and activation quantization, end-to-end models and multi-exit models to obtain resource efficient \glspl{cnn} for \gls{kws}.
We observed that \gls{nas} is an excellent method for obtaining resource efficient and accurate models. Furthermore, \gls{nas} allowed us to explore the accuracy-size tradeoff of models in-depth.
We observed that \gls{nas} is an excellent method for obtaining resource efficient and accurate models. Furthermore, \gls{nas} allowed us to explore the accuracy-size tradeoff of the models.
With weight and activation quantization, we were able to further reduce the memory requirements of our \gls{kws} models. We observed that using quantization aware training with the \gls{ste}, it is possible to train a model with binary weights and binary activations while still obtaining a reasonable performance. We also explored learned bitwidth quantization where the bitwidth of every layer is learned during training. Learned bitwidth quantization allowed us to find the optimal bitwidth for every layer considering the accuracy-size tradeoff that was established by regularizing the cross entropy loss of the model.
With weight and activation quantization, we were able to further reduce the memory requirements of our \gls{kws} models. We observed that using quantization aware training with the \gls{ste}, it is possible to train a model with binary weights and binary activations while still obtaining a reasonable performance. We also explored learned bitwidth quantization where the bitwidth of every layer is learned during training. With learned bitwidth quantization we were able to find the optimal bitwidth for every layer considering the accuracy-size tradeoff that was established by regularizing the cross entropy loss of the model.
With end-to-end \gls{kws} models, we were able to skip the extraction of hand-crafted speech features and instead perform classification on the raw audio waveforms. Removing the need for hand-crafted speech features allowed us to find models with fewer parameters. However, we also observed a small negative performance impact of end-to-end models compared to ordinary models using \glspl{mfcc} as speech features.
With end-to-end \gls{kws} models, we were able to skip the extraction of hand-crafted speech features and instead perform classification on the raw audio waveforms. By removing the need for hand-crafted speech features we managed to find models with fewer parameters. However, we also observed a small negative performance impact of end-to-end models compared to ordinary models using \glspl{mfcc} as speech features.
In our last experiment, we explored multi-exit models for \gls{kws} where we compared different exit topologies. Furthermore, we compared distillation based training to ordinary training. Multi-exit models increase the flexibility in a \gls{kws} system substantially, allowing us to interrupt the forward pass early if necessary. However, this increase in flexibility comes at the cost of an increased number of model parameters. We observed that the exit topology has a substantial impact on the performance of a multi-exit model. We also observed that distillation based training is beneficial for training multi-exit models.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment