Commit 01dcaeb0 authored by David Peter's avatar David Peter
Browse files

Change notation of loss

parent 26627881
No preview for this file type
......@@ -5,19 +5,19 @@ In Section~\ref{sec:dnn:over_underfitting} we discussed the concept of over- and
\newsubsubsection{Mean Squared Error}{dnn:loss:mse}
The \gls{mse} is the average of squared differences between the predicted and the real targets across all $N$ samples in the dataset. The \gls{mse} is most commonly used for regression tasks and depends on the model parameters $\vm{\theta}$. The \gls{mse} for a single sample is defined as
\begin{equation}
l_{\mathrm{MSE}}(\vm{\theta}) = \frac{1}{2} || f(\vm{x}_i; \vm{\theta}) - \vm{y}_i ||^2
l_{\mathrm{MSE}}(f(\vm{x}_i; \vm{\theta}), \vm{y}_i) = \frac{1}{2} || f(\vm{x}_i; \vm{\theta}) - \vm{y}_i ||^2
\end{equation}
where $\vm{y}_i$ is the $i$-th one-hot encoded target and $f(\vm{x}_i; \vm{\theta})$ is the model prediction for the $i$-th input $\vm{x}_i$. The \gls{mse} over all samples is simply the average over all per-sample losses
\begin{equation}
\mathcal{L}_{\mathrm{MSE}}(\vm{\theta}) = \frac{1}{N} \sum_{i=1}^N l_{\mathrm{MSE}}(\vm{\theta}).
\mathcal{L}_{\mathrm{MSE}}(\vm{\theta}) = \frac{1}{N} \sum_{i=1}^N l_{\mathrm{MSE}}(f(\vm{x}_i; \vm{\theta}), \vm{y}_i).
\end{equation}
\newsubsubsection{Cross Entropy Loss}{dnn:loss:ce}
For classification tasks, a different loss functions has to be used. Typically, the choice for classification tasks with $C$ classes is the cross entropy loss. The cross entropy loss for a single sample is defined as
\begin{equation}
l_{\mathrm{CE}}(\vm{\theta}) = -\sum_{j=1}^C y_{ij} \log{f(\vm{x}_i; \vm{\theta})_{j}}
l_{\mathrm{CE}}(f(\vm{x}_i; \vm{\theta}), \vm{y}_i) = -\sum_{j=1}^C y_{ij} \log{f(\vm{x}_i; \vm{\theta})_{j}}
\end{equation}
where $y_{ij}$ is $j$-th element of the $i$-th one-hot encoded target and $f(\vm{x}_i; \vm{\theta})_j$ is the $j$-th element of the model prediction for the $i$-th input $\vm{x}_i$. The cross entropy loss over all samples is simply the average over all per-sample losses
\begin{equation}
\mathcal{L}_{\mathrm{CE}}(\vm{\theta}) = \frac{1}{N} \sum_{i=1}^N l_{\mathrm{CE}}(\vm{\theta}).
\mathcal{L}_{\mathrm{CE}}(\vm{\theta}) = \frac{1}{N} \sum_{i=1}^N l_{\mathrm{CE}}(f(\vm{x}_i; \vm{\theta}), \vm{y}_i).
\end{equation}
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment