Let's say your network does not 'learn' during training, i.e. you notice that the loss is not reducing with increasing epochs as expected. Which of the following hyperparameters (or the architecture) of the net would you consider changing after you've tried all the other options?
◯ Learning rate
◯ Optimiser
◯ The architecture itself
◯ Initialisation strategy