The strategy we use here is to mark a subset of the neural net activations as checkpoint nodes. For the simple feed-forward network in our example, the optimal choice is to mark every sqrt(n)-th node ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results