for artificial intelligence, computer science and linguistics
We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions.