Strong error analysis for stochastic gradient descent optimization algorithms

Arnulf Jentzen, Benno Kuckuck, Ariel Neufeld*, Philippe Von Wurstemberger

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

23 Citations (Scopus)

Abstract

Stochastic gradient descent (SGD) optimization algorithms are key ingredients in a series of machine learning applications. In this article we perform a rigorous strong error analysis for SGD optimization algorithms. In particular, we prove for every arbitrarily small \varepsilon \in (0,\infty) and every arbitrarily large p{\,\in\,} (0,\infty) that the considered SGD optimization algorithm converges in the strong L^p-sense with order 1/2-\varepsilon to the global minimum of the objective function of the considered stochastic optimization problem under standard convexity-type assumptions on the objective function and relaxed assumptions on the moments of the stochastic errors appearing in the employed SGD optimization algorithm. The key ideas in our convergence proof are, first, to employ techniques from the theory of Lyapunov-type functions for dynamical systems to develop a general convergence machinery for SGD optimization algorithms based on such functions, then, to apply this general machinery to concrete Lyapunov-type functions with polynomial structures and, thereafter, to perform an induction argument along the powers appearing in the Lyapunov-type functions in order to achieve for every arbitrarily large p \in (0,\infty) strong L^p -convergence rates.

Original languageEnglish
Pages (from-to)455-492
Number of pages38
JournalIMA Journal of Numerical Analysis
Volume41
Issue number1
DOIs
Publication statusPublished - Jan 1 2021
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2020 The Author(s) 2018. Published by Oxford University Press on behalf of the Institute of Mathematics and its Applications. All rights reserved.

ASJC Scopus Subject Areas

  • General Mathematics
  • Computational Mathematics
  • Applied Mathematics

Keywords

  • Stochastic approximation algorithms
  • Stochastic gradient descent
  • Strong error analysis

Fingerprint

Dive into the research topics of 'Strong error analysis for stochastic gradient descent optimization algorithms'. Together they form a unique fingerprint.

Cite this