In this article we present a new approach for automatic adjoint differentiation (AAD) with a special focus on computations where derivatives ∂F(X)/∂X are required for multiple instances of vectors X. In practice, the presented approach is able to calculate all the differentials faster than the primal (original) C++ program for F.
Major application areas are:
• Gradient methods for optimization problems, including global model calibration,speech recognition, deblurring of images and machine learning in general.
• Derivatives of mathematical expectation.
• Pathwise sensitivities of stochastic differential equations.
Code transformation vs. operator overloading
Currently, two main approaches are used for the AAD tools:
• Code transformation (CT). Analyses the computer program which implements function F to produce a code of the adjoint differentiation (AD) method.
• Operator overloading (OO). All mathematical operations are overloaded in
such a way that the information about a computational graph of F is saved in the data structure called Tape.1 Tape is used afterwards to process the backward pass of the AD method.
There are rather successful CT AAD tools, however, they limit the available language features and make the build system more complex. The difficulty in building such a tool is further reflected in the fact that there is currently no CT AAD available for C++. The OO approach usually demonstrates weaker speed performance due to a runtime overhead in each iteration. Let us enter into more details at this point.
Logged-in members can download the article by clicking the link under all the “Related Posts” below. If there isn’t a link then you aren’t logged in! To log in or register visit here
To learn more about accelerating repetitive calculations and their sensitivities using highly parallel vectorized software and AAD (automatic adjoint differentiation) visit Mathlogic
Image: USAF / Judson Brohmer [Public domain]