Fast exact joint S&P 500/VIX smile calibration in discrete and continuous time

- By Florian Bourgey and Julien Guyon
- 31 Jan 2024

Florian Bourgey and Julien Guyon introduce a novel discrete-time-continuous-time exact calibration method. They build an S&P 500/VIX jointly calibrated discrete-time model, which they later extend to continuous time by martingale interpolation. The benefit of this technique is that both steps can be made much faster than the known methods that calibrate a continuous-time model directly

In previous work, Guyon (2020, 2024) showed how to build a nonparametric discrete-time arbitrage-free model that perfectly matches market data on Standard & Poor’s 500 index value (SPX) futures, SPX options, Chicago Board Options Exchange Volatility Index (VIX) futures and VIX options. The probability distribution is built by minimising the relative entropy with respect to a reference probability measure, and this Schrödinger problem is solved numerically using an extended Sinkhorn algorithm. This provided the first exact solution to this joint calibration problem, a difficult problem (especially for short maturities) that had eluded quants for many years. Jointly calibrating to SPX and VIX futures and options is important to prevent arbitrage and ensure accurate pricing of liquid hedging instruments; calibrating to VIX derivatives means incorporating market information on SPX forward volatilities. Figure 1 showcases the additional information contributed by the VIX by quantifying how model-free bounds for SPX path-dependent payoffs tighten when the prices of VIX futures and VIX options are included. The extra information is also seen from table A, where we compare the prices of several options in models that are all calibrated to SPX smiles but not all calibrated to VIX futures and smiles. Table A shows that to avoid mispricing some payoffs (particularly forward-starting payoffs, which are sensitive to forward volatilities) it is important that the model also fits VIX futures and options, even when the payoff depends only on SPX prices.

The aim of this paper is twofold:

to speed up the construction of the discrete-time model by turning to Newton-type methods for solving the Schrödinger system (in the next section, we show numerically that a mixed Newton–Sinkhorn method and an implied Newton method converge much faster than the Sinkhorn algorithm); and
to quickly build a continuous-time extension of the model, allowing the pricing of options depending on SPX values at any date $t$ , while ensuring calibration to the market smiles of SPX and VIX.

The first step draws inspiration from De March (2018), who explored the entropic approximation for discrete-time multi-dimensional martingale optimal transport, excluding VIX, using Newton’s method. Our continuous-time construction in the second step may appear similar to the Bass local volatility of Conze & Henry-Labordère (2022) but it is fundamentally different. First, unlike the construction by Conze & Henry-Labordère (2022), our purely forward Markov functional construction does not require the solution of a fixed-point problem; it is thus much faster. This is because we first build an arbitrage-free multi-marginal discrete-time model consistent with market data. Second, our continuous-time interpolated model fits not only SPX option prices but also VIX market data.

By following both steps, we thus quickly (in less than a minute) build a continuous-time model that is, by construction, exactly calibrated to SPX and VIX smiles and futures. By contrast, other known continuous-time exact solutions to the joint calibration problem (Guo et al 2022; Guyon 2022a) are more involved and demand significantly more computation time. Approximate parametric continuous-time solutions, including those based on rough or rough-like path-dependent volatility models, classical stochastic volatility models or signature-based models, are also costly in terms of computation time; we refer the reader to the extended online version of this article (Bourgey & Guyon 2022) for references. The main benefit of our novel discrete-time-continuous-time calibration method is that both steps are much faster than the known methods that directly calibrate a continuous-time model.

A natural practical application of our continuous-time model is the pricing and hedging of structured products by exotics desks (see table A). With our model, the pricing and hedging of structured products on the SPX indeed take into account all the information given by SPX smiles (the risk-neutral distributions of future SPX values) as well as all the information brought by VIX futures and VIX smiles (the risk-neutral distributions of some future SPX implied volatilities). Once the model has been calibrated (this takes less than a minute per VIX expiry), it is straightforward to implement and use, as it is a Markov functional model that involves simulating only one Brownian motion, along with the VIX at VIX future expiries. The model can also be used for computing reserves and other valuation adjustments and for assessing model risk.

Fast exact joint calibration in discrete time

Setting and notation

Let $T_{1}>0$ denote a VIX future maturity and set $T_{2}=T_{1}+\tau$ , where $\tau=30$ days. For simplicity, assume zero interest rates, repos and dividends. We take as given the full market smiles of the SPX index $S$ at $T_{1}$ and $T_{2}$ (ie, the full continuum of SPX call prices $C_{i}(K)$ of maturity $T_{i}$ for any $i\in\{1,2\}$ and all strikes $K\geq 0$ ) as well as the full market smile of the VIX index $V$ at $T_{1}$ , (ie, the full continuum of VIX call prices $C_{V}(K)$ for all strikes $K\geq 0$ ). For $i\in\{1,2\}$ , we use the shorthand notation $=S_{T_{i}}$ . We use the term forward-starting log contract (FSLC) to denote the financial derivative that pays $-(2/\tau)\ln(S_{2}/S_{1})$ at $T_{2}$ . From the VIX definition (substituting the strip of out-of-the-money options with the log contract for simplicity), the price of the FSLC at $T_{1}$ is $V^{2}$ .

Table A: Prices of various options in the Dupire local volatility model, local two-factor Bergomi model and our continuous-time model. [We choose the same parameters as Guyon (2022b, table 4) for the local two-factor Bergomi, jointly calibrated to the term-structures of SPX ATM skew and VIX $\smash{{}^{\text{2}}}$ implied volatility: $\smash{k_{\text{1}}}=\text{21.91}$ , $\smash{k_{\text{2}}}=\text{1.04}$ , $\smash{\rho_{XY}}=\text{1}$ , $\smash{\rho_{SX}}=-\text{1}$ , $\smash{\rho_{SY}}=-\text{1}$ , $\smash{\theta_{\text{1}}}=\text{0.77}$ , $\omega=\text{6.64}$ . We define $=\smash{\max_{t\leq u\leq T}S_{u}}$ , $=(\text{1}/(T-t))\int_{t}^{T}S_{u}\mathrm{d}u}$ ]

			Our continuous-
Price	LV	LV + Bergomi 2F	time model
$(M_{\text{0},T_{\text{2}}}-S_{\text{0}})_{+}$	$\text{\phantom{0}76.81}\pm\text{0.30}$	$\text{\phantom{0}73.13}\pm\text{0.28}$	$\text{\phantom{0}75.37}\pm\text{0.36}$
$(M_{T_{\text{1}},T_{\text{2}}}-S_{\text{0}})_{+}$	$\text{\phantom{0}70.53}\pm\text{0.32}$	$\text{\phantom{0}67.20}\pm\text{0.30}$	$\text{\phantom{0}68.92}\pm\text{0.38}$
$(M_{T_{\text{1}},T_{\text{2}}}-S_{T_{\text{1}}})_{+}$	$\text{\phantom{0}61.21}\pm\text{0.28}$	$\text{\phantom{0}56.16}\pm\text{0.23}$	$\text{\phantom{0}62.99}\pm\text{0.34}$
$\text{100}\times\frac{M_{T_{\text{1}},T_{\text{2}}}}{S_{T_{\text{1}}}}$	$\text{101.896}\pm\text{0.011}$	$\text{101.707}\pm\text{0.09\phantom{0}}$	$\text{101.941}\pm\text{0.012}$
$(A_{T_{\text{1}},T_{\text{2}}}-S_{\text{0}})_{+}$	$\text{\phantom{0}32.02}\pm\text{0.22}$	$\text{\phantom{0}31.96}\pm\text{0.22}$	$\text{\phantom{0}33.22}\pm\text{0.26}$
$(A_{T_{\text{1}},T_{\text{2}}}-S_{T_{\text{1}}})_{+}$	$\text{\phantom{0}19.51}\pm\text{0.16}$	$\text{\phantom{0}19.05}\pm\text{0.15}$	$\text{\phantom{0}23.14}\pm\text{0.17}$
$\bigg{(}\displaystyle\frac{A_{T_{\text{1}},T_{\text{2}}}}{S_{T_{\text{1}}}}-% \text{1}\bigg{)}_{+}$	$(\text{7.05}\pm\text{0.06})$	$(\text{6.87}\pm\text{0.05})$	$(\text{8.27}\pm\text{0.06})$
	$\times\text{10}^{-\text{3}}$	$\times\text{10}^{-\text{3}}$	$\times\text{10}^{-\text{3}}$

For each maturity $T_{i}$ , $i\in\{1,2\}$ , the absence of static SPX arbitrage is equivalent to the existence of a risk-neutral measure $=\partial^{2}C_{i}/\partial K^{2}$ , $i\in\{1,2\}$ , such that the price of any vanilla option $u_{i}(\cdot)$ written on $S_{i}$ is the expectation $=\mathbb{E}^{\mu_{i}}[u_{i}(S_{i})]$ of the payoff under $\mu_{i}$ . Similarly, by the absence of static VIX arbitrage, there exists a risk-neutral measure $=\partial^{2}C_{V}/\partial K^{2}$ such that the price of any vanilla option $u_{V}(\cdot)$ written on $V$ is the expectation $=\mathbb{E}^{\mu_{V}}[u_{V}(V)]$ of the payoff under $\mu_{V}$ .

In the absence of dynamic SPX arbitrage (or calendar arbitrage), $\mu_{1}$ and $\mu_{2}$ are of convex order (ie, $\mathbb{E}^{1}[f(S_{1})]\leq\mathbb{E}^{2}[f(S_{2})]$ for any convex function $f\colon\mathbb{R}_{>0}\to\mathbb{R}$ ), even if we allow trading in the FSLC at $T_{1}$ . By the absence of arbitrage, the price of $S_{i}$ at time $0$ is the initial SPX spot value $S_{0}>0$ (ie, $\mathbb{E}^{i}[S_{i}]=S_{0}$ ). Furthermore, $\mathbb{E}^{V}[V]=F_{V}\geq 0$ , where $F_{V}$ is the value at time $0$ of the VIX future maturing at $T_{1}$ . Finally, in order for the log contracts and the VIX squared to have finite prices, in the remainder of this section we assume the following.

Assumption 1

For any $i\in\{1,2\}$ , the given marginals $\mu_{1}$ , $\mu_{V}$ and $\mu_{2}$ satisfy $\mathbb{E}^{i}[S_{i}]=S_{0}$ , $\mathbb{E}^{i}[|\ln S_{i}|]<\infty$ and $\mathbb{E}^{V}[V]=F_{V}$ , $\mathbb{E}^{V}[V^{2}]<\infty$ .

Let $=\mathbb{R}_{>0}\times\mathbb{R}_{\geq 0}\times\mathbb{R}_{>0}$ and define the strictly convex function $L\colon x\in\mathbb{R}_{>0}\mapsto-(2/\tau)\ln x$ . For a probability distribution $\rho$ on $\mathbb{R}$ , we denote the associated cumulative distribution by $F_{\rho}$ (ie, $F_{\rho}(x)=\rho((-\infty,x])$ for every $x\in\mathbb{R}$ ). Let $\mathcal{P}(\mathbb{R}_{>0}^{2})$ (respectively, $\mathcal{P}(\mathcal{X})$ ) denote the set of all probability measures on $\mathbb{R}_{>0}^{2}$ (respectively, $\mathcal{X}$ ).

Let $\mathcal{U}^{V}$ be the set of all measurable functions $u_{1},u_{2}\colon\mathbb{R}_{>0}\to\mathbb{R}$ , $u_{V}\colon\mathbb{R}_{\geq 0}\to\mathbb{R}$ , $\Delta_{S},\Delta_{L}\colon\mathbb{R}_{>0}\times\mathbb{R}_{\geq 0}\to\mathbb{R}$ satisfying $u_{i}\in\mathrm{L}^{1}(\mu_{i})$ for $i\in\{1,V,2\}$ , and $\Delta_{S}$ , $\Delta_{L}$ bounded. We use the shorthand notation:

	$\displaystyle\Delta_{S}^{(S)}(s_{1},v,s_{2})$	$=\Delta_{S}(s_{1},v)(s_{2}-s_{1})$
	$\displaystyle\Delta_{L}^{(L)}(s_{1},v,s_{2})$	$=\Delta_{L}(s_{1},v)(L(s_{2}/s_{1})-v^{2})$

to denote the profit-and-losses (P&Ls) from delta-hedging at time $T_{1}$ in the SPX and the log-contract, respectively. Finally, let $\mathcal{M}_{c}(\mu_{1},\mu_{V},\mu_{2})$ denote the set of all VIX-constrained martingale probability measures:

	$=\{\mu\in\mathcal{P}(% \mathcal{X})\colon S_{1}\overset{\mu}{\sim}\mu_{1},\ V\overset{\mu}{\sim}\mu_{% V},\ S_{2}\overset{\mu}{\sim}\mu_{2},$
	$\displaystyle\qquad\qquad\qquad{}\mathbb{E}^{\mu}[S_{2}\mid S_{1},V]=S_{1},\ % \mathbb{E}^{\mu}[L(S_{2}/S_{1})\|S_{1},V]=V^{2}\}$

Entropy minimisation

Solving the joint calibration problem is equivalent to building a probability measure $\mu\in\mathcal{M}_{c}(\mu_{1},\mu_{V},\mu_{2})$ . In the absence of joint SPX/VIX arbitrage, there may exist an infinite number of models $\mu$ within the convex set $\mathcal{M}_{c}(\mu_{1},\mu_{V},\mu_{2})$ of jointly calibrated models. To build a specific, meaningful jointly calibrated model, in the spirit of Avellaneda et al (1997), Guyon (2020, 2024) suggests a minimum entropy approach: choose a reasonable (though not jointly calibrated) model $\bar{\mu}$ on $\mathcal{X}$ , possibly one derived from a model already in use at the financial institution, and build the probability measure $\mu\in\mathcal{M}_{c}(\mu_{1},\mu_{V},\mu_{2})$ that is closest to $\bar{\mu}$ in the entropic sense (ie, $\mu$ has minimum relative entropy with respect to $\bar{\mu}$ ):

=\begin{cases}\mathbb{E}^{\mu}% \bigg{[}\ln\bigg{(}\displaystyle\frac{\mathrm{d}\mu}{\mathrm{d}\bar{\mu}}\bigg% {)}\bigg{]}=\mathbb{E}^{\bar{\mu}}\bigg{[}\displaystyle\frac{\mathrm{d}\mu}{% \mathrm{d}\bar{\mu}}\ln\bigg{(}\displaystyle\frac{\mathrm{d}\mu}{\mathrm{d}% \bar{\mu}}\bigg{)}\bigg{]}&\text{if~{}}\mu\ll\bar{\mu}\\ +\infty&\text{otherwise}\end{cases}\end{aligned}\right\}

(M)

From Guyon (2024), if the minimum entropy problem is finite, there exists a unique minimiser $\mu^{*}\in\mathcal{M}_{c}(\mu_{1},\mu_{V},\mu_{2})$ of the form:

	$\displaystyle\frac{\mathrm{d}\mu^{*}}{\mathrm{d}\bar{\mu}}=e_{u}(S_{1},V,S_{2})$
	$=e^{u_{1}(s_{1})+u_{V}(v)+u_{2}(s_{2})+% \Delta_{S}^{(S)}(s_{1},v,s_{2})+\Delta_{L}^{(L)}(s_{1},v,s_{2})}$

where $=(u_{1},u_{V},u_{2},\Delta_{S},\Delta_{L})$ , if they exist, are maximisers (called Schrödinger potentials) of the dual problem:

=% \mathbb{E}^{1}[u_{1}(S_{1})]+\mathbb{E}^{V}[u_{V}(V)]\\ &\displaystyle\qquad\qquad\quad{}+\mathbb{E}^{2}[u_{2}(S_{2})]-\mathbb{E}^{% \bar{\mu}}[e_{u}(S_{1},V,S_{2})]+1\end{aligned}\end{aligned}\right\}

(P)

Additionally, in any case, $D_{\bar{\mu}}=P_{\bar{\mu}}$ . Note that (P) is an unconstrained concave maximisation problem. Both problems are dual to each other; following a terminology proposed by Dupire, (M) is a measure problem while (P) is a portfolio problem.

As, in practice, only a finite number of SPX and VIX vanilla options are available for trading, we consider vanilla payoffs $u_{1}$ , $u_{V}$ and $u_{2}$ that are linear combinations of finitely many call options, along with one position in the bond, one position in $S_{1}$ and one position in the VIX futures. Therefore, we consider a market data set $\mathcal{K}$ composed of call options on $S_{1}$ , $V$ and $S_{2}$ , which we denote by $(C_{K}^{1})_{K\in\mathcal{K}_{1}}$ , $(C_{K}^{V})_{K\in\mathcal{K}_{V}}$ and $(C_{K}^{2})_{K\in\mathcal{K}_{2}}$ with respective strikes $\mathcal{K}_{1}$ , $\mathcal{K}_{V}$ and $\mathcal{K}_{2}$ , and we build a model of the form:

	$\displaystyle\frac{\mathrm{d}\mu_{\mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}=e_% {\theta}(S_{1},V,S_{2})$
	$=\exp\bigg{(}c+\Delta_{S}^{0}s_{1}+\Delta_{V}^{0}v+\sum_{K\in% \mathcal{K}_{1}}a_{K}^{1}(s_{1}-K)_{+}\\ &\displaystyle\qquad\qquad{}+\sum_{K\in\mathcal{K}_{V}}a_{K}^{V}(v-K)_{+}+\sum% _{K\in\mathcal{K}_{2}}a_{K}^{2}(s_{2}-K)_{+}\\ &\displaystyle\qquad\qquad\qquad\qquad{}+\Delta_{S}^{(S)}(s_{1},v,s_{2})+% \Delta_{L}^{(L)}(s_{1},v,s_{2})\bigg{)}\end{aligned}$

where $\theta$ is an element of the set $\varTheta$ of all $=(c,\Delta_{S}^{0},\Delta_{V}^{0},a^{1},a^{V},a^{2},\Delta_{S},\Delta_{% L})$ such that $c,\Delta_{S}^{0},\Delta_{V}^{0}\in\mathbb{R}$ , $a^{1}\in\mathbb{R}^{\mathcal{K}_{1}}$ , $a^{V}\in\mathbb{R}^{\mathcal{K}_{V}}$ , $a^{2}\in\mathbb{R}^{\mathcal{K}_{2}}$ and $\Delta_{S},\Delta_{L}\colon\mathbb{R}_{>0}\times\mathbb{R}_{\geq 0}\to\mathbb{R}$ are bounded measurable functions of $(s_{1},v)$ . The measure $\mu_{\mathcal{K},\theta}$ is then a consistent, arbitrage-free model that is jointly calibrated to the market prices of SPX/VIX futures and options if and only if $\theta$ solves the so-called $\mathcal{K}$ -Schrödinger system:

\left.\begin{aligned} &\displaystyle\mathbb{E}^{\bar{\mu}}\bigg{[}\frac{% \mathrm{d}\mu_{\mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}\bigg{]}=1,~{}\mathbb{% E}^{\bar{\mu}}\bigg{[}S_{1}\frac{\mathrm{d}\mu_{\mathcal{K},\theta}}{\mathrm{d% }\bar{\mu}}\bigg{]}=S_{0},~{}\mathbb{E}^{\bar{\mu}}\bigg{[}V\frac{\mathrm{d}% \mu_{\mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}\bigg{]}=F_{V}\\ &\displaystyle\mathbb{E}^{\bar{\mu}}\bigg{[}(S_{1}-K)_{+}\frac{\mathrm{d}\mu_{% \mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}\bigg{]}=C_{K}^{1},\quad\forall K\in% \mathcal{K}_{1}\\ &\displaystyle\mathbb{E}^{\bar{\mu}}\bigg{[}(V-K)_{+}\frac{\mathrm{d}\mu_{% \mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}\bigg{]}=C_{K}^{V},\quad\forall K\in% \mathcal{K}_{V}\\ &\displaystyle\mathbb{E}^{\bar{\mu}}\bigg{[}(S_{2}-K)_{+}\frac{\mathrm{d}\mu_{% \mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}\bigg{]}=C_{K}^{2},\quad\forall K\in% \mathcal{K}_{2}\\ &\displaystyle\mathbb{E}^{\bar{\mu}}\bigg{[}(S_{2}-S_{1})\frac{\mathrm{d}\mu_{% \mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}\biggm{|}S_{1}=s_{1},\ V=v\bigg{]}=0% \\ &\displaystyle\begin{aligned} \displaystyle\mathbb{E}^{\bar{\mu}}\bigg{[}\bigg% {(}L\bigg{(}\frac{S_{2}}{S_{1}}\bigg{)}-V^{2}\bigg{)}&\displaystyle\frac{% \mathrm{d}\mu_{\mathcal{K},\theta}}{\mathrm{d}\bar{\mu}}\biggm{|}S_{1}=s_{1},% \ V=v\bigg{]}=0\\ &\displaystyle\qquad\qquad\qquad\qquad{}\forall s_{1}>0,\ v\geq 0\end{aligned}% \end{aligned}\right\}

(1)

The first equation states that $\mu_{\mathcal{K},\theta}$ is a probability measure, while the others state that it belongs to the set $\mathcal{M}_{c,\mathcal{K}}(\mu_{1},\mu_{V},\mu_{2})$ of probability measures $\mu$ satisfying:

	$\displaystyle\mathbb{E}^{\mu}[S_{1}]=S_{0},~{}\mathbb{E}^{\mu}[V]=F_{V},~{}% \forall K\in\mathcal{K}_{1},~{}\mathbb{E}^{\mu}[(S_{1}-K)_{+}]=C_{K}^{1}$
	$\displaystyle\forall K\in\mathcal{K}_{V},~{}\mathbb{E}^{\mu}[(V-K)_{+}]=C_{K}^% {V},~{}\forall K\in\mathcal{K}_{2},~{}\mathbb{E}^{\mu}[(S_{2}-K)_{+}]=C_{K}^{2}$
	$\displaystyle\mathbb{E}^{\mu}[S_{2}\mid S_{1},V]=S_{1},~{}\mathbb{E}^{\mu}[L(S% _{2}/S_{1})\mid S_{1},V]=V^{2}$

Remark 1

The value of the prior measure $\bar{\mu}$ is chosen by the modeller. Examples include a lognormal prior:

\bar{\mu}(\mathrm{d}s_{1},\mathrm{d}v,\mathrm{d}s_{2})=\nu(\mathrm{d}s_{1},% \mathrm{d}v)T(s_{1},v,\mathrm{d}s_{2})

where $\nu=\mu_{1}\otimes\mu_{V}$ and $T(s_{1},v,\mathrm{d}s_{2})$ is the distribution of:

s_{1}\exp(v\sqrt{\tau}G-\tfrac{1}{2}v^{2}\tau)

with $G\sim\mathcal{N}(0,1)$ ;¹¹ 1 Assuming that $S_{2}$ is lognormally conditioned on $S_{1}$ and that $V$ is financially natural, which may not be the best choice in practice. Indeed, in this case, $\mathbb{E}^{\bar{\mu}}[\mathrm{e}^{\delta S_{2}}\mid S_{1},V]=+\infty$ for $\delta>0$ , so the expectations in (1) may not be well defined. In practice, we avoid those integrability issues by working with a finite-support approximation of $\bar{\mu}$ stemming from the Gaussian quadrature approximation of the expectations (see remark 2). or the independent prior (product measure) $\bar{\mu}=\mu_{1}\otimes\mu_{V}\otimes\mu_{2}$ .

Remark 2

To evaluate the expectations arising in (1), we use a Gauss–Legendre quadrature when integrating with respect to $s_{1}$ (respectively, $v$ ) with grid $=\{s_{1}^{(1)}\leq\dots\leq s_{1}^{(n_{1})}\}$ (respectively, $=\{v^{(1)}\leq\dots\leq v^{(n_{V})}\}$ ). For $s_{2}$ , in the case of the lognormal prior, we use a Gauss–Hermite quadrature with knots $\{z^{(1)},\dots,z^{(n_{2})}\}.$

Solving the Schrödinger system (1)

The Sinkhorn algorithm

The classical method for solving Schrödinger systems is the Sinkhorn algorithm, an iterative method that sequentially solves the individual equations in the Schrödinger system (here, (1)), which converge to the optimiser $\theta^{*}$ of the whole system. This algorithm has recently gained popularity in machine learning, where it is used to quickly compute Wasserstein distances and more generally to solve optimal transport problems, via a small entropic penalty. It has also been applied to martingale optimal transport problems (De March 2018) and, in particular, in quantitative finance to quickly build arbitrage-free smiles (De March & Henry-Labordère 2019). In Guyon (2024) the Sinkhorn algorithm was extended to accommodate the martingality and consistency constraints in (1), and shown to converge toward a jointly calibrated model. However, the convergence was somewhat slow (see the ‘Comparison of the different algorithms’ section below). In the following sections, we present two faster alternatives for solving (1) numerically; both rely on solving the portfolio problem (P).

The Newton–Sinkhorn algorithm

Observe that if we define the concave function:

	$\displaystyle J_{\bar{\mu},\mathcal{K}}(\theta)$	$=c+\Delta_{S}^{0}S_{0}+\Delta_{V}^{0}F_{V}$
		$\displaystyle\qquad{}+\sum_{i\in\{1,V,2\}}\sum_{K\in\mathcal{K}_{i}}a_{K}^{i}C% _{K}^{i}-\mathbb{E}^{\bar{\mu}}[e_{\theta}(S_{1},V,S_{2})]+1$

solving the $\mathcal{K}$ -Schrödinger system is equivalent to cancelling the gradient of $J_{\bar{\mu},\mathcal{K}}$ . Hence, to solve the system and build $\mu^{*}$ , one can directly solve the portfolio problem:

=\sup_{\theta\in\varTheta}J_{\bar{\mu},\mathcal{K}}(\theta)

(2)

which is the ‘finitely many payoffs’ version of (P). To this end, we suggest the following Newton–Sinkhorn algorithm. Each iteration involves a Newton step followed by a Sinkhorn step.

Newton step. Starting from an initial guess $\theta^{(0)}$ , we first solve for every iteration $n\in\mathbb{N}$ and $(s_{1},v)\in\mathcal{G}_{1}\times\mathcal{G}_{V}$ the portfolio problem (2):

\displaystyle\theta^{-\Delta,(n+1)}

\displaystyle=\operatorname*{arg~{}max}_{\theta^{-\Delta}\in\varTheta^{-\Delta% }}J_{\bar{\mu},\mathcal{K}}(\theta^{-\Delta},\Delta_{S}^{(n)}(s_{1},v),\Delta_% {L}^{(n)}(s_{1},v))

(3)

where:

=(c,\Delta_{S}^{0},\Delta_{V}^{0},a^{1},a^{V},a^{2})

Since the Hessian of $\smash{J_{\bar{\mu},\mathcal{K}}(\theta^{-\Delta},\Delta_{S}^{(n)}(s_{1},v),% \Delta_{L}^{(n)}(s_{1},v))}$ is known in closed form, this step is extremely fast. We solve (3) using the function scipy.optimize.minimize (method="trust-exact") from the scipy library.

Sinkhorn step. Then, for all $(s_{1},v)\in\mathcal{G}_{1}\times\mathcal{G}_{V}$ , we jointly solve for $\smash{\Delta_{S}^{(n+1)}(s_{1},v)}$ and $\smash{\Delta_{L}^{(n+1)}(s_{1},v)}$ the two-dimensional nonlinear system:

	$\displaystyle f_{s_{1},v}(\Delta_{S}^{(n+1)}(s_{1},v),\Delta_{L}^{(n+1)}(s_{1}% ,v),a^{2,(n+1)})$	$\displaystyle=0,$		(4)
	$\displaystyle g_{s_{1},v}(\Delta_{S}^{(n+1)}(s_{1},v),\Delta_{L}^{(n+1)}(s_{1}% ,v),a^{2,(n+1)})$	$\displaystyle=0,$		(5)

where $a^{2,(n+1)}$ is the optimal vector $a^{2}$ from the previous step, (3), and for all $(x,y)\in\mathbb{R}^{2}$ :

	$\displaystyle f_{s_{1},v}(x,y,a^{2})$
	$=\int_{\mathbb{R}}(s_{2}-s_{1})e^{\sum_{K\in\mathcal{K}_{2}% }a_{K}^{2}(s_{2}-K)_{+}+x(s_{2}-s_{1})+y(L(\frac{s_{2}}{s_{1}})-v^{2})}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad% \qquad{}\times\bar{\mu}(s_{1},v,\mathrm{d}s_{2})$
	$\displaystyle g_{s_{1},v}(x,y,a^{2})$
	$=\int_{\mathbb{R}}\bigg{(}L\bigg{(}\frac{s_{2}}{s_{1}}\bigg% {)}-v^{2}\bigg{)}e^{\sum_{K\in\mathcal{K}_{2}}a_{K}^{2}(s_{2}-K)_{+}+x(s_{2}-s% _{1})+y(L(\frac{s_{2}}{s_{1}})-v^{2})}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad% \qquad{}\times\bar{\mu}(s_{1},v,\mathrm{d}s_{2})$

This is the Sinkhorn step, where the last two equations of (1) are both solved.

We use the Levenberg–Marquardt algorithm to solve (4) and (5) via scipy.optimize.root(method="lm"). A full Newton algorithm with parameterised deltas has also been considered in the extended online version of this article (Bourgey & Guyon 2022).

The implied Newton algorithm

Inspired by De March (2018), we observe that:

	$\displaystyle\theta^{*}$	$\displaystyle=\operatorname*{arg~{}max}_{\theta\in\varTheta}J_{\bar{\mu},% \mathcal{K}}(\theta)$
		$\displaystyle=\operatorname{arg~{}max}_{\theta^{-\Delta}\in\varTheta^{-\Delta% }}\tilde{J}_{\bar{\mu},\mathcal{K}}(\theta^{-\Delta},\Delta_{S}^{}(\cdot,% \cdot,a^{2}),\Delta_{L}^{*}(\cdot,\cdot,a^{2}))$

where $(\Delta_{S}^{*}(\cdot,\cdot,a^{2}),\Delta_{L}^{*}(\cdot,\cdot,a^{2}))$ solves the two-dimensional nonlinear system:

	$\displaystyle f_{s_{1},v}(\Delta_{S}^{}(\cdot,\cdot),\Delta_{L}^{}(\cdot,% \cdot),a^{2})$	$\displaystyle=0,$
	$\displaystyle g_{s_{1},v}(\Delta_{S}^{}(\cdot,\cdot),\Delta_{L}^{}(\cdot,% \cdot),a^{2})$	$\displaystyle=0.$

That is, for each $\theta^{-\Delta}$ , we first optimise over $\Delta_{S}(\cdot,\cdot)$ and $\Delta_{L}(\cdot,\cdot)$ , and then we optimise over $\theta^{-\Delta}$ . Note that the inner optimisation depends on $\theta^{-\Delta}$ only through $a^{2}$ .

As with $J_{\bar{\mu},\mathcal{K}}$ , the gradient and Hessian of $\tilde{J}_{\bar{\mu},\mathcal{K}}$ are known in closed form. In fact, $\tilde{J}_{\bar{\mu},\mathcal{K}}$ has the same gradient and Hessian as $J_{\bar{\mu},\mathcal{K}}$ , except for the terms involving differentiation with respect to $a^{2}$ , whose exact computation is detailed in Bourgey & Guyon (2022).

Comparison of the different algorithms

All the numerical tests were performed on a MacBook Pro laptop with a 2.6 GHz six-core Intel Core i7 processor and 32 GB memory using Python 3.9.6.

To compare the various algorithms, we plot the logarithm (in base 10) of the calibration error for the SPX smiles at $T_{1},T_{2}$ and the VIX smile at $T_{1}$ as a function of the computational time as of April 20, 2020 ( $T_{1}=30$ days); see figure 2. Other calibration dates in 2018 and 2022 are considered by Bourgey & Guyon (2022). The calibration error is computed as the sum of the absolute relative errors of the three futures, the mean of the three smiles and the total mass of $\mu^{*}$ , which must be a probability measure.

We choose the lognormal prior for the reference measure $\bar{\mu}$ (see remark 1) and we take $n_{1}=n_{v}=45$ knots for the integration with respect to $s_{1}$ and $v$ , and $n_{2}=25$ knots for the integration with respect to $s_{2}$ . We set $\smash{s_{1}^{(1)}}=\smash{F_{\mu_{1}}^{-1}(q)}$ , $\smash{v^{(1)}}=\smash{F_{\mu_{V}}^{-1}(q)}$ and $\smash{s_{1}^{(n_{1})}}=\smash{F_{\mu_{1}}^{-1}(1-q)}$ , $\smash{v^{(n_{V})}}=\smash{F_{\mu_{V}}^{-1}(1-q)}$ with $q=10^{-3}$ for the lowest and highest values of the quadrature grids of $s_{1}$ and $v$ .

We compare the calibration speed of the Sinkhorn (S), Newton–Sinkhorn (NS) and implied Newton (IN) algorithms. For IN, as suggested by De March (2018, section 7.1), we initialise the algorithm with 10 iterations of a (pure) S algorithm (warm-start procedure). A warm-start initialisation of NS yielded no improvement.

In figure 2, we observe that IN is the fastest, and S the slowest. In figure 3, we plot the futures and smiles for $S_{1}$ , $V$ and $S_{2}$ obtained with the IN algorithm after 60 seconds as well as the market smiles: the fits are perfect. The martingality and VIX constraints (reported in Bourgey & Guyon (2022)) are also perfectly satisfied. Similar plots are obtained for the other two calibration dates by Bourgey & Guyon (2022).

We also tested the three methods when the reference measure is the product measure (independent prior). The numerical results are reported in Bourgey & Guyon (2022). We observe that the three methods seem to be less stable with the independent prior and typically require a higher number of nodes; we chose $n_{1}=n_{V}=n_{2}=45$ . With the independent prior, IN needs more steps to converge.

Continuous-time extension

We now extend the discrete-time model $\mu^{*}$ (and the probability space) to build a continuous-time model for $(S_{t})_{t\in[0,T_{2}]}$ . The model will also include the VIX at $T_{1}$ , $V$ . That is, we build a probability $\mathbb{P}$ on $C([0,T_{2}],\mathbb{R})\times\mathbb{R}_{+}$ representing the distribution of $((S_{t})_{t\in[0,T_{2}]},V)$ . Here, $V$ plays the role of a discrete-time stochastic volatility, representing the stochastic volatility anticipated at $T_{1}$ for the $[T_{1},T_{2}]$ period, but our model involves no continuous-time stochastic volatility process. Similarly to that of Conze & Henry-Labordère (2022), our model is computationally efficient, as it only requires simulating one Brownian motion (and $V$ ).

The key advantage of our construction compared with the Bass local volatility of Conze & Henry-Labordère (2022) is that it starts directly from a joint discrete distribution of $(S_{1},S_{2})$ , making the continuous-time interpolation a purely forward construction with no need to solve a fixed-point problem. Moreover, it includes a stochastic volatility component, $V$ , for the calibration to VIX futures and VIX smiles in addition to SPX smiles. As a result, in contrast with Conze & Henry-Labordère (2022) and in line with the path-dependency observed in financial markets (Guyon & Lekeufack 2023), our model is path-dependent: the SPX dynamics after $T_{1}$ depend on both $S_{1}$ and $V$ .

Step 1: simulation of ${(S_{t})}_{t \in [0, T_{1}]}$

We want $(S_{t})_{t\in[0,T_{1}]}$ to be a $\mathbb{P}$ -martingale and $S_{1}$ to have distribution $\mu_{1}$ under $\mathbb{P}$ . To achieve this, one possible choice is to use a Markov functional model $S_{t}=u(t,W_{t})$ , $t\in[0,T_{1}]$ , where $W$ is a $\mathbb{P}$ -Brownian motion and $u$ satisfies the heat equation:

\partial_{t}u+\tfrac{1}{2}\partial_{x}^{2}u=0

Further, define the filtration:

=\begin{cases}\sigma((W_{s})_{s\in[0,t]})&\text{if }t\in[0,T_{% 1})\\ \sigma((W_{s})_{s\in[0,t]},V)&\text{if }t\in[T_{1},T_{2}].\end{cases}

In such a case, $(S_{t})_{t\in[0,T_{1}]}$ is an $((\mathcal{F}_{t}),\mathbb{P})$ -martingale, and the terminal condition $u(T_{1},\cdot)=g$ is determined via quantiles so that $u(T_{1},W_{T_{1}})$ has distribution $\mu_{1}$ , ie, for all $x\in\mathbb{R}$ , we set:

=F_{\mu_{1}}^{-1}\bigg{(}F_{\mathcal{N}(0,1)}\bigg{(}\frac{x}{\sqrt{T_{1}% }}\bigg{)}\bigg{)}.

The solution $u$ to the heat equation is explicit and given by:

	$\displaystyle u(t,x)$	$\displaystyle=\mathbb{E}[g(W_{T_{1}})\mid W_{t}=x]$
		$\displaystyle=\mathbb{E}[g(x+W_{T_{1}}-W_{t})]=(g\ast K_{T_{1}-t})(x)$		(6)

where $\ast$ is the convolution operator and where, for $t>0$ , $K_{t}(x)=\smash{\mathrm{e}^{-x^{2}/2t}/\sqrt{2\pi t}}$ is the heat kernel.

Step 2: simulation of $V$ given ${(S_{t})}_{t \in [0, T_{1}]}$

At this stage, we have simulated $(S_{t})_{t\in[0,T_{1}]}$ such that $(S_{t})_{t\in[0,T_{1}]}$ is an $((\mathcal{F}_{t}),\mathbb{P})$ -martingale and $S_{1}$ has distribution $\mu_{1}$ under $\mathbb{P}$ . Now, we simulate $V$ given $\sigma((W_{s})_{s\in[0,T_{1}]})$ under $\mathbb{P}$ as follows: the distribution of $V$ given $\sigma((W_{s})_{s\in[0,T_{1}]})$ under $\mathbb{P}$ is assumed to depend only on $S_{1}$ and is taken equal to the distribution of $V$ given $S_{1}$ under $\mu^{*}$ . Since $S_{1}$ has distribution $\mu_{1}$ under both $\mu^{*}$ and $\mathbb{P}$ , this means that the distribution of $(S_{1},V)$ is the same under $\mu^{*}$ and $\mathbb{P}$ ; in particular, $V$ has the distribution $\mu_{V}$ under $\mathbb{P}$ .

Step 3: simulation of $(S_{t})_{t\in[T_{1},T_{2}]}$ given

$\mathcal{F}$ $\smash{{}_{T_{1}}}$ . In this last step, we build dynamics for $\smash{(S_{t})_{t\in[T_{1},T_{2}]}}$ conditional on $\smash{\mathcal{F}_{T_{1}}}$ such that $\smash{(S_{t})_{t\in[T_{1},T_{2}]}}$ is an $\smash{((\mathcal{F}_{t}),\mathbb{P})}$ -martingale starting from $\smash{S_{1}}$ , and $\smash{S_{2}}$ has distribution $\smash{\mu_{2}}$ . We use once again a Markov functional construction. Given $\smash{S_{1}=s}$ and $V=v$ , we consider:

=F_{\mu_{2|s,v}}^{-1}\bigg{(}F_{\mathcal{N}(0,1)}\bigg{(}\frac{x}{% \sqrt{\tau}}\bigg{)}\bigg{)}

for all $x\in\mathbb{R}$ and where $\mu_{2|s,v}$ is the distribution of $S_{2}$ given $S_{1}=s$ and $V=v$ under $\mu^{*}$ . Then, given $\mathcal{F}_{T_{1}}$ , we define for $t\in(T_{1},T_{2}]$ :

=u_{S_{1},V}(t,W_{t}-W_{T_{1}})

where for every $s,v>0$ :

	$\displaystyle u_{s,v}(t,x)$	$\displaystyle=\mathbb{E}[g_{s,v}(W_{T_{2}}-W_{T_{1}})\mid W_{t}-W_{T_{1}}=x]$
		$\displaystyle=\mathbb{E}[g_{s,v}(x+W_{T_{2}}-W_{t})]=(g_{s,v}*K_{T_{2}-t})(x)$		(7)

It is easy to check that $(S_{t})_{t\in[T_{1},T_{2}]}$ is an $((\mathcal{F}_{t}),\mathbb{P})$ -martingale starting from $S_{1}$ . Moreover, the distribution of $S_{2}$ given $(S_{1},V)$ is the same under $\mu^{*}$ and $\mathbb{P}$ . As the distribution of $(S_{1},V)$ is the same under $\mu^{*}$ and $\mathbb{P}$ , we conclude that the distribution of $(S_{1},V,S_{2})$ is the same under $\mu^{*}$ and $\mathbb{P}$ . In particular $S_{1}$ , $V$ and $S_{2}$ have distributions $\mu_{1}$ , $\mu_{V}$ and $\mu_{2}$ under $\mathbb{P}$ . We have thus built a model $\mathbb{P}$ on $\smash{((S_{t})_{t\in[0,T_{2}]},V)}$ such that (a) $S_{1}$ , $V$ and $S_{2}$ have distributions $\mu_{1}$ , $\mu_{V}$ and $\mu_{2}$ under $\mathbb{P}$ ; (b) $\smash{(S_{t})_{t\in[0,T_{2}]}}$ is an $((\mathcal{F}_{t}),\mathbb{P})$ -martingale; and (c) $V$ is the VIX at $T_{1}$ , since by construction:

	$\displaystyle\mathbb{E}^{\mathbb{P}}\bigg{[}L\bigg{(}\frac{S_{2}}{S_{1}}\bigg{% )}\biggm{\|}\mathcal{F}_{T_{1}}\bigg{]}$	$\displaystyle=\mathbb{E}^{\mathbb{P}}\bigg{[}L\bigg{(}\frac{S_{2}}{S_{1}}\bigg% {)}\biggm{\|}S_{1},V\bigg{]}$
		$\displaystyle=\mathbb{E}^{\mu^{*}}\bigg{[}L\bigg{(}\frac{S_{2}}{S_{1}}\bigg{)}% \biggm{\|}S_{1},V\bigg{]}=V^{2}$

Remark 3 (Extension to several VIX maturities)

Note that this approach can easily be iterated on intervals $[T_{i},T_{i+1}]$ . For example, after step 3, we will have generated $S_{2}\sim\mu_{2}$ . Then, we just need to generate $V_{T_{2}}\sim\mu_{V_{T_{2}}}$ and repeat the same procedure. Here, we disregard the Wednesday/Friday issue and make the approximation that the VIX future maturities are exactly 30 days apart.

The numerical implementation of the continuous-time extension is easy; for details we refer the reader to Bourgey & Guyon (2022). The resulting smiles (market, discrete and continuous time) for the SPX at $T_{1}$ and $T_{2}$ and that of the VIX at $T_{1}$ are displayed in figure 4; they were computed by Monte Carlo simulation with $10^{5}$ paths.

Pricing

Our continuous-time model can be used to price path-dependent options on the SPX with the guarantee that the model exactly matches the SPX smiles at $T_{1}$ and $T_{2}$ and the VIX future and VIX smile at $T_{1}$ , thus taking into account information about the forward volatility at $T_{1}$ that is not included in the SPX smiles. As a pricing exercise, we compare it with models commonly used by practitioners: the Dupire local volatility model (Dupire 1994) and the local stochastic version of the two-factor Bergomi model (Bergomi 2005); those models are calibrated to the full SPX implied volatility surface but not to VIX smiles. We consider spot-starting, forward-starting and mixed versions of lookback and Asian options. The prices are reported in table A along with their 95% confidence interval. We used $10^{5}$ Monte Carlo paths and a trapezoidal rule to approximate the time integral for the Asian option. Note that for forward-starting options the price in our jointly calibrated model is always larger than in the other two models: ignoring the VIX information leads to underpricing these forward-starting payoffs.

Conclusion

In this article, we have:

improved model-free bounds on SPX options by incorporating VIX options data;
built the minimum-entropy jointly calibrated discrete-time model $μ^{*}$ very fast using an implied Newton method;
seamlessly extended this discrete-time model to continuous time in a purely forward fashion, using Markov functionals.

Thus, we have established a swift process for creating an arbitrage-free continuous-time model for SPX that accurately calibrates to SPX smiles, VIX futures and VIX smiles. Such a model can be used for pricing and hedging exotic options, computing reserves or valuation adjustments, and assessing model risk. Our main methodological contribution is that we first build a jointly calibrated discrete-time model $\smash{(S_{T_{i}})}$ (where the $T_{i}$ are the calibrated maturities) that is later extended to continuous time using an arbitrage-free martingale time interpolation. Since the discrete-time model can be exactly calibrated much faster than continuous-time models, and since extremely fast extrapolations exist, this novel approach seems to be a promising new avenue for calibrating models.

Florian Bourgey is a quantitative researcher at Bloomberg in New York, while Julien Guyon is a professor of applied mathematics at École des Ponts ParisTech in Paris. They appreciate the valuable feedback from the anonymous referees.

Email: fbourgey@bloomberg.net, julien.guyon@enpc.fr.

References

Avellaneda M, C Friedman, R Holmes and D Samperi, 1997
Calibrating volatility surfaces via relative-entropy minimization
Applied Mathematical Finance 4(1), pages 37–64
Bergomi L, 2005
Smile dynamics II
Risk October, pages 67–73
Bourgey F and J Guyon, 2022
Fast exact joint S&P 500/VIX smile calibration in discrete and continuous time
Preprint, SSRN 4315084
Conze A and P Henry-Labordère, 2022
A new fast local volatility model
Risk April, http://www.risk.net/7944906
De March H, 2018
Entropic approximation for multi-dimensional martingale optimal transport
Preprint, arXiv:1812.11104
De March H and P Henry-Labordère, 2019
Building arbitrage-free implied volatility: Sinkhorn’s algorithm and variants
Preprint, SSRN 3326486
Dupire B, 1994
Pricing with a smile
Risk January, pages 18–20
Guo I, G Loeper, J Obłój and S Wang, 2022
Joint modeling and calibration of SPX and VIX by optimal transport
SIAM Journal on Financial Mathematics 13(1), pages 1–31
Guyon J, 2020
The joint S&P 500/VIX smile calibration puzzle solved
Risk April, http://www.risk.net/7518926
Guyon J, 2022a
Dispersion-constrained martingale Schrödinger bridges: joint entropic calibration of stochastic volatility models to S&P 500 and VIX smiles
Preprint, SSRN 3853237
Guyon J, 2022b
The VIX future in Bergomi models: fast approximation formulas and joint calibration with S&P 500 skew
SIAM Journal on Financial Mathematics 13(4), pages 1,418–1,485
Guyon J, 2024
Dispersion-constrained martingale Schrödinger problems and the exact joint S&P 500/VIX smile calibration puzzle
Finance and Stochastics 28(1), pages 27–79
Guyon J and J Lekeufack, 2023
Volatility is (mostly) path-dependent
Quantitative Finance 23(9), pages 1,221–1,258
P Henry-Labordère, 2017
Model-Free Hedging: A Martingale Optimal Transport Viewpoint
Chapman and Hall/CRC

Only users who have a paid subscription or are part of a corporate subscription are able to print or copy content.

To access these options, along with all other subscription benefits, please contact info@risk.net or view our subscription options here: http://subscriptions.risk.net/subscribe

You are currently unable to print this content. Please contact info@risk.net to find out more.

You are currently unable to copy this content. Please contact info@risk.net to find out more.

As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (point 2.4), printing is limited to a single copy.

If you would like to purchase additional rights please email info@risk.net

You may share this content using our article tools. As outlined in our terms and conditions, https://www.infopro-digital.com/terms-and-conditions/subscriptions/ (clause 2.4), an Authorised User may only make one copy of the materials for their own personal use. You must also comply with the restrictions in clause 2.5.

If you would like to purchase additional rights please email info@risk.net

Fast exact joint S&P 500/VIX smile calibration in discrete and continuous time

An arbitrage-free model for exotic options that captures smiles and futures is presented